Log in
Researchers in Cambridge have developed a data standard for storing and exchanging data between different programs in the field of macromolecular NMR spectroscopy. The standard has been used as the foundation for the development of an open source software suite for NMR data analysis, leading to improved research tools which have been widely adopted by both industrial and academic research groups, who benefit from faster drug development times and lower development costs. The CCPN data standard is an integral part of major European collaborative efforts for NMR software integration, and is being used by the major public databases for protein structures and NMR data, namely Protein Data Bank in Europe (PDBe) and BioMagResBank.
Targeted Projection Pursuit (TPP) — developed at Northumbria University — is a novel method for interactive exploration of high-dimension data sets without loss of information. The TPP method performs better than current dimension-reduction methods since it finds projections that best approximate a target view enhanced by certain prior knowledge about the data. "Valley Care" provides a Telecare service to over 5,000 customers as part of Northumbria Healthcare NHS Foundation Trust, and delivers a core service for vulnerable and elderly people (receiving an estimated 129,000 calls per annum) that allows them to live independently and remain in their homes longer. The service informs a wider UK ageing community as part of the NHS Foundation Trust.
Applying our research enabled the managers of Valley Care to establish the volume, type and frequency of calls, identify users at high risk, and to inform the manufacturers of the equipment how to update the database software. This enabled Valley Care managers and staff to analyse the information quickly in order to plan efficiently the work of call operators and social care workers. Our study also provided knowledge about usage patterns of the technology and valuably identified clients at high risk of falls. This is the first time that mathematical and statistical analysis of data sets of this type has been done in the UK and Europe.
As a result of applying the TPP method to its Call Centre multivariate data, Valley Care has been able to transform the quality and efficiency of its service, while operating within the same budget.
Open Data has lowered barriers to data access, increased government transparency and delivered significant economic, social and environmental benefits. Southampton research and leadership has led to the UK Public Data Principles, which were enshrined in the UK Government Open Data White Paper, and has led to data.gov.uk, which provides access to 10,000 government datasets. The open datasets are proving means for strong citizen engagement and are delivering economic benefit through the £10 million Open Data Institute. These in turn have placed the UK at the forefront of the global data revolution: the UK experience has informed open data initiatives in the USA, EU and G8.
The research improves digital data archives by embedding computation into the storage controllers that maintain the integrity of the data within the archive. This opens up a number of possibilities:
This has impact on three different classes of beneficiary:
Research carried out at Birkbeck's Department of Computer Science and Information Systems since 2000 has produced techniques for the management and integration of complex, heterogeneous life sciences data not previously possible with large-scale life sciences data repositories. The research has involved members of the department and researchers from the European Bioinformatics Institute (EBI) and University College London (UCL) and has led to the creation of several resources providing information about genes and proteins. These resources include the BioMap data warehouse, which integrated the CATH database — holding a classification of proteins into families according to their structure, the Gene3D database — holding information about protein sequences, and other related information on protein families, structures and the functions of proteins such as enzymes. These resources are heavily utilised by companies worldwide to explore relationships between protein structure and protein function and to aid in drug design.
The research in this case study has pioneered knowledge management technology. It has had major impact on drug discovery and translational medicine and is widely adopted in the pharmaceutical and healthcare industries. The impacts are:
The advanced information management research of the Department of Digital Humanities (DDH) has led to a better understanding of pollution processes in inland waterways and lakes. It has also improved the standard of water quality information that is available to government and regulatory authorities. The information management framework which DDH has provided supports government-funded activities to improve environmental standards and has helped ensure that the UK Environment Agency is able to comply with the EU's Water Framework Directive, reducing the risk of financial penalties for non-compliance. Moreover, key and accurate evidence about water quality has been made freely available to beneficiaries, including governmental and non-governmental agencies, farmers and land managers, and the general public.
Data-to-text utilises Natural Language Generation (NLG) technology that allows computer systems to generate narrative summaries of complex data sets. These can be used by experts, professional and managers to better, and quickly, understand the information contained within large and complex data sets. The technology has been developed since 2000 by Prof Reiter and Dr Sripada at the University of Aberdeen, supported by several EPSRC grants. The Impact from the research has two dimensions.
As economic impact, a spinout company, Data2Text (www.data2text.com), was created in late 2009 to commercialise the research. As of May 2013, Data2Text had 14 employees. Much of Data2Text's work is collaborative with another UK company, Arria NLG (www.arria.com), which as of May 2013 had about 25 employees, most of whom were involved in collaborative projects with Data2Text.
As impact on practitioners and professional services, case studies have been developed in the oil & gas sector, in weather forecasting, and in healthcare, where NLG provides tools to rapidly develop narrative reports to facilitate planning and decision making, introducing benefits in terms of improved access to information and resultant cost and/or time savings. In addition the research led to the creation of simplenlg (http://simplenlg.googlecode.com/), an open-source software package which performs some basic natural language generation tasks. The simplenlg package is used by several companies, including Agfa, Nuance and Siemens as well as Data2Text and Arria NLG.
There is growing evidence that official population statistics based on the decennial UK Census are inaccurate at the local authority level, the fundamental administrative unit of the UK. The use of locally-available administrative data sets for counting populations can result in more timely and geographically more flexible data which are more cost-effective to produce than the survey-based Census. Professor Mayhew of City University London has spent the last 13 years conducting research on administrative data and their application to counting populations at local level. This work has focused particularly on linking population estimates to specific applications in health and social care, education and crime. Professor Mayhew developed a methodology that is now used as an alternative to the decennial UK Census by a large number of local councils and health care providers. They have thereby gained access to more accurate, detailed and relevant data which have helped local government officials and communities make better policy decisions and save money. The success of this work has helped to shape thinking on statistics in England, Scotland and Northern Ireland and has contributed to the debate over whether the decennial UK Census should be discontinued.
Visual analytics is a powerful method for understanding large and complex datasets that makes information accessible to non-statistically trained users. The Non-linearity and Complexity Research Group (NCRG) developed several fundamental algorithms and brought them to users by developing interactive software tools (e.g. Netlab pattern analysis toolbox in 2002 (more than 40,000 downloads), Data Visualisation and Modelling System (DVMS) in 2012).
Industrial products. These software tools are used by industrial partners (Pfizer, Dstl) in their business activities. The algorithms have been integrated into a commercial tool (p:IGI) used in geochemical analysis for oil and gas exploration with a 60% share of the worldwide market.
Improving business performance. As an enabling technology, visual analytics has played an important role in the data analysis that has led to the development of new products, such as the Body Volume Index, and the enhancement of existing products (Wheelright: automated vehicle tyre pressure measurement).
Impact on practitioners. The software is used to educate and train skilled people internationally in more than 6 different institutions and is also used by finance professionals.