Log in
Research carried out at Birkbeck's Department of Computer Science and Information Systems since 2000 has produced techniques for the management and integration of complex, heterogeneous life sciences data not previously possible with large-scale life sciences data repositories. The research has involved members of the department and researchers from the European Bioinformatics Institute (EBI) and University College London (UCL) and has led to the creation of several resources providing information about genes and proteins. These resources include the BioMap data warehouse, which integrated the CATH database — holding a classification of proteins into families according to their structure, the Gene3D database — holding information about protein sequences, and other related information on protein families, structures and the functions of proteins such as enzymes. These resources are heavily utilised by companies worldwide to explore relationships between protein structure and protein function and to aid in drug design.
Research on data compression produced novel algorithms that optimise the use of bandwidth and processing power. This research has led to the establishment of a product line that applies these algorithms to video surveillance software, marketed by Digital Barriers plc. Since 2008 this compression technology has allowed the company to grow from 8 to 41 staff and increase revenue from £800K to £6M in 2013. The novelty and usefulness of the data compression research was also appreciated by ThinkAnalytics plc. This led the company to the optimal design for data compression in their recommender system, which is currently being supplied to 130M cable TV customers making the product the most deployed content recommendation system in the market.
KCL research played an essential role in the development of data provenance standards published by the World Wide Web Consortium (W3C) standards body for web technologies, which is responsible for HTTP, HTML, etc. The provenance of data concerns records of the processes by which data was produced, by whom, from what other data, and similar metadata. The standards directly impact on practitioners and professional services through adoption by commercial, governmental and other bodies, such as Oracle, IBM, and Nasa, in handling computational records of the provenance of data.
The research improves digital data archives by embedding computation into the storage controllers that maintain the integrity of the data within the archive. This opens up a number of possibilities:
This has impact on three different classes of beneficiary:
Targeted Projection Pursuit (TPP) — developed at Northumbria University — is a novel method for interactive exploration of high-dimension data sets without loss of information. The TPP method performs better than current dimension-reduction methods since it finds projections that best approximate a target view enhanced by certain prior knowledge about the data. "Valley Care" provides a Telecare service to over 5,000 customers as part of Northumbria Healthcare NHS Foundation Trust, and delivers a core service for vulnerable and elderly people (receiving an estimated 129,000 calls per annum) that allows them to live independently and remain in their homes longer. The service informs a wider UK ageing community as part of the NHS Foundation Trust.
Applying our research enabled the managers of Valley Care to establish the volume, type and frequency of calls, identify users at high risk, and to inform the manufacturers of the equipment how to update the database software. This enabled Valley Care managers and staff to analyse the information quickly in order to plan efficiently the work of call operators and social care workers. Our study also provided knowledge about usage patterns of the technology and valuably identified clients at high risk of falls. This is the first time that mathematical and statistical analysis of data sets of this type has been done in the UK and Europe.
As a result of applying the TPP method to its Call Centre multivariate data, Valley Care has been able to transform the quality and efficiency of its service, while operating within the same budget.
[text removed for publication], a developer of high-precision medical devices, have produced a new data annotation tool ([text removed for publication]) based on research in CSRI on data storage formats and activity recognition for applications within smart home environments. Within [text removed for publication] stereo-based cameras record activities in a specified environment (e.g. kitchen) which are then annotated using user-based pre-configured activity labels (e.g. prepare meal, wash dishes). [text removed for publication] is currently used by [text removed for publication] users and has yielded additional sales worth [text removed for publication]. [text removed for publication] have employed [text removed for publication] additional technical development staff to extend [text removed for publication] functionality, and through an MoU [text removed for publication] now supports automated annotation based on CSRI's research on activity recognition.
The advanced information management research of the Department of Digital Humanities (DDH) has led to a better understanding of pollution processes in inland waterways and lakes. It has also improved the standard of water quality information that is available to government and regulatory authorities. The information management framework which DDH has provided supports government-funded activities to improve environmental standards and has helped ensure that the UK Environment Agency is able to comply with the EU's Water Framework Directive, reducing the risk of financial penalties for non-compliance. Moreover, key and accurate evidence about water quality has been made freely available to beneficiaries, including governmental and non-governmental agencies, farmers and land managers, and the general public.
Researchers in Cambridge have developed a data standard for storing and exchanging data between different programs in the field of macromolecular NMR spectroscopy. The standard has been used as the foundation for the development of an open source software suite for NMR data analysis, leading to improved research tools which have been widely adopted by both industrial and academic research groups, who benefit from faster drug development times and lower development costs. The CCPN data standard is an integral part of major European collaborative efforts for NMR software integration, and is being used by the major public databases for protein structures and NMR data, namely Protein Data Bank in Europe (PDBe) and BioMagResBank.
The impact of this work stems from the provision of better quality information models, and is manifest via: (a) reduced cost through improved reuse and less rework; (b) improved system interoperability; and (c) enhanced assurance and checking that information requirements are supported by the resultant systems. The approach has been applied in commercial environments, such as Shell (UK), where it has reduced development costs by up to 50% ($1m in one case). It has also been applied in the defence environment, forming a part of underpinning standards currently being implemented by the UK and Swedish Armed Forces.
Open Data has lowered barriers to data access, increased government transparency and delivered significant economic, social and environmental benefits. Southampton research and leadership has led to the UK Public Data Principles, which were enshrined in the UK Government Open Data White Paper, and has led to data.gov.uk, which provides access to 10,000 government datasets. The open datasets are proving means for strong citizen engagement and are delivering economic benefit through the £10 million Open Data Institute. These in turn have placed the UK at the forefront of the global data revolution: the UK experience has informed open data initiatives in the USA, EU and G8.