Log in
Research carried out at Birkbeck's Department of Computer Science and Information Systems since 2000 has produced techniques for the management and integration of complex, heterogeneous life sciences data not previously possible with large-scale life sciences data repositories. The research has involved members of the department and researchers from the European Bioinformatics Institute (EBI) and University College London (UCL) and has led to the creation of several resources providing information about genes and proteins. These resources include the BioMap data warehouse, which integrated the CATH database — holding a classification of proteins into families according to their structure, the Gene3D database — holding information about protein sequences, and other related information on protein families, structures and the functions of proteins such as enzymes. These resources are heavily utilised by companies worldwide to explore relationships between protein structure and protein function and to aid in drug design.
Novel rapid methods for predicting protein structure, particularly functional loop structures, have been developed by researchers at the University of Oxford. These have been made accessible to a large audience through a suite of computational tools. The methods have had general impact through download and online access and specific impact through extensive use within UCB Pharma. The tools are much faster than other methods, creating equal or better predictions in approximately a thousandth of the time. Commonly exploited by UCB Pharma in their drug discovery pipeline, they have cut computational cost, but, more importantly, they have greatly reduced the time for process improvements. UCB Pharma estimate that the tool pyFREAD alone saves over £5 million in the discovery costs for a single drug molecule. FREAD (a version of pyFREAD coded in C) is also being used more widely, for example by Crysalin Ltd and InhibOx.
The CATH classification of protein structure, developed at the Institute of Structural and Molecular Biology, UCL, by Janet Thornton and Christine Orengo, has been used widely across the pharmaceutical industry and academia to guide experiments on proteins. This has led to significant cost and time savings in drug discovery. The UCL-hosted online CATH database receives around 10,000 unique visitors per month, and is a partner in InterPro — the most frequently accessed protein function annotation server available.
Researchers in Cambridge have developed a data standard for storing and exchanging data between different programs in the field of macromolecular NMR spectroscopy. The standard has been used as the foundation for the development of an open source software suite for NMR data analysis, leading to improved research tools which have been widely adopted by both industrial and academic research groups, who benefit from faster drug development times and lower development costs. The CCPN data standard is an integral part of major European collaborative efforts for NMR software integration, and is being used by the major public databases for protein structures and NMR data, namely Protein Data Bank in Europe (PDBe) and BioMagResBank.
The research improves digital data archives by embedding computation into the storage controllers that maintain the integrity of the data within the archive. This opens up a number of possibilities:
This has impact on three different classes of beneficiary:
Open Data has lowered barriers to data access, increased government transparency and delivered significant economic, social and environmental benefits. Southampton research and leadership has led to the UK Public Data Principles, which were enshrined in the UK Government Open Data White Paper, and has led to data.gov.uk, which provides access to 10,000 government datasets. The open datasets are proving means for strong citizen engagement and are delivering economic benefit through the £10 million Open Data Institute. These in turn have placed the UK at the forefront of the global data revolution: the UK experience has informed open data initiatives in the USA, EU and G8.
Research at the University of Oxford's Glycobiology Institute (OGBI) has led to the development of `state-of-the-art' platform technologies for the analysis of oligosaccharides (sugars) that are linked to proteins and lipids. These enabling technologies have had major impacts worldwide on drug discovery programmes, have enabled robust procedures to be developed for the quality control of biopharmaceutical production, and have been widely adopted by the pharmaceutical industry.
PolySNAP is an extensive commercial computer program developed at WestCHEM to process and classify large volumes of crystallographic and spectroscopic data. It is a market-leading product sold and supported by Bruker Corporation (a manufacturer of scientific instruments for molecular and materials research selling products world-wide) and is used in laboratories throughout the world supporting business in the pharmaceutical, materials, mining, geology, and polymer science sectors. The PolySNAP software was and continues to be sold in combination with all Bruker x-ray powder diffractometers.
Targeted Projection Pursuit (TPP) — developed at Northumbria University — is a novel method for interactive exploration of high-dimension data sets without loss of information. The TPP method performs better than current dimension-reduction methods since it finds projections that best approximate a target view enhanced by certain prior knowledge about the data. "Valley Care" provides a Telecare service to over 5,000 customers as part of Northumbria Healthcare NHS Foundation Trust, and delivers a core service for vulnerable and elderly people (receiving an estimated 129,000 calls per annum) that allows them to live independently and remain in their homes longer. The service informs a wider UK ageing community as part of the NHS Foundation Trust.
Applying our research enabled the managers of Valley Care to establish the volume, type and frequency of calls, identify users at high risk, and to inform the manufacturers of the equipment how to update the database software. This enabled Valley Care managers and staff to analyse the information quickly in order to plan efficiently the work of call operators and social care workers. Our study also provided knowledge about usage patterns of the technology and valuably identified clients at high risk of falls. This is the first time that mathematical and statistical analysis of data sets of this type has been done in the UK and Europe.
As a result of applying the TPP method to its Call Centre multivariate data, Valley Care has been able to transform the quality and efficiency of its service, while operating within the same budget.
The protein research of Imperial's Mass Spectrometry group led to the development of Mass Mapping /Fingerprinting for rapid protein characterisation, and new methods for disulphide bridge and glycosylation assignment. Commercialising these discoveries, the company M-SCAN has developed methods to accelerate industrial research and commercialisation of the next generation of recombinant drug therapies, such as monoclonal antibodies targeting cancers. M-SCAN is the pioneer of Biopharmaceutical Characterisation. It has influenced the regulatory advice and, in the past ten years, has assisted many hundreds of companies worldwide in developing their products for market, leading to the growth of a profitable business. In 2010, SGS S.A., a multinational company that provides inspection, verification, testing and certification services, acquired M-SCAN for an undisclosed sum, satisfying SGS's vision to become one of the top players within the Biologics testing arena.