The DichroWeb Analysis Server and Protein Circular Dichroism Data Bank: analysis tools for structural biology
Submitting Institutions
University College London,
Birkbeck CollegeUnit of Assessment
Biological SciencesSummary Impact Type
TechnologicalResearch Subject Area(s)
Biological Sciences: Biochemistry and Cell Biology
Information and Computing Sciences: Artificial Intelligence and Image Processing, Information Systems
Summary of the impact
DICHROWEB is a comprehensive, user-friendly server that provides access
to computational tools
for the determination of protein secondary structure from data obtained
through circular dichroism
(CD) and synchrotron radiation (SRCD) spectroscopy. The Protein Circular
Dichroism Data Bank
(PCDDB) is a database of spectra obtained using these techniques and
allied data. Both resources
are widely and increasingly used in many countries and are proving useful
in industrial research
(for example, in drug discovery) as well as academia and advanced
teaching. DICHROWEB
currently has over 3,600 registered users and over 375,000 DICHROWEB
analyses have been
run. Since the launch of PCDDB in 2009, the database has had over 175,000
unique hits from 41
different countries, and 89,890 downloads.
Underpinning research
Circular dichroism (CD) spectroscopy is a powerful tool for the
structural analysis of peptides and
proteins that was first developed in the 1960s. It measures the absorption
spectra of molecules in
the far ultra-violet (UV) part of the electromagnetic spectrum. As this
absorption is influenced by
the geometry of the peptide backbone, the absorption spectrum can be used
to determine the
proportion of each secondary structure type present in a protein. It is a
method that is widely used
in both academia and industry for the characterisation of proteins. In
recent years, technical
improvements, particularly the development of synchrotron radiation CD
(SRCD) spectroscopy,
have enabled lower wavelength data to be collected and thus increased the
information content of
the spectra. Wallace and her group have been and are still at the
forefront of the development of
these new techniques.
An analysis of the secondary structure of a protein from its CD spectrum
is based on comparisons
with a set of reference spectra obtained from proteins with structures
known to atomic resolution.
Many different programs for the analysis of CD data have been made
publicly available, although
these required input and produced output data in different formats and
units, and produced
different types of comparison metrics. Wallace and her co-workers
developed and continue to
update and maintain the DICHROWEB server (http://dichroweb.cryst.bbk.ac.uk)
as a user-friendly
web-based interface to a wide range of both well-characterised and novel
tools for the analysis of
SRCD and CD spectroscopy data [1]. DICHROWEB users are able to
submit their spectra in any
recognised format and have access to many combinations of analysis
programs and reference
databases. The degree of agreement between the experimental data and a
spectrum obtained by
back-calculation from the predicted secondary structures is returned as a
simple root mean square
goodness-of-fit parameter, as well as provided as an easily interpretable
graphical comparison of
the experimental and calculated data [2].
Research by Wallace and her collaborators has enabled a range of new
features, input
parameters, data types and reference databases to be added to DICHROWEB,
taking into account
advances in X-ray crystallography, NMR spectroscopy and bioinformatics
that have greatly
increased the range of atomic-resolution protein structures available and
categorised them more
precisely into fold families. Among these, new reference data were
produced which cover
secondary structure of the protein and fold space, using high quality SRCD
data, allowing more
accurate prediction of secondary structure [3]. Furthermore, new
computational and bioinformatics
developments have been incorporated by the Wallace group [4].
Techniques for the analysis of the
dynamic properties of protein folding and of protein-protein interactions
have also been developed
[5].
In a parallel development, Wallace created the Protein Circular Dichroism
Data Bank (PCDDB) in
collaboration with Dr Robert Janes (Queen Mary University of London) as a
publicly available
repository for CD and SRCD data and a BBSRC Bioinformatics resource for
data sharing. It was
first released in December 2009 and is now fully operational for data
deposit as well as access [6].
Besides the spectra, which are available for download graphically or as
text files, this databank
includes metadata such as sample types and experimental conditions. It
also contains a set of
tools for validating CD spectra which are extremely valuable for the
community, particularly as
journals do not insist on CD spectra being validated if the main emphasis
of the publication is on a
different technique. The database is freely available to the community via
a web-based interface
that runs on all up-to-date platforms and browsers and is user-friendly
enough for non-experts to
use. Each entry in PCDDB is linked to a wide range of other structural
biology resources and also
linked through to DICHROWEB to allow users to run their own analyses on
stored datasets. This
database is a valuable resource for bioinformatics and experimental
studies on proteins; it parallels
the Protein Data Bank (PDB) as a repository for crystallographic data and
enables comparisons
with known proteins [7].
References to the research
[2] Whitmore L, Wallace BA. DICHROWEB, an online server for protein
secondary structure
analyses from circular dichroism spectroscopic data. Nucleic Acids Res.
2004 Jul 1;32(Web
Server issue):W668-73. http://dx.doi.org/10.1093/nar/gkh371
[4] Whitmore L, Wallace BA. Protein secondary structure analyses from
circular dichroism
spectroscopy: methods and reference databases. Biopolymers. 2008
May;89(5):392-400.
http://dx.doi.org/10.1002/bip.20853
[5] Wallace BA, Janes RW. Synchrotron radiation circular dichroism (SRCD)
spectroscopy: an
enhanced method for examining protein conformations and protein
interactions. Biochem Soc
Trans. 2010 Aug;38(4):861-73. http://dx.doi.org/10.1042/BST0380861
[6] Whitmore L, Woollett B, Miles AJ, Janes RW, Wallace BA. The protein
circular dichroism data
bank, a Web-based site for access to circular dichroism spectroscopic
data. Structure. 2010
Oct 13;18(10):1267-9. http://dx.doi.org/10.1016/j.str.2010.08.008
[7] Whitmore L, Woollett B, Miles AJ, Klose DP, Janes RW, Wallace BA.
PCDDB: the Protein
Circular Dichroism Data Bank, a repository for circular dichroism spectral
and metadata.
Nucleic Acids Res. 2011 Jan;39(Database issue):D480-6.
http://dx.doi.org/10.1093/nar/gkq1026
Grants supporting this work:
BBSRC BB/J019135/1. Bioinformatics Resources for Circular Dichroism
Spectroscopy. 1/1/2013-
31/12/2017. £856,528.
BBSRC F010346: Support for the Protein Circular Dichroism Data Bank and
the Dichroweb
Analysis Server. 01/01/2008-31/12/2012. £545,279.
BBSRC B02959. Bioinformatics tools for CD spectroscopy: validation and
processing software and
creation of a CD deposition data bank. 01/04/04-31/03/06. £235,446.
BBSRC 27/B13586. Bioinformatics for CD spectroscopy. Apr 01-Mar 04.
£152,256.
Details of the impact
Both DICHROWEB and the PCDDB are widely used by researchers in industry
as well as for
academic research and advanced teaching. DICHROWEB currently has over
3,600 registered
users. Over 375,000 DICHROWEB analyses have been run and the service has
been cited in over
1,000 publications [a]. This resource is free for academic and
non-profit research use but industrial
subscribers each pay a small annual fee for use of the service. The list
of past and present
industrial users includes representatives of the pharmaceutical,
biotechnology and food sectors,
and ranges from large multinational companies to the SME sector. Fees from
industrial users
provide an income of £2-5,000/year which is used as a contribution to
running costs. No formal
registration is required to access PCDDB, but users can register to
facilitate multiple queries or
updates of contents. Currently registered industrial users include a
number of large pharmaceutical
companies and life science SMEs, plus about a dozen biomedical product and
clinical diagnostic
suppliers, amongst others. Since its launch in 2009, the database has had
over 175,000 unique
hits from 41 different countries, and 89,890 downloads (as at March 2012)
[b].
Analysis of citations of our key publications reveals that a wide range
of companies have obtained
valuable results from using the tools that have been made accessible
through DICHROWEB (many
of which were developed by Wallace's group). For example, three studies
cited below [c-e]
illustrate typical recent cases in which the analysis tools within
DICHROWEB have been applied to
analysis of the stability and solution properties of proteins being
developed as biological products,
particularly vaccine candidates and tools for biomedical investigation. A
complete list of citations
with industrial co-authors is available [f]. This indicates use of
DICHROWEB by a wide range of
large and small companies from the pharmaceutical, vaccine, biotech, food
and instrumentation
sectors. A review of the patent literature has shown 11 patent
applications from major corporations
referring to the underpinning research papers, further indicating the
commercial relevance of our
work. The list of assignees that hold these patents include Roche Holding
AG, Pfizer Inc. and
Collplant Ltd [g].
The use of both DICHROWEB and the PCDDB is promoted and encouraged by all
the major
commercial manufacturers of equipment for CD spectroscopy, who have
established links from
their own company websites either to DICHROWEB alone or to both DICHROWEB
and the
PCDDB [h]. The well-established and prestigious RCSB Protein Data
Bank Knowledge Base
website, which is a primary access point for publicly available
atomic-resolution protein structures,
also links relevant entries to the appropriate entries in the PCDDB; both
sites are also linked from a
number of widely used bioinformatics websites and are now being promoted
via social media.
Wallace's work has been recognised in several awards and in invitations
to present her work at
international conferences. Her presentation with the prestigious
AstraZeneca Award by the
Biochemical Society in 2010 recognised her "outstanding work which,
through biomedical
advances, leads to development of a new reagent or method" [i].
In 2011 she was a keynote
lecturer at the CASSS Higher Order Structure meeting on spectroscopy for
industry (Rockville, MD,
USA) [j].
Both DICHROWEB and the PCDDB are widely used in advanced teaching both in
the UK outside
Birkbeck College and internationally. These include workshops for
researchers such as an annual
EU/UK summer school, which regularly involves delegates from industry, and
university courses;
DICHROWEB is mentioned in course notes and curricula from the Universities
of Texas and
Pittsburgh [k]. The use of both databases as resources for
undergraduate teaching of CD
spectroscopy has recently been recommended [l].
Sources to corroborate the impact
[a] Exact figures from DICROWEB Annual Report (March 2012): 3,459 users;
1,079 publications.
Report available on request.
[b] Exact figure from PCDDB Annual Report (March 2012): 175,527. Report
available on request.
[c] Salnikova MS, Joshi SB, Rytting JH, Warny M, Middaugh CR. Physical
characterization of
clostridium difficile toxins and toxoids: effect of the formaldehyde
crosslinking on thermal
stability. J Pharm Sci. 2008 Sep;97(9):3735-52. http://dx.doi.org/10.1002/jps.21261.
[d] Martino A, Magagnoli C, De Conciliis G, D'Ascenzi S, Forster MJ,
Allen L, Brookes C, Taylor S,
Bai X, Findlow J, Feavers IM, Rodger A, Bolgiano B. Structural
characterisation, stability and
antibody recognition of chimeric NHBA-GNA1030: an investigational vaccine
component
against Neisseria meningitidis. Vaccine. 2012 Feb 8;30(7):1330-42.
http://dx.doi.org/10.1016/j.vaccine.2011.12.066.
[e] Paoletti F, Malerba F, Konarev PV, Visintin M, Scardigli R, Fasulo L,
Lamba D, Svergun DI,
Cattaneo A. Direct intracellular selection and biochemical
characterization of a recombinant
anti-proNGF single chain antibody fragment. Arch Biochem Biophys. 2012 Jun
1;522(1):26-36.
http://dx.doi.org/10.1016/j.abb.2012.04.003.
[f] A complete list of citations with industrial collaborators available
on request and summarized
below:
Reference [1]: 385 citations in total (July 2013); 11 papers with
industrial co-authors
Reference [2]: 462 citations in total (July 2013); 13 papers with
industrial co-authors
Reference [4]: 363 citations in total (July 2013); 14 papers with
industrial co-authors
[g] Report by Cambridge IP Ltd available on request.
[h] Links on commercial websites include
[i] Astra Zeneca award notification: http://www.biochemistry.org/Awards/AstraZenecaAward.aspx
[j] Programme available at
http://www.casss.org/associations/9165/files/HOS2011_FINALProgram_091211.pdf
[k] University of Texas: http://www.biomachina.org/courses/structures/052.pdf
University of Pittsburgh: www.cs.cmu.edu/~pittMB3/Lectures/CDLecture.ppt
[l] DICROWEB and the PCDDB are cited in: Urbach AR. Circular dichroism in
the undergraduate
curriculum. J. Chem. Ed. 2010;87(9):891-3. http://dx.doi.org/10.1021/ed1005954
(see Refs 14
& 15).