Bioinformatics Advance Access published online on September 27, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti690
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 School of Biological and Chemical Sciences, Queen Mary, University of London, Mile End Road, E1 4NS, UK
* To whom correspondence should be addressed.
Motivation: Circular Dichroism (CD) spectroscopy has become established as a key method for determining the secondary structure contents of proteins which has had a significant impact on molecular biology. Many excellent mathematical protocols have been developed for this purpose and their quality is above question. However, reference database sets of proteins, with CD spectra matched to secondary structure components derived from X-ray structures, provide the key resource for this task. These databases were created many years ago, before most CD spectrophotometers became standardised, and before it was commonplace to validate X-ray structures prior to publication. The analyses presented here were undertaken to investigate the overall quality of these reference databases in light of their extensive usage in determining protein secondary structure content from CD spectra. Results: The analyses show that there are a number of significant problems associated with the CD reference database sets in current use. There are disparities between CD spectra for the same protein collected by different groups. These include differences in magnitudes, peak positions, or both. However, many current reference sets are now amalgamations of spectra from these groups, introducing inconsistencies, which can lead to inaccuracies in the determination of secondary structure components from the CD spectra. A number of the X-ray structures used fall short on the validation criteria now employed as standard for structure determination. Many have substantial percentages of residues in the disallowed regions of the Ramachandran plot hence their calculated secondary structures components, used as a foundation for the reference databases, are likely to be in error. Additionally, the coverage of secondary structure space in the reference data sets is poorly correlated to the secondary structure components found in the Protein Data Bank. A conclusion is that a new reference CD database, with cross-correlated, machine-independent CD spectra, and validated X-ray structures that cover more secondary structure components, and also diverse protein folds, is now needed. However, that reasonably accurate values for the secondary structure content of proteins can be determined from spectra is a testament to CD spectroscopy being a very powerful technique.
Received August 2, 2005
Revised September 21, 2005
Accepted September 23, 2005
Article
Bioinformatics analyses of circular dichroism protein reference databases
Robert W. Janes, E-mail: r.w.janes{at}qmul.ac.uk
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. G. Lees, A. J. Miles, F. Wien, and B. A. Wallace A reference database for circular dichroism spectroscopy covering fold and secondary structure space Bioinformatics, August 15, 2006; 22(16): 1955 - 1962. [Abstract] [Full Text] [PDF] |
||||
