Bioinformatics Vol. 19 no. 14 2003
Pages 1748-1759
© 2003 Oxford University Press
Recognizing the fold of a protein structure
1 Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK, 2 Inpharmatica, 60 Charlotte Street, London W1P 2NU, UK, 3 Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK, 4 Crystallography Department, Birkbeck College, Malet Street, London WC1E 7HX, UK and 5 European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Received on August 27, 2002
; revised on December 18, 2002
; accepted on February 26, 2003
This paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been classified previously in the CATH database of structural domains.
GRATH uses a measure of similarity that details the geometric information, number of secondary structures and number of residues within secondary structures, that any two protein structures share. Although GRATH builds on well established approaches for secondary structure comparison, a novel scoring scheme has been introduced to allow ranking of any matches identified by the algorithm. More importantly, we have benchmarked the algorithm using a large dataset of 1702 non-redundant structures from the CATH database which have already been classified into fold groups, with manual validation. This has facilitated introduction of further constraints, optimization of parameters and identification of reliable thresholds for fold identification. Following these benchmarking trials, the correct fold can be identified with the top score with a frequency of 90%. It is identified within the ten most likely assignments with a frequency of 98%.
GRATH has been implemented to use via a server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl). GRATH's speed and accuracy means that it can be used as a reliable front-end filter for the more accurate, but computationally expensive, residue based structure comparison algorithm SSAP, currently used to classify domain structures in the CATH database. With an increasing number of structures being solved by the structural genomics initiatives, the GRATH server also provides an essential resource for determining whether newly determined structures are related to any known structures from which functional properties may be inferred.
Contact: harry{at}biochem.ucl.ac.uk
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Csaba, F. Birzele, and R. Zimmer Protein structure alignment considering phenotypic plasticity Bioinformatics, August 15, 2008; 24(16): i98 - i104. [Abstract] [PDF] |
||||
![]() |
P. F. Gherardini and M. Helmer-Citterich Structure-based function prediction: approaches and applications Brief Funct Genomic Proteomic, July 3, 2008; (2008) eln030v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. S. Konagurthu, P. J. Stuckey, and A. M. Lesk Structural search and retrieval using a tableau representation of protein folding patterns Bioinformatics, March 1, 2008; 24(5): 645 - 651. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg Automated protein function prediction--the genomic challenge Brief Bioinform, September 1, 2006; 7(3): 225 - 242. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Goormaghtigh, J.-M. Ruysschaert, and V. Raussens Evaluation of the Information Content in Infrared Spectra for Protein Secondary Structure Determination Biophys. J., April 15, 2006; 90(8): 2946 - 2957. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Sillitoe, M. Dibley, J. Bray, S. Addou, and C. Orengo Assessing strategies for improved superfamily recognition Protein Sci., July 1, 2005; 14(7): 1800 - 1810. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Pearl, A. Todd, I. Sillitoe, M. Dibley, O. Redfern, T. Lewis, C. Bennett, R. Marsden, A. Grant, D. Lee, et al. The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis Nucleic Acids Res., January 1, 2005; 33(suppl_1): D247 - D251. [Abstract] [Full Text] [PDF] |
||||





