Bioinformatics Advance Access originally published online on November 8, 2005
Bioinformatics 2006 22(2):164-171; doi:10.1093/bioinformatics/bti766
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Prediction of functional specificity determinants from protein sequences using log-likelihood ratios
1Howard Hughes Medical Institute, University of Texas Southwestern Medical Center 5323 Harry Hines Boulevard, Dallas, TX 75390-9050, USA
2Department of Biochemistry, University of Texas Southwestern Medical Center 5323 Harry Hines Boulevard, Dallas, TX 75390-9050, USA
*To whom correspondence should be addressed.
Motivation: A number of methods have been developed to predict functional specificity determinants in protein families based on sequence information. Most of these methods rely on pre-defined functional subgroups. Manual subgroup definition is difficult because of the limited number of experimentally characterized subfamilies with differing specificity, while automatic subgroup partitioning using computational tools is a non-trivial task and does not always yield ideal results.
Results: We propose a new approach SPEL (specificity positions by evolutionary likelihood) to detect positions that are likely to be functional specificity determinants. SPEL, which does not require subgroup definition, takes a multiple sequence alignment of a protein family as the only input, and assigns a P-value to every position in the alignment. Positions with low P-values are likely to be important for functional specificity. An evolutionary tree is reconstructed during the calculation, and P-value estimation is based on a random model that involves evolutionary simulations. Evolutionary log-likelihood is chosen as a measure of amino acid distribution at a position. To illustrate the performance of the method, we carried out a detailed analysis of two protein families (LacI/PurR and G protein
subunit), and compared our method with two existing methods (evolutionary trace and mutual information based). All three methods were also compared on a set of protein families with known ligand-bound structures.
Availability: SPEL is freely available for non-commercial use. Its pre-compiled versions for several platforms and alignments used in this work are available at ftp://iole.swmed.edu/pub/SPEL/
Contact: grishin{at}chop.swmed.edu.
Supplementary information: Supplementary materials are available at ftp:/iole.swmed.edu/pub/SPEL/
Received on August 9, 2005; revised on November 3, 2005; accepted on November 3, 2005
This article has been cited by other articles:
![]() |
J. E. Donald and E. I. Shakhnovich SDR: a database of predicted specificity-determining residues in proteins Nucleic Acids Res., January 1, 2009; 37(suppl_1): D191 - D194. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sankararaman and K. Sjolander INTREPID--INformation-theoretic TREe traversal for Protein functional site IDentification Bioinformatics, November 1, 2008; 24(21): 2445 - 2452. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Capra and M. Singh Characterization and prediction of residues determining protein functional specificity Bioinformatics, July 1, 2008; 24(13): 1473 - 1480. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Fischer, C. E. Mayer, and J. Soding Prediction of protein functional residues from sequence by probability density estimation Bioinformatics, March 1, 2008; 24(5): 613 - 620. [Abstract] [Full Text] [PDF] |
||||

