Bioinformatics Advance Access originally published online on May 1, 2008
Bioinformatics 2008 24(13):1473-1480; doi:10.1093/bioinformatics/btn214
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Characterization and prediction of residues determining protein functional specificity
Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each protein's particular function-al specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally determined SDPs.
Results: We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a dataset of SDPs. The resulting large dataset, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolution-ary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large dataset of enzyme SDPs.
Availability: Datasets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/
Contact: msingh{at}cs.princeton.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Burkhard Rost
Received on March 21, 2008; revised on April 22, 2008; accepted on April 28, 2008
This article has been cited by other articles:
![]() |
S. Sankararaman and K. Sjolander INTREPID--INformation-theoretic TREe traversal for Protein functional site IDentification Bioinformatics, November 1, 2008; 24(21): 2445 - 2452. [Abstract] [Full Text] [PDF] |
||||
