Bioinformatics Advance Access originally published online on March 29, 2005
Bioinformatics 2005 21(11):2629-2635; doi:10.1093/bioinformatics/bti396
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Determining functional specificity from protein sequences
Department of Chemistry and Chemical Biology, Harvard University 12 Oxford Street, Cambridge, MA 02138, USA
*To whom correspondence should be addressed.
Motivation: Given a large family of homologous protein sequences, many methods can divide the family into smaller groups that correspond to the different functions carried out by proteins within the family. One important problem, however, has been the absence of a general method for selecting an appropriate level of granularity, or size of the groups.
Results: We propose a consistent way of choosing the granularity that is independent of the sequence similarity and sequence clustering method used. We study three large, well-investigated protein families: basic leucine zippers, nuclear receptors and proteins with three consecutive C2H2 zinc fingers. Our method is tested against known functional information, the experimentally determined binding specificities, using a simple scoring method. The significance of the groups is also measured by randomizing the data. Finally, we compare our algorithm against a popular method of grouping proteins, the TRIBE-MCL method. In the end, we determine that dividing the families at the proposed level of granularity creates very significant and useful groups of proteins that correspond to the different DNA-binding motifs. We expect that such groupings will be useful in studying not only DNA binding but also other protein interactions.
Contact: shakhnovich{at}chemistry.harvard.edu
Supplementary information: The supplementary material contains: experimental binding specificities, a list of proteins in the proposed clusters, a table listing the percentage of proteins with binding data and from humans, visualizations of nuclear receptor and zinc finger proteins from humans, gene trees for two families, BLAST results and TRIBE-MCL results.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Ye, G. Vriend, and A. P. IJzerman Tracing evolutionary pressure Bioinformatics, April 1, 2008; 24(7): 908 - 915. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Marttinen, J. Corander, P. Toronen, and L. Holm Bayesian search of functionally divergent protein subgroups and their function specific residues Bioinformatics, October 15, 2006; 22(20): 2466 - 2474. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Donald and E. I. Shakhnovich Predicting specificity-determining residues in two large eukaryotic transcription factor families Nucleic Acids Res., August 5, 2005; 33(14): 4455 - 4465. [Abstract] [Full Text] [PDF] |
||||

