Skip Navigation


Bioinformatics Advance Access originally published online on March 29, 2005
Bioinformatics 2005 21(11):2629-2635; doi:10.1093/bioinformatics/bti396
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow A correction has been published
Right arrow All Versions of this Article:
21/11/2629    most recent
bti396v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Donald, J. E.
Right arrow Articles by Shakhnovich, E. I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Donald, J. E.
Right arrow Articles by Shakhnovich, E. I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

Determining functional specificity from protein sequences

Jason E. Donald and Eugene I. Shakhnovich *

Department of Chemistry and Chemical Biology, Harvard University 12 Oxford Street, Cambridge, MA 02138, USA

*To whom correspondence should be addressed.

Motivation: Given a large family of homologous protein sequences, many methods can divide the family into smaller groups that correspond to the different functions carried out by proteins within the family. One important problem, however, has been the absence of a general method for selecting an appropriate level of granularity, or size of the groups.

Results: We propose a consistent way of choosing the granularity that is independent of the sequence similarity and sequence clustering method used. We study three large, well-investigated protein families: basic leucine zippers, nuclear receptors and proteins with three consecutive C2H2 zinc fingers. Our method is tested against known functional information, the experimentally determined binding specificities, using a simple scoring method. The significance of the groups is also measured by randomizing the data. Finally, we compare our algorithm against a popular method of grouping proteins, the TRIBE-MCL method. In the end, we determine that dividing the families at the proposed level of granularity creates very significant and useful groups of proteins that correspond to the different DNA-binding motifs. We expect that such groupings will be useful in studying not only DNA binding but also other protein interactions.

Contact: shakhnovich{at}chemistry.harvard.edu

Supplementary information: The supplementary material contains: experimental binding specificities, a list of proteins in the proposed clusters, a table listing the percentage of proteins with binding data and from humans, visualizations of nuclear receptor and zinc finger proteins from humans, gene trees for two families, BLAST results and TRIBE-MCL results.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
K. Ye, G. Vriend, and A. P. IJzerman
Tracing evolutionary pressure
Bioinformatics, April 1, 2008; 24(7): 908 - 915.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Marttinen, J. Corander, P. Toronen, and L. Holm
Bayesian search of functionally divergent protein subgroups and their function specific residues
Bioinformatics, October 15, 2006; 22(20): 2466 - 2474.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. E. Donald and E. I. Shakhnovich
Predicting specificity-determining residues in two large eukaryotic transcription factor families
Nucleic Acids Res., August 5, 2005; 33(14): 4455 - 4465.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.