Bioinformatics Advance Access originally published online on November 11, 2004
Bioinformatics 2005 21(7):1020-1027; doi:10.1093/bioinformatics/bti135
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Predicting fold novelty based on ProtoNet hierarchical classification
1Department of Biological Chemistry, Institute of Life Sciences Jerusalem 91904, Israel
2School of Computer Science and Engineering, The Hebrew University Jerusalem 91904, Israel
*To whom correspondence should be addressed.
Motivation: Structural genomics projects aim to solve a large number of protein structures with the ultimate objective of representing the entire protein space. The computational challenge is to identify and prioritize a small set of proteins with new, currently unknown, superfamilies or folds.
Results: We develop a method that assigns each protein a likelihood of it belonging to a new, yet undetermined, structural superfamily. The method relies on a variant of ProtoNet, an automatic hierarchical classification scheme of all protein sequences from SwissProt. Our results show that proteins that are remote from solved structures in the ProtoNet hierarchy are more likely to belong to new superfamilies. The results are validated against SCOP releases from recent years that account for about half of the solved structures known to date. We show that our new method and the representation of ProtoNet are superior in detecting new targets, compared to our previous method using ProtoMap classification. Furthermore, our method outperforms PSI-BLAST search in detecting potential new superfamilies.
Availability: An interactive tool implementing this method, named ProTarget, is available at http://www.protarget.cs.huji.ac.il. It can be used interactively to retrieve a list of candidate proteins for Structural genomics projects. Supplementary material is available at http://www.protarget.cs.huji.ac.il/supplement
Contact: michall{at}cc.huji.ac.il
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Loewenstein and M. Linial Connect the dots: exposing hidden protein family connections from the entire sequence tree Bioinformatics, August 15, 2008; 24(16): i193 - i199. [Abstract] [PDF] |
||||
![]() |
Y. Loewenstein, E. Portugaly, M. Fromer, and M. Linial Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space Bioinformatics, July 1, 2008; 24(13): i41 - i49. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Sasson and M. Linial ProTarget: automatic prediction of protein structure novelty Nucleic Acids Res., July 1, 2005; 33(suppl_2): W81 - W84. [Abstract] [Full Text] [PDF] |
||||

