Machine Learning in Computational Biology
Rediscovering secondary structures as network motifsan unsupervised learning approach


1 Department of Computer Science & Applied Mathematics, Weizmann Institute of Science Rehovot, 76100, Israel
2 Department of Biological Chemistry, Weizmann Institute of Science Rehovot, 76100, Israel
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Secondary structures are key descriptors of a protein fold and its topology. In recent years, they facilitated intensive computational tasks for finding structural homologues, fold prediction and protein design. Their popularity stems from an appealing regularity in patterns of geometry and chemistry. However, the definition of secondary structures is of subjective nature. An unsupervised de-novo discovery of these structures would shed light on their nature, and improve the way we use these structures in algorithms of structural bioinformatics.
Methods: We developed a new method for unsupervised partitioning of undirected graphs, based on patterns of small recurring network motifs. Our input was the network of all H-bonds and covalent interactions of protein backbones. This method can be also used for other biological and non-biological networks.
Results: In a fully unsupervised manner, and without assuming any explicit prior knowledge, we were able to rediscover the existence of conventional
-helices, parallel ß-sheets, anti-parallel sheets and loops, as well as various non-conventional hybrid structures. The relation between connectivity and crystallographic temperature factors establishes the existence of novel secondary structures.
Contact: barak.raveh{at}weizmann.ac.il; gideon.schreiber{at}weizmann.ac.il
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
This article has been cited by other articles:
![]() |
O. Rahat, U. Alon, Y. Levy, and G. Schreiber Understanding hydrogen-bond patterns in proteins using network motifs Bioinformatics, November 15, 2009; 25(22): 2921 - 2928. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Neuvirth, U. Heinemann, D. Birnbaum, N. Tishby, and G. Schreiber ProMateus--an open research approach to protein-binding sites analysis Nucleic Acids Res., July 13, 2007; 35(suppl_2): W543 - W548. [Abstract] [Full Text] [PDF] |
||||

