Skip Navigation

Bioinformatics 2005 21(Suppl 2):ii59-ii65; doi:10.1093/bioinformatics/bti1110
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Tsuda, K.
Right arrow Articles by Schölkopf, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tsuda, K.
Right arrow Articles by Schölkopf, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oxfordjournals.org

Fast protein classification with multiple networks

Koji Tsuda 1,2, HyunJung Shin 1,3,* and Bernhard Schölkopf 1

1Max Planck Institute for Biological Cybernetics Spemannstrasse 38, 72076 Tübingen, Germany
2Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology (AIST) 2-42 Aomi Koto-ku, Tokyo, Japan
3Friedrich Miescher Laboratory, Max Planck Society Spemannstrasse 37, 72076, Tübingen, Germany

*To whom correspondence should be addressed.

Motivation: Support vector machines (SVMs) have been successfully used to classify proteins into functional categories. Recently, to integrate multiple data sources, a semidefinite programming (SDP) based SVM method was introduced. In SDP/SVM, multiple kernel matrices corresponding to each of data sources are combined with weights obtained by solving an SDP. However, when trying to apply SDP/SVM to large problems, the computational cost can become prohibitive, since both converting the data to a kernel matrix for the SVM and solving the SDP are time and memory demanding. Another application-specific drawback arises when some of the data sources are protein networks. A common method of converting the network to a kernel matrix is the diffusion kernel method, which has time complexity of O(n3), and produces a dense matrix of size n x n.

Results: We propose an efficient method of protein classification using multiple protein networks. Available protein networks, such as a physical interaction network or a metabolic network, can be directly incorporated. Vectorial data can also be incorporated after conversion into a network by means of neighbor point connection. Similar to the SDP/SVM method, the combination weights are obtained by convex optimization. Due to the sparsity of network edges, the computation time is nearly linear in the number of edges of the combined network. Additionally, the combination weights provide information useful for discarding noisy or irrelevant networks. Experiments on function prediction of 3588 yeast proteins show promising results: the computation time is enormously reduced, while the accuracy is still comparable to the SDP/SVM method.

Availability: Software and data will be available on request.

Contact: shin{at}tuebingen.mpg.de



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
H. N. Chua, W.-K. Sung, and L. Wong
An efficient strategy for extensive integration of diverse biological data for protein function prediction
Bioinformatics, December 15, 2007; 23(24): 3364 - 3373.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.