Skip Navigation


Bioinformatics Advance Access originally published online on June 28, 2007
Bioinformatics 2007 23(17):2314-2321; doi:10.1093/bioinformatics/btm342
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrowOA All Versions of this Article:
23/17/2314    most recent
btm342v2
btm342v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Chen, P.-Y.
Right arrow Articles by Reinert, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, P.-Y.
Right arrow Articles by Reinert, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

A statistical approach using network structure in the prediction of protein characteristics

Pao-Yang Chen *, Charlotte M. Deane and Gesine Reinert

Department of Statistics, University of Oxford, Oxford, OX1 3TG, UK

*To whom correspondence should be addressed.


   Abstract

Motivation: The Majority Vote approach has demonstrated that protein–protein interactions can be used to predict the structure or function of a protein. In this article we propose a novel method for the prediction of such protein characteristics based on frequencies of pairwise interactions. In addition, we study a second new approach using the pattern frequencies of triplets of proteins, thus for the first time taking network structure explicitly into account. Both these methods are extended to jointly consider multiple organisms and multiple characteristics.

Results: Compared to the standard non-network-based method, namely the Majority Vote method, in large networks our predictions tend to be more accurate. For structure prediction, the Frequency-based method reaches up to 71% accuracy, and the Triplet-based method reaches up to 72% accuracy, whereas for function prediction, both the Triplet-based method and the Frequency-based method reach up to 90% accuracy. Function prediction on proteins without homologues showed slightly less but comparable accuracies. Including partially annotated proteins substantially increases the number of proteins for which our methods predict their characteristics with reasonable accuracy. We find that the enhanced Triplet-based method does not currently yield significantly better results than the enhanced Frequency-based method, suggesting that triplets of interactions do not contain substantially more information about protein characteristics than interaction pairs. Our methods offer two main improvements over current approaches—first, multiple protein characteristics are considered simultaneously, and second, data is integrated from multiple species. In addition, the Triplet-based method includes network structure more explicitly than the Majority Vote and the Frequency-based method.

Availability: The program is available upon request.

Contact: pchen{at}stats.ox.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

Associate Editor: Jonathan Wren


Received on February 15, 2007; revised on June 8, 2007; accepted on June 22, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.