Skip Navigation


Bioinformatics Advance Access first published online on June 28, 2007
This version published online on June 30, 2007

Bioinformatics, doi:10.1093/bioinformatics/btm342
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/17/2314    most recent
btm342v2
btm342v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Chen, P.-Y.
Right arrow Articles by Reinert, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, P.-Y.
Right arrow Articles by Reinert, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2007). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

A statistical approach using network structure in the prediction of protein characteristics

Pao-Yang Chen *, Charlotte M. Deane and Gesine Reinert {dagger}

Department of Statistics, University of Oxford, Oxford, OX1 3TG, UK.

*To whom correspondence should be addressed. Pao-Yang Chen, E-mail: pchen{at}stats.ox.ac.uk


   Abstract

Motivation: The Majority Vote approach has demonstrated that protein-protein interactions can be used to predict the structure or function of a protein. In this paper we propose a novel method for the prediction of such protein characteristics based on frequencies of pairwise interactions. In addition, we study a second new approach using the pattern frequencies of triplets of proteins, thus for the first time taking network structure explicitly into account. Both these methods are extended to jointly consider multiple organisms and multiple characteristics.

Results: Compared to the standard non network-based method, namely the Majority Vote method, in large networks our predictions tend to be more accurate. For structure prediction, the frequencybased method reaches up to 71% accuracy, and the triplet-based method reaches up to 72% accuracy, whereas for function prediction, both the triplet-based method and the frequency-based method reach up to 90% accuracy. Function prediction on proteins without homologs showed slightly less but comparable accuracies. Including partially annotated proteins substantially increases the number of proteins for which our methods predict their characteristics with reasonable accuracy. We find that the enhanced triplet-based method does not currently yield significantly better results than the enhanced frequency-based method, suggesting that triplets of interactions do not contain substantially more information about protein characteristics than interaction pairs. Our methods offer two main improvements over current approaches - firstly, multiple protein characteristics are considered simultaneously, and secondly, data is integrated from multiple species. In addition, the triplet-based method includes network structure more explicitly than the Majority Vote and the frequency-based method.

Availability: The program is available upon request.

Associate Editor: Dr. Jonathan Wren

{dagger} Funded in part by MMCOMNET Grant No. FP6-2003-BEST-Path-012999.


Received on February 15, 2007; revised on June 8, 2007; accepted on June 22, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.