Skip Navigation


Bioinformatics Advance Access originally published online on November 3, 2005
Bioinformatics 2006 22(2):248-250; doi:10.1093/bioinformatics/bti757
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
22/2/248    most recent
bti757v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Baudot, A.
Right arrow Articles by Brun, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baudot, A.
Right arrow Articles by Brun, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press 2005.

PRODISTIN Web Site: a tool for the functional classification of proteins from interaction networks

Anaïs Baudot 1,{dagger}, David Martin 1,4,{dagger}, Pierre Mouren 1,{dagger}, François Chevenet 3, Alain Guénoche 2, Bernard Jacq 1 and Christine Brun 1,*

1LGPD/IBDM UMR6545 CNRS
2Institut de Mathématiques de Luminy, UPR9016 CNRS, Parc Scientifique et Technologique de Luminy Case 907, 13288 Marseille cedex 9, France
3GEMI, UMR IRD/CNRS9926 IRD - BP 64501, 911 Avenue Agropolis, 34394 Montpellier cedex 5, France

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 

Summary: The PRODISTIN Web Site is a web service allowing users to functionally classify genes/proteins from any type of interaction network. The resulting computation provides a classification tree in which (1) genes/proteins are clustered according to the identity of their interaction partners and (2) functional classes are delineated in the tree using the Biological Process Gene Ontology annotations.

Availabitily: The PRODISTIN Web Site is freely accessible at http://gin.univ-mrs.fr/webdistin

Contact: brun{at}ibdm.univ-mrs.fr


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 
Protein–protein interaction maps are now available for at least four eukaryotic model organisms: the budding yeast (Ito et al., 2001; Uetz et al., 2000), the worm (Li et al., 2004), the fruit fly (Formstecher et al., 2005; Giot et al., 2003) and human (Stelzl et al., 2005; Rual et al., 2005). These maps form large intricate networks leading to a renewed vision of cell biology as an integrated system. However, extracting and revealing the functional information they contain depends on our ability to analyze them in detail. Indeed, although they are far from being complete, the size and the complexity of these networks (~6000, 5000, 20 000, 5500 interactions/graph edges in yeast, worm, fly and human, respectively) make their functional analysis a difficult task.

In order to extract this information, we developed two years ago a bioinformatic method named PRODISTIN (PROtein DISTance based on INteractions) which allows a functional classification of the proteins according to the identity of their interacting partners (Brun et al., 2003). The central idea in this interaction-based functional clustering is to compare interaction partners for all protein pairs, assuming that the more two proteins share interacting partners, the more they should be functionally related. Applying it to the yeast protein–protein interaction network, we have previously shown that the method (1) clusters proteins participating in the same cellular processes in the same functional classes, (2) predicts function for unknown proteins and (3) is statistically valid (Brun et al., 2003). In addition, we showed that PRODISTIN can successfully be used to study the evolutionary fates of the yeast duplicated genes (Baudot et al., 2004). We now propose an automated version of the PRODISTIN method for the community, through a web interface called Prodistin Web Site (PWS).


    SERVER OVERVIEW
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 
This server can be used to functionally classify genes/proteins from interaction networks. The resulting computation provides the user with a classification tree in which the network genes/proteins are clustered according to their functional similarity (Step 1, see below) and the functional classes are identified using the Biological Process Gene Ontology annotations of the genes/proteins (Step 2, see below).

The PWS backbone is mainly written in PHP with calls to Perl scripts and C programs.

Step 1. The user enters the interaction network to be analyzed as a list of binary interactions and specifies a minimal connectivity threshold for genes/proteins to be classified. The use of this threshold is intended to eliminate poorly connected genes/proteins from the classification process for which the functional analysis is likely to be biased by the presence of putative false-positive interactions (default value: 3).

The server then computes the Czekanowski–Dice distance between all possible pairs of proteins (for details, see Brun et al., 2003). The obtained values are subsequently clustered using the BioNJ algorithm (Gascuel, 1997), leading to a classification tree (Fig. 1, Step 1).


Figure 1
View larger version (74K):
[in this window]
[in a new window]
 
Fig. 1 Synopsis of the Prodistin Web Site and output.

 
Step 2. Functional classes (Prodistin classes) are then identified in the tree. They correspond to the largest possible subtrees composed of at least X proteins sharing the same Biological Process GO annotations and representing at least Y% of the individual class members for which an annotation is available (X and Y are user defined). For this, the PWS is taking into account not only the GO term(s) annotating the genes/proteins contained in the tree but also the complete hierarchy of parent terms. This hierarchy is retrieved by PWS through GOToolBox, a software suite devoted to gene sets analysis based on associated GO terms that we developed recently (Martin et al., 2004). Once the terms shared by at least X genes/proteins in the same subtree are selected, the more precise ones (the deeper in the ontology) are kept to annotate the class. In order to assess the statistical quality of the annotation(s) attributed to the functional classes by the method, a P-value of the over-representation of the proposed GO term(s) in the class compared with the tree is calculated using the hypergeometric distribution. This calculation constitutes an optional step of the computation (Step 3).

In metazoan organisms, genes/proteins may be annotated with a very large number of GO terms encompassing a broad range of biological processes. For instance, the Drosophila wingless gene is annotated with 48 Biological Process GO terms. Because this situation may blur the network analysis, it may be useful to identify functional classes with only subparts of the Biological Process ontology. For this purpose, the user can select a GO ID corresponding to the GO term to be used as a root term for the subpart of the ontology. The default root value corresponds to the complete Biological Process ontology root term.

The result of the computation is then visualized as a coloured classification tree using an integrated TreeDyn module (http://www.treedyn.org) as a tree viewer (Fig. 1, Step 2). All intermediate files created during the computation are downloadable for further investigations. The provided formats allow the user to load the generated files into an external TreeDyn program for a subsequent deeper analysis of the obtained results.


    APPLICATION
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 
The Prodistin Web Site can be used to analyse any kind of biological networks (protein–protein, genetic, protein–DNA). The class annotation process is available for all model organisms supported by GOToolBox (currently Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Mus musculus, Rattus norvegicus and Saccharomyces cerevisiae).


    CONCLUSION
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 
Prodistin Web Site is a tool allowing the extraction of a part of the functional information contained in interaction networks. Functional classes are identified by a computation of the network and functional annotations are provided with all of them. The combination of these two features makes PWS different from other existing tools. Indeed, for instance, the ‘Significant Attribute Plugin’ provided with Cytoscape (Shannon et al., 2003) is designed to search for aggregation of attribute values such as annotations in networks without any computation of the network itself, and MCODE (Bader and Hogue, 2003) proposes a computation of the network for the identification of modules without any functional annotations. For these particular reasons, PWS should be valuable to the research community.


    Acknowledgments
 
A.B. and D.M. were supported by fellowships from the French ‘Ministère de l'Enseignement Supérieur et de la Recherche’. This work was supported by an ACI IMPBio grant (EIDIPP project) to B.J.

Conflict of Interest: none declared.


    FOOTNOTES
 
{dagger}These authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors. Back

4Present address: Centre for Molecular Medicine and Therapeutics, University of British Columbia Vancouver, BC, Canada. Back

Associate Editor: John Quackenbush

Received on July 28, 2005; revised on October 17, 2005; accepted on October 28, 2005

    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 SERVER OVERVIEW
 APPLICATION
 CONCLUSION
 REFERENCES
 

    Bader, G.D. and Hogue, C.W. (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinfoinformatics, 4, 2.

    Baudot, A., et al. (2004) A scale of functional divergence for yeast duplicated genes revealed from analysis of the protein–protein interaction network. Genome Biol, . 5, R76[CrossRef][Medline].

    Brun, C., et al. (2003) Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol, . 5, R6[CrossRef][Medline].

    Formstecher, E., et al. (2005) Protein interaction mapping: a Drosophila case study. Genome Res, . 15, 376–384[Abstract/Free Full Text].

    Gascuel, O. (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol, . 14, 685–695[Abstract].

    Giot, L., et al. (2003) A protein interaction map of Drosophila melanogaster. Science, 302, 1727–1736[Abstract/Free Full Text].

    Ito, T., et al. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. Sci. USA, 98, 4569–4574[Abstract/Free Full Text].

    Li, S., et al. (2004) A map of the interactome network of the metazoan C. elegans. Science, 303, 540–543[Abstract/Free Full Text].

    Martin, D., et al. (2004) GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol, . 5, R101[CrossRef][Medline].

    Rual, J.-F., et al. (2005) Towards a proteome-scale map of the human protein–protein interaction network. Nature, 437, 1173–1178[CrossRef][Medline].

    Shannon, P., et al. (2003) Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res, . 13, 2498–2504[Abstract/Free Full Text].

    Stelzl, U., et al. (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell, 122, 957–968[CrossRef][ISI][Medline].

    Uetz, P., et al. (2000) A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature, 403, 623–627[CrossRef][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
ScienceHome page
W. Zhong and P. W. Sternberg
Genome-wide prediction of C. elegans genetic interactions.
Science, March 10, 2006; 311(5766): 1481 - 1484.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
22/2/248    most recent
bti757v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Baudot, A.
Right arrow Articles by Brun, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baudot, A.
Right arrow Articles by Brun, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?