Skip Navigation


Bioinformatics Advance Access originally published online on March 1, 2006
Bioinformatics 2006 22(8):1015-1017; doi:10.1093/bioinformatics/btl072
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/8/1015    most recent
btl072v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (11)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Aragues, R.
Right arrow Articles by Oliva, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Aragues, R.
Right arrow Articles by Oliva, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

PIANA: protein interactions and network analysis

Ramon Aragues , Daniel Jaeggi and Baldo Oliva *

Structural Bioinformatics Group (GRIB–IMIM). Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra C/Doctor Aiguader, 83, Barcelona 08003, Catalonia, Spain

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 

Summary: We present a software framework and tool called Protein Interactions And Network Analysis (PIANA) that facilitates working with protein interaction networks by (1) integrating data from multiple sources, (2) providing a library that handles graph-related tasks and (3) automating the analysis of protein–protein interaction networks. PIANA can also be used as a stand-alone application to create protein interaction networks and perform tasks such as predicting protein interactions and helping to identify spots in a 2D electrophoresis gel.

Availability: PIANA is under the GNU GPL. Source code, database and detailed documentation may be freely downloaded from http://sbi.imim.es/piana.

Contact: ramon.aragues{at}upf.edu; boliva{at}imim.es


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 
The analysis of protein interaction networks is fundamental to the understanding of cellular processes (Salwinski and Eisenberg, 2003; Yook et al., 2004). Furthermore, protein interaction networks are being used in tasks such as assignment of function to uncharacterized proteins (Huynen et al., 2003) and searching for remote similarities between proteins (Espadaler et al., 2005a). Some tools developed to visualize and analyze protein–protein interaction networks are Cytoscape (Shannon et al., 2003), Osprey (Breitkreutz et al., 2003), VisANT (Hu et al., 2005) and ProViz (Iragne et al., 2005). Most of these tools focus on visualizing the networks, while a few of them have analytic capabilities.

Protein Interactions And Network Analysis (PIANA) is a software framework that integrates data from multiple sources into a single repository, creates interaction networks, predicts novel interactions and performs automatic analyses. PIANA is different from most other tools in that (1) it is also a framework on which developers can base their applications, (2) it integrates most protein and interaction databases into a single repository and (3) it performs analyses not provided by other tools.


    2 PIANA ARCHITECTURE
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 
PIANA has been implemented as a collection of python modules that can be used separately as libraries or as a stand-alone application through a user interface.

The database module
The database module consists of a MySQL database and a library used as an interface to the database. A limited version of a PIANA MySQL database containing interactions from DIP (Salwinski et al., 2004) and interactions predicted from sequence/structure distant patterns (Espadaler et al., 2005b) can be downloaded from our website.

The parsing module
PIANA includes parsers for the main protein databases [UniProt (Bairoch et al., 2005), NCBI GenBank (Benson et al., 2005)] and for protein interaction repositories such as DIP, STRING (von Mering et al., 2003), MIPS (Pagel et al., 2005), BIND (Alfarano et al., 2005) and HPRD (Peri et al., 2003). PIANA can also parse flat text files and interaction data that follow the HUPO PSI MI standard (Hermjakob et al., 2004). Moreover, PIANA provides parsers for databases such as COG (Tatusov et al., 2003), GO (Ashburner et al., 2000) and SCOP (Murzin et al., 1995). These databases contain information that PIANA uses when performing the analyses.

The network module
PIANA implements classes and methods for working with networks. Moreover, PIANA has methods specifically designed for biological networks such as clustering proteins by their molecular function and visualizing the networks in formats appropriate for biological analysis.


    3 PIANA CAPABILITIES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 
Data integration
PIANA accepts as input most types of protein database identifiers and contains cross-references between them. Therefore, interactions from different sources can be integrated into a single network. Currently, the type of input and output protein database identifiers accepted by PIANA are UniProt entry names and accession numbers, gene names, NCBI GenBank gi, EMBL, PDB, PIR and the protein sequence.

Creation of protein–protein interaction networks
Usually, a list of proteins of interest is given as input. PIANA searches in its database for interactions where these proteins are involved and adds edges (i.e. interactions) and nodes (i.e. protein interaction partners) to the network until a given depth is reached, where depth is defined as the number of interacting steps taken from the original proteins. Internally, a protein interaction network is represented as a set of nodes (proteins) connected by edges (interactions). The networks can be visualized in different formats, mainly tables that describe in detail each interaction and DOT files, which can be used to produce network images. PIANA also has the possibility of applying output filters such as highlighting proteins that perform specific functions or identifying proteins in the network whose genes have been found over- or under-expressed in a microarray experiment.

Predicting novel interactions
PIANA transfers interactions between proteins that share a given property. For example, PIANA predicts interactions using ‘interologs’ (Yu et al., 2004) by means of COG codes. In a similar way, SCOP codes can be used to transfer interactions between proteins that share a domain family.

Finding ‘interaction distance’ between proteins
PIANA can obtain lists of proteins that are at a certain interaction distance (i.e. minimum number of edges separating two proteins) from another protein, which can be useful for tasks such as searching for remote similarities between proteins (Espadaler et al., 2005a).

Matching spots from electrophoresis experiments
PIANA can be used to help identify spots in a 2D electrophoresis gel. Spots not identified by mass spectrometry are putatively assigned to proteins in the network by comparing their molecular weights and isoelectric points.

Clustering proteins by their GO terms
Networks can become very complex and hence, clustering methods are needed to facilitate their interpretation. PIANA provides a library for applying agglomerative hierarchical clustering to protein interaction networks. For example, using the annotation provided by GO, PIANA groups into clusters those proteins in the network that have similar biological processes or molecular functions. The distance function used for this clustering is based on the length of the path between the GO terms in the GO hierarchical tree. The stop condition is set by the user by means of two thresholds: minimum similarity accepted in order to group two clusters and minimum distance from the terms in the cluster to the GO root term.

Extending PIANA
New functionalities can be added to PIANA by extending the current python classes. Moreover, PIANA implements a class called PianaApi that can be used from other Python programs to work with interaction networks.


    4 EXAMPLE
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 
We illustrate the use of PIANA with two genes (MMP1 and LTBP1) that have been found to mediate breast cancer metastasis to lung (Minn et al., 2005). First of all, we create a PIANA configuration file where we set (1) the input parameters (e.g. input proteins and network depth), (2) the output parameters (e.g. type of protein identifiers to be used) and (3) the PIANA commands to execute (e.g. create network for the proteins and predict interactions based on interologs). Then, we run PIANA with the configuration file as an argument. Figure 1 shows the protein interaction network for MMP1 and LTBP1 (a) before and (b) after adding predictions based on interologs. A detailed PIANA example using all the genes from (Minn et al., 2005) and performing an in-depth analysis of the interaction network can be found at http://sbi.imim.es/piana/example.html.


Figure 1
View larger version (32K):
[in this window]
[in a new window]
 
Fig. 1 (a) Protein interaction network for MMP1 and LTBP1 and (b) network obtained after adding predictions based on interlogs.

 
Furthermore, PIANA has been previously used for the study of biological pathways in breast cancer cells (España et al., 2005).


    5 FUTURE WORK
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 
Future plans for PIANA include the annotation of proteins based on network motifs, prediction of protein structure using interactions (Espadaler et al., 2005b) and developing a reliability score for interactions. We intend as well to introduce algorithms that split proteins into the domains that perform the interactions.


    Acknowledgments
 
The authors thank J. Planas, P. Boixeda, B. Gregori and L. Salwinski for their helpful comments. R.A. is supported by a grant from the Spanish Ministerio de Ciencia y Tecnología (MCyT, BIO2002-03609). This work has been supported by grants from Fundación Ramón Areces, from the Spanish Ministerio de Educación y Ciencia (MEC, BIO02005-00533), the ‘Programa Gaspar de Portolà (DURSI)’, and by EU grant INFOBIOMED-NoE (IST-507585).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Jonathan Wren

Received on December 16, 2005; revised on February 8, 2006; accepted on February 23, 2006

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 PIANA ARCHITECTURE
 3 PIANA CAPABILITIES
 4 EXAMPLE
 5 FUTURE WORK
 REFERENCES
 

    Alfarano, C., et al. (2005) The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res, . 33, D418–D424[Abstract/Free Full Text].

    Ashburner, M., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet, . 25, 25–29[CrossRef][ISI][Medline].

    Bairoch, A., et al. (2005) The Universal Protein Resource (UniProt). Nucleic Acids Res, . 33, D154–D159[Abstract/Free Full Text].

    Benson, D.A., et al. (2005) GenBank. Nucleic Acids Res, . 33, D34–D38[Abstract/Free Full Text].

    Breitkreutz, B.J., et al. (2003) Osprey: a network visualization system. Genome Biol, . 4, R22[CrossRef][Medline].

    Espadaler, J., et al. (2005a) Detecting remotely related proteins by their interactions and sequence similarity [Erratum (2005) Proc. Natl Acad. Sci. USA 102, 9429.]. Proc. Natl Acad. Sci. USA, 102, 7151–7156[Abstract/Free Full Text].

    Espadaler, J., et al. (2005b) Prediction of protein–protein interactions using distant conservation of sequence patterns and structure relationships. Bioinformatics, 21, 3360–3368[Abstract/Free Full Text].

    Espana, L., et al. (2005) Bcl-x(L)-mediated changes in metabolic pathways of breast cancer cells: from survival in the blood stream to organ-specific metastasis. Am. J. Pathol, . 167, 1125–1137[Abstract/Free Full Text].

    Hermjakob, H., et al. (2004) The HUPO PSI's molecular interaction format—a community standard for the representation of protein interaction data. Nat. Biotechnol, . 22, 177–183[CrossRef][ISI][Medline].

    Hu, Z., et al. (2005) VisANT: data-integrating visual framework for biological networks and modules. Nucleic Acids Res, . 33, W352–W357[Abstract/Free Full Text].

    Huynen, M.A., et al. (2003) Function prediction and protein networks. Curr. Opin. Cell Biol, . 15, 191–198[CrossRef][ISI][Medline].

    Iragne, F., et al. (2005) ProViz: protein interaction visualization and exploration. Bioinformatics, 21, 272–274[Abstract/Free Full Text].

    Minn, A.J., et al. (2005) Genes that mediate breast cancer metastasis to lung. Nature, 436, 518–524[CrossRef][Medline].

    Murzin, A.G., et al. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol, . 247, 536–540[CrossRef][ISI][Medline].

    Pagel, P., et al. (2005) The MIPS mammalian protein–protein interaction database. Bioinformatics, 21, 832–834[Abstract/Free Full Text].

    Peri, S., et al. (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res, . 13, 2363–2371[Abstract/Free Full Text].

    Salwinski, L. and Eisenberg, D. (2003) Computational methods of analysis of protein–protein interactions. Curr. Opin. Struct. Biol, . 13, 377–382[CrossRef][ISI][Medline].

    Salwinski, L., et al. (2004) The Database of Interacting Proteins. Nucleic Acids Res, . 32, D449–D451[Abstract/Free Full Text].

    Shannon, P., et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, . 13, 2498–2504[Abstract/Free Full Text].

    Tatusov, R.L., et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics, 4, 41[CrossRef][Medline].

    von Mering, C., et al. (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res, . 31, 258–261[Abstract/Free Full Text].

    Yook, S.H., et al. (2004) Functional and topological characterization of protein interaction networks. Proteomics, 4, 928–942[CrossRef][ISI][Medline].

    Yu, H., et al. (2004) Annotation transfer between genomes: protein–protein interologs and protein–DNA regulogs. Genome Res, . 14, 1107–1118[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
D. Aguilar, L. Skrabanek, S. S. Gross, B. Oliva, and F. Campagne
Beyond tissueInfo: functional prediction using tissue expression profile similarity searches
Nucleic Acids Res., June 1, 2008; 36(11): 3728 - 3737.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. Salwinski and D. Eisenberg
The MiSink Plugin: Cytoscape as a graphical interface to the Database of Interacting Proteins
Bioinformatics, August 15, 2007; 23(16): 2193 - 2195.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Al-Shahrour, P. Minguez, J. Tarraga, I. Medina, E. Alloza, D. Montaner, and J. Dopazo
FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W91 - W96.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
T. Aittokallio and B. Schwikowski
Graph-based methods for analysing networks in cell biology
Brief Bioinform, September 1, 2006; 7(3): 243 - 255.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/8/1015    most recent
btl072v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (11)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Aragues, R.
Right arrow Articles by Oliva, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Aragues, R.
Right arrow Articles by Oliva, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?