Bioinformatics Advance Access originally published online on October 4, 2006
Bioinformatics 2006 22(23):2968-2970; doi:10.1093/bioinformatics/btl488
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks
1 Department of Computer Science, Yale University 51 Prospect Street, New Haven, CT 06511, USA
2 Department of Cancer Biology, Dana-Farber Cancer Institute 1 Jimmy Fund Way, SM854, Boston, MA 02115, USA
3 Department of Molecular Biophysics and Biochemistry, Yale University 266 Whitney Avenue, New Haven, CT 06520, USA
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Summary: Biological processes involve complex networks of interactions between molecules. Various large-scale experiments and curation efforts have led to preliminary versions of complete cellular networks for a number of organisms. To grapple with these networks, we developed TopNet-like Yale Network Analyzer (tYNA), a Web system for managing, comparing and mining multiple networks, both directed and undirected. tYNA efficiently implements methods that have proven useful in network analysis, including identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks. It also allows one to manage a large number of private and public networks using a flexible tagging system, to filter them based on a variety of criteria, and to visualize them through an interactive graphical interface. A number of commonly used biological datasets have been pre-loaded into tYNA, standardized and grouped into different categories.
Availability: The tYNA system can be accessed at http://networks.gersteinlab.org/tyna. The source code, JavaDoc API and WSDL can also be downloaded from the website. tYNA can also be accessed from the Cytoscape software using a plugin.
Contact: mark.gerstein{at}yale.edu
Supplementary information: Additional figures and tables can be found at http://networks.gersteinlab.org/tyna/supp
| 1 INTRODUCTION |
|---|
|
|
|---|
In the era of systems biology, the focus on understanding complex organisms is shifting from studying individual genes and proteins towards the relationships between them (Sharan and Ideker, 2006). These relationships are usually expressed in terms of various kinds of biological networks. Recent developments of large-scale experiments such as mass spectrometry and array-based techniques (Lee et al., 2002, 2006; Krogan et al., 2006) have generated rough descriptions of the complete networks in the cells. Many studies have reported interesting biological findings from these networks, including the relationships between various statistical properties of a gene and its function and essentiality, and the elucidation of controls at the molecular level based on network motifs (Lee et al., 2002; Gavin et al., 2006; Milo et al., 2002; Shen-Orr et al., 2002; Han et al., 2004; Yu et al., 2004).
These studies require heavy computations on multiple networks. We have developed a Web system, tYNA (topnet-like Yale Network Analyzer), to provide researchers with a set of tools to carry out such computations with great ease. The system provides five main types of functionality. (1) Network management: storing, retrieving and categorizing networks. A comprehensive set of widely used network datasets is preloaded, put into standard form, and categorized with a set of tags. (2) Network visualization: displaying networks in an interactive graphical interface (Fig. 1). (3) Network comparison and manipulation: various kinds of filtering and multiple network operations. (4) Network analysis: computing various statistics for the whole network and subsets, and finding motifs and defective cliques. (5) Network mining: predicting one network based on the information in another.
|
Our system shares some elements with some other network analysis and visualization systems, such as Cytoscape (Shannon et al., 2003), JUNG (http://jung.sourceforge.net/), N-Browse (http://nematoda.bio.nyu.edu:8080/NBrowse/N-Browse.jsp?last=false) and Osprey (http://biodata.mshri.on.ca/osprey/), but also offers some additional features such as defective clique finding. Besides, being a Web-based system, tYNA also has some unique advantages:
- Users can share networks through a centralized database.
- Computationally intensive tasks such as motif finding and statistics calculations can be performed on powerful servers.
- The system can be linked from/to other online resources.
- Users can incorporate some functions of tYNA into their own programs using the SOAP-based web service interface.
|
| 2 USING tYNA |
|---|
|
|
|---|
tYNA provides a simple view with some basic features and an advanced view for more complex analyses.
2.1 Uploading networks and categories
The first step of analysis is to upload networks. tYNA accepts various file formats, including the SIF format of Cytoscape. One may also enter additional attributes to organize the networks into groups, such as network type (e.g. proteinprotein interaction), organism (e.g. yeast) and experimental method (e.g. yeast two-hybrid) (Supplementary Figure S1). Furthermore, tYNA allows users to analyze subsets of the networks [e.g. active parts in a dynamic network (Han et al., 2004; Luscombe et al., 2004)] by using category files.
2.2 Loading networks into workspaces
After uploading a network, one may view its statistics and visualize it graphically by loading it into a workspace. A workspace is a working area for a single network (Fig. 1). Various statistics are computed, such as the clustering coefficient, eccentricity and betweenness (Yu et al., 2004). Networks are visualized in scalable vector graphics (SVG) using the aiSee package (http://www.aisee.com/), which facilitates an interactive interface: one may change the appearance of the network in real time (Supplementary Figures S2 and S3).
2.3 Single-network operations (advanced view)
Filtering allows one to retain a portion of the network, based on a statistics cutoff (e.g. the 5% of nodes with the highest out-degrees) or node names. They will easily allow one to identify the hubs and bottlenecks in a graph. Motif finding identifies various regular patterns in the network, including chains, cycles, feed-forward loops and complete two-layers. They generalize the motifs discussed in previous studies (Lee et al., 2002; Milo et al., 2002; Shen-Orr et al., 2002). Currently all occurrences of a specified motif pattern are returned. We will study the feasibility of returning only statistically over-represented motifs in future work. tYNA also identifies defective cliques (Yu et al., 2006) that suggest potential missing edges in a network (Supplementary Figures S4 and S5).
2.4 Multiple-network operations (advanced view)
Multiple-network operations allow one to select multiple networks, perform some operations on each of them, and merge them into a single network. For example, the intersection of multiple high-throughput proteinprotein interaction networks offers a high-confidence set of potential interactions. The relationships between different kinds of networks, such as gene regulation and co-expression, can also be studied (Supplementary Figure S6).
2.5 Mining and edge overlap (advanced view)
The edge overlap feature allows the comparison of the edges in two networks. It can test, using some prediction functions, how well one network predicts another. Some prediction functions are predefined, such as identity, sibling and couple (Supplementary Table S2).
2.6 Saving and downloading analyzed networks
Finally, one may save a working network into the database, or send it to another workspace (as a temporary backup). Likewise, one may download a working network in various network and graphics formats, including SIF, bitmap, postscript and PDF (Supplementary Figures S3 and S4).
| 3 IMPLEMENTATION |
|---|
|
|
|---|
All source codes were written in Java using standard J2EE architecture. A detailed JavaDoc API is available for users who want to use the classes in their own codes. The tYNA database can also be accessed through standard SOAP-based web services, and we have developed a plugin (available on the tYNA website) to interface tYNA with the network visualization system Cytoscape (Supplementary Figure S7).
| 4 DISCUSSION |
|---|
|
|
|---|
As biological network analysis is a vigorous research area, new statistics, motifs and mining algorithms are expected to emerge continuously. tYNA was thus designed in a modular fashion so that new features can be readily added. Being a Web system, the new features are immediately made accessible to users.
We plan on connecting tYNA with dataset management systems such as YeastHub (Cheung et al., 2005), Bind (Alfarano et al., 2005), DIP (Xenarios et al., 2002), MINT (Zanzoni et al., 2002), Reactome (Joshi-Tope et al., 2005) and annotation databases to provide a unified platform for performing complex analyses. We think that the combination of the analysis features provided by tYNA and the advanced visualization facilities of Cytoscape can prove particularly powerful. We also plan on interfacing tYNA with analysis and visualization tools such as bioPIXIE (Myers et al., 2005) and Pajek (Batagelj and Mrvar, 2003), which would allow researchers to combine the distinct features of each tool.
| Acknowledgments |
|---|
Funding to pay the Open Access publication charges for this article was provided by NIH.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Chris Stoeckert
Received on June 30, 2006; revised on August 25, 2006; accepted on September 15, 2006
| REFERENCES |
|---|
|
|
|---|
Alfarano, C., et al. (2005) The Biomolecular Interaction Network database and related tools: 2005 update. Nucleic Acids Res, . 33, D418D424
Batagelj, V. and Mrvar, A. (2003) Pajekanalysis and visualization of large networks. In Jünger, M. and Mutzel, P. (Eds.). Graph Drawing Software, . Springer, pp. 77103.
Cheung, K.H., et al. (2005) YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics, 21, Suppl. 1, i85i96[Abstract].
Gavin, A.-C., et al. (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature, 440, 631636[CrossRef][Medline].
Han, J.-D.H., et al. (2004) Evidence for dynamically organized modularity in the yeast proteinprotein interaction network. Nature, 430, 8893[CrossRef][Medline].
Ito, T., et al. (2000) Toward a proteinprotein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc. Natl Acad. Sci. USA, 98, 45694574.
Joshi-Tope, G., et al. (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res, . 33, D428D432
Krogan, N.J., et al. (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 440, 637643[CrossRef][Medline].
Lee, T.I., et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799804
Luscombe, N.M., et al. (2004) Genomic analysis of regulatory network dynamics reveals large topological changes. Nature, 431, 308312[CrossRef][Medline].
Milo, R., et al. (2002) Network motifs: simple building blocks of complex networks. Science, 298, 824827
Myers, C.L., et al. (2005) Discovery of biological networks from diverse functional genomic data. Genome Biol, . 6, R114[CrossRef][Medline].
Shannon, P., et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, . 13, 24982504
Sharan, R. and Ideker, T. (2006) Modeling cellular machinery through biological network comparison. Nat. Biotechnol, . 24, 427433[CrossRef][Web of Science][Medline].
Shen-Orr, S.S., et al. (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet, . 31, 6468[CrossRef][Web of Science][Medline].
Uetz, P., et al. (2000) A comprehensive analysis of proteinprotein interactions in Saccharomyces cerevisiae. Nature, 403, 623627[CrossRef][Medline].
Xenarios, I., et al. (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, . 30, 303305
Yu, H., et al. (2004) TopNet: a tool for comparing biological subnetworks, correlating protein properties with topological statistics. Nucleic Acids Res, . 32, 328337
Yu, H., et al. (2006) Predicting interactions in protein networks by completing defective cliques. Bioinformatics, 22, 823829
Zanzoni, A., et al. (2002) MINT: a Molecular INTeraction Database. FEBS Lett, . 513, 135140[CrossRef][Web of Science][Medline].
This article has been cited by other articles:
![]() |
V. Mayya, D. H. Lundgren, S.-I. Hwang, K. Rezaul, L. Wu, J. K. Eng, V. Rodionov, and D. K. Han Quantitative Phosphoproteomic Analysis of T Cell Receptor Signaling Reveals System-Wide Modulation of Protein-Protein Interactions Sci. Signal., August 18, 2009; 2(84): ra46 - ra46. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Minguez, S. Gotz, D. Montaner, F. Al-Shahrour, and J. Dopazo SNOW, a web-based tool for the statistical analysis of protein-protein interaction networks Nucleic Acids Res., July 1, 2009; 37(suppl_2): W109 - W114. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Assenov, F. Ramirez, S.-E. Schelhorn, T. Lengauer, and M. Albrecht Computing topological parameters of biological networks Bioinformatics, January 15, 2008; 24(2): 282 - 284. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ma, Q. Gong, and H. J. Bohnert An Arabidopsis gene network based on the graphical Gaussian model Genome Res., November 1, 2007; 17(11): 1614 - 1625. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Ferro, R. Giugno, G. Pigola, A. Pulvirenti, D. Skripin, G. D. Bader, and D. Shasha NetMatch: a Cytoscape plugin for searching biological networks Bioinformatics, April 1, 2007; 23(7): 910 - 912. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




