Bioinformatics Advance Access originally published online on March 28, 2007
Bioinformatics 2007 23(10):1299-1300; doi:10.1093/bioinformatics/btm107
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cyclone: java-based querying and computing with Pathway/Genome databases
Computational Systems Biology Group, Genoscope/CNRS-UMR8030, 2 rue Gaston Crémieux, 91057 Evry Cedex, France
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Summary: Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization.
Availability: Cyclone is freely available under an open source license at: http://sourceforge.net/projects/nemo-cyclone
Contact: cyclone{at}genoscope.cns.fr
Supplementary information: For download and installation instructions, tutorials, use cases and examples, see http://nemo-cyclone.sourceforge.net
| 1 INTRODUCTION |
|---|
|
|
|---|
The availability of usable biological pathways information is both a key enabler and a bottleneck in systems biology research. Usability implies not only high quality, but also the possibility to query and access the information in a format suitable for a variety of modeling and analytical tasks. The two most prominent general-purpose metabolic pathways databases are Kyoto Encyclopedia of Genes and Genome (KEGG) (Kanehisa and Goto, 1999) and BioCyc (Karp et al., 2002). BioCyc is a collection of 205 species-specific Pathway/Genome Databases (PGDB) managed by the proprietary software called Pathway Tools. These PGDBs include automated reconstructions of metabolic networks from genome annotation and also curated sets of metabolic pathways, regulatory networks and chemical information for model organisms. The MetaCyc database recapitulates a set of non-redundant, experimentally elucidated metabolic pathways from 900 organisms.
MetaCyc and some PGDBs have been very successful, in particular among microbiologists, as reference datasets. Whereas they can be queried and visualized through the Pathways Tools interface, their use for advanced querying, model building or computation is more problematic.
Non-standard querying requires the writing of LISP code that targets BioCyc's native frame-based representation scheme (Karp et al., 1995). An alternative solution is the use of the JavaCyc (arabidopsis.org) or the PerlCyc APIs, which provides full access to BioCyc data by encapsulating calls of Pathway-Tools Lisp functions in Java or Perl, but do not create native objects for each biological entity. The latest version of BioCyc uses a relational database to store utility classes, but not the biological objects. Finally, the recently developed BioWarehouse (Lee et al., 2006) integrates a set of biological databases, including PGDBs, into a single platform. Its use for querying and computation is limited, however, only part of Biocyc's model is included, and data from PGDBs can only be read but not updated by BioWarehouse.
Cyclone addresses some of these limitations by providing a Java Object-Oriented API aimed at accessing, manipulating and computing with BioCyc information in an intuitive manner.
| 2 INFORMATION FLOW IN CYCLONE |
|---|
|
|
|---|
Cyclone maps BioCyc objects on Java objects. Using an extension of JavaCyc, Cyclone extracts the BioCyc data model from Pathway Tools and converts it into an XML Schema (Fig. 1a), defining an object model which is Cyclone's pivotal representation.
|
Cyclone uses JAXB (Java Architecture for XML Binding) in order to define Java classes corresponding to this schema (Fig. 1b). HyperJAXB is used in order to define an adequate correspondence in the object-relational mapping software Hibernate (Eliott et al., 2004), which implements persistence of the classes using any compatible relational database management system (CycloneDB) (Fig. 1c). Overall, this mechanism ensures the automatic adjustment of the Cyclone representation model to update in the BioCyc model.
Once the Cyclone representation has been defined and instantiated, e.g. using data from a PGDB, the resulting biological objects can be queried using HQL (Fig. 1d). Query results can be further manipulated as Java objects. These objects can be stored in CycloneDB using Hibernate (Fig. 1c). They also can be exported to/imported from Cyclone Markup Language files using JAXB (Fig. 1b). Partial exports to other formats (SIF, GraphML) are possible (Fig. 1e). All changes made in Cyclone, such as adding or editing a pathway, can be committed back to BioCyc (Fig. 1f).
| 3 FUNCTIONALITIES |
|---|
|
|
|---|
Cyclone can load an entire PGDB from BioCyc, modify it, for instance by adding user-specific information, and save it back into BioCyc. Cyclone can export and import data in CycloneML, allowing easy interface with other XML tools. The current distribution is fully compatible with BioCyc v9.0 and above.
The Cyclone API allows the extraction of data from BioCyc in order to build biological networks (e.g. bipartite metabolic graph or transcriptional regulatory network). The resulting networks can be manipulated as graphs using the Jung graph library (jung.sourceforge.net), bundled with the Cyclone installation package. They can also be exported in the GraphML exchange format, or in the SIF format, readable by Cytoscape. Cytoscape is a popular and intuitive software tool dedicated to biological networks visualization (Shannon et al., 2003).
Cyclone queries can be written in SQL and, more interestingly, in the Hibernate Query Language, (HQL), an object-oriented Query Language HQL. HQL queries are very concise and can express notions such as inheritance, polymorphism and association.
Below is a simple query example: Find all enzymes of Escherichia coli for which ATP is an inhibitor (Krummenacker et al., 2005) can be written as follows:
SELECT er.enzyme
FROM EnzymaticReactions er
WHERE er.Organism = 'Ecoli'
AND er.InhibitorsAll.Value like 'ATP'
In conclusion, via its use of mainstream technologies such as Java and XML, Cyclone facilitates the access to PGDBs for a broad community of computational biologists and bioinformaticians. The structured and curated pathways information from these databases thus becomes more readily usable for a variety of exploratory and computational goals.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
We are grateful to the NeMo group for beta testing, to A. Yartseva for the Cyclone name and to the P.Karp group at SRI, for its help with BioCyc. This work was supported by BioSapiens, an EU FP6 Network of Excellence (contract number LSHG-CT-2003-503265).
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Alfonso Valencia
Received on November 3, 2006; revised on March 12, 2007; accepted on March 13, 2007
| REFERENCES |
|---|
|
|
|---|
Eliott J. Hibernate: a developer's notebook (2004) Sebastopol, CA, USA: O'Reilly Media. ISBN 0596006969.
Lee TJ, et al. BioWarehouse: a bioinformatics database warehouse toolkit. BMC Bioinformatics (2006) 7:170.[CrossRef][Medline]
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acid Res. (1999) 27:29–34.
Krummenacker M, et al. Querying and computing with BioCyc databases. Bioinformatics (2005) 16:3454–3455.
Karp PD, et al. The Generic Frame Protocol. In: Proceedings of the 1995 International Joint Conference on Artificial Intelligence (1995) San Francisco, USA: Morgan Kaufmann Publishers. 768–774.
Karp P, et al. The pathway tools software. Bioinformatics (2002) 18:S225–S232.[Abstract]
Shannon P, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. (2003) 13:2498–2504.
This article has been cited by other articles:
![]() |
J. Gao, A. S. Ade, V. G. Tarcea, T. E. Weymouth, B. R. Mirel, H.V. Jagadish, and D. J. States Integrating and annotating the interactome using the MiMI plugin for cytoscape Bioinformatics, January 1, 2009; 25(1): 137 - 138. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

