Bioinformatics Advance Access originally published online on July 25, 2008
Bioinformatics 2008 24(18):2122-2123; doi:10.1093/bioinformatics/btn390
The CellML Model Repository
Auckland Bioengineering Institute, The University of Auckland, Auckland 1010, New Zealand
*To whom correspondence should be protect
| ABSTRACT |
|---|
|
|
|---|
Summary: The CellML Model Repository provides free access to over 330 biological models. The vast majority of these models are derived from published, peer-reviewed papers. Model curation is an important and ongoing process to ensure the CellML model is able to accurately reproduce the published results. As the CellML community grows, and more people add their models to the repository, model annotation will become increasingly important to facilitate data searches and information retrieval.
Availability: The CellML Model Repository is publicly accessible at http://www.cellml.org/models
Contact: c.lloyd{at}auckland.ac.nz
| 1 INTRODUCTION |
|---|
|
|
|---|
High throughput experimental techniques have led to the population of web-accessible databases with vast amounts of biological data. Mathematical models of biological systems are playing an essential role in the interpretation of this data. The scientific community now faces the challenge of the mathematical models themselves becoming increasingly complex and numerous. There is a need for centralized databases to store all these models in standard formats to make them easily accessible and reusable by the research community. Publishing the models in a standard format, concurrent with the submission of a written paper, will eliminate many of the errors introduced into the model during the publication process. Here we introduce the CellML Model Repository (http://www.cellml.org/models) and discuss it as a solution to these challenges. The BioModels database (Le Novere et al., 2006) is a similar effort, containing biochemical pathway models that have been described in peer-review publications, expressed in SBML (Hucka et al., 2003). Similarly, JWS Online (Olivier and Snoep, 2004) is a repository of kinetic models describing biological systems, and ModelDB (Hines et al., 2004) is a database which stores published models in the field of computational neuroscience.
CellML (Lloyd et al., 2004) and the CellML Model Repository are part of the IUPS Physiome Project (Hunter and Nielsen, 2005) effort to create a virtual physiological human. The explicit representation of modularity, together with the flexible nature of the CellML language which allows the description of a diverse range of cellular and subcellular systems, are two essential features of CellML with regards to its role in the Physiome Project.
Initially the CellML Model Repository started out as a set of examples to illustrate how the language could be applied to describe various biological processes, and to test its features as the language evolved. Later, once the CellML 1.0 specification was stabilized, the CellML repository became a collection of CellML descriptions of models drawn from peer-reviewed journal publications. The CellML Model Repository has since undergone significant growth, with over 330 freely available, quantitative models of biological processes taken from the peer-reviewed literature. In contrast with other databases, such as BioModels, JWS and ModelDB, which focus on specific areas such as systems biology pathway models or computational neuroscience, the CellML Model Repository contains models describing a wide range of biological processes, including: signal transduction pathways, metabolic pathways, elec-trophysiology, immunology, the cell cycle, muscle contraction and mechanical models and constitutive laws. This wide scope exemplifies CellML's ability to describe much of the biochemistry, electrophysiology and mechanics of the intracellular environment. Lumped parameter models dealing with systems physiology (e.g. blood pressure control, fluid retention, electrolyte balance, endocrine function, etc.) are also within the scope of CellML.
| 2 MODEL CURATION |
|---|
|
|
|---|
Currently, of the
330 models in the CellML Model Repository, approximately half have been curated to some degree. A star system signifies the curation status of a CellML model. No stars indicate the model has yet to be curated (level 0); one star denotes the CellML model is consistent with the published paper (level 1); two stars imply the CellML model has been checked for typographical errors, unit consistency, completeness (i.e. there are no missing parameters or equations), overconstraints and finally, and arguably most importantly, the CellML model is capable of reproducing the published results (level 2). If a CellML model has three stars it is known to satisfy physical constraints such as conservation of mass, momentum, charge, etc. At this level the curation is conducted by a domain expert (level 3). From experience, we have found that levels 1 and 2 can be mutually exclusive. Frequently, the errors introduced into the model during the publication process require us to correct minor typographical errors or unit inconsistencies, and/or contact the original model author to request missing parameter values or equations.
The process of model curation involves the following sequence of actions:
- The CellML model is loaded into an editing and simulation environment such as the Physiome CellML Environment (PCEnv) or Cellular Open Resource (COR). Any obvious typographical errors and unit inconsistencies are corrected, which is facilitated by a series of error messages and validation prompts generated by the software, and the rendering of the MathML equations in an easily readable format.
- Assuming the model is able to be run, we then compare the simulation output with the results in the published paper—this typically involves comparing the graphical results with the published figures.
- If we cannot get the CellML model to run, or the simulation output disagrees with the published results, we then attempt to contact the original model author(s) and seek their advice and, where possible, obtain the original model code, which may be in a wide range of different programming languages.
| 3 MODEL ANNOTATION |
|---|
|
|
|---|
Metadata, the extra information associated with a model, are embedded in CellML using the W3C approved RDF standard. In order for a CellML model to be committed to the repository, at the very least it must contain the full citation of the peer-reviewed publication from which the model was taken. This may also be complemented by non-compulsory metadata such as the model authorship and modification histories. While currently these data are non-compulsory, this information is regarded as essential to the utility of a model as a public resource. The CellML Model Repository curators respect the MIRIAM framework (Le Novere et al., 2005) for minimum model annotation requirements, but place different emphasis on annotation requirements. For example, a full modification history explaining what changes were made at what time, by whom, and for what reason is suggested, but not required, by MIRIAM, whereas the CellML Model Repository curators place great importance on this information. A model which is supposedly correct, but offers no explanation as to why it differs from the incorrect description given in the original publication is of limited use to a researcher.
Mathematical descriptions of biological systems implemented in CellML can be given semantic meaning by annotating elements within the CellML model with ontologies and constrained vocabularies, such as SBO (http://www.ebi.ac.uk/sbo/), BioPAX (http://www.biopax.org/), UniProt (http://beta.uniprot.org/), Gene Ontology(http://www.geneontology.org/), etc. One of the primary goals of the semantic annotation of CellML models within the CellML Model Repository is to facilitate searches of models, and elements within models, to allow them to be reused. Further, the annotation of models with semantic information will increase intercompatibility between CellML and other modelling languages, such as SBML, by facilitating the identification of common elements.
| 4 FUTURE DIRECTIONS |
|---|
|
|
|---|
We encourage the scientific modelling community—including model authors, journals and publishing houses—to publish models in the CellML Model Repository concurrent with the publication of their printed article. This eliminates the need for code-to-text-to-code translations and thus avoids many of the errors which are introduced during the model translation process.
As the CellML community continues to grow, there will be more users submitting their CellML models to the repository, and model curation and annotation will be essential to the maintenance of the CellML Model Repository as a useful resource. We anticipate the development and improvement of simulation and editing tools will further facilitate the model curation process, while model annotation will be enhanced through links with biological ontologies.
Finally, with the implementation of CellML 1.1, we intend to decompose CellML 1.0 models into a series of reusable modules. The repository will become a library of reusable models, allowing the creation of new, more complex models from pre-existing parts.
Funding: Wellcome Trust; Maurice Wilkins Centre for Molecular Biodiscovery.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Jonathan Wren
Received on June 16, 2008; revised on July 15, 2008; accepted on July 23, 2008
| REFERENCES |
|---|
|
|
|---|
Hines ML, et al. ModelDB: a database to support computational neuroscience. J. Comput. Neurosci (2004) 17:7–11.[CrossRef][Web of Science][Medline]
Hucka M, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics (2003) 19:524–531.
Hunter P, Nielsen P. A strategy for integrative computational physiology. Physiology (Bethesda) (2005) 20:316–325.[CrossRef][Medline]
Le Novere N, et al. Minimum information requested in the annotation of biochemical models (MIRIAM). Nat. Biotechnol. (2005) 23:1509–1515.[CrossRef][Web of Science][Medline]
Le Novere N, et al. BioModels database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. (2006) 34:D689–D691.
Lloyd CM, et al. CellML: its future, present and past. Prog. Biophys. Mol. Biol. (2004) 85:433–450.[CrossRef][Web of Science][Medline]
Olivier BG, Snoep JL. Web-based kinetic modelling using JWS Online. Bioinformatics (2004) 20:2143–2144.
This article has been cited by other articles:
![]() |
S. M. Wimalaratne, M. D. B. Halstead, C. M. Lloyd, M. T. Cooling, E. J. Crampin, and P. F. Nielsen A method for visualizing CellML models Bioinformatics, November 15, 2009; 25(22): 3012 - 3019. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Lister, M. Pocock, M. Taschuk, and A. Wipat Saint: a lightweight integration environment for model annotation Bioinformatics, November 15, 2009; 25(22): 3026 - 3027. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Wimalaratne, M. D. B. Halstead, C. M. Lloyd, E. J. Crampin, and P. F. Nielsen Biophysical annotation and representation of CellML models Bioinformatics, September 1, 2009; 25(17): 2263 - 2270. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Endler, N. Rodriguez, N. Juty, V. Chelliah, C. Laibe, C. Li, and N. Le Novere Designing and encoding models for synthetic biology J R Soc Interface, August 6, 2009; 6(Suppl_4): S405 - S417. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Matsuoka, S. Ghosh, and H. Kitano Consistent design schematics for biological systems: standardization of representation in biological engineering J R Soc Interface, August 6, 2009; 6(Suppl_4): S393 - S404. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Nickerson and M. L. Buist A physiome standards-based model publication paradigm Phil Trans R Soc A, May 28, 2009; 367(1895): 1823 - 1844. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Beard, R. Britten, M. T. Cooling, A. Garny, M. D.B. Halstead, P. J. Hunter, J. Lawson, C. M. Lloyd, J. Marsh, A. Miller, et al. CellML metadata standards, associated tools and repositories Phil Trans R Soc A, May 28, 2009; 367(1895): 1845 - 1867. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Linge, J. Sundnes, M. Hanslien, G.T. Lines, and A. Tveito Numerical solution of the bidomain equations Phil Trans R Soc A, May 28, 2009; 367(1895): 1931 - 1950. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Wimalaratne, M. D. B. Halstead, C. M. Lloyd, M. T. Cooling, E. J. Crampin, and P. F. Nielsen Facilitating modularity and reuse: guidelines for structuring CellML 1.1 models by isolating common biophysical concepts Exp Physiol, May 1, 2009; 94(5): 472 - 485. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



