Skip Navigation


Bioinformatics Advance Access originally published online on November 18, 2004
Bioinformatics 2005 21(7):1257-1262; doi:10.1093/bioinformatics/bti147
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1257    most recent
bti147v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gkoutos, G. V.
Right arrow Articles by Hancock, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gkoutos, G. V.
Right arrow Articles by Hancock, J. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

CRAVE: a database, middleware and visualization system for phenotype ontologies

Georgios V. Gkoutos *, Eain C.J. Green , Simon Greenaway , Andrew Blake , Ann-Marie Mallon and John M. Hancock

Bioinformatics Group, MRC Mammalian Genetics Unit Harwell, Oxfordshire OX11 0RD, UK

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 INTRODUCTION
 SYSTEMS AND METHODS
 DISCUSSION AND CONCLUSION
 REFERENCES
 

Motivation: A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterizing phenotypes resulting from mutations. Efficiently expressing phenotypic information requires combinatorial use of ontologies. However tools are not currently available to visualize combinations of ontologies. Here we describe CRAVE (Concept Relation Assay Value Explorer), a package allowing storage, active updating and visualization of multiple ontologies.

Results: CRAVE is a web-accessible JAVA application that accesses an underlying MySQL database of ontologies via a JAVA persistent middleware layer (Chameleon). This maps the database tables into discrete JAVA classes and creates memory resident, interlinked objects corresponding to the ontology data. These JAVA objects are accessed via calls through the middleware's application programming interface. CRAVE allows simultaneous display and linking of multiple ontologies and searching using Boolean and advanced searches.

Availability: Direct access: http://www.mgu.har.mrc.ac.uk/CRAVE/

Contact: g.gkoutos{at}har.mrc.ac.uk


    INTRODUCTION
 TOP
 Abstract
 INTRODUCTION
 SYSTEMS AND METHODS
 DISCUSSION AND CONCLUSION
 REFERENCES
 
Ontologies have become an important tool for structuring biological information since the advent of the Gene Ontology (GO) in 2000 (Gene Ontology Consortium, 2000). GO describes gene products and genome annotation using consistent terminology. Ontologies in biology have proven successful in facilitating access to information. By accessing an ontology at a particular level of detail, terms relating to more specific or more detailed aspects of a part of the structure can be identified. In this way, apparently different descriptions, at different levels of detail, can be related. This approach is forming the basis of new approaches to the mining of biological data (Camon et al., 2004; Philippi and Kohler, 2004).

The description of (mutant) phenotypes using ontologies is a relatively new area which presents major conceptual and practical problems. These include the potential for a combinatorial explosion, if phenotypic attributes are represented individually and especially, if a fine-grain representation is required. The difficulty in representing phenotypes, such as behaviour phenotypes, which do not correspond to physical attributes of an organism (but are, rather, responses of the organism to challenges which may have many significant components) pose practical problems. To address this, we have proposed a schema (Gkoutos et al., 2004) (Fig. 1) upon which phenotype representation can be modelled. The idea behind this schema is to provide a common framework within which different core ontologies (such as behaviour, anatomy, etc.) can be mapped on to form phenotype ontologies. Individual core ontologies could be linked to the Phenotype And Trait Ontology (PATO; see http://obo.sourceforge.net/), an ontology of relationships, assays and values to provide phenotypic descriptions. According to this schema, for any type of organism (which may be defined by an ID number and various other descriptors, such as species, genotype, strain, genotypic sex, alleles at a specified locus, handling conditions and age or stage of development) concepts (such as parts of anatomy or types of behaviour) may be defined using an appropriate ontology. Such concepts have Attributes defined by PATO and will be characterized by an Assay, which provides information on the range of values, a particular Attribute can adopt, given a particular Assay. A combination of a Concept with a Relationship is described as a Phenotypic Attribute (Gkoutos et al., 2004).



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 1 Proposed ontology schema (Gkoutos et al., 2004).

 
To navigate and visualize complex ontologies of this kind requires appropriate retrieval and visualization software. There is also a need to map between ontologies from different domains on a common platform. Although several ontology browsers are available on the Gene Ontology Consortium website (http://www.geneontology.org/GO.tools.html), all of these are built to read simple directed acyclic graphs (DAG) and cannot represent complicated ontology schemas such as the ones proposed for phenotype ontologies.

Here we present a system, CRAVE (Concept Relation Assay Value Explorer), which allows users to search, retrieve and visualize phenotype ontologies held in a custom database. CRAVE is a free, open source computer resource available at the MRC Mammalian Genetics Unit website. It can be accessed directly via http://www.mgu.har.mrc.ac.uk/servlet/browser.frameset or downloaded from http://informatics.har.mrc.ac.uk/software/.


    SYSTEMS AND METHODS
 TOP
 Abstract
 INTRODUCTION
 SYSTEMS AND METHODS
 DISCUSSION AND CONCLUSION
 REFERENCES
 
Database and middleware
Ontologies are stored in a MySQL database (http://www.mysql.com/) that mimics the functionality of the schema presented in Figure 1. The database is based on the functionality of the Gene Ontology database (http://www.godatabase.org/dev/database/archive/) but allows the storage of multiple phenotype ontologies of different species and domains. Furthermore, the database is designed to hold instances of phenotypic descriptions, generating knowledgebases for individual domains and allowing cross-referencing and indexing between them. CRAVE accesses these ontologies via a JAVA middleware layer (Chameleon) that maps the relational database tables into discrete JAVA classes and then creates a collection of memory resident, interlinked objects that correspond to the ontology data stored within the database tables. These live JAVA objects are then accessed via calls through the middleware's application programming interface (API). CRAVE is based on a variety of open source JAVA classes and developer tools, plus our own custom software engineering. We chose JAVA and JavaScript to achieve platform independence. Figure 2 shows data flow between the functions schematically. There are a number of advantages of using a method by which data is stored by the web server as persistent objects. First, it enhances the ease of programming an application, as access to ontology data can be handled by one or two JAVA method calls. Second it improves performance, as no database interactions are needed once the data structure has been loaded. Finally, and importantly, the method ensures that all applications are using the same version of the ontologies.



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 2 Browser schema. The CRAVE GUI comprises four frames as described in the text. Individual frames mediate the acquisition of particular types of data from the ontology database by way of the middleware layer.

 
Source code and a tutorial for the use of Chameleon can be downloaded from http://informatics.har.mrc.ac.uk/software/chameleon.html. The schema of the ontology database can be accessed via http://informatics.har.mrc.ac.uk/software/crave.html.

Browser graphical user interface
A custom graphical user interface (GUI) provides the user with flexibility and allows visualization of the schema. The browser consists of four frames that are listed below:

  • A Main Frame situated in the middle. This frame is the main visualization tool for the ontologies.
  • A Navigation (Hierarchy) Frame situated at the top left-hand side. This frame allows the user to browse between the four different kinds of ontologies accommodated by the schema (Concepts, Relations, Assays and Values) and a hierarchical navigation of a selected ontology.
  • A Search Frame situated at the top right-hand side, allowing simple and advanced text searches of the ontologies.
  • A Metadata Frame situated at the bottom right-hand side. This frame displays all metadata associated with a particular term.

Generating the hierarchies
The browser queries the middleware, extracting all the root nodes of all available ontologies and then categorizes them into Concept, Relationship, Assay and Value types. Subsequently, this allows the building of each individual ontology hierarchy structure by invoking JAVA methods that dynamically generate the JavaScript tree-like menu.

Populating the Main Frame
The browser then allows the user to either perform a text search or browse the tree to explore the ontologies. The search form invokes a JAVA method that retrieves all matching terms and categorizes them in the Main Frame (see above) according to their ontology type. Browsing the tree and selecting a node invokes two methods. The first retrieves the ontology type of the term and the range of its relations, and displays them in the Main Frame, while the second retrieves all the associated metadata for the term and displays them in the Metadata Frame (see below). For example, if the node is a concept, all relations allowed in its range are also retrieved and all its associated metadata such as term definition, documentation, etc. are displayed in the Metadata Frame.

Formation of concept–relationship pairs (phenotypic attributes)
Further manipulation in the main display allows the generation of concept–relationship pairs (phenotypic attributes) and invokes a method that retrieves the range of the allowed values. The method also allows retrieval of the range of phenotypic attributes for a particular value. The same applies for displaying relationships between Assays and their Values.

Browser function
In this section we give a brief description of the browser's function. There are four buttons on the left-hand side named Concept, Relations, Assays and Values, accessing the core concept ontologies, the Phenotypic and Trait Ontology, Assay Controlled vocabularies and Values Ontologies, respectively.

Concepts
The browser can upload different ontologies from different species. The user must first select the species of interest and then select the particular core phenotype ontology. For example, for the mouse phenotype ontology at least the following ontologies are currently available or will be in the near future:

By selecting a particular ontology, the user can browse that ontology in a simple hierarchical manner as described elsewhere (Gene Ontology Consortium, 2000). By clicking on a term, it appears on the right-hand side of the browser under Concepts and the associated relations allowed for that particular term appear under Relationship. On the bottom of the browser, one can view all the associated metadata for that particular term.

Relationships
According to the schema there is a set of common attributes (PATO; http://obo.sourceforge.net/) that are linked with concepts to provide the phenotypic character. The user can browse this ontology and see all associated concepts that are allowed for a particular attribute in their range.

Assays
Different Assay Controlled vocabulary can be visualized here. By clicking a node, all the allowed values and the associated phenotypic attributes are displayed in the main frame.

Values
Here all the Value ontologies, including the PATO Value Ontology, are accessed. By browsing the ontologies one can view all assays that allow these values within their range.

Searching the ontologies
There are two types of searches allowed within CRAVE: Boolean searches and advanced searches. By default CRAVE performs the Boolean ‘AND’ query containing all requested parameters, such that ‘A B C’ is equivalent to ‘A and B and C’. It will also convert any punctuation mark (i.e. underscore or slash character etc.) used in the term name to white space and allow users to search for them either way. So, for example, body_tone could be searched for as body_tone or as body tone, or even BODY TONE since the search is case insensitive. Users also have the option to edit the default query to be any Boolean expression, so one can request to search for A or B, or A not C. More complicated queries can also be performed by using parentheses to emphasize evaluation priorities. For example, users can search for (A and B) and not C etc., which by default is left to right evaluation. The wildcard search (*) retrieves all entities held in the database.

CRAVE allows advanced searches in a simple, single interface manner. By using qualifiers in the search box such as value:, searches are performed on individual groups of ontologies. One could search for concepts linked to the eye that were identified using particular assays. Users are also allowed to search in particular species and particular ontologies for that species. For example: species:mouse AND ontology: behaviour AND concept:body AND attribute:position makes a specific search in the Mouse Behavioural ontology for the phenotypic attribute of body_position. This one line command is meant for external applications to be able to reverse engineer the API. For CRAVE's user interface the first two options (i.e. species and ontology) are provided in a drop down menu and are automatically filled when a user selects a particular species and/or ontology through the graphical interface.

Finally, although ontologies should hold synonyms for term names and different spellings (e.g. British versus American English), we anticipate that there will be occasions when these fields will not be complete and have adopted lists of terms (in a separate standalone database) spelled differently according to British and American spelling and populating common synonyms to allow CRAVE to search for them. So, if B is spelled differently (e.g. Behavior–Behaviour), our example ‘A and B and C’ is equivalent to ‘A and (B or B') and C’. This feature could also be used for conversions between other pairs of languages. Figure 3 shows schematically how the JAVA middleware executes the queries.



View larger version (10K):
[in this window]
[in a new window]
 
Fig. 3 JAVA middleware query engine. The figure illustrates the series of transformations carried out by the middleware in the process of executing a given query.

 
Visualization of an example
We illustrate here an example of a typical browse and search using CRAVE. We browse (or search) for the concept of Feeding Behaviour in the Mouse Behaviour ontology. We then click on this concept and view all the associated attributes coming from the PATO ontology that Feeding Behaviour allows, namely attribute:relative_quantity, attribute:food_substance, attribute:deviation(from_normal), as presented in Figure 4.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 4 CRAVE snapshot. The figure shows the result of the search for the terms Feeding and Behaviour. The left-hand (hierarchy) frame shows the position of Feeding Behaviour in the relevant ontology (Mouse Behaviour Ontology). The main frame (right) shows the Relationships associated with this Concept.

 
If we then choose, for example, the attribute of food type by clicking on the button next to it, we form a phenotypic attribute (concept plus relationship). All the assays, in this particular case only one (undefined_assay_of:food_substance assay) that are allowed in the range of the phenotypic attribute are displayed along with the allowed values (Fig. 5).



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 5 CRAVE snapshot. The Figure shows the result of selecting the attribute: food type term from the Relationship column by clicking, forming a phenotypic attribute. Assays and Values associated with this Phenotypic Attribute are displayed in the relevant columns.

 

    DISCUSSION AND CONCLUSION
 TOP
 Abstract
 INTRODUCTION
 SYSTEMS AND METHODS
 DISCUSSION AND CONCLUSION
 REFERENCES
 
We have presented here a freely available resource that allows storage, visualization and retrieval of phenotype ontologies based on our proposed schema (Gkoutos et al., 2004). CRAVE is meant to be an aid that will allow different phenotype communities to develop their ontologies. A domain of the Mouse Behavioural Phenotype Ontology being developed by us as part of the EUMORPHIA effort (http://www.eumorphia.org/) is currently available for browsing, searching and visualizing in CRAVE. CRAVE highlights the advantages and functionality of the proposed schema for modelling phenotype ontologies (Gkoutos et al., 2004). It also demonstrates how ontologies could be mapped onto a common platform to allow expressivity in phenotype description across ontologies describing a particular domain, and across different species.

We anticipate that in the future CRAVE would be linked to a variety of databases holding individual instances (e.g. Strivens et al., 2000; Blake et al., 2000) to allow searching, retrieval and visualization of annotated phenotypes. CRAVE could also be used as an annotation tool allowing curators to have a custom interface for providing instances of their annotation. Different parsers could allow CRAVE to express the ontologies in a variety of knowledge representation languages (Gruber, 1993; Baader, 1996; Stevens et al., 2000) and more importantly in OWL (http://www.w3.org/TR/owl-guide/), becoming a major aid, as far as phenotypes are concerned, in the realization of the Semantic Web (Staab, 2003; Gkoutos et al., 2002; Berners-Lee, unpublished data, see http://www.w3.org/DesignIssues/CG.html). Finally, although CRAVE has been developed as a tool for describing phenotypes, it could also be used in other domains which require complex descriptions involving multiple ontologies.


    Acknowledgments
 
This project is funded by the European Commission under contract number QLG2-CT-2002-00930. We thank Michael Ashburner and Suzie Lewis for their comments.

Received on September 3, 2004; revised on November 5, 2004; accepted on November 11, 2004

    REFERENCES
 TOP
 Abstract
 INTRODUCTION
 SYSTEMS AND METHODS
 DISCUSSION AND CONCLUSION
 REFERENCES
 

    Baader, F. (1996) A formal definition for the expressive power of terminological knowledge representation languages. J. Log. Comput., 6, 33–54.

    Blake, J.A., Eppig, J.T., Richardson, J.E., Davisson, M.T. the Mouse Genome Database Group. (2000) The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. Nucleic Acids Res., 28, 108–111[Abstract/Free Full Text].

    Camon, E., Barrell, D., Lee, V., Dimmer, E., Apweiler, R. (2004) The Gene Ontology Annotation (GOA) Database—an integrated resource of GO annotations to the UniProt Knowledgebase. In Silico Biol., 4, 5–6[Medline].

    Davidson, D., Bard, J., Kaufman, M., Baldock, R. (2001) The MouseAtlas Database: a community resource for mouse development. Trends Genet., 17, 49–51[CrossRef].

    The Gene Ontology Consortium. (2000) Gene Ontology: tool for the unification of biology. Nat. Genet., 25, 25–29[CrossRef][Web of Science][Medline].

    Gkoutos, G.V., Leach, C., Rzepa, H.S. (2002) ChemDig: new approaches to chemically significant indexing and searching of distributed web collections. New J. Chem., 26, 656–666[CrossRef].

    Gkoutos, G.V., Green, E.C.J., Mallon, A.M., Hancock, J.M., Davidson, D. (2004) Ontologies for the description of mouse phenotypes. Pac. Symp. Biocomput., 9, 178–189.

    Gruber, T.R. (1993) A translation approach to portable ontologies. Knowl. Acquis., 5, 199–220[CrossRef].

    Philippi, S. and Kohler, J. (2004) Using XML technology for the ontology-based semantic integration of life science databases. IEEE Trans. Inf. Technol. Biomed., 8, 154–160[CrossRef][Web of Science][Medline].

    Staab, S. (2003) The semantic web—new ways to present and integrate information. Comp. Funct. Genom., 4, 98–103[CrossRef].

    Stevens, R., Goble, C.A., Bechhofer, S. (2000) Ontology-based knowledge representation for bioinformatics. Brief. Bioinformatics, 4, 398–414.

    Strivens, M.A., Selley, R.L., Greenaway, S.J., Hewitt, M., Liu, X., Battershill, K., McCormack, S.L., Pickford, K.A., Vizor, L., Nolan, P.M., et al. (2000) Informatics for mutagenesis: the design of mutabase—a distributed data recording system for animal husbandry, mutagenesis, and phenotypic analysis. Mamm. Genome, 11, 577–583[CrossRef][Web of Science][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
E. C. J. Green, G. V. Gkoutos, H. V. Lad, A. Blake, J. Weekes, and J. M. Hancock
EMPReSS: European Mouse Phenotyping Resource for Standardized Screens
Bioinformatics, June 15, 2005; 21(12): 2930 - 2931.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1257    most recent
bti147v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gkoutos, G. V.
Right arrow Articles by Hancock, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gkoutos, G. V.
Right arrow Articles by Hancock, J. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?