Bioinformatics Advance Access originally published online on May 12, 2007
Bioinformatics 2007 23(14):1868-1870; doi:10.1093/bioinformatics/btm258
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OBO to OWL: a protégé OWL tab to read/save OBO ontologies
1Stanford Medical Informatics, Stanford University School of Medicine, Stanford, CA 94002, USA and 2USP, São Paulo, Brazil
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
The Open Biomedical Ontologies (OBO) format from the GO consortium is a very successful format for biomedical ontologies, including the Gene Ontology. But it lacks formal computational definitions for its constructs and tools, like DL reasoners, to facilitate ontology development/maintenance. We describe the OBO Converter, a Java tool to convert files from OBO format to Web Ontology Language (OWL) (and vice versa) that can also be used as a Protégé Tab plug-in. It uses the OBO to OWL mapping provided by the National Center for Biomedical Ontologies (NCBO) (a joint effort of OBO developers and OWL experts) and offers options to ease the task of saving/reading files in both formats.
Availability: bioontology.org/tools/oboinowl/obo_converter.html
Contact: dilvan{at}stanford.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
| 1 INTRODUCTION |
|---|
|
|
|---|
The Gene Ontology (Ashburner et al., 2000) and a significant number of bio-ontologies are in the OBO-format (GO, 2004). This format has evolved to support the needs of the bio-ontologies under the Open Biomedical Ontologies (OBO) umbrella and aims to have human readability, ease of parsing, extensibility and minimal redundancy. It has served the biomedical community well and currently forms the backbone of most GO-based data analysis tools.
In parallel with the developments in bio-ontologies, ontologies in general have become more prevalent in information technology; with the most visible push coming from the W3C in the form of the Web Ontology Language—OWL (http://www.w3.org/TR/owl-ref/) as a standard for ontologies. There has also been a corresponding increase in the number, diversity and quality of the tools available to construct, maintain and view ontologies in OWL.
Ontologies in the OBO format typically lack computational definitions to differentiate a term from other similar terms. A computer is unable to determine the meaning of a term, which presents problems for tools such as automated reasoners (Mungall, 2004). This lack of computational definitions leaves the task of maintaining ontology integrity entirely on the ontology developers. However, if the ontology language is unambiguous and expressive enough, then computer programs can parse it and perform basic maintenance. As pointed out by Mungall (2004), reasoners can be of enormous benefits in managing a complex ontology. A mapping between OBO format and OWL will enable users to leverage an increasing number of OWL tools, such as DL reasoners.
Once OBO ontologies are converted to OWL, they (1) are available to a wider user community, (2) can make use of reasoners for hierarchical classification and (3), when logical statements (such as necessary and sufficient definitions for classes) are added, can take full advantage of reasoners. The new OBO 1.2 format (not yet adopted by most OBO ontologies) can support some reasoning, but it lacks OWL expressiveness and a DL reasoner.
| 2 THE OBO CONVERTER TAB FOR PROTÉGÉ |
|---|
|
|
|---|
The OBO Converter Tab implements the mapping of OBO format files into OWL. It is based on the mapping provided by the National Center for Biomedical Ontologies (NCBO, 2007) in a joint effort of OBO developers and ontology experts. It maps OBO semantic information into appropriate OWL constructs and OBO lexical information into predefined OWL annotation tags.
This Tab converts OBO format files into OWL and vice versa. The CoBRA (Aitken et al., 2004) is an OBO editor that provides comparable mapping functionality, but it does not support DL quantifiers or OBO 1.2 making it incompatible with the NCBO mapping.
The Tab has the following parts: a mapping program, an ontology interface to the Protégé OWL API, a GUI (Graphic User Interface) to work as a Protégé Tab and a command line interface to work as a stand-alone application. First, the mapping program parses an OBO 1.0 or 1.2 format file, using the OBO-Edit 1.002 parser, and then uses the OBO-Edit API to map it to a generic ontology, using the simplified ontology interface to the Protégé OWL API. This interface maps generic ontology constructs, such as class and property, to the OWL API. As the mapping does not actually need all the features of the OWL API, it makes sense to use just a subset. Using the mapping program and the simplified ontology interface, the OBO to OWL mapping task is broken in two steps:
- map OBO format to a simplified ontology description (using a Java interface) and;
- map this simplified ontology description into OWL.
An advantage of this modular design is the fact that the Java classes implementing Step 2 can be substituted by others that do the same mapping to other formats or APIs, such as Jena.
When saving OWL projects in the OBO format, the Tab uses this simplified ontology interface to read the OWL ontology from Protégé; uses the OBO-Edit API to map the OWL constructs into OBO Format; and finally writes them as OBO format files.
The mapping allows the conversion of OWL files back into OBO without loss of information, if they follow the map conventions. But, when mapping between arbitrary OWL files to OBO format, the Tab will lose information that OBO cannot codify (as OWL is more expressive). For users that are comfortable with this information loss, the Tab can be used to force the conversion.
| 3 THE FEATURES OF THE TAB |
|---|
|
|
|---|
The OBO Converter Tab has two main panels, one to read OBO files and one to save them. The save operation is straightforward as the user chooses the file name and the conversion is done. The read operation has the same functionality plus a set of options that can alter the way an OBO file is read (Fig. 1).
|
3.1 Class name generation
A Combo box allows users to choose (three options) the way the OWL class names will be generated from the OBO format terms:
- OBO id: this option will generate the name from the OBO term id. This is the default option and generates the OWL id in the way described in the mapping (NCBO, 2007).
- Class name: this option will generate the name from the OBO term name. This has to be used with care because the names are not required to be unique. If the names are not unique, there will be a parser error.
- Class name + OBO id: this option will generate the name from the combination of the OBO term name and id.
In all cases, characters other than letters (a–z, A–Z) or numbers are converted to underscore characters (_), e.g. the OBO term name nurse cell is converted to the OWL class name nurse_cell.
The default behavior of the Tab is to generate the OWL class names from the OBO id. If the user wants to see the OBO name, instead of their meaningless ids, as the class identifier, Protégé has an option to display the OWL class label as the identifier. As OWL labels can also have language identifiers (such as en for English), converted OBO ontologies can now have names in different languages all pointing to the same entities allowing for language localization while preserving the OBO ids.
The other options are targeted to users that want to create their ontology using a specific OBO ontology as a start point, but want to name their entities in a different way. Using these options does not guarantee naming compatibility with the OBO format.
3.2 Exclusion of OBO namespaces
In OBO format, it is possible to define more than one ontology in a file using the OBO namespace: tag (Note that OBO namespaces have no connection with OWL namespaces). For instance, the Gene Ontology actually contains three independent ontologies defined in the same file (gene_ontology.obo) using three different namespaces (biological_process, molecular_function and cellular_component). When the Tab reads such file types, the user can specify which namespaces should be read from the file. If no choice is made then the tab collapses all ontologies into one (namespace information is also always stored as annotation).
3.3 Default namespace URI
It is the author's responsibility to set the default namespace URI for the OWL ontology. That information is not available in the OBO format file. The Tab has a panel where the author can enter this URI. If it does not enter one, whatever default URI Protégé is using becomes the default namespace URI.
| 4 THE ROUND TRIP TEST |
|---|
|
|
|---|
The round trip test is done by reading an OBO file, transforming it to OWL, saving it back to OBO and checking if the two files are equal (using the OBO-Edit Diff tool). To make the testing more thorough, we wrote a test suite that tests each OBO file available in the OBO repository (any file ending in.obo).
Using this test suite, we were able discover and correct problems in the mapping. It allowed us to identify features of the OBO format that were under specified (this tool was actually used to develop the NCBO mapping). We also found defective ontology files in the 42 OBO files on the site, at the time of the testing (27 November 2006): two were not readable: mosquito_anatomy.obo and seb.obo; three were read correctly, but had errors detected: fly_development.obo, psi-mi.obo and psi-mod.obo files; four had illegal characters: zea_mays_anatomy.obo, image.obo. fly_anatomy.obo and human-dev-anat-abstract.obo.
| 5 CONCLUSION |
|---|
|
|
|---|
The test suite allowed rigorous testing of the Tab against a large number of ontologies, demonstrating robustness and translation without loss of information. The Tab was also used to easily test mapping changes against a large number of ontologies, during the mapping creation. The possibility of having OBO bio-ontologies in OWL with just a mouse click is the main contribution of this tool.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
We would like to acknowledge Nigam Shah for his useful opinions about this work. This work was funded by NIH grant U54 HG004028 and a grant from CAPES-Brazil.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Dmitrij Frishman
Received on February 21, 2007; revised on May 7, 2007; accepted on May 8, 2007
| REFERENCES |
|---|
|
|
|---|
Aitken S, et al. COBrA: a bio-ontology editor. Bioinformatics, ( (2004) ) 21, : 825–826. Epub 28 October 2004.[CrossRef][ISI][Medline].
Ashburner M, et al. Gene Ontology: tool for the unification of biology. Nat. Genet., ( (2000) ) 25, : 25–29.[CrossRef][ISI][Medline].
GO. The OBO Flat File Format Specification, ( (2004) ) Version 1.2 http://www.geneontology.org/GO.format.obo-1_2.shtml..
Mungall CJ. Obol: integrating language and meaning in bio-ontologies. Comp. Funct. Genom., ( (2004) ) 5, : 509–520.[CrossRef].
NCBO. OBO in OWL: Mapping and Tools, ( (2007) ) February 12. http://www.bioontology.org/wiki/index.php?title=OboInOwl:Main_Page..
This article has been cited by other articles:
![]() |
E. Antezana, M. Egana, B. De Baets, M. Kuiper, and V. Mironov ONTO-PERL: An API for supporting the development and analysis of bio-ontologies Bioinformatics, March 15, 2008; 24(6): 885 - 887. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Aitken, Y. Chen, and J. Bard OBO Explorer: an editor for open biomedical ontologies in OWL Bioinformatics, February 1, 2008; 24(3): 443 - 444. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

