Skip Navigation


Bioinformatics Advance Access originally published online on February 21, 2008
Bioinformatics 2008 24(7):1026-1028; doi:10.1093/bioinformatics/btn068
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/7/1026    most recent
btn068v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lee, B.
Right arrow Articles by Seo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lee, B.
Right arrow Articles by Seo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

GOTreePlus: an interactive gene ontology browser

Bongshin Lee 1, Kristy Brown 2, Yetrib Hathout 2 and Jinwook Seo 2,*

1Microsoft Research, One Microsoft Way, Redmond, WA 98052 and 2Children's National Medical Center, 111 Michigan Ave, NW, Washington, DC 20010, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: We developed an interactive gene ontology (GO) browser named GOTreePlus that superimposes annotation information over GO structures. It can facilitate the identification of important GO terms through interactive visualization of them in the GO structure. The interactive pie chart summarizing an annotation distribution for a selected GO term provides users with a succinct context-sensitive overview of their experimental results. We tested our GOTreePlus using a proteome profiling dataset obtained on differentiation of retinal pigment epithelial cells where 399 proteins were quantified.

Availability: http://bioinformatics.cnmcresearch.org/GOTreePlus/

Contact: jseo{at}cnmcresearch.org


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Hypothesis generation and testing in biology these days involve informatics tasks due to the heavy volume of data generated by cutting-edge techniques. Microarray techniques increased the data resolution to merely over tens of thousands of features on a chip. Recent single nucleotide polymorphism chips push the limit even further to one million features per chip. Mass spectrometry data in the proteomics field also stretches the limit. As datasets become larger, it becomes more challenging to extract global information on the underlying biochemical pathways and biological processes. To deal with such a large dataset, it is essential to aggregate or summarize the initial dataset in a universal language so that future testing could focus on more relevant parts of data to the aims of the project. This has led biomedical researchers to project the dataset over the gene ontology (GO) to reveal overall meaning of their data and to set a more focused hypothesis.

GO enrichment tools such as GOMiner (Zeeberg et al., 2003) and Database for annotation, visualization, and integrated discovery (DAVID; Dennis et al., 2003) systematically sort the massive amount of GO data in a more meaningful and focused format based on various enrichment algorithms. However, it is still challenging to effectively explore the enrichment results over the GO structure. It is due to the size and complexity of GO data structure and the lack of visualization tools to efficiently browse and search the structure with experimental data combined.

GO contains ontology terms organized as a Directed Acyclic Graph (DAG), which is more complex than a tree structure because of the cross links in it. There are various GO browsers to help interpret microarray and proteomics data using the GO structure (see www.geneontology.org/GO.tools.browsers.shtml). Most of them are text-based tools showing the structure using a simple tree control that has ‘+’ or ‘–’ sign in front of each term and indents terms to show different levels. There are some graphical browsers, but they still do not support effective user interactions. The lack of effective navigation in these tools ignited the use of tools equipped with interactive visualization techniques. Baehrecke et al. (2004) incorporated a famous 2D space filling visualization called ‘Treemap’ to make an intuitive graphical overview of the raw dataset. Treemap improved the way biologists interpret their dataset. However, since it is not trivial in Treemap to show the hierarchical structure, users often struggle to grasp the underlying structure of the GO.

A different approach can be used to show expression/proteome profiling data with the GO annotation in a generalized Venn diagram (Kestler et al., 2005). Each GO term is represented as a circle whose size is proportional to the number of proteins/genes mapped to that term. It shows intersections between GO terms as well as a graphical overview of the GO association with the input dataset. But again the hierarchical structure of the GO was not incorporated in this visualization.

To address these problems, we developed an interactive GO browser called GOTreePlus (Fig. 1) by improving TreePlus (Lee et al., 2006). It visualizes the GO DAG as a tree and provides interactive zoom and pan. GOTreePlus maps proteome profiling data to the GO terms and visualizes them over the GO structure. It also provides succinct context-sensitive overviews of annotation distribution over children nodes of a selected node in a pie chart (Fig. 2). We believe GOTreePlus could help users better understand their genomic or proteomic datasets.


Figure 1
View larger version (48K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. GOTreePlus consists of two lists (GO terms list and Proteins/Genes list) on the left and the TreePlus control on the right. Online user manual with high-resolution figures are available at http://bioinformatics.cnmcresearch.org/GOTreePlus/.

 

Figure 2
View larger version (81K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Annotation distribution for ‘biological process’ node. By selecting ‘Show annotation distribution’ from the pop up menu, users can see the overall annotation distribution of all child nodes of the selected node using a pie chart.

 

    2 FEATURES AND FUNCTIONALITIES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
GOTreePlus (Fig. 1) consists of the GO terms list, the proteins/genes list and the TreePlus control. When users open a file containing a list of proteins/genes and an annotation file from the GO annotations download page (www.geneontology.org/GO.current.annotations.shtml), the number of annotations for each GO term is computed and shown in the GO structure using the TreePlus control. Each node representing a GO term has six attributes: name, ID, number of its own annotations, sum of the number of its descendents’ and its own annotations, average value of the proteins mapped to this node and average value of the proteins mapped to it or its descendents. Since the nodes in the TreePlus control, by default, are sorted by the sum of the number of its own annotations, users can easily see which GO term is most relevant in their data. Each node has a colored dot that shows up- or down-regulation of the proteins/genes mapped to that node. For example, red color indicates up-regulation and green color indicates down-regulation (Fig. 1).

Since the ontologies are structured as DAGs, it is common for a node to have more than one parent. GOTreePlus can visualize multiple parent nodes in a node-link diagram with color-coded edges. As users click on a GO term node, all children nodes are shown as outgoing edge with blue arrows and all parent nodes are shown as incoming edge with red arrows. Any structural change required to show DAG as a tree is smoothly animated to help users follow the change. Unlike other GO browsers, GOTreePlus enables users to select any GO term and make it the root node to initiate a focused exploration from the node. Users can also select any node to see a localized overview of annotation distribution over its children nodes in a standard pie chart with a coordinated list view (Fig. 2).

GOTreePlus provides a way to search for a specific GO term—a simple substring match either by term or by ID. Search results are shown in the GO terms list. When users select a GO term from the list, the selected term is shown in the TreePlus control. Furthermore, with the ‘GO Terms’ radio button selected, the proteins/genes list is updated with the proteins/genes associated with the selected GO term. The number of proteins/genes in the proteins/genes list is also updated and shown by the ‘Proteins/Genes’ radio button. Similarly, users can also search proteins by name or by symbol. When users select a protein/gene from the list, all GO terms related to the selected protein/gene are shown in the GO structure in the TreePlus control. If the ‘Proteins/Genes’ radio button is selected, the GO terms list is updated with the GO terms associated with the selected protein/gene.

In summary, GOTreePlus has the following distinctive features:

  • Visualize GO DAG structure as a tree with smooth animations
  • Explore GO with any selected GO node as a root node
  • Provide a context-sensitive overview of annotation distribution of children nodes of an interactively selected GO node in a pie chart
  • Search for a GO term in GO and its associated proteins
  • Search for a protein in user dataset and its associated GO terms in GO


    3 APPLICATIONS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
We used proteome profiling data obtained on dividing versus confluent human retinal pigment epithelial (RPE) cells to show the utility of GOTreePlus. Proteome profiling of dividing versus resting human RPE cells was obtained using stable isotope labeling by amino acid in cell culture (SILAC) strategy in combination with shotgun proteome profiling (Hathout et al., 2005). Labeled dividing cells were mixed at 1:1 ratio with unlabeled resting cells. Total cytosolic proteins were extracted and digested with trypsin, and the resulting peptides analyzed by 2D chromatography coupled to an liquid trap quadrupole ion trap mass spectrometer. A total of 399 proteins were identified and quantified in this study.

The bottleneck is how to look at this data and extract useful information based on the GO term and differential expression. One can rank proteins in a group as up-regulated and down-regulated proteins and look at their function and subcellular localization one by one. However, this is time consuming and the overall underlying biological process may be overlooked. By mapping this dataset to GO using GOTreePlus, we could easily follow the overall function and biological process underlying the global proteome profiling data obtained on dividing versus resting RPE cells. Most of the up-regulated proteins in dividing cells were found directly involved in cell cycle, synthesis and biogenesis of cell components, while the down-regulated proteins in resting cells were involved in actin remodeling and stress management.

The ontology terms that we can extract from this application directly reflects the biological status of the system studied. One can select all the proteins that are up-regulated in dividing cells and check if they have common ontology terms. The exploratory nature of using the GO structure makes the interactive graph visualization methods in GOTreePlus most useful. Users can delve into the sublevels of a GO term with annotated protein information to get a deeper insight into their datasets.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
This work was supported by NIH 5R24HD050846-02 Integrated molecular core for rehabilitation medicine, and NIH 1P30HD40677-01 (MRDDRC Genetics Core).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: John Quackenbush

Received on December 13, 2007; revised on January 13, 2008; accepted on February 17, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES AND FUNCTIONALITIES
 3 APPLICATIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Baehrecke EH, et al. Visualization and analysis of microarray and gene ontology data with treemaps. BMC Bioinformatics (2004) 5:84.[CrossRef][Medline]

    Dennis G, et al. DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol (2003) 4:R60.[CrossRef]

    Hathout Y, et al. Metabolic labeling of human primary retinal pigment epithelial cells for accurate comparative pro-teomics. J. Proteome Res (2005) 4:620–627.[CrossRef][Web of Science][Medline]

    Kestler HA, et al. Generalized venn diagrams: a new method of visualizing complex genetic set relations. Bioinformatics (2005) 21:1592–1595.[Abstract/Free Full Text]

    Lee B, et al. TreePlus: interactive exploration of networks with enhanced tree layouts. IEEE TVCG (2006) 12:1414–1426.

    Zeeberg BR, et al. GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol (2003) 4:R28.[CrossRef][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
C. Herrmann, S. Berard, and L. Tichit
SimCT: a generic tool to visualize ontology-based relationships for biological objects
Bioinformatics, December 1, 2009; 25(23): 3197 - 3198.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/7/1026    most recent
btn068v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lee, B.
Right arrow Articles by Seo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lee, B.
Right arrow Articles by Seo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?