Skip Navigation


Bioinformatics Advance Access originally published online on December 1, 2007
Bioinformatics 2008 24(4):588-590; doi:10.1093/bioinformatics/btm581
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/4/588    most recent
btm581v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Adler, P.
Right arrow Articles by Vilo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Adler, P.
Right arrow Articles by Vilo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

KEGGanim: pathway animations for high-throughput data

Priit Adler 1,{dagger}, Jüri Reimand 2,{dagger}, Jürgen Jänes 2, Raivo Kolde 3, Hedi Peterson 1,3 and Jaak Vilo 1,2,3,*

1Estonian Biocentre, Riia 23b, 2University of Tartu, Institute of Computer Science, Liivi 2 and 3QureTec Inc. Ülikooli 6a, Tartu, Estonia

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: Gene expression analysis with microarrays has become one of the most widely used high-throughput methods for gathering genome-wide functional data. Emerging -omics fields such as proteomics and interactomics introduce new information sources. With the rise of systems biology, researchers need to concentrate on entire complex pathways that guide individual genes and related processes. Bioinformatics methods are needed to link the existing knowledge about pathways with the growing amounts of experimental data.

Results: We present KEGGanim, a novel web-based tool for visualizing experimental data in biological pathways. KEGGanim produces animations and images of KEGG pathways using public or user uploaded high-throughput data. Pathway members are coloured according to experimental measurements, and animated over experimental conditions. KEGGanim visualization highlights dynamic changes over conditions and allows the user to observe important modules and key genes that influence the pathway. The simple user interface of KEGGanim provides options for filtering genes and experimental conditions. KEGGanim may be used with public or private data for 14 organisms with a large collection of public microarray data readily available. Most common gene and protein identifiers and microarray probesets are accepted for visualization input.

Availability: http://biit.cs.ut.ee/KEGGanim/

Contact: vilo{at}ut.ee


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
High-throughput methods such as microarrays have changed the research pace in molecular biology. Thousands of genes and proteins are now routinely studied under experimental conditions (Buck et al., 2004; Eads et al., 2000; Schena et al., 1995), with results stored in public microarray databases like GEO (Barrett et al., 2007) and ArrayExpress (Parkinson et al., 2007), and protein databases like Pride (Jones et al., 2006).

As genomic and proteomic data accumulate, researchers envisage complex systems behind biological processes and functions. Genes and proteins rarely operate alone in the cell, but are regulated by elaborate mechanisms and bound into networks (Alon, 2007). Systems biology approaches are applied to view these networks in detail. Well-studied parts of networks called pathways have roles in cell signaling, gene regulation and metabolism as well as human disease. Pathways are described in databases like KEGG (Kanehisa et al., 2006) and Reactome (Vastrik et al., 2007).

Knowledge of pathways stored in databases is often far from complete. Bioinformatics methods are needed that combine various experimental data to verify existing knowledge and propose new hypotheses. Visualization has a key role in understanding complex and dynamical phenomena of pathways, proteomics and gene expression. Several efforts have been made in this area, but there is still a need for interactive web-based pathway resources. For example, KEGG allows the user to colour genes on the pathway. Reactome Skypainter and PathwayExpress (Khatri et al., 2007) link genes to pathways using overrepresentation analysis. However, these tools have no means for directly incorporating experimental data. Also, fixed images fail to deliver the temporal and spatial dynamics behind pathways and gene expression. GenMAPP (Dahlquist et al., 2002) and BioCyc Pathway Tools (Paley et al., 2006) produce user-defined pathways and cellular wiring diagrams, and allow inclusion of expression data with some animation capabilities. No such systematic visualization functions are available for the comprehensive KEGG resource.


    2 APPROACH
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
KEGGanim is a novel web-based visualization tool that links manually curated pathway maps from KEGG with experimental data from sources like gene expression and proteomics. KEGGanim shows animated figures of pathways with genes and proteins depicted as coloured rectangles. Pathway members are painted red or green according to their experimental values in the given dataset. Animation changes the colour values of these rectangles while looping over experimental conditions in the dataset, for instance moments in a timeseries (Spellman et al., 1998), healthy and diseased samples (Alon et al., 1999), or samples of healthy tissues (Ge et al., 2005).

KEGGanim allows a researcher to observe expression and protein production dynamics in the context of pathway dependencies. Animating a pathway over consecutive timepoints reflects the behaviour of master regulatory genes, propagation of signals in the pathway over time, and the avalanche of up- and downregulation caused by the master regulator. When analysing a set of conditions or tissues on microarray, KEGGanim allows to reason about tissue specificity or the influence of global conditions on the pathway and its components. Figure 1 shows an example of KEGGanim output.


Figure 1
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. KEGG animated pathways example: cinefilm snapshots display expression of genes involved in the apoptosis pathway during myocardial remodelling. The snapshots show pathway behaviour in three samples of left ventricle; non-ischaemic (middle), ischaemic (right) and acute ischaemic (left) conditions are observed. Ischaemic tissue, a common reason for heart disease, causes decrease in blood supply that leads to apoptosis and eventually to tissue death.

 

    3 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
KEGGanim combines KEGG pathway data with a matrix of experimental values of genes and proteins. First, the user needs to select a pathway of interest from a dropdown menu in the web interface, which corresponds to a graphical map downloaded from the KEGG database.

The second input is a matrix containing experimental values for genes and proteins. A number of gene expression datasets from GEO and ArrayExpress are available in KEGGanim for immediate analysis. KEGGanim automatically fetches all associations to the genes in the pathway from the g:Profiler software (Reimand et al., 2007), and creates an animation of the related experimental values over different conditions. If several probesets or proteins match a pathway member, the corresponding node is split into smaller coloured areas to reflect different experimental values. Users can upload their own data for analysis and visualization. Most common gene and protein IDs and microarray probesets are accepted as input, for instance standard names, RefSeq, Entrez, Affymetrix, UniProt, EnsEMBL as well as species-specific IDs. Uploaded data is optionally centred and normalized, and missing values may be replaced with fixed values or via the kNN method (Troyanskaya et al., 2001) implementation in GEPAS (Montaner et al., 2006).

Additional options help to interpret the animations and concentrate on specific conditions or components. KEGGanim tooltips display names and descriptions of genes and proteins when the user hovers over corresponding pathway members. Lineplots display the amounts of proteins or the gene expression levels. The user can narrow down the study by selecting a subset of conditions to view. Experimental values for irrelevant pathway members and related probesets may also be excluded from the animation. The cinefilm feature overcomes the technical difficulty of including animations in printed materials by allowing the user to extract pathway snapshots of timepoints or conditions into a separate image (Fig. 1). These features are especially useful in visualizing timeseries data. The GIF animations produced by KEGGanim do not require special software packages for viewing, and can easily be inserted into presentation slides, web pages, tutorials, etc.


    4 DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Advancing high-throughput technologies allow researchers to gather information about organizational, functional and physical layers of the cell. There is an increasing need for ideas that successfully integrate layers of data and explain the elaborate mechanisms responsible for creating the observed measurements. With the development of KEGGanim, we wish to contribute to the data integration goal and provide methods that take advantage of the powerful human visual analysis skill.


    5 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
KEGGanim is a simple web-based visualization tool that links manually curated KEGG pathway maps with high-throughput data. The tool creates animations that allow intuitive visual analysis of condition or tissue-specific changes in gene expression or protein levels within the selected pathway. KEGGanim is already actively used in several research initiatives, e.g. in functional profiling of mouse embryonic stem cell development.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
This research has been supported by the EU FP6 grants ENFIN LSHG-CT-2005-518254, FunGenES LSHG-CT-2003-503494 and Estonian Science Foundation ETF5724. The authors would like to thank Dr N. Billon, M. Kull, J. Hansen and the reviewers of this manuscript.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Olga Troyanskaya

{dagger}The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Back

Received on August 5, 2007; revised on October 22, 2007; accepted on November 19, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 APPROACH
 3 METHODS
 4 DISCUSSION
 5 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Alon U. Network motifs: theory and experimental approaches. Nat. Rev. Gen (2007) 8:450–61.[CrossRef][Web of Science][Medline]

    Alon U, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl Acad. Sci. USA (1999) 96:6745–50.[Abstract/Free Full Text]

    Barrett T, et al. NCBI GEO: mining tens of millions of expression profiles – database and tools update. Nucleic Acids Res (2007) D35:D760–D765.[Abstract/Free Full Text]

    Buck MJ, et al. ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics (2004) 83:349–360.[CrossRef][Web of Science][Medline]

    Dahlquist KD, et al. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet (2002) 31:19–20.[CrossRef][Web of Science][Medline]

    Eads CA, et al. MethyLight: a high-throughput assay to measure DNA methylation. Nucleic Acids Res (2000) 28:E32.[CrossRef][Medline]

    Ge X, et al. Interpreting expression profiles of cancers by genome-wide survey of breadth of expression in normal tissues. Genomics (2005) 86:127–141.[CrossRef][Web of Science][Medline]

    Jones P, et al. PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res (2006) D34:D659–D663.[Abstract/Free Full Text]

    Kanehisa M, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res (2006) D34:D354–D357.[Abstract/Free Full Text]

    Khatri P, et al. Onto-Tools: new additions and improvements in 2006. Nucleic Acids Res (2007) W35:W206–W211.[Abstract/Free Full Text]

    Montaner D, et al. Next station in microarray data analysis: GEPAS. Nucleic Acids Res (2006) W34:W486–W491.[Abstract/Free Full Text]

    Paley S, et al. The Pathway Tools cellular overview diagram and Omics Viewer. Nucleic Acids Res (2006) 34:3771–3778.[Abstract/Free Full Text]

    Parkinson H, et al. ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res (2007) D35:D747–D750.[Abstract/Free Full Text]

    Reimand J, et al. g:Profiler – a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res (2007) W35:W193–W200.[Abstract/Free Full Text]

    Schena M, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science (1995) 270:467–470.[Abstract/Free Full Text]

    Spellman PT, et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell (1998) 9:3273–3297.[Abstract/Free Full Text]

    Troyanskaya O, et al. Missing value estimation methods for DNA microarrays. Bioinformatics (2001) 17:520–525.[Abstract/Free Full Text]

    Vastrik I, et al. Reactome: a knowledge base of biologic pathways and processes. Genome Biol (2007) 8:R39.[CrossRef][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/4/588    most recent
btm581v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Adler, P.
Right arrow Articles by Vilo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Adler, P.
Right arrow Articles by Vilo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?