Skip Navigation


Bioinformatics Advance Access originally published online on June 28, 2007
Bioinformatics 2007 23(19):2631-2632; doi:10.1093/bioinformatics/btm333
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/19/2631    most recent
btm333v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cavalieri, D.
Right arrow Articles by Dolara, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cavalieri, D.
Right arrow Articles by Dolara, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Eu.Gene Analyzer a tool for integrating gene expression data with pathway databases

Duccio Cavalieri 1,*, Cinzia Castagnini 1,{dagger}, Simona Toti 1,2,{dagger}, Karolina Maciag 1, Thomas Kelder 3, Luca Gambineri 4, Samuele Angioli 4 and Piero Dolara 1

1Department of Preclinical and Clinical Pharmacology, 2Department of Statistics, University of Florence, Florence, Italy, 3BIGCAT Bioinformatics, University of Maastricht, Maastricht, The Netherlands and 4Inspect.it, Capolona (Arezzo), Italy

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: Eu.Gene Analyzer is an easy-to-use, stand-alone application that allows rapid and powerful microarray data analysis in the context of biological pathways. Its intuitive graphical user interface makes it an easy and flexible tool, even for the first-time user. Eu.Gene supports a variety of array platforms, organisms and pathway ontologies, transparently deals with multiple nomenclature systems and seamlessly integrates data from different sources. Two different statistical methods, the Fisher Exact Test and the Gene Set Enrichment Analysis (GSEA), are implemented to identify biological pathways transcriptionally affected under experimental conditions. A suite of tools is offered to define, visualize and share custom non-redundant pathway sets.

In conclusion, Eu.Gene Analyzer is a new software application that takes advantage of information from multiple pathway databases to build a comprehensive interpretation of experimental results in a simple, intuitive environment.

Availability: Download of Eu.Gene Analyzer Java version is available free of charge for academic users. Please visit the web page: http://www.ducciocavalieri.org/bio/Eugene.htm

Contact: duccio.cavalieri{at}unifi.it

Supplementary information: http://www.ducciocavalieri.org/bio/Eugene/Suppl_Inf


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Pathway-based microarray analysis methods look for patterns of gene expression variation in any predefined set of genes. While the effect of each individual gene can be subtle, a coordinated change among many gene products can produce potent biological effects. Eu.Gene explores this type of multi-gene effects.


    2 FEATURES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Many collections of pathways and other meaningful gene sets are publically available. Eu.Gene loads and stores pathway definitions from the latest update of multiple public databases, KEGG (Kanehisa et al., 2006), Reactome (Joshi-Tope et al., 2005) and GenMAPP (Dahlquist et al., 2002), within the stand-alone application.

One obstacle to concurrent use of different pathway databases is their use of diverse types of identifiers. That is, different databases use various types of identifiers, or primary names, to refer to the same entities. The same problem appears among different microarray platforms. Eu.Gene resolves the nomenclature problems by handling conversion among various annotation and nomenclature systems transparently to the user (see Table 1a, Supplementary Material). To do so, a built-in conversion map converts all identifiers to a common format, Ensembl Gene and Transcript IDs (Hubbard et al., 2007). Thus, Eu.Gene users can load data from different array platforms (see Tables 1d and 2b, Supplementary Material) and different pathway sources using the original nomenclature. The entire conversion map is also available directly, as a Conversion Tool that easily converts user-generated lists of gene names among the different nomenclature systems. In some cases, the size of the conversion map is very huge and the analysis tasks become memory demanding (see Table 2a, Supplementary Material).

The available pathways can be browsed to select which to include in a custom pathway set. Browsing based on pathway source database is easy and convenient. Alternately, Eu.Gene provides a search function to identify and select pathways involving specific key words in the name. Thus, a pathway set may represent the space of biological categories of interest and may conveniently be stored for future use and shared with others.

The use of several pathway sources often results in the occurrence of redundant pathways with a high degree of overlap (Cary et al., 2005). The ‘Entity Affinity Filter’ tool helps to navigate the territory and create robust, non-redundant pathway sets. Users select threshold values for maximum tolerated overlap among pathways to include in the selected pathway set. A dendrogram visualization tool guides the user in the selection of redundancy threshold values. The dendrogram illustrates hierarchical trees of the loaded pathways, ordered according to the mutual overlap. To minimize overlap among pathways, users can employ the ‘SuperSetMode’ to collapse all available pathways into a super-pathway that can be stored and used for further analysis.

Pathways frequently include a number of genes not represented in the microarray. User can exclude from a pathway set the pathways which do not meet a threshold value of percentage of which are present on the microarray.

Eu.Gene Analyzer implements two different statistical methods to evaluate which pathways are most affected by differences in gene expression observed in a functional genomic experiment: the one-tailed Fisher Exact Test (FET) (Grosu et al., 2002) and Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005). The FET requires a user-defined threshold for gene expression to identify pathways enriched in both over- and under-expressed genes. The method GSEA does not use thresholds and is suited to the detection of coordinated, modest changes in transcriptional activity of many genes in a pathway. The output provided by the program is a text file with the relevant analytical information. For the FET, output values include signed and unsigned P-values. The P-value is a measure of the significance of a pathway's enrichment in transcriptionally altered genes; its sign reflects the relative number of up-regulated genes with respect to down-regulated genes for each pathway. GSEA analysis output files contains enrichment scores (ES) and empirical P-values for each pathway. The output is exportable in HTML or MS-Excel format.


    3 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The integration of pathway browsing and statistical tools for microarray analysis in a single, stand-alone application enables the user to perform cycles of analysis and fine-tuning to construct a biologically meaningful pathway set. Results from multiple experiments can be used for further analysis, such as clustering. The pathwayset and analysis criteria used for an experimental group can be saved as a project and used to analyze additional data. This feature allows the comparison between multiple experiments and keeps track of the analysis performed.

The significant improvement provided by Eu.Gene is represented by the power and flexibility offered to biologist in selecting and understanding the set of pathways analyzed and by the possibility to apply and compare the results from different statistical methods. The number of pathways, genomes, platform and statistical methods implemented in Eu.Gene Analyzer can be easily expanded in the future versions of the program.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
This work was funded by the European Networks of Excellence NuGO (FOOD-2003-5063-60), Dc Thera (LSHB-CT-2004-512074), by a year 2007 grant from Cassa di Risparmio di Firenze at the ‘Noi per Voi’ association and by a grant from AIRC Milan, Italy. We thank Prof. Bruce Conklin, Dr Chris Evelo and Dr Andre Boorsma for the useful discussions.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: David Rocke

{dagger}The authors wish it to be known that, in their opinion, the second and third authors should be regarded as joint First Authors. Back

Received on April 19, 2007; revised on June 15, 2007; accepted on June 17, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 FEATURES
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Cary MP, et al. Pathway information for systems biology. FEBS Lett., ( (2005) ) 579, : 1815–1820.[CrossRef][ISI][Medline].

    Dahlquist KD, et al. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet., ( (2002) ) 31, : 19–20.[CrossRef][ISI][Medline].

    Grosu P, et al. Pathway processor: a tool for integrating whole-genome expression results into metabolic networks. Genome Res., ( (2002) ) 12, : 1121–1126.[Abstract/Free Full Text].

    Hubbard TJP, et al. Ensembl 2007. Nucleic Acids Res., ( (2007) ) 35, : D610–D617.[CrossRef][ISI][Medline].

    Joshi-Tope G, et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res., ( (2005) ) 33, : D428–D432.[Abstract/Free Full Text].

    Kanehisa M, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res., ( (2006) ) 34, : D354–D357.[Abstract/Free Full Text].

    Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA, ( (2005) ) 102, : 15545–15550.[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
D. Nam and S.-Y. Kim
Gene-set approach for expression pattern analysis
Brief Bioinform, May 1, 2008; 9(3): 189 - 197.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/19/2631    most recent
btm333v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cavalieri, D.
Right arrow Articles by Dolara, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cavalieri, D.
Right arrow Articles by Dolara, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?