Bioinformatics Advance Access originally published online on April 10, 2006
Bioinformatics 2006 22(13):1600-1607; doi:10.1093/bioinformatics/btl140
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Improved scoring of functional groups from gene expression data by decorrelating GO graph structure
Max-Planck-Institute for Informatics Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany
*To whom correspondence should be addressed.
Motivation: The result of a typical microarray experiment is a long list of genes with corresponding expression measurements. This list is only the starting point for a meaningful biological interpretation. Modern methods identify relevant biological processes or functions from gene expression data by scoring the statistical significance of predefined functional gene groups, e.g. based on Gene Ontology (GO). We develop methods that increase the explanatory power of this approach by integrating knowledge about relationships between the GO terms into the calculation of the statistical significance.
Results: We present two novel algorithms that improve GO group scoring using the underlying GO graph topology. The algorithms are evaluated on real and simulated gene expression data. We show that both methods eliminate local dependencies between GO terms and point to relevant areas in the GO graph that remain undetected with state-of-the-art algorithms for scoring functional terms. A simulation study demonstrates that the new methods exhibit a higher level of detecting relevant biological terms than competing methods.
Availability: topgo.bioinf.mpi-inf.mpg.de
Contact: alexa{at}mpi-sb.mpg.de
Supplementary Information: Supplementary data are available at Bioinformatics online.
Received on September 28, 2005; revised on March 30, 2006; accepted on April 4, 2006
This article has been cited by other articles:
![]() |
M. Shi, J. Bradner, T. K. Bammler, D. L. Eaton, J. Zhang, Z. Ye, A. M. Wilson, T. J. Montine, C. Pan, and J. Zhang Identification of Glutathione S-Transferase Pi as a Protein Involved in Parkinson Disease Progression Am. J. Pathol., July 1, 2009; 175(1): 54 - 65. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Reiland, G. Messerli, K. Baerenfaller, B. Gerrits, A. Endler, J. Grossmann, W. Gruissem, and S. Baginsky Large-Scale Arabidopsis Phosphoproteome Profiling Reveals Novel Chloroplast Kinase Substrates and Phosphorylation Networks Plant Physiology, June 1, 2009; 150(2): 889 - 903. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Rossignol, C. Ding, A. Guida, C. d'Enfert, D. G. Higgins, and G. Butler Correlation between Biofilm Formation and the Hypoxic Response in Candida parapsilosis Eukaryot. Cell, April 1, 2009; 8(4): 550 - 559. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rieber, B. Knapp, R. Eils, and L. Kaderali RNAither, an automated pipeline for the statistical analysis of high-throughput RNAi screens Bioinformatics, March 1, 2009; 25(5): 678 - 679. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ortutay and M. Vihinen Identification of candidate disease genes by integrating Gene Ontologies and protein-interaction networks: case study of primary immunodeficiencies Nucleic Acids Res., February 1, 2009; 37(2): 622 - 628. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Huang, B. T. Sherman, and R. A. Lempicki Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists Nucleic Acids Res., January 1, 2009; 37(1): 1 - 13. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Loire, F. Praz, D. Higuet, P. Netter, and G. Achaz Hypermutability of Genes in Homo sapiens Due to the Hosting of Long Mono-SSR Mol. Biol. Evol., January 1, 2009; 26(1): 111 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhao, W. Zhang, B. A. Stanley, and S. M. Assmann Functional Proteomics of Arabidopsis thaliana Guard Cells Uncovers New Stomatal Signaling Pathways PLANT CELL, December 1, 2008; 20(12): 3210 - 3226. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. V. Rajagopala, J. Goll, N.D. D. Gowda, K. C. Sunil, B. Titz, A. Mukherjee, S. S. Mary, N. Raviswaran, C. S. Poojari, S. Ramachandra, et al. MPI-LIT: a literature-curated dataset of microbial binary protein--protein interactions Bioinformatics, November 15, 2008; 24(22): 2622 - 2627. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Prifti, J.-D. Zucker, K. Clement, and C. Henegar FunNet: an integrative tool for exploring transcriptional interactions Bioinformatics, November 15, 2008; 24(22): 2636 - 2638. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. M. Han, R. Romero, J.-S. Kim, A. L. Tarca, S. K. Kim, S. Draghici, J. P. Kusanovic, F. Gotsch, P. Mittal, S. S. Hassan, et al. Region-Specific Gene Expression Profiling: Novel Evidence for Biological Heterogeneity of the Human Amnion Biol Reprod, November 1, 2008; 79(5): 954 - 961. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lu, R. Rosenfeld, I. Simon, G. J. Nau, and Z. Bar-Joseph A probabilistic generative model for GO enrichment analysis Nucleic Acids Res., October 1, 2008; 36(17): e109 - e109. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bauer, S. Grossmann, M. Vingron, and P. N. Robinson Ontologizer 2.0--a multifunctional tool for GO term enrichment analysis and data exploration Bioinformatics, July 15, 2008; 24(14): 1650 - 1651. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Antonov, T. Schmidt, Y. Wang, and H. W. Mewes ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data Nucleic Acids Res., July 1, 2008; 36(suppl_2): W347 - W351. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zheng and X.-J. Wang GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis Nucleic Acids Res., July 1, 2008; 36(suppl_2): W358 - W363. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Baerenfaller, J. Grossmann, M. A. Grobei, R. Hull, M. Hirsch-Hoffmann, S. Yalovsky, P. Zimmermann, U. Grossniklaus, W. Gruissem, and S. Baginsky Genome-Scale Proteomics Reveals Arabidopsis thaliana Gene Models and Proteome Dynamics Science, May 16, 2008; 320(5878): 938 - 941. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Shriner, T. M. Baye, M. A. Padilla, S. Zhang, L. K. Vaughan, and A. E. Loraine Commonality of functional annotation: a method for prioritization of candidate genes from genome-wide linkage studies Nucleic Acids Res., March 27, 2008; 36(4): e26 - e26. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Goeman and U. Mansmann Multiple testing on the directed acyclic graph of gene ontology Bioinformatics, February 15, 2008; 24(4): 537 - 544. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Yang, Y. Li, H. Xiao, Q. Liu, M. Zhang, J. Zhu, W. Ma, C. Yao, J. Wang, D. Wang, et al. Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories Bioinformatics, January 15, 2008; 24(2): 265 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. F. Schwarz, O. Hadicke, J. Erdmann, A. Ziegler, D. Bayer, and S. Moller SNPtoGO: characterizing SNPs by enriched GO terms Bioinformatics, January 1, 2008; 24(1): 146 - 148. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Grossmann, S. Bauer, P. N. Robinson, and M. Vingron Improved detection of overrepresentation of Gene-Ontology annotations with parent child analysis Bioinformatics, November 15, 2007; 23(22): 3024 - 3031. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lottaz, J. Toedling, and R. Spang Annotation-based distance measures for patient subgroup discovery in clinical microarray studies Bioinformatics, September 1, 2007; 23(17): 2256 - 2264. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, J. M. Hughes-Oliver, and J. A. Menius Jr Domain-enhanced analysis of microarray data using GO annotations Bioinformatics, May 15, 2007; 23(10): 1225 - 1234. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Falcon and R. Gentleman Using GOstats to test gene lists for GO term association Bioinformatics, January 15, 2007; 23(2): 257 - 258. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Kong, W. T. Pu, and P. J. Park A multivariate approach for integrating genome-wide expression data and biological knowledge Bioinformatics, October 1, 2006; 22(19): 2373 - 2380. [Abstract] [Full Text] [PDF] |
||||









