Bioinformatics Advance Access originally published online on December 20, 2006
Bioinformatics 2007 23(4):401-407; doi:10.1093/bioinformatics/btl633
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Enrichment or depletion of a GO category within a class of genes: which test?
Équipe de Statistique Appliquée 10 rue Vauquelin, 75005 Paris, France
1 Laboratoire de Neurobiologie et Diversité Cellulaire, École Supérieure de Physique et de Chimie Industrielles (ESPCI) 10 rue Vauquelin, 75005 Paris, France
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: A number of available program packages determine the significant enrichments and/or depletions of GO categories among a class of genes of interest. Whereas a correct formulation of the problem leads to a single exact null distribution, these GO tools use a large variety of statistical tests whose denominations often do not clarify the underlying P-value computations.
Summary: We review the different formulations of the problem and the tests they lead to: the binomial,
2, equality of two probabilities, Fisher's exact and hypergeometric tests. We clarify the relationships existing between these tests, in particular the equivalence between the hypergeometric test and Fisher's exact test. We recall that the other tests are valid only for large samples, the test of equality of two probabilities and the
2-test being equivalent. We discuss the appropriateness of one- and two-sided P-values, as well as some discreteness and conservatism issues.
Contact: isabelle.rivals{at}espci.fr
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Jonathan Wren
Received on June 20, 2006; revised on December 11, 2006; accepted on December 11, 2006
This article has been cited by other articles:
![]() |
O. Tsoy, D. Ravcheev, and A. Mushegian Comparative Genomics of Ethanolamine Utilization J. Bacteriol., December 1, 2009; 191(23): 7157 - 7164. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Jiang, Y. Ma, C. Chen, X. Fu, S. Yang, X. Li, G. Yu, Y. Mao, Y. Xie, and Y. Li Androgen-Responsive Gene Database: Integrated Knowledge on Androgen-Responsive Genes Mol. Endocrinol., November 1, 2009; 23(11): 1927 - 1933. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Kristiansson, P. Hugenholtz, and D. Dalevi ShotgunFunctionalizeR: an R-package for functional comparison of metagenomes Bioinformatics, October 15, 2009; 25(20): 2737 - 2738. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Graze, L. M. McIntyre, B. J. Main, M. L. Wayne, and S. V. Nuzhdin Regulatory Divergence in Drosophila melanogaster and D. simulans, a Genomewide Analysis of Allele-Specific Expression Genetics, October 1, 2009; 183(2): 547 - 561. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Wren A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide Bioinformatics, July 1, 2009; 25(13): 1694 - 1701. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Bindea, B. Mlecnik, H. Hackl, P. Charoentong, M. Tosolini, A. Kirilovsky, W.-H. Fridman, F. Pages, Z. Trajanoski, and J. Galon ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks Bioinformatics, April 15, 2009; 25(8): 1091 - 1093. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-A. Tsai and J. J. Chen Multivariate analysis of variance test for gene set analysis Bioinformatics, April 1, 2009; 25(7): 897 - 903. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Sartor, G. D. Leikauf, and M. Medvedovic LRpath: a logistic regression approach for identifying enriched biological groups in gene expression data Bioinformatics, January 15, 2009; 25(2): 211 - 217. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Huang, B. T. Sherman, and R. A. Lempicki Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists Nucleic Acids Res., January 1, 2009; 37(1): 1 - 13. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Loire, F. Praz, D. Higuet, P. Netter, and G. Achaz Hypermutability of Genes in Homo sapiens Due to the Hosting of Long Mono-SSR Mol. Biol. Evol., January 1, 2009; 26(1): 111 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Chen, L. Wang, J. D. Smith, and B. Zhang Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes Bioinformatics, November 1, 2008; 24(21): 2474 - 2481. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Iossifov, T. Zheng, M. Baron, T. C. Gilliam, and A. Rzhetsky Genetic-linkage mapping of complex hereditary disorders to a whole-genome molecular-interaction network Genome Res., July 1, 2008; 18(7): 1150 - 1162. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zheng and X.-J. Wang GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis Nucleic Acids Res., July 1, 2008; 36(suppl_2): W358 - W363. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Al-Shahrour, J. Carbonell, P. Minguez, S. Goetz, A. Conesa, J. Tarraga, I. Medina, E. Alloza, D. Montaner, and J. Dopazo Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments Nucleic Acids Res., July 1, 2008; 36(suppl_2): W341 - W346. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Hackenberg and R. Matthiesen Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists Bioinformatics, June 1, 2008; 24(11): 1386 - 1393. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Nam and S.-Y. Kim Gene-set approach for expression pattern analysis Brief Bioinform, May 1, 2008; 9(3): 189 - 197. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Shriner, T. M. Baye, M. A. Padilla, S. Zhang, L. K. Vaughan, and A. E. Loraine Commonality of functional annotation: a method for prioritization of candidate genes from genome-wide linkage studies Nucleic Acids Res., March 27, 2008; 36(4): e26 - e26. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. E. Sanchez, M. Dellarole, K. Gaston, and G. de Prat Gay Comprehensive comparison of the interaction of the E2 master regulator with its cognate target DNA sites in 73 human papillomavirus types by sequence statistics Nucleic Acids Res., February 11, 2008; 36(3): 756 - 769. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Dass, S. Tardif, J. Y. Park, B. Tian, H. M. Weitlauf, R. A. Hess, K. Carnes, M. D. Griswold, C. L. Small, and C. C. MacDonald Loss of polyadenylation protein {tau}CstF-64 causes spermatogenic defects and male infertility PNAS, December 18, 2007; 104(51): 20374 - 20379. [Abstract] [Full Text] [PDF] |
||||








