Bioinformatics Advance Access published online on January 18, 2008
Bioinformatics, doi:10.1093/bioinformatics/btm620
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments
1Department of Biostatistics, Division of Information Sciences, City of Hope National Medical Center, Beckman Research Institute, 1500 Duarte Rd, Duarte, CA 91010, USA Current address: Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, 44 Binney Street, Boston, MA 02115, USA
2Groningen Bioinformatics Centre, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands.
*To whom correspondence should be addressed. Dr. Fangxin Hong, E-mail: fxhong{at}jimmy.harvard.edu
| Abstract |
|---|
Motivation: The proliferation of public data repositories creates a need for meta-analysis methods to efficiently evaluate, integrate and validate related datasets produced by independent groups. Choi et al. (2003) proposed a t-based approach to integrate effect size from multiple studies by modeling both intra- and between-study variation. Hong et al. (2006) applied a non-parametric "rank product" method to derive a statistic based on biological reasoning of fold-change criteria and to directly combine multiple datasets into one meta study. Fisher's Inverse X2 method, which only depends on p-values from individual analyses of each dataset, has been used in a couple of medical studies (Moreau et al., 2003). While these methods address the question from different angles, it is not clear how they compare with each other.
Results: We comparatively evaluate the three methods; t-based hierarchical modeling, rank products, and Fisher's Inverse X2 test with p-values from either the t-based or the rank product method. A simulation study shows that the rank product method, in general, has higher sensitivity and selectivity than the t-based method in both individual and meta-analysis, especially in the setting of small sample size and/or large between-study variation. Not surprisingly, Fisher's X2 method highly depends on the method used in the individual analysis. Application to real datasets demonstrates that meta-analysis achieves more reliable identification than an individual analysis, and rank products are more robust in gene ranking, which leads to a much higher reproducibility among independent studies. Though t-based meta-analysis greatly improves over the individual analysis, it suffers from a potentially large amount of false positives when p-values serve as threshold. We conclude that careful metaanalysis is a powerful tool for integrating multiple array studies.
Contact: fxhong{at}jimmy.harvard.edu
Associate Editor: Prof. David Rocke
Received on June 8, 2007; revised on December 4, 2007; accepted on December 8, 2007
This article has been cited by other articles:
![]() |
G. Marot, J.-L. Foulley, C.-D. Mayer, and F. Jaffrezic Moderated effect size and P-value combinations for microarray meta-analyses Bioinformatics, October 15, 2009; 25(20): 2692 - 2699. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-L. Boulesteix and M. Slawski Stability and aggregation of ranked gene lists Brief Bioinform, September 1, 2009; 10(5): 556 - 568. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lu, P. Huggins, and Z. Bar-Joseph Cross species analysis of microarray expression data Bioinformatics, June 15, 2009; 25(12): 1476 - 1483. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. de Magalhaes, J. Curado, and G. M. Church Meta-analysis of age-related gene expression profiles identifies common signatures of aging Bioinformatics, April 1, 2009; 25(7): 875 - 881. [Abstract] [Full Text] [PDF] |
||||

