Bioinformatics Advance Access originally published online on February 1, 2008
Bioinformatics 2008 24(7):943-949; doi:10.1093/bioinformatics/btn049
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bayesian models based on test statistics for multiple hypothesis testing problems
1Department of Bioinformatics and Computational Biology and 2Department of Systems Biology, The University of Texas, M. D. Anderson Cancer Center, Houston, TX 77030, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool.
Results: Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Contact: yuanji{at}mdanderson.org
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Chris Stoeckert
Received on September 24, 2007; revised on December 3, 2007; accepted on January 29, 2008