Bioinformatics Advance Access originally published online on November 30, 2006
Bioinformatics 2007 23(3):328-335; doi:10.1093/bioinformatics/btl612
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Flexible empirical Bayes models for differential gene expression
Department of Statistics, University of British Columbia 333-6356 Agricultural Road, Vancouver, BC, Canada V6T 1Z2
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Inference about differential expression is a typical objective when analyzing gene expression data. Recently, Bayesian hierarchical models have become increasingly popular for this type of problem. The two most common hierarchical models are the hierarchical GammaGamma (GG) and LognormalNormal (LNN) models. However, to facilitate inference, some unrealistic assumptions have been made. One such assumption is that of a common coefficient of variation across genes, which can adversely affect the resulting inference.
Results: In this paper, we extend both the GG and LNN modeling frameworks to allow for gene-specific variances and propose EM based algorithms for parameter estimation. The proposed methodology is evaluated on three experimental datasets: one cDNA microarray experiment and two Affymetrix spike-in experiments. The two extended models significantly reduce the false positive rate while keeping a high sensitivity when compared to the originals. Finally, using a simulation study we show that the new frameworks are also more robust to model misspecification.
Availability: The R code for implementing the proposed methodology can be downloaded at http://www.stat.ubc.ca/~c.lo/FEBarrays
Contact: c.lo{at}stat.ubc.ca
Supplementary information: The supplementary material is available at http://www.stat.ubc.ca/~c.lo/FEBarrays/supp.pdf
Associate Editor: John Quackenbush
Received on October 1, 2006; revised on November 21, 2006; accepted on November 26, 2006