Bioinformatics Advance Access originally published online on December 14, 2007
Bioinformatics 2008 24(3):404-411; doi:10.1093/bioinformatics/btm612
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model
Division of Biostatistics, School of Public Health, University of Minnesota, A460 Mayo Building (MMC 303), Minneapolis, MN 55455-0378, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: It is a common task in genomic studies to identify a subset of the genes satisfying certain conditions, such as differentially expressed genes or regulatory target genes of a transcription factor (TF). This can be formulated as a statistical hypothesis testing problem. Most existing approaches treat the genes as having an identical and independent distribution a priori, testing each gene independently or testing some subsets of the genes one by one. On the other hand, it is known that the genes work coordinately as dictated by gene networks. Treating genes equally and independently ignores the important information contained in gene networks, leading to inefficient analysis and reduced power.
Results: We propose incorporating gene network information into statistical analysis of genomic data. Specifically, rather than treating the genes equally and independently a priori in a standard mixture model, we assume that gene-specific prior probabilities are correlated as induced by a gene network: while the genes are allowed to have different prior probabilities, those neighboring ones in the network have similar prior probabilities, reflecting their shared biological functions. We applied the two approaches to a real ChIP-chip dataset (and simulated data) to identify the transcriptional target genes of TF GCN4. The new method was found to be more powerful in discovering the target genes.
Contact: weip{at}biostat.umn.edu
Associate Editor: Chris Stoeckirt
Received on September 18, 2007; revised on November 19, 2007; accepted on December 7, 2007
This article has been cited by other articles:
![]() |
C. Li and H. Li Network-constrained regularization and variable selection for analysis of genomic data Bioinformatics, May 1, 2008; 24(9): 1175 - 1182. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Sanguinetti, J. Noirel, and P. C. Wright MMG: a probabilistic tool to identify submodules of metabolic pathways Bioinformatics, April 15, 2008; 24(8): 1078 - 1084. [Abstract] [Full Text] [PDF] |
||||
