Bioinformatics Advance Access published online on May 5, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp295
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes
1 College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150086, China
2 Bioinformatics Centre and School of Life Science, University of Electronic Science and Technology of China, Chengdu, 610054, China
*To whom correspondence should be addressed. Prof. Zheng Guo, E-mail: guoz{at}ems.hrbmu.edu.cn, markzguo{at}163.com
| Abstract |
|---|
Motivation:According to current consistency metrics such as POG (Percentage of Overlapping Genes), lists of differentially expressed genes (DEGs) detected from different microarray studies for a complex disease are often highly inconsistent. This irreproducibility problem also exists in other high-throughput post genomic areas such as proteomics and metabolismics. A complex disease is often charac-terized with many coordinated molecular changes, which should be considered when evaluating the reproducibility of discovery lists from different studies.
Results:We proposed metrics POGR (Percentage of Overlapping Genes-Related) and nPOGR (normalized POGR) to evaluate the consistency between two DEG lists for a complex disease, consider-ing correlated molecular changes rather than only counting gene overlaps between the lists. Based on microarray datasets of three diseases, we showed that though the POG scores for DEG lists from different studies for each disease are extremely low, the POGR and nPOGR scores can be rather high, suggesting that the apparently inconsistent DEG lists may be highly reproducible in the sense that they are actually significantly correlated. Observing different discov-ery results for a disease by the POGR and nPOGR scores will obvi-ously reduce the uncertainty of the microarray studies. The pro-posed metrics could also be applicable in many other high-throughput post-genomic areas.
Contact:guoz@ems.hrbmu.edu.cn
Associate Editor: Dr. Limsoon Wong
Min Zhang and Lin Zhang contributed equally to this work.
Received on October 13, 2008; revised on April 28, 2009; accepted on April 28, 2009