Bioinformatics Advance Access originally published online on August 10, 2009
Bioinformatics 2009 25(19):2486-2491; doi:10.1093/bioinformatics/btp471
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Double error shrinkage method for identifying protein binding sites observed by tiling arrays with limited replication
1Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA, 2Department of Statistics, Seoul National University, Seoul, Korea and 3Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: ChIP–chip has been widely used for various genome-wide biological investigations. Given the small number of replicates (typically two to three) per biological sample, methods of analysis that control the variance are desirable but in short supply. We propose a double error shrinkage (DES) method by using moving average statistics based on local-pooled error estimates which effectively control both heterogeneous error variances and correlation structures of an extremely large number of individual probes on tiling arrays.
Results: Applying DES to ChIP–chip tiling array study for discovering genome-wide protein-binding sites, we identified 8400 target regions that include highly likely TFIID binding sites. About 33% of these were well matched with the known transcription starting sites on the DBTSS library, while many other newly identified sites have a high chance to be real binding sites based on a high positive predictive value of DES. We also showed the superior performance of DES compared with other commonly used methods for detecting actual protein binding sites.
Contact: tspark{at}snu.ac.kr; jaeklee{at}virginia.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Alex Bateman
Received on April 15, 2009; revised on July 21, 2009; accepted on July 30, 2009