Bioinformatics Advance Access published online on January 29, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth006
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Statistics, Harvard University, 1 Oxford ST, Cambridge, MA 02138, USA
* To whom correspondence should be addressed. E-mail: jliu{at}stat.harvard.edu.
Motivation: The position-specific weight matrix (PWM) model, which assumes that each position in the DNA site contributes independently to the overall protein-DNA interaction, has been the primary means to describe transcription factor binding site motifs. Recent biological experiments, however, suggest that there exists interdependence among positions in the binding sites. In order to exploit this interdependence to aid motif discovery, we extend the PWM model to include pairs of correlated positions and design a Markov chain Monte Carlo algorithm to sample in the model space. We then combine the model sampling step with the Gibbs sampling framework for de novo motif discoveries. Results: Testing on experimentally validated binding sites, we find that about 25% of the transcription factor binding motifs show significant within-site position correlations, and 80% of these motif models can be improved by considering the correlated positions. Using both simulated data and real promoter sequences, we show that the new de novo motif-finding algorithm can infer the true correlated position pairs accurately and is more precise in finding putative TF binding sites than the standard Gibbs sampling algorithms. Availability: The program is available at http://www.people.fas.harvard.edu/~junliu/
Revised October 31, 2003
Accepted November 3, 2003
Article
Modeling within-motif dependence for transcription factor binding site predictions
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. A. F. T. van Hijum, M. H. Medema, and O. P. Kuipers Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation Microbiol. Mol. Biol. Rev., September 1, 2009; 73(3): 481 - 509. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Narlikar and I. Ovcharenko Identifying regulatory elements in eukaryotic genomes Brief Funct Genomic Proteomic, July 1, 2009; 8(4): 215 - 230. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu and G. D. Stormo Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors Bioinformatics, September 1, 2008; 24(17): 1850 - 1857. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zhou and J. S. Liu Extracting sequence features to predict protein-DNA interactions: a comparative study Nucleic Acids Res., July 1, 2008; 36(12): 4137 - 4148. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Vandenbon, Y. Miyamoto, N. Takimoto, T. Kusakabe, and K. Nakai Markov Chain-based Promoter Structure Modeling for Tissue-specific Expression Pattern Prediction DNA Res, February 7, 2008; (2008) dsm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. G Lemay, A. M Zivkovic, and J B. German Building the bridges to bioinformatics in nutrition research Am. J. Clinical Nutrition, November 1, 2007; 86(5): 1261 - 1269. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Jiang, M. Q. Zhang, and X. Zhang OSCAR: One-class SVM for accurate recognition of cis-elements Bioinformatics, November 1, 2007; 23(21): 2823 - 2828. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Das, T. A. Clark, A. Schweitzer, M. Yamamoto, H. Marr, J. Arribere, S. Minovitsky, A. Poliakov, I. Dubchak, J. E. Blume, et al. A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing Nucleic Acids Res., July 10, 2007; (2007) gkm485v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Li, Y. Liang, and R. L. Bass GAPWM: a genetic algorithm method for optimizing a position weight matrix Bioinformatics, May 15, 2007; 23(10): 1188 - 1194. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tomovic and E. J. Oakeley Position dependencies in transcription factor binding sites Bioinformatics, April 15, 2007; 23(8): 933 - 941. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Wilkinson Bayesian methods in bioinformatics and computational systems biology Brief Bioinform, April 12, 2007; (2007) bbm007v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tang, Z. Zhang, S. L. Tan, M.-H. E. Tang, A. P. Kumar, S. K. Ramadoss, and V. B. Bajic KBERG: KnowledgeBase for Estrogen Responsive Genes Nucleic Acids Res., January 12, 2007; 35(suppl_1): D732 - D736. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. T. Naughton, E. Fratkin, S. Batzoglou, and D. L. Brutlag A graph-based motif detection algorithm models complex nucleotide dependencies in transcription factor binding sites Nucleic Acids Res., November 6, 2006; 34(20): 5730 - 5739. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. GuhaThakurta Computational identification of transcriptional regulatory elements in DNA sequence Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. G. Lemay and D. H. Hwang Genome-wide identification of peroxisome proliferator response elements using integrated computational genomics J. Lipid Res., July 1, 2006; 47(7): 1583 - 1587. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Corcoran, E. Feingold, J. Dominick, M. Wright, J. Harnaha, M. Trucco, N. Giannoukakis, and P. V. Benos Footer: A quantitative comparative genomics method for efficient recognition of cis-regulatory elements Genome Res., June 1, 2005; 15(6): 840 - 847. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hong, X. S. Liu, Q. Zhou, X. Lu, J. S. Liu, and W. H. Wong A boosting approach for motif modeling using ChIP-chip data Bioinformatics, June 1, 2005; 21(11): 2636 - 2643. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. I. Gershenzon, G. D. Stormo, and I. P. Ioshikhes Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites Nucleic Acids Res., April 22, 2005; 33(7): 2290 - 2301. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Barash, G. Elidan, T. Kaplan, and N. Friedman CIS: compound importance sampling method for protein-DNA binding site p-value estimation Bioinformatics, March 1, 2005; 21(5): 596 - 600. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zhou and W. H. Wong CisModule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling PNAS, August 17, 2004; 101(33): 12114 - 12119. [Abstract] [Full Text] [PDF] |
||||









