Identification of consensus patterns in unaligned DNA sequences known to be functionally related
Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, CO 80309-0347, USA
*To whom reprint request should be sent
We have developed a method for identifying consensus patterns in a set of unaligned DNA sequences known to bind a common protein or to have some other common biochemical function. The method is based on a tnatrix representation of binding site patterns. Each row of the matrix represents one of the four possible bases, each column represents one of the positions of the binding site and each element is determined by the frequency the indicated base occurs at the indicated position. The goal of the method is to find the most significant matrix-i.e. the one with the lowest probability of occurring by chance-out of all the matrices that can be formed from the set of related sequences. The reliability of the method improves with the number of sequences, while the time required increases only linearly with the number of sequences. To test this method, we analysed 11 DNA sequences containing promoters regulated by the Escherichia coli LexA protein. The matrices we' found were consistent with the known consensus sequence, and could distinguish the generally accepted LexA binding sites from other DNA sequences.
Received on November 6, 1989; accepted on December 20, 1989
This article has been cited by other articles:
![]() |
J. Liu, X. Xu, and G. D. Stormo The cis-regulatory map of Shewanella genomes Nucleic Acids Res., August 13, 2008; (2008) gkn515v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Thomas-Chollier, O. Sand, J.-V. Turatsinze, R. Janky, M. Defrance, E. Vervisch, S. Brohee, and J. van Helden RSAT: regulatory sequence analysis tools Nucleic Acids Res., July 1, 2008; 36(suppl_2): W119 - W127. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. J. Pape, S. Rahmann, and M. Vingron Natural similarity measures between position frequency matrices with an application to clustering Bioinformatics, February 1, 2008; 24(3): 350 - 357. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tomovic and E. J. Oakeley Quality estimation of multiple sequence alignments by Bayesian hypothesis testing Bioinformatics, September 15, 2007; 23(18): 2488 - 2490. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tomovic and E. J. Oakeley Position dependencies in transcription factor binding sites Bioinformatics, April 15, 2007; 23(8): 933 - 941. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Johnson, R. J. Gamblin, L. Ooi, A. W. Bruce, I. J. Donaldson, D. R. Westhead, I. C. Wood, R. M. Jackson, and N. J. Buckley Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication Nucleic Acids Res., September 1, 2006; 34(14): 3862 - 3877. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. GuhaThakurta Computational identification of transcriptional regulatory elements in DNA sequence Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-K. Tsai, G. T.-W. Huang, M.-Y. Chou, H. H.-S. Lu, and W.-H. Li Method for identifying transcription factor binding sites in yeast Bioinformatics, July 15, 2006; 22(14): 1675 - 1681. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kwan, J. Liu, M. DuBow, P. Gros, and J. Pelletier Comparative Genomic Analysis of 18 Pseudomonas aeruginosa Bacteriophages J. Bacteriol., February 1, 2006; 188(3): 1184 - 1187. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hu, B. Li, and D. Kihara Limitations and potentials of current motif discovery algorithms Nucleic Acids Res., September 2, 2005; 33(15): 4899 - 4913. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Gonze, S. Pinloche, O. Gascuel, and J. van Helden Discrimination of yeast genes involved in methionine and phosphate metabolism on the basis of upstream motifs Bioinformatics, September 1, 2005; 21(17): 3490 - 3500. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Corcoran, E. Feingold, and P. V. Benos FOOTER: a web tool for finding mammalian DNA regulatory regions using phylogenetic footprinting Nucleic Acids Res., July 1, 2005; 33(suppl_2): W442 - W446. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Gupta and J. S. Liu De novo cis-regulatory module elicitation for eukaryotic genomes PNAS, May 17, 2005; 102(20): 7079 - 7084. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-L. Li and C.-J. Fu A linear programming approach for identifying a consensus sequence on DNA sequences Bioinformatics, May 1, 2005; 21(9): 1838 - 1845. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. I. Gershenzon, G. D. Stormo, and I. P. Ioshikhes Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites Nucleic Acids Res., April 22, 2005; 33(7): 2290 - 2301. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kwan, J. Liu, M. DuBow, P. Gros, and J. Pelletier The complete genomes and proteomes of 27 Staphylococcus aureus bacteriophages PNAS, April 5, 2005; 102(14): 5174 - 5179. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-M. Zheng Relation between weight matrix and substitution matrix: motif search by similarity Bioinformatics, April 1, 2005; 21(7): 938 - 943. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Smith, P. Sumazin, and M. Q. Zhang Identifying tissue-selective transcription factor binding sites in vertebrate promoters PNAS, February 1, 2005; 102(5): 1560 - 1565. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. E. Schones, P. Sumazin, and M. Q. Zhang Similarity of position frequency matrices for transcription factor binding sites Bioinformatics, February 1, 2005; 21(3): 307 - 313. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Sumazin, G. Chen, N. Hata, A. D. Smith, T. Zhang, and M. Q. Zhang DWE: Discriminating Word Enumerator Bioinformatics, January 1, 2005; 21(1): 31 - 38. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, L. Wei, S. Batzoglou, D. L. Brutlag, J. S. Liu, and X. S. Liu A suite of web-based programs to search for transcriptional regulatory motifs Nucleic Acids Res., July 1, 2004; 32(suppl_2): W204 - W207. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, X. S. Liu, L. Wei, R. B. Altman, and S. Batzoglou Eukaryotic Regulatory Element Conservation Analysis and Identification Using Comparative Genomics Genome Res., March 1, 2004; 14(3): 451 - 458. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Salgado, S. Gama-Castro, A. Martinez-Antonio, E. Diaz-Peredo, F. Sanchez-Solano, M. Peralta-Gil, D. Garcia-Alonso, V. Jimenez-Jacinto, A. Santos-Zavaleta, C. Bonavides-Martinez, et al. RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12 Nucleic Acids Res., January 1, 2004; 32(90001): D303 - 306. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, K. Tan, and G. D. Stormo Computational identification of the Spo0A-phosphate regulon that is essential for the cellular differentiation and development in Gram-positive spore-forming bacteria Nucleic Acids Res., December 1, 2003; 31(23): 6891 - 6903. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Roven and H. J. Bussemaker REDUCE: an online tool for inferring cis-regulatory elements and transcriptional module activities from microarray data Nucleic Acids Res., July 1, 2003; 31(13): 3487 - 3490. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. van Helden Regulatory Sequence Analysis Tools Nucleic Acids Res., July 1, 2003; 31(13): 3593 - 3596. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rombauts, K. Florquin, M. Lescot, K. Marchal, P. Rouze, and Y. Van de Peer Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes Plant Physiology, July 1, 2003; 132(3): 1162 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-J. Hu Prediction of consensus structural motifs in a family of coregulated RNA sequences Nucleic Acids Res., September 1, 2002; 30(17): 3886 - 3893. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. F. Cliften, L. W. Hillier, L. Fulton, T. Graves, T. Miner, W. R. Gish, R. H. Waterston, and M. Johnston Surveying Saccharomyces Genomes to Identify Functional Elements by Comparative DNA Sequence Analysis Genome Res., July 1, 2001; 11(7): 1175 - 1186. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Besemer, A. Lomsadze, and M. Borodovsky GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions Nucleic Acids Res., June 15, 2001; 29(12): 2607 - 2618. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gorodkin, S. L. Stricklin, and G. D. Stormo Discovering common stem-loop motifs in unaligned RNA sequences Nucleic Acids Res., May 15, 2001; 29(10): 2135 - 2144. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Tan, G. Moreno-Hagelsieb, J. Collado-Vides, and G. D. Stormo A Comparative Genomics Approach to Prediction of New Members of Regulons Genome Res., April 1, 2001; 11(4): 566 - 584. [Abstract] [Full Text] |
||||
![]() |
L. M. Jakt, L. Cao, K. S.E. Cheah, and D. K. Smith Assessing Clusters and Motifs from Gene Expression Data Genome Res., January 1, 2001; 11(1): 112 - 123. [Abstract] [Full Text] |
||||
![]() |
J. v. Helden, Alma. F. Rios, and J. Collado-Vides Discovering regulatory elements in non-coding sequences by analysis of spaced dyads Nucleic Acids Res., April 15, 2000; 28(8): 1808 - 1818. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Streit, W. Li, B. Robertson, J. Schein, I. H. Kamal, M. Marra, and W. B. Wood Homologs of the Caenorhabditis elegans Masculinizing Gene her-1 in C. briggsae and the Filarial Parasite Brugia malayi Genetics, August 1, 1999; 152(4): 1573 - 1584. [Abstract] [Full Text] |
||||
![]() |
I. Ioshikhes, E. N. Trifonov, and M. Q. Zhang Periodical distribution of transcription factor sites in promoter regions and connection with chromatin structure PNAS, March 16, 1999; 96(6): 2891 - 2895. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Lai, C Burks, and J. Posakony The K box, a conserved 3' UTR sequence motif, negatively regulates accumulation of enhancer of split complex transcripts Development, January 10, 1998; 125(20): 4077 - 4088. [Abstract] [PDF] |
||||
![]() |
C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and J. Wootton Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment Science, October 8, 1993; 262(5131): 208 - 214. [Abstract] [PDF] |
||||
![]() |
D. A. Papatsenko, V. J. Makeev, A. P. Lifanov, M. Regnier, A. G. Nazina, and C. Desplan Extraction of Functional Binding Sites from Unique Regulatory Regions: The Drosophila Early Developmental Enhancers Genome Res., March 1, 2002; 12(3): 470 - 481. [Abstract] [Full Text] [PDF] |
||||








