Bioinformatics, Vol 15, 362-369, Copyright © 1999 by Oxford University Press
U Ohler, S Harbeck, H Niemann, E Noth and MG Reese
MOTIVATION: We describe a new content-based approach for the detection of
promoter regions of eukaryotic protein encoding genes. Our system is based
on three interpolated Markov chains (IMCs) of different order which are
trained on coding, non-coding and promoter sequences. It was recently shown
that the interpolation of Markov chains leads to stable parameters and
improves on the results in microbial gene finding (Salzberg et al., Nucleic
Acids Res., 26, 544-548, 1998). Here, we present new methods for an
automated estimation of optimal interpolation parameters and show how the
IMCs can be applied to detect promoters in contiguous DNA sequences. Our
interpolation approach can also be employed to obtain a reliable scoring
function for human coding DNA regions, and the trained models can easily be
incorporated in the general framework for gene recognition systems.
RESULTS: A 5-fold cross- validation evaluation of our IMC approach on a
representative sequence set yielded a mean correlation coefficient of 0.84
(promoter versus coding sequences) and 0.53 (promoter versus non-coding
sequences). Applied to the task of eukaryotic promoter region
identification in genomic DNA sequences, our classifier identifies 50% of
the promoter regions in the sequences used in the most recent review and
comparison by Fickett and Hatzigeorgiou ( Genome Res., 7, 861-878, 1997),
while having a false-positive rate of 1/849 bp.
ARTICLES
Interpolated markov chains for eukaryotic promoter recognition
University of Erlangen-Nuremberg, Martensstrasse 3, D-91058 Erlangen, Germany. ohler@informatik.uni-erlangen.de
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
I. Abnizova and W. R. Gilks Studying statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the eukaryotic genomes Brief Bioinform, March 1, 2006; 7(1): 48 - 54. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ben-Gal, A. Shani, A. Gohr, J. Grau, S. Arviv, A. Shmilovici, S. Posch, and I. Grosse Identification of transcription factor binding sites with variable-order Bayesian networks Bioinformatics, June 1, 2005; 21(11): 2657 - 2666. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Burden, Y.-X. Lin, and R. Zhang Improving promoter prediction Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences Bioinformatics, March 1, 2005; 21(5): 601 - 607. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Shahmuradov, V. V. Solovyev, and A. J. Gammerman Plant promoter prediction with confidence estimation Nucleic Acids Res., February 18, 2005; 33(3): 1069 - 1076. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Parham, S. Zervou, E. Karteris, R. D. Catalano, R. W. Old, and E. W. Hillhouse Promoter Analysis of Human Corticotropin-Releasing Factor (CRF) Type 1 Receptor and Regulation by CRF and Urocortin Endocrinology, August 1, 2004; 145(8): 3971 - 3983. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Solovyev and I. Shahmuradov PromH: promoters identification using orthologous genomic sequences Nucleic Acids Res., July 1, 2003; 31(13): 3540 - 3545. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rombauts, K. Florquin, M. Lescot, K. Marchal, P. Rouze, and Y. Van de Peer Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes Plant Physiology, July 1, 2003; 132(3): 1162 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Catalano, T. Kyriakou, J. Chen, A. Easton, and E. W. Hillhouse Regulation of Corticotropin-Releasing Hormone Type 2 Receptors by Multiple Promoters and Alternative Splicing: Identification of Multiple Splice Variants Mol. Endocrinol., March 1, 2003; 17(3): 395 - 410. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. van Munster, A. M. Dullemans, M. Verbeek, J. F. J. M. van den Heuvel, C. Reinbold, V. Brault, A. Clerivet, and F. van der Wilk A new virus infecting Myzus persicae has a genome organization similar to the species of the genus Densovirus J. Gen. Virol., January 1, 2003; 84(1): 165 - 172. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Rimessi, A. Bonaccorsi, M. Sturzl, M. Fabris, E. Brocca-Cofano, A. Caputo, G. Melucci-Vigo, M. Falchi, A. Cafaro, E. Cassai, et al. Transcription Pattern of Human Herpesvirus 8 Open Reading Frame K3 in Primary Effusion Lymphoma and Kaposi's Sarcoma J. Virol., August 1, 2001; 75(15): 7161 - 7174. [Abstract] [Full Text] |
||||
![]() |
D. M. Koelle, H. B. Chen, M. A. Gavin, A. Wald, W. W. Kwok, and L. Corey CD8 CTL from Genital Herpes Simplex Lesions: Recognition of Viral Tegument and Immediate Early Proteins and Lysis of Infected Cutaneous Cells J. Immunol., March 15, 2001; 166(6): 4049 - 4058. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Reese, G. Hartzell, N. L. Harris, U. Ohler, J. F. Abril, and S. E. Lewis Genome Annotation Assessment in Drosophila melanogaster Genome Res., April 1, 2000; 10(4): 483 - 501. [Abstract] [Full Text] |
||||
![]() |
U. Ohler Promoter Prediction on a Genomic Scale---The Adh Experience Genome Res., April 1, 2000; 10(4): 539 - 542. [Abstract] [Full Text] |
||||









