First and second moment of counts of words in random texts generated by Markov chains
Institute of Molecular Biology and Biochemistry, Department of Molecular Biology and Informatics, Free University of Berlin Arnimallee 22, D-1000, Berlin 33, Germany
1School of Biology, Georgia Institute of Technology Atlanta, GA 30332, USA and Institute of Molecular Genetics 123182 Moscow
An exact expression for the variance of random frequency that a given word has in text generated by a Markov chain is presented. The result is applied to periodic Markov chains, which describe the protein-coding DNA sequences better than simple Markov chains. A new solution to the problem of word overlap is proposed. It was found that the expected frequency and overlapping properties determine most of the variance. The expectation and variance of counts for triplets are compared with experimental counts in Escherichia coli coding sequences.
Received on June 11, 1991; accepted on January 31, 1992
This article has been cited by other articles:
![]() |
M. R. Kantorovitz, G. E. Robinson, and S. Sinha A statistical method for alignment-free comparison of regulatory sequences Bioinformatics, July 1, 2007; 23(13): i249 - i255. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Karlin Colloquium Perspective: Statistical signals in bioinformatics PNAS, September 20, 2005; 102(38): 13355 - 13362. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Park and J. L. Spouge Searching for Multiple Words in a Markov Sequence INFORMS Journal on Computing, January 1, 2004; 16(4): 341 - 347. [Abstract] [PDF] |
||||
![]() |
S. Rombauts, K. Florquin, M. Lescot, K. Marchal, P. Rouze, and Y. Van de Peer Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes Plant Physiology, July 1, 2003; 132(3): 1162 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. M. Hulzink, H. Weerdesteyn, A. F. Croes, T. Gerats, M. M. A. van Herpen, and J. van Helden In Silico Identification of Putative Regulatory Sequence Elements in the 5'-Untranslated Region of Genes That Are Expressed during Male Gametogenesis Plant Physiology, May 1, 2003; 132(1): 75 - 83. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. v. Helden, Marcel.l. d. Olmo, and J. E. Perez-Ortin Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals Nucleic Acids Res., February 15, 2000; 28(4): 1000 - 1010. [Abstract] [Full Text] [PDF] |
||||




