Bioinformatics Vol. 18 no. 90002 2002
Pages S211-S218
© 2002 Oxford University Press
Genome segmentation using piecewise constant intensity models and reversible jump MCMC
1 HIIT Basic Research Unit, Department of Computer Science,
University of Helsinki, PO Box 26, FIN-00014 Helsinki, Finland
2 Molecular Genetics, Department of Biosciences at Novum,
Karolinska Institute, S-141 57 Huddinge, Sweden
Received on April 8, 2002
; accepted on June 15, 2002
The existence of whole genome sequences makes it possible to search for global structure in the genome. We consider modeling the occurrence frequencies of discrete patterns (such as starting points of ORFs or other interesting phenomena) along the genome. We use piecewise constant intensity models with varying number of pieces, and show how a reversible jump Markov Chain Monte Carlo (RJMCMC) method can be used to obtain a posteriori distribution on the intensity of the patterns along the genome. We apply the method to modeling the occurrence of ORFs in the human genome. The results show that the chromosomes consist of 535 clearly distinct segments, and that the posteriori number and length of the segments shows significant variation. On the other hand, for the yeast genome the intensity of ORFs is nearly constant.
Contact: Marko.Salmenkivi{at}cs.helsinki.fi Juha.Kere{at}biosci.ki.se Heikki.Mannila{at}cs.helsinki.fi