Bioinformatics Advance Access originally published online on January 10, 2006
Bioinformatics 2006 22(5):517-522; doi:10.1093/bioinformatics/btk029
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bayesian classifiers for detecting HGT using fixed and variable order markov models of genomic signatures
1Department of Computing Science, Chalmers University SE 412 96 Göteborg, Sweden
2Department of Cell and Molecular Biology, Microbiology, Göteborg University 405 30 Göteborg, Sweden
*To whom correspondence should be addressed.
Motivation: Analyses of genomic signatures are gaining attention as they allow studies of species-specific relationships without involving alignments of homologous sequences. A naïve Bayesian classifier was built to discriminate between different bacterial compositions of short oligomers, also known as DNA words. The classifier has proven successful in identifying foreign genes in Neisseria meningitis. In this study we extend the classifier approach using either a fixed higher order Markov model (Mk) or a variable length Markov model (VLMk).
Results: We propose a simple algorithm to lock a variable length Markov model to a certain number of parameters and show that the use of Markov models greatly increases the flexibility and accuracy in prediction to that of a naïve model. We also test the integrity of classifiers in terms of false-negatives and give estimates of the minimal sizes of training data. We end the report by proposing a method to reject a false hypothesis of horizontal gene transfer.
Availability: Software and Supplementary information available at www.cs.chalmers.se/~dalevi/genetic_sign_classifiers/
Contact: dalevi{at}cs.chalmers.se
Received on June 21, 2005; revised on December 8, 2005; accepted on December 27, 2005
This article has been cited by other articles:
![]() |
Y. Sun, Y. Cai, L. Liu, F. Yu, M. L. Farrell, W. McKendree, and W. Farmerie ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences Nucleic Acids Res., June 1, 2009; 37(10): e76 - e76. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Kunin, A. Copeland, A. Lapidus, K. Mavromatis, and P. Hugenholtz A Bioinformatician's Guide to Metagenomics Microbiol. Mol. Biol. Rev., December 1, 2008; 72(4): 557 - 578. [Abstract] [Full Text] [PDF] |
||||

