Bioinformatics Vol. 17 no. 9 2001
Pages 803-820
© 2001 Oxford University Press
Evolutionary HMMs: a Bayesian approach to multiple alignment
Group T10, Los Alamos National Laboratory, NM 87545, USA
Received on February 21, 2001
; revised on April 6, 2001
; accepted on April 6, 2001
Motivation: We review proposed syntheses of probabilistic sequence alignment, profiling and phylogeny. We develop a multiple alignment algorithm for Bayesian inference in the links model proposed by Thorne et al. (1991, J. Mol. Evol. , 33, 114124). The algorithm, described in detail in Section 3, samples from and/or maximizes the posterior distribution over multiple alignments for any number of DNA or protein sequences, conditioned on a phylogenetic tree. The individual sampling and maximization steps of the algorithm require no more computational resources than pairwise alignment.
Methods: We present a software implementation (Handel) of our algorithm and report test results on (i) simulated data sets and (ii) the structurally informed protein alignments of BAliBASE (Thompson et al. , 1999, Nucleic Acids Res. , 27, 26822690).
Results: We find that the mean sum-of-pairs score (a measure of residue-pair correspondence) for the BAliBASE alignments is only 13% lower for Handelthan for CLUSTALW(Thompson et al. , 1994, Nucleic Acids Res. , 22, 46734680), despite the relative simplicity of the links model (CLUSTALW uses affine gap scores and increased penalties for indels in hydrophobic regions). With reference to these benchmarks, we discuss potential improvements to the links model and implications for Bayesian multiple alignment and phylogenetic profiling.
Availability: The source code to Handelis freely distributed on the Internet at http://www.biowiki.org/Handel under the terms of the GNU Public License (GPL, 2000, http://www.fsf.org./copyleft/gpl.html).
Contact: ihh{at}fruitfly.org
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
I. Miklos, A. Novak, R. Satija, R. Lyngso, and J. Hein Stochastic models of sequence evolution including insertion--deletion events Statistical Methods in Medical Research, October 1, 2009; 18(5): 453 - 485. [Abstract] [PDF] |
||||
![]() |
B. Paten, J. Herrero, S. Fitzgerald, K. Beal, P. Flicek, I. Holmes, and E. Birney Genome-wide nucleotide-level mammalian ancestor reconstruction Genome Res., November 1, 2008; 18(11): 1829 - 1843. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Novak, I. Miklos, R. Lyngso, and J. Hein StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees Bioinformatics, October 15, 2008; 24(20): 2403 - 2404. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Wong, M. A. Suchard, and J. P. Huelsenbeck Alignment Uncertainty and Genomic Analysis Science, January 25, 2008; 319(5862): 473 - 476. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. K. Bradley and I. Holmes Transducers: an emerging probabilistic framework for modeling indels on trees Bioinformatics, December 1, 2007; 23(23): 3258 - 3262. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Holmes Phylocomposer and phylodirector: analysis and visualization of transducer indel models Bioinformatics, December 1, 2007; 23(23): 3263 - 3264. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Ogden and M. S. Rosenberg Alignment and Topological Accuracy of the Direct Optimization approach via POY and Traditional Phylogenetics via ClustalW + PAUP Syst Biol, April 1, 2007; 56(2): 182 - 193. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kim and S. Sinha Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment Bioinformatics, February 1, 2007; 23(3): 289 - 297. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Suchard and B. D. Redelings BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny Bioinformatics, August 15, 2006; 22(16): 2047 - 2048. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Ogden and M. S. Rosenberg Multiple Sequence Alignment Accuracy and Phylogenetic Inference Syst Biol, April 1, 2006; 55(2): 314 - 328. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Csuros and I. Miklos Statistical Alignment of Retropseudogenes and Their Functional Paralogs Mol. Biol. Evol., December 1, 2005; 22(12): 2457 - 2471. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Fleissner, D. Metzler, and A. von Haeseler Simultaneous Statistical Multiple Alignment and Phylogeny Reconstruction Syst Biol, August 1, 2005; 54(4): 548 - 561. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Holmes Using evolutionary Expectation Maximization to estimate indel rates Bioinformatics, May 15, 2005; 21(10): 2294 - 2300. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Brown, S. S. Gross, and M. R. Brent Begin at the beginning: Predicting genes with 5' UTRs Genome Res., May 1, 2005; 15(5): 742 - 747. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. P. Gardner, A. Wilm, and S. Washietl A benchmark of multiple sequence alignment programs upon structural RNAs Nucleic Acids Res., April 28, 2005; 33(8): 2433 - 2439. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bray and L. Pachter MAVID: Constrained Ancestral Alignment of Multiple Sequences Genome Res., April 1, 2004; 14(4): 693 - 699. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Keightley and T. Johnson MCALIGN: Stochastic Alignment of Noncoding DNA Sequences Based on an Evolutionary Model of Sequence Evolution Genome Res., March 1, 2004; 14(3): 442 - 450. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Miklos, G. A. Lunter, and I. Holmes A "Long Indel" Model For Evolutionary Sequence Alignment Mol. Biol. Evol., March 1, 2004; 21(3): 529 - 540. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Frith, U. Hansen, J. L. Spouge, and Z. Weng Finding functional sequence elements by multiple local alignment Nucleic Acids Res., January 2, 2004; 32(1): 189 - 200. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hein, J. L. Jensen, and C. N. S. Pedersen Recursions for statistical multiple alignment PNAS, December 9, 2003; 100(25): 14960 - 14965. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brudno, C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov, N. C. S. Program, E. D. Green, A. Sidow, and S. Batzoglou LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA Genome Res., April 1, 2003; 13(4): 721 - 731. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. McCue, W. Thompson, C. S. Carmack, and C. E. Lawrence Factors Influencing the Identification of Transcription Factor Binding Sites by Cross-Species Comparison Genome Res., October 1, 2002; 12(10): 1523 - 1532. [Abstract] [Full Text] [PDF] |
||||







