Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Jojic, V.
Right arrow Articles by Heckerman, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jojic, V.
Right arrow Articles by Heckerman, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics 20(Suppl. 1) © Oxford University Press 2004; all rights reserved.

Efficient approximations for learning phylogenetic HMM models from data

Vladimir Jojic 1,*, Nebojsa Jojic 1, Chris Meek 1, Dan Geiger 2, Adam Siepel 3, David Haussler 3,4 and D. Heckerman 1

1 Microsoft Research, Redmond, WA 98052, USA, 2 Technion—Israel Institute of Technology Computer Science Department, Haifa 32000, Israel, 3 Center for Biomolecular Science and Engineering and 4 Howard Hughes Medical Institute, University of California Santa Cruz, CA 95064, USA

Received on January 15, 2004; accepted on March 1, 2004

Motivation: We consider models useful for learning an evolutionary or phylogenetic tree from data consisting of DNA sequences corresponding to the leaves of the tree. In particular, we consider a general probabilistic model described in Siepel and Haussler that we call the phylogenetic-HMM model which generalizes the classical probabilistic models of Neyman and Felsenstein. Unfortunately, computing the likelihood of phylogenetic-HMM models is intractable. We consider several approximations for computing the likelihood of such models including an approximation introduced in Siepel and Haussler, loopy belief propagation and several variational methods.

Results: We demonstrate that, unlike the other approximations, variational methods are accurate and are guaranteed to lower bound the likelihood. In addition, we identify a particular variational approximation to be best—one in which the posterior distribution is variationally approximated using the classic Neyman–Felsenstein model. The application of our best approximation to data from the cystic fibrosis transmembrane conductance regulator gene region across nine eutherian mammals reveals a CpG effect.

Contact: vjojic{at}psi.toronto.edu

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.