Evolution and Phylogenetics
Efficient parsimony-based methods for phylogenetic network reconstruction

1 Department of Computer Science, Rice University Houston, TX, USA
2 Department of Mathematics, University of California Berkeley, CA, USA
3 School of Computer Science, Tel Aviv University Tel Aviv, Israel
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Phylogeniesthe evolutionary histories of groups of organismsplay a major role in representing relationships among biological entities. Although many biological processes can be effectively modeled as tree-like relationships, others, such as hybrid speciation and horizontal gene transfer (HGT), result in networks, rather than trees, of relationships. Hybrid speciation is a significant evolutionary mechanism in plants, fish and other groups of species. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Maximum parsimony is one of the most commonly used criteria for phylogenetic tree inference. Roughly speaking, inference based on this criterion seeks the tree that minimizes the amount of evolution. In 1990, Jotun Hein proposed using this criterion for inferring the evolution of sequences subject to recombination. Preliminary results on small synthetic datasets. Nakhleh et al. (2005) demonstrated the criterions application to phylogenetic network reconstruction in general and HGT detection in particular. However, the naive algorithms used by the authors are inapplicable to large datasets due to their demanding computational requirements. Further, no rigorous theoretical analysis of computing the criterion was given, nor was it tested on biological data.
Results: In the present work we prove that the problem of scoring the parsimony of a phylogenetic network is NP-hard and provide an improved fixed parameter tractable algorithm for it. Further, we devise efficient heuristics for parsimony-based reconstruction of phylogenetic networks. We test our methods on both synthetic and biological data (rbcL gene in bacteria) and obtain very promising results.
Contact: ssagi{at}math.berkeley.edu
The authors wish it to be known thats in their opinion, all the authors should be regarded as Joint First Authors.
This article has been cited by other articles:
![]() |
K. Yang and L. Zhang Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction Nucleic Acids Res., March 1, 2008; 36(5): e33 - e33. [Abstract] [Full Text] [PDF] |
||||
