Bioinformatics Advance Access originally published online on May 12, 2008
Bioinformatics 2008 24(13):1481-1488; doi:10.1093/bioinformatics/btn231
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A distance metric for a class of tree-sibling phylogenetic networks
1Department of Mathematics and Computer Science, University of the Balearic Islands, E-07122 Palma de Mallorca and 2Algorithms, Bioinformatics, Complexity and Formal Methods Research Group, Technical University of Catalonia, E-08034 Barcelona, Spain
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: The presence of reticulate evolutionary events in phylogenies turn phylogenetic trees into phylogenetic networks. These events imply in particular that there may exist multiple evolutionary paths from a non-extant species to an extant one, and this multiplicity makes the comparison of phylogenetic networks much more difficult than the comparison of phylogenetic trees. In fact, all attempts to define a sound distance measure on the class of all phylogenetic networks have failed so far. Thus, the only practical solutions have been either the use of rough estimates of similarity (based on comparison of the trees embedded in the networks), or narrowing the class of phylogenetic networks to a certain class where such a distance is known and can be efficiently computed. The first approach has the problem that one may identify two networks as equivalent, when they are not; the second one has the drawback that there may not exist algorithms to reconstruct such networks from biological sequences.
Results: We present in this article a distance measure on the class of semi-binary tree-sibling time consistent phylogenetic networks, which generalize tree-child time consistent phylogenetic networks, and thus also galled-trees. The practical interest of this distance measure is 2-fold: it can be computed in polynomial time by means of simple algorithms, and there also exist polynomial-time algorithms for reconstructing networks of this class from DNA sequence data.
Availability: The Perl package Bio::PhyloNetwork, included in the BioPerl bundle, implements many algorithms on phylogenetic networks, including the computation of the distance presented in this article.
Contact: gabriel.cardona{at}uib.es
Supplementary information: Some counterexamples, proofs of the results not included in this article, and some computational experiments are available at Bioinformatics online.
Associate Editor: Martin Bishop
Received on March 19, 2007; revised on May 11, 2008; accepted on May 11, 2008