Bioinformatics Vol. 18 no. 90001 2002
Pages S285-S293
© 2002 Oxford University Press
Statistically based postprocessing of phylogenetic analysis by clustering
1 Texas Institute for Computational and Applied Mathematics,
University of Texas, ACES 6.412, Austin TX 78712, USA
2 Department of Computer Sciences,
University of Texas, Austin TX 78712, USA
Received on January 24, 2002
; revised on March 29, 2002
; accepted on March 29, 2002
Motivation: Phylogenetic analyses often produce thousands of candidate trees. Biologists resolve the conflict by computing the consensus of these trees. Single-tree consensus as postprocessing methods can be unsatisfactory due to their inherent limitations.
Results: In this paper we present an alternative approach by using clustering algorithms on the set of candidate trees. We propose bicriterion problems, in particular using the concept of information loss, and new consensus trees called characteristic trees that minimize the information loss. Our empirical study using four biological datasets shows that our approach provides a significant improvement in the information content, while adding only a small amount of complexity. Furthermore, the consensus trees we obtain for each of our large clusters are more resolved than the single-tree consensus trees. We also provide some initial progress on theoretical questions that arise in this context.
Availability: Software available upon request from the authors. The agglomerative clustering is implemented using Matlab (MathWorks, 2000) with the Statistics Toolbox. The Robinson-Foulds distance matrices and the strict consensus trees are computed using PAUP (Swofford, 2001) and the Daniel Huson's tree library on Intel Pentium workstations running Debian Linux.
Contact: lisan{at}cs.utexas.edu
Supplementary Information: http://www.cs.utexas.edu/users/lisan/ismb02/
Keywords: consensus methods; clustering; phylogenetics; information theory; maximum parsimony.
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. M. W. Nye Trees of Trees: An Approach to Comparing Multiple Alternative Phylogenies Syst Biol, October 1, 2008; 57(5): 785 - 794. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bonnard, V. Berry, and N. Lartillot Multipolar Consensus for Phylogenetic Trees Syst Biol, October 1, 2006; 55(5): 837 - 843. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Hillis, T. A. Heath, and K. St. John Analysis and Visualization of Tree Space Syst Biol, June 1, 2005; 54(3): 471 - 482. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Husmeier, F. Wright, and I. Milne Detecting interspecific recombination with a pruned probabilistic divergence measure Bioinformatics, May 1, 2005; 21(9): 1797 - 1806. [Abstract] [Full Text] [PDF] |
||||

