Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Stockham, C.
Right arrow Articles by Warnow, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Stockham, C.
Right arrow Articles by Warnow, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 90001 2002
Pages S285-S293
© 2002 Oxford University Press

Statistically based postprocessing of phylogenetic analysis by clustering

Cara Stockham 1, Li-San Wang 1,* and Tandy Warnow 2

1 Texas Institute for Computational and Applied Mathematics, University of Texas, ACES 6.412, Austin TX 78712, USA
2 Department of Computer Sciences, University of Texas, Austin TX 78712, USA

Received on January 24, 2002 ; revised on March 29, 2002 ; accepted on March 29, 2002

Motivation: Phylogenetic analyses often produce thousands of candidate trees. Biologists resolve the conflict by computing the consensus of these trees. Single-tree consensus as postprocessing methods can be unsatisfactory due to their inherent limitations.

Results: In this paper we present an alternative approach by using clustering algorithms on the set of candidate trees. We propose bicriterion problems, in particular using the concept of information loss, and new consensus trees called characteristic trees that minimize the information loss. Our empirical study using four biological datasets shows that our approach provides a significant improvement in the information content, while adding only a small amount of complexity. Furthermore, the consensus trees we obtain for each of our large clusters are more resolved than the single-tree consensus trees. We also provide some initial progress on theoretical questions that arise in this context.

Availability: Software available upon request from the authors. The agglomerative clustering is implemented using Matlab (MathWorks, 2000) with the Statistics Toolbox. The Robinson-Foulds distance matrices and the strict consensus trees are computed using PAUP (Swofford, 2001) and the Daniel Huson's tree library on Intel Pentium workstations running Debian Linux.

Contact: lisan{at}cs.utexas.edu

Supplementary Information: http://www.cs.utexas.edu/users/lisan/ismb02/

Keywords: consensus methods; clustering; phylogenetics; information theory; maximum parsimony.

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Syst BiolHome page
T. M. W. Nye
Trees of Trees: An Approach to Comparing Multiple Alternative Phylogenies
Syst Biol, October 1, 2008; 57(5): 785 - 794.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
C. Bonnard, V. Berry, and N. Lartillot
Multipolar Consensus for Phylogenetic Trees
Syst Biol, October 1, 2006; 55(5): 837 - 843.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
D. M. Hillis, T. A. Heath, and K. St. John
Analysis and Visualization of Tree Space
Syst Biol, June 1, 2005; 54(3): 471 - 482.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Husmeier, F. Wright, and I. Milne
Detecting interspecific recombination with a pruned probabilistic divergence measure
Bioinformatics, May 1, 2005; 21(9): 1797 - 1806.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.