Bioinformatics Vol. 19 no. 16 2003
pages 2122-2130
© 2003 Oxford University Press
A new sequence distance measure for phylogenetic tree construction
1 Department of Electrical Engineering, University of Nebraska-Lincoln, 209N WSEC, Lincoln, NE 68503, USA and 2 New England Baptist Bone and Joint Institute, Beth Israel Deaconess Medical Center Genomics Center, Harvard Medical School, Boston, MA 02215, USA
Received on November 18, 2002
; revised on March 5, 2003
; accepted on April 17, 2003
Motivation: Most existing approaches for phylogenetic inference use multiple alignment of sequences and assume some sort of an evolutionary model. The multiple alignment strategy does not work for all types of data, e.g. whole genome phylogeny, and the evolutionary models may not always be correct. We propose a new sequence distance measure based on the relative information between the sequences using LempelZiv complexity. The distance matrix thus obtained can be used to construct phylogenetic trees.
Results: The proposed approach does not require sequence alignment and is totally automatic. The algorithm has successfully constructed consistent phylogenies for real and simulated data sets.
Availability: Available on request from the authors.
Contact: hotu{at}bidmc.harvard.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Kocsor, A. Kertesz-Farkas, L. Kajan, and S. Pongor Application of compression-based distance measures to protein sequence classification: a methodological study Bioinformatics, February 15, 2006; 22(4): 407 - 412. [Abstract] [Full Text] [PDF] |
||||
