Bioinformatics Advance Access originally published online on November 21, 2006
Bioinformatics 2007 23(3):372-374; doi:10.1093/bioinformatics/btl592
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences
1 Digital Medicine Initiative, Kyushu University Fukuoka 812-8582, Japan
2 Medical Institute of Bioregulation, Kyushu University Fukuoka 812-8582, Japan
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>
10 000) of sequences, the calculation of a guide tree with a complexity of O(N2) to O(N3), where N is the number of sequences, is the most time-consuming process.
Results: To overcome this limitation, we have developed an approximate algorithm, PartTree, to construct a guide tree with an average time complexity of O(N log N). The new MSA method with the PartTree algorithm can align
60 000 sequences in several minutes on a standard desktop computer. The loss of accuracy in MSA caused by this approximation was estimated to be several percent in benchmark tests using Pfam.
Availability: The present algorithm has been implemented in the MAFFT sequence alignment package (http://align.bmr.kyushu-u.ac.jp/mafft/software/).
Contact: katoh{at}bioreg.kyushu-u.ac.jp
Supplementary information: Supplementary information is available at Bioinformatics online.
Associate Editor: Thomas Lengauer
Received on August 23, 2006; revised on October 30, 2006; accepted on November 17, 2006
This article has been cited by other articles:
![]() |
K. Katoh and H. Toh Recent developments in the MAFFT multiple sequence alignment program Brief Bioinform, July 1, 2008; 9(4): 286 - 298. [Abstract] [Full Text] [PDF] |
||||
