Bioinformatics Advance Access originally published online on July 26, 2005
Bioinformatics 2005 21(19):3794-3796; doi:10.1093/bioinformatics/bti594
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies
1Albert-Ludwigs-University Freiburg, Germany
2NIC FZ Jülich, Germany
3Heinrich-Heine-University Düsseldorf, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Summary: IQPNNI is a program to infer maximum-likelihood phylogenetic trees from DNA or protein data with a large number of sequences. We present an improved and MPI-parallel implementation showing very good scaling and speedup behavior.
Availability: IQPNNI (http://www.bi.uni-duesseldorf.de/software/iqpnni) is written in C++, executable on UNIX/Linux, Windows and MacOS systems. (Free) MPI libraries can be found at http://www.lam-mpi.org/mpi/implementations/.
Contact: haeseler{at}cs.uni-duesseldorf.de
| 1 INTRODUCTION |
|---|
|
|
|---|
Likelihood approaches to infer phylogenetic trees are nowadays widely used in evolutionary studies. However, its application to large datasets, i.e. many species, is severely hampered by its computational complexity (Chor and Tuller, 2005; Felsenstein, 1981). Although the application of efficient heuristics lead to a substantial reduction in computing time, the speed at which new sequence data accumulate almost annihilate the savings.
Therefore, the runtime of several programs has been further reduced by porting them to parallel platforms using a variety of parallel tools and methods (Olsen et al, 1994; Schmidt et al., 2002; Stamatakis and Ludwig, 2004; Altekar et al., 2004; Keane et al., 2005).
In the present study, we present an algorithmically improved version of the IQPNNI method (Vinh and von Haeseler, 2004), that performs well on large datasets. The optimized version was furthermore parallelized using message passing interface (MPI, Snir et al., 1998), an industrial standard supported on parallel platforms like massively parallel processors, SMP computers and workstation clusters.
| 2 METHODS |
|---|
|
|
|---|
2.1 Sequential IQPNNI
The original IQPNNI algorithm (Vinh and von Haeseler, 2004) consists of two major steps: the initial step, BIONJ (Gascuel, 1997) combined with fast nearest neighbor interchange (NNI, Guindon and Gascuel, 2003) is used to provide a (locally) optimal maximum likelihood tree. In the subsequent optimization step (OS), leaves are randomly removed from the tree. These leaves are reinserted in a random order by the important quartet puzzling (IQP) algorithm, a simplified but computationally more efficient version of quartet puzzling (Strimmer and von Haeseler, 1996). The resulting tree topology is again optimized by NNI until no further likelihood improvement is achieved. If the likelihood of the resulting tree exceeds that of the current best tree, then the current best tree is replaced by the new tree. The OS is repeated many times to thoroughly search the tree space. Iterations stop if either a user-defined number of iterations is met with, or if the so-called stopping rule is fulfilled. The stopping rule is based on a Weibull distribution to decide whether it is likely (with a 95% confidence level) that a continuation of the search will lead to no further improvement.
2.2 Algorithmic improvement of sequential code
The runtime analysis of IQPNNI revealed, that the implementation of Brent's method (1973) to determine the optimal branch lengths, consumed the largest fraction. However, Newton's method (Press et al., 1992) has been shown to be faster by taking advantage of the first and second derivatives which can be efficiently calculated (Yang, 2000). However, it sometimes suffers from a well-known problem of non-convergence (Press et al., 1992). Therefore, we applied a slight modification: if Newton's method returns a result, we check whether it is (locally) optimal and reoptimize with Brent's algorithm if necessary. Furthermore, a number of more efficient algorithms and data structures have been implemented compared with the original implementation.
2.3 Parallelization of optimization step
The OS, even after improving the sequential code, uses 9099% of total running time, depending on the number of iterations and the size of the data. Hence, total running time will benefit from an efficient parallelization of the OS. We parallelized OS using a master/worker scheme (Fig. 1). Since the OS acts on the current best tree, iterations in the original version are not independent. To account for this, the sequential scheme was slightly modified: starting from the current best tree, every worker runs its own optimization as explained above. When sending back its result, the current best tree is updated by the master, if the returned tree shows a higher likelihood. In such a case, the master will broadcast the better tree to all workers by non-blocking communication, ensuring that a worker finishes its current optimization even if a new better tree is sent. This also takes advantage of the time saving by overlapping computation and communication (Snir et al., 1998). After sending the resulting tree and likelihood to the master, the worker checks for the best tree and starts the next iteration with the current best tree. In addition, the master checks the stop condition and sends a stop message to all workers if the stopping rule applies.
|
| 3 RESULTS |
|---|
|
|
|---|
The performance of the sequential improvement and parallelization relative to the accelerated sequential version was tested on four datasets (Table 1) of a homogeneous Linux cluster of 15 nodes with 2 CPUs of 2.0 GHz each, connected via Gigabit Ethernet. We had to refrain to relative speedup because a best known sequential implementation does not exist. We used the following parameters: for DNA data the HKY85 model (Hasegawa et al., 1985) of evolution was used and the WAG model (Whelan and Goldman, 2001) for protein sequences. The number of iterations in the OS was fixed (Table 1) to avoid runtime fluctuations caused by the stopping rule. Table 1 and Figure 2 display the average speedup from 10 independent runs for each dataset.
|
|
Using Newton's method for DNA data we attained a reduction of 3345% in running time. For the protein data, we got a reduction in runtime of (only) 17%. This is attributable to the fact, that Newton's method suffers from the higher complexity while computing derivatives. Moreover, Newton's method also converges slower.
Figure 2 displays the speedup as function of the number worker processes for the parallelized IQPNNI. Figure 2a shows that the speedup of the OS is almost linear. It ranges from 25.0 to 28.3 with 29 worker processes. Moreover, a saturation effect is not apparent.
Even for complete runs of the program (Fig. 2b), including parallel computation of the pairwise distances and sequential code from the initial step, the speedup drops only slightly, i.e. ranging from 22.2 to 25.6 for 29 workers, again without showing saturation. Despite its small number of sequences, the protein data scale similar to the largest DNA dataset owing to longer computation times per iteration (112s compared with 56s), which reduces the communication/computation ratio.
| 4 CONCLUSION |
|---|
|
|
|---|
We presented an improved and parallelized implementation of the IQPNNI method. Both the sequential optimization and the parallel implementation yield significant reductions in runtime without introducing a large overhead. The runtime reduction and parallel scaling behavior suggests that pIQPNNI is well-fitted to analyze large datasets. For example, with just 10 CPUs one can compute a maximum-likelihood tree for 500 sequences in
25 min.
| Acknowledgments |
|---|
Financial support for B.Q.M. by the Konrad Adenauer Stiftung is gratefully acknowledged.
Conflict of Interest: none declared.
Received on July 5, 2005; revised on July 20, 2005; accepted on July 21, 2005
| REFERENCES |
|---|
|
|
|---|
Altekar, G., et al. (2004) Parallel metropolis coupled markov chain monte carlo for bayesian phylogenetic inference. Bioinformatics, 20, 407415
Brent, R.P. Algorithms for Minimization without Derivatives, (1973) , Englewood Cliffs, NJ, USA Prentice Hall.
Chor, B. and Tuller, T. (2005) Maximum likelihood of evolutionary trees is hard. Proceedings of the 9th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2005) , NY, USA volume 3500 of Lecture Notes in Computer Science ACM Press, pp. , pp. 296310.
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol., 17, 368376[CrossRef][Web of Science][Medline].
Gascuel, O. (1997) BIONJ: An improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol., 14, 685695[Abstract].
Guindon, S. and Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol., 52, 696704
Hasegawa, M., et al. (1985) Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol., 22, 160174[CrossRef][Web of Science][Medline].
Keane, T.M., et al. (2005) DPRml: distributed phylogeny reconstruction by maximum likelihood. Bioinformatics, 21, 969974
Olsen, G.J., et al. (1994) fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci., 10, 4148
Press, W.H., et al. Numerical Recipes in C: The Art of Scientific Computing, (1992) 2nd edition Cambridge University Press.
Schmidt, H.A., et al. (2002) TREE-PUZZLE: Maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics, 18, 502504
Snir, M., Otto, S.W., Huss-Lederman, S., Walker, D.W., Dongarra, J. MPI: The Complete ReferenceThe MPI Core, (1998) 2nd edition , Cambridge, Massachusetts The MIT Press volume 1, .
Stamatakis, A. and Ludwig, T. (2004) The AxML program family for phylogenetic tree inference. Concurr. Comput.-Pract. Exp., 16, 975988[CrossRef].
Strimmer, K. and von Haeseler, A. (1996) Quartet puzzling: A quartet maximumlikelihood method for reconstructing tree topologies. Mol. Biol. Evol., 13, 964969[Web of Science].
Vinh, L.S. and von Haeseler, A. (2004) IQPNNI: Moving fast through tree space and stopping in time. Mol. Biol. Evol., 21, 15651571
Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum likelihood approach. Mol. Biol. Evol., 18, 691699
Yang, Z. (2000) Maximum likelihood estimation on large phylogenies and analysis of adaptive evolution in human influenza virus A. J. Mol. Evol., 51, 423432[Web of Science][Medline].
This article has been cited by other articles:
![]() |
E. W. Bloomquist and M. A. Suchard Unifying Vertical and Nonvertical Evolution: A Stochastic ARG-based Framework Syst Biol, November 9, 2009; (2009) syp076v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Q. Minh, S. Klaere, and A. von Haeseler Taxon Selection under Split Diversity Syst Biol, September 21, 2009; (2009) syp058v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Suchard and A. Rambaut Many-core algorithms for statistical phylogenetics Bioinformatics, June 1, 2009; 25(11): 1370 - 1376. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Stamatakis and M. Ott Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures Phil Trans R Soc B, December 27, 2008; 363(1512): 3977 - 3984. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Stamatakis, P. Hoover, and J. Rougemont A Rapid Bootstrap Algorithm for the RAxML Web Servers Syst Biol, October 1, 2008; 57(5): 758 - 771. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. D. Swingley, R. E. Blankenship, and J. Raymond Integrating Markov Clustering and Molecular Phylogenetics to Reconstruct the Cyanobacterial Species Tree from Conserved Protein Families Mol. Biol. Evol., April 1, 2008; 25(4): 643 - 654. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moslavac, K. Nicolaisen, O. Mirus, F. Al Dehni, R. Pernil, E. Flores, I. Maldener, and E. Schleiff A TolC-Like Protein Is Required for Heterocyst Development in Anabaena sp. Strain PCC 7120 J. Bacteriol., November 1, 2007; 189(21): 7887 - 7895. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Stamatakis RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models Bioinformatics, November 1, 2006; 22(21): 2688 - 2690. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






