Bioinformatics Advance Access published online on January 19, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm015
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Assessment of phylogenomic and orthology approaches for phylogenetic inference
1 Center for Molecular and Biomolecular Informatics / Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center. PO Box 9101, 6500 HB, Nijmegen, The Netherlands.
2 Centraalbureau voor Schimmelcultures. Uppsalalaan 8, 3584 CT, Utrecht, The Netherlands.
3 Bioinformatics group, Department Biology, Utrecht University. Padualaan 8, 3584 CH, Utrecht, The Netherlands.
*To whom correspondence should be addressed. B.E. Dutilh, E-mail: dutilh{at}cmbi.ru.nl
| Abstract |
|---|
Motivation: Phylogenomics integrates the vast amount of phylogenetic information contained in complete genome sequences, and is rapidly becoming the standard for inferring reliable species phylogenies. There are however fundamental differences between the ways in which phylogenomic approaches like gene content, superalignment, superdistance and supertree integrate the phylogenetic information from separate orthologous groups. Furthermore, they all depend on the method by which the orthologous groups are initially determined. Here, we systematically compare these four phylogenomic approaches, in parallel with three approaches for large-scale orthology determination: pairwise orthology, cluster orthology and tree-based orthology.
Results: Including various phylogenetic methods, we apply a total of 54 fully automated phylogenomic procedures to the Fungi, the eukaryotic clade with the largest number of sequenced genomes, for which we retrieved a golden standard phylogeny from the literature. Phylogenomic trees based on gene content show, relative to the other methods, a bias in the tree topology that parallels convergence in life style among the species compared, indicating convergence in gene content.
Conclusions: Complete genomes are no warrant for good, or even consistent phylogenies. However, the large amounts of data in genomes enable us to carefully select the data most suitable for phylogenomic inference. In terms of performance, the superalignment approach, combined with restrictive orthology, is the most successful in recovering a fungal phylogeny that agrees with current taxonomic views, and allows us to obtain a high resolution phylogeny. We provide solid support for what has grown to be common practice in phylogenomics during its advance in recent years.
Associate Editor: Martin Bishop
Received on October 30, 2006; revised on January 15, 2007; accepted on January 15, 2007
This article has been cited by other articles:
![]() |
Y. I. Wolf, P. S. Novichkov, G. P. Karev, E. V. Koonin, and D. J. Lipman Inaugural Article: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages PNAS, May 5, 2009; 106(18): 7273 - 7280. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, J. W. Leigh, H. Brinkmann, M. T. Cushion, N. Rodriguez-Ezpeleta, H. Philippe, and B. F. Lang Phylogenomic Analyses Support the Monophyly of Taphrinomycotina, including Schizosaccharomyces Fission Yeasts Mol. Biol. Evol., January 1, 2009; 26(1): 27 - 34. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Dutilh, B. Snel, T. J. G. Ettema, and M. A. Huynen Signature Genes as a Phylogenomic Tool Mol. Biol. Evol., August 1, 2008; 25(8): 1659 - 1667. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Dutilh, Y. He, M. L. Hekkelman, and M. A. Huynen Signature, a web server for taxonomic characterization of sequence samples using signature genes Nucleic Acids Res., July 1, 2008; 36(suppl_2): W470 - W474. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution J R Soc Interface, February 6, 2008; 5(19): 151 - 170. [Abstract] [Full Text] [PDF] |
||||



