Automatic clustering of orthologs and inparalogs shared by multiple proteomes
1 Center for Genomics and Bioinformatics, Karolinska Institutet S-17177 Stockholm
2 Stockholm Bioinformatics Center, Albanova, Stockholm University SE-10691 Stockholm, Sweden
3 Present address: Department of Molecular Biology & Functional Genomics, Stockholm University SE-10691, Stockholm Sweden
*To whom correspondence should be addressed.
Motivation: The complete sequencing of many genomes has made it possible to identify orthologous genes descending from a common ancestor. However, reconstruction of evolutionary history over long time periods faces many challenges due to gene duplications and losses. Identification of orthologous groups shared by multiple proteomes therefore becomes a clustering problem in which an optimal compromise between conflicting evidences needs to be found.
Results: Here we present a new proteome-scale analysis program called MultiParanoid that can automatically find orthology relationships between proteins in multiple proteomes. The software is an extension of the InParanoid program that identifies orthologs and inparalogs in pairwise proteome comparisons. MultiParanoid applies a clustering algorithm to merge multiple pairwise ortholog groups from InParanoid into multi-species ortholog groups. To avoid outparalogs in the same cluster, MultiParanoid only combines species that share the same last ancestor.
To validate the clustering technique, we compared the results to a reference set obtained by manual phylogenetic analysis. We further compared the results to ortholog groups in KOGs and OrthoMCL, which revealed that MultiParanoid produces substantially fewer outparalogs than these resources.
Availability: MultiParanoid is a freely available standalone program that enables efficient orthology analysis much needed in the post-genomic era. A web-based service providing access to the original datasets, the resulting groups of orthologs, and the source code of the program can be found at http://multiparanoid.cgb.ki.se.
Contact: Erik.Sonnhammer{at}sbc.su.se
Supplementary information: http://multiparanoid.cgb.ki.se/ISMB2006/
This article has been cited by other articles:
![]() |
O. Sakarya, K. S. Kosik, and T. H. Oakley Reconstructing ancestral genome content based on symmetrical best alignments and Dollo parsimony Bioinformatics, March 1, 2008; 24(5): 606 - 612. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Moreno-Hagelsieb and K. Latimer Choosing BLAST options for better detection of orthologs as reciprocal best hits Bioinformatics, February 1, 2008; 24(3): 319 - 324. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Frickey, V. A. Benedito, M. Udvardi, and G. Weiller AffyTrees: Facilitating Comparative Analysis of Affymetrix Plant Microarray Chips Plant Physiology, February 1, 2008; 146(2): 377 - 386. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brilli, R. Fani, and P. Lio Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes Brief Bioinform, January 1, 2008; 9(1): 34 - 45. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhong and P. W. Sternberg Automated data integration for developmental biological research Development, September 15, 2007; 134(18): 3227 - 3238. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schneider, C. Dessimoz, and G. H. Gonnet OMA Browser Exploring orthologous relations across 352 complete genomes Bioinformatics, August 15, 2007; 23(16): 2180 - 2182. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhou and L. F. Landweber BLASTO: a tool for searching orthologous groups Nucleic Acids Res., July 13, 2007; 35(suppl_2): W678 - W682. [Abstract] [Full Text] [PDF] |
||||




