Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (43)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Wu, J.
Right arrow Articles by DeLisi, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wu, J.
Right arrow Articles by DeLisi, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 12 2003
Pages 1524-1530
© 2003 Oxford University Press

Identification of functional links between genes using phylogenetic profiles

Jie Wu 1, Simon Kasif 1,2 and Charles DeLisi 1,2

1 Department of Biomedical Engineering, USA
2 Bioinformatics Graduate Program, Boston University, 44 Cummington St., Boston, MA, 02215, USA

Received on November 22, 2002 ; revised on February 14, 2003 ; accepted on February 15, 2003

Motivation: Genes with identical patterns of occurrence across the phyla tend to function together in the same protein complexes or participate in the same biochemical pathway. However, the requirement that the profiles be identical (i) severely restricts the number of functional links that can be established by such phylogenetic profiling; (ii) limits detection to very strong functional links, failing to capture relations between genes that are not in the same pathway, but nevertheless subserve a common function and (iii) misses relations between analogous genes. Here we present and apply a method for relaxing the restriction, based on the probability that a given arbitrary degree of similarity between two profiles would occur by chance, with no biological pressure. Function is then inferred at any desired level of confidence.

Results: We derive an expression for the probability distribution of a given number of chance co-occurrences of a pair of non-homologous orthologs across a set of genomes. The method is applied to 2905 clusters of orthologous genes (COGs) from 44 fully sequenced microbial genomes representing all three domains of life. Among the results are the following. (1) Of the 51 000 annotated intrapathway gene pairs, 8935 are linked at a level of significance of 0.01. This is over 30-fold greater than the 271 intrapathway pairs obtained at the same confidence level when identical profiles are used. (2) Of the 540 000 interpathway genes pairs, some 65 000 are linked at the 0.01 level of significance, some 12 standard deviations beyond the number expected by chance at this confidence level. We speculate that many of these links involve nearest-neighbor path, and discuss some examples. (3) The difference in the percentage of linked interpathway and intrapathway genes is highly significant, consistent with the intuitive expectation that genes in the same pathway are generally under greater selective pressure than those that are not. (4) The method appears to recover well metabolic networks. This is illustrated by the TCA cycle which is recovered as a highly connected, weighted edge network of 30 of its 31 COGs. (5) The fraction of pairs having a common pathway is a symmetric function of the Hamming distance between their profiles. This finding, that the functional correlation between profiles with near maximum Hamming distance is as large as between profiles with near zero Hamming distance, and as statistically significant, is plausibly explained if the former group represents analogous genes.

Contact: delisi{at}bu.edu


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
I. Uchiyama, T. Higuchi, and M. Kawai
MBGD update 2010: toward a comprehensive resource for exploring microbial genome diversity
Nucleic Acids Res., November 11, 2009; (2009) gkp948v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. Gonzalez and R. Zimmer
Assigning functional linkages to proteins using phylogenetic profiles and continuous phenotypes
Bioinformatics, May 15, 2008; 24(10): 1257 - 1263.
[Abstract] [Full Text] [PDF]


Home page
J R Soc InterfaceHome page
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen
Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution
J R Soc Interface, February 6, 2008; 5(19): 151 - 170.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. E. Reddy, B. E. Shakhnovich, D. S. Roberts, S. J. Russek, and C. DeLisi
Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes
Nucleic Acids Res., February 16, 2007; 35(3): e20 - e20.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Cell Physiol.Home page
T. Gabaldon
Computational approaches for the prediction of protein function in the mitochondrion
Am J Physiol Cell Physiol, December 1, 2006; 291(6): C1121 - C1128.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. L. Green and P. D. Karp
The outcomes of pathway database computations depend on pathway ontology
Nucleic Acids Res., August 7, 2006; 34(13): 3687 - 3697.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. Campillos, C. von Mering, L. J. Jensen, and P. Bork
Identification and analysis of evolutionarily cohesive functional modules in protein networks
Genome Res., March 1, 2006; 16(3): 374 - 382.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Sun, J. Xu, Z. Liu, Q. Liu, A. Zhao, T. Shi, and Y. Li
Refined phylogenetic profiles method for predicting protein-protein interactions
Bioinformatics, August 15, 2005; 21(16): 3409 - 3415.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. L. Green and P. D. Karp
Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers
Nucleic Acids Res., July 20, 2005; 33(13): 4035 - 4039.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
P. M. Bowers, S. J. Cokus, D. Eisenberg, and T. O. Yeates
Use of Logic Relationships to Decipher Protein Network Organization
Science, December 24, 2004; 306(5705): 2246 - 2249.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
U. Karaoz, T. M. Murali, S. Letovsky, Y. Zheng, C. Ding, C. R. Cantor, and S. Kasif
Whole-genome annotation by using evidence integration in functional-linkage networks
PNAS, March 2, 2004; 101(9): 2888 - 2893.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.