Bioinformatics Vol. 19 no. 1 2003
Pages 125-134
© 2003 Oxford University Press
Whole-proteome interaction mining
Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0412, USA
Received on August 23, 2001
; revised on March 11, 2002
; accepted on June 27, 2002
Motivation: A major post-genomic scientific and technological pursuit is to describe the functions performed by the proteins encoded by the genome. One strategy is to first identify the proteinprotein interactions in a proteome, then determine pathways and overall structure relating these interactions, and finally to statistically infer functional roles of individual proteins. Although huge amounts of genomic data are at hand, current experimental protein interaction assays must overcome technical problems to scale-up for high-throughput analysis. In the meantime, bioinformatics approaches may help bridge the information gap required for inference of protein function. In this paper, a previously described data mining approach to prediction of proteinprotein interactions (Bock and Gough, 2001, Bioinformatics, 17, 455460) is extended to interaction mining on a proteome-wide scale. An algorithm (the phylogenetic bootstrap) is introduced, which suggests traversal of a phenogram, interleaving rounds of computation and experiment, to develop a knowledge base of protein interactions in genetically-similar organisms.
Results: The interaction mining approach was demonstrated by building a learning system based on 1,039 experimentally validated proteinprotein interactions in the human gastric bacterium Helicobacter pylori. An estimate of the generalization performance of the classifier was derived from 10-fold cross-validation, which indicated expected upper bounds on precision of 80% and sensitivity of 69% when applied to related organisms. One such organism is the enteric pathogen Campylobacter jejuni, in which comprehensive machine learning prediction of all possible pairwise proteinprotein interactions was performed. The resulting network of interactions shares an average protein connectivity characteristic in common with previous investigations reported in the literature, offering strong evidence supporting the biological feasibility of the hypothesized map. For inferences about complete proteomes in which the number of pairwise non-interactions is expected to be much larger than the number of actual interactions, we anticipate that the sensitivity will remain the same but precision may decrease. We present specific biological examples of two subnetworks of proteinprotein interactions in C. jejuni resulting from the application of this approach, including elements of a two-component signal transduction systems for thermoregulation, and a ferritin uptake network.
Contact: dgough{at}bioeng.ucsd.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Iqbal, A. A. Freitas, C. G. Johnson, and M. Vergassola Message-passing algorithms for the prediction of protein domain interactions from protein-protein interaction data Bioinformatics, September 15, 2008; 24(18): 2064 - 2070. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. R. Li, H. H. Lin, L. Y. Han, L. Jiang, X. Chen, and Y. Z. Chen PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W32 - W37. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Nanni and A. Lumini An ensemble of K-local hyperplanes for predicting protein-protein interactions Bioinformatics, May 15, 2006; 22(10): 1207 - 1210. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Martin, D. Roe, and J.-L. Faulon Predicting protein-protein interactions using signature products Bioinformatics, January 15, 2005; 21(2): 218 - 226. [Abstract] [Full Text] [PDF] |
||||

