Bioinformatics Vol. 18 no. 5 2002
Pages 715-724
© 2002 Oxford University Press
Evaluation of computational metabolic-pathway predictions for Helicobacter pylori
Bioinformatics Research Group, SRI International, EK207, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
Received on August 26, 2001
; revised on December 10, 2001
; accepted on December 17, 2001
Motivation: We seek to determine the accuracy of computational methods for predicting metabolic pathways in sequenced genomes, and to understand the contributions of both the prediction algorithms, and the reference pathway databases used by those algorithms, to the prediction accuracy.
Results: The comparisons we performed were as follows. (1) We compared two predictions of the pathway complements of Helicobacter pylori that were computed by an early version of our pathway-prediction algorithm: prediction A used the EcoCyc E. coli pathway DB as the reference database (DB) for prediction, and prediction B used the MetaCyc pathway DB (a superset of EcoCyc) as the reference pathway DB. The MetaCyc-based prediction contained 75% more pathway predictions, but we believe a significant number of those predictions were false positives. (2) We compared two predictions of the pathway complement of H. pylori that used MetaCyc as the reference pathway DB, but that used different algorithms: the original PathoLogic algorithm, and an enhanced version of the algorithm designed to eliminate false-positive pathway predictions. The improved algorithm predicted 30\% fewer metabolic pathways than the original algorithm; all of the eliminated pathways are believed to be false-positive predictions. (3)~We compared the 98 pathways predicted by the enhanced algorithm with the results of a manual analysis of the pathways of H. pylori. Results: 40 of the computationally predicted pathways were consistent with the manual analysis, 13 pathways are considered false-positive predictions, and four pathways had partially overlapping topologies. Twenty-six predicted pathways were not mentioned in the manual analysis; we believe these are correct predictions by PathoLogic that were not found by the manual analysis. Five pathways from the manual analysis were not found computationally. Agreement between the computational and manual predictions was good overall, with the computational analysis inferring many pathways that the manual analysis did not identify. Ultimately the manual analysis is also partially speculative, and therefore is not an absolute measure of correctness. The algorithm is designed to err on the side of more false positives to bring more potential pathways to the user's attention. The resulting H. pylori pathway DB is freely available at http://ecocyc.org:1555/HPY/organism-summary?object=HPY.
Availability: The Pathway Tools software is freely available to academic users, and is available to commercial users for a fee. Contact pkarp{at}ai.sri.com for information on obtaining the software.
Contact: paley{at}ai.sri.com; pkarp{at}ai.sri.com
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Kastenmuller, J. Gasteiger, and H.-W. Mewes An environmental perspective on large-scale genome clustering based on metabolic capabilities Bioinformatics, August 15, 2008; 24(16): i56 - i62. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Caspi, H. Foerster, C. A. Fulcher, P. Kaipa, M. Krummenacker, M. Latendresse, S. Paley, S. Y. Rhee, A. G. Shearer, C. Tissier, et al. The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases Nucleic Acids Res., January 11, 2008; 36(suppl_1): D623 - D631. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Cakmak and G. Ozsoyoglu Mining biological networks for unknown pathways Bioinformatics, October 15, 2007; 23(20): 2775 - 2783. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Green and P. D. Karp Using genome-context data to identify specific types of functional associations in pathway/genome databases Bioinformatics, July 1, 2007; 23(13): i205 - i211. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Urbanczyk-Wochniak and L. W. Sumner MedicCyc: a biochemical pathway database for Medicago truncatula Bioinformatics, June 1, 2007; 23(11): 1418 - 1423. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. A. Novak and A. N. Jain Pathway recognition and augmentation by computational analysis of microarray expression data Bioinformatics, January 15, 2006; 22(2): 233 - 241. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Shi, P. R. Romero, G. K. Schoolnik, A. M. Spormann, and P. D. Karp Evidence supporting predicted metabolic pathways for Vibrio cholerae: gene expression data and clinical tests. Nucleic Acids Res., January 1, 2006; 34(8): 2438 - 2444. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Caspi, H. Foerster, C. A. Fulcher, R. Hopkinson, J. Ingraham, P. Kaipa, M. Krummenacker, S. Paley, J. Pick, S. Y. Rhee, et al. MetaCyc: a multiorganism database of metabolic pathways and enzymes Nucleic Acids Res., January 1, 2006; 34(suppl_1): D511 - D516. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Karp, C. A. Ouzounis, C. Moore-Kochlacs, L. Goldovsky, P. Kaipa, D. Ahren, S. Tsoka, N. Darzentas, V. Kunin, and N. Lopez-Bigas Expansion of the BioCyc collection of pathway/genome databases to 160 genomes Nucleic Acids Res., October 24, 2005; 33(19): 6083 - 6089. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Zhang, H. Foerster, C. P. Tissier, L. Mueller, S. Paley, P. D. Karp, and S. Y. Rhee MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research Plant Physiology, May 1, 2005; 138(1): 27 - 37. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Pinney, M. W. Shirley, G. A. McConkey, and D. R. Westhead metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella Nucleic Acids Res., March 3, 2005; 33(4): 1399 - 1409. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Keseler, J. Collado-Vides, S. Gama-Castro, J. Ingraham, S. Paley, I. T. Paulsen, M. Peralta-Gil, and P. D. Karp EcoCyc: a comprehensive database resource for Escherichia coli Nucleic Acids Res., January 1, 2005; 33(suppl_1): D334 - D337. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Krieger, P. Zhang, L. A. Mueller, A. Wang, S. Paley, M. Arnaud, J. Pick, S. Y. Rhee, and P. D. Karp MetaCyc: a multiorganism database of metabolic pathways and enzymes Nucleic Acids Res., January 1, 2004; 32(90001): D438 - 442. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Mueller, P. Zhang, and S. Y. Rhee AraCyc: A Biochemical Pathway Database for Arabidopsis Plant Physiology, June 1, 2003; 132(2): 453 - 460. [Abstract] [Full Text] [PDF] |
||||


