Skip Navigation


Bioinformatics Advance Access originally published online on January 29, 2004
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow All Versions of this Article:
20/5/709    most recent
btg471v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (23)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Romero, P. R.
Right arrow Articles by Karp, P. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Romero, P. R.
Right arrow Articles by Karp, P. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics 20(5) © Oxford University Press 2004; all rights reserved.

Using functional and organizational information to improve genome-wide computational prediction of transcription units on pathway-genome databases

P. R. Romero and P. D. Karp *

Bioinformatics Research Group, Artificial Intelligence Center, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 950151, USA

Received on August 1, 2003 ; revised on October 7, 2003 ; accepted on October 9, 2003
Advance Access Publication January 29, 2004

Motivation: The prediction of transcription units (TUs, which are similar to operons) is an important problem that has been tackled using many different approaches. The availability of complete microbial genomes has made genome-wide TU predictions possible. Pathway-genome databases (PGDBs) add metabolic and other organizational (i.e. protein complexes) information to the annotated genome, and are able to capture TU organization information. These characteristics of PGDBs make them a suitable framework for the development and implementation of TU predictors.

Results: We implemented a TU predictor that uses only intergenic distance and functional classification of genes to predict TU boundaries, and applied it to EcoCyc, our PGDB of Escherichia coli. To this original predictor, we added information on metabolic pathways, protein complexes and transporters, all readily available in EcoCyc, in order to generate an enhanced predictor. The enhanced predictor correctly predicted 80% of the known E.coli TUs (69% of the known operons), a moderate improvement over the original predictor's performance (75% of TUs and 65% of operons correctly predicted), demonstrating that the extra information available in the PGDB does indeed improve prediction performance. Performance of this E.coli-based predictor on a genome other than that of E.coli was tested on BsubCyc, our computationally generated PGDB for Bacillus subtilis, for which a set of 100 known operons is available. Prediction accuracy decreased substantially (46% of the known operons correctly predicted). This was due in part to missing information in BsubCyc, which prevented full use of the predictor's features. The augmented predictor has been implemented as part of our Pathway Tools software suite, and can be used to populate a PGDB with predicted TUs.

Availability: The TU predictor is included in version 7.0 of the Pathway Tools software suite. Pathway Tools 7.0 is available free of charge to academic institutions and for a fee to commercial enterprises. It runs on Sun Solaris 8, Linux and Windows. TUs predicted on the Caulobacter crescentus and Mycobacterium tuberculosis (H37Rv) genomes are available in our CauloCyc and MtbrvCyc databases, available at the BioCyc web site (http://biocyc.org). To obtain version 7.0 of Pathway Tools, follow the directions in our web site, http://biocyc.org/download.shtml.

Contact: pkarp{at}ai.sri.com

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
R. W. W. Brouwer, O. P. Kuipers, and S. A. F. T. van Hijum
The relative value of operon predictions
Brief Bioinform, September 1, 2008; 9(5): 367 - 375.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. J. Lee, I. Paulsen, and P. Karp
Annotation-based inference of transporter function
Bioinformatics, July 1, 2008; 24(13): i259 - i267.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Caspi, H. Foerster, C. A. Fulcher, P. Kaipa, M. Krummenacker, M. Latendresse, S. Paley, S. Y. Rhee, A. G. Shearer, C. Tissier, et al.
The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D623 - D631.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. D. Karp, I. M. Keseler, A. Shearer, M. Latendresse, M. Krummenacker, S. M. Paley, I. Paulsen, J. Collado-Vides, S. Gama-Castro, M. Peralta-Gil, et al.
Multidimensional annotation of the Escherichia coli K-12 genome
Nucleic Acids Res., December 3, 2007; 35(22): 7577 - 7590.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Roback, J. Beard, D. Baumann, C. Gille, K. Henry, S. Krohn, H. Wiste, M.I. Voskuil, C. Rainville, and R. Rutherford
A predicted operon map for Mycobacterium tuberculosis
Nucleic Acids Res., August 1, 2007; 35(15): 5085 - 5095.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Dam, V. Olman, K. Harris, Z. Su, and Y. Xu
Operon prediction using both genome-specific and general genomic information
Nucleic Acids Res., January 12, 2007; 35(1): 288 - 298.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. C. Janga, W. F. Lamboy, A. M. Huerta, and G. Moreno-Hagelsieb
The distinctive signatures of promoter regions and operon junctions across prokaryotes
Nucleic Acids Res., September 1, 2006; 34(14): 3980 - 3987.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Che, G. Li, F. Mao, H. Wu, and Y. Xu
Detecting uber-operons in prokaryotic genomes.
Nucleic Acids Res., January 1, 2006; 34(8): 2418 - 2427.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Caspi, H. Foerster, C. A. Fulcher, R. Hopkinson, J. Ingraham, P. Kaipa, M. Krummenacker, S. Paley, J. Pick, S. Y. Rhee, et al.
MetaCyc: a multiorganism database of metabolic pathways and enzymes
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D511 - D516.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. D. Karp, C. A. Ouzounis, C. Moore-Kochlacs, L. Goldovsky, P. Kaipa, D. Ahren, S. Tsoka, N. Darzentas, V. Kunin, and N. Lopez-Bigas
Expansion of the BioCyc collection of pathway/genome databases to 160 genomes
Nucleic Acids Res., October 24, 2005; 33(19): 6083 - 6089.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. P. Westover, J. D. Buhler, J. L. Sonnenburg, and J. I. Gordon
Operon prediction without a training set
Bioinformatics, April 1, 2005; 21(7): 880 - 888.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. M. Keseler, J. Collado-Vides, S. Gama-Castro, J. Ingraham, S. Paley, I. T. Paulsen, M. Peralta-Gil, and P. D. Karp
EcoCyc: a comprehensive database resource for Escherichia coli
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D334 - D337.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.