Bioinformatics Vol. 17 no. 12 2001
Pages 1123-1130
© 2001 Oxford University Press
A probabilistic method for identifying start codons in bacterial genomes
1 Department of Computer Science, Johns
Hopkins University, Baltimore, MD 21218, USA
2 The Institute for Genomic Research, 9712
Medical Center Dr, Rockville, MD 20850, USA
3 Department of Biochemistry, University of
Otago, PO Box 56, Dunedin, New Zealand
Received on December 18, 2000
; revised on July 4, 2001
; accepted on July 9, 2001
As the pace of genome sequencing has accelerated, the need for highly accurate gene prediction systems has grown. Computational systems for identifying genes in prokaryotic genomes have sensitivities of 9899% or higher (Delcher et al. , Nucleic Acids Res. , 27, 46364641, 1999). These accuracy figures are calculated by comparing the locations of verified stop codons to the predictions. Determining the accuracy of start codon prediction is more problematic, however, due to the relatively small number of start sites that have been confirmed by independent, non-computational methods. Nonetheless, the accuracy of gene finders at predicting the exact gene boundaries at both the 5' and 3' ends of genes is of critical importance for microbial genome annotation, especially in light of the important signaling information that is sometimes found on the 5' end of a protein coding region. In this paper we propose a probabilistic method to improve the accuracy of gene identification systems at finding precise translation start sites. The new system, RBSfinder, is tested on a validated set of genes from Escherichia coli , for which it improves the accuracy of start site locations predicted by computational gene finding systems from the range 6777% to 90% correct.
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G.-Q. Hu, X. Zheng, Y.-F. Yang, P. Ortet, Z.-S. She, and H. Zhu ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D114 - D119. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ventura, A. Zomer, C. Canchaya, M. O'Connell-Motherway, O. Kuipers, F. Turroni, A. Ribbera, E. Foroni, G. Buist, U. Wegmann, et al. Comparative Analyses of Prophage-Like Elements Present in Two Lactococcus lactis Strains Appl. Envir. Microbiol., December 1, 2007; 73(23): 7771 - 7780. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Geslin, M. Gaillard, D. Flament, K. Rouault, M. Le Romancer, D. Prieur, and G. Erauso Analysis of the First Genome of a Hyperthermophilic Marine Virus-Like Particle, PAV1, Isolated from Pyrococcus abyssi J. Bacteriol., June 15, 2007; 189(12): 4510 - 4519. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Wegmann, M. O'Connell-Motherway, A. Zomer, G. Buist, C. Shearman, C. Canchaya, M. Ventura, A. Goesmann, M. J. Gasson, O. P. Kuipers, et al. Complete Genome Sequence of the Prototype Lactic Acid Bacterium Lactococcus lactis subsp. cremoris MG1363 J. Bacteriol., April 15, 2007; 189(8): 3256 - 3270. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Delcher, K. A. Bratke, E. C. Powers, and S. L. Salzberg Identifying bacterial genes and endosymbiont DNA with Glimmer Bioinformatics, March 15, 2007; 23(6): 673 - 679. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Smith, T. A. Gianoulis, S. Pukatzki, J. J. Mekalanos, L. N. Ornston, M. Gerstein, and M. Snyder New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis Genes & Dev., March 1, 2007; 21(5): 601 - 614. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sugawara, T. Abe, T. Gojobori, and Y. Tateno DDBJ working on evaluation and classification of bacterial genes in INSDC Nucleic Acids Res., January 12, 2007; 35(suppl_1): D13 - D15. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Snyder, N. Kampanya, J. Lu, E. K. Nordberg, H. R. Karur, M. Shukla, J. Soneja, Y. Tian, T. Xue, H. Yoo, et al. PATRIC: The VBI PathoSystems Resource Integration Center Nucleic Acids Res., January 12, 2007; 35(suppl_1): D401 - D406. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Nie, G. Wu, and W. Zhang Correlation of mRNA Expression and Protein Abundance Affected by Multiple Sequence Features Related to Translational Efficiency in Desulfovibrio vulgaris: A Quantitative Analysis Genetics, December 1, 2006; 174(4): 2229 - 2243. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Choulet, B. Aigle, A. Gallois, S. Mangenot, C. Gerbaud, C. Truong, F.-X. Francou, C. Fourrier, M. Guerineau, B. Decaris, et al. Evolution of the Terminal Regions of the Streptomyces Linear Chromosome Mol. Biol. Evol., December 1, 2006; 23(12): 2361 - 2369. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Choulet, A. Gallois, B. Aigle, S. Mangenot, C. Gerbaud, C. Truong, F.-X. Francou, F. Borges, C. Fourrier, M. Guerineau, et al. Intraspecific Variability of the Terminal Inverted Repeats of the Linear Chromosome of Streptomyces ambofaciens. J. Bacteriol., September 1, 2006; 188(18): 6599 - 6610. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tech, B. Morgenstern, and P. Meinicke TICO: a tool for postprocessing the predictions of prokaryotic translation initiation sites. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W588 - W590. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Vallenet, L. Labarre, Z. Rouy, V. Barbe, S. Bocs, S. Cruveiller, A. Lajus, G. Pascal, C. Scarpelli, and C. Medigue MaGe: a microbial genome annotation system supported by synteny results Nucleic Acids Res., January 10, 2006; 34(1): 53 - 65. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kosuge, T. Abe, T. Okido, N. Tanaka, M. Hirahata, Y. Maruyama, J. Mashima, A. Tomiki, M. Kurokawa, R. Himeno, et al. Exploration and Grading of Possible Genes from 183 Bacterial Strains by a Common Protocol to Identification of New Genes: Gene Trek in Prokaryote Space (GTPS) DNA Res, January 1, 2006; 13(6): 245 - 254. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Thieme, R. Koebnik, T. Bekel, C. Berger, J. Boch, D. Buttner, C. Caldana, L. Gaigalat, A. Goesmann, S. Kay, et al. Insights into Genome Plasticity and Pathogenicity of the Plant Pathogenic Bacterium Xanthomonas campestris pv. vesicatoria Revealed by the Complete Genome Sequence J. Bacteriol., November 1, 2005; 187(21): 7254 - 7266. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Takeuchi, S. Watanabe, T. Baba, H. Yuzawa, T. Ito, Y. Morimoto, M. Kuroda, L. Cui, M. Takahashi, A. Ankai, et al. Whole-Genome Sequencing of Staphylococcus haemolyticus Uncovers the Extreme Plasticity of Its Genome and the Evolution of Human-Colonizing Staphylococcal Species J. Bacteriol., November 1, 2005; 187(21): 7292 - 7308. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tech, N. Pfeifer, B. Morgenstern, and P. Meinicke TICO: a tool for improving predictions of prokaryotic translation initiation sites Bioinformatics, September 1, 2005; 21(17): 3568 - 3569. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tauch, O. Kaiser, T. Hain, A. Goesmann, B. Weisshaar, A. Albersmeier, T. Bekel, N. Bischoff, I. Brune, T. Chakraborty, et al. Complete Genome Sequence and Analysis of the Multiresistant Nosocomial Pathogen Corynebacterium jeikeium K411, a Lipid-Requiring Bacterium of the Human Skin Flora J. Bacteriol., July 1, 2005; 187(13): 4671 - 4682. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Jaffe, N. Stange-Thomann, C. Smith, D. DeCaprio, S. Fisher, J. Butler, S. Calvo, T. Elkins, M. G. FitzGerald, N. Hafez, et al. The Complete Genome and Proteome of Mycoplasma mobile Genome Res., August 1, 2004; 14(8): 1447 - 1461. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Wiedenheft, K. Stedman, F. Roberto, D. Willits, A.-K. Gleske, L. Zoeller, J. Snyder, T. Douglas, and M. Young Comparative Genomic Analysis of Hyperthermophilic Archaeal Fuselloviridae Viruses J. Virol., February 15, 2004; 78(4): 1954 - 1961. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Baar, M. Eppinger, G. Raddatz, J. Simon, C. Lanz, O. Klimmek, R. Nandakumar, R. Gross, A. Rosinus, H. Keller, et al. Complete genome sequence and analysis of Wolinella succinogenes PNAS, September 30, 2003; 100(20): 11690 - 11695. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-B. Guo, H.-Y. Ou, and C.-T. Zhang ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes Nucleic Acids Res., March 15, 2003; 31(6): 1780 - 1789. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Walker, V. Pavlovic, and S. Kasif A comparative genomic method for computational identification of prokaryotic translation initiation sites Nucleic Acids Res., July 15, 2002; 30(14): 3181 - 3191. [Abstract] [Full Text] [PDF] |
||||










