Translation initiation site prediction on a genomic scale: beauty in simplicity
1Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Ghent, Belgium, 2Department of Molecular Genetics, Ghent University, Ghent, Belgium and 3Pronota, Technologiepark - Zwijnaarde 927, B-9052 Ghent, Belgium
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: The correct identification of translation initiation sites (TIS) remains a challenging problem for computational methods that automatically try to solve this problem. Furthermore, the lion's share of these computational techniques focuses on the identification of TIS in transcript data. However, in the gene prediction context the identification of TIS occurs on the genomic level, which makes things even harder because at the genome level many more pseudo-TIS occur, resulting in models that achieve a higher number of false positive predictions.
Results: In this article, we evaluate the performance of several simple TIS recognition methods at the genomic level, and compare them to state-of-the-art models for TIS prediction in transcript data. We conclude that the simple methods largely outperform the complex ones at the genomic scale, and we propose a new model for TIS recognition at the genome level that combines the strengths of these simple models. The new model obtains a false positive rate of 0.125 at a sensitivity of 0.80 on a well annotated human chromosome (chromosome 21). Detailed analyses show that the model is useful, both on its own and in a simple gene prediction setting.
Availability: Datafiles and a web interface for the StartScan program are available at http://bioinformatics.psb.ugent.be/supplementary_data/
Contact: yvan.saeys{at}psb.ugent.be
This article has been cited by other articles:
![]() |
S. Sonnenburg, A. Zien, P. Philips, and G. Ratsch POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors Bioinformatics, July 1, 2008; 24(13): i6 - i14. [Abstract] [Full Text] [PDF] |
||||
