Skip Navigation


Bioinformatics Advance Access originally published online on February 5, 2004
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow All Versions of this Article:
20/7/1157    most recent
bth058v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (38)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Brendel, V.
Right arrow Articles by Zhu, W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Brendel, V.
Right arrow Articles by Zhu, W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics 20(7) © Oxford University Press 2004; all rights reserved.

Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus

Volker Brendel 1,2,*, Liqun Xing 1,{dagger} and Wei Zhu 1,{ddagger}

1 Department of Genetics, Development and Cell Biology and 2 Department of Statistics, Iowa State University, 2112 Molecular Biology Building, Ames, Iowa 50011–3260, USA

Received on July 14, 2003; revised on December 12, 2003; accepted on December 13, 2003
Advance Access Publication February 5, 2004

Motivation: Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possibility of using combined EST resources from fairly diverged species that still share a common gene space. Previous spliced alignment tools were found inadequate for this task because they rely on very high sequence similarity between the ESTs and the genomic DNA.

Results: We have developed a computer program, GeneSeqer, which is capable of aligning thousands of ESTs with a long genomic sequence in a reasonable amount of time. The algorithm is uniquely designed to tolerate a high percentage of mismatches and insertions or deletions in the EST relative to the genomic template. This feature allows use of non-cognate ESTs for gene structure prediction, including ESTs derived from duplicated genes and homologous genes from related species. The increased gene prediction sensitivity results in part from novel splice site prediction models that are also available as a stand-alone splice site prediction tool. We assessed GeneSeqer performance relative to a standard Arabidopsis thaliana gene set and demonstrate its utility for plant genome annotation. In particular, we propose that this method provides a timely tool for the annotation of the rice genome, using abundant ESTs from other cereals and plants.

Availability: The source code is available for download at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis and other plant species are accessible at http://www.plantgdb.org/cgi-bin/AtGeneSeqer.cgi and http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi, respectively. For non-plant species, use http://bioinformatics.iastate.edu/cgi-bin/gs.cgi. The splice site prediction tool (SplicePredictor) is distributed with the GeneSeqer code. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi

Supplementary information: http://www.plantgdb.org/AtGDB/prj/BXZ03B

Contact: vbrendel{at}iastate.edu

* To whom correspondence should be addressed.

{dagger} Current address: BASF Plant Science NC, 26 Davis Drive, Research Triangle Park, NC 27709-3528, USA.

{ddagger} Current address: NewLink Genetics, 2901 S. Loop Dr, Ames, IA 50010, USA.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
DNA ResHome page
K. Iida, K. Fukami-Kobayashi, A. Toyoda, Y. Sakaki, M. Kobayashi, M. Seki, and K. Shinozaki
Analysis of Multiple Occurrences of Alternative Splicing Events in Arabidopsis thaliana Using Novel Sequenced Full-Length cDNAs
DNA Res, June 1, 2009; 16(3): 155 - 164.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Iida, M. Shionyu, and Y. Suso
Alternative Splicing at NAGNAG Acceptor Sites Shares Common Properties in Land Plants and Mammals
Mol. Biol. Evol., April 1, 2008; 25(4): 709 - 718.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Duvick, A. Fu, U. Muppirala, M. Sabharwal, M. D. Wilkerson, C. J. Lawrence, C. Lushbough, and V. Brendel
PlantGDB: a resource for comparative plant genomics
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D959 - D965.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
W. Zhu and C. R. Buell
Improvement of whole-genome annotation of cereals through comparative analyses
Genome Res., March 1, 2007; 17(3): 299 - 310.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. D'Agostino, M. Aversano, L. Frusciante, and M. L. Chiusano
TomatEST database: in silico exploitation of EST data to explore expression patterns in tomato species
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D901 - D905.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
Y. Fu, T.-J. Wen, Y. I. Ronin, H. D. Chen, L. Guo, D. I. Mester, Y. Yang, M. Lee, A. B. Korol, D. A. Ashlock, et al.
Genetic Dissection of Intermated Recombinant Inbred Lines Using a New Genetic Map of Maize
Genetics, November 1, 2006; 174(3): 1671 - 1683.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. J. Hsieh, C. Y. Lin, N. H. Liu, W. Y. Chow, and C. Y. Tang
GeneAlign: a coding exon prediction tool based on phylogenetical comparisons.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W280 - W284.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
B.-B. Wang and V. Brendel
Genomewide comparative analysis of alternative splicing in plants
PNAS, May 2, 2006; 103(18): 7175 - 7180.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Iida and M. Go
Survey of Conserved Alternative Splicing Events of mRNAs Encoding SR Proteins in Land Plants
Mol. Biol. Evol., May 1, 2006; 23(5): 1085 - 1094.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
Q. Dong, C. J. Lawrence, S. D. Schlueter, M. D. Wilkerson, S. Kurtz, C. Lushbough, and V. Brendel
Comparative Plant Genomics Resources at PlantGDB
Plant Physiology, October 1, 2005; 139(2): 610 - 618.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Pan, L. Stein, and V. Brendel
SynBrowse: a synteny browser for comparative sequence analysis
Bioinformatics, September 1, 2005; 21(17): 3461 - 3468.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
C. J. Lawrence, T. E. Seigfried, and V. Brendel
The Maize Genetics and Genomics Database. The Community Resource for Access to Diverse Maize Data
Plant Physiology, May 1, 2005; 138(1): 55 - 58.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.