Bioinformatics Vol. 16 no. 3 2000
Pages 203-211
© 2000 Oxford University Press
Optimal spliced alignment of homologous cDNA to a genomic DNA template
1 Department of Chemistry, Stanford
University, Stanford, CA 94305, USA
2 Department of Zoology and Genetics, Iowa
State University, 2112 Molecular Biology Building, Ames, IA
50011-3260, USA
Received on March 28, 1999
; revised on July 30, 1999
; accepted on August 28, 1999
Motivation: Supplementary cDNA or EST evidence is often decisive for discriminating between alternative gene predictions derived from computational sequence inspection by any of a number of requisite programs. Without additional experimental effort, this approach must rely on the occurrence of cognate ESTs for the gene under consideration in available, generally incomplete, EST collections for the given species. In some cases, particular exon assignments can be supported by sequence matching even if the cDNA or EST is produced from non-cognate genomic DNA, including different loci of a gene family or homologous loci from different species. However, marginally significant sequence matching alone can also be misleading. We sought to develop an algorithm that would simultaneously score for predicted intrinsic splice site strength and sequence matching between the genomic DNA template and a related cDNA or EST. In this case, weakly predicted splice sites may be chosen for the optimal scoring spliced alignment on the basis of surrounding sequence matching. Strongly predicted splice sites will enter the optimal spliced alignment even without strong sequence matching.
Results: We designed a novel algorithm that produces the optimal spliced alignment of a genomic DNA with a cDNA or EST based on scoring for both sequence matching and intrinsic splice site strength. By example, we demonstrate that this combined approach appears to improve gene prediction accuracy compared with current methods that rely only on either search by content and signal or on sequence similarity.
Availability: The algorithm is available as a C subroutine and is implemented in the SplicePredictor and GeneSeqer programs. The source code is available via anonymous ftp from ftp.zmdb.iastate.edu. Both programs are also implemented as a Web service at http://gremlin1.zool.iastate.edu/cgi-bin/sp.cgiand http://gremlin1.zool.iastate.edu/cgi-bin/gs.cgi, respectively.
Contact: vbrendel{at}iastate.edu
*To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. V. Lu, R. H. Brown, M. Arumugam, and M. R. Brent Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner Bioinformatics, July 1, 2009; 25(13): 1587 - 1593. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liang, L. Liu, and G. Ji WebGMAP: a web service for mapping and aligning cDNA sequences to genomes Nucleic Acids Res., July 1, 2009; 37(suppl_2): W77 - W83. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhou, M. Pertea, A. L. Delcher, and L. Florea Sim4cc: a cross-species spliced alignment program Nucleic Acids Res., June 1, 2009; 37(11): e80 - e80. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Fu, O. Bannach, H. Chen, J.-H. Teune, A. Schmitz, G. Steger, L. Xiong, and W. B. Barbazuk Alternative splicing of anciently exonized 5S rRNA regulates plant transcription factor TFIIIA Genome Res., May 1, 2009; 19(5): 913 - 921. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. De Bona, S. Ossowski, K. Schneeberger, and G. Ratsch Optimal spliced alignments of short sequence reads Bioinformatics, August 15, 2008; 24(16): i174 - i180. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Gotoh A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence Nucleic Acids Res., May 1, 2008; 36(8): 2630 - 2638. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Schulze, B. Hepp, C. S. Ong, and G. Ratsch PALMA: mRNA to genome alignments using large margin algorithms Bioinformatics, August 1, 2007; 23(15): 1892 - 1900. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu and C. R. Buell Improvement of whole-genome annotation of cereals through comparative analyses Genome Res., March 1, 2007; 17(3): 299 - 310. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Nagaraj, R. B. Gasser, and S. Ranganathan A hitchhiker's guide to expressed sequence tag (EST) analysis Brief Bioinform, January 1, 2007; 8(1): 6 - 21. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Emrich, W. B. Barbazuk, L. Li, and P. S. Schnable Gene discovery and annotation using LCM-454 transcriptome sequencing Genome Res., January 1, 2007; 17(1): 69 - 73. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kalyna, S. Lopato, V. Voronin, and A. Barta Evolutionary conservation and regulation of particular alternative splicing events in plant SR proteins Nucleic Acids Res., September 11, 2006; 34(16): 4395 - 4405. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Florea Bioinformatics of alternative splicing and its regulation Brief Bioinform, March 1, 2006; 7(1): 55 - 69. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Bonizzoni, R. Rizzi, and G. Pesole Computational methods for alternative splicing prediction Brief Funct Genomic Proteomic, March 1, 2006; 5(1): 46 - 51. |
||||
![]() |
M. Zhang and W. Gish Improved spliced alignment from an information theoretic approach Bioinformatics, January 1, 2006; 22(1): 13 - 20. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Haberer, S. Young, A. K. Bharti, H. Gundlach, C. Raymond, G. Fuks, E. Butler, R. A. Wing, S. Rounsley, B. Birren, et al. Structure and Architecture of the Maize Genome Plant Physiology, December 1, 2005; 139(4): 1612 - 1624. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Fu, S. J. Emrich, L. Guo, T.-J. Wen, D. A. Ashlock, S. Aluru, and P. S. Schnable Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes PNAS, August 23, 2005; 102(34): 12282 - 12287. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Wu and C. K. Watanabe GMAP: a genomic mapping and alignment program for mRNA and EST sequences Bioinformatics, May 1, 2005; 21(9): 1859 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kunne, M. Lange, T. Funke, H. Miehe, T. Thiel, I. Grosse, and U. Scholz CR-EST: a resource for crop ESTs Nucleic Acids Res., January 1, 2005; 33(suppl_1): D619 - D621. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Ding, A. Sabo, N. Berkowicz, R. R. Meyer, Y. Shotland, M. R. Johnson, K. H. Pepin, R. K. Wilson, and J. Spieth EAnnot: A genome annotation tool using experimental evidence Genome Res., December 1, 2004; 14(12): 2503 - 2509. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Iida, M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences Nucleic Acids Res., September 27, 2004; 32(17): 5096 - 5103. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kruger, A. Sczyrba, S. Kurtz, and R. Giegerich e2g: an interactive web-based server for efficiently mapping large EST and cDNA sets to genomic sequences Nucleic Acids Res., July 1, 2004; 32(suppl_2): W301 - W304. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Schoof, R. Ernst, V. Nazarov, L. Pfeifer, H.-W. Mewes, and K. F. X. Mayer MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics Nucleic Acids Res., January 1, 2004; 32(90001): D373 - 376. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. M. Christensen, Z. Vejlupkova, Y. K. Sharma, K. M. Arthur, J. W. Spatafora, C. A. Albright, R. B. Meeley, J. P. Duvick, R. S. Quatrano, and J. E. Fowler Conserved Subgroups and Developmental Regulation in the Monocot rop Gene Family Plant Physiology, December 1, 2003; 133(4): 1791 - 1808. [Abstract] [Full Text] |
||||
![]() |
B. J. Haas, A. L. Delcher, S. M. Mount, J. R. Wortman, R. K. Smith Jr, L. I. Hannick, R. Maiti, C. M. Ronning, D. B. Rusch, C. D. Town, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies Nucleic Acids Res., October 1, 2003; 31(19): 5654 - 5666. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu and V. Brendel Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome Nucleic Acids Res., August 1, 2003; 31(15): 4561 - 4572. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D. Schlueter, Q. Dong, and V. Brendel GeneSeqer@PlantGDB: gene structure prediction in plant genomes Nucleic Acids Res., July 1, 2003; 31(13): 3597 - 3600. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Volfovsky, B. J. Haas, and S. L. Salzberg Computational Discovery of Internal Micro-Exons Genome Res., June 1, 2003; 13(6): 1216 - 1221. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu, S. D. Schlueter, and V. Brendel Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping Plant Physiology, June 1, 2003; 132(2): 469 - 484. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. K. Lal, M. J. Giroux, V. Brendel, C. E. Vallejos, and L. C. Hannah The Maize Genome Contains a Helitron Insertion PLANT CELL, February 1, 2003; 15(2): 381 - 391. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Dong, L. Roy, M. Freeling, V. Walbot, and V. Brendel ZmDB, an integrated database for maize genome research Nucleic Acids Res., January 1, 2003; 31(1): 244 - 247. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Pandey, A. Muller, C. A. Napoli, D. A. Selinger, C. S. Pikaard, E. J. Richards, J. Bender, D. W. Mount, and R. A. Jorgensen Analysis of histone acetyltransferase and histone deacetylase families of Arabidopsis thaliana suggests functional diversification of chromatin modification among multicellular eukaryotes Nucleic Acids Res., December 1, 2002; 30(23): 5036 - 5055. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze Current methods of gene prediction, their strengths and weaknesses Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. I. Castillo-Davis and D. L. Hartl Genome Evolution and Developmental Constraint in Caenorhabditis elegans Mol. Biol. Evol., May 1, 2002; 19(5): 728 - 735. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Gemund, C. Ramu, B. Altenberg-Greulich, and T. J. Gibson Gene2EST: a BLAST2 server for searching expressed sequence tag (EST) databases with eukaryotic gene-sized queries Nucleic Acids Res., March 15, 2001; 29(6): 1272 - 1277. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Dubcovsky, W. Ramakrishna, P. J. SanMiguel, C. S. Busso, L. Yan, B. A. Shiloff, and J. L. Bennetzen Comparative Sequence Analysis of Colinear Barley and Rice Bacterial Artificial Chromosomes Plant Physiology, March 1, 2001; 125(3): 1342 - 1353. [Abstract] [Full Text] |
||||








