Bioinformatics Advance Access originally published online on February 22, 2005
Bioinformatics 2005 21(9):1859-1875; doi:10.1093/bioinformatics/bti310
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GMAP: a genomic mapping and alignment program for mRNA and EST sequences
1Department of Bioinformatics Genentech, Inc., South San Francisco, CA 94080, USA
2Department of Corporate Information Technology Genentech, Inc., South San Francisco, CA 94080, USA
*To whom correspondence should be addressed.
Motivation: We introduce GMAP, a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.
Results: On a set of human messenger RNAs with random mutations at a 1 and 3% rate, GMAP identified all splice sites accurately in over 99.3% of the sequences, which was one-tenth the error rate of existing programs. On a large set of human expressed sequence tags, GMAP provided higher-quality alignments more often than BLAT did. On a set of Arabidopsis cDNAs, GMAP performed comparably with GeneSeqer. In these experiments, GMAP demonstrated a several-fold increase in speed over existing programs.
Availability: Source code for GMAP and associated programs is available at http://www.gene.com/share/gmap
Contact: twu{at}gene.com
Supplementary information: http://www.gene.com/share/gmap
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
X. Huang, G. Lu, Q. Zhao, X. Liu, and B. Han Genome-Wide Analysis of Transposon Insertion Polymorphisms Reveals Intraspecific Variation in Cultivated Rice Plant Physiology, September 1, 2008; 148(1): 25 - 40. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Akagi, J. Li, R. M. Stephens, N. Volfovsky, and D. E. Symer Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition Genome Res., June 1, 2008; 18(6): 869 - 880. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Gotoh A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence Nucleic Acids Res., May 1, 2008; 36(8): 2630 - 2638. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liang, Y. Liu, L. Liu, A. C. Davis, Y. Shen, and Q. Q. Li Expressed Sequence Tags With cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas reinhardtii Genetics, May 1, 2008; 179(1): 83 - 93. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Shen, Y. Liu, L. Liu, C. Liang, and Q. Q. Li Unique Features of Nuclear mRNA Poly(A) Signals and Alternative Polyadenylation in Chlamydomonas reinhardtii Genetics, May 1, 2008; 179(1): 167 - 176. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, I. Ladunga, A. R. Miller, K. M. Horken, T. Plucinak, D. P. Weeks, and C. P. Bailey The Small Ubiquitin-Like Modifier (SUMO) and SUMO-Conjugating System of Chlamydomonas reinhardtii Genetics, May 1, 2008; 179(1): 177 - 192. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ameline-Torregrosa, B.-B. Wang, M. S. O'Bleness, S. Deshpande, H. Zhu, B. Roe, N. D. Young, and S. B. Cannon Identification and Characterization of Nucleotide-Binding Site-Leucine-Rich Repeat Genes in the Model Plant Medicago truncatula Plant Physiology, January 1, 2008; 146(1): 5 - 21. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Chakrabarti, M. Pearson, L. Grate, T. Sterne-Weiler, J. Deans, J. P. Donohue, and M. Ares Jr Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis RNA, November 1, 2007; 13(11): 1923 - 1939. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Balasenthil, A. E. Gururaj, A. H. Talukder, R. Bagheri-Yarmand, T. Arrington, B. J. Haas, J. C. Braisted, I. Kim, N. H. Lee, and R. Kumar Identification of Pax5 as a Target of MTA1 in B-Cell Lymphomas Cancer Res., August 1, 2007; 67(15): 7132 - 7138. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Dinka, M. A. Campbell, T. Demers, and M. N. Raizada Predicting the Size of the Progeny Mapping Population Required to Positionally Clone a Gene Genetics, August 1, 2007; 176(4): 2035 - 2054. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhang, S.-M. Luoh, L. S. Hon, R. Baertsch, W. I. Wood, and Z. Zhang GeneHub-GEPIS: digital expression profiling for normal and cancer tissues based on an integrated gene database Nucleic Acids Res., July 13, 2007; 35(suppl_2): W152 - W158. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Cui, T. Vinar, B. Brejova, D. Shasha, and M. Li Homology search for genes Bioinformatics, July 1, 2007; 23(13): i97 - i103. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Menotti-Raymond, V. A. David, A. A. Schaffer, R. Stephens, D. Wells, R. Kumar-Singh, S. J. O'Brien, and K. Narfstrom Mutation in CEP290 Discovered for Cat Model of Human Retinal Degeneration J. Hered., May 16, 2007; (2007) esm019v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Kaminker, Y. Zhang, A. Waugh, P. M. Haverty, B. Peters, D. Sebisanovic, J. Stinson, W. F. Forrest, J. F. Bazan, S. Seshagiri, et al. Distinguishing Cancer-Associated Missense Mutations from Common Polymorphisms Cancer Res., January 15, 2007; 67(2): 465 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Nagaraj, R. B. Gasser, and S. Ranganathan A hitchhiker's guide to expressed sequence tag (EST) analysis Brief Bioinform, January 1, 2007; 8(1): 6 - 21. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Malde, K. Schneeberger, E. Coward, and I. Jonassen RBR: library-less repeat detection for ESTs Bioinformatics, September 15, 2006; 22(18): 2232 - 2236. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Hsieh, C. Y. Lin, N. H. Liu, W. Y. Chow, and C. Y. Tang GeneAlign: a coding exon prediction tool based on phylogenetical comparisons. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W280 - W284. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Castrignano, R. Rizzi, I. G. Talamo, P. D. De Meo, A. Anselmo, P. Bonizzoni, and G. Pesole ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W440 - W443. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Guryev, M. J. Koudijs, E. Berezikov, S. L. Johnson, R. H.A. Plasterk, F. J.M. van Eeden, and E. Cuppen Genetic variation in the zebrafish Genome Res., April 1, 2006; 16(4): 491 - 497. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Bonizzoni, R. Rizzi, and G. Pesole Computational methods for alternative splicing prediction Brief Funct Genomic Proteomic, March 1, 2006; 5(1): 46 - 51. |
||||









