Bioinformatics Vol. 17 no. 90001 2001
Pages S140-S148
© 2001 Oxford University Press
Integrating genomic homology into gene structure prediction
1 Department of Computer Science, Washington
University, Campus Box 1045, St. Louis, MO, 63130, USA
2 Department of Biomedical Engineering,
Washington University, Campus Box 1097, St. Louis, MO, 63130, USA
Received on February 6, 2001
; revised on April 2, 2001
; accepted on April 2, 2001
TWINSCAN is a new gene-structure prediction system that directly extends the probability model of GENSCAN, allowing it to exploit homology between two related genomes. Separate probability models are used for conservation in exons, introns, splice sites, and UTRs, reflecting the differences among their patterns of evolutionary conservation. TWINSCAN is specifically designed for the analysis of high-throughput genomic sequences containing an unknown number of genes. In experiments on high-throughput mouse sequences, using homologous sequences from the human genome, TWINSCAN shows notable improvement over GENSCAN in exon sensitivity and specificity and dramatic improvement in exact gene sensitivity and specificity. This improvement can be attributed entirely to modeling the patterns of evolutionary conservation in genomic sequence.
Contact: ikorf{at}cs.wustl.edu; pflicek{at}cs.wustl.edu; duan{at}cs.wustl.edu; brent{at}cs.wustl.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Baerenfaller, J. Grossmann, M. A. Grobei, R. Hull, M. Hirsch-Hoffmann, S. Yalovsky, P. Zimmermann, U. Grossniklaus, W. Gruissem, and S. Baginsky Genome-Scale Proteomics Reveals Arabidopsis thaliana Gene Models and Proteome Dynamics Science, May 16, 2008; 320(5878): 938 - 941. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Perez, F. Lankas, F. J. Luque, and M. Orozco Towards a molecular dynamics consensus view of B-DNA flexibility Nucleic Acids Res., April 1, 2008; 36(7): 2379 - 2394. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ansong, S. O. Purvine, J. N. Adkins, M. S. Lipton, and R. D. Smith Proteogenomics: needs and roles to be filled by proteomics in genome annotation Brief Funct Genomic Proteomic, March 10, 2008; (2008) eln010v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Fullwood, J. J. S. Tan, P. W. P. Ng, K. P. Chiu, J. Liu, C. L. Wei, and Y. Ruan The use of multiple displacement amplification to amplify complex DNA libraries Nucleic Acids Res., March 1, 2008; 36(5): e32 - e32. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Liu, A. J. Mackey, D. S. Roos, and F. C. N. Pereira Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction Bioinformatics, March 1, 2008; 24(5): 597 - 605. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Siepel, M. Diekhans, B. Brejova, L. Langton, M. Stevens, C. L.G. Comstock, C. Davis, B. Ewing, S. Oommen, C. Lau, et al. Targeted discovery of novel human exons by comparative genomics Genome Res., December 1, 2007; 17(12): 1763 - 1773. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Xu, C. W. Saunders, P. Hu, R. A. Grant, T. Boekhout, E. E. Kuramae, J. W. Kronstad, Y. M. DeAngelis, N. L. Reeder, K. R. Johnstone, et al. Dandruff-associated Malassezia genomes reveal convergent and divergent virulence traits shared with plant and human fungal pathogens PNAS, November 20, 2007; 104(47): 18730 - 18735. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Cogburn, T. E. Porter, M. J. Duclos, J. Simon, S. C. Burgess, J. J. Zhu, H. H. Cheng, J. B. Dodgson, and J. Burnside Functional Genomics of the Chicken A Model Organism Poult. Sci., October 1, 2007; 86(10): 2059 - 2094. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. U. Andersen, R. G. Algreen-Petersen, M. Hoedl, A. Jurkiewicz, C. Cvitanich, U. Braunschweig, L. Schauser, S.-A. Oh, D. Twell, and E. O. Jensen The conserved cysteine-rich domain of a tesmin/TSO1-like protein binds zinc in vitro and TSO1 is required for both male and female fertility in Arabidopsis thaliana J. Exp. Bot., October 1, 2007; 58(13): 3657 - 3670. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kwan, D. Benovoy, C. Dias, S. Gurd, D. Serre, H. Zuzan, T. A. Clark, A. Schweitzer, M. K. Staples, H. Wang, et al. Heritability of alternative splicing in the human genome Genome Res., August 1, 2007; 17(8): 1210 - 1218. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Coghlan and R. Durbin Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron exon structure Bioinformatics, June 15, 2007; 23(12): 1468 - 1475. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Parra, K. Bradnam, and I. Korf CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes Bioinformatics, May 1, 2007; 23(9): 1061 - 1067. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. de Groot, T. Mailund, and J. Hein Comparative annotation of viral genomes with non-conserved gene structure Bioinformatics, May 1, 2007; 23(9): 1080 - 1089. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Keibler, M. Arumugam, and M. R. Brent The Treeterbi and Parallel Treeterbi algorithms: efficient, optimal decoding for ordinary, generalized and pair HMMs Bioinformatics, March 1, 2007; 23(5): 545 - 554. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tanner, Z. Shen, J. Ng, L. Florea, R. Guigo, S. P. Briggs, and V. Bafna Improving gene annotation using peptide mass spectrometry Genome Res., February 1, 2007; 17(2): 231 - 239. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Yook and J. Hodgkin Mos1 Mutagenesis Reveals a Diversity of Mechanisms Affecting Response of Caenorhabditis elegans to the Bacterial Pathogen Microbacterium nematophilum Genetics, February 1, 2007; 175(2): 681 - 697. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Knapp and Y.-P. P. Chen An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy Nucleic Acids Res., January 12, 2007; 35(1): 317 - 324. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Taylor, S. Tyekucheva, D. C. King, R. C. Hardison, W. Miller, and F. Chiaromonte ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements Genome Res., December 1, 2006; 16(12): 1596 - 1604. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. van Baren and M. R. Brent Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res., May 1, 2006; 16(5): 678 - 685. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Schranz and T. Mitchell-Olds Independent Ancient Polyploidy Events in the Sister Families Brassicaceae and Cleomaceae PLANT CELL, May 1, 2006; 18(5): 1152 - 1165. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Windsor, M. E. Schranz, N. Formanova, S. Gebauer-Jung, J. G. Bishop, D. Schnabelrauch, J. Kroymann, and T. Mitchell-Olds Partial Shotgun Sequencing of the Boechera stricta Genome Reveals Extensive Microsynteny and Promoter Conservation with Arabidopsis. Plant Physiology, April 1, 2006; 140(4): 1169 - 1182. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Galagan, M. R. Henn, L.-J. Ma, C. A. Cuomo, and B. Birren Genomics of the fungal kingdom: Insights into eukaryotic biology Genome Res., December 1, 2005; 15(12): 1620 - 1631. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. W. Hillier, A. Coulson, J. I. Murray, Z. Bao, J. E. Sulston, and R. H. Waterston Genomics in C. elegans: So many genes, such a little worm Genome Res., December 1, 2005; 15(12): 1651 - 1660. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. R. Brent Genome annotation past, present, and future: How to define an ORF at each locus Genome Res., December 1, 2005; 15(12): 1777 - 1786. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Flannick and S. Batzoglou Using multiple alignments to improve seeded local alignment algorithms Nucleic Acids Res., August 12, 2005; 33(14): 4563 - 4577. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Pohler, N. Werner, R. Steinkamp, and B. Morgenstern Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC Nucleic Acids Res., July 1, 2005; 33(suppl_2): W532 - W534. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Brown, S. S. Gross, and M. R. Brent Begin at the beginning: Predicting genes with 5' UTRs Genome Res., May 1, 2005; 15(5): 742 - 747. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Conklin, B. Haldeman, and Z. Gao Gene finding for the helical cytokines Bioinformatics, May 1, 2005; 21(9): 1776 - 1781. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Wu and C. K. Watanabe GMAP: a genomic mapping and alignment program for mRNA and EST sequences Bioinformatics, May 1, 2005; 21(9): 1859 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ayele, B. J. Haas, N. Kumar, H. Wu, Y. Xiao, S. Van Aken, T. R. Utterback, J. R. Wortman, O. R. White, and C. D. Town Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis Genome Res., April 1, 2005; 15(4): 487 - 495. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Katari, V. Balija, R. K. Wilson, R. A. Martienssen, and W. R. McCombie Comparing low coverage random shotgun sequence data from Brassica oleracea and Oryza sativa genome sequence for their ability to add to the annotation of Arabidopsis thaliana Genome Res., April 1, 2005; 15(4): 496 - 504. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Wei, P. Lamesch, M. Arumugam, J. Rosenberg, P. Hu, M. Vidal, and M. R. Brent Closing in on the C. elegans ORFeome by cloning TWINSCAN predictions Genome Res., April 1, 2005; 15(4): 577 - 582. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. P. Westover, J. D. Buhler, J. L. Sonnenburg, and J. I. Gordon Operon prediction without a training set Bioinformatics, April 1, 2005; 21(7): 880 - 888. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Pinney, M. W. Shirley, G. A. McConkey, and D. R. Westhead metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella Nucleic Acids Res., March 3, 2005; 33(4): 1399 - 1409. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Zdobnov, Món. Campillos, E. D. Harrington, D. Torrents, and P. Bork Protein coding potential of retroviruses and other transposable elements in vertebrate genomes Nucleic Acids Res., February 16, 2005; 33(3): 946 - 954. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Tan, L. A. McCue, and G. D. Stormo Making connections between novel transcription factors and their DNA motifs Genome Res., February 1, 2005; 15(2): 312 - 320. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Richards, Y. Liu, B. R. Bettencourt, P. Hradecky, S. Letovsky, R. Nielsen, K. Thornton, M. J. Hubisz, R. Chen, R. P. Meisel, et al. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution Genome Res., January 1, 2005; 15(1): 1 - 18. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Florea, V. Di Francesco, J. Miller, R. Turner, A. Yao, M. Harris, B. Walenz, C. Mobarry, G. V. Merkulov, R. Charlab, et al. Gene and alternative splicing annotation with AIR Genome Res., January 1, 2005; 15(1): 54 - 66. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Dike, V. S. Balija, L. U. Nascimento, Z. Xuan, J. Ou, T. Zutavern, L. E. Palmer, G. Hannon, M. Q. Zhang, and W. R. McCombie The mouse genome: Experimental examination of gene predictions and transcriptional start sites Genome Res., December 1, 2004; 14(12): 2424 - 2429. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Ding, A. Sabo, N. Berkowicz, R. R. Meyer, Y. Shotland, M. R. Johnson, K. H. Pepin, R. K. Wilson, and J. Spieth EAnnot: A genome annotation tool using experimental evidence Genome Res., December 1, 2004; 14(12): 2503 - 2509. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Tenney, R. H. Brown, C. Vaske, J. K. Lodge, T. L. Doering, and M. R. Brent Gene prediction and verification in a compact genome with numerous small introns Genome Res., November 1, 2004; 14(11): 2330 - 2335. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Issac and G. P. S. Raghava EGPred: Prediction of Eukaryotic Genes Using Ab Initio Methods After Combining With Sequence Similarity Approaches Genome Res., September 1, 2004; 14(9): 1756 - 1766. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Caenepeel, G. Charydczak, S. Sudarsanam, T. Hunter, and G. Manning The mouse kinome: Discovery and comparative genomics of all mouse protein kinases PNAS, August 10, 2004; 101(32): 11707 - 11712. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brudno, R. Steinkamp, and B. Morgenstern The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences Nucleic Acids Res., July 1, 2004; 32(suppl_2): W41 - W44. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Taher, O. Rinner, S. Garg, A. Sczyrba, and B. Morgenstern AGenDA: gene prediction by cross-species sequence comparison Nucleic Acids Res., July 1, 2004; 32(suppl_2): W305 - W308. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Q. Wu, D. Shteynberg, M. Arumugam, R. A. Gibbs, and M. R. Brent Identification of Rat Genes by TWINSCAN Gene Prediction, RT-PCR, and Direct Sequencing Genome Res., April 1, 2004; 14(4): 665 - 671. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Abbas and S. P. Holmes Bioinformatics and Management Science: Some Common Tools and Techniques Operations Research, March 1, 2004; 52(2): 165 - 190. [Abstract] [PDF] |
||||
![]() |
E. Eden and S. Brunak Analysis and recognition of 5' UTR intron splice sites in human pre-mRNA Nucleic Acids Res., February 11, 2004; 32(3): 1131 - 1142. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-K. Hong, A. Chakravarti, and J. S. Takahashi From The Cover: The gene for soluble N-ethylmaleimide sensitive factor attachment protein {alpha} is mutated in hydrocephaly with hop gait (hyh) mice PNAS, February 10, 2004; 101(6): 1748 - 1753. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. H. Margulies, M. Blanchette, NISC Comparative Sequencing Program, D. Haussler, and E. D. Green Identification and Characterization of Multi-Species Conserved Sequences Genome Res., December 1, 2003; 13(12): 2507 - 2518. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Sommer, H. Liu, and T. L. Doering An {alpha}-1,3-Mannosyltransferase of Cryptococcus neoformans J. Biol. Chem., November 28, 2003; 278(48): 47724 - 47730. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Mignone, G. Grillo, S. Liuni, and G. Pesole Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis Nucleic Acids Res., August 1, 2003; 31(15): 4639 - 4645. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Foissac, P. Bardou, A. Moisan, M.-J. Cros, and T. Schiex EUGENE'HOM: a generic similarity-based gene finder using multiple homologous sequences Nucleic Acids Res., July 1, 2003; 31(13): 3742 - 3745. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhang, V. Pavlovic, C. R Cantor, and S. Kasif Human-Mouse Gene Identification by Comparative Evidence Integration and Evolutionary Analysis Genome Res., June 1, 2003; 13(6): 1190 - 1202. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Giardine, L. Elnitski, C. Riemer, I. Makalowska, S. Schwartz, W. Miller, and R. C. Hardison GALA, a Database for Genomic Sequence Alignments and Annotations Genome Res., April 1, 2003; 13(4): 732 - 741. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Guigo, E. T. Dermitzakis, P. Agarwal, C. P. Ponting, G. Parra, A. Reymond, J. F. Abril, E. Keibler, R. Lyle, C. Ucla, et al. Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes PNAS, February 4, 2003; 100(3): 1140 - 1145. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Flicek, E. Keibler, P. Hu, I. Korf, and M. R. Brent Leveraging the Mouse Genome for Gene Prediction in Human: From Whole-Genome Shotgun Reads to a Global Synteny Map Genome Res., January 1, 2003; 13(1): 46 - 54. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Collins, M. E. Goward, C. G. Cole, L. J. Smink, E. J. Huckle, S. Knowles, J. M. Bye, D. M. Beare, and I. Dunham Reevaluating Human Gene Annotation: A Second-Generation Analysis of Chromosome 22 Genome Res., January 1, 2003; 13(1): 27 - 36. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze Current methods of gene prediction, their strengths and weaknesses Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117. [Abstract] [Full Text] [PDF] |
||||












