Bioinformatics, Vol 14, 384-390, Copyright © 1998 by Oxford University Press
AA Salamov, T Nishikawa and MB Swindells
MOTIVATION: In cDNA sequencing projects, it is vital to know whether the
protein coding region of a sequence is complete, or whether errors have
occurred during library construction. Here we present a linear discriminant
approach that predicts this completeness by estimating the probability of
each ATG being the initiation codon. RESULTS: Because of the current
shortage of full-length cDNA data on which to base this work, tests were
performed on a non-redundant set of 660 initiation codon-containing DNA
sequences that had been conceptually spliced into mRNA/cDNA. We also used
an edited set of the same sequences that only contained the region
following the initiation codon as a negative control. Using the criterion
that only a single prediction is allowed for each sequence, a cut-off was
selected at which discrimination of both positive and negative sets was
equal. At this cut-off, 67% of each set could be correctly distinguished,
with the correct ATG codon also being identified in the positive set.
Reliability could be increased further by raising the cut-off or including
homologues, the relative merits of which are discussed. AVAILABILITY: The
prediction program, called ATGpr, and other data are available at
http://www.hri.co.jp/atgpr CONTACT: swintech@hri.co.jp
ARTICLES
Assessing protein coding region integrity in cDNA sequencing projects
Helix Research Institute, 1532-3 Yana, Kisarazu-shi, Chiba-ken 292, Japan.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. A. Aldred, R. D. Machado, V. James, N. W. Morrell, and R. C. Trembath Characterization of the BMPR2 5'-Untranslated Region and a Novel Mutation in Pulmonary Hypertension Am. J. Respir. Crit. Care Med., October 15, 2007; 176(8): 819 - 824. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, T. Abeel, S. Degroeve, and Y. Van de Peer Translation initiation site prediction on a genomic scale: beauty in simplicity Bioinformatics, July 1, 2007; 23(13): i418 - i423. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Shippy, C. D. Rogers, R. W. Beeman, S. J. Brown, and R. E. Denell The Tribolium castaneum Ortholog of Sex combs reduced Controls Dorsal Ridge Development Genetics, September 1, 2006; 174(1): 297 - 307. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Klee, K. J. Shim, M. A. Pickart, S. C. Ekker, and L. B. M. Ellis AMOD: a morpholino oligonucleotide selection tool Nucleic Acids Res., July 1, 2005; 33(suppl_2): W506 - W511. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. J. Min, G. Butler, R. Storms, and A. Tsang TargetIdentifier: a webserver for identifying full-length cDNAs from EST sequences Nucleic Acids Res., July 1, 2005; 33(suppl_2): W669 - W672. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Liu, H. Han, J. Li, and L. Wong DNAFSMiner: a web-based software toolbox to recognize two types of functional sites in DNA sequences Bioinformatics, March 1, 2005; 21(5): 671 - 673. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Kuhl, F. Cheung, Q. Yuan, W. Martin, Y. Zewdie, J. McCallum, A. Catanach, P. Rutherford, K. C. Sink, M. Jenderek, et al. A Unique Set of 11,008 Onion Expressed Sequence Tags Reveals Expressed Sequence and Genomic Differences between the Monocot Orders Asparagales and Poales PLANT CELL, January 1, 2004; 16(1): 114 - 125. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Rezniczek, C. Abrahamsberg, P. Fuchs, D. Spazierer, and G. Wiche Plectin 5'-transcript diversity: short alternative sequences determine stability of gene products, initiation of translation and subcellular localization of isoforms Hum. Mol. Genet., December 1, 2003; 12(23): 3181 - 3194. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Tsuji, H. Ishii-Ohba, H. Ukai, T. Katsube, and T. Ogiu Radiation-induced deletions in the 5' end region of Notch1 lead to the formation of truncated proteins and are involved in the development of mouse thymic lymphomas Carcinogenesis, July 1, 2003; 24(7): 1257 - 1268. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yang, J. Whelan, R. Babb, and B. R. Bowen An mRNA Splice Variant of the AFX Gene with Altered Transcriptional Activity J. Biol. Chem., March 1, 2002; 277(10): 8068 - 8075. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rogic, A. K. Mackworth, and F. B.F. Ouellette Evaluation of Gene-Finding Programs on Mammalian Sequences Genome Res., May 1, 2001; 11(5): 817 - 832. [Abstract] [Full Text] |
||||
![]() |
H. T. Yudate, M. Suwa, R. Irie, H. Matsui, T. Nishikawa, Y. Nakamura, D. Yamaguchi, Z. Z. Peng, T. Yamamoto, K. Nagai, et al. HUNT: launch of a full-length cDNA database from the Helix Research Institute Nucleic Acids Res., January 1, 2001; 29(1): 185 - 188. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Dunphy, A. Balic, G. J. Barcham, A. J. Horvath, A. D. Nash, and E. N. T. Meeusen Isolation and Characterization of a Novel Inducible Mammalian Galectin J. Biol. Chem., October 6, 2000; 275(41): 32106 - 32113. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Konno, Y. Fukunishi, K. Shibata, M. Itoh, P. Carninci, Y. Sugahara, and Y. Hayashizaki Computer-Based Methods for the Mouse Full-Length cDNA Encyclopedia: Real-Time Sequence Clustering for Construction of a Nonredundant cDNA Library Genome Res., February 1, 2001; 11(2): 281 - 289. [Abstract] [Full Text] |
||||








