Bioinformatics Advance Access published online on October 18, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti709
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA
* To whom correspondence should be addressed.
Summary: Errors are prevalent in cDNA sequences but the extent to which sequence collections differ in frequencies and types of errors has not been investigated systematically. cDNA Quality Control, or cQC, was developed to evaluate the quality of cDNA sequence collections and revise those sequences that differ from a higher quality genomic sequence. After removing rRNA, vector, bacterial IS, and chimeric cDNA contaminants, small-scale nucleotide discrepancies were found in 51% of cDNA sequences from one Arabidopsis cDNA collection, 89% from a second Arabidopsis collection, and 75% from a rice collection. These errors created premature termination codons in 4% and 42% of cDNA sequences in the respective Arabidopsis collections and in 7% of the rice cDNA sequences. Availability: A web-based version of cQC, source code and revised cDNA collections are available at http://genomics.arizona.edu/software/cQC/. Supplementary data: Further text, tables and figures are available at the above website or on Bioinformatics online.
Received May 27, 2005
Revised September 1, 2005
Accepted October 6, 2005
Applications note
Evaluating and improving cDNA sequence quality with cQC
2 Department of Computer Science, University of Arizona, Tucson, AZ 85721-0077, USA
Richard A. Jorgensen, E-mail: raj{at}ag.arizona.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?