© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Beware of mis-assembled genomes
1Center for Bioinformatics and Computational Biology, University of Maryland College Park, MD 20742, USA
2Institute for Physical Sciences and Technology, University of Maryland College Park, MD 20742, USA
*To whom correspondence should be addressed. E-mail: salzberg@umd.edu
| The first 10% of the full text of this article appears below. |
With hundreds of genomes now in GenBank, researchers might be forgiven for assuming that genome sequence data are correct, at least at a large scale. Certainly there might be errors at some small rate, perhaps 1 in 50 000 or 100 000 bases (Schmutz et al., 2004; Read et al., 2002), but at a large scale these genomes are put together correctly, are not they? Well, not always.
We have been looking at the assemblies of large genomes for several years now, and for every draft genome we look at, we find hundredsand sometimes thousandsof mis-assemblies. These include regions where a genome is incorrectly re-arranged as well as places where large chunks of DNA sequence are simply
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J.-H. Choi, S. Kim, H. Tang, J. Andrews, D. G. Gilbert, and J. K. Colbourne A machine-learning approach to combined evidence validation of genome assemblies Bioinformatics, March 15, 2008; 24(6): 744 - 750. [Abstract] [Full Text] [PDF] |
||||
