Bioinformatics Advance Access published online on November 11, 2004
Bioinformatics, doi:10.1093/bioinformatics/bti129
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Electrical Engineering, UET, Lahore-54890, Pakistan
* To whom correspondence should be addressed.
Motivation: With the potential availability of nanopore devices that can sense the bases of translocating single-stranded DNA (ssDNA), it is likely that reads of length We assume that Results: We have developed an algorithm that addresses these issues. It can be considered an extreme variation of the well-known Eulerian path approach. It searches over a space of de Bruijn graphs until it finds one in which (a) the impact of errors is eliminated and (b) both possible orientations of the two ssDNA sequences can be identified separately and unambiguously. Our algorithm is able to correctly reconstruct real DNA sequences of the order of 106 bases (e.g. the bacterium Mycoplasma pneumoniae) from simulated erroneous reads on a modest workstation in about one hour. We describe, and give measured timings of, a parallel implementation of this algorithm on the Cray Multithreaded Architecture (MTA-2) supercomputer, whose architecture is ideally suited to this unstructured problem. Our parallel implementation is crucial to the problem of rapidly sequencing long DNA sequences and also to the situation where multiple nanopores are used to obtain a high-bandwidth stream of reads.
Revised October 2, 2004
Accepted October 25, 2004
Article
A parallel graph decomposition algorithm for DNA sequencing with nanopores
2 Eagle Research & Development, 1898 S. Flatiron Court, Suite 128, Boulder, Colorado, 80301
Shahid H. Bokhari, E-mail: shb{at}acm.org
![]()
Abstract
105 will be available in large numbers and at high speed. We address the problem of complete DNA sequencing using such reads.
102 copies of a DNA sequence are split into single strands that break into randomly sized pieces as they translocate the nanopore in arbitrary orientations. The nanopore senses and reports each individual base that passes through, but all information about orientation and complementarity of the ssDNA subsequences is lost. Random errors (both biological and transduction) in the reads create further complications.![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. R. Zerbino and E. Birney Velvet: Algorithms for de novo short read assembly using de Bruijn graphs Genome Res., May 1, 2008; 18(5): 821 - 829. [Abstract] [Full Text] [PDF] |
||||
