Bioinformatics Advance Access published online on October 27, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp614
The GNUMAP Algorithm: Unbiased Probabilistic Mapping of Oligonucleotides from Next-Generation Sequencing
1Department of Computer Science, Brigham Young University, Provo, UT, 84602
2Department of Statistics, Brigham Young University, Provo, UT, 84602
3Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, 84105
*To whom correspondence should be addressed. Nathan L. Clement, E-mail: nathanlclement{at}gmail.com
| Abstract |
|---|
Motivation: The advent of next-generation sequencing technologies has increased the accuracy and quantity of sequence data, opening the door to greater opportunities in genomic research.
Results: In this paper, we present GNUMAP (Genomic Nextgeneration Universal MAPper), a program capable of overcoming two major obstacles in the mapping of reads from next-generation sequencing runs. First, we have created an algorithm that probabilistically maps reads to repeat regions in the genome on a quantitative basis. Second, we have developed a probabilistic Needleman-Wunsch algorithm which utilizes _prb.txt and _int.txt files produced in the Solexa/Illumina pipeline to improve the mapping accuracy for lower quality reads and increase the amount of usable data produced in a given experiment.
Availability: The source code for the software can be downloaded from http://dna.cs.byu.edu/gnumap.
Contact: nathanlclement{at}gmail.com
Associate Editor: Prof. John Quackenbush
Received on May 18, 2009; revised on September 24, 2009; accepted on October 16, 2009