Bioinformatics Advance Access published online on May 8, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn223
PatMaN: rapid alignment of short sequences to large databases

1Max-Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
To whom correspondence should be addressed. Kay Prüfer, E-mail: pruefer{at}eva.mpg.de
| Abstract |
|---|
Summary: We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a pre-defined number of gaps and mismatches. The commandline-driven program implements a nondeterministic automata matching-algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed.
Availability: The C++ source code for PatMaN is distributed under the GNU General Public License and has been tested on the GNU/Linux operating system. It is available from http://bioinf.eva.mpg.de/patman.
Contact: pruefer{at}eva.mpg.de
Associate Editor: Dr. Limsoon Wong
*These authors contributed equally to this work
Received on March 21, 2008; revised on April 24, 2008; accepted on May 3, 2008