Bioinformatics Advance Access originally published online on March 28, 2007
Bioinformatics 2007 23(10):1195-1202; doi:10.1093/bioinformatics/btm114
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A fast and flexible approach to oligonucleotide probe design for genomes and gene families
1Institute of Computing Technology, Chinese Academy of Sciences, China, 2Ontario Cancer Institute, University Health Network, Toronto, Canada and 3Department of Medical Biophysics, University of Toronto, Canada
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics.
Methods: ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated.
Results: The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays.
Availability: ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html
Contact: e.tillier{at}utoronto.ca or fsz{at}ncic.ac.cn
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Martin Bishop
Received on January 3, 2007; accepted on March 15, 2007