Bioinformatics Advance Access originally published online on November 30, 2004
Bioinformatics 2005 21(8):1365-1370; doi:10.1093/bioinformatics/bti182
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
YODA: selecting signature oligonucleotides
Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University Blacksburg, VA 24061, USA
| Abstract |
|---|
|
|
|---|
Motivation: Selecting oligonucleotide probes for use in microarray design, and other applications requiring signature sequences, involves identifying sequences which will bind strongly to their intended target, while binding only weakly (or preferably, not at all) to non-target sequences which may be present in the hybridization reaction. While many tools to assist in selection of such sequences exist, all the ones we examined lack important oligo design and software features.
Results: YODA is an application for assisting biological researchers in selecting signature sequences. It incorporates a custom sequence similarity search to find potential cross-hybridizing non-target sequences. For this task, most oligo design tools rely on BLAST, which is ill suited for it due to an unacceptable risk of false negatives. YODA supports multiple probe design goals including single-genome, multiple-genome, pathogen-host and species/strain-identification. A graphical interface is provided as well as a command-line interface, both of which support many user-controlled parameters. YODA is easy to install and use and runs on Windows, Mac OS X and Linux platforms.
Availability: Freely available (LGLP) along with source code and additional documentation at http://pathport.vbi.vt.edu/YODA
Contact: enordber{at}vbi.vt.edu
| INTRODUCTION |
|---|
|
|
|---|
Oligo design considerations
Three general considerations are important in designing oligonucleotide probes for microarrays: sensitivity of the individual probes, specificity of individual probes and consistency among the set of probes.
Sensitivity refers to the ability of the oligonucleotide probe to bind strongly to its target sequence. This primarily requires selecting probe sequences which do not tend to form stable secondary (stem-loop or hairpin) structures and which do not have a tendency to form homo-dimers. The presence of stable secondary structures and/or probe dimers limits the availability of the probe for binding to its target sequence.
Specificity refers to the inability of the probe to bind strongly to non-target sequences that may be present during the hybridization. This can be accomplished by avoiding probes with excessive sequence similarity to a non-target sequence that may be present during the hybridization. There are multiple criteria for excessive sequence similarity.
- Total percent identity: >7580% identity with a non-target sequence (Hughes et al., 2001; Kane et al., 2000).
- Contiguous identical stretches: contiguous stretches of identity >15 nt with a non-target sequence (Kane et al., 2000).
- Low complexity: low complexity regions, such as long stretches of the same base.
Consistency among the set of probes primarily involves all probes having similar melting temperatures (Tm). Depending on the protocol used for sample preparation, it may be important for probes to be targeted towards various regions of the sequence. If poly-dT priming is used, then probes should be close to the 3' end of the sequence. If random priming is used then probes should be towards the 5' end or the center of the sequence.
Software features
In addition to the technical requirements for oligo selection, there are certain desirable features for the software performing the selection. The oligo selection program should run on multiple computer platforms, including Windows, Macintosh and Linux. RAM requirements should be reasonable (e.g. 512 MB). It should run in a reasonable amount of time. While this is a subjective criterion, we would suggest that reasonable times would be measured in minutes, possibly hours, but certainly not days, for single genome designs.
Ideally, the program will not require any external local or remote programs to accomplish its task. Dependence on external programs can create problems beyond the control of the oligo selection program. Using remote (server-side) programs involves network latencies and makes performance dependent on network traffic and server load. Local programs may be platform-specific and require complicated installations.
For the software to be useful to a range of users it should provide a Graphical User Interface (GUI) as well as a command-line interface. A GUI allows non-computer-expert users to use the program easily. A command-line interface allows the program to be easily incorporated into scripts and pipelines.
Available oligo selection tools
Many tools for oligo probe selection exist [Bozdech et al., 2003; Kaderali and Schliep, 2002; Li and Stormo, 2001; Nielsen et al., 2003; Premier Biosoft (http://www.premierbiosoft.com/dnamicroarray/dnamicroarray.com); Raddatz et al., 2001; Reymond et al., 2004; Rouillard et al., 2002, 2003; Wang and Seed 2003; Xu et al., 2002]. Several of these are shown in Table 1, which summarizes certain features of each tool. The available tools are generally difficult to install and configure, non-intuitive to use and, as Table 1 indicates, lack important oligo design and software features. Additionally, most oligo selection tools rely heavily on BLAST. This has serious implications for oligo specificity. BLAST is designed for rapid searching of a sequence database to find highly significant alignments to a query sequence (Altschul et al., 1990). In order to achieve the speed of which BLAST is capable, shortcuts are taken. BLAST uses a word-based lookup approach, with a minimum word size of 7 nt for DNA sequences. BLAST identifies database sequences sharing words with the query sequence. If a database sequence does not share at least one word with the query, BLAST will not attempt to align the two sequences. This means that two sequences that are identical except at every seventh position have no chance of being aligned by BLAST. Oligo selection programs using BLAST may select oligos which are >86% identical to a non-target sequence, regardless of the identity threshold which is used. An examination of oligo sets produced by some of the available programs shows that this does occur in practice (see Results and Discussion).
|
Other problems with available tools include reliance on external software, limited user interface, platform-specificity, lack of consideration of secondary structure or dimerization, and excessive running times.
Yet-another Oligonucleotide Design Application
YODA (Yet-another Oligonucleotide Design Application) is an easy-to-use stand-alone application for the rapid design of microarray probe sets. YODA supports the design of one or more oligo probes per sequence using parameters such as oligo length, maximum percent identity to a non-target sequence, maximum consecutive identities with a non-target sequence, melting temperature range, GC content range, secondary structure and homo-dimerization potential. We describe the design process and provide examples of probe design tasks. An examination of existing tools, and individual comparisons with YODA, demonstrates the need for another oligonucleotide design tool.
To support multiple design strategies, YODA considers three classes of DNA sequence files, termed Design, Genome, and Host. Design files contain sequences for which oligo probes are to be designed. Host files contain sequences to be considered for potential cross-hybridization to all designed probes. Genome files are primarily a convenience to support the design of probes for a subset of genes in a genome, which is particularly useful in iterative designs.
| ALGORITHM |
|---|
|
|
|---|
In order to minimize the time required to perform a design task, oligos are processed in several steps of increasing computational intensity, with unfit candidate probes being removed at each successive step. The steps in the design process are: determining average Tm and GC content, identifying prohibited sequences and contiguous stretches of identities, calculating Tm, checking for potential secondary structure, checking for dimerization potential and checking for excessive similarity to non-target sequences.
Determine average Tm and GC content
YODA uses the nearest-neighbor formula with the parameters reported by SantaLucia (1998). First, the average Tm for all oligos of the specified length is calculated. The user specifies a range of acceptable Tms. For example, if the calculated average is 65°C and the specified range is 4°C, then any Tm from 63 to 67°C is acceptable. This saves the user the trouble of determining a good Tm for the sequences being considered, while maximizing the number of oligos that can be found within the desired range of Tms. Average GC content for all oligos of the specified length is also calculated. Sequences provided in Host files are not considered for these calculations.
Prohibited sequences and contiguous identities
Initially, all sequences of the target oligo length are considered as candidate probes. Each oligo is first examined for the presence of any prohibited sequences or any long stretches of contiguous identities with a non-target sequence. The user can specify prohibited sequences, which are short (<16 nt) sequences to be avoided in probes. Poly-X (e.g. AAAAA or TTTTT) sequences are also considered prohibited sequences.
Filter for Tm
All oligo candidates passing the previous tests are examined for Tm. If the Tm for the candidate is outside of the acceptable range, the candidate is rejected.
Filter for secondary structure
All remaining candidates are checked for potential stem-loop structures by looking for short stretches of complementary sequences (the stem) that are separated by a few bases (the loop). The user can set parameters for length and stringency of the stem and length of the loop. A thermodynamic approach is not used for this step because the intent is not to find the best secondary structure, but to determine if any such structure is likely.
Surviving candidates for each sequence are stored in a temporary file for later sorting and validation.
Sort and validate candidates
At this stage, each target sequence may have anywhere from zero to thousands of oligo candidates. Selecting a final oligo, or set of oligos, for each sequence is performed by various Probe Sorter modules. The default Probe Sorter is the Coverage Probe Sorter, which selects multiple oligos per sequence (up to a maximum specified by the user) with the goal of even spacing across the length of the sequence. Other Probe Sorters may select oligos close to the 3' end, close to the 5' end, close to the center of the sequence, or all non-overlapping oligos. Multiple Probe Sorters may be used and each produces its own output file containing its set of selected oligos.
Only those candidates selected by a Probe Sorter are subjected to final validation, which checks the oligo for dimerization potential and for sequence similarity to other sequences being considered. Sequence similarity is determined using the SeqMatch tool.
SeqMatch is a custom sequence similarity search tool developed for use in YODA. It serves the same purpose for which BLAST is used by many other tools. The goal is to determine if there is any non-target sequence which has a greater percent identity with the oligo candidate than a specified cutoff value (80% by default). SeqMatch uses two different algorithms, depending on the percent identity cutoff. For 80% and greater, a word-based search with word length of 4 is used. With a word length of 4, it is possible to guarantee that all matches with
80% identity will be found. To find matches with <80% identity, an exhaustive search is used. This search can find matches at any level, but it requires significantly more time. At 94% and higher, running SeqMatch is not necessary because the candidate probe has already been checked for any non-unique 15 mers. If the candidate has at least one mismatch every 15 bases when aligned with another sequence, it must be less than 94% identical to the other sequence. While 94% identity may seem unacceptable, there are applications where a guaranteed mismatch at least every 15 bases is sufficient, particularly when dealing with very short probes (2535 nt) (Relogio et al., 2002).
SeqMatch does not attempt to find the best match for a given sequence, nor does it perform gapped alignments. It searches only until it finds a single match above the identity threshold, at which time the candidate is rejected and the Probe Sorter moves on to the next candidate. SeqMatch can be used independently via a command-line interface (additional documentation at website).
| IMPLEMENTATION |
|---|
|
|
|---|
YODA is written entirely in Java and runs on any platform with Java 1.3.1 or better. A graphical interface and a command-line interface are provided. The application can be launched by double-clicking on the file icon, or by running the command java jar YODA.jar. Users wanting to use the command-line interface can get a list of command-line options with the command java jar YODA.jar -h. No additional software or libraries are required, nor is an active network connection. Testing has been performed on Linux, Mac OS X and Windows. Approximately 400 MB of RAM is required and a machine with 512 MB is sufficient for most uses.
YODA is freely available, under the terms of the Lesser GNU Public License, and can be downloaded along with source code and documentation from http://pathport.vbi.vt.edu/YODA. A web service version is also available, with access via the ToolBus/PathPort system (Eckart and Sobral 2003) (http://pathport.vbi.vt.edu).
Input files
All sequence files must be in FASTA format. The only restriction on the FASTA title line is that it must be unique for each Design File sequence. The DNA sequences may contain ambiguity characters (e.g. N), but they will not be considered for probe design (i.e. YODA does not design degenerate probes). No probe will be selected that contains any character other than ATGC (uppercase and lowercase are accepted). There are no explicit limits on the number of files or the size of sequences that may be used. Multiple files of each type may be used.
All accepted oligos will have been screened for ability to cross-hybridize to all sequences in the Design files. At least one Design file is required. Host files are optional and, if provided, must not contain any sequences from the Design files. Genome files are also optional, differing from Host files in two ways: sequences in Genome files are used in calculating the average melting temperature and GC content used for the design process; if any Genome files are provided, all sequences in the Design files must be present in the Genome files exactly once.
Output files
YODA produces three output files for each Probe Sorter used. There is a file containing the probes sequences and details, including location in the sequence, Tm and GC%. This is a tab-delimited file, which can be viewed with many spreadsheet programs or with YODA's built-in OligoViewer. There is a file containing the FASTA title lines from any sequences for which the Probe Sorter was unable to select oligo probes. Finally, there is a file containing the FASTA format sequences for which the Probe Sorter was unable to select oligo probes. This last file is useful for an iterative design process, in which the stringency for probe selection is initially set very high, but is reduced on successive rounds to select probes for those sequences not meeting the more stringent criteria.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Unless otherwise noted, the computer used for the design tasks described here has a 2.0 GHz Pentium 4 processor with 512 MB of physical RAM and runs the Linux operating system. Unless otherwise noted, for design tasks, Oligo Length = 60 nt, Maximum Percent Identity = 80%, Tm range = 6°C, GC range = 12%, Maximum Consecutive Matches = 15, Maximum Poly-X = 4, DNA concentration = 50 nM and salt concentration = 50 mM.
All sequence files used were obtained from NCBI (ftp://ftp.ncbi.nih.gov/genomes/).
Single-genome designs
Table 2 shows the results of several single-genome design tasks. In each case, a single file containing multiple gene sequences from the species was given as a Design file. Parameters are all set to default values (see above). For each genome, probes are selected for at least 90% of the genes. Most of the designs (13 of 18) were completed in <10 min.
|
Hostpathogen design
A probe set for the Escherichia coli K12 genome was designed with 28,410 human genes included as Host sequences, to prepare for the possibility that an E.coli sample being analyzed may be contaminated with sequences from the human host cells. Without consideration of the human sequences, probes were selected for 4103 of 4289 E.coli genes in about 6 min. The addition of the host sequences increases the time to about 40 min and reduces the number of genes for which a probe can be selected to 4037. This means that the probes selected for at least 66 of the E.coli genes would have an unacceptable possibility of hybridizing to host sequences in a sample, producing erroneous expression values for those genes. By including the host genes in the design process, the chance of such erroneous results can be reduced.
Species/strain identification design
In addition to transcriptional profiling, microarrays can be used to identify an unknown DNA sample. YODA can design probe sets for distinguishing between a set of given species or strains. Probes were designed from the genome sequences of four strains of E.coli, one of which (0157:H7) contains two plasmids. A total of 2470 diagnostic 60 mer probes were found spanning all six DNA molecules (four chromosomes and two plasmids) in 25 min, 12 s. An E.coli DNA sample from unknown strain could be identified as coming from one of these four strains, ruling out the other three, with the use of a microarray containing this probe set. Such a design requires that the sequences be sufficiently different to allow identification of specific probes. YODA is not intended, however, to design probe sets for re-sequencing or SNP analysis/identification arrays.
Iterative design
An iterative design process may be used to maximize the number of sequences for which probes can be designed, without sacrificing probe quality for the majority of sequences. Each Probe Sorter produces a file containing the sequences for which it was unable to select probes. By using this file as a Design file, and the original file, containing all genes, as a Genome file, a second round of design can be performed with slightly relaxed stringency. The mean Tm and GC values remain constant because they are determined from the Genome file sequences. This may enable the selection of probes for some of the initially missed sequences. These probes may not be as reliable as the original set of probes, which must be considered when analyzing data.
An iterative design process was used to select probes for E.coli genes. The first round of the design used the default parameters, finding probes for 4103 genes in about 6 min. For the second round, the file containing all E.coli genes was loaded as a Genome file, and the file containing the rejected sequences from the first round was loaded as the Design file. The Tm range parameter was increased to 10°C and the GC range was increased to 20. In <1 min, probes were found for an additional 54 genes that had been missed in the more stringent first round. For the third round, the rejected sequences from the second round were used as the Design file. The Tm range was increased to 14°C, the GC range was increased to 30 and the Maximum Poly-X parameter was increased to 5. In <1 min, probes were selected for another 30 genes. This process can be continued, but at some point the value of the probes becomes suspect. For some genes it may not be possible to design useful signature oligonucleotides.
Tool comparisons
For comparison, OligoArray1.0, OligoPicker, ArrayDesigner3.0 and YODA were used to design probe sets for the yeast Saccharomyces cerevisiae. OligoArray1.0, OligoPicker and YODA are free software, while ArrayDesigner3.0 is a commercial tool. Each tool has its own set of parameters, complicating a direct comparison. Comparisons with YODA were done individually, attempting to adjust the YODA parameters to match the default setting of the other tool. The results of these designs are shown in Table 3.
|
OligoArray1.0 (Rouillard et al., 2002) selected probes for 5844 of the 5864 genes in 46 h, 45 min. Most of these probes are reported as cross-hybridizing probes, expected to bind significantly to multiple target genes. Of the probes, 250 are reported as specific to a single gene. Total percent identity is not available as a parameter in OligoArray1.0, but the documentation indicates that for 50 nt, 50% identity is the cutoff. Of the 250 specific probes, 21 meet this threshold. All 250 specific probes meet the 80% identity threshold. Of the cross-hybridizing probes, 262 have exact matches in a non-target gene. In one such case, identical 50 mers are selected as probes for 27 different genes. This 50 mer is present in a total of 55 of the genes. YODA selected specific 50 mer probes for 5484 of the 5864 genes in 21 min. None of the YODA probes are >80% identical to a non-target sequence.
OligoPicker (Wang and Seed, 2003) selected specific probes for 5453 of the 5864 genes in 32 min. While OligoPicker appears to perform better than OligoArray1.0, it suffers from the same pitfalls inherent in using BLAST to identify potentially cross-hybridizing sequences. Among the set of selected probes are 36 with 80% or greater identity to a non-target sequence. There are cases where probes for two genes are >80% identical to each other. OligoPicker screens probes for stretches of 15 nt or more consecutive identities with non-target sequences. While largely successful at avoiding such regions, OligoPicker did select 22 probes with contiguous identical stretches of 15 nt. YODA selected specific 70 mer probes for 5396 of the 5864 genes in 13 min. None of the YODA probes are >80% identical to a non-target sequence or have contiguous identical stretches of 15 nt with a non-target sequence.
The ArrayDesigner3.0 (Premier Biosoft) comparison was performed using a 933 MHz G4 PowerMac with 1.5 GB of RAM running Mac OS X version 10.3.5 to run both ArrayDesigner3.0 and YODA. ArrayDesigner3.0 selected specific probes for 5849 of the 5864 yeast genes in about 3.5 days. The selected probes for 244 of the genes have exact matches in a non-target gene, with another 125 of the probes having matches with only one mismatched nucleotide in a non-target gene. A total of 705 of the probes match a non-target sequence with greater than 80% identity. YODA selected specific probes for 5417 of the 5864 genes in 22 min. None of the YODA probes are >80% identical to a non-target sequence.
Table 3 summarizes the comparisons of these tools. ArrayDesigner3.0 and OligoArray1.0 have long running times, primarily due to the use of remote servers. These tools have the benefit of running on all major platforms. OligoPicker has short running times and does a relatively good job of probe selection. However, it is only available for Linux, lacks a graphical interface and uses a less reliable method for predicting melting temperatures. All three use BLAST for detecting cross-hybridization risks, rendering them incapable of guaranteeing the specificity of selected probes.
Running time
Depending on the value of the Maximum Percent Identity parameter, YODA has two worst-case order running times with respect to effective sequence length. Effective sequence length is LT [n * (LO 1)], where LT is the total length of all sequences in the design task, n is the number of individual sequences and LO is the oligo length. This is the total number of oligos of length LO present in the sequence. When only Design sequences are given, at 94% identity and greater, YODA runs in O(n) time. That is, if the sequence length doubles, the running time doubles. Below 94% Maximum Percent Identity, YODA runs in O(n2) time. That is, if the sequence length doubles, the running time will quadruple.
When Design, Genome and Host sequences are all considered, the complexity analysis becomes more complicated. At 94% identity and greater, the complexity is (a + b + c + d)D + aH + (a + d)G, where D is the effective length of Design sequences, H is the effective length of Host sequences, G is the effective length of non-Design Genome sequences (i.e. Genome sequences that are not also present in the Design sequences) and a, b, c and d are constants for a given set of parameters. The time to populate the list of all 15 mers is described by a. The time to search the list of 15 mers, calculate Tm and GC, and identify potential hairpin structures in each probe candidate is described by b. The number of probes being selected per Design sequence is represented by c. The time to calculate the mean Tm and GC content is described by d. Below 94% identity, the use of SeqMatch adds a term to the running time, resulting in (a + b + c + d)D + aH + (a + d)G + ecD(D + H + G), where the additional constant e represents the time to search for similar sequences using SeqMatch.
Actual running time, and the number and quality of oligos selected, depends on the sequence being considered and the parameter values used. The maximum percentage identity parameter has a large impact on running time because at different values different search algorithms are used.
Melting temperature
To date, there is no established method of predicting melting temperature of an immobilized oligonucleotide and a longer, complementary, oligonucleotide in solution. The favored melting-temperature prediction method is the nearest-neighbor formula using the thermodynamic parameters reported by SantaLucia (1998). This formula, and these parameter values, assumes two strands in solution with known concentrations. As this is clearly not the case in a microarray experiment, the validity of Tms calculated in this way is uncertain.
It is, however, reasonable to expect that two sequences with very similar Tms as predicted by the nearest-neighbor method will have actual Tms very near each other in a microarray context, even though the absolute predicted Tms are not accurate. In other words, if all probes have predicted Tms within a narrow range of each other, they should have actual Tms within a narrow range of each other. Therefore, specifying a specific target Tm for probes is not as important as specifying that all probes have Tms within a certain number of degrees of each other.
| CONCLUSIONS |
|---|
|
|
|---|
YODA provides the research biologist with a tool for the easy, rapid, flexible and free design of signature oligonucleotides for use in microarrays and other applications.
| Acknowledgments |
|---|
This work was supported by DoD grant (DAAD 13-02-C-0018) to Dr Bruno Sobral. We are grateful to Premier Biosoft for providing an evaluation license for ArrayDesigner3.0. Thanks to Mike Horsmon for helpful user feedback. Thanks to Drs Bruno Sobral, J. Dana Eckart and Karen Duca for critical reading and thoughtful comments.
Received on September 14, 2004; revised on September 22, 2004; accepted on September 23, 2004
| REFERENCES |
|---|
|
|
|---|
Altschul, S.F., Gish, W., Miller, W., Myers, E., Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403410[CrossRef][ISI][Medline].
Bozdech, Z., Zhu, J., Joachimial, M.P., Cohen, F.E., Pulliam, B., DeRisi, J.L. (2003) Expression profiling of the schizont and trophozite stages of Plasmodium falciparum with a long-oligonucleotide microarray. Genome Biol., 4, R9[CrossRef][Medline].
Eckart, J.D. and Sobral, B.W. (2003) A life scientist's gateway to distributed data management and computing: the PathPort/ToolBus framework. OMICS, 7, 7988[CrossRef][Medline].
Hughes, T.R., Mao, M., Jones, A.R., Burchard, J., Marton, M.J., Shannon, K.W., Lefkowitz, S.M., Ziman, M., Scheclter, J.M., Meyer, M.R., et al. (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol., 19, 342347[CrossRef][ISI][Medline].
Kaderali, L. and Schliep, A. (2002) Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics, 18, 13401349
Kane, M.D., Jatkoe, T.A., Strumpf, C.R., Lu, J., Thomas, J.D., Madore, S.J. (2000) Assesment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res., 28, 45524557
Li, F. and Stormo, G. (2001) Selection of optimal DNA oligos for gene expression arrays. Bioinformatics, 17, 10671076
Nielsen, H.B., Wernersson, R., Knudsen, S. (2003) Design of oligonucleotides for microarrays and perspectives for design of multi-transcriptome arrays. Nucleic Acids Res., 31, 34913496
Raddatz, G., Dehio, M., Meyer, T.F., Dehio, C. (2001) PrimerArray: genome-scale primer design for DNA-microarray construction. Bioinformatics, 17, 9899
Relogio, A., Schwager, C., Richter, A., Ansorge, W., Valcarel, J. (2002) Optimization of oligonucleotide-based DNA microarrays. Nucleic Acids Res., 30, e51
Reymond, N., Charles, H., Duret, L., Calevro, F., Belson, G., Fayard, J.-M. (2004) ROSO: optimizing oligonucleotide probes for microarrays. Bioinformatics, 20, 271273
Rouillard, J.-M., Herbert, C.J., Zuker, M. (2002) OligoArray: genome-scale oligonucleotide design for microarrays. Bioinformatics, 18, 486487
Rouillard, J.-M., Zuker, M., Gulari, E. (2003) OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach. Nucleic Acids Res., 31, 30573062
SantaLucia, J. (1998) A unified view of polymer, dumbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc. Natl Acad. Sci., USA, 95, 14601465
Wang, X. and Seed, B. (2003) Selection of oligonucleotide probes for protein coding sequences. Bioinformatics, 19, 796802
Xu, D., Li, G., Wu, L., Zhou, J., Xu, Y. (2002) PRIMEGENS: robust and efficient design of gene-specific probes for microarray analysis. Bioinformatics, 18, 14321437
This article has been cited by other articles:
![]() |
A. E. Pozhitkov, D. Tautz, and P. A. Noble Oligonucleotide microarrays: widely applied poorly understood Brief Funct Genomic Proteomic, July 20, 2007; (2007) elm014v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Paredes, R. S. Senger, I. S. Spath, J. R. Borden, R. Sillers, and E. T. Papoutsakis A General Framework for Designing and Validating Oligomer-Based DNA Microarrays and Its Application to Clostridium acetobutylicum Appl. Envir. Microbiol., July 15, 2007; 73(14): 4631 - 4638. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Feng and E. R.M. Tillier A fast and flexible approach to oligonucleotide probe design for genomes and gene families Bioinformatics, May 15, 2007; 23(10): 1195 - 1202. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Tembe, N. Zavaljevski, E. Bode, C. Chase, J. Geyer, L. Wasieloski, G. Benson, and J. Reifman Oligonucleotide fingerprint identification for microarray-based pathogen diagnostic assays Bioinformatics, January 1, 2007; 23(1): 5 - 13. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Brigand, R. Russell, C. Moreilhon, J.-M. Rouillard, B. Jost, F. Amiot, V. Magnone, C. Bole-Feysot, P. Rostagno, V. Virolle, et al. An open-access long oligonucleotide microarray resource for analysis of the human and mouse transcriptomes Nucleic Acids Res., July 19, 2006; 34(12): e87 - e87. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yamada, H. Soma, and S. Morishita PrimerStation: a highly specific multiplex genomic PCR primer design server for the human genome. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W665 - W669. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-J. Wu, Y.-H. Huang, and L.-A. Li Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences Bioinformatics, November 15, 2005; 21(22): 4125 - 4132. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



