Skip Navigation


Bioinformatics Advance Access originally published online on April 1, 2004
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow All Versions of this Article:
20/13/2101    most recent
bth210v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zheng, J.
Right arrow Articles by Lonardi, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zheng, J.
Right arrow Articles by Lonardi, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics 20(13) © Oxford University Press 2004; all rights reserved.

Efficient selection of unique and popular oligos for large EST databases{dagger}

Jie Zheng 1, Timothy J. Close 2, Tao Jiang 1 and Stefano Lonardi 1,*

1 Department of Computer Science and Engineering and 2 Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA

Received on September 29, 2003; revised on January 2, 2004; accepted on February 10, 2004
Advance Access Publication April 1, 2004

Motivation: Expressed sequence tag (EST) databases have grown exponentially in recent years and now represent the largest collection of genetic sequences. An important application of these databases is that they contain information useful for the design of gene-specific oligonucleotides (or simply, oligos) that can be used in PCR primer design, microarray experiments and genomic library screening.

Results: In this paper, we study two complementary problems concerning the selection of short oligos, e.g. 20–50 bases, from a large database of tens of thousands of ESTs: (i) selection of oligos each of which appears (exactly) in one unigene but does not appear (exactly or approximately) in any other unigene and (ii) selection of oligos that appear (exactly or approximately) in many unigenes. The first problem is called the unique oligo problem and has applications in PCR primer and microarray probe designs, and library screening for gene-rich clones. The second is called the popular oligo problem and is also useful in screening genomic libraries. We present an efficient algorithm to identify all unique oligos in the unigenes and an efficient heuristic algorithm to enumerate the most popular oligos. By taking into account the distribution of the frequencies of the words in the unigene database, the algorithms have been engineered carefully to achieve remarkable running times on regular PCs. Each of the algorithms takes only a couple of hours (on a 1.2 GHz CPU, 1 GB RAM machine) to run on a dataset 28 Mb of barley unigenes from the HARVEST database. We present simulation results on the synthetic data and a preliminary analysis of the barley unigene database.

Availability: Available on request from the authors.

Contact: stelo{at}cs.ucr.edu

* To whom correspondence should be addressed.

{dagger} A preliminary version of this work was presented at the Symposium on Combinatorial Pattern Matching, Morelia, Mexico, and included in its Proceedings, pp. 273–283, LNCS 2676, Springer (2003).


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
S. Feng and E. R.M. Tillier
A fast and flexible approach to oligonucleotide probe design for genomes and gene families
Bioinformatics, May 15, 2007; 23(10): 1195 - 1202.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.