Bioinformatics Vol. 18 no. 2 2002
Pages 244-250
© 2002 Oxford University Press
Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset
1 Department of Biology and Biochemistry,
University of Houston, Houston, TX 77204-5001, USA
2 Department of Chemical Engineering,
University of Houston, Houston, TX 77204-4004, USA
Received on June 22, 2001
; revised on August 12, 2001
; accepted on October 5, 2001
Motivation: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification.
Results: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Qs, was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 511 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.
Availability: The programs described herein are available at http://prion.bchs.uh.edu/16S_signatures/programs/. A preliminary database of signature sequences identified in this paper is available at: http://prion.bchs.uh.edu/16S_signatures/.
Contact: zzhang{at}bayou.uh.edu; fox{at}uh.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Feng and E. R.M. Tillier A fast and flexible approach to oligonucleotide probe design for genomes and gene families Bioinformatics, May 15, 2007; 23(10): 1195 - 1202. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-H. Chung, S.-K. Rhee, X.-F. Wan, J.-W. Bae, Z.-X. Quan, and Y.-H. Park Design of long oligonucleotide probes for functional gene detection in a microbial community Bioinformatics, November 15, 2005; 21(22): 4092 - 4100. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sengupta, K. Onodera, A. Lai, and U. Melcher Molecular Detection and Identification of Influenza Viruses by Oligonucleotide Microarray Hybridization J. Clin. Microbiol., October 1, 2003; 41(10): 4542 - 4550. [Abstract] [Full Text] [PDF] |
||||

