Bioinformatics Advance Access originally published online on April 21, 2005
Bioinformatics 2005 21(14):3122-3130; doi:10.1093/bioinformatics/bti452
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Classification of oligonucleotide fingerprints: application for microbial community and gene expression analyses
1Department of Statistics, University of California Riverside, CA 92521, USA
2Department of Plant Pathology, University of California Riverside, CA 92521, USA
3Central Laboratories, Israeli Ministry of Health Yaakov Eliav 9, 94467 Jerusalem, Israel
*To whom correspondence should be addressed.
Motivation: Oligonucleotide fingerprinting of ribosomal RNA genes (OFRG) is a procedure that sorts rRNA gene (rDNA) clones into taxonomic groups through a series of hybridization experiments. The hybridization signals are classified into three discrete values 0, 1 and N, where 0 and 1, respectively, specify negative and positive hybridization events and N designates an uncertain assignment. This study examined various approaches for classifying the values including Bayesian classification with normally distributed signal data, Bayesian classification with the exponentially distributed data, and with gamma distributed data, along with tree-based classification. All classification data were clustered using the unweighted pair group method with arithmetic mean.
Results: The performance of each classification/clustering procedure was compared with results from known reference data. Comparisons indicated that the approach using the Bayesian classification with normal densities followed by tree clustering out-performed all others. The paper includes a discussion of how this Bayesian approach may be useful for the analysis of gene expression data.
Contact: james.press{at}ucr.edu
Received on February 3, 2005; revised on April 13, 2005; accepted on April 13, 2005