Bioinformatics Advance Access published online on February 24, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti346
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI 48824, USA
* To whom correspondence should be addressed.
Motivation: Rapid, automated means of organizing biological data are required if we hope to keep abreast of the flood of data emanating from sequencing, microarray, and similar high-throughput analyses. Faced with the need to validate the annotation of thousands of sequences and to generate biologically meaningful classifications based on the sequence data, we turned to statistical methods in order to automate these processes. Results: An algorithm for automated classification based on evolutionary distance data was written in S. The algorithm was tested on a data set of 1,433 small subunit ribosomal RNA sequences and was able to classify the sequences according to an extant scheme, use statistical measurements of group membership to detect sequences that were misclassified within this scheme, and produce a new classification. The use of the algorithm to address problems in prokaryotic taxonomy is discussed. Availability: S-Plus is available from Insightful, Inc. An S-Plus implementation of the algorithm and the associated data are available at http://taxoweb.mmg.msu.edu/datasets.
Received December 23, 2004
Revised January 27, 2005
Accepted February 19, 2005
Article
Self-organizing and self-correcting classifications of biological data
2 Science Information Systems, American Type Culture Collection, Manassas, VA 20110, USA
George M. Garrity, E-mail: garrity{at}msu.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. R. Cole, Q. Wang, E. Cardenas, J. Fish, B. Chai, R. J. Farris, A. S. Kulam-Syed-Mohideen, D. M. McGarrell, T. Marsh, G. M. Garrity, et al. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis Nucleic Acids Res., January 1, 2009; 37(suppl_1): D141 - D145. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. G. Lilburn, S. H. Harrison, J. R. Cole, and G. M. Garrity Computational aspects of systematic biology Brief Bioinform, June 1, 2006; 7(2): 186 - 195. [Abstract] [Full Text] [PDF] |
||||

