Skip Navigation



Bioinformatics Advance Access published online on February 24, 2005

Bioinformatics, doi:10.1093/bioinformatics/bti346
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
21/10/2309    most recent
bti346v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Garrity, G. M.
Right arrow Articles by Lilburn, T. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Garrity, G. M.
Right arrow Articles by Lilburn, T. G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
Received December 23, 2004
Revised January 27, 2005
Accepted February 19, 2005

Article

Self-organizing and self-correcting classifications of biological data

George M. Garrity 1* and Timothy G. Lilburn 2

1 Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI 48824, USA
2 Science Information Systems, American Type Culture Collection, Manassas, VA 20110, USA

* To whom correspondence should be addressed.
George M. Garrity, E-mail: garrity{at}msu.edu


   Abstract

Motivation: Rapid, automated means of organizing biological data are required if we hope to keep abreast of the flood of data emanating from sequencing, microarray, and similar high-throughput analyses. Faced with the need to validate the annotation of thousands of sequences and to generate biologically meaningful classifications based on the sequence data, we turned to statistical methods in order to automate these processes.

Results: An algorithm for automated classification based on evolutionary distance data was written in S. The algorithm was tested on a data set of 1,433 small subunit ribosomal RNA sequences and was able to classify the sequences according to an extant scheme, use statistical measurements of group membership to detect sequences that were misclassified within this scheme, and produce a new classification. The use of the algorithm to address problems in prokaryotic taxonomy is discussed.

Availability: S-Plus is available from Insightful, Inc. An S-Plus implementation of the algorithm and the associated data are available at http://taxoweb.mmg.msu.edu/datasets.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J. R. Cole, Q. Wang, E. Cardenas, J. Fish, B. Chai, R. J. Farris, A. S. Kulam-Syed-Mohideen, D. M. McGarrell, T. Marsh, G. M. Garrity, et al.
The Ribosomal Database Project: improved alignments and new tools for rRNA analysis
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D141 - D145.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
T. G. Lilburn, S. H. Harrison, J. R. Cole, and G. M. Garrity
Computational aspects of systematic biology
Brief Bioinform, June 1, 2006; 7(2): 186 - 195.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.