Skip Navigation

Bioinformatics 2007 23(4):408-413; doi:10.1093/bioinformatics/btl133
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Takitoh, S.
Right arrow Articles by Kamatani, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Takitoh, S.
Right arrow Articles by Kamatani, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Accurate automated clustering of two-dimensional data for single-nucleotide polymorphism genotyping by a combination of clustering methods: evaluation by large-scale real data

Shuichi Takitoh 1,4, Shogo Fujii 1,4, Yoichi Mase 1,4, Junichi Takasaki 1, Toshimasa Yamazaki 1, Yozo Ohnishi 2,5, Masao Yanagisawa 4, Yusuke Nakamura 3,5 and Naoyuki Kamatani 1,6,*

1 Laboratory for Statistical Analysis, Waseda University Shinjuku, Tokyo, Japan
2 Laboratory of SNP Analysis, Waseda University Shinjuku, Tokyo, Japan
3 Laboratory of Pharmacogenetics, RIKEN SNP Research Center, Waseda University Shinjuku, Tokyo, Japan
4 Department of Computer Science, Waseda University Shinjuku, Tokyo, Japan
5 Institute of Medical Science, University of Tokyo Tokyo, Japan
6 Institute of Rheumatology, Tokyo Women's Medical University Tokyo, Japan

*To whom correspondence should be addressed.


   Abstract

Motivation: The Invader assay is a fluorescence-based high-throughput genotyping technology. If the output data from the Invader assay were classified automatically, then genotypes for individuals would be determined efficiently. However, existing classification methods do not necessarily yield results with the same accuracy as can be achieved by technicians. Our clustering algorithm, Genocluster, is intended to increase the proportion of data points that need not be manually corrected by technicians.

Results: Genocluster worked well even when the number of clusters was unknown in advance and when there were only a few points in a cluster. The use of Genocluster enabled us to achieve an acceptance rate (proportion of assay results that did not need to be corrected by expert technicians) of 84.4% and a proportion of uncorrected points of 95.8%, as determined using the data from over 31 million points.

Availability: Information for obtaining the executable code, example data and example analysis are available at http://www.genstat.net/genocluster

Contact: kamatani{at}ior.twmu.ac.jp

Associate Editor: Alfonso Valencia


Received on September 20, 2005; revised on March 30, 2006; accepted on March 31, 2006

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.