Bioinformatics Vol. 19 no. 8 2003
Pages 973-980
© 2003 Oxford University Press
Fuzzy C-means method for clustering microarray data
Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS-IMSERM-ULP, BP 10142, 67404 Illkirch Cedex, France
Received on August 14, 2002
; revised on November 14, 2002
; accepted on January 3, 2003
Motivation: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes.
Results: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster.
Availability: Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/
Contact: doulaye{at}titus.u-strasbg.fr
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
F. Achcar, J.-M. Camadro, and D. Mestivier AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology Nucleic Acids Res., July 1, 2009; 37(suppl_2): W63 - W67. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Andreopoulos, A. An, X. Wang, and M. Schroeder A roadmap of clustering algorithms: finding a match for a biomedical application Brief Bioinform, May 1, 2009; 10(3): 297 - 314. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bhattacharya and R. K. De Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles Bioinformatics, June 1, 2008; 24(11): 1359 - 1366. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Langfelder, B. Zhang, and S. Horvath Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R Bioinformatics, March 1, 2008; 24(5): 719 - 720. [Abstract] [Full Text] [PDF] |
||||
![]() |
Seo Young Kim and J. Won Lee Ensemble clustering method based on the resampling similarity measure for gene expression data Statistical Methods in Medical Research, December 1, 2007; 16(6): 539 - 564. [Abstract] [PDF] |
||||
![]() |
S. Bandyopadhyay, A. Mukhopadhyay, and U. Maulik An improved algorithm for clustering gene expression data Bioinformatics, November 1, 2007; 23(21): 2859 - 2865. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhou, M. N. Pons, L. Raskin, and J. L. Zilles Automated Image Analysis for Quantitative Fluorescence In Situ Hybridization with Environmental Samples Appl. Envir. Microbiol., May 1, 2007; 73(9): 2956 - 2962. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. T. Vu and J. Vohradsky Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Nucleic Acids Res., January 12, 2007; 35(1): 279 - 287. [Abstract] [Full Text] [PDF] |
||||
![]() |
D.-W. Kim, K.-Y. Lee, K. H. Lee, and D. Lee Towards clustering of incomplete microarray data without the use of imputation Bioinformatics, January 1, 2007; 23(1): 107 - 113. [Abstract] [Full Text] [PDF] |
||||
![]() |
D.-W. Kim, K. H. Lee, and D. Lee Detecting clusters of different geometrical shapes in microarray gene expression data Bioinformatics, May 1, 2005; 21(9): 1927 - 1934. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. H. Asyali and M. Alci Reliability analysis of microarray data using fuzzy c-means and normal mixture modeling based classification methods Bioinformatics, March 1, 2005; 21(5): 644 - 649. [Abstract] [Full Text] [PDF] |
||||




