Bioinformatics Advance Access originally published online on June 9, 2006
Bioinformatics 2006 22(16):1988-1997; doi:10.1093/bioinformatics/btl284
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Clustering microarray gene expression data using weighted Chinese restaurant process
Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan 1420 Washington Heights, Ann Arbor, MI 48109-2029, USA
| Abstract |
|---|
Motivation: Clustering microarray gene expression data is a powerful tool for elucidating co-regulatory relationships among genes. Many different clustering techniques have been successfully applied and the results are promising. However, substantial fluctuation contained in microarray data, lack of knowledge on the number of clusters and complex regulatory mechanisms underlying biological systems make the clustering problems tremendously challenging.
Results: We devised an improved model-based Bayesian approach to cluster microarray gene expression data. Cluster assignment is carried out by an iterative weighted Chinese restaurant seating scheme such that the optimal number of clusters can be determined simultaneously with cluster assignment. The predictive updating technique was applied to improve the efficiency of the Gibbs sampler. An additional step is added during reassignment to allow genes that display complex correlation relationships such as time-shifted and/or inverted to be clustered together. Analysis done on a real dataset showed that as much as 30% of significant genes clustered in the same group display complex relationships with the consensus pattern of the cluster. Other notable features including automatic handling of missing data, quantitative measures of cluster strength and assignment confidence. Synthetic and real microarray gene expression datasets were analyzed to demonstrate its performance.
Availability: A computer program named Chinese restaurant cluster (CRC) has been developed based on this algorithm. The program can be downloaded at http://www.sph.umich.edu/csg/qin/CRC/
Contact: qin{at}umich.edu
Supplementary information: http://www.sph.umich.edu/csg/qin/CRC/
Associate Editor: John Quackenbush
Received on February 16, 2006; revised on April 20, 2006; accepted on May 31, 2006
This article has been cited by other articles:
![]() |
Y. Chen, G. Lin, J. S. Huo, D. Barney, Z. Wang, T. Livshiz, D. J. States, Z. S. Qin, and J. Schwartz Computational and Functional Analysis of Growth Hormone (GH)-Regulated Genes Identifies the Transcriptional Repressor B-Cell Lymphoma 6 (Bc16) as a Participant in GH-Regulated Transcription Endocrinology, August 1, 2009; 150(8): 3645 - 3654. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Achcar, J.-M. Camadro, and D. Mestivier AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology Nucleic Acids Res., July 1, 2009; 37(suppl_2): W63 - W67. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Wang, Q. Wang, X. Li, B. Shen, M. Ding, and Z. Shen Towards patterns tree of gene coexpression in eukaryotic species Bioinformatics, June 1, 2008; 24(11): 1367 - 1373. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Joshi, Y. Van de Peer, and T. Michoel Analysis of a Gibbs sampler method for model-based clustering of gene expression data Bioinformatics, January 15, 2008; 24(2): 176 - 183. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bandyopadhyay, A. Mukhopadhyay, and U. Maulik An improved algorithm for clustering gene expression data Bioinformatics, November 1, 2007; 23(21): 2859 - 2865. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Xiang, Z. S. Qin, and Y. He CRCView: a web server for analyzing and visualizing microarray gene expression data using model-based clustering Bioinformatics, July 15, 2007; 23(14): 1843 - 1845. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lu, X. He, and S. Zhong Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease Nucleic Acids Res., July 13, 2007; 35(suppl_2): W105 - W114. [Abstract] [Full Text] [PDF] |
||||


