Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (132)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by McLachlan, G. J.
Right arrow Articles by Peel, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by McLachlan, G. J.
Right arrow Articles by Peel, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 3 2002
Pages 413-422
© 2002 Oxford University Press

A mixture model-based approach to the clustering of microarray expression data

G. J. McLachlan , R. W. Bean and D. Peel

Department of Mathematics, University of Queensland, Brisbane, Queensland 4072, Australia

Received on August 30, 2001 ; revised on October 26, 2001 ; accepted on November 2, 2001

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of tdistributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes.

Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Availability: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/

Contact: gjm{at}maths.uq.edu.au


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
G. Li, Q. Ma, H. Tang, A. H. Paterson, and Y. Xu
QUBIC: a qualitative biclustering algorithm for analyses of gene expression data
Nucleic Acids Res., August 1, 2009; 37(15): e101 - e101.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. A. Koziol, A. C. Feng, Z. Jia, Y. Wang, S. Goodison, M. McClelland, and D. Mercola
The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis
Bioinformatics, January 1, 2009; 25(1): 54 - 60.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
J.-L. Dortet-Bernadet and N. Wicker
Model-based clustering on the unit sphere with an illustration using gene expression profiles
Biostat., January 1, 2008; 9(1): 66 - 80.
[Abstract] [Full Text] [PDF]


Home page
Stat Methods Med ResHome page
Seo Young Kim and J. Won Lee
Ensemble clustering method based on the resampling similarity measure for gene expression data
Statistical Methods in Medical Research, December 1, 2007; 16(6): 539 - 564.
[Abstract] [PDF]


Home page
BioinformaticsHome page
G. C. Tseng
Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data
Bioinformatics, September 1, 2007; 23(17): 2247 - 2255.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Lu, X. He, and S. Zhong
Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W105 - W114.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. S. V. Wong, F. K. Wong, and G. R. Wood
A multi-stage approach to clustering and imputation of gene expression profiles
Bioinformatics, April 15, 2007; 23(8): 998 - 1005.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
T. Vuocolo, K. Byrne, J. White, S. McWilliam, A. Reverter, N. E. Cockett, and R. L. Tellam
Identification of a gene network contributing to hypertrophy in callipyge skeletal muscle
Physiol Genomics, February 12, 2007; 28(3): 253 - 272.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
E. Lepage, S. Brinster, C. Caron, C. Ducroix-Crepy, L. Rigottier-Gois, G. Dunny, C. Hennequet-Antier, and P. Serror
Comparative Genomic Hybridization Analysis of Enterococcus faecalis: Identification of Genes Absent from Food Strains.
J. Bacteriol., October 1, 2006; 188(19): 6858 - 6868.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Reverter, A. Ingham, S. A. Lehnert, S.-H. Tan, Y. Wang, A. Ratnakumar, and B. P. Dalrymple
Simultaneous identification of differential gene expression and connectivity in inflammation, adipogenesis and cancer
Bioinformatics, October 1, 2006; 22(19): 2396 - 2404.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Pan, X. Shen, A. Jiang, and R. P. Hebbel
Semi-supervised learning via penalized mixture model with application to microarray sample classification
Bioinformatics, October 1, 2006; 22(19): 2388 - 2395.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Thalamuthu, I. Mukhopadhyay, X. Zheng, and G. C. Tseng
Evaluation and comparison of gene clustering methods in microarray analysis
Bioinformatics, October 1, 2006; 22(19): 2405 - 2412.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Z. S. Qin
Clustering microarray gene expression data using weighted Chinese restaurant process
Bioinformatics, August 15, 2006; 22(16): 1988 - 1997.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. K. Ng, G. J. McLachlan, K. Wang, L. Ben-Tovim Jones, and S.-W. Ng
A Mixture model with random-effects components for clustering correlated gene-expression profiles
Bioinformatics, July 15, 2006; 22(14): 1745 - 1752.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Liu, S. Sivaganesan, K. Y. Yeung, J. Guo, R. E. Bumgarner, and M. Medvedovic
Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset
Bioinformatics, July 15, 2006; 22(14): 1737 - 1744.
[Abstract] [Full Text] [PDF]


Home page
J R Soc InterfaceHome page
L Carrivick, S Rogers, J Clark, C Campbell, M Girolami, and C Cooper
Identification of prognostic signatures in breast cancer microarray data using Bayesian techniques
J R Soc Interface, June 22, 2006; 3(8): 367 - 381.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Yoshida, T. Higuchi, S. Imoto, and S. Miyano
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles
Bioinformatics, June 15, 2006; 22(12): 1538 - 1539.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Huang and W. Pan
Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data
Bioinformatics, May 15, 2006; 22(10): 1259 - 1268.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Pan
Incorporating gene functions as priors in model-based clustering of microarray gene expression data
Bioinformatics, April 1, 2006; 22(7): 795 - 801.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
K. D. Siegmund, A. J. Levine, J. Chang, and P. W. Laird
Modeling exposures for DNA methylation profiles.
Cancer Epidemiol. Biomarkers Prev., March 1, 2006; 15(3): 567 - 572.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Martella
Classification of microarray data with factor mixture models
Bioinformatics, January 15, 2006; 22(2): 202 - 208.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Grotkjaer, O. Winther, B. Regenberg, J. Nielsen, and L. K. Hansen
Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm
Bioinformatics, January 1, 2006; 22(1): 58 - 67.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. E. Teschendorff, Y. Wang, N. L. Barbosa-Morais, J. D. Brenton, and C. Caldas
A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data
Bioinformatics, July 1, 2005; 21(13): 3025 - 3033.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. H. Asyali and M. Alci
Reliability analysis of microarray data using fuzzy c-means and normal mixture modeling based classification methods
Bioinformatics, March 1, 2005; 21(5): 644 - 649.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
A. Reverter, Y. H. Wang, K. A. Byrne, S. H. Tan, G. S. Harper, and S. A. Lehnert
Joint analysis of multiple cDNA microarray studies via multivariate mixed models applied to genetic improvement of beef cattle
J Anim Sci, December 1, 2004; 82(12): 3430 - 3439.
[Abstract] [Full Text] [PDF]


Home page
IOVSHome page
S. Zareparsi, A. Hero, D. J. Zack, R. W. Williams, and A. Swaroop
Seeing the Unseen: Microarray-Based Gene Expression Profiling in Vision
Invest. Ophthalmol. Vis. Sci., August 1, 2004; 45(8): 2457 - 2462.
[Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. H. Asyali, M. M. Shoukri, O. Demirkaya, and K. S. A. Khabar
Assessment of reliability of microarray data and estimation of signal thresholds using mixture modeling
Nucleic Acids Res., April 27, 2004; 32(8): 2323 - 2335.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Ambroise and G. J. McLachlan
Selection bias in gene extraction on the basis of microarray gene-expression data
PNAS, May 14, 2002; 99(10): 6562 - 6566.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.