Skip Navigation



Bioinformatics Advance Access published online on May 3, 2006

Bioinformatics, doi:10.1093/bioinformatics/btl165
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
22/14/1745    most recent
btl165v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ng, S. K.
Right arrow Articles by Ng, S.-W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ng, S. K.
Right arrow Articles by Ng, S.-W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2006). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Received February 15, 2006
Revised April 10, 2006
Accepted April 26, 2006

Article

A mixture model with random-effects components for clustering correlated gene-expression profiles

S. K. Ng 1, G. J. McLachlan 2 *, K. Wang 3, L. Ben-Tovim Jones 4, and S.-W. Ng 5

1 Department of Mathematics, University of Queensland, Brisbane, QLD 4072, Australia
2 Department of Mathematics, University of Queensland, Brisbane, QLD 4072, Australia; Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia; ARC Centre for Complex Systems, University of Queensland, Brisbane, QLD 4072, Australia
3 ARC Centre for Complex Systems, University of Queensland, Brisbane, QLD 4072, Australia
4 Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
5 Laboratory of Gynecologic Oncology, Department of Obstetrics, Gynecology and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA

* To whom correspondence should be addressed.
G. J. McLachlan, E-mail: gjm{at}maths.uq.edu.au


   Abstract

Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries, and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes.

Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real data sets, representing typical microarray experimental designs, covering time-course, repeated-measurement, and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic data set is considered too.

Availability: A Fortran program blue called EMMIX-WIRE (EM-based MIXture analysis WIth Random Effects) is available on request from the correspondence author.

Supplementary Information: http://www.maths.uq.edu.au/~gjm/bioinf0602_supp.pdf.


Associate Editor: Martin Bishop
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
F. Achcar, J.-M. Camadro, and D. Mestivier
AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W63 - W67.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C.-T. Li, Y. Yuan, and R. Wilson
An unsupervised conditional random fields approach for clustering gene expression time series
Bioinformatics, November 1, 2008; 24(21): 2467 - 2473.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B.-R. Kim, L. Zhang, A. Berg, J. Fan, and R. Wu
A Computational Approach to the Functional Clustering of Periodic Gene-Expression Profiles
Genetics, October 1, 2008; 180(2): 821 - 834.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Lu, X. He, and S. Zhong
Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W105 - W114.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Lai, B.-l. Adam, R. Podolsky, and J.-X. She
A mixture model approach to the tests of concordance and discordance between two large-scale experiments with two-sample groups
Bioinformatics, May 15, 2007; 23(10): 1243 - 1250.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.