Skip Navigation


Bioinformatics Advance Access originally published online on October 27, 2005
Bioinformatics 2006 22(1):58-67; doi:10.1093/bioinformatics/bti746
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/1/58    most recent
bti746v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (12)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Grotkjær, T.
Right arrow Articles by Hansen, L. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grotkjær, T.
Right arrow Articles by Hansen, L. K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

Thomas Grotkjær 1,*, Ole Winther 2, Birgitte Regenberg 1, Jens Nielsen 1 and Lars Kai Hansen 2

1Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
2Informatics and Mathematical Modelling, Building 321, Technical University of Denmark DK-2800 Kgs. Lyngby, Denmark

*To whom correspondence should be addressed.

Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.

Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.

Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.

Contact: tg{at}biocentrum.dtu.dk and owi{at}imm.dtu.dk

Supplementary information: http://www.cmb.dtu.dk/


Received on February 11, 2005; revised on October 13, 2005; accepted on October 25, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
Z. Yu, H.-S. Wong, and H. Wang
Graph-based consensus clustering for class discovery from gene expression data
Bioinformatics, November 1, 2007; 23(21): 2888 - 2896.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Lu, X. He, and S. Zhong
Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W105 - W114.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
R. Usaite, K. R. Patil, T. Grotkjaer, J. Nielsen, and B. Regenberg
Global Transcriptional and Physiological Responses of Saccharomyces cerevisiae to Ammonium, L-Alanine, or L-Glutamine Limitation
Appl. Envir. Microbiol., September 1, 2006; 72(9): 6194 - 6203.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.