Skip Navigation



Bioinformatics Advance Access published online on September 14, 2007

Bioinformatics, doi:10.1093/bioinformatics/btm463
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/21/2888    most recent
btm463v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Yu, Z.
Right arrow Articles by Wang, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yu, Z.
Right arrow Articles by Wang, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2007). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Graph based Consensus Clustering for Class Discovery from Gene Expression Data

Zhiwen Yu a, Hau-San Wong a and Hongqiang Wang a

aDepartment of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

To whom correspondence should be addressed. Mr. Zhiwen Yu, E-mail: yuzhiwen{at}cs.cityu.edu.hk


   Abstract

Motivation: Consensus clustering, also known as cluster ensemble, is one of the important techniques for microarray data analysis, and is particularly useful for class discovery from microarray data. Compared with traditional clustering algorithms, consensus clustering approaches have the ability to integrate multiple partitions from different cluster solutions to improve the robustness, stability, scalability and parallelization of the clustering algorithms. By consensus clustering, one can discover the underlying classes of the samples in gene expression data.

Results: In addition to exploring a graph based consensus clustering algorithm to estimate the underlying classes of the samples in microarray data, we also design a new validation index to determine the number of classes in microarray data. To our knowledge, this is the first time in which graph based consensus clustering is applied to class discovery for microarray data. Given a pre-specified maximum number of classes (denoted as Kmax in this paper), our algorithm can discover the true number of classes for the samples in microarray data according to a new cluster validation index called the Modified Rand Index. Experiments on gene expression data indicate that our new algorithm can (i) outperform most of the existing algorithms, (ii) identify the number of classes correctly in real cancer datasets, and (iii) discover the classes of samples with biological meaning.

Availability: Matlab source code for the graph based consensus clustering algorithm (GCC) is available upon request from Zhiwen Yu.

Contact: cshswong{at}cityu.edu.hk and yuzhiwen{at}cs.cityu.edu.hk

Associate Editor: Prof. David Rocke


Received on April 11, 2007; revised on August 15, 2007; accepted on September 8, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.