Skip Navigation



Bioinformatics Advance Access published online on May 24, 2005

Bioinformatics, doi:10.1093/bioinformatics/bti517
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
21/15/3201    most recent
bti517v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Handl, J.
Right arrow Articles by Kell, D. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Handl, J.
Right arrow Articles by Kell, D. B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
Received March 24, 2005
Revised May 24, 2005
Accepted May 24, 2005

Review

Computational cluster validation in post-genomic data analysis

Julia Handl 1*, Joshua Knowles 1, and Douglas B. Kell 1

1 School of Chemistry, University of Manchester, Faraday Building, Sackville Street PO Box 88, Manchester M60 1QD, UK

* To whom correspondence should be addressed.
Julia Handl, E-mail: J.Handl{at}postgrad.manchester.ac.uk


   Abstract

Motivation: The discovery of novel biological knowledge from the ab initio analysis of post-genomic data relies upon the use of unsupervised processing methods, in particular clustering techniques. Much recent research in bioinformatics has therefore focused on the transfer of clustering methods introduced in other scientific fields [26, 67, 74], and on the development of novel algorithms specifically designed to tackle the challenges posed by post-genomic data [33, 51]. The partitions returned by a clustering algorithm are commonly validated using visual inspection and concordance with prior biological knowledge -- whether the clusters actually correspond to real structure in the data is somewhat less frequently considered. Suitable computational cluster validation techniques are available in the general data-mining literature [15, 31, 36, 56], but have been given only a fraction of the same attention in bioinformatics [8, 9].

Results: This review paper aims to familiarize the reader with the battery of techniques available for the validation of clustering results, with a particular focus on their application for post-genomic data analysis. Synthetic and real biological data sets are used to demonstrate the benefits, and also some of the perils, of analytical cluster validation.

Availability: Enlarged colour plots are provided in the supplementary material. The software used in the experiments is made available at http://dbkweb.ch.umist.ac.uk/handl/clustervalidation/.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J. Song and M. Singh
How and when should interactome-derived clusters be used to predict functional modules and protein function?
Bioinformatics, December 1, 2009; 25(23): 3143 - 3150.
[Abstract] [Full Text] [PDF]


Home page
Sci SignalHome page
T. K. Lee, E. M. Denny, J. C. Sanghvi, J. E. Gaston, N. D. Maynard, J. J. Hughey, and M. W. Covert
A Noisy Paracrine Signal Determines the Cellular NF-{kappa}B Response to Lipopolysaccharide
Sci. Signal., October 20, 2009; 2(93): ra65 - ra65.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Giancarlo, D. Scaturro, and F. Utro
Textual data compression in computational biology: a synopsis
Bioinformatics, July 1, 2009; 25(13): 1575 - 1586.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Sharma, R. Podolsky, J. Zhao, and R. A. McIndoe
A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets
Bioinformatics, May 1, 2009; 25(9): 1152 - 1157.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
W. N. Van Wieringen, M. A. Van De Wiel, and B. Ylstra
Weighted clustering of called array CGH data
Biostat., July 1, 2008; 9(3): 484 - 500.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Tarraga, I. Medina, J. Carbonell, J. Huerta-Cepas, P. Minguez, E. Alloza, F. Al-Shahrour, S. Vegas-Azcarate, S. Goetz, P. Escobar, et al.
GEPAS, a web-based tool for microarray data analysis and interpretation
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W308 - W314.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
J. C. Trinidad, A. Thalhammer, C. G. Specht, A. J. Lynn, P. R. Baker, R. Schoepfer, and A. L. Burlingame
Quantitative Analysis of Synaptic Phosphorylation and Protein Expression
Mol. Cell. Proteomics, April 1, 2008; 7(4): 684 - 696.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Z. Yu, H.-S. Wong, and H. Wang
Graph-based consensus clustering for class discovery from gene expression data
Bioinformatics, November 1, 2007; 23(21): 2888 - 2896.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
E. J. Ploran, S. M. Nelson, K. Velanova, D. I. Donaldson, S. E. Petersen, and M. E. Wheeler
Evidence Accumulation and the Moment of Recognition: Dissociating Perceptual Recognition Processes Using fMRI
J. Neurosci., October 31, 2007; 27(44): 11912 - 11924.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Jaqaman, J. F. Dorn, E. Marco, P. K. Sorger, and G. Danuser
Phenotypic clustering of yeast mutants based on kinetochore microtubule dynamics
Bioinformatics, July 1, 2007; 23(13): 1666 - 1673.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Pihur, S. Datta, and S. Datta
Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach
Bioinformatics, July 1, 2007; 23(13): 1607 - 1615.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Valentini
Mosclust: a software library for discovering significant structures in bio-molecular data
Bioinformatics, February 1, 2007; 23(3): 387 - 389.
[Abstract] [Full Text] [PDF]


Home page
J R Soc InterfaceHome page
F. J Doyle III and J. Stelling
Systems interface biology
J R Soc Interface, October 22, 2006; 3(10): 603 - 616.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Thalamuthu, I. Mukhopadhyay, X. Zheng, and G. C. Tseng
Evaluation and comparison of gene clustering methods in microarray analysis
Bioinformatics, October 1, 2006; 22(19): 2405 - 2412.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
L.-C. Lai, A. L. Kosorukoff, P. V. Burke, and K. E. Kwast
Metabolic-State-Dependent Remodeling of the Transcriptome in Response to Anoxia and Subsequent Reoxygenation in Saccharomyces cerevisiae.
Eukaryot. Cell, September 1, 2006; 5(9): 1468 - 1489.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Liu, J. Mohammed, J. Carter, S. Ranka, T. Kahveci, and M. Baudis
Distance-based clustering of CGH data
Bioinformatics, August 15, 2006; 22(16): 1971 - 1978.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Huang and W. Pan
Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data
Bioinformatics, May 15, 2006; 22(10): 1259 - 1268.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler
A systematic comparison and evaluation of biclustering methods for gene expression data
Bioinformatics, May 1, 2006; 22(9): 1122 - 1129.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Pan
Incorporating gene functions as priors in model-based clustering of microarray gene expression data
Bioinformatics, April 1, 2006; 22(7): 795 - 801.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.