Bioinformatics Advance Access originally published online on January 19, 2008
Bioinformatics 2008 24(5):682-688; doi:10.1093/bioinformatics/btn017
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Using repeated measurements to validate hierarchical gene clusters


1Méthodes et Algorithmes pour la Bioinformatique, LIRMM, CNRS - University Montpellier II, 2INRA, Unité protéomique, 2 Place Viala, 34060 Montpellier Cédex 1 and 3INRA, Unité Biostatistique et Processus Spatiaux, 84914 Avignon Cédex 9, France
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Hierarchical clustering is a common approach to study protein and gene expression data. This unsupervised technique is used to find clusters of genes or proteins which are expressed in a coordinated manner across a set of conditions. Because of both the biological and technical variability, experimental repetitions are generally performed. In this work, we propose an approach to evaluate the stability of clusters derived from hierarchical clustering by taking repeated measurements into account.
Results: The method is based on the bootstrap technique that is used to obtain pseudo-hierarchies of genes from resampled datasets. Based on a fast dynamic programming algorithm, we compare the original hierarchy to the pseudo-hierarchies and assess the stability of the original gene clusters. Then a shuffling procedure can be used to assess the significance of the cluster stabilities. Our approach is illustrated on simulated data and on two microarray datasets. Compared to the standard hierarchical clustering methodology, it allows to point out the dubious and stable clusters, and thus avoids misleading interpretations.
Availability: The programs were developed in C and R languages.
Contact: brehelin{at}lirmm.fr
Supplementary information: Supplementary Material and source code are available at address http://www.lirmm.fr/~brehelin/Stability/
Both authors contributed equally to this work.
Received on October 2, 2007; revised on December 11, 2007; accepted on January 9, 2008
This article has been cited by other articles:
![]() |
J. Fassunke, M. Majores, A. Tresch, P. Niehusmann, A. Grote, S. Schoch, and A. J. Becker Array analysis of epilepsy-associated gangliogliomas reveals expression patterns related to aberrant development of neuronal precursors Brain, September 26, 2008; (2008) awn233v1. [Abstract] [Full Text] [PDF] |
||||
