Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (143)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Yeung, K. Y.
Right arrow Articles by Ruzzo, W. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yeung, K. Y.
Right arrow Articles by Ruzzo, W. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 4 2001
Pages 309-318
© 2001 Oxford University Press


Original Paper

Validating clustering for gene expression data

K. Y. Yeung 1,*, D. R. Haynor 2 and W. L. Ruzzo 1

1 Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195, USA
2 Radiology, Box 357115, University of Washington, Seattle, WA 98195, USA

Received on August 23, 2000 ; revised on November 23, 2000 ; accepted on December 1, 2000

Motivation: Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters—meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance.

Results: We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.

Availability: The software is under development.

Contact: kayee{at}cs.washington.edu

Supplementary information: http://www.cs.washington.edu/homes/kayee/cluster or http://www.cs.washington.edu/homes/ruzzo/cluster

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
B. Andreopoulos, A. An, X. Wang, and M. Schroeder
A roadmap of clustering algorithms: finding a match for a biomedical application
Brief Bioinform, May 1, 2009; 10(3): 297 - 314.
[Abstract] [Full Text] [PDF]


Home page
JDRHome page
E.L. Hendrickson, R.J. Lamont, and M. Hackett
Tools for Interpreting Large-scale Protein Profiling in Microbiology
Journal of Dental Research, November 1, 2008; 87(11): 1004 - 1015.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. Brehelin, O. Gascuel, and O. Martin
Using repeated measurements to validate hierarchical gene clusters
Bioinformatics, March 1, 2008; 24(5): 682 - 688.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
H. A. van den Burg, D. I. Tsitsigiannis, O. Rowland, J. Lo, G. Rallapalli, D. MacLean, F. L.W. Takken, and J. D.G. Jones
The F-Box Protein ACRE189/ACIF1 Regulates Cell Death and Defense Responses Activated during Pathogen Recognition in Tobacco and Tomato
PLANT CELL, March 1, 2008; 20(3): 697 - 719.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Bandyopadhyay, A. Mukhopadhyay, and U. Maulik
An improved algorithm for clustering gene expression data
Bioinformatics, November 1, 2007; 23(21): 2859 - 2865.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Pihur, S. Datta, and S. Datta
Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach
Bioinformatics, July 1, 2007; 23(13): 1607 - 1615.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and J. Huang
Clustering threshold gradient descent regularization: with applications to microarray studies
Bioinformatics, February 15, 2007; 23(4): 466 - 472.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
A. V. Kapp and R. Tibshirani
Are clusters found in one dataset present in another dataset?
Biostat., January 1, 2007; 8(1): 9 - 31.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D.-W. Kim, K.-Y. Lee, K. H. Lee, and D. Lee
Towards clustering of incomplete microarray data without the use of imputation
Bioinformatics, January 1, 2007; 23(1): 107 - 113.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
F. O. Martinez, S. Gordon, M. Locati, and A. Mantovani
Transcriptional Profiling of the Human Monocyte-to-Macrophage Differentiation and Polarization: New Molecules and Patterns of Gene Expression
J. Immunol., November 15, 2006; 177(10): 7303 - 7311.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Thalamuthu, I. Mukhopadhyay, X. Zheng, and G. C. Tseng
Evaluation and comparison of gene clustering methods in microarray analysis
Bioinformatics, October 1, 2006; 22(19): 2405 - 2412.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
N. H. Bergman, E. C. Anderson, E. E. Swenson, M. M. Niemeyer, A. D. Miyoshi, and P. C. Hanna
Transcriptional Profiling of the Bacillus anthracis Life Cycle In Vitro and an Implied Model for Regulation of Spore Formation.
J. Bacteriol., September 1, 2006; 188(17): 6092 - 6100.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler
A systematic comparison and evaluation of biclustering methods for gene expression data
Bioinformatics, May 1, 2006; 22(9): 1122 - 1129.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al.
Machine learning in bioinformatics
Brief Bioinform, March 1, 2006; 7(1): 86 - 112.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
S. Vanneste, B. De Rybel, G. T.S. Beemster, K. Ljung, I. De Smet, G. Van Isterdael, M. Naudts, R. Iida, W. Gruissem, M. Tasaka, et al.
Cell Cycle Progression in the Pericycle Is Not Sufficient for SOLITARY ROOT/IAA14-Mediated Lateral Root Initiation in Arabidopsis thaliana
PLANT CELL, November 1, 2005; 17(11): 3035 - 3050.
[Abstract] [Full Text] [PDF]


Home page
Stem CellsHome page
W. Wagner, R. Saffrich, U. Wirkner, V. Eckstein, J. Blake, A. Ansorge, C. Schwager, F. Wein, K. Miesala, W. Ansorge, et al.
Hematopoietic Progenitor Cells and Cellular Microenvironment: Behavioral and Molecular Changes upon Interaction
Stem Cells, September 1, 2005; 23(8): 1180 - 1191.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Handl, J. Knowles, and D. B. Kell
Computational cluster validation in post-genomic data analysis
Bioinformatics, August 1, 2005; 21(15): 3201 - 3212.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D.-W. Kim, K. H. Lee, and D. Lee
Detecting clusters of different geometrical shapes in microarray gene expression data
Bioinformatics, May 1, 2005; 21(9): 1927 - 1934.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Zhou, J. A. Young, A. Santrosyan, K. Chen, S. F. Yan, and E. A. Winzeler
In silico gene function prediction using ontology-based pattern identification
Bioinformatics, April 1, 2005; 21(7): 1237 - 1245.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Bolshakova, F. Azuaje, and Pád. Cunningham
An integrated tool for microarray data clustering and cluster validity assessment
Bioinformatics, February 15, 2005; 21(4): 451 - 455.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Xu, V. Olman, L. Wang, and Y. Xu
EXCAVATOR: a computer program for efficiently mining gene expression data
Nucleic Acids Res., October 1, 2003; 31(19): 5582 - 5589.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. B. Owen, J. Stuart, K. Mach, A. M. Villeneuve, and S. Kim
A Gene Recommender Algorithm to Identify Coexpressed Genes in C. elegans
Genome Res., August 1, 2003; 13(8): 1828 - 1837.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Knudsen, C. Workman, T. Sicheritz-Ponten, and C. Friis
GenePublisher: automated analysis of DNA microarray data
Nucleic Acids Res., July 1, 2003; 31(13): 3471 - 3476.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
H. Ressom, D. Wang, and P. Natarajan
Clustering gene expression data using adaptive double self-organizing map
Physiol Genomics, June 24, 2003; 14(1): 35 - 46.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
C. M. Ronning, S. S. Stegalkina, R. A. Ascenzi, O. Bougri, A. L. Hart, T. R. Utterbach, S. E. Vanaken, S. B. Riedmuller, J. A. White, J. Cho, et al.
Comparative Analyses of Potato Expressed Sequence Tag Libraries
Plant Physiology, February 1, 2003; 131(2): 419 - 429.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
F. D. Gibbons and F. P. Roth
Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation
Genome Res., October 1, 2002; 12(10): 1574 - 1581.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.