Bioinformatics Advance Access originally published online on September 20, 2005
Bioinformatics 2005 21(22):4148-4154; doi:10.1093/bioinformatics/bti681
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Classification of microarrays to nearest centroids
Department of Biostatistics, University of Washington Seattle 98195, USA
Motivation: Classification of biological samples by microarrays is a topic of much interest. A number of methods have been proposed and successfully applied to this problem. It has recently been shown that classification by nearest centroids provides an accurate predictor that may outperform much more complicated methods. The Prediction Analysis of Microarrays (PAM) approach is one such example, which the authors strongly motivate by its simplicity and interpretability. In this spirit, I seek to assess the performance of classifiers simpler than even PAM.
Results: I surprisingly show that the modified t-statistics and shrunken centroids employed by PAM tend to increase misclassification error when compared with their simpler counterparts. Based on these observations, I propose a classification method called Classification to Nearest Centroids (ClaNC). ClaNC ranks genes by standard t-statistics, does not shrink centroids and uses a class-specific gene-selection procedure. Because of these modifications, ClaNC is arguably simpler and easier to interpret than PAM, and it can be viewed as a traditional nearest centroid classifier that uses specially selected genes. I demonstrate that ClaNC error rates tend to be significantly less than those for PAM, for a given number of active genes.
Availability: Point-and-click software is freely available at http://students.washington.edu/adabney/clanc
Contact: adabney{at}u.washington.edu
Supplementary Information: http://students.washington.edu/adabney/clanc/supplement.pdf
Received on August 15, 2005; revised on September 15, 2005; accepted on September 17, 2005
This article has been cited by other articles:
![]() |
M. Haubitz, D. M. Good, A. Woywodt, H. Haller, H. Rupprecht, D. Theodorescu, M. Dakna, J. J. Coon, and H. Mischak Identification and Validation of Urinary Biomarkers for Differential Diagnosis and Evaluation of Therapeutic Intervention in Anti-neutrophil Cytoplasmic Antibody-associated Vasculitis Mol. Cell. Proteomics, October 1, 2009; 8(10): 2296 - 2307. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Wei, P. Johansson, Q.-R. Chen, Y. K. Song, S. Durinck, X. Wen, A. T.C. Cheuk, M. A. Smith, P. Houghton, C. Morton, et al. microRNA Profiling Identifies Cancer-Specific and Prognostic Signatures in Pediatric Malignancies Clin. Cancer Res., September 1, 2009; 15(17): 5560 - 5568. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Parker, M. Mullins, M. C.U. Cheang, S. Leung, D. Voduc, T. Vickery, S. Davies, C. Fauron, X. He, Z. Hu, et al. Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes J. Clin. Oncol., March 10, 2009; 27(8): 1160 - 1167. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Tai and W. Pan Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms Bioinformatics, July 15, 2007; 23(14): 1775 - 1782. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Mullins, L. Perreard, J. F. Quackenbush, N. Gauthier, S. Bayer, M. Ellis, J. Parker, C. M. Perou, A. Szabo, and P. S. Bernard Agreement in Breast Cancer Classification between Microarray and Quantitative Reverse Transcription PCR from Fresh-Frozen and Formalin-Fixed, Paraffin-Embedded Tissues Clin. Chem., July 1, 2007; 53(7): 1273 - 1279. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Wood, P. M. Visscher, and K. L. Mengersen Classification based upon gene expression data: bias and precision of error rates Bioinformatics, June 1, 2007; 23(11): 1363 - 1370. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Wang and J. Zhu Improved centroids estimation for the nearest shrunken centroid classifier Bioinformatics, April 15, 2007; 23(8): 972 - 979. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Shen, D. Ghosh, A. Chinnaiyan, and Z. Meng Eigengene-based linear discriminant model for tumor classification using gene expression microarray data Bioinformatics, November 1, 2006; 22(21): 2635 - 2642. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Dabney ClaNC: point-and-click software for classifying microarrays to nearest centroids Bioinformatics, January 1, 2006; 22(1): 122 - 123. [Abstract] [Full Text] [PDF] |
||||




