Bioinformatics Vol. 19 no. 16 2003
pages 2072-2078
© 2003 Oxford University Press
Linear regression and two-class classification with gene expression data
Division of Biostatistics, School of Public Health, University of Minnesota, A460 Mayo Building (MMC 303), Minneapolis, MN 55455-0378, USA
Received on January 7, 2003
; revised on April 13, 2003
; accepted on May 6, 2003
Motivation: Using gene expression data to classify (or predict) tumor types has received much research attention recently. Due to some special features of gene expression data, several new methods have been proposed, including the weighted voting scheme of Golub et al., the compound covariate method of Hedenfalk et al. (originally proposed by Tukey), and the shrunken centroids method of Tibshirani et al. These methods look different and are more or less ad hoc.
Results: We point out a close connection of the three methods with a linear regression model. Casting the classification problem in the general framework of linear regression naturally leads to new alternatives, such as partial least squares (PLS) methods and penalized PLS (PPLS) methods. Using two real data sets, we show the competitive performance of our new methods when compared with the other three methods.
Contact: weip{at}biostat.umn.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
F. Tai and W. Pan Incorporating prior knowledge of gene functional groups into regularized discriminant analysis of microarray data Bioinformatics, December 1, 2007; 23(23): 3170 - 3177. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Tai and W. Pan Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms Bioinformatics, July 15, 2007; 23(14): 1775 - 1782. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, J. M. Hughes-Oliver, and J. A. Menius Jr Domain-enhanced analysis of microarray data using GO annotations Bioinformatics, May 15, 2007; 23(10): 1225 - 1234. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-L. Boulesteix and K. Strimmer Partial least squares: a versatile tool for the analysis of high-dimensional genomic data Brief Bioinform, January 1, 2007; 8(1): 32 - 44. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tan, L. Shi, S. M. Hussain, J. Xu, W. Tong, J. M. Frazier, and C. Wang Integrating time-course microarray gene expression profiles with cytotoxicity for identification of biomarkers in primary rat hepatocytes exposed to cadmium Bioinformatics, January 1, 2006; 22(1): 77 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Eckel-Passow, A. Hoering, T. M. Therneau, and I. Ghobrial Experimental Design and Analysis of Antibody Microarrays: Applying Methods from cDNA Arrays Cancer Res., April 15, 2005; 65(8): 2985 - 2989. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Fort and S. Lambert-Lacroix Classification using partial least squares with penalized logistic regression Bioinformatics, April 1, 2005; 21(7): 1104 - 1111. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tan, L. Shi, W. Tong, and C. Wang Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data Nucleic Acids Res., January 7, 2005; 33(1): 56 - 65. [Abstract] [Full Text] [PDF] |
||||



