Bioinformatics Advance Access published online on June 6, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm305
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Predicting survival from microarray data a comparative study


1Department of Mathematics, University of Oslo; 2Department of Informatics, University of Oslo; 3Norwegian Computing Center; 4 Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation (sfi)2
*To whom correspondence should be addressed. H.M. Bøvelstad, E-mail: hegembo{at}math.uio.no
| Abstract |
|---|
Motivation: Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Coxs proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression, and the lasso.
Results: Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance.
Availability: Matlab and R code for the prediction methods are available at http://www.med.uio.no/imb/stat/bmms/software/microsurv/.
Associate Editor: Dr. Joaquin Dopazo
These two authors contributed equally to this work
Received on October 31, 2006; revised on May 24, 2007; accepted on May 28, 2007
This article has been cited by other articles:
![]() |
M. A. van de Wiel, J. Berkhof, and W. N. van Wieringen Testing the prediction error difference between 2 predictors Biostat., July 1, 2009; 10(3): 550 - 560. [Abstract] [Full Text] [PDF] |
||||
![]() |
Renal Gene and Protein Expression Signatures for Prediction of Kidney Disease Progression Am. J. Pathol., June 1, 2009; 174(6): 2073 - 2085. |
||||
![]() |
B. J. Wouters, B. Lowenberg, and R. Delwel A decade of genome-wide gene expression profiling in acute myeloid leukemia: flashback and prospects Blood, January 8, 2009; 113(2): 291 - 298. [Abstract] [Full Text] [PDF] |
||||


