Bioinformatics Advance Access published online on April 6, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti422
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Statistics, University of California, Davis, CA 95616
* To whom correspondence should be addressed.
Motivation: An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene expression profiles to the censored survival data such as patients' overall survival time or time to cancer relapse. It would be desirable to have models with good prediction accuracy and parsimony property. Results: We propose to use the L1 penalized estimation for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for future prediction. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by using the latest developed least angle regression method. Our simulation studies and application to real data set on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed procedure, which we call the LARS-Cox procedure, can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The LARS-Cox regression gives better predictive performance than the L2 penalized regression and a few other dimension-reduction based methods. Conclusions: We conclude that the proposed LARS-Cox procedure can be very useful in identifying genes relevant to survival phenotypes and in building a parsimonious predictive model that can be used for classifying the future patients into clinically relevant high and low risk groups based on the gene expression profile and survival times of previous patients. Supplementary Information: http://dna.ucdavis.edu/~hli/LARSCox-Appendix.pdf.
Received November 17, 2004
Revised March 4, 2005
Accepted March 30, 2005
Article
Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data
2 Rowe Program in Human Genetics, University of California, Davis, CA 95616
Hongzhe Li, E-mail: hli{at}ucdavis.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. J. Raz, M. R. Ray, J. Y. Kim, B. He, M. Taron, M. Skrzypski, M. Segal, D. R. Gandara, R. Rosell, and D. M. Jablons A Multigene Assay Is Prognostic of Survival in Patients with Early-Stage Lung Adenocarcinoma Clin. Cancer Res., September 1, 2008; 14(17): 5565 - 5570. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Tai and W. Pan Incorporating prior knowledge of gene functional groups into regularized discriminant analysis of microarray data Bioinformatics, December 1, 2007; 23(23): 3170 - 3177. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Tai and W. Pan Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms Bioinformatics, July 15, 2007; 23(14): 1775 - 1782. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Schumacher, H. Binder, and T. Gerds Assessment of survival prediction models based on microarray data Bioinformatics, July 15, 2007; 23(14): 1768 - 1774. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Huang and T. W. S. Chow Identifying the biologically relevant gene categories based on gene expression and biological data: an example on prostate cancer Bioinformatics, June 15, 2007; 23(12): 1503 - 1510. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and H. Li Nonparametric pathway-based regression models for analysis of genomic data Biostat., April 1, 2007; 8(2): 265 - 284. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ma and J. Huang Clustering threshold gradient descent regularization: with applications to microarray studies Bioinformatics, February 15, 2007; 23(4): 466 - 472. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rajicic, D. M. Finkelstein, D. A. Schoenfeld, and the Inflammation Host Response to Injury Research Survival analysis of longitudinal microarrays Bioinformatics, November 1, 2006; 22(21): 2643 - 2649. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Sha, M. G. Tadesse, and M. Vannucci Bayesian variable selection for the analysis of microarray data with censored outcomes Bioinformatics, September 15, 2006; 22(18): 2262 - 2268. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Li Survival prediction of diffuse large-B-cell lymphoma based on both clinical and gene expression information Bioinformatics, February 15, 2006; 22(4): 466 - 471. [Abstract] [Full Text] [PDF] |
||||


