Skip Navigation



Bioinformatics Advance Access published online on April 6, 2005

Bioinformatics, doi:10.1093/bioinformatics/bti422
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
21/13/3001    most recent
bti422v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gui, J.
Right arrow Articles by Li, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gui, J.
Right arrow Articles by Li, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
Received November 17, 2004
Revised March 4, 2005
Accepted March 30, 2005

Article

Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data

Jiang Gui 1 and Hongzhe Li 2*

1 Department of Statistics, University of California, Davis, CA 95616
2 Rowe Program in Human Genetics, University of California, Davis, CA 95616

* To whom correspondence should be addressed.
Hongzhe Li, E-mail: hli{at}ucdavis.edu


   Abstract

Motivation: An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene expression profiles to the censored survival data such as patients' overall survival time or time to cancer relapse. It would be desirable to have models with good prediction accuracy and parsimony property.

Results: We propose to use the L1 penalized estimation for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for future prediction. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by using the latest developed least angle regression method. Our simulation studies and application to real data set on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed procedure, which we call the LARS-Cox procedure, can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The LARS-Cox regression gives better predictive performance than the L2 penalized regression and a few other dimension-reduction based methods.

Conclusions: We conclude that the proposed LARS-Cox procedure can be very useful in identifying genes relevant to survival phenotypes and in building a parsimonious predictive model that can be used for classifying the future patients into clinically relevant high and low risk groups based on the gene expression profile and survival times of previous patients.

Supplementary Information: http://dna.ucdavis.edu/~hli/LARSCox-Appendix.pdf.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Clin. Cancer Res.Home page
D. J. Raz, M. R. Ray, J. Y. Kim, B. He, M. Taron, M. Skrzypski, M. Segal, D. R. Gandara, R. Rosell, and D. M. Jablons
A Multigene Assay Is Prognostic of Survival in Patients with Early-Stage Lung Adenocarcinoma
Clin. Cancer Res., September 1, 2008; 14(17): 5565 - 5570.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Tai and W. Pan
Incorporating prior knowledge of gene functional groups into regularized discriminant analysis of microarray data
Bioinformatics, December 1, 2007; 23(23): 3170 - 3177.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Tai and W. Pan
Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms
Bioinformatics, July 15, 2007; 23(14): 1775 - 1782.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Schumacher, H. Binder, and T. Gerds
Assessment of survival prediction models based on microarray data
Bioinformatics, July 15, 2007; 23(14): 1768 - 1774.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Huang and T. W. S. Chow
Identifying the biologically relevant gene categories based on gene expression and biological data: an example on prostate cancer
Bioinformatics, June 15, 2007; 23(12): 1503 - 1510.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
Z. Wei and H. Li
Nonparametric pathway-based regression models for analysis of genomic data
Biostat., April 1, 2007; 8(2): 265 - 284.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and J. Huang
Clustering threshold gradient descent regularization: with applications to microarray studies
Bioinformatics, February 15, 2007; 23(4): 466 - 472.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Rajicic, D. M. Finkelstein, D. A. Schoenfeld, and the Inflammation Host Response to Injury Research
Survival analysis of longitudinal microarrays
Bioinformatics, November 1, 2006; 22(21): 2643 - 2649.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Sha, M. G. Tadesse, and M. Vannucci
Bayesian variable selection for the analysis of microarray data with censored outcomes
Bioinformatics, September 15, 2006; 22(18): 2262 - 2268.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. Li
Survival prediction of diffuse large-B-cell lymphoma based on both clinical and gene expression information
Bioinformatics, February 15, 2006; 22(4): 466 - 471.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.