Bioinformatics Advance Access published online on March 22, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl103
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Cologne University Bioinformatics Center and Center for Applied Computer Science, Weyertal 80, 50931 Köln, Germany
* To whom correspondence should be addressed.
Motivation: DNA microarrays allow the simultaneous measurement of thousands of gene expression levels in any given patient sample. Gene expression data have been shown to correlate with survival in several cancers, however, analysis of the data is difficult, since typically at most a few hundred patients are available, resulting in severely underdetermined regression or classification models. Several approaches exist to classify patients in different risk classes, however, relatively little has been done with respect to the prediction of actual survival times. We introduce CASPAR, a novel method to predict true survival times for the individual patient based on microarray measurements. CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables CASPAR to automatically select small, most informative subsets of genes for prediction. Results: Validity of the method is demonstrated on two publicly available datasets on diffuse large B-cell lymphoma (DLBCL) and on adenocarcinoma of the lung. The method successfully identifies long and short survivors, with high sensitivity and specificity. We compare our method to two alternative methods from the literature, demonstrating superiour results of our approach. In addition, we show that CASPAR can further refine predictions made using clinical scoring systems such as the International Prognostic Index (IPI) for DLBCL and clinical staging for lung cancer, thus providing an additional tool for the clinician. An analysis of the genes identified confirms previously published results, and furthermore, new candidate genes correlated with survival are identified. Availability: The software is available upon request from the authors. Supplementary material: http://www.zaik.uni-koeln.de/bioinformatik/caspar.html.
Received October 12, 2005
Revised January 26, 2006
Accepted March 16, 2006
Article
CASPAR: a hierarchical Bayesian approach to predict survival times in cancer from gene expression data
Lars Kaderali 1 *,
Thomas Zander 2,
Ulrich Faigle 1,
Jürgen Wolf 2,
Joachim L. Schultze 2,
and
Rainer Schrader 1
2 Department of Internal Medicine, University Clinic Cologne, Joseph-Stelzmannstr. 9, 50924 Köln, Germany
Lars Kaderali, E-mail: kaderali{at}zpr.uni-koeln.de
![]()
Abstract
Associate Editor: Alvis Brazma
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Schramm, J. Vandesompele, J. H. Schulte, S. Dreesmann, L. Kaderali, B. Brors, R. Eils, F. Speleman, and A. Eggert Translating Expression Profiling into a Clinically Feasible Test to Predict Neuroblastoma Outcome Clin. Cancer Res., March 1, 2007; 13(5): 1459 - 1465. [Abstract] [Full Text] [PDF] |
||||
