Bioinformatics Advance Access published online on July 17, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn374
A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all?
1Machine Learning Group, Department of Computer Science, Université Libre de Bruxelles.
2Functional Genomics Unit, Department of Medical Oncology, Institut Jules Bordet, Université Libre de Bruxelles.
*To whom correspondence should be addressed. Benjamin Haibe-Kains, E-mail: bhaibeka{at}ulb.ac.be
| Abstract |
|---|
Motivation: Survival prediction of breast cancer (BC) patients independently of treatment, also known as prognostication, is a complex task since clinically similar breast tumors, in addition to be molecularly heterogeneous, may exhibit different clinical outcomes. In recent years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insights in BC biology and to improve the quality of prognostication. The aim of this work is to assess quantitatively the accuracy of prediction obtained with state-of-the-art data analysis techniques for BC microarray data through an independent and thorough framework.
Results: Due to the large number of variables, the reduced amount of samples and the high degree of noise, complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques. Our analysis shows that the most complex methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene. This result suggests that proliferation might be the most relevant biological process for BC prognostication and that the loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalanced by an improvement of the quality of prediction.
Availability: The comparison study is implemented in an R package called survcomp and is available from http://www.ulb.ac.be/di/map/bhaibeka/software/survcomp/.
Contact: bhaibeka{at}ulb.ac.be
Supplementary information: Supplementary Data are available at Bioinformatics online.
Associate Editor: Dr. Joaquin Dopazo
Received on January 18, 2008; revised on May 30, 2008; accepted on July 15, 2008
This article has been cited by other articles:
![]() |
F. Correa Geyer and J. S. Reis-Filho Microarray-based Gene Expression Profiling as a Clinical Tool for Breast Cancer Management: Are We There Yet? International Journal of Surgical Pathology, August 1, 2009; 17(4): 285 - 302. [Abstract] [PDF] |
||||
![]() |
J. Hugh, J. Hanson, M. C. U. Cheang, T. O. Nielsen, C. M. Perou, C. Dumontet, J. Reed, M. Krajewska, I. Treilleux, M. Rupin, et al. Breast Cancer Subtypes and Response to Docetaxel in Node-Positive Breast Cancer: Use of an Immunohistochemical Definition in the BCIRG 001 Trial J. Clin. Oncol., March 10, 2009; 27(8): 1168 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Sotiriou and L. Pusztai Gene-Expression Signatures in Breast Cancer N. Engl. J. Med., February 19, 2009; 360(8): 790 - 800. [Full Text] [PDF] |
||||


