Bioinformatics Advance Access originally published online on June 9, 2008
Bioinformatics 2008 24(15):1698-1706; doi:10.1093/bioinformatics/btn262
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value
1Sylvia Lawry Centre for MS Research, Hohenlindenerstr. 1, D-81677 Munich, 2Department of Statistics, Ludwig-Maximilians-University of Munich, Ludwigstr. 33, D-80539 Munich and 3Institute of Medical Biometry and Medical Informatics, University Hospital Freiburg, Stefan-Meier-Str. 26, D-79104 Freiburg, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: In the context of clinical bioinformatics methods are needed for assessing the additional predictive value of microarray data compared to simple clinical parameters alone. Such methods should also provide an optimal prediction rule making use of all potentialities of both types of data: they should ideally be able to catch subtypes which are not identified by clinical parameters alone. Moreover, they should address the question of the additional predictive value of microarray data in a fair framework.
Results: We propose a novel but simple two-step approach based on random forests and partial least squares (PLS) dimension reduction embedding the idea of pre-validation suggested by Tibshirani and colleagues, which is based on an internal cross-validation for avoiding overfitting. Our approach is fast, flexible and can be used both for assessing the overall additional significance of the microarray data and for building optimal hybrid classification rules. Its efficiency is demonstrated through simulations and an application to breast cancer and colorectal cancer data.
Availability: Our method is implemented in the freely available R package MAclinical which can be downloaded from http://www.stat.uni-muenchen.de/~socher/MAclinical
Contact: boulesteix{at}slcmsr.org
Associate Editor: Joaquin Dopazo
Received on February 8, 2008; revised on May 16, 2008; accepted on June 4, 2008