Bioinformatics Vol. 19 no. 9 2003
Pages 1061-1069
© 2003 Oxford University Press
Boosting for tumor classification with gene expression data
Seminar für Statistik, ETH Zürich, CH-8092, Switzerland
Received on February 28, 2002
; revised on April 19, 2002
; accepted on September 5, 2002
Motivation: Microarray experiments generate large datasets with expression values for thousands of genes but not more than a few dozens of samples. Accurate supervised classification of tissue samples in such high-dimensional problems is difficult but often crucial for successful diagnosis and treatment. A promising way to meet this challenge is by using boosting in conjunction with decision trees.
Results: We demonstrate that the generic boosting algorithm needs some modification to become an accurate classifier in the context of gene expression data. In particular, we present a feature preselection method, a more robust boosting procedure and a new approach for multi-categorical problems. This allows for slight to drastic increase in performance and yields competitive results on several publicly available datasets.
Availability: Software for the modified boosting algorithms as well as for decision trees is available for free in R at http://stat.ethz.ch/~dettling/boosting.html
Contact: dettling{at}stat.math.ethz.ch
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A.-L. Boulesteix WilcoxCV: an R package for fast variable selection in cross-validation Bioinformatics, July 1, 2007; 23(13): 1702 - 1704. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and H. Li Nonparametric pathway-based regression models for analysis of genomic data Biostat., April 1, 2007; 8(2): 265 - 284. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchet, S. Mocellin, C. Belluco, A. Ambrosi, F. DeMarchi, E. Mammano, M. Digito, A. Leon, A. D'Arrigo, M. Lise, et al. Gene Expression Profile of Primary Gastric Cancer: Towards the Prediction of Lymph Node Status Ann. Surg. Oncol., March 1, 2007; 14(3): 1058 - 1064. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Tan, D. Q. Naiman, L. Xu, R. L. Winslow, and D. Geman Simple decision rules for classifying human cancers from gene expression profiles Bioinformatics, October 15, 2005; 21(20): 3896 - 3904. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Li and Y. Luan Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data Bioinformatics, May 15, 2005; 21(10): 2403 - 2409. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Y. Yeung, R. E. Bumgarner, and A. E. Raftery Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data Bioinformatics, May 15, 2005; 21(10): 2394 - 2402. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lottaz and R. Spang Molecular decomposition of complex clinical phenotypes using biologically structured analysis of microarray data Bioinformatics, May 1, 2005; 21(9): 1971 - 1978. [Abstract] [Full Text] [PDF] |
||||


