Bioinformatics Vol. 18 no. 9 2002
Pages 1216-1226
© 2002 Oxford University Press
Multi-class cancer classification via partial least squares with gene expression profiles
1 Department of Statistics, Texas A&M University,
College Station, TX 77843, USA
2 Department of Applied Science, University of California,
Davis, CA 95616, USA
Received on July 17, 2001
; revised on March 14, 2002
; accepted on March 25, 2002
Motivation: Discrimination between two classes such as normal and cancer samples and between two types of cancers based on gene expression profiles is an important problem which has practical implications as well as the potential to further our understanding of gene expression of various cancer cells. Classification or discrimination of more than two groups or classes (multi-class) is also needed. The need for multi-class discrimination methodologies is apparent in many microarray experiments where various cancer types are considered simultaneously.
Results: Thus, in this paper we present the extension to the classification methodology proposed earlier Nguyen and Rocke (2002b; Bioinformatics, 18, 3950) to classify cancer samples from multiple classes. The methodologies proposed in this paper are applied to four gene expression data sets with multiple classes: (a) a hereditary breast cancer data set with (1) BRCA1-mutation, (2) BRCA2-mutation and (3) sporadic breast cancer samples, (b) an acute leukemia data set with (1) acute myeloid leukemia (AML), (2) T-cell acute lymphoblastic leukemia (T-ALL) and (3) B-cell acute lymphoblastic leukemia (B-ALL) samples, (c) a lymphoma data set with (1) diffuse large B-cell lymphoma (DLBCL), (2) B-cell chronic lymphocytic leukemia (BCLL) and (3) follicular lymphoma (FL) samples, and (d) the NCI60 data set with cell lines derived from cancers of various sites of origin. In addition, we evaluated the classification algorithms and examined the variability of the error rates using simulations based on randomization of the real data sets. We note that there are other methods for addressing multi-class prediction recently and our approach is along the line of Nguyen and Rocke (2002b; Bioinformatics, 18, 3950).
Contact: dnguyen{at}stat.tamu.edu dmrocke@ucdavis.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. Zhu, Y. Li, and H. Li Multivariate correlation estimator for inferring functional relationships from replicated genome-wide data Bioinformatics, September 1, 2007; 23(17): 2298 - 2305. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sundaresh, A. Randall, B. Unal, J. M. Petersen, J. T. Belisle, M. Gill Hartley, M. Duffield, R. W. Titball, D. H. Davies, P. L. Felgner, et al. From protein microarrays to diagnostic antigen discovery: a study of the pathogen Francisella tularensis Bioinformatics, July 1, 2007; 23(13): i508 - i518. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, J. M. Hughes-Oliver, and J. A. Menius Jr Domain-enhanced analysis of microarray data using GO annotations Bioinformatics, May 15, 2007; 23(10): 1225 - 1234. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-L. Boulesteix and K. Strimmer Partial least squares: a versatile tool for the analysis of high-dimensional genomic data Brief Bioinform, January 1, 2007; 8(1): 32 - 44. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tan, L. Shi, S. M. Hussain, J. Xu, W. Tong, J. M. Frazier, and C. Wang Integrating time-course microarray gene expression profiles with cytotoxicity for identification of biomarkers in primary rat hepatocytes exposed to cadmium Bioinformatics, January 1, 2006; 22(1): 77 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Gao and G. Church Improving molecular cancer class discovery through sparse non-negative matrix factorization Bioinformatics, November 1, 2005; 21(21): 3970 - 3975. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Y. Yeung, R. E. Bumgarner, and A. E. Raftery Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data Bioinformatics, May 15, 2005; 21(10): 2394 - 2402. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Shanahan and S. M. Hofer Social Context in Gene-Environment Interactions: Retrospect and Prospect J. Gerontol. B. Psychol. Sci. Soc. Sci., March 1, 2005; 60(suppl_Special_Issue_1): 65 - 76. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Guan and H. Zhao A semiparametric approach for marker gene selection based on gene expression data Bioinformatics, February 15, 2005; 21(4): 529 - 536. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tan, L. Shi, W. Tong, and C. Wang Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data Nucleic Acids Res., January 7, 2005; 33(1): 56 - 65. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Modlich, H.-B. Prisack, M. Munnes, W. Audretsch, and H. Bojar Immediate Gene Expression Changes After the First Course of Neoadjuvant Chemotherapy in Patients with Primary Breast Cancer Disease Clin. Cancer Res., October 1, 2004; 10(19): 6418 - 6431. [Abstract] [Full Text] [PDF] |
||||




