Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (156)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nguyen, D. V.
Right arrow Articles by Rocke, D. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nguyen, D. V.
Right arrow Articles by Rocke, D. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 1 2002
Pages 39-50
© 2002 Oxford University Press

Tumor classification by partial least squares using microarray gene expression data

Danh V. Nguyen 1 and David M. Rocke 2,*

1 Center for Image Processing and Integrated Computing
2 Department of Applied Science, University of California, Davis, CA 95616, USA

Received on November 23, 2000 ; revised on March 22, 2001 ; accepted on June 6, 2001

Motivation: One important application of gene expression microarray data is classification of samples into categories, such as the type of tumor. The use of microarrays allows simultaneous monitoring of thousands of genes expressions per sample. This ability to measure gene expression en masse has resulted in data with the number of variables p(genes) far exceeding the number of samples N. Standard statistical methodologies in classification and prediction do not work well or even at all when N < p. Modification of existing statistical methodologies or development of new methodologies is needed for the analysis of microarray data.

Results: We propose a novel analysis procedure for classifying (predicting) human tumor samples based on microarray gene expressions. This procedure involves dimension reduction using Partial Least Squares (PLS) and classification using Logistic Discrimination (LD) and Quadratic Discriminant Analysis (QDA). We compare PLS to the well known dimension reduction method of Principal Components Analysis (PCA). Under many circumstances PLS proves superior; we illustrate a condition when PCA particularly fails to predict well relative to PLS. The proposed methods were applied to five different microarray data sets involving various human tumor samples: (1) normal versus ovarian tumor; (2) Acute Myeloid Leukemia (AML) versus Acute Lymphoblastic Leukemia (ALL); (3) Diffuse Large B-cell Lymphoma (DLBCLL) versus B-cell Chronic Lymphocytic Leukemia (BCLL); (4) normal versus colon tumor; and (5) Non-Small-Cell-Lung-Carcinoma (NSCLC) versus renal samples. Stability of classification results and methods were further assessed by re-randomization studies.

Availability: The methodology can be implemented using a combination of standard statistical methods, available, for example, in SAS. Illustrative SAS code is available from the first author.

Contact: nguyen{at}wald.ucdavis.edu; dmrocke{at}ucdavis.edu

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
S. Ma and J. Huang
Penalized feature selection and classification in bioinformatics
Brief Bioinform, September 1, 2008; 9(5): 392 - 403.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A.-L. Boulesteix, C. Porzelius, and M. Daumer
Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value
Bioinformatics, August 1, 2008; 24(15): 1698 - 1706.
[Abstract] [Full Text] [PDF]


Home page
Toxicol SciHome page
Y. Li, D. Pan, J. Liu, P. S. Kern, G. F. Gerberick, A. J. Hopfinger, and Y. J. Tseng
Categorical QSAR Models for Skin Sensitization based upon Local Lymph Node Assay Classification Measures Part 2: 4D-Fingerprint Three-State and Two-2-State Logistic Regression Models
Toxicol. Sci., October 1, 2007; 99(2): 532 - 544.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Zhu, Y. Li, and H. Li
Multivariate correlation estimator for inferring functional relationships from replicated genome-wide data
Bioinformatics, September 1, 2007; 23(17): 2298 - 2305.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J.G. Liao and K.-V. Chin
Logistic regression for disease classification using microarray data: model selection in a large p and small n case
Bioinformatics, August 1, 2007; 23(15): 1945 - 1951.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Tai and W. Pan
Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms
Bioinformatics, July 15, 2007; 23(14): 1775 - 1782.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. J. Nueda, A. Conesa, J. A. Westerhuis, H. C. J. Hoefsloot, A. K. Smilde, M. Talon, and A. Ferrer
Discovering gene expression patterns in time course microarray experiments by ANOVA SCA
Bioinformatics, July 15, 2007; 23(14): 1792 - 1800.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Fishel, A. Kaufman, and E. Ruppin
Meta-analysis of gene expression data: a predictor-based approach
Bioinformatics, July 1, 2007; 23(13): 1599 - 1606.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Liu, J. M. Hughes-Oliver, and J. A. Menius Jr
Domain-enhanced analysis of microarray data using GO annotations
Bioinformatics, May 15, 2007; 23(10): 1225 - 1234.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and J. Huang
Clustering threshold gradient descent regularization: with applications to microarray studies
Bioinformatics, February 15, 2007; 23(4): 466 - 472.
[Abstract] [Full Text] [PDF]


Home page
Molecular Cancer TherapeuticsHome page
D. M. Havaleshko, H. Cho, M. Conaway, C. R. Owens, G. Hampton, J. K. Lee, and D. Theodorescu
Prediction of drug combination chemosensitivity in human bladder cancer
Mol. Cancer Ther., February 1, 2007; 6(2): 578 - 586.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A.-L. Boulesteix and K. Strimmer
Partial least squares: a versatile tool for the analysis of high-dimensional genomic data
Brief Bioinform, January 1, 2007; 8(1): 32 - 44.
[Abstract] [Full Text] [PDF]


Home page
Transactions of the Institute of Measurement and ControlHome page
H.-Q. Wang and K. Li
A New Algorithm Based on Support Vectors and Penalty Strategy for Identifying Key Genes Related with Cancer
Transactions of the Institute of Measurement and Control, August 1, 2006; 28(3): 263 - 273.
[Abstract] [PDF]


Home page
BioinformaticsHome page
Y. Tan, L. Shi, S. M. Hussain, J. Xu, W. Tong, J. M. Frazier, and C. Wang
Integrating time-course microarray gene expression profiles with cytotoxicity for identification of biomarkers in primary rat hepatocytes exposed to cadmium
Bioinformatics, January 1, 2006; 22(1): 77 - 87.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and J. Huang
Regularized ROC method for disease classification and biomarker selection with microarray data
Bioinformatics, December 15, 2005; 21(24): 4356 - 4362.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. F. A. Wessels, M. J. T. Reinders, A. A. M. Hart, C. J. Veenman, H. Dai, Y. D. He, and L. J. v. Veer
A protocol for building and evaluating predictors of disease state based on microarray data
Bioinformatics, October 1, 2005; 21(19): 3755 - 3762.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Y. Yeung, R. E. Bumgarner, and A. E. Raftery
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data
Bioinformatics, May 15, 2005; 21(10): 2394 - 2402.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. J. Fu, R. J. Carroll, and S. Wang
Estimating misclassification error with small samples via bootstrap cross-validation
Bioinformatics, May 1, 2005; 21(9): 1979 - 1986.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Fort and S. Lambert-Lacroix
Classification using partial least squares with penalized logistic regression
Bioinformatics, April 1, 2005; 21(7): 1104 - 1111.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Sandberg and I. Ernberg
Assessment of tumor characteristic gene expression in cell lines using a tissue similarity index (TSI)
PNAS, February 8, 2005; 102(6): 2052 - 2057.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. L. Yap, M. P. Wong, X. W. Zhang, D. Hernandez, R. Gras, D. K. Smith, and A. Danchin
Conserved transcription factor binding sites of cancer markers derived from primary lung adenocarcinoma microarrays
Nucleic Acids Res., January 14, 2005; 33(1): 409 - 421.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Tan, L. Shi, W. Tong, and C. Wang
Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data
Nucleic Acids Res., January 7, 2005; 33(1): 56 - 65.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Perez-Enciso, M. A. Toro, M. Tenenhaus, and D. Gianola
Combining Gene Expression and Molecular Marker Information for Mapping Complex Trait Genes: A Simulation Study
Genetics, August 1, 2003; 164(4): 1597 - 1606.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
C. Romualdi, S. Campanaro, D. Campagna, B. Celegato, N. Cannata, S. Toppo, G. Valle, and G. Lanfranchi
Pattern recognition in gene expression profiling using DNA array: a comparative study of different statistical methods applied to cancer classification
Hum. Mol. Genet., April 15, 2003; 12(8): 823 - 836.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
R. Simon, M. D. Radmacher, K. Dobbin, and L. M. McShane
Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification
J Natl Cancer Inst, January 1, 2003; 95(1): 14 - 18.
[Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.