Skip Navigation



Bioinformatics Advance Access published online on November 5, 2004

Bioinformatics, doi:10.1093/bioinformatics/bti114
Bioinformatics © Oxford University Press 2004; all rights reserved
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
21/7/1104    most recent
bti114v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Fort, G.
Right arrow Articles by Lambert-Lacroix, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fort, G.
Right arrow Articles by Lambert-Lacroix, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Received July 27, 2004
Revised October 5, 2004
Accepted October 22, 2004

Article

Classification using Partial Least Squares with penalized logistic regression

Gersende Fort 1* and Sophie Lambert-Lacroix 1

1 CNRS/LMC-IMAG, BP 53, 38041 Grenoble cedex 9, France

* To whom correspondence should be addressed.
Gersende Fort, E-mail: Gersende.Fort{at}imag.fr


   Abstract

Motivation: One important aspect of data-mining of microarray data is to discover the molecular variation among cancers. In microarray studies, the number n of samples is relatively small compared to the number p of genes per sample (usually in thousands). It is known that standard statistical methods in classification are efficient (i.e. in the present case, yield successful classifiers) particularly when n is (far) larger than p. This naturally calls for the use of a dimension reduction procedure together with the classification one.

Results: In this paper, the question of classification in such a high dimensional setting is addressed. We view the classification problem as a regression one with few observations and many predictor variables. We propose a new method combining Partial Least Squares (PLS) and Ridge penalized logistic regression. We review the existing methods based on PLS and/or penalized likelihood techniques, outline their interest in some cases and theoretically explain their sometimes poor behavior. Our procedure is compared with these other classifiers. The predictive performance of the resulting classification rule is illustrated on three data sets: Leukemia, Colon and Prostate.

Availability: Software that implements the procedures and data source on which this paper focuses are freely available at http://www-lmc.imag.fr/SMS/membres/Gersende_Fort,Sophie_Lambert.html.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J.G. Liao and K.-V. Chin
Logistic regression for disease classification using microarray data: model selection in a large p and small n case
Bioinformatics, August 1, 2007; 23(15): 1945 - 1951.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Liu, J. M. Hughes-Oliver, and J. A. Menius Jr
Domain-enhanced analysis of microarray data using GO annotations
Bioinformatics, May 15, 2007; 23(10): 1225 - 1234.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A.-L. Boulesteix and K. Strimmer
Partial least squares: a versatile tool for the analysis of high-dimensional genomic data
Brief Bioinform, January 1, 2007; 8(1): 32 - 44.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Tan, L. Shi, S. M. Hussain, J. Xu, W. Tong, J. M. Frazier, and C. Wang
Integrating time-course microarray gene expression profiles with cytotoxicity for identification of biomarkers in primary rat hepatocytes exposed to cadmium
Bioinformatics, January 1, 2006; 22(1): 77 - 87.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.