Skip Navigation


Bioinformatics Advance Access originally published online on January 5, 2008
Bioinformatics 2008 24(3):412-419; doi:10.1093/bioinformatics/btm579
This Article
Right arrow Full Text
Right arrow Full Text (Print PDF)
Right arrow All Versions of this Article:
24/3/412    most recent
btm579v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Wang, L.
Right arrow Articles by Zou, H.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wang, L.
Right arrow Articles by Zou, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Hybrid huberized support vector machines for microarray classification and gene selection

Li Wang 1, Ji Zhu 2,* and Hui Zou 3

1Ross School of Business, 2Department of Statistics, University of Michigan, Ann Arbor, MI 48109 and 3School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA

*To whom correspondence should be addressed.


   Abstract

Motivation: The standard L2-norm support vector machine (SVM) is a widely used tool for microarray classification. Previous studies have demonstrated its superior performance in terms of classification accuracy. However, a major limitation of the SVM is that it cannot automatically select relevant genes for the classification. The L1-norm SVM is a variant of the standard L2-norm SVM, that constrains the L1-norm of the fitted coefficients. Due to the singularity of the L1-norm, the L1-norm SVM has the property of automatically selecting relevant genes. On the other hand, the L1-norm SVM has two drawbacks: (1) the number of selected genes is upper bounded by the size of the training data; (2) when there are several highly correlated genes, the L1-norm SVM tends to pick only a few of them, and remove the rest.

Results: We propose a hybrid huberized support vector machine (HHSVM). The HHSVM combines the huberized hinge loss function and the elastic-net penalty. By doing so, the HHSVM performs automatic gene selection in a way similar to the L1-norm SVM. In addition, the HHSVM encourages highly correlated genes to be selected (or removed) together. We also develop an efficient algorithm to compute the entire solution path of the HHSVM. Numerical results indicate that the HHSVM tends to provide better variable selection results than the L1-norm SVM, especially when variables are highly correlated.

Availability: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/hhsvm/

Contact: jizhu{at}umich.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Associate Editor: David Rocke


Received on June 15, 2007; revised on October 9, 2007; accepted on November 18, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.