Skip Navigation



Bioinformatics Advance Access published online on October 25, 2005

Bioinformatics, doi:10.1093/bioinformatics/bti736
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
22/1/88    most recent
bti736v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhang, H. H.
Right arrow Articles by Park, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, H. H.
Right arrow Articles by Park, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Received June 12, 2005
Revised October 20, 2005
Accepted October 20, 2005

Article

Gene selection using support vector machines with nonconvex penalty

Hao Helen Zhang 1*, Jeongyoun Ahn 2, Xiaodong Lin 3, and Cheolwoo Park 4

1 Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
2 Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA
3 Department of Mathematical Sciences, University of Cincinnati, OH 45221, USA
4 Department of Statistics, University of Georgia, Athens, GA 30602, USA

* To whom correspondence should be addressed.
Hao Helen Zhang, E-mail: hzhang{at}stat.ncsu.edu


   Abstract

Motivation: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of "high dimensional low sample size." Therefore, robust and accurate gene selection methods are required to identify differentially expressed group of genes across different samples, e.g., between cancerous and normal cells. Successful gene selection will help to classify different cancer types, lead to a better understanding of genetic signatures in cancers, and improve treatment strategies. Although gene selection and cancer classification are two closely related problems, most existing approaches handle them separately by selecting genes prior to classification. We provide a unified procedure for simultaneous gene selection and cancer classification, achieving high accuracy in both aspects.

Results: In this paper we develop a novel type of regularization in support vector machines (SVMs) to identify important genes for cancer classification. A special nonconvex penalty, called the smoothly clipped absolute deviation penalty, is imposed on the hinge loss function in the SVM. By systematically thresholding small estimates to zeros, the new procedure eliminates redundant genes automatically and yields a compact and accurate classifier. A successive quadratic algorithm is proposed to convert the non-differentiable and nonconvex optimization problem into easily solved linear equation systems. The method is applied to two real data sets and has produced very promising results.

Availability: MATLAB codes are available upon request from the authors.

Supplementary information: http://www4.stat.ncsu.edu/hzhang/pub.html.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
DNA ResHome page
S. Keerthikumar, S. Bhadra, K. Kandasamy, R. Raju, Y.L. Ramachandra, C. Bhattacharyya, K. Imai, O. Ohara, S. Mohan, and A. Pandey
Prediction of Candidate Primary Immunodeficiency Disease Genes Using a Support Vector Machine Learning Approach
DNA Res, December 1, 2009; 16(6): 345 - 351.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Becker, W. Werft, G. Toedt, P. Lichter, and A. Benner
penalizedSVM: a R-package for feature selection SVM classification
Bioinformatics, July 1, 2009; 25(13): 1711 - 1712.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and M. R. Kosorok
Identification of differential gene pathways with principal component analysis
Bioinformatics, April 1, 2009; 25(7): 882 - 889.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Hwang, H. Sicotte, Z. Tian, B. Wu, J.-P. Kocher, D. A. Wigle, V. Kumar, and R. Kuang
Robust and efficient identification of biomarkers by classifying features on graphs
Bioinformatics, September 15, 2008; 24(18): 2023 - 2029.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
S. Ma and J. Huang
Penalized feature selection and classification in bioinformatics
Brief Bioinform, September 1, 2008; 9(5): 392 - 403.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Huang and T. W. S. Chow
Identifying the biologically relevant gene categories based on gene expression and biological data: an example on prostate cancer
Bioinformatics, June 15, 2007; 23(12): 1503 - 1510.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Zhou and D. P. Tuck
MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data
Bioinformatics, May 1, 2007; 23(9): 1106 - 1114.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Wang and J. Zhu
Improved centroids estimation for the nearest shrunken centroid classifier
Bioinformatics, April 15, 2007; 23(8): 972 - 979.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.