Skip Navigation


Bioinformatics Advance Access originally published online on October 25, 2005
Bioinformatics 2006 22(1):88-95; doi:10.1093/bioinformatics/bti736
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/1/88    most recent
bti736v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhang, H. H.
Right arrow Articles by Park, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, H. H.
Right arrow Articles by Park, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Gene selection using support vector machines with non-convex penalty

Hao Helen Zhang 1,*, Jeongyoun Ahn 2, Xiaodong Lin 3 and Cheolwoo Park 4

1Department of Statistics, North Carolina State University Raleigh, NC 27695, USA
2Department of Statistics and Operations Research, University of North Carolina Chapel Hill, NC 27599, USA
3Department of Mathematical Sciences, University of Cincinnati OH 45221, USA
4Department of Statistics, University of Georgia Athens, GA 30602, USA

*To whom correspondence should be addressed.

Motivation: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of ‘high-dimensional low sample size’. Therefore, robust and accurate gene selection methods are required to identify differentially expressed group of genes across different samples, e.g. between cancerous and normal cells. Successful gene selection will help to classify different cancer types, lead to a better understanding of genetic signatures in cancers and improve treatment strategies. Although gene selection and cancer classification are two closely related problems, most existing approaches handle them separately by selecting genes prior to classification. We provide a unified procedure for simultaneous gene selection and cancer classification, achieving high accuracy in both aspects.

Results: In this paper we develop a novel type of regularization in support vector machines (SVMs) to identify important genes for cancer classification. A special nonconvex penalty, called the smoothly clipped absolute deviation penalty, is imposed on the hinge loss function in the SVM. By systematically thresholding small estimates to zeros, the new procedure eliminates redundant genes automatically and yields a compact and accurate classifier. A successive quadratic algorithm is proposed to convert the non-differentiable and non-convex optimization problem into easily solved linear equation systems. The method is applied to two real datasets and has produced very promising results.

Availability: MATLAB codes are available upon request from the authors.

Contact: hzhang{at}stat.ncsu.edu

Supplementary information: http://www4.stat.ncsu.edu/~hzhang/research.html


Received on June 12, 2005; revised on October 20, 2005; accepted on October 20, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
DNA ResHome page
S. Keerthikumar, S. Bhadra, K. Kandasamy, R. Raju, Y. L. Ramachandra, C. Bhattacharyya, K. Imai, O. Ohara, S. Mohan, and A. Pandey
Prediction of Candidate Primary Immunodeficiency Disease Genes Using a Support Vector Machine Learning Approach
DNA Res, October 3, 2009; (2009) dsp019v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Becker, W. Werft, G. Toedt, P. Lichter, and A. Benner
penalizedSVM: a R-package for feature selection SVM classification
Bioinformatics, July 1, 2009; 25(13): 1711 - 1712.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Ma and M. R. Kosorok
Identification of differential gene pathways with principal component analysis
Bioinformatics, April 1, 2009; 25(7): 882 - 889.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Hwang, H. Sicotte, Z. Tian, B. Wu, J.-P. Kocher, D. A. Wigle, V. Kumar, and R. Kuang
Robust and efficient identification of biomarkers by classifying features on graphs
Bioinformatics, September 15, 2008; 24(18): 2023 - 2029.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
S. Ma and J. Huang
Penalized feature selection and classification in bioinformatics
Brief Bioinform, September 1, 2008; 9(5): 392 - 403.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Huang and T. W. S. Chow
Identifying the biologically relevant gene categories based on gene expression and biological data: an example on prostate cancer
Bioinformatics, June 15, 2007; 23(12): 1503 - 1510.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Zhou and D. P. Tuck
MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data
Bioinformatics, May 1, 2007; 23(9): 1106 - 1114.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Wang and J. Zhu
Improved centroids estimation for the nearest shrunken centroid classifier
Bioinformatics, April 15, 2007; 23(8): 972 - 979.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.