Skip Navigation



Bioinformatics Advance Access published online on December 20, 2006

Bioinformatics, doi:10.1093/bioinformatics/btl632
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
23/4/466    most recent
btl632v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ma, S.
Right arrow Articles by Huang, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ma, S.
Right arrow Articles by Huang, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2006). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Received May 23, 2006
Revised December 7, 2006
Accepted December 8, 2006

Article

Clustering Threshold Gradient Descent Regularization: with applications to microarray studies

Shuangge Ma 1 * and Jian Huang 2

1 Department of Epidemiology and Public Health, Yale University, New Haven, CT, USA
2 Department of Statistics, University of Iowa, Iowa City, IA, USA; Department of Actuarial Science, University of Iowa, Iowa City, IA, USA

* To whom correspondence should be addressed.
Shuangge Ma, E-mail: shuangge.ma{at}yale.edu


   Abstract

Motivation: An important goal of microarray studies is to discover genes that are associated with clinical outcomes such as disease status and patient survival. While a typical experiment surveys gene expressions on a global scale, there may be only a small number of genes that have significant influence on a clinical outcome. Moreover, expression data have cluster structures and the genes within a cluster have correlated expressions and coordinated functions, but the effects of individual genes in the same cluster may be different. Accordingly, we seek to build statistical models with the following properties. First, the model is sparse in the sense that only a subset of the parameter vector is non-zero. Second, the cluster structures of gene expressions are properly accounted for.

Results: For gene expression data without pathway information, we divide genes into clusters are using commonly used methods such as K-means or hierarchical approaches. The optimal number of clusters is determined using the Gap statistic. We propose a Clustering Threshold Gradient Descent Regularization (CTGDR) method, for simultaneous cluster selection and within cluster gene selection. We apply this method to binary classification and censored survival analysis. Compared to the standard TGDR and other regularization methods, the CTGDR takes into account the cluster structure and carries out feature selection at both the cluster level and withincluster gene level. We demonstrate the CTGDR on two studies of cancer classification and two studies correlating survival of lymphoma patients with microarray expressions.

Availability: R code is available upon request.


Associate Editor: Satoru Miyano
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BiometrikaHome page
J. Huang, S. Ma, H. Xie, and C.-H. Zhang
A group bridge approach for variable selection
Biometrika, June 1, 2009; 96(2): 339 - 355.
[Abstract] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.