Bioinformatics Advance Access published online on August 16, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti631
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Center for Cardiovascular Bioinformatics and Modeling, Whitaker Biomedical Engineering Institute, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA
* To whom correspondence should be addressed.
Motivation: Various studies have shown that cancer tissue samples can be successfully detected and classified by their gene expression patterns using machine learning approaches. One of the challenges in applying these techniques for classifying gene expression data is to extract accurate, readily interpretable rules providing biological insight as to how classification is performed. Current methods generate classifiers which are accurate but difficult to interpret. This is the trade-off between credibility and comprehensibility of the classifiers. Here, we introduce a new classifier in order to address these problems. It is referred to as k-TSP (k-Top Scoring Pairs) and is based on the concept of "relative expression reversals". This method generates simple and accurate decision rules that only involve a small number of gene-to-gene expression comparisons, thereby facilitating follow-up studies. Results: In this study, we have compared our approach to other machine learning techniques for class prediction in 19 binary and multi-class gene expression data sets involving human cancers. The k-TSP classifier performs as well as PAM and SVM and outperforms other learning methods (decision trees, k-NN and naïve Bayes). Our approach is easy to interpret as the classifier involves only a small number of informative genes. For these reasons, we consider the k-TSP method to be a useful tool for cancer classification from microarray gene expression data. Availability: The software and data sets are available at http://www.ccbm.jhu.edu.
Received May 9, 2005
Revised July 28, 2005
Accepted August 14, 2005
Article
Simple decision rules for classifying human cancers from gene expression profiles
2 Center for Cardiovascular Bioinformatics and Modeling, Whitaker Biomedical Engineering Institute, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA
Aik Choon Tan, E-mail: actan{at}jhu.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C.-S. Chen, S. Sullivan, T. Anderson, A. C. Tan, P. J. Alex, S. R. Brant, C. Cuffari, T. M. Bayless, M. V. Talor, C. L. Burek, et al. Identification of Novel Serological Biomarkers for Inflammatory Bowel Disease Using Escherichia coli Proteome Chip Mol. Cell. Proteomics, August 1, 2009; 8(8): 1765 - 1776. [Abstract] [Full Text] [PDF] |
||||
![]() |
N.V. Rajeshkumar, A. C. Tan, E. De Oliveira, C. Womack, H. Wombwell, S. Morgan, M. V. Warren, J. Walker, T. P. Green, A. Jimeno, et al. Antitumor Effects and Biomarkers of Activity of AZD0530, a Src Inhibitor, in Pancreatic Cancer Clin. Cancer Res., June 15, 2009; 15(12): 4138 - 4146. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Messersmith, N.V. Rajeshkumar, A. C. Tan, X. F. Wang, V. Diesl, S. E. Choe, M. Follettie, C. Coughlin, F. Boschelli, E. Garcia-Garcia, et al. Efficacy and pharmacodynamic effects of bosutinib (SKI-606), a Src/Abl inhibitor, in freshly generated human pancreas cancer xenografts Mol. Cancer Ther., June 1, 2009; 8(6): 1484 - 1493. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Leek The tspair package for finding top scoring pair classifiers in R Bioinformatics, May 1, 2009; 25(9): 1203 - 1204. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Weichselbaum, H. Ishwaran, T. Yoon, D. S. A. Nuyten, S. W. Baker, N. Khodarev, A. W. Su, A. Y. Shaikh, P. Roach, B. Kreike, et al. An interferon-related gene signature for DNA damage resistance is a predictive marker for chemotherapy and radiation for breast cancer PNAS, November 25, 2008; 105(47): 18490 - 18495. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Kuner, A. S. Barth, M. Ruschhaupt, A. Buness, L. Zwermann, E. Kreuzer, G. Steinbeck, A. Poustka, H. Sultmann, and M. Nabauer Genomic analysis reveals poor separation of human cardiomyopathies of ischemic and nonischemic etiologies Physiol Genomics, June 1, 2008; 34(1): 88 - 94. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Price, J. Trent, A. K. El-Naggar, D. Cogdell, E. Taylor, K. K. Hunt, R. E. Pollock, L. Hood, I. Shmulevich, and W. Zhang Highly accurate two-gene classifier for differentiating gastrointestinal stromal tumors and leiomyosarcomas PNAS, February 27, 2007; 104(9): 3414 - 3419. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. C. Cawley and N. L. C. Talbot Gene selection in cancer classification using sparse logistic regression with Bayesian regularization Bioinformatics, October 1, 2006; 22(19): 2348 - 2355. [Abstract] [Full Text] [PDF] |
||||





