Bioinformatics Vol. 19 Suppl. 2 2003
pages ii93-ii102
© 2003 Oxford University Press
Discovery of significant rules for classifying cancer diagnosis data
Institute for Infocomm Research, 21, Heng Mui Keng Terrace, 119613, Singapore
Received on March 17, 2003
; accepted on June 9, 2003
Methods and Results: We introduce a new method to discover many diversified and significant rules from high dimensional profiling data. We also propose to aggregate the discriminating power of these rules for reliable predictions. The discovered rules are found to contain low-ranked features; these features are found to be sometimes necessary for classifiers to achieve perfect accuracy. The use of low-ranked but essential features in our method is in constrast to the prevailing use of an ad-hoc number of only top-ranked features. On a wide range of data sets, our method displayed highly competitive accuracy compared to the best performance of other kinds of classification models. In addition to accuracy, our method also provides comprehensible rules to help elucidate the translation between raw data and useful knowledge.
Supplementary information: http://sdmc.i2r.a-star.edu.sg/GEDatasets/supplementaldata/eccb2003/ECCB2003.html.
Contact: jinyan{at}i2r.a-star.edu.sg
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. Geurts, M. Fillet, D. de Seny, M.-A. Meuwis, M. Malaise, M.-P. Merville, and L. Wehenkel Proteomic mass spectra classification using decision tree based ensemble methods Bioinformatics, July 15, 2005; 21(14): 3138 - 3145. [Abstract] [Full Text] [PDF] |
||||
