Skip Navigation


Bioinformatics Advance Access originally published online on November 28, 2007
Bioinformatics 2008 24(1):110-117; doi:10.1093/bioinformatics/btm486
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/1/110    most recent
btm486v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Google Scholar
Right arrow Articles by Draminski, M.
Right arrow Articles by Komorowski, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Draminski, M.
Right arrow Articles by Komorowski, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Monte Carlo feature selection for supervised classification

Michal Draminski 1, Alvaro Rada-Iglesias 2, Stefan Enroth 3, Claes Wadelius 2, Jacek Koronacki 1,{dagger} and Jan Komorowski 3,4,*,{dagger}

1Institute of Computer Science, Polish Academy of Science, Ordona 21, PL-01-237 Warsaw, Poland, 2Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, 3The Linnaeus Centre for Bioinformatics, Uppsala University and The Swedish University for Agricultural Sciences, Box 758, SE-751 24 Uppsala, Sweden and 4Interdisciplinary Centre for Mathematical and Computer Modelling, Warsaw University, Poland

*To whom correspondence should be addressed.


   Abstract

Motivation: Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It is desirable that feature selection provides the features that contribute most to the classification task per se and which should therefore be used by any classifier later used to produce classification rules. In this article, a conceptually simple but computer-intensive approach to this task is proposed. The reliability of the approach rests on multiple construction of a tree classifier for many training sets randomly chosen from the original sample set, where samples in each training set consist of only a fraction of all of the observed features.

Results: The resulting ranking of features may then be used to advantage for classification via a classifier of any type. The approach was validated using Golub et al. leukemia data and the Alizadeh et al. lymphoma data. Not surprisingly, we obtained a significantly different list of genes. Biological interpretation of the genes selected by our method showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods.

Availability: Prototype available upon request.

Contact: jan.komorowski{at}lcb.uu.se

Associate Editor: Joaquin Dopazo

{dagger}The authors wish it to be known that, in their opinion, the last two authors should be regarded as joint First Authors.


Received on December 13, 2006; revised on August 28, 2007; accepted on September 25, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
B. M. King and B. Tidor
MIST: Maximum Information Spanning Trees for dimension reduction of biological data sets
Bioinformatics, May 1, 2009; 25(9): 1165 - 1172.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.