Skip Navigation



Bioinformatics Advance Access published online on December 16, 2008

Bioinformatics, doi:10.1093/bioinformatics/btn644
This Article
Right arrow Advance Access manuscript (PDF)
Right arrow Supplementary Data
Right arrow All Versions of this Article:
25/3/331    most recent
btn644v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Huang, D.-S.
Right arrow Articles by Xu, C.-G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huang, D.-S.
Right arrow Articles by Xu, C.-G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2008). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

A Genetic Programming Based Approach to the Classification of Multiclass Microarray Datasets

De-Shuang Huang 1,{dagger}, Kun-Hong Liu 2,3,* and Chun-Gui Xu 1,4,{dagger}

1Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, P.O. Box 1130, Hefei, Anhui, 230031, China;
2School of Software, Xiamen University, Xiamen, Fujian, 361005, China.
3Department of Automation, University of Science and Technology of China, Hefei, Anhui, 230026, China.
4School of Life Science, University of Science and Technology of China, Hefei, Anhui, 230026, China.

*To whom correspondence should be addressed. Dr. Kun-Hong Liu, E-mail: lkhqz{at}163.com, khliu1977{at}gmail.com


   Abstract

Motivation: Feature selection approaches have been widely applied to deal with the small sample size problem in the analysis of microarray datasets. For the multiclass problem, the proposed methods are based on the idea of selecting a gene subset to distinguish all classes. However, it will be more effective to solve a multiclass problem by splitting it into a set of two-class problems and solving each problem with a respective classification system,

Results: We propose a genetic programming (GP) based approach to analyze multiclass microarray datasets. Unlike the traditional GP, the individual proposed in this paper consists of a set of small-scale ensembles, named as sub-ensemble (denoted by SE). Each SE consists of a set of trees. In application, a multiclass problem is divided into a set of two-class problems, each of which is tackled by a SE firstly. The SEs tackling the respective two-class problems are combined to construct a GP individual, so each individual can deal with a multiclass problem directly. Effective methods are proposed to solve the problems arising in the fusion of SEs, and a greedy algorithm is designed to keep high diversity in SEs. This GP is tested in five datasets. The results show that the proposed method effectively implements the feature selection and classification tasks.

{dagger}These authors contributed equally to this work.

Associate Editor: Prof. David Rocke


Received on March 23, 2008; revised on November 14, 2008; accepted on December 11, 2008

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.