Bioinformatics Advance Access originally published online on April 29, 2004
Bioinformatics 2004 20(16):2545-2552; doi:10.1093/bioinformatics/bth281
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics vol. 20 issue 16 © Oxford University Press 2004; all rights reserved.
Class discovery and classification of tumor samples using mixture modeling of gene expression dataa unified approach
Department of Statistics, Ohio State University, 1958 Neil Avenue, Columbus, OH 43210, USA
Received on September 11, 2003; revised on March 4, 2004; accepted on April 19, 2004
Advance Access Publication April 29, 2004
Motivation: The DNA microarray technology has been increasingly used in cancer research. In the literature, discovery of putative classes and classification to known classes based on gene expression data have been largely treated as separate problems. This paper offers a unified approach to class discovery and classification, which we believe is more appropriate, and has greater applicability, in practical situations.
Results: We model the gene expression profile of a tumor sample as from a finite mixture distribution, with each component characterizing the gene expression levels in a class. The proposed method was applied to a leukemia dataset, and good results are obtained. With appropriate choices of genes and preprocessing method, the number of leukemia types and subtypes is correctly inferred, and all the tumor samples are correctly classified into their respective type/subtype. Further evaluation of the method was carried out on other variants of the leukemia data and a colon dataset.
Supplementary information: The program implementing the method and additional details and figures are at http://www.stat.ohio-state.edu/~statgen/PAPERS/DNC-MIX.html.
Contact: shili{at}stat.ohio-state.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. Pan Incorporating gene functions as priors in model-based clustering of microarray gene expression data Bioinformatics, April 1, 2006; 22(7): 795 - 801. [Abstract] [Full Text] [PDF] |
||||
