Bioinformatics Advance Access published online on May 8, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl174
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Cancer Genomics Program, Department of Oncology, University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK
* To whom correspondence should be addressed.
Motivation: Elucidating the molecular taxonomy of cancers and finding biological and clinical markers from microarray experiments is problematic due to the large number of variables being measured. Feature selection methods that can identify relevant classifiers or that can remove likely false positives prior to supervised analysis are therefore desirable. Results: We present a novel feature selection procedure based on a mixture model and a non-gaussianity measure of a gene's expression profile. The method can be used to find genes that define either small outlier subgroups or major subdivisions, depending on the sign of kurtosis. The method can also be used as a filtering step, prior to supervised analysis, in order to reduce the false discovery rate. We validate our methodology using six independent data sets by rediscovering major classifiers in ER negative and ER positive breast cancer and in prostate cancer. Furthermore, our method finds two novel subtypes within the basal subgroup of ER negative breast tumours, associated with apoptotic and immune response functions respectively, and with statistically different clinical outcome. Availability: An R-function pack that implements the methods used here has been added to vabayelMix, available from (www.cran.r-project.org).
Received April 1, 2006
Revised April 25, 2006
Accepted April 30, 2006
Article
PACK: Profile Analysis using Clustering and Kurtosis to find molecular classifiers in cancer
Andrew E. Teschendorff 1 *,
Ali Naderi 1,
Nuno L. Barbosa-Morais 2,
and
Carlos Caldas 1
2 Cancer Genomics Program, Department of Oncology, University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK; Institute of Molecular Medicine, Faculty of Medicine, University of Lisbon, 1649-028 Lisbon, Portugal
Andrew E. Teschendorff, E-mail: aet21{at}cam.ac.uk
![]()
Abstract
Associate Editor: Martin Bishop
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. Li, A. Chaudhuri, J. Chant, and Z. Tang PADGE: analysis of heterogeneous patterns of differential gene expression Physiol Genomics, December 19, 2007; 32(1): 154 - 159. [Abstract] [Full Text] [PDF] |
||||
