Improving molecular cancer class discovery through sparse non-negative matrix factorization
Department of Genetics, Harvard Medical School Boston, MA 02115, USA
*To whom correspondence should be addressed.
Motivation: Identifying different cancer classes or subclasses with similar morphological appearances presents a challenging problem and has important implication in cancer diagnosis and treatment. Clustering based on gene-expression data has been shown to be a powerful method in cancer class discovery. Non-negative matrix factorization is one such method and was shown to be advantageous over other clustering techniques, such as hierarchical clustering or self-organizing maps. In this paper, we investigate the benefit of explicitly enforcing sparseness in the factorization process.
Results: We report an improved unsupervised method for cancer classification by the use of gene-expression profile via sparse non-negative matrix factorization. We demonstrate the improvement by direct comparison with classic non-negative matrix factorization on the three well-studied datasets. In addition, we illustrate how to identify a small subset of co-expressed genes that may be directly involved in cancer.
Contact: g1m1c1{at}receptor.med.harvard.edu, ygao{at}receptor.med.harvard.edu
Supplementary information: http://arep.med.harvard.edu/snmf/supplement.htm
Received on April 7, 2005; revised on July 27, 2005; accepted on August 30, 2005
This article has been cited by other articles:
![]() |
M. F. Ochs Knowledge-based data analysis comes of age Brief Bioinform, October 23, 2009; (2009) bbp044v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Chang, R. A. DeFilippis, T. D. Tlsty, and B. Parvin Graphical methods for quantifying macromolecules through bright field imaging Bioinformatics, April 15, 2009; 25(8): 1070 - 1075. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kim and H. Park Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis Bioinformatics, June 15, 2007; 23(12): 1495 - 1502. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Li, Y. Sun, and M. Zhan The discovery of transcriptional modules by a two-stage matrix decomposition approach Bioinformatics, February 15, 2007; 23(4): 473 - 479. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fogel, S. S. Young, D. M. Hawkins, and N. Ledirac Inferential, robust non-negative matrix factorization analysis of microarray data Bioinformatics, January 1, 2007; 23(1): 44 - 49. [Abstract] [Full Text] [PDF] |
||||

