Bioinformatics Vol. 19 no. 10 2003
Pages 1259-1266
© 2003 Oxford University Press
Unsupervised feature selection via two-way ordering in gene expression analysis
NERSC Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA
Received on August 7, 2002
; revised on October 31, 2002 and January 17, 2003
; accepted on January 22, 2003
Motivation: Selection of genes most relevant and informative for certain phenotypes is an important aspect in gene expression analysis. Most current methods select genes based on known phenotype information. However, certain set of genes may correspond to new phenotypes which are yet unknown, and it is important to develop novel effective selection methods for their discovery without using any prior phenotype information.
Results: We propose and study a new method to select relevant genes based on their similarity information only. The method relies on a mechanism for discarding irrelevant genes. A two-way ordering of gene expression data can force irrelevant genes towards the middle in the ordering and thus can be discarded. Mechanisms based on variance and principal component analysis are also studied. When applied to expression profiles of colon cancer and leukemia, the unsupervised method outperforms the baseline algorithm that simply uses all genes, and it also selects relevant genes close to those selected using supervised methods.
Supplement: More results and software are online: http://www.nersc.gov/~cding/2way
Contact: chqding{at}lbl.gov
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Z. Guo, Y. Li, X. Gong, C. Yao, W. Ma, D. Wang, Y. Li, J. Zhu, M. Zhang, D. Yang, et al. Edge-based scoring and searching method for identifying condition-responsive protein protein interaction sub-network Bioinformatics, August 15, 2007; 23(16): 2121 - 2128. [Abstract] [Full Text] [PDF] |
||||
