Bioinformatics Advance Access published online on February 5, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth035
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598
* To whom correspondence should be addressed. E-mail: gustavo{at}us.ibm.com.
Motivation: Despite the growing literature devoted to finding differentially expressed genes in assays probing different tissues types, little attention has been paid to the combinatorial nature of feature selection inherent to large, high-dimensional gene expression data sets. New flexible data analysis approaches capable of searching relevant subgroups of genes and experiments are needed to understand multivariate associations of gene expression patterns with observed phenotypes. Results: We present in detail a deterministic algorithm to discover patterns of multivariate gene associations in gene expression data. The patterns discovered are differential with respect to a control data set. The algorithm is exhaustive and efficient, reporting all existent patterns that fit a given input parameter set while avoiding enumeration of the entire pattern space. The value of the pattern discovery approach is demonstrated by finding a set of genes that differentiate between two types of lymphoma. Moreover, these genes are found to behave consistently in an independent data set produced in a different lab using different arrays, thus validating the genes selected using our algorithm. We show that the genes deemed significant in terms of their multivariate statistics will be missed using other methods. Availability: Our set of pattern discovery algorithms including a user interface is distributed as a package called Genes@Work. This package is freely available to non-commercial users and can be downloaded from our website (http://www.research.ibm.com/FunGen).
Revised October 30, 2003
Accepted October 31, 2003
Article
Genes@Work: an efficient algorithm for pattern discovery and multivariate feature selection in gene expression data
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
N. T. Zinkin, F. Grall, K. Bhaskar, H. H. Otu, D. Spentzos, B. Kalmowitz, M. Wells, M. Guerrero, J. M. Asara, T. A. Libermann, et al. Serum Proteomics and Biomarkers in Hepatocellular Carcinoma and Chronic Liver Disease Clin. Cancer Res., January 15, 2008; 14(2): 470 - 477. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hidvegi, K. Mirnics, P. Hale, M. Ewing, C. Beckett, and D. H. Perlmutter Regulator of G Signaling 16 Is a Marker for the Distinct Endoplasmic Reticulum Stress State Associated with Aggregated Mutant {alpha}1-Antitrypsin Z in the Classical Form of {alpha}1-Antitrypsin Deficiency J. Biol. Chem., September 21, 2007; 282(38): 27769 - 27780. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Horikawa, S. W. Martin, S. L. Pogue, K. Silver, K. Peng, K. Takatsu, and C. C. Goodnow Enhancement and suppression of signaling by the conserved tail of IgG memory-type B cell antigen receptors J. Exp. Med., April 16, 2007; 204(4): 759 - 769. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Aivado, D. Spentzos, U. Germing, G. Alterovitz, X.-Y. Meng, F. Grall, A. A. N. Giagounidis, G. Klement, U. Steidl, H. H. Otu, et al. From the cover: Serum proteome profiling detects myelodysplastic syndromes and identifies CXC chemokine ligands 4 and 7 as markers for advanced disease PNAS, January 23, 2007; 104(4): 1307 - 1312. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Jones, H. Otu, D. Spentzos, S. Kolia, M. Inan, W. D. Beecken, C. Fellbaum, X. Gu, M. Joseph, A. J. Pantuck, et al. Gene Signatures of Progression and Metastasis in Renal Cell Cancer Clin. Cancer Res., August 15, 2005; 11(16): 5730 - 5739. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Tsafrir, I. Tsafrir, L. Ein-Dor, O. Zuk, D.A. Notterman, and E. Domany Sorting points into neighborhoods (SPIN): data analysis and visualization by ordering distance matrices Bioinformatics, May 15, 2005; 21(10): 2301 - 2308. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Mirnics, Z. Korade, D. Arion, O. Lazarov, T. Unger, M. Macioce, M. Sabatini, D. Terrano, K. C. Douglass, N. F. Schor, et al. Presenilin-1-Dependent Transcriptome Changes J. Neurosci., February 9, 2005; 25(6): 1571 - 1578. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Basso, U. Klein, H. Niu, G. A. Stolovitzky, Y. Tu, A. Califano, G. Cattoretti, and R. Dalla-Favera Tracking CD40 signaling during germinal center development Blood, December 15, 2004; 104(13): 4088 - 4096. [Abstract] [Full Text] [PDF] |
||||






