Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (40)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Creighton, C.
Right arrow Articles by Hanash, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Creighton, C.
Right arrow Articles by Hanash, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 1 2003
Pages 79-86
© 2003 Oxford University Press

Mining gene expression databases for association rules

Chad Creighton 1,* and Samir Hanash 2

1 Bioinformatics Program
2 Pediatrics and Communicable Diseases, University of Michigan, Ann Arbor, MI 48109, USA

Received on April 19, 2002 ; revised on July 1, 2002 ; accepted on July 10, 2002

Motivation: Global gene expression profiling, both at the transcript level and at the protein level, can be a valuable tool in the understanding of genes, biological networks, and cellular states. As larger and larger gene expression data sets become available, data mining techniques can be applied to identify patterns of interest in the data. Association rules, used widely in the area of market basket analysis, can be applied to the analysis of expression data as well. Association rules can reveal biologically relevant associations between different genes or between environmental effects and gene expression. An association rule has the form LHS {Rightarrow} RHS, where LHS and RHS are disjoint sets of items, the RHS set being likely to occur whenever the LHS set occurs. Items in gene expression data can include genes that are highly expressed or repressed, as well as relevant facts describing the cellular environment of the genes (e.g. the diagnosis of a tumor sample from which a profile was obtained).

Results: We demonstrate an algorithm for efficiently mining association rules from gene expression data, using the data set from Hughes et al. (2000, Cell, 102, 109–126) of 300 expression profiles for yeast. Using the algorithm, we find numerous rules in the data. A cursory analysis of some of these rules reveals numerous associations between certain genes, many of which make sense biologically, others suggesting new hypotheses that may warrant further investigation. In a data set derived from the yeast data set, but with the expression values for each transcript randomly shifted with respect to the experiments, no rules were found, indicating that most all of the rules mined from the actual data set are not likely to have occurred by chance.

Availability: An implementation of the algorithm using Microsoft SQL Server with Access 2000 is available at http://dot.ped.med.umich.edu:2000/pub/assoc_rules/assoc_rules.zip. Our results from mining the yeast data set are available at http://dot.ped.med.umich.edu:2000/pub/assoc_rules/yeast_results.zip.

Contact: ccreight{at}umich.edu

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
A. Gyenesei, U. Wagner, S. Barkow-Oesterreicher, E. Stolte, and R. Schlapbach
Mining co-regulated gene profiles for the detection of functional associations in gene expression data
Bioinformatics, August 1, 2007; 23(15): 1927 - 1935.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. A. Vinterbo, E.-Y. Kim, and L. Ohno-Machado
Small, fuzzy and interpretable gene expression based classifiers
Bioinformatics, May 1, 2005; 21(9): 1964 - 1970.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.