Bioinformatics Vol. 19 no. 4 2003
Pages 449-458
© 2003 Oxford University Press
An information theoretic approach for analyzing temporal patterns of gene expression
1 Department of Computer Science and
Engineering, Pennsylvania State University, University Park,
PA 16802
2 Department of Pharmaceutical Sciences,
State University of New York at Buffalo, Buffalo, NY 14260-1200,
USA
Received on October 11, 2001
; revised on August 21, 2002
; accepted on October 1, 2002
Motivation: Arrays allow measurements of the expression levels of thousands of mRNAs to be made simultaneously. The resulting data sets are information rich but require extensive mining to enhance their usefulness. Information theoretic methods are capable of assessing similarities and dissimilarities between data distributions and may be suited to the analysis of gene expression experiments. The purpose of this study was to investigate information theoretic data mining approaches to discover temporal patterns of gene expression from array-derived gene expression data.
Results: The KullbackLeibler divergence, an information-theoretic distance that measures the relative dissimilarity between two data distribution profiles, was used in conjunction with an unsupervised self-organizing map algorithm. Two published, array-derived gene expression data sets were analyzed. The patterns obtained with the KL clustering method were found to be superior to those obtained with the hierarchical clustering algorithm using the Pearson correlation distance measure. The biological significance of the results was also examined.
Availability: Software code is available by request from the authors. All programs were written in ANSI C and Matlab (Mathworks Inc., Natick, MA).
Contact: jkasturi{at}cse.psu.edu acharya{at}cse.psu.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Seo Young Kim and J. Won Lee Ensemble clustering method based on the resampling similarity measure for gene expression data Statistical Methods in Medical Research, December 1, 2007; 16(6): 539 - 564. [Abstract] [PDF] |
||||
![]() |
E. Parrella, M. Gianni, M. Fratelli, M. M. Barzago, I. Raska Jr, L. Diomede, M. Kurosaki, C. Pisano, P. Carminati, L. Merlini, et al. Antitumor Activity of the Retinoid-Related Molecules (E)-3-(4'-Hydroxy-3'-adamantylbiphenyl-4-yl)acrylic Acid (ST1926) and 6-[3-(1-Adamantyl)-4-hydroxyphenyl]-2-naphthalene Carboxylic Acid (CD437) in F9 Teratocarcinoma: Role of Retinoic Acid Receptor {gamma} and Retinoid-Independent Pathways Mol. Pharmacol., September 1, 2006; 70(3): 909 - 924. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-h. Taguchi and Y. Oono Relational patterns of gene expression via non-metric multidimensional scaling analysis Bioinformatics, March 15, 2005; 21(6): 730 - 740. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kasturi and R. Acharya Clustering of diverse genomic data using information fusion Bioinformatics, February 15, 2005; 21(4): 423 - 429. [Abstract] [Full Text] [PDF] |
||||


