Bioinformatics Vol. 18 no. 11 2002
Pages 1477-1485
© 2002 Oxford University Press
Statistical analysis of a small set of time-ordered gene expression data using linear splines
Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
Received on December 21, 2001
; revised on March 18, 2002
; accepted on April 17, 2002
Motivation: Recently, the temporal response of genes to changes in their environment has been investigated using cDNA microarray technology by measuring the gene expression levels at a small number of time points. Conventional techniques for time series analysis are not suitable for such a short series of time-ordered data. The analysis of gene expression data has therefore usually been limited to a fold-change analysis, instead of a systematic statistical approach.
Methods: We use the maximum likelihood method together with Akaike's Information Criterion to fit linear splines to a small set of time-ordered gene expression data in order to infer statistically meaningful information from the measurements. The significance of measured gene expression data is assessed using Student's t-test.
Results: Previous gene expression measurements of the cyanobacterium Synechocystis sp. PCC6803 were reanalyzed using linear splines. The temporal response was identified of many genes that had been missed by a fold-change analysis. Based on our statistical analysis, we found that about four gene expression measurements or more are needed at each time point.
Availability: An extension module for Python to calculate linear spline functions is available at http://bonsai.ims.u-tokyo.ac.jp/~mdehoon. This software package (with patent pending) is free of charge for academic use only.
Contact: mdehoon{at}ims.u-tokyo.ac.jp