Bioinformatics Advance Access published online on May 10, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti465
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Biostatistics, University at Buffalo, Buffalo, NY, 14226
* To whom correspondence should be addressed.
Motivation: The issue of high dimensionality in microarray data has been and remains a hot topic in statistical and computational analysis. Efficient gene filtering and differentiation approaches can reduce the dimensions of data, help to remove redundant genes and noises and highlight the most relevant genes that are major players in the development of certain diseases or the effect of drug treatment. The purpose of this study is to investigate the efficiency of parametric (including Bayesian and non-Bayesian, linear and nonlinear), nonparametric, and semi-parametric gene filtering methods through the application of time course microarray data from multiple sclerosis patients being treated with IFN- Results: Results show that the presented methods performed significant differently but all are adequately in capturing small number of the potentially relevant genes to the disease. The parametric method, such as mixed models and two Bayesian approaches proved to be more conservative. This may due to these methods are based on overall variation in expression across all time points. The semi-parametric (class dispersion) and nonparametric (pareto) methods are appropriate in capturing variation in expression from time point to time point thereby making them more suitable for investigating significant monotonic changes and trajectories of changes in gene expressions in time course microarray data. Also, the nonlinear Bayesian model proved to be less conservative than linear Bayesian correlated growth models to filter out the redundant genes although the linear model showed better fit than nonlinear model (smaller DIC). We also report the trajectories of significant genes since we have been able to isolate trajectories of genes whose regulations appear to be inter-dependent. Availability: SAS, R and WinBugs codes are available upon request from the authors.
Received December 13, 2004
Revised April 22, 2005
Accepted April 23, 2005
Article
Differential and trajectory methods for time course gene expression data
2 Department of Social and Preventive Medicine, University at Buffalo, Buffalo, NY, 14226
3 Department of Biostatistics, University at Buffalo, Buffalo, NY, 14226; Department of Computer and Information Sciences Niagara University, Lewiston, NY 14109
Yulan Liang, E-mail: yliang{at}buffalo.edu
![]()
Abstract
-1a. The analysis of variance with bootstrapping (parametric), class dispersion (semi-parametric), and Pareto (nonparametric) with permutation methods are presented and compared for filtering and finding differentially expressed genes. Bayesian linear correlated model, Bayesian nonlinear model and non-Bayesian mixed effects model with bootstrap were also developed to characterize the differential expression patterns. Furthermore, trajectory clustering approaches were developed in order to investigate the dynamic patterns and inter-dependency of drug treatment effects on gene expression.![]()
CiteULike
Connotea
Del.icio.us What's this?