Bioinformatics Advance Access originally published online on September 16, 2004
Bioinformatics 2005 21(4):502-508; doi:10.1093/bioinformatics/bti023
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics vol. 21 issue 4 © Oxford University Press 2005; all rights reserved.
VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data

1 Laboratoire MAS Ecole Centrale Paris, Grande Voie des vignes, 92295 Chatenay Malabry, France
2 UMR ENGREF/INAPG/INRA 518, INAPG 16, rue Claude Bernard, 75005 Paris, France
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: Identifying differentially regulated genes in experiments comparing two experimental conditions is often a key step in the microarray data analysis process. Many different approaches and methodological developments have been put forward, yet the question remains open.
Results: Varmixt is a powerful and efficient novel methodology for this task. It is based on a flexible and realistic variance modelling strategy. It compares favourably with other popular techniques (standard t-test, SAM and Cyber-T). The relevance of the approach is demonstrated with real-world and simulated datasets. The analysis strategy was successfully applied to both a two-colour cDNA microarray and an Affymetrix Genechip. Strong control of false positive and false negative rates is proven in large simulation studies.
Availability: The R package is freely available at http://www.inapg.inra.fr/ens_rech/mathinfo/recherche/mathematique/outil.html
Contact: delmar{at}inapg.inra.fr
Supplementary information: http://www.inapg.inra.fr/ens_rech/mathinfo/recherche/mathematique/outil.html
| 1 INTRODUCTION |
|---|
|
|
|---|
Reliable identification of differentially regulated genes is a key issue in many molecular biology experiments using microarray technology for measuring mRNA levels in tissue samples.
Many different approaches and methodological developments have been described. After careful examination of existing methods and analysis of many datasets, we were able to form a list of functional requirements:
- A flexible variance model.
- Accurate variance estimates.
- Data-driven parameters estimation.
- Handling missing values.
- Strong control of the false positive rate.
- Optimal detection power.
It turns out that there is some room for a method that could reliably meet all these requirements, and this work will present a novel method designed to comply with them. Section 1.1 outlines our variance model and gives a brief overview of other existing methods. In Section 2 we formulate the statistical data model and testing procedure. Section 3 gives some algorithm details. Section 4 is an application of the proposed method to a real-world dataset, the Affymetrix U133 Latin square data (http://www.affymetrix.com). In Section 5, the results of a thorough simulation study comparing popular methods for analysis of microarray data are discussed.
1.1 Background
Many different sources of variability affect gene expression intensity measurements in microarray experiments (Schuchhardt et al., 2000; Kerr et al., 2002) including spatial effects, variability in probe sources, mean spot intensity level. Not all are well characterized or even identified. A generally limited number of replications does not permit accurate estimation of the variance for each gene individually. Using a common variance for all the genes would provide a very powerful test and an accurate variance estimate, yet the homoscedastic hypothesis is clearly not acceptable. There is some room between these two extremes, and several methods have already been proposed.
ANOVA type approach Kerr et al. (2002), Draghici et al. (2003) and Wolfinger et al. (2001) have presented adaptation of the general linear model. These models generally try to encompass the different sources of systematic variability.
Non-parametric approach Tusher et al. (2001) developed a non-parametric approach called SAM for significance analysis of microarrays. They add a constant to the gene-specific standard error to avoid false positives due to under estimation of the gene variance. They also define a non-parametric significance level using a permutation procedure.
Structural variance modelling approach Wang and Ethier (2004) have proposed a likelihood ratio test based on the variance model of Rocke and Durbin (2003).
Bayesian approach Baldi and Long (2001) developed a method from a fully Bayesian framework. Their solution, called Cyber-T, is a t-test with regularized variance estimate and adapted degrees of freedom. Some empirical indications are provided for setting the values of the hyperparameters. Lönnstedt and Speed (2002) also proposed an empirical Bayes strategy and derived a logit model on the probability for a gene to be differentially expressed. Their solution uses a regularized estimate of the variance which requires user-defined hyperparameter.
Machine learning approach Cole et al. (2003) proposed a completely different approach. They employed machine learning techniques, and transformed the problem of identifying differentially regulated genes into a prediction problem. With a few example datasets, they showed that their method was more powerful than other existing statistical methods.
Variance mixture modelling approach Heterogeneity of variance in microarray data can arise for various technical and biological reasons. For example, gene-specific dye effect or spotting effects (Mary-Huard et al., 2004) are commonly observed in two colour microarray experiments. Because of different probe properties or other experimental conditions, the measure of the expression level of some genes has a high degree of variability while it is more reproducible for others. Since, with few replicates, it is difficult to assess the variability of each gene individually, more reliable inferences can be drawn if the genes are categorized into a few groups by variability level. Delmar et al. (2005) proposed a flexible model of the gene expression variance based on these ideas. Their model relies on the assumption that groups of genes can be identified based on a similar response to the various sources of variability. It does not make assumptions about specific sources of variability and does not use a priori assumption of a relationship between variance level and mean gene log-intensity. Their strategy for differential analysis is the following. Groups of genes with homogeneous variance are identified. The variance of each group is then accurately estimated from a large number of observations. Using the group variance in place of individual gene variance estimate, they use a test statistic of the form:
is the gene index, C g is the group gene g was assigned to,
g is the measure of the differential expression of gene g and
is proportional to the variance of group C g . The proposed method stems from the same modelling strategy. It provides a more general formulation of the variance model and proposes a more robust analysis technique.
| 2 SYSTEM AND METHODS |
|---|
|
|
|---|
2.1 Data model
Consider an experimental design with R 1 measurements in condition 1 and R 2 measurements in condition 2, with G the total number of gene probes. Let Y gcr be the value for the expression level of the gene g (g = 1, G) in condition c (c = 1, 2), in replicate number r (r = 1, R c ). We will assume in all the following data that the pre-processing and normalization steps have been properly completed.
2.1.1 Unpaired data
The observations Y gcr are modelled with the simple linear model of Equation (1):
![]() |
By hypothesis, E gcr is normally distributed with mean 0. The model on the variance of E gcr is addressed in the next section.
The measure of differential expression
g is defined in Equation (2) as the difference in mean intensity between conditions 1 and 2 for gene g:
![]() |
The variance of
g is given as
![]() |
2.1.2 Paired data
In some instances, for each gene probe, the two measurements in the two different conditions must be treated as paired data. This case can be encountered with two-colour glass slide microarray or when the same patient is measured before and after some treatment. The model is fitted on the logarithm of the ratio of observed intensity (log-ratio).
Note Y gr the log-ratio of gene g (g = 1, G) in replicate r (r = 1, R). The model for Y gr is described by
![]() |
where E gr is normally distributed with mean 0.
The measure of differential expression
g is now defined in Equation (5) as the mean log-ratio for gene g.
![]() |
and
![]() |
2.2 Variance model
Consistently with the definition of
g in Equation (2) or (5), we assume that
g is normally distributed with mean
g , for all g in [1, ..., G]. The key assumption of the mixture model on the variance is that the set of all genes is divided into groups, with all the genes in a particular group having equal variance. The variance and the relative abundance of each group can be estimated with accuracy, yet we cannot decide for individual genes with certainty. Thus, the mixture model on the variance for the differential analysis is defined in Equations (7) and (8):
![]() |
with
![]() |
and [
C 1 , ...,
C k ] known. k is the total number of variance groups, C 1, C 2, ..., C k denote the variance groups of the model.
Note
is the usual estimate of
. Note s 2 is the set of variance estimates of all genes:
. Let
gi be the posterior probability that true variance of gene g is
. Given the value the observed variance of gene g,
, and the value of the observed variance of all the other genes, s 2.
![]() |
it comes that for all real d:
![]() |
and from Equation (7):
![]() |
Equation (11) elicits the conditional distribution of
g given the value of the observed variance of all the genes. The conditional variance of
g is then derived as:
![]() |
The variance of
g is a weighted sum of the variance of all the groups of the models, with the weights equal to the probability that gene g belongs to each group. In contrast, in Delmar et al. (2005) each gene is totally assigned to a variance group according to the maximum posterior probability. Their solution for the variance of
g is not a weighted sum of the variance of all the groups but the variance of the group gene g was assigned to. Then, in Equation (11), instead of
gi they use P(i|j) = Pr[g
C i |g is assigned to C j ], that is, the probability that gene g truly belongs to group i given that it was assigned to group j. In practice this approach, called VM2, can lead to spurious results for genes whose variance level is at the limit between two variance groups. We call VM the current method and note that it does partial assignment of gene to variance groups.
2.3 Hypothesis testing
For each gene g, the null hypothesis is [H 0 = {
g = 0}] and the alternative hypothesis [H 1 = {
g
0}]. The proposed test statistic is simply the value of
g . Therefore, under the null hypothesis, the distribution of the test statistic is known for all g. It is a mixture (i.e. a weighted sum) of normal distributions, with 0 mean, variance equal to the group variance and probability (i.e. weight) equal to the posterior probability of g to belong to each of the variance group, as described in Equation (11), with
g = 0 for all g in [1, ..., G].
| 3 ALGORITHM |
|---|
|
|
|---|
Under the normal distribution assumption, for each gene g the sum of square of the residuals (Es) denoted X g is distributed according to a Gamma distribution with shape parameter equal one half times degrees of freedom
(
= R 1 in the paired data case and
= R 1 + R 2 2 in the unpaired data case) and scale parameter equal two times the value of the true variance of gene g. As in Delmar et al. (2005), the marginal distribution of X g is modelled with a mixture of Gamma distribution distribution:
. The log-likelihood L function is therefore given by
, where
j , j
[1, ..., k] is the probability that a gene belongs to the j-th group. The parameters (
's and
C 's) are estimated via EM maximization of the log-likelihood function. Choosing the number of components in a mixture model is a difficult task. BIC criterion has given satisfactory results in the analysis of many real datasets and simulation studies (data not shown). Defining the best method for estimating the number of components is one of our ongoing research project. | 4 IMPLEMENTATION |
|---|
|
|
|---|
Delmar et al. (2005) have illustrated the efficiency of the mixture model on the variance with cDNA microarray datasets of about 6000 gene probes. Microarray technology is evolving rapidly and new models with up to 40 000 gene probes are now readily available. We show that the variance model with our new testing procedure is appropriate and efficient on these formats. We report the results of the analysis of reference Affymetrix genechip datasets and simulated two-colour microarray data.
4.1 Affymetrix genechip
4.1.1 Spike-in dataset
Affymetrix has provided a reference dataset for comparison and calibration of microarray analysis strategy. The Human Genome U133 Data Set consists of 3 technical replicates of 14 separate hybridizations of 42 spiked transcripts in a complex human background at concentrations ranging from 0.125 to 512 pM. Thirty of the spikes are isolated from a human cell line, four spikes are bacterial controls and eight spikes are artificially engineered sequences believed to be unique in the human genome (http://www.affymetrix.com).
4.1.2 Analysis
The data pre-processing steps were performed with the Bioconductor package (Gentleman and Carey, 2002). The data were normalized with the rma algorithm (Irizarry et al., 2003) without background subtraction. We performed the 13 pairwise comparisons for which spike-in genes had a true fold-change of two. Differential analysis was then performed with different methods.
- Gene specific. Standard t-test.
- Homoscedastic.
- VM2 (Delmar et al., 2005). Mixture model on the variance with total assignment of genes to variance groups.
- VM. Mixture model on the variance with partial assignment of genes to variance groups.
- Cyber-T. The method of Baldi and Long (2001).
4.1.3 Results
The results must be analysed with caution. In particular, we found in our regulated gene lists, probe names that were very similar to the probe names of spike-in genes. For example, AFFX-LysX-3_at is a part of the spike-in genes list. Often times, AFFX-LysX-5_at was also part of the regulated gene list. We hypothesized that the probes AFFX-LysX-3_at and AFFX-LysX-5_at were designed to match the 3' and 5' prime end of the same gene. We decided not to count those probes, and others in the same situation, as false positive. Still, some doubts remain about whether gene probes in our lists of regulated genes but not in the spike-in gene list are true false positive or not.
In addition, genes spiked at very low concentration (<1 pmol) are very difficult to detect (this point was also taken into account in the affycomp website http://affycom.biostat.jhsph.edu/). In all the comparisons, 30 (out of the 42) genes were spiked at concentration above 1 pmol in at least one of the two experiments. We report the analysis results relative to the full list of 42 spike-in genes, and to the reduced list of 30 genes spiked at sufficient concentration. Note that the list of 30 genes is different for each comparison. Finally, we excluded one comparison because of failed normalization (see Supplemental Material).
The results of differential analysis are reported in Table 1. Three or four components mixture models were fitted to these datasets according to the BIC criterion. VM and VM2 models clearly outperform both the gene specific and homoscedastic model. VM model also slightly outperforms VM2. The difference between VM and Cyber-T, in terms of performance, is less evident in this example. The VM method is more powerful than Cyber-T and seems to control the number of false negatives nearly as well. Moreover, the VM method does not use any user-defined parameters and would perform well in all situations. It was shown that with four replicates the performance of Cyber-T are significantly altered. Incidentally, we compared the results obtained with rma and Affymetrix MAS5 normalizations and could note that rma is more efficient (data not shown).
|
| 5 SIMULATION STUDY |
|---|
|
|
|---|
5.1 Variance estimation
5.1.1 Dataset
Assessing the accuracy of the variance model is not easy, neither the true value of the variance is known, nor do we have a satisfactory model for it. We found that the control dataset from Rosetta (Hughes et al., 2000) could prove very useful for estimating the accuracy and robustness of variance estimation methods. This dataset was compiled from 63 hybridization of yeast sample on two-colour microarrays, with all the yeast samples in the same culture conditions. Thanks to the large sample size, for each gene, the usual variance estimate over the 63 replicates is close to the true variance. We will refer to it as the reference variance.
Consider a sample of n replicates drawn from the complete 63 replicate dataset. The distance between the reference variance and the variance estimated from the sample is a valuable measure of the accuracy of a variance model.
For a given n, 20 samples of size n were drawn at random from the full dataset. For each sample, the gene variance was estimated with different methods. The median absolute deviation (MAD) from the reference variance over all the genes in the dataset was then computed. Figure 1 shows the average MAD over the 20 samples at 12 different sample sizes.
|
5.1.2 Results
As expected, for very large number of replicates (>25) the gene specific, usual variance estimate is the closest to the reference variance. For more than 10 replicates, Cyber-T method performs quiet poorly compared to the other methods. Cyber-T estimate of the variance is a weighted sum of a priori variance and the usual variance estimate. The a priori part in the variance estimate of Cyber-T becomes a drawback when the sample size gets large. VM method stands out as its curve is below the others for small and larger n's. As n gets very large, the curve of the VM (and VM2) method stays very close to the curve of the gene specific estimate.
5.2 Differential analysis
In a large simulation study, we compared the performance of this and popular existing methods for the analysis of microarray data: homoscedastic, gene-specific, VM, VM2 (Delmar et al., 2005), Cyber-T (Baldi and Long, 2001) and SAM (Tusher et al., 2001). The parameters of a simulation study are the vectors of mean log-intensity and SDusually taken from real experiment estimatesand a vector of log-ratio characterizing a set of simulated genes. The log-ratio of each gene is independently simulated according to a normal distribution. The simulation parameters for each gene are its mean log-ratio, SD and an associated log-intensity value. One percent of the genes are true positive, with simulated mean log-ratio not zero. In each simulated dataset, the differentially expressed genes are randomly associated with a value of SD.
5.3 Simulation 1
5.3.1 Data
In this simulation set we used the same parameters described in Delmar et al. (2005). The parameters of the simulation are the vectors of mean log-intensity and SD estimated in a real experiment comparing the spleen of normal and irradiated mice on custom made two-colour glass slide microarrays. This setting respects the putative pattern of intensity-dependent variance. The simulated dataset had 4360 genes. One percent of the genes were simulated with a non-zero mean log-ratio. These 43 genes had a simulated absolute value of mean log-ratio (base 2) ranging from 0.25 to 0.9 (ratio between 1.18 and 1.83), 21 with positive values. This range of differences was chosen because (1) it seemed reasonable with regard to the observed value in the actual experiments; and (2) weak ratios are challenging for analysis methods and highlighted the differences between the methods and the number of replicates. The results were averaged over 50 simulated datasets and are reported in Table 2.
|
5.3.2 Results
Four to seven components (1011 in the 25 replicates case) mixture models are chosen to fit the simulated datasets. There are usually more components for more replicates so that, as expected, one gets a more accurate variance estimate. The VM and VM2 methods clearly outperform the homoscedastic and simple gene-specific models in all situations. VM slightly outperforms VM2 in almost all cases. The SAM method has a false positive rate well above the nominal level showing that the method does not control the false discovery rate very efficiently. The competition between VM and Cyber-T is tighter. However, at four replicates, Cyber-T has a much higher false negative rate than expected. We hypothesize that this is caused by a discontinuity in the proposed heuristic for setting the user-defined parameter as a function of the number of replicates, in Cyber-T. Interestingly, with large number of replicates (25), Cyber-T also exhibits a much higher false negative rate than expected. This is actually consistent with the results of Section 5.1 and Figure 1, where the variance estimated of Cyber-T is seen to be much further away from the reference variance than VM or even VM2 method for large number of replicates. Even at eight replicates, the level of false positive rate for Cyber-T is well above the nominal level and above the level of the VM method. These results, together with section 5.1, are evidence of the relevance of our variance modelling approach.
5.4 Simulation 2
5.4.1 Data
In this second simulation study, the vector of simulated variance was the vector of gene-specific variance estimated from a typical dataset generated from Stanford Human microarrays. The goal of the experiment was to compare fatty tissues of four individuals before and after some treatment. Simulated datasets comprised 30 672 genes. A total of 307 genes had a simulated mean ratio ranging from 1.5 to 4, all other had a simulated mean ratio of 1. The results were averaged over 50 simulated datasets and are reported in Table 3.
|
5.4.2 Results
Five to eight components mixture models are chosen to fit the simulated datasets (1113 components in the 25 replicates case). The results are consistent with that discussed in Section 5.3.2. VM, VM2 and Cyber-T clearly outperform the other methods. VM slightly outperforms VM2 in all cases. Overall, VM does better than Cyber-T as it controls the FDR in all situations.
5.5 Simulation: missing values
Based on the parameters of Section 5.3.1, we conducted a large simulation study to investigate the effect of missing values (see Supplemental Material). Simulated datasets with 1, 5, 10 and 20% of missing values, were analyzed. Since SAM cannot deal directly with missing value, we chose to replace missing value by the mean of the gene in other experiments. The result is that VM and VM2 are very robust to missing values and their performance is not greatly affected by a large number of missing values. Cyber-T also behaves correctly. SAM performances, however, are significantly affected by missing values. The hypothesis that all test statistics are identically distributed is seriously violated in case of a significant proportion of missing values. Yet, this assumption is at the core of the SAM testing procedure.
| 6 DISCUSSION |
|---|
|
|
|---|
This paper has articulated a new statistical model for the differential analysis of gene expression data. This novel method, called VM or Varmixt, was shown to be at least as good as other existing ones and, in many instances, better (Table 4). It is very flexible as it does not force a variance model onto the data. In contrast to other (non-parametric) methods, it does not rely on the assumptions that the test statistics of all genes are identically distributed under the null hypothesis. Varmixt has fulfilled the list of functional requirements stated at the beginning of this work.
|
The proposed model is based on a discrete distribution of the gene variance which could be seen as an approximation. In principle, extending the model to a continuous parametric distribution may lead to greater power and better performances. We are also looking into extending our variance model to more complex experimental designs and applications: compare more than two conditions, two levels of variability (Technical/Biological).
| Acknowledgments |
|---|
We acknowledge Emmanuelle Lepin and Dominique Langin at INSERM U586, Toulouse, France for the parameters of simulated dataset which was derived from their experiments with Stanford Human microarrays. We thank Diana Tronik-Le Roux at Laboratoire de génomique fonctionnelle, CEA sciences de la vie, Evry, France for the parameters of the first set of simulations which was derived from experiments in her laboratory. We also thank Julie Aubert at INAPG, Paris, France, for useful comments on the software package. Our thank also goes to the reviewers for their useful comments and discussion of the manuscript.
| Footnotes |
|---|
Present address: Laboratoires Foumier SA, 50, rue de Dijon, 21121 Daix, France.
Received on June 30, 2004; revised on July 29, 2004; accepted on September 7, 2004
| REFERENCES |
|---|
|
|
|---|
Baldi, P. and Long, A. (2001) A bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics, 17, 509519
Cole, S.W., Galic, Z., Zack, J.A. (2003) Controlling false-negative errors in microarray differential expression analysis: a prim approach. Bioinformatics, 19, 18081816
Delmar, P., Robin, S., Tronik-Leroux, D., Daudin, J. (2005) Mixture model on the variance for the differential analysis of gene expression data. J. R. Stat. Soc., Ser. C, 54, 3150[CrossRef].
Draghici, S., Kulaeva, O., Hoff, B., Petrov, A., Shams, S., Tainsky, M.A. (2003) Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays. Bioinformatics, 19, 13481359
Gentleman, R. and Carey, V. (2002) Bioconductor. R News, 2, 1116.
Hughes, T., Marton, M., Jones, A., Roberts, C., Stoughton, R., Armour, C., Bennett, H., Coffey, E., Dai, H., He, Y. (2000) Functional discovery via a compendium of expression profiles. Cell, 102, 109126[CrossRef][Web of Science][Medline].
Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B., Speed, T.P. (2003) Summaries of affymetrix genechip probe level data. Nucleic Acids Res., 31, e15
Kerr, M., Afshari, C., Bennett, L., Bushel, P., Martinez, J., Walker, N., Churchill, G. (2002) Statistical analysis of a gene expression microarray experiment with replication. Stat. Sinica, 12, 203218.
Lönnstedt, I. and Speed, T. (2002) Replicated microarray data. Stat. Sinica, 12, 3146.
Mary-Huard, T., Daudin, J.-J., Robin, S., Bitton, F., Cabannes, E., Hilson, P. (2004) Spotting effect in microarray experiments. BMC Bioinformatics, 5, 63[CrossRef][Medline].
Rocke, D.M. and Durbin, B. (2003) Approximate variance-stabilizing transformations for gene-expression microarray data. Bioinformatics, 19, 966972
Schuchhardt, J., Beule, D., Malik, A., Wolski, E., Eickhoff, H., Lehrach, H., Herzel, H. (2000) Normalization strategies for cdna microarrays. Nucleic Acids Res., 28, e41
Tusher, V., Tibshirani, R., Chu, G. (2001) Significance analysis of microarrays applied to ionizing radiation response. Proc. Nat Acad. Sci. USA, 98, 51165121
Wang, S. and Ethier, S. (2004) A generalized likelihood ratio test to identify differentially expressed genes from microarray data. Bioinformatics, 20, 100104
Wolfinger, R.D., Gibson, G., Wolfinger, E.D., Bennett, L., Hamadeh, H., Bushel, P., Afshari, C., Paules, R.S. (2001) Assessing gene significance from cdna microarray expression data via mixed models. J. Comput. Biol., 8, 625637[CrossRef][Web of Science][Medline].
This article has been cited by other articles:
![]() |
S. Ollier, C. Leroux, A. de la Foye, L. Bernard, J. Rouel, and Y. Chilliard Whole intact rapeseeds or sunflower oil in high-forage or high-concentrate diets affects milk yield, milk composition, and mammary gene expression profile in goats J Dairy Sci, November 1, 2009; 92(11): 5544 - 5560. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. van Schaik, A. Chateau, M.-A. Dillies, J.-Y. Coppee, A. L. Sonenshein, and A. Fouet The Global Regulator CodY Regulates Toxin Gene Expression in Bacillus anthracis and Is Required for Full Virulence Infect. Immun., October 1, 2009; 77(10): 4437 - 4445. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Mansouri-Attia, J. Aubert, P. Reinaud, C. Giraud-Delville, G. Taghouti, L. Galio, R. E. Everts, S. Degrelle, C. Richard, I. Hue, et al. Gene expression profiles of bovine caruncular and intercaruncular endometrium at implantation Physiol Genomics, September 1, 2009; 39(1): 14 - 27. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Mansouri-Attia, O. Sandra, J. Aubert, S. Degrelle, R. E. Everts, C. Giraud-Delville, Y. Heyman, L. Galio, I. Hue, X. Yang, et al. Endometrium as an early sensor of in vitro embryo manipulation technologies PNAS, April 7, 2009; 106(14): 5687 - 5692. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Herve-Jimenez, I. Guillouard, E. Guedon, S. Boudebbouze, P. Hols, V. Monnet, E. Maguin, and F. Rul Postgenomic Analysis of Streptococcus thermophilus Cocultivated in Milk with Lactobacillus delbrueckii subsp. bulgaricus: Involvement of Nitrogen, Purine, and Iron Metabolism Appl. Envir. Microbiol., April 1, 2009; 75(7): 2062 - 2073. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tiaden, T. Spirig, P. Carranza, H. Bruggemann, K. Riedel, L. Eberl, C. Buchrieser, and H. Hilbi Synergistic Contribution of the Legionella pneumophila lqs Genes to Pathogen-Host Interactions J. Bacteriol., November 15, 2008; 190(22): 7532 - 7547. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mathur and S. Dolo A new efficient statistical test for detecting variability in the gene expression data Statistical Methods in Medical Research, August 1, 2008; 17(4): 405 - 419. [Abstract] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hindre, H. Bruggemann, C. Buchrieser, and Y. Hechard Transcriptional profiling of Legionella pneumophila biofilm cells and the influence of iron on biofilm formation Microbiology, January 1, 2008; 154(1): 30 - 41. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Kelley, H. Feizi, and T. Ideker Correcting for gene-specific dye bias in DNA microarrays using the method of maximum likelihood Bioinformatics, January 1, 2008; 24(1): 71 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mazzucotelli, N. Viguerie, C. Tiraby, J.-S. Annicotte, A. Mairal, E. Klimcakova, E. Lepin, P. Delmar, S. Dejean, G. Tavernier, et al. The Transcriptional Coactivator Peroxisome Proliferator Activated Receptor (PPAR){gamma} Coactivator-1{alpha} and the Nuclear Receptor PPAR{alpha} Control the Expression of Glycerol Kinase and Metabolism Genes Independently of PPAR{gamma} Activation in Human White Adipocytes Diabetes, October 1, 2007; 56(10): 2467 - 2475. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Frey, I. Rubio-Aliaga, A. Siewert, D. Sailer, A. Drobyshev, J. Beckers, M. H. de Angelis, J. Aubert, A. B. Hen, O. Fiehn, et al. Profiling at mRNA, protein, and metabolite levels reveals alterations in renal amino acid handling and glutathione metabolism in kidney tissue of Pept2-/- mice Physiol Genomics, February 12, 2007; 28(3): 301 - 310. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Motakis, G. P. Nason, P. Fryzlewicz, and G. A. Rutter Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach Bioinformatics, October 15, 2006; 22(20): 2547 - 2553. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Liu, M. Milo, N. D Lawrence, and M. Rattray Probe-level measurement error improves accuracy in detecting differential gene expression Bioinformatics, September 1, 2006; 22(17): 2107 - 2113. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Blangiardo, S. Toti, B. Giusti, R. Abbate, A. Magi, F. Poggi, L. Rossi, F. Torricelli, and A. Biggeri Using a calibration experiment to assess gene-specific information: full Bayesian and empirical Bayesian models for two-channel microarray data Bioinformatics, January 1, 2006; 22(1): 50 - 57. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||























