Skip Navigation


Bioinformatics Advance Access originally published online on July 4, 2006
Bioinformatics 2006 22(17):2107-2113; doi:10.1093/bioinformatics/btl361
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/17/2107    most recent
btl361v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Liu, X.
Right arrow Articles by Rattray, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, X.
Right arrow Articles by Rattray, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Probe-level measurement error improves accuracy in detecting differential gene expression

Xuejun Liu 1, Marta Milo 2, Neil D Lawrence 2 and Magnus Rattray 1,*

1 School of Computer Science, University of Manchester Oxford Road, Manchester M13 9PL, UK
2 Department of Computer Science, University of Sheffield Regent Court 211 Portobello Street, Sheffield S1 4DP, UK

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 

Motivation: Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level measurement error provides useful information which can help in the identification of differentially expressed genes.

Results: We propose a Bayesian method to include probe-level measurement error into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for efficient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational efficiency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in dataset and a mouse time-course dataset. Results show that the inclusion of probe-level measurement error improves accuracy in detecting differential gene expression.

Availability: The MAP approximation and variational inference described in this paper have been implemented in an R package pplr. The MCMC method is implemented in Matlab. Both software are available from http://umber.sbs.man.ac.uk/resources/puma

Contact: magnus.rattray{at}manchester.ac.uk

Supplementary Information: Supplementary data are available at Bioinformatics Online.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 
Microarrays (Schena et al., 1995; Lockhart et al., 1996) are widely used for large-scale measurements of gene expression levels. Microarray experiments are a complicated multiple step procedure and variability exists in every step of the experiment. The generated data are therefore notoriously noisy. In order to provide meaningful information from these data, various levels of analysis should be conducted. The first stage is probe-level analysis which provides a summary of the expression level for each gene. The type of downstream analysis to be performed then depends on the particular biological questions being addressed. Detecting differentially expressed genes is often the most basic aim of a microarray experiment. Affymetrix GeneChips are a popular choice of microarray technology and in this article we focus on detecting differential gene expression using these arrays.

Owing to the high variability in microarray data, replicate arrays are often used to obtain improved accuracy and reproducibility. Because of the high cost of the experiment, the number of chips for each condition is usually small, typically 2–4 replicates. There are two main reasons that make the detection of differential gene expression difficult in this context:

  1. Microarray experiments are associated with low precision probe-level measurements, especially for weakly expressed genes (probe-level measurement error).
  2. The small number of replicates makes it difficult to obtain an accurate variance estimate for each gene across replicates (between-replicate variance).

Many different approaches have been devised to address the second difficulty and obtain accurate between-replicate variance, such as the widely used methods Cyber-T (Baldi and Long, 2001) and SAM (Tusher et al., 2001). Most of these methods are based on single point estimates of gene expression values that can be obtained from probe-level analysis methods, such as MAS 5.0 (Affymetrix, 2001) or RMA (Irizarry et al., 2003) for Affymetrix arrays. A small number of replicates will often lead to an underestimated variance. To overcome this problem, Cyber-T implements a regularized t-test and SAM adds a regularizing constant to the gene-specific standard deviation. The reader is referred to Delmar et al. (2005) and Yang et al. (2005) for a detailed review of other recent methods for detecting differentially expressed genes.

Few methods have been proposed to include probe-level measurement error in finding differential gene expression. Each gene on the Affymetrix array has 11–20 probe-pairs and these provide useful information about probe-level measurement error that is typically discarded. Probe-level probabilistic models, gMOS (Milo et al., 2003), BGX (Hein et al., 2005), mgMOS and multi-mgMOS (Liu et al., 2005), have been developed and provide an estimate of this probe-level measurement error. Our most recent model, multi-mgMOS, provides accurate gene expression measurements along with the associated uncertainty in these measurement (Liu et al., 2005). It has been shown that the measurement variance obtained from probe-level analysis can be propagated through the downstream analysis using probabilistic methods, thereby improving the performance of the analysis (Sanguinetti et al., 2005).

In this contribution, we demonstrate the usefulness of the measurement error in finding differentially expressed genes. We propose a Bayesian hierarchical model to incorporate the probe-level measurement error into the variance estimate of gene expression levels. The probe-level measurements from multi-mgMOS are used in our work. The introduction of an additional variance term makes Bayesian hyper-parameter estimation intractable and we use a variational method for approximate parameter inference. We also compare the variational method with maximum a posteriori (MAP) and Markov chain Monte Carlo (MCMC) in terms of computational efficiency and accuracy. Results from an artificial spike-in dataset and a real mouse time-course dataset show that the inclusion of probe-level measurement error improves the accuracy of finding differentially expressed genes.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 
Biological replicates are usually used in an experiment to obtain more confident gene expression measurements. For each gene g, a Gaussian distribution of logged expression level across replicates is commonly assumed (Baldi and Long, 2001; Delmar et al., 2005; Krohn et al., 2005), Formula, where i is the index of replicate and j is the index of condition. The parameter Formula is the mean logged expression level under condition j and Formula is the inverse of the between-replicate variance (we use this form because it is more natural to place a prior on Formula). We take the probe-level measurement error Formula into consideration, so the measured expression level Formula can be expressed as Formula. We assume a zero-mean Gaussian measurement noise, Formula, with variance Formula obtained from the probe-level analysis (Liu et al., 2005; Sanguinetti et al., 2005). After including the probe-level measurement variance in the model we have

Formula 1(1)

From the probabilistic probe-level analysis we can obtain measured logged gene expression level Formula 1 for each gene on each single chip together with a measurement variance Formula 1. The parameters Formula 1 are to be estimated from this data. Each gene is treated independently on the array and we focus on each individual gene, so we omit the index of the gene, g, in the following equations.

2.1 Likelihood and prior
We consider a full Bayesian approach for the combination of replicate signals. With the absence of probe-level measurement variances it has been widely accepted that Formula 1 and Formula 1 should be dependent variables. However, it is unclear whether this remains true when we account for probe-level measurement error. We therefore make a prior assumption that Formula 1 and Formula 1 are independent and select a Gaussian prior over Formula 1,

Formula 2(2)
where Formula 2 and Formula 2 are hyperparameters. In practice, there are tens of thousands of genes to be processed in each experiment. In order to make our model simple for efficient computation, we put non-informative hyperpriors on Formula 2 and Formula 2.

We assume that Formula 2 is shared across different conditions and measures the gene-specific variability. We denote the common variance as Formula 2. In practice many experiments consider only a small number of conditions with a small number of replicates. To reduce the effect of the small size of dataset, we select a conjugate gamma prior over {lambda},

Formula 3(3)
where Ga(·) denotes the Gamma distribution with hyperparameters {alpha} and ß.

The different contributions to the variance play similar roles to the contributions in Rocke and Durbin's two-component model (Rocke and Durbin, 2001). In their model the additive measurement noise dominates for weakly expressed genes, similar to our measurement noise variance Formula 3, while the multiplicative component in their model corresponds to our replicate variance parameter Formula 3, which dominates for highly expressed genes. The difference here is that we use the internal probe replication on Affymetrix arrays to estimate Formula 3.

We now have a hierarchical model with parameters Formula 3 and hyperparameters Formula 3, where µ represents the set of all Formula 3. We are interested in the posterior distribution Formula 3 which summarizes the information about the mean expression value of replicates under each condition. With the assumption of the independence of observations, the likelihood of the observed data D is

Formula 4(4)
where Formula 4, c is the number of conditions and Formula 4 is the number of replicates under condition j. The prior on parameters is given by

Formula 5(5)

2.2 Parameter estimation
With the introduction of measurement error in (1), Bayesian inference becomes intractable. We use various parameter estimation methods and we compare the different methods in terms of efficiency and accuracy. We use MAP for crude estimation, a variational method for approximate Bayesian inference and MCMC for more accurate Bayesian inference.

2.2.1 MAP approximation
With the selected priors on {theta} in (5) there is no closed form solution for the marginal likelihood Formula 5. We obtain rough inference based on MAP estimates, Formula 5, of the posterior Formula 5 (Gelman et al., 2004). The conditional posterior of Formula 5, Formula 5, given other parameters as fixed at their modal values, is

Formula 6(6)
where Formula 6. The crude estimates are calculated simply and efficiently (see Gelman et al., 2004, p. 276) but discard the variability in the parameters. For more details please refer to the Supplementary Material.

2.2.2 Variational Inference
In order to account for variability in parameters, we use the Expectation-Maximisation (EM) algorithm (Dempster et al., 1977) combined with a variational method (Ghahramani and Beal, 2001; Jordan et al., 1999) to optimise a lower bound on Formula 6 and work out the posterior distribution Formula 6.

The log marginal likelihood, Formula 6, of data D is a function of {phi},

Formula 7(7)

The integral is intractable, so we use a distribution Q({theta}) over {theta} and Jensen's inequality to get a lower bound on Formula 7:

Formula 8(8)

The lower bound of Formula 8 is known as the Kullback–Leibler (KL) divergence (except for an additive constant) which measures the discrepancy between Q({theta}) and Formula 8. If there are no constraints on the form of Q({theta}), optimizing the KL-divergence results in

Formula 9(9)

We use an EM algorithm to optimize (8) with respect to Q({theta}) and {phi} iteratively. Starting from some initial hyper-parameters {phi}0:

Formula 10(10)

At the E-step the posterior distribution Formula 10 is not tractable, however by introducing constraints on the form of Q({theta}) we can approximate the distribution. In variational inference a factorization constraint on distribution Q is often used. Instead of optimizing (8) over the whole Q, we optimize with respect to the distribution of disjoint subsets of {theta}. For our model, we have assumed that {lambda} is independent of Formula 10, so a reasonable approximation should be to assume Q({theta}) factorizes as

Formula 11(11)

We substitute the factorized Q distribution into (8) and optimize it with respect to Q(µ) and Q({lambda}) respectively. The optimal expressions of Q(µ) and Q({lambda}) are

Formula 12(12)

Formula 13(13)
where <·> denotes the expectation of a function with respect to Q({lambda}) or Q(µ) and f({lambda}) denotes Formula 13. In the variational approach, when there is no standard form for the Q distribution, importance sampling can be used to obtain the required expectations (Lawrence et al., 2004). Equation (13) suggests an importance sampling procedure to estimate an expectation of interest under the distribution Q({lambda}). If we take n samples {lambda}k from Ga({alpha}t, ßt), the approximation to a function g({lambda}) is

Formula 14(14)

At each E-step the distributions Q(µ) and Q({lambda}) are calculated iteratively until the parameters have converged. When the whole EM algorithm in (10) converges, Q(µ) in (12) is the approximated posterior distribution P(µ|D,{phi}) of mean gene expression level. From (12) we can see that as the number of replicates increases, we obtain more confidence in the mean expression level since the inverse variance grows with the number of replicates.

2.2.3 MCMC
Intractable hierarchical Bayesian models are usually solved by random sampling from the posterior distribution of model parameters. We apply standard MCMC to summarize the Bayesian inference of our model by assuming a flat uniform prior on µ0 and flat Gamma priors on {lambda} and {eta}0 with shape and inverse scale both 0.001. Gibbs sampling is used for updates of parameters µ, µ0 and {eta}0. The remaining parameter log({lambda}) is updated by the Metropolis algorithm using a Gaussian proposal distribution. For more details please refer to the Supplementary Material.

2.3 Significance of differential expression
Once the posterior distribution P(µ|D,{phi}) is obtained, it is possible to compute the significance of differential expression between any two conditions. Taking a treatment and a control (indicated by 1 and 2 respectively), for instance, the probability of positive log-ratio (PPLR), P1 > µ2|D, {phi}), can be calculated by

Formula 15(15)

Equation (15) gives the posterior probability of increased expression in a treatment compared with a control. One can find the up-regulated genes by setting a level of confidence, like an {alpha}-level in a conventional statistical test. Down-regulated genes can also be found using an equation similar to (15) by calculating the integral of P1 – µ2 | D, {phi}) over (–{infty},0).

2.4 Implementation and computation time
During the developmental stage of our models, we implement all three parameter estimation approaches in Matlab. For importance sampling in the variational method, we draw 1000 samples once at each EM iteration. At each E-step, Q(µ) and Q({lambda}) are considered to have converged when parameters change by <10–6 per iteration. For the implementation of MCMC we simulate 10 parallel sequences, each of length 200. We monitor the convergence of the iterative simulation by estimating the potential scale reduction Formula 15 (Gelman et al., 2004) for all parameters. If Formula 15, we draw another 200 samples for each sequence. After discarding the first half of the simulations and mixing the remaining simulated sequences, the length of the effective simulations are between 1000 and 3000 for most genes. We use the fast C program donlp2 (Spellucci et al., 1998) for parameter optimization.

To process the golden spike-in dataset (described in Section 3.1), MAP approximation, variational inference and MCMC take 5 min, 4 and 50 h, respectively, using a 1.8 GHz AMD Opteron machine with 512 MB RAM. We conclude that MCMC is too computationally expensive to use in practice although it is useful to compare with other methods (see Section 3.3). Variational inference is a good compromise between computational efficiency and accuracy. MAP estimation is very fast, but less accurate, so we recommend it for crude inference. For more accurate inference, one can choose to use the variational method. We have implemented both MAP estimation and variational inference in an R-package pplr for public use of our models. For MCMC we provide the Matlab implementation for comparison.


    3 RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 
3.1 Data sets
We compare our method with one of the most popular approaches, Cyber-T (Baldi and Long, 2001), to show the improvement obtained by including probe-level measurement variance. Two datasets are used to perform this comparison. One is a wholly defined spike-in dataset called the golden spike-in dataset (Choe et al., 2005). The other is a real-world mouse time-course dataset which was used to identify hair-cycle associated genes by (Lin et al., 2004).

The golden spike-in dataset includes two conditions each of which has three replicate chips. This dataset contains a large number of differentially expressed genes with known fold-change, 1.2- to 4-fold, and provides enough true positives to obtain adequate statistics. The two conditions, control and spike, are labelled as ‘C’ and ‘S’ respectively. The S sample contains the same cRNAs as the C sample except that some selected groups of cRNAs are present at a defined different increasing concentration in the S sample. The remaining cRNAs are at identical concentration. This leads to 1331 up-regulated genes and 12 679 invariant genes. For more details please refer to Choe et al. (2005).

The mouse time-course dataset is used by Lin et al. (2004) to discover regulators in hair-follicle morphogenesis and cycling by measuring gene expression patterns in mouse back skin at eight representative time points. Each time point has three or four biological replicates. Eight hair-growth associated genes are validated by quantitative real-time PCR (qr-PCR) experiments. The qr-PCR data at each time point have three replicate measurements. For the first hair-growth cycle microarray data includes five time points (day 1, 6, 14, 17 and 23) and qr-PCR data cover eight time points (day 1, 5, 8, 12, 15, 17, 19 and 23). They have three time points in common (day 1, 17 and 23). Please refer to Lin et al. (2004) for more information.

3.2 Making use of measurement error
Before we apply the Bayesian hierarchical model on replicated data, we would like to show the usefulness of the measurement error from multi-mgMOS in a single chip experiment. We select chip C1 and S1 from the golden dataset as the control and treatment chips. Since all spiked-in genes in S1 are up-regulated, we use (15) to calculate PPLR in S1 compared with C1 using the measured gene expression values and variances from multi-mgMOS. The histogram of PPLR is shown in Figure 1a. The PPLRs of most spike-in genes are close to 1 which shows the high confidence of the increasing gene expression in S1. This is consistent with the golden data where all spike-in genes are up-regulated. Most invariant genes have PPLR close to 0.5 and there is no obvious evidence for these genes to be differentially expressed. Figure 2 shows the top 50 differential expression ranking between chip C1 and S1 by (1) simple fold-change and (2) PPLR. Without considering the measurement variance, multi-mgMOS obtains a large number of false positives by fold-change ranking. After taking the credibility intervals into account, there are now no false positives in the top 50 positions.


Figure 1
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1 Histogram shows probability of positive log-ratio (PPLR) between (a) C1 and S1 and (b) replicated condition C and S in golden dataset. The histogram is a stack of non-spike-in genes and spike-in genes. The white is for non-spike-in genes and the shade is for the spike-in genes.

 

Figure 2
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2 A total of 5–95% credibility intervals of positive log-ratio between S1 and C1 in the golden dataset. The left figure shows the top 50 most significantly differentially expressed genes ranking by log-ratio and the right is the ranking by the PPLR between S1 and C1. Stars represent the mean of log-ratio. Spike-in genes are indicated by a square box. Without considering the measurement error, log-ratio ranking (a) obtains a much larger false positive rate than PPLR ranking (b).

 
The average receiver operator characteristic (ROC) curves for all nine possible single chip-pairs between condition C and S in the golden spike-in dataset are plotted in Figure 3s. As well as the two methods using multi-mgMOS, as shown in Figure 2, we also include a combined method suggested in Choe et al. (2005) which we denote as cMAS 5.0 because the major procedures in this method come from MAS 5.0. The results in Choe et al. (2005) show that cMAS 5.0 performs best among current statistical methods, including RMA (Irizarry et al., 2003) and GCRMA (Wu et al., 2004), on the golden spike-in dataset. We used the same form of loess normalization for all methods as used in Choe et al. (2005) to make results comparable. The area under ROC curves are 0.9226, 0.8062 and 0.7869 for PPLR of multi-mgMOS, fold-change of cMAS 5.0 and fold-change of multi-mgMOS, respectively. We notice that at the upper-right part of ROC curves PPLR results obtains slightly lower true positive rate than the other two methods. In practice we are often more concerned with obtaining a low false positive rate. For a reasonable number of false positives, on the left of the ROC curves, PPLR is much better than the other two alternatives. These results show that the uncertainty of the estimated expression level helps in detecting differential gene expression where there is only a single chip for each condition.


Figure 3
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3 ROC curves for (a) all nine possible single chip-pairs and (b) replicated conditions in golden dataset. Three methods are used in each case, (a) simple fold-change of multi-mgMOS and cMAS 5.0 gene expression, and probability of positive log-ratio (PPLR) of multi-mgMOS gene expression measurements together with measurement error, (b) multi-mgMOS combined with Bayesian hierarchical model and Cyber-T respectively, and cMAS 5.0 with Cyber-T. Curves from Cyber-T are obtained using a Bayesian t-test and the curve from the Bayesian hierarchical model is plotted by calculating PPLR between sample S and sample C.

 
3.3 Combining replicates
In practice, people usually use replicates to estimate a level of uncertainty with the estimated gene expression level. We use the proposed Bayesian hierarchical method to include the measurement error of replicates and improve the estimation of the uncertainty of gene expression measurements.

3.3.1 Comparison of estimation accuracy
MCMC simulation obtains reliable results when it converges. We use the results from MCMC as a gold standard to evaluate the accuracy of MAP estimation and variational inference. Figure 4 shows the distribution of signal log ratio between days 14 and 1 for two probe-sets in the mouse time-course dataset. The distribution shown in Figure 4a is for one probe-set of gene Dab2 which is a known hair-growth associated gene. The estimated expression levels of this probe-set are variable over different time points. MAP approximation and variational inference obtain similar accuracy for this probe-set. The probe-set shown in Figure 4b is randomly selected and is not obviously hair-growth related, so the signal log ratio is not as high as for the probe-set in Figure 4a. For this probe-set, the variational method is closer than the MAP approximation to MCMC. In general we find that the posterior estimates of the variational approach are closer than those provided by the MAP approximation to MCMC results. In the following examples, we therefore use the variational method.


Figure 4
View larger version (14K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4 Distribution of log ratio of expression level between days 14 and 1 for (a) one probe-set of gene Dab2 and (b) a randomly selected probe-set in the mouse time-course dataset. Three different parameter estimation methods are used: MAP estimation, variational approximation and MCMC.

 
3.3.2 Performance on artifical dataset
Figure 1b shows the histogram of PPLR between replicated condition C and S in the golden spike-in dataset. In addition to more spike-in genes moving close to one, non-spike-in genes get tighter 0.5 compared with the histogram of PPLR between C1 and S1 in Figure 1a. We therefore obtain more confidence in the up-regulated genes and the invariant genes. We plot ROC curves (Fig. 3b) on the golden spike-in dataset to show the ability of our proposed combination method compared with the widely used approach Cyber-T which does not consider the measurement variance. From Figure 3b we can see that the inclusion of measurement error does improve the ability of multi-mgMOS to detect differentially expressed genes. The area under ROC curves for the Bayesian hierarchical method and Cyber-T on results from multi-mgMOS are 0.9431 and 0.9310, respectively, and 0.9306 for cMAS 5.0 combined with Cyber-T. More results from popular statistical methods combined with Cyber-T on the golden spike-in dataset are presented in Choe et al. (2005).

3.3.3 Performance on real dataset
In order to examine our method in an application on a real-world experiment, we applied the method to the mouse time-course dataset. Figure 5 is the temporal profile of one probe-set related to gene Dab2 in this dataset. The first hair-growth cycle is shown, which is covered by five time points. It can be seen that the combined signal obtains more confidence for each condition compared with the signal from each individual replicate and remains consistent with the qr-PCR profile.


Figure 5
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 5 Temporal profile of one probe-set of gene Dab2 in the mouse time-course dataset. Thin solid lines are the estimated results from multi-mgMOS for three replicate chips and the thick solid line is the combined result from the hierarchical Bayesian model. At each time point 2.5–97.5% credibility intervals are shown. The dotted line is the quantitative real-time PCR profile.

 
The qr-PCR data in the mouse time-course dataset have three measurements at each time point. We make use of these replicate values and process them using Cyber-T at those time points that were also measured by microarray data (day 1, 17 and 23). Using the obtained statistical values of qr-PCR data, we validate the PPLR calculated from our methods and compare them with the results from Cyber-T using expression measurements obtained from cMAS 5.0 and the popular probe-level processing method GCRMA (Wu et al., 2004). The results generated from Cyber-T are associated with a P-value which does not have the same meaning as for PPLR and we cannot compare them dirrectly. In order to make them comparable, we select different credibility levels so that methods obtain a similar number of significant genes. For PPLR we set the significant level at 0.06, while the P-values for the other methods are set at 0.01. The number of significant genes at the different credibility levels for different methods are shown in the lower part of Table 1. A global scaling normalization is used for results from multi-mgMOS and cMAS 5.0 at the probe-set level, and GCRMA uses quantile normalisation at the probe level. The eight PCR validated genes have 14 associated probe-sets shown in Table 1. We find the differential gene expression between two pairs of time points, (day 1, 17) and (day 17, 23). We exclude day 1 of gene Crisp1 owing to the absence of replicate measurements of qr-PCR data at this time point. There are 27 tests altogether in this data. From these 27 tests PPLR obtains two inconsistent results with qr-PCR, while cMAS 5.0+Cyber-T and GCRMA+Cyber-T obtain three. Results from our method therefore show more consistency with results from qr-PCR data.


View this table:
[in this window]
[in a new window]

 
Table 1 Finding differential gene expression among eight qr-PCR validated genes in a mouse time-course dataset. qr-PCR data, cMAS 5.0 and GCRMA expression measurements are processed by Cyber-T to obtain P-values and multi-mgMOS estimates are processed by the Bayesian hierarchical model to calculate the probability of positive log-ratio (PPLR). For Cyber-T results we set a credibility level at {alpha} = 0.01. ‘S’ stands for significant differential expression, and ‘N’ not significant. For comparison with P-value, min(PPLR, 1–PPLR) is shown in the table. We set a comparable credibility for PPLR, {alpha} = 0.06, which means genes which have at least 94% probability of signal change are considered as significantly differentially expressed. The numbers of significant genes for different methods at different credibility levels are shown in the lower part of the table

 

    4 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 
We have presented an approach using probe-level measurement error in order to improve the detection of differential gene expression and compared three different computation methods, MAP approximation, a variational method and MCMC, to solve the intractability in the model owing to the incorporation of probe-level measurement error. This approach makes full use of information in microarray experimental data and performs relatively efficient and accurate inference based on this rich information. Results on a spike-in dataset and a real time-course dataset show that the inclusion of probe-level measurement error improves the accuracy of finding differentially expressed genes, especially when there are few replicate chips for conditions. Recent work on PCA demonstrates that probe-level uncertainties can also be incorporated into other probabilistic models, leading to an improved method (Sanguinetti et al., 2005). Where possible we recommend that probe-level measurement error should be included in the downstream analysis in order to obtain more reasonable biological conclusions from a microarray experiment.


    Acknowledgments
 
The authors gratefully acknowledge Padhraic Smyth and Bogi Andersen for providing the mouse time-course qr-PCR data, Leo Zeef for helpful suggestions on the software and three anonymous reviewers for constructive comments. M.R. and N.D.L. gratefully acknowledge a BBSRC award ‘Improved processing of microarray data with probabilistic models’. M.M. is supported by an Advanced Training Fellowship from the Wellcome Trust.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Martin Bishop

Received on March 30, 2006; revised on June 5, 2006; accepted on June 26, 2006

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 4 CONCLUSION
 REFERENCES
 

    Affymetrix. Microarray Suite User Guide Version 5.0, (2001) , Santa Clara, CA Affymetrix Inc.

    Baldi, P. and Long, A.D. (2001) A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics, 17, 509–519[Abstract/Free Full Text].

    Choe, S.E., et al. (2005) Preferrend analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. Genome Biol, . 6, R16[CrossRef][Medline].

    Delmar, P., et al. (2005) VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data. Bioinformatics, 21, 502–508[Abstract/Free Full Text].

    Dempster, A.P., et al. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat Soc, . B39, 1–38.

    Gelman, A., et al. Bayesian Data Analysis, (2004) 2nd , Boca Raton, FL Chapman & Hall/CRC.

    Ghahramani, Z. and Beal, M.J. (2001) Graphical models and variational methods. In Opper, M. and Saad, D. (Eds.). Advanced Mean Field Methods—Theory and Practice, , Cambridge, MA MIT Press, pp. 161–177.

    Hein, A.K., et al. (2005) BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 6, 349–373[Abstract/Free Full Text].

    Irizarry, R.A., et al. (2003) Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics, 4, 249–264[Abstract].

    Jordan, M.I., et al. (1999) An introduction to variational methods for graphical models. Mach. Learn, . 37, 183–233[CrossRef].

    Krohn, K., et al. (2005) Increased power of microarray analysis by use of an algorithm based on a multivariate procedure. Bioinformatics, 21, 3530–3534[Abstract/Free Full Text].

    Lawrence, N.D., et al. (2004) Reducing the variability in cDNA microarray image processing by Bayesian inference. Bioinformatics, 20, 518–526[Abstract/Free Full Text].

    Lin, K.K., et al. (2004) Identification of hair cycle-associated genes from time-course gene expression profile data by using replicate variance. Proc. Natl Acad. Sci. USA, 101, 15955–15960[Abstract/Free Full Text].

    Liu, X., et al. (2005) A tractable probabilistic model for Affymetrix probe-level analysis across multiple chips. Bioinformatics, 21, 3637–3644[Abstract/Free Full Text].

    Lockhart, D.J., et al. (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat. Biotechnol, . 14, 1675–1680[CrossRef][Web of Science][Medline].

    Milo, M., et al. (2003) A probabilistic model for the extraction of expression levels from oligonucleotide arrays. Biochem. Soc. Trans, . 31, 1510–1512[Web of Science][Medline].

    Rocke, D.M. and Durbin, B. (2001) A model for measurement error for gene expression arrays. J. Comput. Biol, . 8, 557–569[CrossRef][Web of Science][Medline].

    Sanguinetti, G., et al. (2005) Accounting for probe-level noise in principal component analysis of microarray data. Bioinformatics, 21, 3748–3754[Abstract/Free Full Text].

    Schena, M., et al. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270, 467–470[Abstract/Free Full Text].

    Spellucci, P. (1998) A SQP method for general nonlinear programs using only equality constrained subproblems. Math. Program, . 82, 413–448.

    Tusher, V.G., et al. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl Acad. Sci. USA, 98, 5116–5121[Abstract/Free Full Text].

    Wu, Z., et al. (2004) A model based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc, . 99, 909–917[CrossRef][Web of Science].

    Yang, Y.H., et al. (2005) Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics, 21, 1084–1093[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Physiol. GenomicsHome page
D. G. Morris, S. M. Waters, S. D. McCarthy, J. Patton, B. Earley, R. Fitzpatrick, J. J. Murphy, M. G. Diskin, D. A. Kenny, A. Brass, et al.
Pleiotropic effects of negative energy balance in the postpartum dairy cow on splenic gene expression: repercussions for innate and adaptive immunity
Physiol Genomics, September 1, 2009; 39(1): 28 - 37.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
I. M. Packham, C. Gray, P. R. Heath, P. G. Hellewell, P. W. Ingham, D. C. Crossman, M. Milo, and T. J. A. Chico
Microarray profiling reveals CXCR4a is downregulated by blood flow in vivo and mediates collateral formation in zebrafish embryos
Physiol Genomics, August 7, 2009; 38(3): 319 - 327.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Geeleher, D. Morris, J. P. Hinde, and A. Golden
BioconductorBuntu: a Linux distribution that implements a web-based DNA microarray analysis server
Bioinformatics, June 1, 2009; 25(11): 1438 - 1439.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Jayawardhana, D. B. Kell, and M. Rattray
Bayesian inference of the sites of perturbations in metabolic pathways via Markov chain Monte Carlo
Bioinformatics, May 1, 2008; 24(9): 1191 - 1197.
[Abstract] [Full Text] [PDF]


Home page
Toxicol SciHome page
J. B. Silkworth, E. A. Carlson, C. McCulloch, K. Illouz, S. Goodwin, and T. R. Sutter
Toxicogenomic Analysis of Gender, Chemical, and Dose Effects in Livers of TCDD- or Aroclor 1254-Exposed Rats Using a Multifactor Linear Model
Toxicol. Sci., April 1, 2008; 102(2): 291 - 309.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/17/2107    most recent
btl361v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Liu, X.
Right arrow Articles by Rattray, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, X.
Right arrow Articles by Rattray, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?