Bioinformatics Advance Access originally published online on February 2, 2005
Bioinformatics 2005 21(9):1995-2000; doi:10.1093/bioinformatics/bti302
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Published by Oxford University Press 2005.
Evaluation of the gene-specific dye bias in cDNA microarray experiments
1URGV UMR INRA 1165CNRS 8114UEVE 2 rue Gaston Crémieux, CP 5708, 91057 Evry Cedex, France
2UMR INAPG/ENGREF/INRA MIA 518 16 rue C. Bernard, 75231 Paris Cedex 05, France
3Laboratoire d'Immunologie Virale, Institut Pasteur 28 rue du Docteur Roux, 75724 Paris, France
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: In cDNA microarray experiments all samples are labeled with either Cy3 or Cy5. Systematic and gene-specific dye bias effects have been observed in dual-color experiments. In contrast to systematic effects which can be corrected by a normalization method, the gene-specific dye bias is not completely suppressed and may alter the conclusions about the differentially expressed genes.
Methods: The gene-specific dye bias is taken into account using an analysis of variance model. We propose an index, named label bias index, to measure the gene-specific dye bias. It requires at least two selfself hybridization cDNA microarrays.
Results: After lowess normalization we have found that the gene-specific dye bias is the major source of experimental variability between replicates. The ratio (R/G) may exceed 2. As a consequence false positive genes may be found in direct comparison without dye-swap. The stability of this artifact and its consequences on gene variance and on direct or indirect comparisons are addressed.
Availability: http://www.inapg.inra.fr/ens_rech/mathinfo/recherche/mathematique
Contact: mlmartin{at}inapg.fr
| INTRODUCTION |
|---|
|
|
|---|
Many experimenters and statisticians (Kerr et al., 2002; Churchill, 2002) recommend using dye-swap design in cDNA microarray experiments to correct gene-specific dye bias. This artifact is not suppressed by normalization procedures such as the lowess (Yang et al., 2002). For a reference design some experimenters claim that dye-swaps are not necessary (Sterrenburg et al., 2002) whereas others use dye-swap design to preclude gene-specific dye bias (Pritchard et al., 2001; Brem et al., 2002). In direct comparison, even when the labeling artifact is better recognized, its consequences are often minimized. For example Yue et al. (2001) wrote Any variation observed in differential expression was likely a result of real variations in experimental mRNA levels rather than an artifact of the labeling system. Tseng et al. (2001) described the gene*label interaction but concluded Theoretically some degree of gene-label interaction may exist. However this interaction appears to be insignificant in magnitude compared to other sources of variation in the present experiment.
To our knowledge, few papers have investigated the influence of the gene-specific dye bias: Dombkowski et al. (2004) have shown that dye orientation can significantly influence results on differential analysis in a reference design. They have estimated that over 20% of the conclusions of their differential analysis may be inaccurate using an approach with single dye orientation. They did not identify the cause of the bias, but have urged the experimenters to use dye-swap until this artifact is better characterized. Rosenzweig et al. (2004) have investigated the nature of the gene-specific dye bias on a direct comparison experiment. Their analysis suggests that this artifact may concern the same probes. They proposed in their paper a new and less expensive design than the dye-swap, which attenuates the gene-specific dye bias but does not completely correct it.
In this paper, we propose an index to evaluate the magnitude of the gene-specific dye bias. The idea of the index comes from an analysis of two selfself hybridization slides. When we analyzed them, we were surprised to obtain many differentially expressed genes. The reason is that the mean log-ratio log2(R1R2/G1G2) was wrongly calculated in place of log2(R1G2/G1R2), where Ri and Gi denote respectively the red and green intensity on the array i. With the mean log-ratio log2(R1G2/G1R2), no differentially expressed genes were obtained, as was expected. We have been amazed by the importance of the effect of a simple reverse of dye. To better understand the phenomenon we have written the corresponding statistical model, and deduced an index to estimate the magnitude of the gene-specific dye bias.
The paper is organized as follows. In the next section we present the statistical model taking gene-specific dye bias into account, and an index [label bias index (LBI)] to evaluate the magnitude of this artifact. Next the LBI is computed on experiments concerning several array types and organisms. We note that it is almost constant for each array type but varies from one to another. One array type seems to have low gene-specific dye bias. We are not able to explain the reasons, but this fact shows that it is possible to control this artifact. Finally we discuss the consequence of the gene-specific dye bias in direct and indirect comparisons, and try to give some insight into the mechanism of this bias.
| METHODS |
|---|
|
|
|---|
This section is devoted to the statistical model. We underline the importance of keeping the interaction between gene and dye in the model to take gene-specific dye bias into account in the differential analysis, and we evaluate the gene-specific dye bias.
Model allowing for gene-specific dye bias
A dye-swap experiment consists of two replicate microarrays where opposite dye orientations are used. Thus each RNA sample is labeled with each dye. We consider an experiment where p dye-swaps are made. To study the data, we use the analysis of variance. Our notations follow those of Kerr et al. (2002). Let Yijkg be the logarithm base 2 of the measurement for array i, dye j, RNA sample k and gene g. We consider the following model:
![]() | (1) |
![]() |
are independent random variates with mean 0.
To remove systematic biases, we perform an array-by-array normalization using the lowess procedure (Yang et al., 2002). It suppresses the first four constant terms, and is supposed to alleviate the DG terms and not to alter the VG terms. We refer to the work of Kerr et al. (2002) for an explanation. After the normalization step, the observed difference of expression between the two RNA samples on the array i equals
![]() |
; therefore the errors Fig are not independent by construction, and they verify that
. It implies a weak structural dependence of order 1/G between the Fig. In the following we assume that the Fig are independent. The departure from this assumption is too weak to have any practical importance provided that G
1000. The difference (VG)1g (VG)2g is the true difference of expression between the two RNA samples. It is the difference of interest for identifying differential expressed genes. When it is non-null, it states that the gene is not transcribed in the same manner in the two RNA samples. The difference (DG)'1g (DG)'2g represents what is called the gene-specific dye bias. When it is non-null, it states that the probe corresponding to the gene g incorporates one of the dyes preferentially. To simplify the notations, we denote the difference (VG)1g (VG)2g by
g and the difference (DG)'1g (DG)'2g by ßg. The observed difference of the gene g between the two RNA samples in the array i is now re-written:
![]() | (2) |
g is the gene g differential expression and ßg the specific dye bias of the gene g. From this model we can estimate for each gene the differential expression between the two RNA samples and the gene-specific dye bias by
![]() |
![]() |
2), we can also estimate the variance of Fig, say
, by the empirical estimator defined by
![]() |
It is then possible to perform a differential analysis and also an analysis of the gene-specific dye bias. For the latter purpose, it suffices to test the null hypothesis {ß1 =
= ßG = 0} against the alternative hypothesis {At least one gene is such that ßg
0}. The associated test statistic can be viewed as a global index to evaluate the gene-specific dye bias. It is easily and quickly computed. We name it the LBI and it is defined by
![]() | (3) |
, the LBI is distributed as a Fisher distribution with [G 1, (2p 2)(G 1)] degrees of freedom. The null hypothesis is rejected as soon as the test statistic is greater than FG1, (2p2)(G1)(1
), where Fa,b(
) denotes the
-quantile of a Fisher distribution with (a, b) degrees of freedom. Note that in practice, the null hypothesis may often be rejected since the power of the test is high. So to decide if the gene-specific dye bias is important, the LBI can be also compared with the expectation of a Fisher distribution, given by 1 + {1/[(p 1)(G 1) 1]}.
Although it is possible to take into account the gene-specific dye bias, in many studies the authors prefer to neglect it (e.g. Tseng et al., 2001; Comander et al., 2004). This leads to setting ßg = 0 for g = 1, ..., G in the model (2). The variance of Fig is thus estimated by
![]() |
is a biased estimator of
if ßg differs from 0. To be precise, the bias equals
. Therefore assuming wrongly that the ßg are null leads to overestimating the variance
. Hence the power of the test for detecting a difference of expression will be lower when
is used in place of
: some differentially expressed genes will not be detected.
When only one dye swap is made, the model (2) is over-parametrized: the number of parameters is larger than the number of observations. It is thus impossible to estimate simultaneously the difference of expression (
g), the gene-specific dye bias (ßg) and the variance (
). Only two parameters per gene can be estimated. Since the major interest is the differential analysis, the parameter ßg is usually supposed to be null. In the following section, we propose a method to assess this assumption.
Evaluation of the gene-specific dye bias from selfself hybridization slides
As noticed above, when only one dye-swap is available, the statistical model (2) is no longer usable to study the observed difference of expression between two different RNA samples. Nevertheless if we consider selfself hybridization slides where the same RNA sample is hybridized against itself, it guarantees that the true difference of expression is null (
g = 0) and thus the model (2) becomes a one-way ANOVA model:
![]() |
= ßG = 0}. If p
1, it is defined by:
![]() | (4) |
, for all g = 1, ..., G. Under the null hypothesis, the LBI is distributed as a Fisher distribution with [G 1, (2p 1)(G 1)] degrees of freedom. Consequently the null hypothesis is rejected as soon as the LBI is greater than FG1,(2p1)(G1)(1
). It readily follows that for p = 1,
![]() | (5) |
). As previously the null hypothesis is often rejected since the number of degrees of freedom is of the magnitude of G. Consequently to decide if the gene-specific dye bias is important, the LBI can be compared with the expectation of the Fisher distribution, which is equal to (G 1)/(G 3)
1. The LBI gives a global overview of the gene-specific dye bias. It is also interesting to have a gene-by-gene approach. For that purpose we propose to test {ßg = 0} for each gene. As in the differential analysis, it is important to model the variance suitably. We have chosen to use the mixture model of Delmar et al. (2004). This method identifies clusters of genes with equal variance and has the good properties of keeping a good control of false positive genes and having a good power of detection. We use the Bonferroni method (with a type I error equal to 5%) in order to keep a strong control of the false positives in a multiple comparison context (Benjamini and Hochberg, 1995).
| RESULTS |
|---|
|
|
|---|
Data
We calculate the LBI from several selfself hybridization arrays of human and Arabidopsis thaliana cells.
Experiments from human cells
Resting CD4+ T cells isolated from peripheral mononuclear blood cells of healthy donors were stimulated either by the SDF-1a chemokine (SDF), or infected by the NL4-3 wild-type strain of HIV-1 (WT) or left untreated (control). For each treatment, an aliquot was removed from the cell culture at 6 different time-points over a 24 h period (30 min, 2, 4, 8, 12 and 24 h) and RNA was extracted using the RNeasy mini kit (Qiagen) according to the manufacturer's recommendations. Samples of mRNA were submitted to the T7 amplification procedure described by Phillips and Eberwine (1996), in a very similar way as previously reported (Wang et al., 2000). An aliquot of 4 µg of amplified RNA from a given condition (SDF, wild type or control) at a chosen time (Table 1), was used for reverse transcription and aminoallyl coupling (for details see http://cmgm.stanford.edu/pbrown/protocols/amino-allyl.htm and http://www.microarrays.org/pdfs/amino-allyl-protocol.pdf). The two halves of each aminoallyl-cDNA were coupled to NHS-Cy3 and NHS-Cy5, then purified together and hybridized onto the same array to produce a selfself hybridization.
|
For the first six experiments of Table 1, duplicate experiments using cells from two independent donors (RNA from same time and condition) were performed on the same day. For the next two experiments, the procedures remained the same except that the amount of starting material was doubled in order to hybridize a couple of arrays (same sample duplication).
All samples were hybridized on the same type of array consisting of 11 520 clones except for the seventh dye-swap, which was hybridized on another array of 11 616 clones spotted in duplicate. These experiments are part of a larger study that will be published elsewhere.
The arrays were scanned on a GenePix 4000A scanner (Axon Instruments, Foster City, USA) and images were analyzed by the GenePix Pro 4.0 software (Axon Instruments, Foster City, USA). For each array, the raw data comprised the logarithm base 2 of median feature pixel intensity at wavelength 635 nm (red) and 532 nm (green). No background was subtracted. The array-by-array normalization was performed to remove systematic biases. First, we excluded spots that were considered badly formed features. Then we performed a global intensity-dependent normalization using the lowess procedure (Yang et al., 2002). Finally, for each block, the log-ratio median calculated over the values for the entire block was subtracted from each individual log-ratio value.
Experiments from A.thaliana cells
Four sets of 100 A.thaliana Col-0 plants were grown on horticultural potting soil (Tref substrate with NFU 44-571 fertilizer, BAAN SA, Vulaines, France) under cool white light at 100 µmol m-2 s-1 with a 16-h photoperiod at 22°C and 50% humidity. Pooled samples of the flowers or the buds were harvested. The RNA extraction and target labeling were described as in Lurin et al. (2004).
All samples were hybridized on CATMA array containing 24 576 Gene Specific Tags from A.thaliana (Crowe et al., 2003).
The arrays were scanned on a GenePix 4000A scanner (Axon Instruments, Foster City, USA) and images were analyzed by GenePix Pro 3.0 (Axon Instruments, Foster City, USA). For each array, the raw data and array-by-array normalization were respectively defined and performed as for the slides of the human cell experiments.
LBI
Table 1 summarizes the LBI computed for the 11 experiments. The LBI is the ratio between the Regression sum of squares
and the Residuals sum of squares
. The RegSS, RSS and LBI values are respectively presented in the first, second and third columns of Table 1. We note that the RegSS is always > RSS, so the LBI is always >1. The LBI shows that the RegSS is more than three times as high as the RSS in arrays 1 and 2 and less than twice as high as the RegSS in the CATMA array. So the dye bias is more important in the human experiments than in the experiments of A.thaliana. We recall that the ideal LBI (no gene-specific dye bias) is close to 1. In the experiments from A.thaliana cells, we have at our disposal four slides of CATMA, where the same sample of buds has been hybridized against itself. We use these four slides to evaluate the robustness of the LBI by calculating it on the six possible pairs of slides. The associated LBI varies between 1.12 and 1.26, which proves its robustness. We point out that the robustness has not been evaluated for arrays with a relatively high LBI because necessary data were not available.
To further illustrate the impact of the gene-specific dye bias, we plot the log-ratios log2(R/G) for the two slides of the same dye-swap, for all the experiments (Fig. 1). As we have two replicates of selfself hybridization slides, nothing is expected to be seen. However one can see that there is a positive correlation between the two replicates. The only possible cause for such a correlation is the dye bias. Some genes have a higher intensity when labeled with one dye than with the other. Therefore the log-ratio log2(R/G) is repeatedly higher (or lower) than it should be. This dye effect is higher on human experiments (correlation between 0.61 and 0.73) than on A.thaliana (correlation between 0.08 and 0.33). This confirms that the dye bias plays an important role in the experimental variability in the human experiments. In contrast, the dye bias seems to be better controlled in the A.thaliana experiments.
|
We also calculate the correlations between all the
for each human/array 1 experiment. These correlations are comprised between 0.45 and 0.81 (Table 2). As the array type is the same but experimental conditions vary, these correlations suggest that the dye bias may be attributed to the gene. Note that the possible gene effect is confounded with its position on the slide. Therefore it is impossible to separate the two possible causes of the labeling bias which are the nucleic composition of the probe and the spotting effect (Mary-Huard et al., 2004).
|
Identification of genes having a specific dye bias
After a global analysis of the gene-specific dye bias we identify the genes which are concerned. However to begin with, we assess the quality of the selfself hybridization slides by testing that each
g is null. Similar to the test of {ßg = 0} for each gene, we use the mixture model of Delmar et al. (2004). The control of the false positives is done with the Bonferroni method at a level of 5%. No gene is found to be regulated (column (a) in Table 1). Then, in order to identify genes with a significant dye bias, we test the labeling artifact using also the mixture model of Delmar et al. (see Methods section). Column (b) of Table 1 shows that between 0 and 189 genes have a significant gene-specific dye bias. This artifact is important in the human experiments and does not appear in the A.thaliana experiments. These results are in agreement with the LBI calculated in the previous section. Furthermore, all the genes having a significant dye bias are classified in the highest variance group from the differential analysis. This suggests that many genes from the highest variance group could not be detected as differentially expressed only because their pure experimental variability is increased by a specific dye bias effect. This confirms that the presence of gene-specific dye bias can increase the false negative rate and so decrease the power of detection.
Table 1 contains the mean, minimal and maximal values of the
for the detected genes. One can see that the gene-specific dye bias may multiply or divide the ratio by a factor >2 which is sizeable. An analysis on the intensity level of the genes with a high specific dye bias (data not shown) shows that the intensity of these genes is in a large range between 5.5 and 15.7, with a median value between 9.5 and 10.2. Figure 2 plots the specific dye bias according to the intensity level for the first human/array 1 experiment. We can see that the magnitude of the artifact is near 0 when the intensity level is not very far from the background level. This confirms that a gene needs to be transcribed in order to reveal its specific dye bias. For higher values of the intensity level, no dependence is observed between specific dye bias and intensity level. As shown before, all expressed genes can be affected by a specific dye bias whatever their intensity level.
|
| DISCUSSION |
|---|
|
|
|---|
Consequences of the gene-specific dye bias on direct comparison experiments
In direct comparison, two RNA samples are simultaneously hybridized on the same slide. Each sample is labeled with a dye, and it is well known that the two dyes do not have the same incorporation effectiveness. Moreover it appears that some genes are systematically badly labeled by Cy5 or Cy3 (the gene-specific dye bias). For all these reasons dye-swap design is absolutely recommended, although it is costly. Moreover in the first section we have proved that the gene*label interaction increases the experimental variability even in dye-swap experiments and thus decreases the power of the tests for detecting the differentially expressed genes.
In this paper we have proposed the LBI which is a global index to evaluate the magnitude of the gene-specific dye bias. The LBI is easily and quickly computed, and requires at least two selfself hybridization slides. After the LBI calculation we advise carrying out a gene-by-gene analysis. Even if we cannot completely describe the biochemical mechanisms of this bias, it seems that it is an artifact which involves the probes and the labeled targets, since the gene-specific dye bias can be seen only when the gene corresponding to the probe is transcribed. Consequently we advocate using a sample which hybridizes against the most possible probes. Moreover if the LBI is calculated on an array where the probes are duplicated, we think that it is better to work from the probes and not from the mean of the duplicated probes, since the gene-specific dye bias is probe-dependent. All these remarks allow us to think that the method proposed by Rosenzweig et al. (2004) is questionable. A condition where all genes would be transcribed simultaneously would be necessary to obtain an effective correction.
In order to investigate the gene-specific dye bias in more detail, it could be interesting for the platforms to include the LBI in their quality-control procedures, because the identification of genes which have specific dye bias is important supplementary information for the differential analysis. Moreover it could help to explain the nature of the phenomenon. According to the result of the A.thaliana experiment, this artifact is not an inevitability and can be well controlled. The elimination of the gene-specific dye bias could dramatically decrease the experiment cost by removing the necessity of systematic dye-swap design.
Note that the genes can be clustered either in a group without specific dye bias (ßg = 0) or in a group with specific dye bias (ßg
0). The former group has a lower experimental variability than the latter in dye-swap experiments. This explains why the mixture model on gene variances is well suited to microarray experiments (Delmar et al., 2004).
Consequences of the gene-specific dye bias on indirect comparison experiments
In indirect comparison an RNA sample is hybridized against a control sample. The associated design is called the reference design. As we mentioned in the introduction, it is widely assumed that reference design does not require dye-swaps. The paper of Dombkowski et al. (2004) demonstrated from a microarray data analysis that this assumption is not reliable. By writing the statistical model, we confirm their findings. We take the notations used throughout the paper. To take into account that the gene-specific dye bias appears only when there is transcription, we include in the model (1) the interaction between the RNA sample, the dye and the gene, say (VDG). Let us assume that the dye j = 1 is associated with the control sample k = 0, then the observed difference of expression between the i-th RNA sample and the control sample is equal to
![]() |
![]() |
![]() |
are random variates with mean 0. The gene*label interaction terms vanish but the interactions between the RNA sample, the dye and the gene remain. This is the reason why a dye-swap design is recommended even in indirect comparison.
Received on July 7, 2004; revised on January 25, 2005; accepted on January 27, 2005
| REFERENCES |
|---|
|
|
|---|
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B, 57, 289300.
Brem, R., et al. (2002) Genetic dissection of transcriptional regulation in budding yeast. Science, 296, 752755
Churchill, G. (2002) Fundamentals of experimental designs for cDNA microarray. Nat. Genet., 32, 490495.
Comander, J., et al. (2004) Improving the statistical detection of regulated genes from microarray data using intensity-based variance estimation. BMC Genomics, 5, 17[CrossRef][Medline].
Crowe, M., et al. (2003) CATMA: a complete Arabidopsis GST database. Nucleic Acids Res., 31, 156158
Delmar, P., et al. (2004) Varmixt: efficient variance modelling for the differential analysis of replicated gene expression data. Bioinformatics, 21, 502508.
Dombkowski, A., et al. (2004) Gene-specific dye bias in microarray reference designs. FEBS Lett., 560, 120124[CrossRef][ISI][Medline].
Kerr, M.K., et al. (2002) Statistical analysis of a gene expression microarray experiment with replication. Statist. Sin., 12, 203217.
Lurin, C., et al. (2004) Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell, 16, 20892103
Mary-Huard, T., et al. (2004) Spotting effect in microarray experiments. BMC Bioinformatics, 5, 63[CrossRef][Medline].
Phillips, J. and Eberwine, J.H. (1996) Antisense RNA amplification: a linear amplification method for analysing the mRNA population from single living cells. Methods, 10, 283288[CrossRef][Medline].
Pritchard, C., et al. (2001) Project normal: defining normal variance in mouse gene expression. Proc. Natl Acad. Sci. USA, 98, 1326613271
Rosenzweig, B., et al. (2004) Dye-bias correction in dual-labeled cDNA microarray gene expression measurements. Environ. Health Perspect., 112, 480487[ISI][Medline].
Sterrenburg, E., et al. (2002) A common reference for cDNA microarray hybridizations. Nucleic Acids Res., 30, e116
Tseng, G., et al. (2001) Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res., 29, 25492557
Wang, E., et al. (2000) High-fidelity amplification for gene profiling. Nat. Biotechnol., 18, 457459[CrossRef][ISI][Medline].
Yang, Y., et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res., 30, e15
Yue, H., et al. (2001) An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucleic Acids Res., 29, E411.
This article has been cited by other articles:
![]() |
R. Kelley, H. Feizi, and T. Ideker Correcting for gene-specific dye bias in DNA microarrays using the method of maximum likelihood Bioinformatics, January 1, 2008; 24(1): 71 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-L. Martin-Magniette, J. Aubert, E. Cabannes, and J.-J. Daudin Answer to the comments of K. Dobbin, J. Shih and R. Simon on the paper 'Evaluation of the gene-specific dye-bias in cDNA microarray experiments' Bioinformatics, July 15, 2005; 21(14): 3065 - 3065. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

















