Bioinformatics Advance Access originally published online on March 7, 2006
Bioinformatics 2006 22(10):1251-1258; doi:10.1093/bioinformatics/btl068
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A calibration method for estimating absolute expression levels from microarray data
1 BIOI@SCD, Department of Electrical Engineering K.U.Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
2 ISLab, Department of Mathematics and Computer Science, University of Antwerp Middelheimlaan 1, B-2020 Antwerpen, Belgium
3 CMPG, Department of Microbial and Molecular Systems K.U.Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: We describe an approach to normalize spotted microarray data, based on a physically motivated calibration model. This model consists of two major components, describing the hybridization of target transcripts to their corresponding probes on the one hand, and the measurement of fluorescence from the hybridized, labeled target on the other hand. The model parameters and error distributions are estimated from external control spikes.
Results: Using a publicly available dataset, we show that our procedure is capable of adequately removing the typical non-linearities of the data, without making any assumptions on the distribution of differences in gene expression from one biological sample to the next. Since our model links target concentration to measured intensity, we show how absolute expression values of target transcripts in the hybridization solution can be estimated up to a certain degree.
Contact: kathleen.marchal{at}biw.kuleuven.be
Supplementary information: Supplementary data are available at Bioinformatics online.
| INTRODUCTION |
|---|
|
|
|---|
Normalization of microarray measurements, the first step in a microarray analysis trajectory, aims at removing consistent and systematic sources of variations to allow mutual comparison of measurements acquired from different slides and experimental settings. Obviously, normalization largely influences the results of all subsequent analyses (such as clustering), and therefore is a crucial phase in the analysis of microarray data. For normalization of spotted microarrays, different methods have been described [for overviews, see for instance Leung and Cavalieri (2003); Quackenbush (2002) and Bilban et al. (2002)]. In general, preprocessing of spotted microarrays largely depends on the calculation of the log-ratios of the measured intensities. For complex designs, using ratios complicates the comparison of different experimental conditions, especially when they are not measured with the same reference condition. To cope with this, some approaches inherently work with absolute intensities [e.g. ANOVA (Wolfinger et al., 2001; Kerr et al., 2000)], or use a universal reference to estimate absolute expression levels from the ratios (Dudley et al., 2002). A common ratio-normalization step consists of the linearization of the Cy3 versus Cy5 intensities [e.g. LOESS (Yang et al., 2002)], sometimes followed by, or inherently combined with, techniques for variance stabilization (Durbin et al., 2002; Huber et al., 2002). These methods assume that the distribution of gene expression shows little overall change and is balanced between the biological samples tested (from here on referred to as the Global Normalization Assumption). If this assumption is violated, for instance when comparing two drastically different biological conditions or when working with dedicated arrays, using such a normalization may yield erratic results. Normalization algorithms that do not require this Global Normalization Assumption have been proposed (Wang et al., 2005; Zhao et al., 2005), but a more reliable strategy to avoid making any assumptions regarding the distribution of the gene expression is to use external control spikes (exogenous RNA species that are added to the hybridization solution in known concentrations, prior to labeling) to estimate normalization parameters. Other types of experimental normalization controls, such as housekeeping genes, spotted clone pools or spotted genomic DNA, have also been proposed [for an overview, see Kroll and Wölfl (2002)], but none of these are able to compensate for unbalanced gene expression changes. By using external control spikes, it has been shown that global mRNA changes, resulting in an uneven distribution of expression changes, occur more frequently than what was previously believed (van Bakel and Holstege, 2004; van de Peppel et al., 2003), and that these changes can have a significant impact on the interpretation of data normalized according to the Global Normalization Assumption (Radonjic et al., 2005).
External control spikes have previously been employed for quality control and normalization (Radonjic et al., 2005; van de Peppel et al., 2003; Badiee et al., 2003; Wang et al., 2003; Benes and Muckenthaler, 2003; Hughes et al., 2001; Girke et al., 2000; Eickhoff et al., 1999), but have seldom (Carter et al., 2005) been exploited to their full potential. In fact, spikes are genuine calibration points, in that they relate the measured intensity to the actual RNA concentration in the hybridization solution. In this paper, we propose a normalization procedure that can be used to estimate absolute expression levels, and is based on spike measurements and a calibration model. This procedure is capable of adequately removing the typical non-linearities of the data, without making any assumptions on the distribution of gene expression from one biological sample to the next. Moreover, estimates of absolute expression levels instead of expression ratios, can greatly simplify inter platform comparisons and the analysis of large, complex designs comparing multiple biological conditions.
| MODELS AND ALGORITHMS |
|---|
|
|
|---|
The proposed normalization procedure is straightforward in principle: intensity measurements of external control spikes serve to estimate the parameters of a calibration model. These parameters can then be used to obtain absolute expression levels for every gene in each of the tested biological conditions. The calibration model consists of two components, a hybridization reaction and a dye saturation function. In the following sections a more detailed description of this model is given, along with its corresponding parameters and error distributions.
Hybridization reaction
This component of the model takes spot related errors into account, which have been shown to have a large effect on the final, observed signal (Rocke and Durbin, 2001). How these errors manifest themselves in the measured intensities, becomes clear when comparing the behavior of the data in Figure 1. A plot of the Cy3 versus Cy5 spike intensities (Fig. 1, panel A) illustrates the relatively small scanner errors: ratios of these controls seem highly conserved, especially at upper intensity levels. Figure 1, panel B on the other hand, displays the relation between the measured intensities of these external control spikes to their actual concentration in the hybridization solution. A large variation in intensity for a single spike concentration can be observed. In view of the relatively small scanner errors, the level of variation seen in this plot is remarkable. Heterogeneous spot capacities, in terms of the available quantity of probe, offer an explanation: imperfections in the spotting process allow distinct spots to bind different amounts of target from the hybridization solution. Whether the main source of this variation in spot capacity can be attributed to the actual amount of deposited cDNA, or to a measure of spot quality [e.g. probe density (Peterson et al., 2001), cDNA probe length (Stillman and Tonkinson, 2001), etc.], the implications are equivalent.
|
To explain these large variations of absolute intensities observed for a single spike concentration, a hybridization component was included in our model to account for these spot errors. The relation between the amount of hybridized target (xs) and the concentration of the corresponding transcript in the hybridization solution (x0) is modeled by the steady state of the following reaction:
![]() | (1) |
A second assumption underlying our model is that the hybridization is a first order reaction, and that x0 is in excess (i.e. x0 is constant). The latter assumption ensures that the amount of hybridized target at the end of the reaction only depends on the initial concentration in the hybridization solution. The amount of probe of a spot (s) available for hybridization will decrease with an increasing amount of hybridized target xs (s = s0 xs, s0 being the spot size or maximal amount of available probe), so that we can write at thermodynamic equilibrium:
![]() | (2) |
s (i.e. additive spot error) or s0 = µs e
s (i.e. multiplicative spot error) with
s
N(0,
s). Whichever distribution is more appropriate in any particular case will depend largely on the type of microarray slide and spotting procedure used, and should be evaluated after performing the normalization procedure, e.g. by testing the normality assumptions of the spot error distribution. The distribution parameters µs and
s can be considered equal for all measurements of a single array, or treated differently on a per pin group basis to compensate for spotting pin related variations. Finally, we assume that the presence of distinct labels (Cy3 and Cy5) does not influence the hybridization efficiency of the differentially labeled target transcripts, i.e.
![]() |
![]() |
![]() | (3) |
, with
being the amount of non-labeled target), and to include parameters for labeling efficiencies. However, since the external control spikes are added to the hybridization solution before the actual labeling reaction, effects attributed to labeling efficiency are accounted for in the dye saturation function, described below.
Dye saturation function
A second component of our model is the dye saturation function, which describes the relationship between the measured intensity y and the amount of labeled target xs, hybridized to a single spot on the microarray:
![]() | (4) |
a
N(0,
a) and
m
N(0,
m). This type of function has already been used in other normalization strategies (Durbin et al., 2002; Rocke and Durbin, 2001).
In all, there are three different error distributions that are assumed to influence intensity measurements: additive intensity error
a, multiplicative intensity error
m and spot capacity error
s. The parameters of the saturation function and the variances of the intensity error distributions are considered specific for all measurements of a single array and dye combination. The parameters of the hybridization reaction and variance of the spot error on the other hand apply to all measurements of a single array. As such, Cy3 and Cy5 intensities obtained from the same array element are modeled with different saturation parameters and intensity errors, but will share the same hybridization parameters and spot error. Based on Equations (2)(4), the intensities yCy3 and yCy5, measured on a single spot s0 of the array, are related to the amount of corresponding target x0,Cy3 and x0,Cy5 in the hybridization solution as
![]() | (5) |
![]() | (6) |
Parameter estimation
The model parameters are estimated separately for each microarray, based on the measured intensities y of the external control spikes and their known concentration in the hybridization solution x0. In order to determine these model parameters, it is important to have initial, reliable values for
m and
a. Estimates for
a,Cy3 and
a,Cy5 can easily be obtained by computing the standard deviation of the intensities for the negative control spikes (not present in the hybridization solution). Finding a reliable measure for
m,Cy3 and
m,Cy5 is less evident. Although the additive intensity error can be neglected, the multiplicative errors are still confounded with the influence of spot errors at high intensity levels. Estimating
m,Cy3 and
m,Cy5 independently for both channels from these higher intensity replicate measurements is not feasible. Obtaining an adequate approximation is nevertheless possible. In the higher intensity range where the calibration controls (ratio 1:1) exhibit a log linear behavior in a yCy3 versus yCy5 plot (Supplementary Figure S1), the main contribution to the observed variation can be assigned to the multiplicative intensity error. Indeed in this range, differences in spot size will obviously nullify themselves and the additive intensity error can be neglected. If we then assume that
m,Cy3 and
m,Cy5 contribute equally to the observed variation (
m =
m,Cy3 =
mCy5), a value for
m can be obtained (Supplementary Figure S1). Performing an orthogonal regression of Cy5 versus Cy3 intensities on the selected data points will yield an error distribution of which the standard deviation is an estimate of
m
2.
Obtaining a solution for the remaining parameters (dye saturation and hybridization parameters p1,Cy3, p1,Cy5, p2,Cy3, p2,Cy5 and KA respectively; µs is kept constant at an arbitrary value) is done in a least squares sense. The error sum of squares that is minimized is that of spot capacity errors, i.e.
![]() | (7) |
The minimization of SSEs is done numerically. The individual spot errors
s(i), necessary to calculate the SSEs in every iteration (i.e. for any given set of parameter values), are of course unknown. For every spot on the microarray, they are estimated by comparing the expected intensity [a function of target concentration x0,Cy3 and x0,Cy5, and a set of parameter values as indicated by (5) and (6)] to the measured intensity values (yCy3 and yCy5) for both channels, and scoring the difference based on the estimators of additive and multiplicative intensity variances. More precisely, for each pair of measurements obtained from a single spot, the following object function is minimized with respect to that spots error
s(i), i.e.
![]() | (8) |
s(i), where
![]() | (9) |
![]() |
m and
a, determine the spread of measurements around the Cy3 and Cy5 saturation curves. The gray dots in Figure 2 depict the relation between measured intensity and amount of hybridized target under the assumption of equal spot sizes [i.e. all
s(i) are zero]. Most of these are localized in regions of high intensity error and are therefore very unlikely. However, by allowing errors
s(i) on individual spot's capacities, and thus altering the amount of hybridized target per spot for both dyes (xs,Cy3 and xs,Cy5), a good correspondence between intensities and saturation curves can be obtained for both channels, and across the entire measurement range (indicated by the black dots). It is notable how well the Cy3 and Cy5 intensities, and the relationships between them, can be explained by our model. For instance in the example given, at lower intensities, Cy3 intensities are persistently higher than Cy5 for equal amounts of hybridized target, while the opposite is true for higher levels, a trend that is nicely reflected by the fitted model. Notice also that, while the ratios between Cy3 and Cy5 intensities are highly conservedat least at higher intensity levelsabsolute intensities may vary to a large extent for transcripts with the same target concentration x0 owing to spot inhomogenities.
|
Normalization: estimation of target expression levels
The obtained parameter values can be used to estimate a single x0(t,u) (i.e. the absolute expression level of a single gene t in a single biological condition u) based on all measurements that were obtained for this combination of gene and condition. Although each array and dye combination is attributed with its own set of parameters, the normalization can be considered a global one. Namely, for each combination of a gene and a tested biological condition, a single expression level is estimated, irrespective of the number of microarray slides, or the number of replicate spots on a slide, for which this gene condition combination was measured. In this sense, the results format of this normalization is comparable with the VarietyGene interaction factor effects in the models of Kerr et al. (2000), or similar factors in other ANOVA-models.
Although this procedure can be applied to any design, its complexity does depend on the used experimental setup. For a single gene, it requires the estimation of expression values for all the biological conditions at once. These x0(t,u) can be estimated by minimizing the following object function (an extension of the one used to estimate the model parameters):
![]() | (10) |
![]() | (11) |
The subscript C indicates the set of biological conditions under survey; it applies to all conditions that are present in the experimental design. The set of intensities, and the relevant array-dye combinations of parameters, that measure an expression value x0(t,u), is represented by Su [a single measured intensity belonging to this set is designated by Su(k)]. So for a single gene t, expression values for all of the biological condition present in the experiment are estimated simultaneously (and together with all the relevant spot errors), and in such a way that the total contribution of the three random errors (i.e. the combined spot errors and additive and multiplicative intensity errors for all intensity data points that are a measure of gene t) is minimized as dictated by the cost function in (10).
| RESULTS |
|---|
|
|
|---|
A publicly available dataset (Hilson et al., 2004), specifically designed for quality control and the assessment of experimental variation (Allemeersch et al., 2005; Hilson et al., 2004), was chosen to illustrate the workings of our normalization method. This experiment was ideally suited to validate our procedure because first, it contained the necessary spots for measuring external control spikes, which are required for estimating the parameters of our model. A series of external controls (Lucidea Universal Scorecard; Amersham Biosciences) consisted of 10 calibration spikes (added to the hybridization solution in a ratio 1:1 and spanning up to 4.5 orders of magnitude), eight ratio spikes provided at both low and high concentration and two negative controls, was spotted once per pin group, resulting in a total of 24 repeats of each spike probe per array. Second, the experimental design included only a single biological condition (selfself experiments; all hybridizations were conducted with the same RNA sample, extracted from aerial parts of germinating Arabidopsis thaliana seedlings), which allows assessing the performance of our normalization method in removing non-linear tendencies present in microarray data. Finally, they were outfitted with an additional set of control spikes that could be used to verify to what extent our method was capable of approximating the absolute target concentrations.
The results presented in this paper were obtained from non-background corrected measurements, since no marked improvements were observed after performing a background subtraction (data not shown). The distribution of spot capacities s0 was modeled as
with
s
N(0,
s). The distribution parameters µs and
s were assumed to be equal for all measurements of a single array.
Removal of non-linear artifacts
Figure 3 illustrates the result of applying our method on a selection of two arrays from the 14-array experiment. As this is a selfself design, the same biological sample was measured four times on these two arrays (twice labeled with Cy3 and twice with Cy5). For the purpose of our test, we treated this selfself experiment as a dye swap design with two hypothetically different samples (designated C1 and C2). Estimated expression levels x0 of the
19.000 genes are plotted in Figure 3 for C1 versus C2. Because in reality C1 and C2 represent the same biological condition, all estimates being centered along the bisector indicates that our model adequately accounts for the major sources of non-linear variation in the data. The increased variance of the estimates observed at lower target levels is inherent to microarray technology. This range of expression corresponds to the saturation observed in the lower intensity region, i.e. where the additive error has a significant influence, considerably blurring the relationship between measured intensity y and target expression level x0. Because of these saturation effects, estimates of lower concentration are prone to be less reliable.
|
As mentioned previously, our method is not bound by experimental design. To illustrate that these results are not only achievable with simple experimental setups, such as a color flip, we normalized a set of four arrays as if it concerned a loop design with four different biological conditions. A comparison of the estimated expression levels is shown in Figure 4.
|
Evaluation of target expression level estimates
Although we have shown that our method is capable of estimating absolute expression levels that respect true ratios between the different conditions compared, the previous experiment does not reveal anything about the accuracy of these absolute estimates, i.e. it does not show to what extent these absolute expression levels approximate the actual concentrations of target in the hybridization solution.
To verify the accuracy of estimated target concentrations, they should be compared with their actual concentrations in the hybridization solution. Doing this for the entire population of transcripts is impossible; as for most of the genes this concentration is unknown. However, the dataset contains an additional set of non-commercial spikes for which the absolute concentrations in the hybridization solution are known. The extracted RNA samples were complemented with 14 external controls at amounts of 104, 103, 102, 10, 1, 0.1 or zero copies per cell. In all 14 hybridizations, these controls were compared with a unique reference RNA, capable of binding to all of the 14 spike cDNA probes, always added at a concentration of 100 copies per cell. The experimental design for these control spikes is summarized in Table 1. Results obtained after performing our normalization are shown in Figure 5 [one spike was omitted from analysis because of quality issues (Allemeersch et al., 2005)]. Because the estimated target concentrations, expressed in pg/ml, were not directly comparable with the units of copy number per cell, a linear rescaling of these values by a factor that set our estimate of the unique reference RNA to 100 (copies per cell) was performed. Figure 5 shows that, except for the lowest concentrations, estimated values correspond fairly well to the true target concentrations as present in the hybridization solution. As explained above, also here estimates of the lowest concentrations show a higher error variance.
|
|
Comparison of target concentrations between genes
Although Figure 5 shows that concentrations can be accurately estimated, there are several gene-dependent factors that could influence the obtained results, possibly hampering the comparison of estimated concentrations between different genes. Gene specific hybridization efficiencies, for instance, are not taken into account by our model. Consistent spot errors are another factor for which it is theoretically impossible to compensate. Microarrays are usually spotted in batch: experimental errors that influence the DNA probe solutions used for spotting will affect an entire set of microarrays in a similar way. This type of consistent spot error will manifest itself on individual spots across multiple microarray slides, contrary to e.g. variations related to the spotting pins themselves, which would also affect multiple spots on a single array. The particular setup of the 13 external controls, used for assessing the accuracy of estimated target levels, can provide some insight. Because the universal reference RNA can hybridize to all the probes of these spikes, it couples the spot errors of all probes during the estimation of target concentrations. As a consequence of this coupling, consistent spot errors could partially be compensated for, as illustrated in Figure 6. For certain spikes (e.g. Dil2a), estimated spot capacities were persistently above or below the average spot capacity µs, a feature that was only detectable through the presence of the universal reference RNA. As a result, estimated target concentrations can be subject to gene specific rescaling, hampering the comparison of these concentrations between genes. They can nevertheless be interpreted as absolute values of expression when comparing different concentrations for a single gene.
|
Influence of background corrections
In our model the combination of the additive intensity error
a and intercept of the dye saturation function p2 can be regarded as an elementary model for the entire slide's background. Having a single background for all spots is different from the spot specific background corrections performed during standard microarray analysis, which estimate a spot specific background from pixels corresponding to the area of the glass slide surrounding the spotted probe. This background model is by no means a restriction concerning the use of background corrected values; our normalization can be applied to both raw and background corrected intensities. Moreover, our method is perfectly capable of working with negative intensity values that may arise when measurements are below background. Whether or not using background corrected measurements is advisable, depends largely on the data quality. This is illustrated in Supplementary Figure S2. Performing a spot specific background correction prior to applying our model would ideally result in the lower saturation limit of our model (p2) becoming zero. In reality, the estimate for p2 will indeed be lower, but never reaches a zero level. In general, we have observed a trade-off: background corrected measurements have a larger linear range, but at the expense of increased measurement errors for lower concentrations. | DISCUSSION |
|---|
|
|
|---|
In this paper we present an approach for normalizing microarray data using external control spikes to fit a calibration model. This model incorporates parameters and error distributions representing both the hybridization of labeled target to complementary probes and the subsequent measurement of fluorescence intensities. External control spikes serve to estimate the model parameters. The obtained parameter values are then employed to estimate absolute levels of expression for the remaining genes. For each combination of a gene and a tested biological condition, a single absolute target level is estimated, taken the specificities of the design.
The model in itself is fairly basic, in that, with the exception of spot size errors, it is aimed at capturing the global characteristics of an experiment and their overall influence on intensity measurements, generalizing on hard to quantify local sources of variation. The combination of the additive intensity error
a and intercept of the dye saturation function p2, for instance, can be regarded as a global model for the entire slide's background.
The array specific hybridization constant KA, another global factor, obviously does not account for transcript specific hybridization efficiencies. Therefore, care should be taken when interpreting the estimated expression levels as actual concentrations or when comparing estimated target levels between genes. On the other hand, probe sequences for spotted microarrays are often specifically selected to have properties that obviate large differences in transcript specific hybridization effects. Besides these gene specific hybridization effects, comparison of estimated target levels between genes is also complicated by consistent spot errors across multiple slides. These errors, resulting from experimental inaccuracies in the probe preparation, can arise when microarray slides are spotted in batch. Owing to the characteristics of microarray technology, they cannot be dealt with modelwise.
Although our model is a simplification of physical reality dealing with errors in a global, non-gene specific way, results show that our method is capable of adequately linearizing and normalizing microarray data. An important difference over most existing normalization methods is that our procedure does not rely on any assumptions on the distribution of gene expression levels from one biological sample to the next. Hence, our procedure is particularly well-suited to normalize experiments for which the Global Normalization Assumption may not be entirely valid, i.e. experiments for which there is no symmetry in the amount of genes that are up-regulated versus down-regulated. Such is typically the case with experiments comparing drastically contrasting biological conditions or with dedicated microarrays, containing only a limited number of probes, representing genes involved in the studied biological process.
In contrast to other normalization methods that use spikes to circumvent the Global Normalization Assumption (van de Peppel et al., 2003), our procedure computes absolute expression levels, avoiding the use of ratios. Moreover, for the described experiment, the estimated absolute expression levels approximate the actual concentrations fairly well. Some caution is nevertheless advised when interpreting estimated concentrations as such. This is only problematic as far as comparing expression levels between different genes; the points discussed above have little or no consequence if a comparison is made between estimated target levels across biological conditions for a single gene. Conclusively, our method offers a novel approach to normalize spotted microarrays that combines the advantages of some ANOVA based approaches, which also estimate absolute expression levels, and methods that perform data linearization (e.g. LOESS). The procedure offers independence of assumptions concerning the distribution of gene expression and retains much of the inherent calibration information of external control spike measurements.
| Acknowledgments |
|---|
K.E. is a research assistant of the IWT; B.N. was a postdoctoral researcher of the FWO-Vlaanderen for a major part of this work. This work is partially supported by (1) IWT projects: GBOU-SQUAD-20160, GBOU-ANA; (2) Research Council KULeuven: GOA Mefisto-666, GOA-Ambiorics, IDO genetic networks, EF/05/007 SymBioSys; (3) FWO projects: G.0115.01, G.0241.04 and G.0413.03 and (4) IUAP V-22 (2002-2006), 4. FP5 CAGE.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Joaquin Dopazo
Received on November 10, 2005; revised on February 13, 2006; accepted on February 21, 2006
| REFERENCES |
|---|
|
|
|---|
Allemeersch, J., et al. (2005) Benchmarking the CATMA microarray. A novel tool for Arabidopsis transcriptome analysis. Plant Physiol, . 137, 588601
Badiee, A., et al. (2003) Evaluation of five different cDNA labeling methods for microarrays using spike controls. BMC Biotechnol, . 3, 23[Medline].
Benes, V. and Muckenthaler, M. (2003) Standardization of protocols in cDNA microarray analysis. Trends Biochem. Sci, . 28, 244249[CrossRef][Web of Science][Medline].
Bilban, M., et al. (2002) Normalizing DNA microarray data. Curr. Issues Mol. Biol, . 4, 5764[Medline].
Carter, M.G., et al. (2005) Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray. Genome Biol, . 6, R61[CrossRef][Medline].
Dudley, A.M., et al. (2002) Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc. Natl Acad. Sci. USA, 99, 75547559
Durbin, B.P., et al. (2002) A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics, 18, Suppl. 1, S105S110[Abstract].
Eickhoff, B., et al. (1999) Normalization of array hybridization experiments in differential gene expression analysis. Nucleic Acids Res, . 27, e33
Girke, T., et al. (2000) Microarray analysis of developing Arabidopsis seeds. Plant Physiol, . 124, 15701581
Hilson, P., et al. (2004) Versatile gene-specific sequence tags for Arabidopsis functional genomics: transcript profiling and reverse genetics applications. Genome Res, . 14, 21762189
Huber, W., et al. (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics, 18, Suppl. 1, S96S104[Abstract].
Hughes, T.R., et al. (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol, . 19, 342347[CrossRef][Web of Science][Medline].
Kerr, M.K., et al. (2000) Analysis of variance for gene expression microarray data. J. Comput. Biol, . 7, 819837[CrossRef][Web of Science][Medline].
Kroll, T.C. and Wolfl, S. (2002) Ranking: a closer look on globalisation methods for normalisation of gene expression arrays. Nucleic Acids Res, . 30, e50
Leung, Y.F. and Cavalieri, D. (2003) Fundamentals of cDNA microarray data analysis. Trends Genet, . 19, 649659[CrossRef][Web of Science][Medline].
Peterson, A.W., et al. (2001) The effect of surface probe density on DNA hybridization. Nucleic Acids Res, . 29, 51635168
Quackenbush, J. (2002) Microarray data normalization and transformation. Nat. Genet, . 32, Suppl., 496501[CrossRef][Web of Science][Medline].
Radonjic, M., et al. (2005) Genome-wide analyses reveal RNA polymerase II located upstream of genes poised for rapid response upon S.cerevisiae stationary phase exit. Mol. Cell, 18, 171183[CrossRef][Web of Science][Medline].
Rocke, D.M. and Durbin, B. (2001) A model for measurement error for gene expression arrays. J. Comput. Biol, . 8, 557569[CrossRef][Web of Science][Medline].
Stillman, B.A. and Tonkinson, J.L. (2001) Expression microarray hybridization kinetics depend on length of the immobilized DNA but are independent of immobilization substrate. Anal. Biochem, . 295, 149157[CrossRef][Web of Science][Medline].
van Bakel, H. and Holstege, F.C. (2004) In control: systematic assessment of microarray performance. EMBO Rep, . 5, 964969[CrossRef][Web of Science][Medline].
van de Peppel, J., et al. (2003) Monitoring global messenger RNA changes in externally controlled microarray experiments. EMBO Rep, . 4, 387393[CrossRef][Web of Science][Medline].
Wang, D., et al. (2005) A robust two-way semi-linear model for normalization of cDNA microarray data. BMC Bioinformatics, 6, 14[Medline].
Wang, H.Y., et al. (2003) Assessing unmodified 70mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol, . 4, R5[CrossRef][Medline].
Wolfinger, R.D., et al. (2001) Assessing gene significance from cDNA microarray expression data via mixed models. J. Comput. Biol, . 8, 625637[CrossRef][Web of Science][Medline].
Yang, Y.H., et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res, . 30, e15
Zhao, Y., et al. (2005) An adaptive method for cDNA microarray normalization. BMC Bioinformatics, 6, 28[Medline].
This article has been cited by other articles:
![]() |
H. Zhao, K. Engelen, B. De Moor, and K. Marchal CALIB: a Bioconductor package for estimating absolute expression levels from two-color microarray data Bioinformatics, July 1, 2007; 23(13): 1700 - 1701. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




















