Skip Navigation

Bioinformatics 2008 24(16):i70-i75; doi:10.1093/bioinformatics/btn278
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gao, P.
Right arrow Articles by Lawrence, N. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gao, P.
Right arrow Articles by Lawrence, N. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities

Pei Gao 1, Antti Honkela 2, Magnus Rattray 1 and Neil D. Lawrence 1,*

1School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester, M13 9PL and 2Adaptive Informatics Research Centre, Helsinki University of Technology, PO Box 5400, FI-02015 TKK, Finland

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: Inference of latent chemical species in biochemical interaction networks is a key problem in estimation of the structure and parameters of the genetic, metabolic and protein interaction networks that underpin all biological processes. We present a framework for Bayesian marginalization of these latent chemical species through Gaussian process priors.

Results: We demonstrate our general approach on three different biological examples of single input motifs, including both activation and repression of transcription. We focus in particular on the problem of inferring transcription factor activity when the concentration of active protein cannot easily be measured. We show how the uncertainty in the inferred transcription factor activity can be integrated out in order to derive a likelihood function that can be used for the estimation of regulatory model parameters. An advantage of our approach is that we avoid the use of a coarsegrained discretization of continuous time functions, which would lead to a large number of additional parameters to be estimated. We develop exact (for linear regulation) and approximate (for non-linear regulation) inference schemes, which are much more efficient than competing sampling-based schemes and therefore provide us with a practical toolkit for model-based inference.

Availability: The software and data for recreating all the experiments in this paper is available in MATLAB from http://www.cs.man.ac.uk/~neill/gpsim.

Contact: neill{at}cs.man.ac.uk


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Ordinary differential equations (ODEs) are the most common framework in use for modelling biological sub-systems (Alon, 2006). Well established methodologies have been developed for estimating the parameters of these equations in the context of a particular experiment or set of experiments, using e.g. least squares and maximum likelihood combined with an appropriate optimization algorithm (Mendes and Kell, 1998). More recently, significant progress has been made on Bayesian parameter estimation in the context of ODEs (Coleman and Block, 2006). Through the use of advanced Monte Carlo techniques it is even possible to, given a specific data set, rank model structures through the use of Bayes factors (Vyshemirsky and Girolami, 2008). This shows the potential for ODE models to be closely integrated with biological investigations, informing the process of biological experimental design.

A challenging problem for parameter estimation in ODE models occurs where one or more chemical species influencing the dynamics are controlled outside of the sub-system being modelled. For example, a signalling pathway can be triggered by a signal external to the pathway itself. In a regulatory sub-system, one or more transcription factors (TFs) may influence the expression of a set of target genes, but these TFs may not be regulated at the transcriptional level, instead being activated by another sub-system such as a signalling pathway. Similarly, in a metabolic pathway external metabolites and enzymes will influence the dynamics of the pathway. If these external chemical species have a constant influence, e.g. as in the case of steady state behaviour of a metabolic pathway, then they can simply be treated as additional parameters of the model and their effect can be estimated along with the other model parameters. However, more often these external factors are time-varying quantities. In this case, they are functional parameters and cannot be estimated by the standard methods discussed above. One approach for dealing with this is to discretize in time, treating the time-varying function as a sequence of discrete parameters. However, this leaves the problem of choosing the correct granularity for the discretization and either ignoring temporal continuity, or assuming a simple Markovian relationship and thereby introducing further parameters and assumptions. Here, we propose an alternative approach. We deal with these parameters as continuous functions of time, avoiding the need for arbitrary discretization.

To further compound the problem of dealing with the time-varying effects of these chemical species, their concentration is often not directly observable and their dynamics must therefore be inferred indirectly according to their influence on measured elements of the system. This is a common problem and it is a natural consequence of the fact that some quantities are relatively easy to measure in a high throughput manner (e.g. mRNA concentrations with a microarray), whereas others are much more difficult to measure (e.g. the concentration of TFs located in the nucleus). In this article, we advocate the use of Gaussian processes (GPs) to define prior distributions over these latent chemical species. This allows us to marginalize their contributions in the interaction network of interest. We present a basic toolkit of algorithms based on GPs which allow us to consider different response models (Michaelis Menten kinetics, repression responses) and cascades of interactions in which chemical species of interest are missing. The application domain we consider is inference of TF activity in both developmental and signalling networks.

Inference of TF activity in a given network is a well studied problem with both genome wide approaches (Liao et al., 2003; Sanguinetti et al., 2006a,b) and algorithms designed for a subset of genes (Barenco et al., 2006; Khanin et al., 2006; Nachman et al., 2004; Rogers et al., 2006). Our approach is most directly inspired by Barenco et al. (2006) who infer TF activity in a single input module network motif through a differential equation with a linear response. We build on their work to consider simple cascade networks and non-linear response models such as Michaelis Menten kinetics and repression responses (Alon, 2006; Khanin et al., 2006).

GPs (Rasmussen and Williams, 2006) are probabilistic models of functions that encode particular assumptions about the function such as smoothness and timescale. They are commonly applied in the context of regression and interpolation. GP modelling provides not only the inference of continuous quantities without discretization but also the natural capability of handling uncertainty. Another attractive characteristic of the GP is that the result of any linear operator on the function leads to another GP and we will exploit this when applying GPs in the context of differential equations.

Our focus in this article will be the inference of TF activities given mRNA concentrations. We start by considering the model given in Barenco et al. (2006) and reviewing the work done by Lawrence et al. (2007) who provided the first treatment of this model with GPs. Then, in Section 2.3, we extend our model to the case of repression. We follow Khanin et al. (2006) and apply the model in the context of the SOS system in Escherichia coli where genes are controlled by the transcriptional repressor protein LexA. In both these cases, no model of TF translation is included as the proteins are post-translationally regulated. In our final example (Section 2.4) we show how, in the context of Drosophila mesoderm development, a model of translation can also be incorporated to improve the quality of the TF inference in cases where TFs are primarily regulated at the transcriptional level.


    2 METHODS AND RESULTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The following linear model of gene activation was considered by Barenco et al. (2006),


Formula 1

(1)
where the changing level of a gene j's expression, xj(t), is given by a combination of a basal transctiption rate, Bj, a sensitivity, Sj, to its governing TF's activity, f(t), and the decay rate of the mRNA, Dj. The differential equation can be solved for xj(t) giving


Formula 2

(2)
where we have ignored any transient terms (we can assume these are zero if the TF has zero concentration at times before t=0).

2.1 GP inference
Our general approach is to assume that f(t) was drawn from a GP. A GP is a prior distribution over functions that leads to highly flexible non-linear functions in which characteristics such as the stationarity, the roughness and the timescale of the signal can be controlled. GPs are the functional analogue of the Gaussian distribution. They are fully specified by a mean function, µ(t), and a covariance function, k(t,t'). The mean function is an unconstrained function of time, whilst the covariance function is constrained to be a positive definite function. Various covariance functions can be used but we will predominantly make use of the squared exponential covariance (sometimes known as the Gaussian or RBF covariance),


Formula 3

(3)
An important characteristic of a GP is that any linear operation applied to the function drawn from a GP leads to a function that is drawn from a related GP. This is the functional analogue of linear operations applied to samples from a Gaussian distribution. We note that Equation (2) is a linear operation on f(t) since the integral is the functional analogue of a weighted sum. The properties of the GP tell us that if f(t) was drawn from a GP with covariance function kf,f(t,t') then xj(t) will also be drawn from a GP with a covariance function given by


Formula 4

(4)
Furthermore, the cross covariances between the xj(t) and f(t) will be given by


Formula 5

(5)
Finally, if our data consist of several genes, {xj(t)}Formula, cross covariances between the genes can be computed,


Formula 6

(6)
The basal transcription rate then appears in the mean function to define the mean of the GP for xj(t) as a constant Bj/Dj.

As was shown in Lawrence et al. (2007), if f(t) is drawn from a GP with a squared exponential covariance function, we can analytically solve the integrals in Equations (456) and thereby define a probabilistic model of the expression data for which the parameters of the differential equation, Bj, Sj, Dj and the scale l can all be determined, either through maximum likelihood or Bayesian sampling.

A likelihood function for the model parameters {theta}={Bj, Sj, Dj}Formula and GP scale l is obtained by integrating out the latent function f(t)


Formula 7

(7)
where the data xj=[xj(ti)] are collected at discrete times ti and modelled using Gaussian observation noise with either known or estimated variance (this is for a single replicate). Integrating out f(t) avoids having to carry out maximum likelihood for this infinite dimensional parameter which could be problematic as estimates based on limited data would be associated with large uncertainty. Approaches based on carrying out maximum likelihood over a discretized f(t) suffer from the same problem, i.e. the discretization introduces many new parameters which have to be estimated and therefore only a very coarse discretization will be tractable (Khanin et al., 2006). The associated error in these parameters confounds the estimation of other parameters by maximum liklihood and while Bayesian approaches avoid this problem there is a large associated computational bottleneck in using Markov chain Monte Carlo (MCMC) on these extra parameters (Barenco et al., 2006; Rogers et al., 2006).

Once the model parameters have been estimated then the model can be used to estimate the predictive GP distribution for f(t) and for each target gene xj(t). The mean and covariance of this predictive distribution for f(t) are given by,


Formula 8

(8)


Formula 9

(9)
where x and µ have the dimension of N x T with the elements of xj,i=xj(ti) (for j=1,...,N and i=1,...,T) and µj,i=Bj/Dj (for all i) respectively, (Kff)p,q=kff(tp,tq), (Kfx)p,rj=kf,xj(tp,tr) and (Kxx)ri,mj=kxi,xj(tr,tm)+{delta}ij{delta}mr{sigma}Formula where {sigma}Formula is the measurement noise associated with xi(tr). Here tp and tq denote any choice of points over which one wishes to evaluate the GP, whereas tr and tm are specifically the times at which the data are measured. This equation was used to plot the credible intervals for f in the figures presented in this article. It is similarly possible to compute the predictive distribution for the data by replacing f with xi in the above expression. The predictive distribution can be used for ranking genes according to their fit to the model [as in the application considered by Barenco et al. (2006)].

We illustrate our results from applying the method below, first on the previously studied linear activation model and then showing how we can extend the approach to non-linear response functions for activation and repression, followed by a two-layer activation model.

2.2 Activation
In this section, we consider a system in which a single TF activates a number of targets. The example we consider is the TF p53 which is a tumour repressor activated during DNA damage. According to Barenco et al. (2006), irradiation is performed to disrupt the equilibrium of the p53 network stimulating transcription of p53 target genes. Seven samples of the expression levels of the target genes in three replicas are collected as the raw time course data.

2.2.1 Linear activation
In this section, we recreate the results presented by Lawrence et al. (2007), for the linear model with several key differences. First, in their original paper Barenco et al. (2006) constrained f(0) to be zero, forcing the basal transcription rate to account for all transcription at time t=0. This constraint was not included in Lawrence et al. (2006), but is included here. This allows us to incorporate the prior information of the latent TF profiles as much as possible. Second, Lawrence et al. (2007) used an unnormalized version of the Affymetrix array data. We found that simple median based normalization removed the effect of a couple of repeats that were anomalously high. Inspection of the processed data used by Barenco et al. (2006) showed that they had also dealt with these anomalies.

In Figure 1, we show results of the TF inference as well as the model parameter estimations. The latent TF activity profiles are reconstructed with 95% credible intervals. The result resembles the activity profile of p53 measured by western blot (Barenco et al., 2006) and the kinetic parameters in the model are also closely matched with the results there (we followed Barenco et al. in fixing kinetic parameters for p21 to improve identifiability).


Figure 1
View larger version (15K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Results for p53 using the linear model and a GP with RBF kernel. Results are shown for one of the replicates: (a) predicted protein concentration; (b) predicted expression level for DDB2; (c) predicted expression level for BIK; (d) predicted expression level for TNFRSP20b; (e) predicted expression level for p21; (f) predicted expression level for hPA26. Solid lines represent the mean inference, dashed lines show the 95% credible intervals, and the crosses are the observed gene expression data with error bars showing the technical error from each individual Affymetrix microarray processed using the puma package (Liu et al., 2005). Maximum likelihood model parameters, estimated using all replicates, are shown for each target gene. Data and reconstructed profiles are shown on an unlogged normalised scale. Time is measured in hours.

 
2.2.2 MAP-Laplace approximation
The differential equation with a linear response is an attractive model to use in the context of GPs as it allows the joint distribution over the gene expression and TF activity to be determined analytically, given the model parameters. However, as a model, it has some shortcomings. First, it treats both the gene expression and the TF activity as GPs. Since a GP cannot encode the information that a function is constrained positive, this means that the concentrations are a priori allowed to be negative. Whilst the posteriors, in the region where there is data, tend to stay positive (Fig. 1), when the predictions move away from the data they allow the TF activity to become negative. In Figure 2a, we show the results using the linear model for a different replicate to the one used in Figure 1. In this case it can be observed that, although the mean inferred profile remains positive or very close to zero, the inferred distribution of profiles summarized by the credible intervals includes profiles with negative concentrations. This contradicts our prior knowledge that concentrations are positive quantities.


Figure 2
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Inferring p53 activity for a different replicate than Figure 1. (a) using the linear model; (b) using the non-linear model with Michalis Menten kinetics and a positively constrained TF concentration.

 
A potential solution to this problem is to place a GP prior over, for example, the log of the TF activity. However, this, in effect, is a non-linear response in the differential equation. The non-linear response means that it is no longer possible to construct the joint distribution over gene expression and TF activity in a closed form. We must, necessarily, turn to approximations to make progress. In Lawrence et al. (2007) the use of a MAP-Laplace approximation is suggested. They demonstrate how the concentration of the TF activity can be constrained positive by placing the GP in log space.

Consider the following modification to the model,


Formula 10

(10)
where g(·) is a non-linear function. The differential equation can still be solved,


Formula 11

(11)
but there is now a non-linear operation on f(t) within the integral. The gene expression level is therefore no longer a GP. The MAP-Laplace approach involves finding a maximum a posteriori (MAP) estimate for the function f(t) and making a second-order Taylor approximation at that point to the log likelihood. This approximation is itself a GP and leads to an approximation to the marginal likelihood (Rasmussen and Williams, 2006). Derivatives can then be taken with respect to the model parameters and the approximation maximized. The MAP-Laplace's approximation becomes exact in the case where g(·) is linear. This is also useful in practice, as naive implementation of the linear response model for a general covariance function would require, in general, numerical evaluation of the double integrals in Equations (4) and (6). The use of the MAP-Laplace approach involves only a single numerical integral.

We now build on this work in two major ways. We go beyond the simple exponential response model considered in Lawrence et al. (2007) by exploring a Michaelis Menten kinetics inspired response and a response that has been suggested as appropriate for repression (Alon, 2006). We also extend the algorithm to learn the parameters of the differential equation by maximizing the approximation to the log likelihood.

2.2.3 Michaelis Menten kinetics
Michaelis Menten reaction kinetics were designed for the case where a chemical reaction is enabled through an enzyme. If the concentration of the enzyme is much lower than the concentration of the substrate the reaction rate becomes limited by the reduced enzyme availability. The justification in the context of modelling transcription is that there are a limited number of binding sites on the genome for the TF. This ‘bottle neck’ has the same effect as a reduced availability of enzyme and Michaelis Menten kinetics is therefore also appropriate as a model of transcription.

We implemented Michaelis Menten kinetics for the p53 data by taking the non-linearity to have the following form


Formula 12

(12)
where the Michaelis constant for the j-th gene is given by {gamma}j and we are using a GP f(t) to model the log of the TF activity.

Figure 2b shows the results of applying the above kinetic model with positively constrained TF concentration to the p53 data set. It can be seen that the inferred distribution of TF profiles no longer includes negative concentrations and the result is now closer in shape to the replicate shown in Figure 1.

2.3 Repression
The same framework can easily be adapted to the case of a repressor by using an analogous Michaelis Menten model of repression,


Formula 13

(13)
again using f(t) to represent the log TF concentration. Khanin et al. (2006) developed a method to infer the TF profile and model parameters for this model. This involved a coarse discretization of the TF profile, treating f(t) as a piecewise-constant function of time. They acknowledge that more flexible parametric models for f(t) will lead to an increase in the number of parameters, which are therefore impossible to estimate by maximum likelihood. Our approach avoids this explosion of parameters by treating f(t) as a functional parameter which is integrated out before carrying out maximum likelihood for the other model parameters [recall equation (7)].

We applied our method to the same microarray data as Khanin et al. (2006). They identify 14 target genes which are repressed by the TF LexA in E.Coli and mRNA measurements for these genes are collected over six time points. The amount of LexA is reduced after UV irradiation, decreasing for a few minutes and then recovering to its normal level.

In the case of repression, we have to include the transient terms in Equation (2),


Formula 14

(14)
where {{alpha}j} are an additional set of model parameters that have to be inferred.

The result for the inferred TF profile along with predictions for a selection of target genes are shown in Figure 3. Our inferred TF profile and reconstructed target gene profiles are similar to those obtained by Khanin et al. (2006), showing how different model parameters can lead to very different target gene expression given the same stimulus. However, whereas Khanin et al. (2006) had to use interpolation to go from a discretized model to a continuous profile, our GP representation avoids any need for discretization or interpolation. The model parameters shown in the figure are estimated by maximum likelihood.


Figure 3
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Results for the repressor LexA. (a) predicted LexA concentration; (b) predicted expression level for dinI; (c) predicted expression level for recN; (d) predicted expression level for lexA; (e) predicted expression level for ruvB; (f) predicted expression level for umuC. Parameters shown were obtained by maximum likelihood. SEs were not available for this two-dye microarray data set and therefore a gene-specific noise variance parameter was estimated for each target gene. Data and reconstructed profiles are given in the original unlogged scale. Time is measured in minutes.

 
2.4 Cascaded differential equations
As a final example, we consider a simple cascade of differential equations. Returning to the framework of the linear model, we consider the case where the TF protein is regulated at the level of transcription. In other words, we model the process of translation from mRNA to protein for the TF, but we are only able to measure the mRNA of the TF and of its targets. The simple model we consider would only be appropriate in the case where the TF does not require activation, e.g. by phosphorolation, after translation. The model is, therefore, not appropriate for many signalling pathways, but seems sensible in the context of development where TFs are often functional directly after translation. We therefore considered an example of the development of the mesoderm in Drosophila combining target gene predictions from Chromatin immunoprecipitation (ChIP–chip) data (Sandmann et al., 2006) with microarray time-course data from wild-type development (Tomancak et al., 2002).

We take the production rate of active TF to be given by,


Formula 15

(15)
where {delta} gives the decay rate of the active TF protein, {sigma} gives a rate of translation and y(t) is the concentration of the TF's mRNA. The solution for f(t), setting transient terms to zero, is


Formula 16

(16)
If we assume that y(t) was drawn from a GP with the squared exponential covariance function, then f(t) is also a GP with the kernel being a special case of the one given by Equations (4) and (5). This gives us a new covariance function governing f(t). It turns out that we can use this prior over f(t) in the linear activation model [Equation (1)] and derive the associated prior distribution for the target genes of the TF, {xj(t)}Formula. The derivation and form of the resulting covariance function are rather involved, but remarkably all the integrals required can be calculated analytically for the squared exponential covariance function given by Equation (3).

We applied the above driven input model to a simple cascade in Drosophila mesoderm development, focusing on the TF Mef2 (Myocyte enhancing factor 2). We selected six targets of Mef2 identified by ChIP–chip assays (Sandmann et al., 2006) and which were observed to be up-regulated after Mef2 is expressed. Affymetrix time course microarray data from wild-type embryos (Tomancak et al., 2002) provide us with observations of Mef2 expression (y(t)) and the expression of the target genes xj(t) at hourly intervals. The microarray time course was replicated three times, and we used all three replicates to fit the model parameters {delta},{Bj,Dj,Sj} and the kernel scale l by maximum likelihood (we set {sigma}=1 without loss of generality since the scale of the TF protein is arbitrarily determined by the Sj parameters).

Figure 4 shows the results for one of the replicates. Error bars associated with the mRNA data are obtained from the Affymetrix probe-level processing of each replicate (Liu et al., 2005). In Figure 4a, we show the inferred mRNA profile for Mef2 along with the microarray data. In Figure 4b, we show the inferred TF profile for Mef2. Figure 4c and d shows the model predictions for two of the targets, along with the associated expression data, showing again the difference in response possible given different model parameters.


Figure 4
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4. Results for Mef2 from one of the replicates: (a) driving input mef2 mRNA concentration; (b) inferred Mef2 protein concentration; (c–d) reconstructed mRNA profiles of two example genes from the six used to train the model. Data points and error bars are obtained from probe-level analysis of the Affymetrix data using the puma package (Liu et al., 2005).

 
We observe that using the TF mRNA data results in the model obtaining tighter credibility intervals for the inferred TF. The fit to the target gene expression looks reasonable, although the response in Figure 4c appears to be sharper than predicted by the model. It also appears that the peak of the inferred Mef2 mRNA profile precedes the peak in the data. The model here highlights the fact that some of our simplifying assumptions are being violated by the biology. The most critical assumption that we have made is that all of the targets used to train the model are unique targets of Mef2. In reality it is likely that other TFs are co-bound with Mef2. In future work, we expect to extend the model to account for these other TFs. We are currently using unpublished ChIP–chip data (obtained from E. Furlong, EMBL, Heidelberg) for other TFs involved in mesoderm development in order to refine our model and selection of targets.


    3 DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
GPs allow for inference over chemical species concentrations in a continuous time manner. In this article, we have laid out the basic technologies required for this inference and highlighted some of the issues that arise: e.g. dealing with non-linear response models and layered systems. Barenco et al. (2006) have already shown that these models can be used for ranking targets of TFs. Such rankings can also be given in the context of our models, both for linear and non-linear response models.

Another important direction of future research will be scaling the models used to much larger systems with multiple interacting TFs. Larger systems will lead to an increase in computational requirements, particularly if it is necessary to account for correlations between multiple interacting latent chemical species.

The non-linear response models require particular approximations as exact inference in these models is not tractable. One issue with the Laplace approximation we have applied is that it is only responsive to the mode of the posterior distribution. Variational approximations (Jordan et al., 1998) provide an alternative approach which can lead to a rigorous lower bound on the likelihood. Finally, it also makes sense to validate the quality of the approximation through MCMC methods. We are working towards efficient MCMC algorithms for validating the quality of our approximate inference.

All the experiments we have shown have been in the context of transcription networks. However, the technologies are equally applicable for other biological systems, such as metabolic networks.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
For data provision and assistance with our questions we would like to thank Charles Girardot and Eileen Furlong of EMBL in Heidelberg (mesoderm development in D.Melanogaster), Martino Barenco and Mike Hubank at the Institute of Child Health in UCL (p53 pathway) and Raya Khanin and Ernst Wit of the University of Glasgow and the University of Lancaster (E.coli repressor system).

Funding: This work is funded by BBSRC Grant. No BBS/B/0076X ‘Improved processing of microarray data with probabilistic models’, EPSRC Grant. No EP/F005687/1 ‘Gaussian Process Models for Systems Identification with Applications in Systems Biology’ and the EU FP6 Network of Excellence PASCAL.

Conflict of Interest: none declared.


    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND RESULTS
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Alon U. An Introduction to Systems Biology: Design Principles of Biological Circuits. (2006) London: Chapman and Hall/CRC.

    Barenco M, et al. Genome Biol (2006) 7:R25.[CrossRef][Medline]

    Coleman MC, Block DE. Am. Inst. Chem. Eng. J (2006) 52:651–667.

    Jordan MI, et al. Jordan Michael I, ed. (1998) Kluwer, Dordrecht, The Netherlands. 105–162. In Learning in Graphical Models, Vol. 89 of Series D: Behavioural and Social Sciences.

    Khanin R, et al. Proc. Natl Acad. Sci. USA (2006) 103:18592–18596.[Abstract/Free Full Text]

    Lawrence ND, et al. Modelling transcriptional regulation using gaussian processes. In: Advances in Neural Information Processing Systems.—Schölkopf B, Platt JC, Hofmann T, eds. (2007) 19. MA: MIT Press. Cambridge.

    Liao JC, et al. Proc. Natl Acad. Sci. USA (2003) 100:15522–15527.[Abstract/Free Full Text]

    Liu X, et al. Bioinformatics (2005) 21:3637–3644.[Abstract/Free Full Text]

    Mendes P, Kell D. Bioinformatics (1998) 14:869–883.[Abstract/Free Full Text]

    Nachman I, et al. Bioinformatics (2004) 20(Suppl. 1):248–256.[CrossRef]

    Rasmussen CE, Williams CKI. Gaussian Processes for Machine Learning. (2006) Cambridge, MA: MIT Press.

    Rogers S, et al. Probabilistic Modeling and Machine Learning in Structural and Systems Biology. (2006) Finland: Tuusula. June 17, 18, 2006.

    Sandmann T, et al. Dev. Cell (2006) 10:797–807.[CrossRef][Web of Science][Medline]

    Sanguinetti G, et al. Bioinformatics (2006a) 22:2275–2281.

    Sanguinetti G, et al. Bioinformatics (2006b) 22:1753–1759.[Abstract/Free Full Text]

    Tomancak P, et al. Genome Biol (2002) 3. RESEARCH0088.

    Vyshemirsky V, Girolami M. Bioinformatics (2008) 24:833–839.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
T. Aijo and H. Lahdesmaki
Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics
Bioinformatics, November 15, 2009; 25(22): 2937 - 2944.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Sanguinetti, A. Ruttor, M. Opper, and C. Archambeau
Switching regulatory models of cellular stress response
Bioinformatics, May 15, 2009; 25(10): 1280 - 1286.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. D. W. Kirk and M. P. H. Stumpf
Gaussian process regression bootstrapping: exploring the effects of uncertainty in time course data
Bioinformatics, May 15, 2009; 25(10): 1300 - 1306.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gao, P.
Right arrow Articles by Lawrence, N. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gao, P.
Right arrow Articles by Lawrence, N. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?