Skip Navigation


Bioinformatics Advance Access originally published online on September 5, 2008
Bioinformatics 2008 24(21):2554-2556; doi:10.1093/bioinformatics/btn471
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
24/21/2554    most recent
btn471v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Rodrigo, G.
Right arrow Articles by Jaramillo, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rodrigo, G.
Right arrow Articles by Jaramillo, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

DESHARKY: automatic design of metabolic pathways for optimal cell growth

Guillermo Rodrigo 1, Javier Carrera 1,2, Kristala Jones Prather 3 and Alfonso Jaramillo 4,5,*

1Instituto de Biologia Molecular y Celular de Plantas, CSIC, 2Instituto de Aplicaciones en Tecnologias de la Informacion y las Comunicaciones Avanzadas (ITACA), Universidad Politecnica de Valencia, Camino de Vera s/n, 46022 Valencia, Spain, 3Department of Chemical Engineering, Massachusetts Institute of Technology, Massachusetts Avenue 77, Cambridge MA 02139, USA, 4Laboratoire de Biochimie, Ecole Polytechnique - CNRS, Route de Saclay, 91128 Palaiseau Cedex and 5Epigenomics Project, Genopole, 523 Terrasses de l'Agora, 91034 Evry Cedex, France

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: The biological solution for synthesis or remediation of organic compounds using living organisms, particularly bacteria and yeast, has been promoted because of the cost reduction with respect to the non-living chemical approach. In that way, computational frameworks can profit from the previous knowledge stored in large databases of compounds, enzymes and reactions. In addition, the cell behavior can be studied by modeling the cellular context.

Results: We have implemented a Monte Carlo algorithm (DESHARKY) that finds a metabolic pathway from a target compound by exploring a database of enzymatic reactions. DESHARKY outputs a biochemical route to the host metabolism together with its impact in the cellular context by using mathematical models of the cell resources and metabolism. Furthermore, we provide the sequence of amino acids for the enzymes involved in the route closest phylogenetically to the considered organism. We provide examples of designed metabolic pathways with their genetic load characterizations. Here, we have used Escherichia coli as host organism. In addition, our bioinformatic tool can be applied for biodegradation or biosynthesis and its performance scales with the database size.

Availability: Software, a tutorial and examples are freely available and open source at http://soft.synth-bio.org/desharky.html

Contact: alfonso.jaramillo{at}polytechnique.fr

Supplementary information: Supplementary data are available at Bioinformatics online.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Biotechnology process development is frequently equated with the production of biologics, such as proteins and viral vaccines (Nielsen, 2001). Yet the use of biological systems for the production of small molecules goes back thousands of years and has been increasing since the discipline of metabolic engineering was defined 15 years ago (Bailey, 1991). Initially, metabolic engineering efforts were primarily focused on improving the productivity of naturally-occurring metabolites within an organism, such as for overexpressing glycolytic enzymes in yeast (Schaaff et al., 1989). More recently, the field has expanded to encompass a number of examples of introducing new enzyme activities into a host cell in order to produce non-natural products (Martin et al., 2003; Ro et al., 2006) or to engineer degradation of toxic compounds (Haro and de Lorenzo, 2001).

The use of automated techniques to design biological systems constitutes a breakthrough in biotechnology, and it has previously been applied to predict biodegradation pathways (Hou et al., 2003; Pazos et al., 2005). Interestingly, functional approaches (Hatzimanikatis et al., 2005; Hou et al., 2003; Li et al., 2004) could reveal novel pathways, but these are ultimately limited by the availability of naturally-occurring enzymes. In that sense, recent work shows how to construct biochemical pathways using atomic information (Arita, 2003, 2004), and this approach could be used to enlarge our enzyme database by adding abstract reactions corresponding to functional enzymes. This would allow the design of metabolic pathways that incorporate enzymes not found in nature but which could be engineered by directed evolution or using computational design (Rothlisberger et al., 2008). In this work we propose to go beyond by extending the design to biosynthesis and predicting the cell behavior when implementing a pathway in a given host using plasmids (Jones et al., 2000).

On the other hand, one of the major challenges in synthetic biology is engineering as far as possible orthogonal systems (Sprinzak and Elowitz, 2005). In that way, quantitative models provide fruitful insights. We propose the use of two different models to quantify the readjustment of fluxes (Varma and Palsson, 1994) and the consumption of cellular resources (Bremer and Dennis, 1996) that results from the expression of heterologous pathways. We select the growth rate as the control parameter for the cellular behavior evaluation. From the transcriptional approach, we consider a dynamical model involving RNAs, RNA polymerases, proteins and ribosomes (Carrera J. et al., manuscript in preparation). Accordingly, we compute the reduction in the growth rate due to the sequestration of RNA polymerases and ribosomes. On the other hand, since the cell is metabolically altered, we use Flux Balance Analysis (FBA) to predict the new growth rate. These two strategies give different predictions about the cell behavior, but they constitute two scores to be considered when implementing a designed pathway. Further approaches will use more complex models by integrating the metabolic and transcriptomic systems, and also taking advantage of databases of Gibbs free energies for all enzymatic reactions (Mavrovouniotis, 1991). Importantly, as the desired route could be not unique, we provide a methodology to rank different pathways according to their genetic loads.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
2.1 Algorithm
We have developed a Monte Carlo algorithm (DESHARKY) with the aim of designing metabolic pathways. The purpose is to find a possible route connecting a given compound of interest with a metabolite from the considered hosting organism. These routes can be for biodegradation (reactant as source) or biosynthesis (product as source). For the source compound, we find the possible enzymatic reactions and select one among them with equitable probabilities. We repeat this process for the new source compound. Moreover, we consider with a given probability a move to go back, removing the previous reaction, to improve the convergence and to avoid long pathways. This probability is a function of the number of the already introduced steps, as the longer the pathway, the higher is the probability to go back, and here we have used a sigmoid function. We do not consider metabolic steps involving many compounds which are not specific to the hosting organism (here, one non-specific reactant and one product at most).

2.2 Transcription–translation model
The microbial production or degradation of chemical compounds usually requires the expression of foreign enzymes. This expression consumes cellular resources such as RNA polymerases and ribonucleotides for transcription, and ribosomes and amino acids for translation. Using previous knowledge on heterologous expression, we assume that RNA polymerases and ribosomes are the two critical pools. Using the experimental measurements of these resources in Escherichia coli (Bremer and Dennis, 1996), we have constructed a chassis model (Carrera et al., manuscript in preparation), fitting those data with exponential equations (see caption of Fig. 1). Furthermore, we have modeled the total heterologous expression of RNA (RNAh) by


Formula 1

(1)
and enzymes (ENZh) following


Formula 2

(2)
where {varphi} is the average transcription rate, C the number of copies of external DNA, {psi} the average translation rate, and {delta}r and {delta}e the degradation rates of the RNA and enzymes, respectively. Hence, a first order approach is to compute the consumption of cellular resources by the heterologous system (RNAPh={varphi} Ctr and RIBh={psi} RNAh tp, where tr is the transcription time and tp the translation time) and then to recompute the growth rate using the phenomenological chassis model (Fig. 1). We take the minimum value of µ throughout these resources.


Figure 1
View larger version (43K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Genetic load characterization of the glucaric acid biosynthesis pathway (see Table 1). In (a) transcription load assuming a plasmid copy number of 100. In the inset, amount of RNA polymerases as a function of the cell growth rate given by RNAP=910exp (1.06 µ) (diamonds are experimental measurements). In (b) translation load. In the inset, amount of ribosomes as a function of the cell growth rate given by RIB=3690exp (1.23 µ) (diamonds are experimental measurements). In (c) metabolic load: list of the shadow prices for all cofactors required in that pathway and the source compound (D-Glucose-6-Phosphate).

 

View this table:
[in this window]
[in a new window]

 
Table 1. Examples of metabolic pathways designed with DESHARKY

 
2.3 Metabolic model
We have addressed the metabolic burden with FBA (Varma and Palsson, 1994). This linear program, in which we maximize the cell growth rate (µ), can be written as


Formula 3

(3)
where v are the cell metabolic fluxes, c their contributions to the growth rate, S the stoichiometry matrix, and b the uptake fluxes. Then, we have constructed the corresponding dual problem (Schrijver, 1998), which is equivalent to its primal, given by min µ={lambda} b, subject to {lambda} S=c, where {lambda}, usually called shadow prices, are the contributions to the growth rate when perturbing the uptake fluxes ({Delta} µ={lambda} {Delta} b). Therefore, we can precompute {lambda} since it is a property of the host organism. In that way, the fact of introducing a new metabolic route in the host can be treated in a perturbative way. Then, {Delta} b=S*j where S* is the stoichiometry matrix for this pathway and j its flux.

2.4 Implementation
DESHARKY is implemented in C/C++, it is easily compiled, and it runs in UNIX environments (e.g. in Linux or in Windows using Cygwin). Here we have taken E. coli as the cell model. We have used an extended description of E. coli metabolism involving 1039 compounds, including extracellular compounds, and 2381 biochemical reactions (Schuetz et al., 2007). We provide the KEGG (Kanehisa and Goto, 2000) databases for chemical compounds and enzymatic reactions in a depured format. There are 14 965 chemical compounds, of which 826 are present in the host, 4942 enzymes, of which 2350 have available their sequence, and 7400 enzymatic reactions from 650 organisms. Also we consider a set of compounds eventually in the medium that can be used as substrates by the cell. To enlarge the capabilities of the algorithm, we can assume reversible reactions. In addition, we can introduce reactions which are not found in KEGG. The input of our algorithm is the target compound. The output is the designed metabolic pathway together with the quatification of the transcription, translation and metabolic load. In addition, we provide the sequence of amino acids of the enzymes involved in the pathway. These sequences are the closest phylogenetically to E. coli according to the KEGG classification of organisms.

Here we have assumed an initial growth rate of µ0=2 doublings/h, a transcription kinetics of {varphi}=0.1 RNA polymerases/s, a translation kinetics of {psi}=0.4 ribosomes/s, a number of DNA copies for the enzymes of C=100, a transcription velocity of 1/tr=45 nt/s, a translation velocity of 1/tp=16 aa/s, and a metabolic pathway flux of j=1 mmol/gDW/h.


    3 RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We have applied DESHARKY to design several metabolic pathways including biodegradation of toluene or phenol and bioproduction of sorbitol and glucaric acid (Table 1). For instance, the microbial production of glucaric acid is important for therapeutic purposes including cholesterol reduction and cancer chemotherapy, and for the synthesis of new nylons and hyperbranched polyesters. In Figure 1 we show the transcription, translation and metabolic load for this pathway, and in the Supplementary Figure S1 we depict the biochemical transformations and the list of genes encoding the corresponding enzymes. In addition, in the Supplementary Material we have compared the biodegradation pathways we found with those obtained from UM-BBD (Hou et al., 2003) showing alternative routes.

Our tool uses a heuristic algorithm based on Monte Carlo to find a possible route connecting a specified target metabolite with the host metabolism, instead of using a pathway selection by enumeration of all possible metabolic routes (Arita, 2003; Eppstein, 1998). DESHARKY finds a proper pathway and computes its associated genetic load in a few seconds. In addition, our software can be used in distributed computing to sample most of the solution space. For illustration purposes, we show in the Supplementary Material all possible biodegradation routes for phenol. Here, we have assumed non-weighted reactions for the heuristic procedure and we compute the genetic load a posteriori using the transcription and metabolic models. Alternatively, a global optimization could be addressed by considering the load of each reaction during the heuristic procedure (Croes et al., 2006).


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We thank T. S. Moon for his help, and B. Canton and D. Endy for their fruitful comments on the chassis model.

Funding: Generalitat Valenciana G.R. (BFPI 2007/160 to G.R.); Spanish Ministry of Education (TIN 2006-12860); MIT-France program, Structural Funds ERDF; EU grant BioModularH2 (FP6-NEST-043340).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Trey Ideker

Received on May 28, 2008; revised on August 14, 2008; accepted on September 2, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Arita M. In silico atomic tracing by substrate-product relationships in Escherichia coli intermediary metabolism. Genome Res. (2003) 13:2455–2466.[Abstract/Free Full Text]

    Arita M. The metabolic world of Escherichia coli is not small. Proc. Natl Acad. Sci. USA (2004) 101:1543–1547.[Abstract/Free Full Text]

    Bailey JE. Toward a science of metabolic engineering. Science (1991) 252:1668–1675.[Abstract/Free Full Text]

    Bremer H, Dennis PP. Modulation of chemical composition and other parameters of the cell by growth rate. In: Escherichia coli and Salmonella.—Neidhardt FC, et al, eds. (1996) 2, 2nd edn. Washington, D.C: ASM Press. 1553–1569.

    Croes D, et al. Inferring meaningful pathways in weighted metabolic networks. J. Mol. Biol. (2006) 356:222–236.[CrossRef][Web of Science][Medline]

    Eppstein D. Finding the k shortest paths. SIAM J. Comput. (1998) 28:652–673.[CrossRef]

    Haro M-A, de Lorenzo V. Metabolic engineering of bacteria for environmental applications: construction of Pseudomonas strains for biodegradation of 2-chlorotoluene. J. Biotechnol. (2001) 85:103–113.[CrossRef][Web of Science][Medline]

    Hatzimanikatis V, et al. Broadbelt. Exploring the diversity of complex metabolic networks. Bioinformatics (2005) 21:1603–1609.[Abstract/Free Full Text]

    Hou BK, et al. Microbial pathway prediction: a functional group approach. J. Chem. Inf. Comput. Sci. (2003) 43:1051–1057.[CrossRef][Web of Science][Medline]

    Jones KL, et al. Low-copy plasmids can perform as well as or better than high-copy plasmids for metabolic engineering of bacteria. Metabolic Engineering (2000) 2:328–338.[CrossRef][Medline]

    Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. (2000) 28:27–30.[Abstract/Free Full Text]

    Li C, et al. Computational discovery of biochemical routes to specialty chemicals. Chem. Eng. Sci. (2004) 59:5051–5060.[CrossRef]

    Martin JJ, et al. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat. Biotech. (2003) 21:796–802.[CrossRef][Web of Science][Medline]

    Mavrovouniotis ML. Estimation of standard Gibbs energy changes of biotransformations. J. Biol. Chem. (1991) 266:14440–14445.[Abstract/Free Full Text]

    Nielsen J. Metabolic engineering. Appl. Microbiol. Biotechnol. (2001) 55:263–283.[CrossRef][Web of Science][Medline]

    Pazos F, et al. MetaRouter: bioinformatics for bioremediation. Nucleic Acids Res. (2005) 33:D588–D592.[Abstract/Free Full Text]

    Ro DK, et al. Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature (2006) 440:940–943.[CrossRef][Web of Science][Medline]

    Rothlisberger D, et al. Kemp elimination catalysts by computational enzyme design. Nature (2008) 453:190–195.[CrossRef][Web of Science][Medline]

    Schaaff I, et al. Overproduction of glycolytic enzymes in yeast. Yeast (1989) 5:285–290.[CrossRef][Web of Science][Medline]

    Schrijver A. Theory of Linear and Integer Programming. (1998) New York: John Wiley & Sons.

    Schuetz R, et al. Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Molec. Syst. Biol. (2007) 3:119.

    Sprinzak D, Elowitz MB. Reconstruction of genetic circuits. Nature (2005) 438:443–448.[CrossRef][Web of Science][Medline]

    Varma A, Palsson BO. Metabolic flux balancing: Basic concepts, scientific and practical use. Bio/Technology (1994) 12:994–998.[CrossRef]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
24/21/2554    most recent
btn471v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Rodrigo, G.
Right arrow Articles by Jaramillo, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rodrigo, G.
Right arrow Articles by Jaramillo, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?