Skip Navigation


Bioinformatics Advance Access originally published online on November 11, 2004
Bioinformatics 2005 21(7):1203-1210; doi:10.1093/bioinformatics/bti127
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
21/7/1203    most recent
bti127v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (14)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Urbanczik, R.
Right arrow Articles by Wagner, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Urbanczik, R.
Right arrow Articles by Wagner, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press 2004.

An improved algorithm for stoichiometric network analysis: theory and applications

R. Urbanczik * and C. Wagner *

Institute of Pharmacology, University of Bern Friedbuehlstrasse 49, CH-3010 Bern, Switzerland

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 

Motivation: Genome scale analysis of the metabolic network of a microorganism is a major challenge in bioinformatics. The combinatorial explosion, which occurs during the construction of elementary fluxes (non-redundant pathways) requires sophisticated and efficient algorithms to tackle the problem.

Results: Mathematically, the calculation of elementary fluxes amounts to characterizing the space of solutions to a mixed system of linear equalities, given by the stoichiometry matrix, and linear inequalities, arising from the irreversibility of some or all of the reactions in the network. Previous approaches to this problem have iteratively solved for the equalities while satisfying the inequalities throughout the process. In an extension of previous work, here we consider the complementary approach and derive an algorithm which satisfies the inequalities one by one while staying in the space of solution of the equality constraints. Benchmarks on different subnetworks of the central carbon metabolism of Escherichia coli show that this new approach yields a significant reduction in the execution time of the calculation. This reduction arises since the odds that an intermediate elementary flux already fulfills an additional inequality are larger than when having to satisfy an additional equality constraint.

Availability: The code is available upon request.

Supplementary information: Pseudo code and a Mathematica implementation of the algorithm is on the OUP server.

Contact: robert.urbanczik{at}pki.unibe.ch; clemens.wagner{at}pki.unibe.ch


    1 INTRODUCTION
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
Due to recent advances in genomics the reconstruction of metabolic networks of microorganisms has become feasible. This rebuilding task can be solved using biochemical knowledge and information from genetic databases. However, the analysis of such large-scale systems remains a major challenge in computational biology.

Metabolic networks are assumed to operate in the steady state, so that no accumulation of species occur in the system. The stationary state condition allows for detecting routes in the system, which are inherently coupled due to the stoichiometric constraints. These so-called elementary fluxes represent non-redundant subnetworks of the system, which either properly connect inputs to outputs or represent internal cycles.

The convex analysis of biochemical networks was founded in the seminal paper by Clarke (1980). Although his main focus was on the stability of systems, in this work he outlined the theoretical basis of the stoichiometric network analysis. In the biochemical systems he considered, all reactions are unidirectional by splitting reversible reactions into forward and reverse velocities. In this case all admissible steady states of the system form a convex cone in the semi-positive orthant of the flow space. The normalized generating vectors of this current cone, which lie on its edges, are named extremal currents. These extremal currents have the property that they are as simple as possible. However, the determination of simple routes in chemical networks does not require the semi-positivity condition. As shown in Schuster et al. (2002) reversible reactions can also be taken into account and the dimension of the space of flows need not be augmented artificially as in Clarke's approach. Therefore, simple pathways can also contain negative entries when the flow of a reaction runs against its initially assumed direction. But these simple fluxes do not necessarily lie on the edges of the flux cone and were thus called elementary. Thus, extremal currents and elementary fluxes are in general elements of different vector spaces (except when all reactions are irreversible).

Several different algorithms have been proposed to determine the extremal currents of a chemical reaction systems (Clarke, 1980; Happel and Sellers, 1982; von Hohenbalken et al., 1987). However, large-scale systems meet the problem of combinatorial explosion and these methods are not applicable to metabolic networks of microorganisms. Recently, Schuster et al. (2002) presented an algorithm, which can be applied to larger systems. In the irreversible case the method has a simple geometrical interpretation since each row of the stoichiometry matrix represents a hyperplane and the mutual intersection of these hyperplanes, which divide the semi-positive orthant, shapes the cone of the system. One starts with the edges of the semi-positive orthant and after processing a hyperplane three sets of intersection points are obtained—first, points that lie in the interior of the current cone, second, points on the edges of the final cone and, third, points that are on an edge now but will be chopped off when cutting with the next hyperplane. The achievement of Schuster et al. (Schuster and Hoefer, 1991; Heinrich and Schuster, 1996; Schuster et al., 2002) was to infer simple rules to eliminate the first set of points which also work in the presence of reversible reactions. This reduces the computational expenditure and makes the elementary flux analysis of large networks feasible.

The convex analysis of metabolic networks has found interesting applications in biotechnology. (Edwards and Palsson, 2000) reconstructed the major part of the metabolic map of Escherichia coli. They determined the feasible metabolic states of the in silico system and used a target function to optimize growth. In an experiment performed in parallel the bacterium indeed undergoes adaptive evolution to achieve the predicted growth (Ibarra et al., 2002). In a different study, Stelling et al. (2002) investigated the central carbon metabolism of wild-type and mutated E.coli and calculated the elementary fluxes for subnetworks utilizing representative substrates (acetate, succinate, glycerol, glucose). Based on this, they determined in silico the survival rate when a set of enzymes is knocked out and verified it in the bacterium. In the majority of cases the computational and the experimental outcome coincides nicely. Extending this work, in Klamt and Stelling (2002) it was also possible to determine the elementary fluxes when the subnetworks were combined by using a reduced stoichiometry matrix of 31 species and 51 reactions. This calculation, however, required approximately 50 h computing time. Therefore, more efficient algorithms are needed in order to process complete networks of higher organisms.

We have recently presented an algorithm, which determines the elementary fluxes via a linear combination of null-space basis vectors of the stoichiometry matrix (Wagner, 2004). Since all of the flows considered by this method already lie in the intersection of all hyperplanes, this has the advantage that the computation is performed in a proper subspace of the whole space of flows. We exploited the special representation of the null-space matrix, which can be always put into the form that the identity matrix appears in the lower part. Starting with the basis vectors, the current polytope is then obtained, from linear combinations of pairs of fluxes which cancel at least one flow.

In contrast to this earlier phenomenological work, we here present a mathematically rigorous analysis of this approach. Further, the analysis shows that one does not have to wait till the very end of the calculation with enforcing the irreversibility constraints, but that these can be used to prune the set of possible solution during the course of the iteration. This results in an exponential reduction in both the space and time complexity of the procedure. In order to demonstrate the benefits of the new algorithm we benchmark it on some of the networks used in the above mentioned previous study (Klamt and Stelling, 2002).

Our analysis builds on quite a few basic properties of elementary fluxes and their relationship to polyhedral cones. While some of these have already been established before (Schuster et al., 2002; Wagner, 2004), to keep the presentation self-contained, we shall start from scratch, formally stating even simple properties as mathematical propositions—proofs to all proposition are given in the Appendix. To make the technical presentation starting with Section 3 more accessible, the next section gives a rough sketch of the basic strategies used in Schuster's and in the new null-space algorithm.


    2 TWO STRATEGIES
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
In a metabolic network linking m species by n reactions with the stoichiometry given via the m by n matrix N, a flow x through the reactions changes the concentration of the species by Nx. In a steady state, there can be no accumulation of species, so Nx = 0, and, further, the flow xi through any irreversible reaction i must be positive. We assume that nr of the n reactions are reversible and that they come first in the numbering of the reactions. So any steady-state flux has to satisfy the following system of linear equalities and inequalities:

(1)
The set of solutions of the system is a cone which is invariant under elementary operations on the rows of N. Using such operations N can be transformed into a matrix where the first d ≤ m rows are linearly independent and the remaining rows vanish. In the sequel, we assume that this simplification of the problem has already been carried out, so d = m and, in particular, N is of full rank.

The goal of stoichiometric network analysis is to gain a complete overview over the possible steady-state behavior, by computing a small generating set for the solutions of Equation (1). For nr > 0 there are different notions of what a suitable generating set is and we shall discuss some of these in the next section. In this section, to focus on the main ideas, we shall assume nr = 0 since in this case the notions are equivalent and coincide with standard ideas of convex analysis. A generating set E is obtained by picking one non-zero vector from each edge, i.e. one-dimensional face, of the cone of solutions. Then, any solution of (1) is a linear combination of vectors in E with non-negative scalar coefficients, and such linear combinations always yield solutions of (1).

To compute this generating set, we denote by Ni, i = 1, ..., m, the row vectors of N. For m' ≤ m then consider the system

(2)
For m' = m the system is equivalent to (1) since we have assumed nr = 0. But it is very simple to find a generating set for (2) if m' = 0. Then the vectors of the standard basis of Rn lie on the edges of the cone x ≥ 0. So all one has do is to figure out a way to compute the edges of the m' + 1 case from those of the m' problem. Now, as a simple consequence of standard dimension arguments in convex analysis (Rockafellar, 1970) one finds that:

  1. Any edge vector x of the m' problem which happens to satisfy the new constraint Nm'+1 x = 0 lies on an edge of the m' + 1 problem.
  2. Any other edge of the m' + 1 problem can be written as a linear combination of two edges of the m' problem.
Hence, to obtain the new edges for m' + 1 we have to consider all pairs of old edges and check whether a vector x ≥ 0 which is a linear combination of the pair can satisfy Nm'+1 x = 0. In this manner, we obtain a list of candidates, and any new edge is found among this list. However, the list will also contain many vectors which do not lie on an edge of the m' + 1 problem. So one has to separate the wheat from the chaff, and quite a few approaches to this task have been discussed in the literature (von Hohenbalken et al., 1987; Schuster et al., 2002).

The algorithm sketched above exploits the fact that (1) is trivial if one first ignores the equality constraints. Starting with such a solution, we then satisfy the equality constraints one by one. But (1) is also trivial if one ignores the inequality constraints. So we can consider the dual approach of ignoring them first and then satisfying the inequality constraints one by one.

To this end, let K be a full rank n by nm matrix satisfying NK = 0. Then the vectors of the form Kß, with ß Rnm are the solution of the system Nx = 0. Hence, we consider the system of inequalities

(3)
and the mapping ß -> Kß yields a one-to-one correspondence between the solution of (3) and the solutions of (1).

The inequalities in (3) become particular simple, if we chose K such that an nm by nm sub-matrix of K is the identity matrix. For brevity, let us assume that in fact the last nm rows of K form this identity matrix. Then the last nm inequalities in Equation (3) simply read ß ≥ 0. We further denote by Ki, i = 1, ..., m the first rows of K and then, similarly to the first approach, for m' ≤ m, we study the system

(4)
Again, due to the specific form of K, for m' = m the system is equivalent to (3) and it is trivial for m' = 0. In the latter case the standard basis of Rnm gives the edges of the cone ß ≥ 0. For the induction step, similarly to the primal approach, any edge of the m'-problem which satisfies the new constraint Km'+1ß ≥ 0 is a new edge. The other edges of the m' + 1 cone are linear combinations of two edges of the old problem. Again, in calculating these new edges from pairs of old edges we have to separate the wheat from the chaff and one will expect that the approaches used in the primal problem can be adapted to the present situation.


    3 ELEMENTARY FLUXES
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
We now turn to the general case that some reactions can be reversible, nr ≥ 0, and first introduce a name for the set of solutions of the system (1) by defining

where c(N) denotes the number of columns of N. Of course c(N) just equals n here. But we will also use matrices of different shapes later on.

An x F(N, nr) will be called reversible iff xj = 0 for j > nr, in this case also –x F(N, nr). We shall often need to consider the set of zeroes of a vector x and set S(x) = {j | xj = 0}. We shall say that a non-zero vector y is simpler than a vector x of same dimensionality if and only if S(y) S(x).

We are now in a position to introduce the basic concept: a non-zero vector x F(N, nr) is elementary in F(N, nr) if and only if there is no simpler vector in F(N, nr), i.e. there is no y F(N, nr)/{0} with S(y) S(x). We shall often use phrases such as ‘x is elementary’ or ‘x is an elementary flux’ when the cone we are referring to is obvious from the context.

Many important properties of elementary fluxes derive from the following two simple propositions.

PROPOSITION 1
If a non-zero x F(N, nr) is not elementary, we can find simpler vectors y, z F(N, nr) such that x = y + z.

PROPOSITION 2
Assume that x is elementary and Ny = 0 for some vector y Rn/{0}. Then S(x) {subseteq} S(y) implies x = {lambda}y.

As a corollary to Proposition 2 we obtain that elementary fluxes are characterized by their zero-sets in the following sense.

PROPOSITION 3
If x and y are elementary, S(x) = S(y) implies x = {lambda}y. If, in addition, x is irreversible, {lambda} > 0.

As a consequence of Propostion 1 any non-elementary x F(N, nr)/{0} can be decomposed as a sum of elementary elements yk with S(x) S(yk). Hence, in the case nr = 0 the elementary fluxes lie on the edges of the cone F(N,0).

Due to the above decomposition property, elementary fluxes can be used to construct generating sets for F(N, nr). We shall call a subset E of F(N, nr) an elementary cover of F(N, nr), if and only if

  1. For any elementary x there is a y in E with x = {lambda} y, and {lambda} is positive for irreversible x.
  2. If y1 and y2 are different elements of E: S(y1) S(y2).
While according to the above definition F(N, nr) can have more than one elementary cover any two covers are of course closely related. If E' is also an elementary cover, then x E' implies that a multiple {lambda}x of x lies in E and, in particular, E' and E have the same number of elements.

It is important to note that one can sensibly define more parsimonious descriptions of F(N, nr) than an elementary cover. We shall call a subset G of F(N, nr) an generating set if

While an elementary cover obviously is a generating set, it will in general not be the smallest generating set if nr > 0. Indeed, an advantage of the null-space algorithm presented below is that as an intermediate result in the computation of an elementary cover, we find a smallest generating set.

But, as already proven before in a different manner by Schuster et al. (2002) any non-elementary x F(N, nr)/{0} can be decomposed as a linear combination of the fluxes in an elementary cover such that each elementary flux occurring in the decomposition is simpler than x (and the scalar pre-factor is positive if the flux is irreversible). Hence, the concept of an elementary cover is uniquely suited to answer some functional questions about biochemical networks. For example, assume we want to block reaction j, e.g. because it produces some undesirable output, but that it is easier to block reaction i. Now, in the steady state, blocking i will be tantamount to blocking j if for any x F(N, nr) the following conditions holds: xi = 0 implies xj = 0. But having found an elementary cover it suffices to check whether the condition holds for all x E because any other steady state can be decomposed into simpler elementary fluxes. Of course, in case blocking i is not sufficient to impede j, we can go on to ask whether j is blocked if in addition to i we block some reaction i'. Then we just have to check if implies xj = 0 for all x E.

To round off this section, we give a simple criterion to check directly whether a flux is elementary.

PROPOSITION 4
Assume that x F(N, nr)/{0}. Then x is elementary if and only if the sub-matrix of N formed by its columns {nu}i with i {notin} S(x) is of rank equal to: n – card S(x) –1.


    4 ELEMENTARY FLUXES AND CONE EDGES
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
If some reactions are reversible the concept of an elementary flux does not have a simple interpretation in terms of purely geometric properties of F(N, nr). But, by considering the equivalent system obtained by treating any reversible reaction as two irreversible ones, we do arrive at a simple geometric interpretation in an extended space.

It will be convenient to write N in block form as N = (Nr Nir) where Nr is the m by nr sub-matrix of reversible reactions and Nir represent the irreversible reactions. Likewise we decompose a vector x in Rn as where of course xr has nr rows. To treat a reversible reaction as a pair of irreversible ones we set and decompose vectors in Rn+nr as where both xrp and xrn have nr rows. A solution to the irreversible problem given by easily yields a solution to the partly reversible problem given by (N, nr). Indeed, one immediately sees that the linear mapping

(5)
maps the cone into F(N, nr). On the other hand, by setting

(6)
for y Rnr, we obtain a mapping from F(N, nr) into . Obviously {psi}({varphi}(x)) = x and thus {psi} in fact maps onto F(N, nr).

We now turn to the relationship between the elementary fluxes of , which are just the edges of this cone, and elementary fluxes F(N, nr). Assume that an is of the form . Then is zero and thus not elementary in F(N, nr). But, if e has exactly one non-zero entry and if this entry is positive, one easily sees that is an elementary flux in . We call such a flux a spurious cycle, since it amounts to running the same reaction forwards and backwards. The following proposition shows the elementary fluxes of , which are not spurious cycles, correspond one to one to those of F(N, nr).

PROPOSITION 5
If x is elementary in F(N, nr), {varphi}(x) is elementary in . If is elementary in , is elementary in F(N, nr) unless is a spurious cycle.


    5 SCHUSTER'S ALGORITHM
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
Schuster's algorithm uses as starting point that the standard basis of Rn is an elementary cover if there are no equality constraints and then proceeds by satisfying the equality constraints one by one. So, assume that in addition to being in F(N, nr) a vector x should satisfy an equality constraint cx = 0 for some n-dimensional row vector c. We can then define the augmented matrix and in terms of this matrix say that x should lie in F(N+, nr). The next proposition shows how to obtain fluxes elementary in F(N+, nr) from those in F(N, nr).

PROPOSITION 6
  1. Any x which is elementary in F(N, nr) and happens to lie in F(N+, nr), is also elementary in F(N+, nr).
  2. If x is elementary in F(N+, nr), we can find y1 and y2, elementary in F(N, nr), such that x = y1 + y2. Furthermore S(x) = S(y1) {cap} S(y2).

This proposition provides the theoretical basis of Schuster's algorithm for computing elementary covers. Assuming we already have an elementary cover E of F(N, nr) we basically have to do the following to find a cover E_ + of F(N+, nr). Considering all pairs (y1, y2) of elements of E, check if there is a ty1, y2, non-negative if y2 is irreversible, such that c (y1 + ty1,y2 y2) = 0. From all such pairs construct the list of the y1 + ty1,y2 y2. If for any z1 in there is a z2 with S(z1) {subseteq} S(z2), remove z1 from . The finally resulting E+ is an elementary cover of F(N+, nr).

There are few embellishments to this basic procedure. First note that in principle Proposition 4 allows us to check directly if a flux in is elementary. Now, calculating the rank of the sub-matrix needed for the test is computationally too expensive to be useful. But we know that a must fail this test if z has too many non-zero elements. In particular, if card S(z) < nm – 2, z cannot be elementary in F(N+, nr). This yields a simple way to weed out many of the candidates before carrying out the above subset tests. While pre-screening the candidates in this manner is not part of the original version of Schuster's algorithm, the benchmark results given below indicate that this can substantially reduce the number of candidates which have to be considered further. In any case, the computational overhead of this simple inspection of the single candidates is negligible compared to the quadratic cost entailed by checking whether S(z1) {subseteq} S(z2) holds for any pair.

A second point is that the floating point operations needed to compute y1 + ty1,y2 y2 are quite time consuming. But, in checking for being elementary, we just need S(y1 + ty1,y2 y2). Obviously S(y1 + ty1,y2 y2) ⊇ S(y1) {cap} S(y2) and from Proposition 6 we know that this relation holds as an equality if y1 + ty1,y2 y2 is indeed elementary. But this means that, for the purpose of carrying out the above tests, we may simply assume that S(y1 + ty1,y2 y2) = S(y1) {cap} S(y2) because we can only err on the safe side: any non-elementary vector in will be removed a fortiori, and such a vector will never erroneously cause an elementary flux to be deleted from . By virtualizing the generation of candidates in this manner, we just have to compute y1 + ty1,y2 y2 if this yields a new elementary flux. For all the other candidates one only has to calculate set intersections, which can be efficiently implemented by using packed bitmaps.


    6 THE NULL-SPACE ALGORITHM
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
As in the introduction, let K be a kernel matrix of N, i.e. K is an n by nm matrix of full rank with NK = 0. Let Kj, j = 1, ..., n be the row vectors of K.

With K and nr we can associate the cone

where r(K) and c(K) denote the number of rows and, respectively, columns of K, here n and nm.

Since the mapping ß -> K ß yields a bijection from C(K, nr) onto F(N, nr) we call C(K, nr) a coordinate system for F(N, nr). The mapping is also a bijection of the edges of C(K, nr) onto the edges of F(N, nr) and, if nr = 0 the latter are just the elementary fluxes. Unfortunately for nr > 0 one cannot determine if Kß is an elementary flux by inspecting the location of ß in C(K, nr).

We are thus forced to introduce the following somewhat round-about definition: ß Rnm is an elementary coordinate w.r.t. (K, nr) if Kß is elementary in F(N, nr). Note that this definition does indeed make sense, because for any regular m by n matrix with N'K = 0, we have F(N', nr) = F(N, nr). Also, if ß is an elementary coordinate w.r.t. (K, nr), this does indeed mean that ß is an element of C(K, nr). Further, we shall call a row vector Kl irreversible if l > nr.

Now consider augmenting K by a irreversible row vector k+ and let K+ be the n + 1 by nm matrix . The following two propositions then deal with the transition from the (K, nr) to the (K+, nr) system.

PROPOSITION 7
If ß C(K+, nr) is elementary w.r.t to (K, nr) it is also elementary w.r.t. (K+, nr).

PROPOSITION 8
Assume that ß is an elementary coordinate w.r.t to (K+, nr) but not an elementary coordinate w.r.t. to (K, nr). Then:
  1. k+ ß = 0.
  2. ß = ß1 + ß2, where ß1 and ß2 are elementary coordinates w.r.t to (K, nr). Further S(Kß) = S(K ß1) {cap} S(K ß2).

We next deal with adding a reversible row vector, so consider the matrix obtained by prepending a row k* to K. We are now interested in the transition from the (K, nr) to the (K*, nr + 1) system. Note that while C(K*, nr + 1) = C(K, nr), a coordinate elementary w.r.t. (K*, nr + 1) may not be elementary w.r.t. (K, nr). Nevertheless, the following two propositions describing how to go from (K, nr) to (K*, nr + 1) are very similar to the above statements (Propositions 7 and 8) for the (K, nr) to (K+, nr) transition.

PROPOSITION 9
If ß is elementary w.r.t to (K, nr) it is also elementary w.r.t. (K*, nr + 1).

PROPOSITION 10
Assume that ß is an elementary coordinate w.r.t to (K*, nr + 1) but not an elementary coordinate w.r.t. to (K, nr). Then:
  1. k*ß = 0.
  2. ß = ß1 + ß2, where ß1 and ß2 are elementary coordinates w.r.t to (K, nr). Further S(Kß) = S(K ß1) {cap} S(K ß2).

The preceeding analysis provides us with an alternative algorithm to calculate the elementary cover of F(N, nr).

In analogy to our previous definitions, we call a set of elementary coordinates E an elementary cover w.r.t to (K, nr), if K E is an elementary cover of F(N, nr). We shall assume that the kernel matrix K is such that for a subset of rows with indices j1, ..., jnm the matrix K0 = (Kj1, ..., Kjnm) is the identity matrix. This can always be achieved by elementary operations on the columns of K which do not change the image of K.

Our starting point is an elementary cover E for (K0, u0) where u0 is the number of reversible rows in K0, i.e. the number of indices jl ≤ nr. Since K0 is the identity matrix, the vectors of the standard basis of Rnm can simply be used for E. We next augment K0 by a new row from K, obtaining the system (K1, u1) where u1 = u0 + 1, if the new row is reversible, and u1 = u0 if it is reversible. Using Propositions 7–10 we pick the appropriate elements of E and linear combinations of two of these elements to obtain a candidate set , such that a subset of forms an elementary cover w.r.t (K1, u). Thus all we need to do to obtain this cover is to weed out the redundant coordinates in . Again, this can be done by removing from , one after another, any ß for which S(K1{alpha}) {subseteq} S(K1ß) for some {alpha} in . Having thus obtained an elementary cover for (K1, u1), we can add further rows by re-applying the above prescription until we have a cover for (K, nr).

Note that essentially the same embellishments to the basic loop of this procedure are possible as in Schuster's algorithm. We can pre-check that card S(Kiß) is sufficiently large for a candidate ß and, based on Propositions 8 and 10 we can also virtualize the generation of candidates.

The main difference between this procedure and the early version presented in Wagner (2004) is that now just the cover of F(N, nr) is computed. In cases where nr < m, and these seem to be the norm for metabolic networks, the early version first found a cover of F(N, m) and then removed fluxes which violate one of the p = mnr irreversibility constraints to obtain a cover of F(N, nr). Generically, one will expect the cover of F(N, m) to be larger than the one of F(N, nr) by a factor of some 2p, with the corresponding performance penalty. Other differences include: (a) The intermediate elementarity tests are more strict since at the i-th stage they operate on the nm + i dimensional vectors Kiß instead of the final Kß. (b) The virtualization of the candidate generation. (c) Pre-checking that card S(Kiß) is sufficiently large.


    7 PERFORMANCE COMPARISON
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
While the main purpose of this section is to compare the performance of the two algorithms, a few words on the choice of representation of the problem are in order. For Schuster's algorithm, while the final cover is invariant to elementary operation on the rows of N, this is not true for the intermediate covers generated by the algorithm. Hence, we first transform N to a matrix N' such that a permutation of N' is in row-reduced echelon form. This will tend to increase the number of zeroes in the matrix and help to reduce the size of the intermediate covers. Also, we chose the permutation such that the number of zeroes per row of N' decreases with the row number, since this helps to keep the intermediate covers small in the initial stages of the algorithm.

For null-space, we choose K such that a permutation of KT is in row-reduced echelon form. The permutation is chosen such that as many rows as possible of the diagonal sub-matrix of K are irreversible, and we process any remaining irreversible rows before the reversible ones. Then, on the one hand, the intermediate cover we obtain after having processed all of the irreversible rows, yields the coordinates of a minimal generating set of F(N, nr). On the other hand, when processing a reversible row the size of the intermediate cover cannot decrease. So, when processing the reversible rows in the last iterations of the algorithm, we know that the intermediate covers we are generating are not larger than the final cover.

It is important to note that the above considerations do not uniquely define the representation. Hence, the observed performance of the algorithms will still depend on trivia such as the numbering of the reactions but it seems difficult to remove this arbitrariness in a sensible way.

We benchmarked the algorithms on the central carbon metabolism of E.coli and on three sub-networks thereof. The reduced stoichiometry matrices used are the ones described in more detail in Klamt and Stelling (2002). They consist of 45–51 reactions, 18 of which are reversible, involving 29–30 metabolites.

The comparison of computation times, summarized in Figure 1, show that null-space is by a substantial factor faster than Schuster's algorithm. There is even indication in the plot that this factor might increase somewhat with the size of the final cover, but this could be just accidental.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 1 Double logarithmic plot of the execution time of the two algorithms versus the final cover size, showing an approximately quadratic increase of the former with the latter. The upper curve is for Schuster's algorithm; the lower one for null-space. The four networks (in order of increasing cover size) are: succinate, glycerol, glucose and E.coli. The algorithms are basically implemented in Matlab but the operations on the packed bitmaps were written in C. The computation times are for our workstation with a 2.66 GHz P4 CPU and 1 GB of RAM. The values for our implementation of Schuster's algorithm are compatible with the ones reported in Klamt and Stelling (2002). Inset: Performance of null-space when the algorithm is just computing a minimal generating set. The CPU time (s) is plotted against the size of the minimal generating set for, in this order, succinate, glycerol, glucose and E.coli.

 
There does not seem to be one single factor quantitatively explaining the observed differences in computation time. In some cases, the intermediate covers generated by Schuster's algorithm are larger, as shown in Figure 2. Indeed, for the glycerol and glucose networks the last intermediate cover computed by Schuster's algorithm is larger than the final one. This cannot happen for null-space since, in the final stages, we are processing reversible rows. But even when there is only a small difference in the sizes of the intermediate covers, there is a marked difference in the way these are generated. As shown in Figure 3, Schuster's algorithm consistently generates more candidates than null-space. This is rather intuitive since Schuster's algorithm at each iteration will keep fewer vectors from the old cover, testing for an equality and not an inequality, and thus, as also shown in Figure 3, has to compute more replacements. For the three subnetworks the differences in the number of generated candidates seem sufficient to account for the difference in computation time observed in Figure 1. However, for the entire network E.coli this difference is smaller, albeit still marked. While even in this case null-space generates only half as many candidates than Schuster's algorithm, another factor is also at work: For null-space, a smaller fraction of candidates pass the simple test for sufficiently many zeroes and have to be checked by the more involved subset tests (cf. Fig. 3).



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 2 Logarithmic bar chart showing the sum of the sizes of the intermediate covers generated by the two algorithms. The white bars are for Schuster's algorithm; the grey bars for null-space.

 


View larger version (18K):
[in this window]
[in a new window]
 
Fig. 3 Logarithmic bar chart showing the number of candidates generated by the two algorithms. The white bars are for Schuster's algorithm; the grey bars for null-space. The outermost bars refer to the total number of candidates. The middle bars show the number of candidates which pass the test for sufficiently many zeroes. The innermost bars give the number of candidates passing all test which are at some point added to a new cover.

 

    8 DISCUSSION
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 
We have presented the null-space algorithm to calculate elementary fluxes of a chemical reaction system together with a rigorous mathematical derivation of this procedure. Benchmark results show that our procedure is faster, upto 20 times, than Schuster's algorithm, which to date has been the state of the art approach to this problem.

Our basic idea is to sculpt a simplex spanned by basis vectors of the null-space by intersecting it with half-spaces to satisfy the irreversibility constraints. This is in contrast to previous approaches which work in the full space of flows and intersect the unit simplex in this space with hyperplanes to satisfy the stoichiometric constraints. Compared to these, the null-space approach has the advantage of reducing the dimensionality of the problem from the outset. Further, when intersecting with a hyperplane as in Schuster's procedure, the space spanned by the surviving points generically is of only zero measure in the space spanned by the original points. But typically, this will not be the case when intersecting with a half-space. Thus one will expect far more points to be retained from one iteration to the next in the null-space algorithm than in Schuster's procedure. In the presence of reversible reactions this advantage is even greater, since then for null-space all elements of the old cover become part of the new cover. Since both algorithms need m iterations and the size of the final cover is the same, the fact that null-space retains more elements should result in fewer candidates being generated and thus in a faster computation. The results of the previous section confirm this expectation and show that the reduction in the number of candidates qualitatively explains the observed speedup.

But even when using null-space, for the most complex network we considered the computational demands are substantial due to the unavoidable fact that there are very many fluxes in the final elementary cover. In contrast to this, as shown in the inset of Figure 1, just computing a generating set is computationally trivial even in the worst case we studied. Hence, when attempting to extend this style of analysis to genome scale networks, one will have to carefully consider whether the exhaustive description provided by an elementary cover is really needed or if one can make do with the more parsimonious description of the flux cone provided by a generating set.


    APPENDIX
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 

Proof of Proposition 1.
Since x is not elementary there is a simpler . We are done if we can find a t > 0 such that: is in F(N, nr) and S(z) S(x).
Case 1: is irreversible. For t choose the minimal value in .
Case 2: is reversible. For t choose the minimal positive value in . If there is no positive value, replace with .

Proof of Proposition 2
Because S(x) {subseteq} S(y), a linear combination x + ty is in F(N, nr) even for non-zero t as long as t is sufficiently small. Also, we can find a t with the property xj + tyj = 0 for some j {notin} S(x). If, in addition, t is of minimal magnitude with this property we have x + ty F(N, nr) and S(x) S(x + ty). Since x is elementary, x + ty = 0 and the rest is obvious.

Proof of Proposition 3
Direct consequence of Proposition 2.

Proof of Proposition 4
A direct consequence of Proposition 2 is that for x F(N, nr) the following two statements are equivalent.
  1. x is elementary.
  2. The set of solutions y of the linear system of equations Ny = 0, S(y) ⊇ S(x) is a one-dimensional space.

Now, statement II is obviously equivalent to the rank condition given in the proposition.

Proof of Proposition 5
We first set:

(7)
Now and for we have . Furthermore, any elementary flux of which is not a spurious cycle must lie in .

Hence Proposition 5 is equivalent to the statement: x is elementary in F(N, nr) if and only if {varphi}(x) is elementary in . This statement is a direct consequence of Proposition 4 because the matrices involved in the rank test for x and {varphi}(x) are—up to sign changes of the columns—the same.

Proof of Proposition 6
(a) This is obvious. For (b), first note that we are done if x is elementary in F(N, nr). Otherwise, decompose x as , with all elementary in F(N, nr) and simpler than x. The cannot lie in F(N+, nr) because x is elementary in F(N+, nr), thus , for all k. Because the tk have to sum to zero since cx = 0, we can find i, j such that ti tj < 0. Now set . So and is in F(N+, nr). By construction and since x is elementary in F(N+, nr), we have . So , proving the first part of 5b.

We still need to show that S(x) = S(y1) {cap} S(y2). Otherwise, there is an index i ≤ nr with . We now consider the equivalent irreversible problem discussed in Proposition 5 and denote by the matrix associated to (N+, nr) in this problem.

For the condition implies . This means that does not lie in , in contradiction to the fact that is elementary in F(N+, nr).

Proof of Proposition 7
Consider any {alpha} C(K+, nr) satisfying S(K+ß) {subseteq} S(K+{alpha}). Then S(Kß) {subseteq} S(K{alpha}) but this inclusion cannot be strict since ß is elementary w.r.t to (K, nr). So S(Kß) = S(K{alpha}) and by Proposition 2 this implies that {alpha} = {lambda} ß and thus S(K+ß) = S(K+{alpha}).

Proof of Proposition 8
(a) The premise guarantees the existence of an {alpha} C(K, nr) with S(K{alpha}) S(Kß) and that S(K+{alpha}) ⊅ S(K+ß). Hence k+ß = 0.

(b) By decomposing Kß into elementary fluxes in F(N, nr), we see that ß can be written as ß = {sum}l ßl where the ßl are elementary coordinates w.r.t. (K, nr) satisfying S(Kß) S(Kßl). For brevity we set tl = k+ ßl. We cannot have tl = 0 even for a single l, because this would mean S(K+ß) S(K+ßl). However, by the preceeding proposition k+ß = 0, so we can find indices i and j with ti tj < 0. Then, setting as usual, we have and this proves the first part of (b).

We still need to show S(Kß) = S(Kß1) {cap} S(Kß2). Denote by N+ a full rank m by n + 1 matrix with N+ K+ = 0 and by the matrix of the irreversible problem associated with (N+, nr). The analogs for the augmented system to the mappings {psi} and {varphi} defined in Equations (5) and (6) will be denoted by {psi}+ and {varphi}+ since they map forth and back between n + 1 + nr dimensional and the n + 1 dimensional F(N+, nr).

Now, assuming S(Kß) S(Kß1) {cap} S(Kß2) there is an i with Kiß1 = – Kiß2 != 0, thus {varphi}(Kß1) + {varphi}(Kß2) cannot not lie in , defined in Equation (7). We then also have , contradicting the fact that K+ß = {psi}+(s) is elementary in F(N+, nr).

Proof of Propositions 9 and 10
The proofs of (9), (10a) as well as the first part of (10b) are verbatim the same as for the corresponding statements of the (K, nr) to (K+, nr) transition (Proposition 7 and 8) if one makes the substitutions: (K+, nr) becomes (K*, nr + 1), K+ becomes K* and k+ becomes k*.

Also, to prove the second part of (10b), i.e. S(Kß) = S(K ß1) {cap} S(K ß2), we can argue analogously as in Proposition (8): denote by N* a full rank m by n + 1 matrix with N* K* = 0 and by the matrix of the irreversible problem associated with (N*, nr + 1). The analogs for the augmented system to the mappings defined in Equations (5) and (6) will be denoted by {psi}* and {varphi}*. They map forth and back between n + 2 + nr dimensional and the n + 1 dimensional F(N*, nr + 1).

Again, assuming there is an i with Kiß1 = –Kiß2 != 0, implies . We then also have , contradicting the fact that K*ß = {psi}* (s) is elementary in F(N*, nr + 1).


    Acknowledgments
 
We thank Jörg Stucki for fruitful discussions and Steffen Klamt for providing us the matrices of the E.coli metabolic network. This work was supported by the Swiss National Science Foundation, Grant No. 3100A0-102269.

Received on August 23, 2004; revised on October 11, 2004; accepted on October 20, 2004

    REFERENCES
 TOP
 Abstract
 1 INTRODUCTION
 2 TWO STRATEGIES
 3 ELEMENTARY FLUXES
 4 ELEMENTARY FLUXES AND...
 5 SCHUSTER'S ALGORITHM
 6 THE NULL-SPACE ALGORITHM
 7 PERFORMANCE COMPARISON
 8 DISCUSSION
 APPENDIX
 REFERENCES
 

    Clarke, B. (1980) Stability of complex reaction networks. In Prigogine, I. and Rice, S. (Eds.). Advances in Chemical Physics, , New York Wiley, pp. 1–216.

    Edwards, J. and Palsson, B. (2000) The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl Acad. Sci. USA, 97, 5528–5533[Abstract/Free Full Text].

    Happel, J. and Sellers, P. (1982) Multiple reaction mechanism in catalysis. Ind. Eng. Chem. Fundam., 21, 67–76[CrossRef].

    Heinrich, R. and Schuster, S. The Regulation of Cellular Systems, (1996) , New York Chapman and Hall.

    Ibarra, R., Edwards, J., Palsson, B. (2002) Escherichia coli k-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature, 420, , pp. 186–189[CrossRef][Medline].

    Klamt, S. and Stelling, J. (2002) Combinatorial complexity of pathway analysis in metabolic networks. Mol. Biol. Rep., 2002, 233–236.

    Rockafellar, R. Convex Analysis, (1970) , Princeton Princeton University Press.

    Schuster, S. and Hoefer, T. (1991) Determining all extreme semi-positive conservation relations in chemical reaction systems: a test criterion for conservativity. J. Chem. Soc. Faraday Trans., 87, , pp. 2561–2566[CrossRef].

    Schuster, S., Hilgetag, C., Woods, J., Fell, D. (2002) Reaction routes in biochemichal reactions systems: algebraic properties, validated calculation procedure and example from nucleotide metabolism. J. Math. Biol., 45, 153–181[CrossRef][Web of Science][Medline].

    Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., Gilles, E. (2002) Metabolic network structure determines key aspects of functionality and regulation. Nature, 420, 190–193[CrossRef][Medline].

    von Hohenbalken, B., Clark, B., Lewis, J. (1987) Least distance methods for the frame of homogeneous equation systems. J. Comput. Appl. Math., 19, 231–241.

    Wagner, C. (2004) Nullspace approach to determine elementary modes of chemical reaction systems. J. Phys. Chem. B, 108, 2425–2431[CrossRef].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
F. J. Planes and J. E. Beasley
An optimization model for metabolic pathways
Bioinformatics, October 15, 2009; 25(20): 2723 - 2729.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. F. de Figueiredo, S. Schuster, C. Kaleta, and D. A. Fell
Can sugars be produced from fatty acids? A test case for pathway analysis tools
Bioinformatics, January 1, 2009; 25(1): 152 - 158.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. F. de Figueiredo, S. Schuster, C. Kaleta, and D. A. Fell
Can sugars be produced from fatty acids? A test case for pathway analysis tools
Bioinformatics, November 15, 2008; 24(22): 2615 - 2621.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. v. Kamp and S. Schuster
Metatool 5.0: fast and flexible elementary modes analysis
Bioinformatics, August 1, 2006; 22(15): 1930 - 1931.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Urbanczik and C. Wagner
Functional stoichiometric analysis of metabolic networks
Bioinformatics, November 15, 2005; 21(22): 4176 - 4180.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
21/7/1203    most recent
bti127v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (14)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Urbanczik, R.
Right arrow Articles by Wagner, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Urbanczik, R.
Right arrow Articles by Wagner, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?