**Abbreviations:**

**FBA**, flux balance analysis; **FVA**, flux variability analysis; **EFMs**, elementary flux modes; **ExPas**, extreme pathways; **CoPE-FBA**, comprehensive polyhedra enumeration flux balance analysis

By continuing to browse this site you agree to us using cookies as described in About Cookies

Remove maintenance messageExplore this journal >

- Open Access

### Timo R. Maarleveld,

- Life Sciences, Center for Mathematics and Computer Science, Amsterdam, The Netherlands
- Systems Bioinformatics, Amsterdam Institute for Molecules Medicines and Systems, VU University Amsterdam, Amsterdam, The Netherlands
- BioSolar Cells, Wageningen, The Netherlands

### Ruchir A. Khandelwal,

- Systems Bioinformatics, Amsterdam Institute for Molecules Medicines and Systems, VU University Amsterdam, Amsterdam, The Netherlands

### Brett G. Olivier,

- Systems Bioinformatics, Amsterdam Institute for Molecules Medicines and Systems, VU University Amsterdam, Amsterdam, The Netherlands

### Bas Teusink,

- Kluyver Centre for Genomics of Industrial Fermentation/NCSB, Delft, The Netherlands

### Prof. Frank J. Bruggeman

Corresponding author- E-mail address: f.j.bruggeman@vu.nl

- Kluyver Centre for Genomics of Industrial Fermentation/NCSB, Delft, The Netherlands

Systems Bioinformatics, VU University, De Boelelaan 1087, 1081 HV, Amsterdam, The Netherlands

- First published: Full publication history
- DOI: 10.1002/biot.201200291 View/save citation
- Cited by (CrossRef): 12 articles Check for updates
- Funding Information

Metabolic networks supply the energy and building blocks for cell growth and maintenance. Cells continuously rewire their metabolic networks in response to changes in environmental conditions to sustain fitness. Studies of the systemic properties of metabolic networks give insight into metabolic plasticity and robustness, and the ability of organisms to cope with different environments. Constraint-based stoichiometric modeling of metabolic networks has become an indispensable tool for such studies. Herein, we review the basic theoretical underpinnings of constraint-based stoichiometric modeling of metabolic networks. Basic concepts, such as stoichiometry, chemical moiety conservation, flux modes, flux balance analysis, and flux solution spaces, are explained with simple, illustrative examples. We emphasize the mathematical definitions and their network topological interpretations.

**FBA**, flux balance analysis; **FVA**, flux variability analysis; **EFMs**, elementary flux modes; **ExPas**, extreme pathways; **CoPE-FBA**, comprehensive polyhedra enumeration flux balance analysis

Metabolism is essentially a large network of coupled chemical conversions (reactions) catalyzed mostly by enzymes. In this process, nutrients are converted into building blocks, such as nucleotides, fatty acids, lipids, amino acids, and free-energy carriers, for the synthesis of macromolecules, such as DNA, RNA, and proteins. These macromolecules are required for the maintenance of cellular integrity and formation of new cells. Fundamental processes in metabolism are enzyme-catalyzed reactions. In a single reaction, substrates are converted into products and the number of atoms of a given type, such as C, H, O, N, P, or S, and the net charge should balance on each side of the equation [1]. These balancing principles are followed in genome-scale metabolic reconstructions [2]. Some aspects of balancing remain ambiguous, such as the protonation states of some of the metabolites, because this may be dependent on intracellular properties, such as pH and ionic strength. Every reaction occurs at a rate that depends on the concentrations of the enzyme reactants, possibly a few effectors, and the enzyme kinetic properties described by enzyme kinetics. Any reaction *j* can be written as Eq. (1):

(1)

in which we consider a network with a total of *m* metabolites (reactants), denoted by x_{i}. The n^{+}_{ij} and n^{–}_{ij} coefficients denote product and substrate stoichiometric coefficients, respectively, and equal the number of molecules produced and consumed per unit reaction rate. The reaction rate is denoted by *v*_{j} and typical units are mM min^{–1} or mmol h^{–1} (g biomass)^{–1}. The net stoichiometric coefficient of metabolite *i* in reaction *j* is defined as *n _{ij}* =

The rates of change of the concentration of every metabolite can be equated in terms of reaction rates and net stoichiometric coefficients, which gives rise to the set of ordinary differential equations given by Eq. (2):

(2)

The metabolite or state vector ** x** is

Because our interest is in stoichiometric models, we will not discuss further enzyme kinetics that enter the rate vector ** v**, see [3]. The stoichiometric matrix is the principle object of study in stoichiometric modeling. Herein, we discuss basic analyses of the stoichiometric matrix.

In metabolism, metabolites tend to occur that are solely recycled. Examples of such metabolites include ATP, NAD(P)H, and coenzyme A. As a consequence of recycling, the maximum concentration of those metabolites is constrained by a total concentration of a chemical moiety. For instance, in the case of phosphate and adenosine moiety conservation, the relationships given by Eq. (3) hold true at any moment in time:

(3)

with total phosphate and adenosine levels given as P_{T} and A_{T}. Taking the derivative of Eq. (3) with respect to time gives Eq. (4):

(4)

Equation (4) indicates linear relationships between the rows of the stoichiometry matrix and allows for the expression of the rate of change of one metabolite in terms of other rates of change [4]. In matrix form, we can write this as Eq. (5) for the general case of metabolites of a metabolic network:

(5)

The vectors ** x^{D}** and

(6)

with **t** as the vector of total concentrations of chemical moieties. By way of illustration, the vectors ** x^{D}** and

(7)

Using this *L*_{0} matrix, the dynamics of all metabolites (ATP, ADP, AMP, and P) can be obtained from the dynamics of the independent metabolites (ATP, ADP). In other words, the dependent species are redundant for determining the species dynamics. Note that different combinations of independent metabolites can be chosen. This can intuitively be seen from the relationships given in Eq. (3). For example, by choosing ATP and AMP as independent metabolites, the concentration of ADP can be determined from A_{T} = ATP + ADP + AMP. Subsequently, the concentration of P can be determined from P_{T} = 3ATP + 2 ADP + AMP + P.

The relationship given in Eq. (5) dictates the decomposition of the stoichiometric matrix into two blocks, given by Eq. (8):

(8)

in which ** N** is decomposed into blocks of

(9)

and indicate that the moiety conservation matrix in Eq. (10):

(10)

can be derived from the left null-space of the stoichiometry matrix [7, 8]. Typically, ** N_{R}** is identified in

In genome-scale models, a biomass reaction is typically used to describe cell growth. This biomass reaction is used as a sink for biomass precursors (e.g. DNA, RNA, proteins, lipids) that together define the biomass composition of the cell. These biomass precursors contain moieties, such as adenosine, that require the continuous synthesis of adenosine. Therefore, a non-zero flux through this biomass reaction results in a drain of the moieties. Hence, in such a genome-scale metabolic model, a strict application of moiety conservation detection along the lines detailed above, will result in fewer moieties. Yet, to understand the dynamics of metabolic pathways, they are relevant because the turnover of ATP is much larger than the rate at which the adenosine moiety will be synthesized.

By definition, at a steady state of the metabolic network, Eq. (11) holds:

(11)

Here we used the convention that the reaction rate vector, ** v**, at steady state is denoted by

(12)

Thus, the right null-space of the (reduced) stoichiometry matrix equals the kernel matrix, given by Eq. (13):

(13)

Note that the columns of ** N_{R}** may have to be reordered to write the null-space in this form. In addition, each column of

If we denote the *i*th column of ** K** by

(14)

in which the weighting coefficients, *α _{i}*, can take any value. The set of all flux vectors of the metabolic network is contained within the null-space of

The vectors *k*_{i} have a network topological interpretation. They represent routes through the network along which every metabolite is at steady state, if the fluxes carry the values dictated by *k*_{i}. This is why the *k*_{i} terms are often called flux modes. In Fig. 1A, a toy metabolic network is shown. It contains 26 reactions and 23 metabolites. The external metabolites T, U, X, and Y are considered to be fixed; the other 19 metabolites are considered to be variable. The stoichiometry matrix has full rank and, therefore, no conserved chemical moieties occur. The number of independent fluxes equals seven (=26–19). Thus, seven flux modes exist and they are displayed in Fig. 1B. The color codes of the reactions indicate reaction rate values and it can be easily verified that all variable metabolites are at steady state for all flux modes. Because all of these metabolites along a flux mode are required to operate a steady state, the flux modes have to be either cycles or routes from source to sink metabolites.

Note that the flux modes do not necessarily agree with thermodynamics, since several irreversible reactions are forced to have a negative flux. In addition, the flux mode in the upper-right part of Fig. 1B is complex and could be swapped for a simpler flux mode if desired. This indicates the problems associated with analysis of the null-space and explains why alternative definitions for steady-state flux routes in metabolic networks have been developed; these are introduced and discussed in sections 4 to 9. These alternative definitions are unique representations of the null-space and agree with the thermodynamic preference of the reactions.

In Section 3, we discussed the steady-state relationship ** N_{R}J** =

(15)

in which vector ** c** dictates a linear relationship between fluxes of

Performing FBA on the toy metabolic model shown in Fig. 1A with the input flux R01≤1, and maximization of the flux through R26 as the objective function yields a flux distribution, one of which is presented in the upper-left part of Fig. 1B. This example indicates that the flux distribution that results from FBA can be (a linear combination of the) flux modes of the stoichiometric matrix.

We also performed FBA on the iAF1260 model of *Escherichia coli* with input fluxes representing a mineral medium [12] supplemented with glucose (uptake flux of 8 mmol g^{–1}h^{–1}), in the presence of oxygen (uptake flux of 18.5 mmol g^{–1}h^{–1}) and free exchange of ammonia, water, carbon dioxide, protons, phosphate, sulfate, and other metal ions. As the metabolic objective, we considered maximization of the rate of biomass synthesis. This optimization predicts a growth rate of 0.73 h^{–1} and a flux distribution in which 82% reactions are inactive; only 18% reactions of the whole metabolic network are used (Fig. 2B).

FVA maximizes and minimizes each flux of the metabolic network, while satisfying all given constraints at the optimal objective function value [13]. It is a useful tool to gain an insight into network flexibility. It gives the span of the fluxes that exist within the optimal solution space defined by the linear program given in Eq. (15). The linear program for FVA can be written by Eq. (16):

(16)

in which *Z*_{obj} is the objective function value of the previous FBA program. By fixing the objective function value at the value obtained from the FBA optimization, FVA determines the range for each flux within which all numerical values are valid FBA solutions. Using the results of FVA optimization, we can determine the spans (= |*J _{i}*

In Fig. 2A, the absolute spans resulting from FVA for all fluxes in the toy metabolic model are shown. The constraints for this FVA are identical to the FBA calculations for this model discussed above and infinity flux bounds are represented by a large value of 1000. This analysis depicts the reactions that have a fixed flux and a span of zero (gray arrows) and reactions that have variable fluxes with a span of one (purple arrows). Here, reactions with a span of zero are either active (essential) reactions because alternative optimal paths are not present or inactive (non-essential) reactions because they yield a sub-optimal FBA solution (R25). Some reactions (red arrows) have large spans because they are part of metabolic cycles (R02–R04, R14 and R19–R21, and R23–R24). In any optimal FBA solution with a maximal flux of one through R01, a net flux of one is required from R01 to R05. Therefore, the allowed flux values of R02–R04 are between -999 and 1000 (span = 1999); a flux of -999 through R02 results in a flux of 1000 through R03 and R04 and vice versa. In contrast, no net flux is required through the second metabolic cycle (R14 and R19–R21) to obtain an optimal FBA solution. For example, a flux of one through R16–R18 results in no flux through R13–R15 in any optimal FBA solution. As a result, the reactions of this metabolic cycle can operate at their maximum bounds in both directions without violating optimal metabolic functioning (span = 2000). Also, a net flux of one is required from R22 to R26. Because R24 is irreversible, the maximal flux R23 can obtain in any optimal solution is one (with *J*_{R24} = 0). Since R23 is reversible, an optimal FBA solution can be obtained if the fluxes through R23 and R24 are -999 and 1000, respectively. As a consequence, the spans of R23 and R24 are 1000 (R23: –999 to 1 and R24: 0–1000).

We also performed FVA on the *E. coli* model iAF1260 with the same constraints and objective function value as the FBA optimization described above. Analyzing the spans of all fluxes revealed that only 6% reactions could vary in the optimal solution space. Of the 94% fixed reactions, 16% carry a non-zero flux and the remaining 84% are inactive (Fig. 2C). This means that 21.04% (6% + 0.94 × 16%) reactions can have a non-zero flux. The percentage of non-zero fluxes in a FBA outcome (Fig. 2B) will be equal or lower, because some variable fluxes can also be zero in a FBA solution.

Further analysis of the variable fluxes (6% of total fluxes) revealed that 49% of them had a finite span. The remainder (51% of fluxes with infinite spans) get their variability due to metabolic cycles in the metabolic network. In section 9 we explain how these cycles can be interpreted. The spans of some of the reactions in the model, as obtained by FVA, are shown in Fig. 2D.

Two sensitivity parameters – reduced costs and shadow prices – are associated with a FBA solution. A reduced cost (*r _{i}*) can be interpreted as the sensitivity of the objective function with respect to the change in the

(17)

A shadow price (*γ _{i}*) is the sensitivity of the objective function with respect to the change in a constraint [17]. Consequently, if metabolite

Elementary flux modes (EFMs) and extreme pathways (ExPas) were developed to uniquely characterize the right null-space **K** of a stoichiometry matrix. In contrast to FBA-related techniques, EFM and ExPa analyses are only based on network stoichiometry and therefore allow an unbiased analysis without imposing an optimization principle. These definitions rely on a convex set of flux vectors [19, 20]. By taking a convex combination of these flux vectors (*e*_{n}), any possible steady-state flux distribution (** J**) can be generated. Assuming that we have

(18)

Here *α _{i}* are non-negative weighting coefficients that total one (Eq. 19)

(19)

Both EMFs and ExPas can be exploited to evaluate, for instance, pathway redundancy, to find (sub-)optimal pathways for the investigation of pathway properties, such as cost and length, and to study the effect of gene deletions [21–23]. Unfortunately, both approaches suffer from excessive running times, that is, characterizing the right null-space of the stoichiometric matrix is a non-deterministic polynomial-time (NP)-hard computational problem.

EFMs [19] fulfill three conditions: (i) (pseudo-)steady state, (ii) thermodynamic feasibility, and (iii) non-decomposability. These conditions have several consequences. First, internal metabolites of an EFM are neither net consumed or produced due to the steady-state condition. Second, all flux rates of an EFM are thermodynamically feasible in contrast to the flux modes, ** k_{i}**. Third, no subset of an EFM exists that fulfills the first two conditions without violating the third. The complete set of EFMs can be partitioned into three types: (I) all optimal yield pathways converting one or more substrates into a product (e.g. biomass), (II) all sub-optimal yield pathways converting one or more substrates to a product, and (III) internal loops in the metabolic network.

The toy metabolic network shown in Fig. 1A contains 28 EFMs, as shown in Fig. 3. In this toy model, we can identify all three types of EFMs earlier defined. To begin with, there are 24 type I EFMs that characterize all optimal pathways, which are shown in Fig. 3A–X. Any route from metabolite X to Y, ignoring reaction R25, is a type I EFM. In this network, there is only one type II EFM that gives a sub-optimal yield (Fig. 3Y). Thus, any route from metabolite X to Y, involving reaction R25, results in a sub-optimal yield because R25 has stoichiometry . Finally, there are three type III EFMs that characterize the internal loops of this metabolic network, as shown in Fig. 3Z–AB. Generally, these cycles are responsible for the large number of EFMs. To illustrate this point, without these three cycles the toy metabolic network has only five EFMs. Then, there would be four EFMs to characterize all optimal pathways, one EFM to characterize the sub-optimal pathway, and zero EFMs to characterize internal loops.

The alternative approach, ExPas, determines the edges of the cone that describe the steady-state solution space and the thermodynamic preference of reactions [20]. The set of ExPas does not have to contain all pathways with an optimal and sub-optimal yield, in contrast to the EFMs. Convex combinations of ExPas that satisfy the three EFM conditions, however, can be used to obtain all optimal and sub-optimal pathways. In addition to the three conditions of EFMs, ExPas require two additional conditions: (iv) network reconfiguration and (v) systematic independence.

Network reconfiguration results in a classification of each reaction as an internal or exchange reaction. Moreover, each internal reversible reaction is split into two irreversible reactions: a reaction describing the forward reaction and a reaction describing the backward reaction. Systematic independence guaranties that an ExPa cannot be represented by a non-negative linear combination of other ExPas. Because of the systemic independence condition, ExPas are always a subset of the EFMs. In other words, each ExPa is also an EFM, but not necessarily vice versa. This can result in fewer ExPas than EFMs for the same metabolic network. For a metabolic model of the human red blood cell, the average number of EFMs used for a given ExPa was about four [24]. However, if all exchange reactions in a metabolic network are irreversible, the sets of relevant EFMs and ExPas are identical. This is a general property of EFMs and ExPas [24, 25]. Note that each originally reversible internal reaction, split into two irreversible reactions, fulfills all ExPa and EFM conditions, resulting in additional ExPas and EFMs. These ExPas and EFMs can be considered irrelevant [24, 25] because they only redefine reversibility. In more mathematical terms, the EFMs and ExPas are the extreme rays that span the flux cone, *C*, defined by C = {** J**|

Because the exchange reactions (R01 and R26) in this toy metabolic network are irreversible, the sets of relevant EFMs and ExPas are identical. Nevertheless, the network contains 38 ExPas and 28 EFMs. The additional 10 ExPas arise because of the network reconfiguration condition. Each of these 10 ExPas (not shown) is also an EFM that will be detected if they are determined after reconfiguring the network.

FBA can be exploited to calculate the maximum yield of a product on a certain substrate. FBA simulation provides a steady-state flux distribution, which corresponds to a point in the (optimal) solution space. Typically, a unique optimal steady-state flux distribution through the metabolic network cannot be guaranteed because the constraints defined by the stoichiometric network are insufficient. Accordingly, a solution space of optimal steady-state flux distributions exists that each give rise to the maximal yield. This solution space represents a polyhedron [26] and this space is considerably smaller than the entire steady-state solution space characterized by flux modes, EFMs, or ExPas. This reduction in solution space is achieved in FBA by the consideration of additional constraints, a particular nutrient environment, and the demand for flux distributions that optimize a metabolic objective. Characterizing the optimal solution space of FBA remains a NP-hard computational problem. Above, we characterized the variability of the flux values within the optimal solution space. Next, we characterize this solution space in network topological terms.

In contrast to both the EFMs and ExPas approaches, comprehensive polyhedra enumeration flux balance analysis (CoPE-FBA) characterizes only the optimal solution space, which is done in terms of a compact set of sub-networks [27]. These sub-networks account for all alternative flux distributions in the optimal steady state predicted by FBA. CoPE-FBA therefore provides the topological structure underneath flux variability, at least in the optimal solution. The solution space of optimal flux distributions contains three topological features: (i) vertices, (ii) rays, and (iii) linealities.

Vertices are optimal paths of the metabolic network, including reactions with fixed and variable fluxes. Rays are irreversible, thermodynamically infeasible cycles and linealities are reversible cycles in the metabolic network. No net conversion occurs in either the rays or linealities of the FBA polyhedron. The toy metabolic network has four vertices (Fig. 4B–E), one ray, and two linealities (Fig. 4A). Note that the EFMs and ExPas also consist of these three topological features. Therefore, rays and linealities are responsible for the increase in the number of type I and II EFMs and ExPas. Alternatively, rays and linealities do not influence the number of CoPE-FBA sub-networks.

Any optimal flux distribution that satisfies the metabolic optimum obtained in the FBA calculation can be written in terms of the vertices, rays, and linealities using the Minkowski sum given in Eq. (20) [26, 27],

(20)

in which the vectors *ϕ*_{i}, *ϕ*_{i}, and *ψ*_{i} represent the vertices, rays, and linealities, respectively. The weighting coefficients obey the following restrictions: , *β* _{i} > 0, *α _{i}* > 0, and

The sub-networks that can be identified with CoPE-FBA and explain the numbers of vertices for a given FBA problem satisfy three conditions: (i) only reactions belonging to a specific sub-network display correlation in flux values across the optimal solution space, (ii) fixed net input–output stoichiometry of reactants, and (iii) thermodynamic feasibility.

As a result, these sub-networks contain reactions that vary independently across all vertices of the optimal solution space. Therefore, without violating the optimality condition, sub-networks with alternative internal flux distributions can be independently chosen. For this reason, the number of vertices can be determined by multiplying the number of alternative internal flux distributions for each sub-network [27]. This illustrates the likely combinatorial explosion for the number of vertices of the optimal solution space for larger metabolic networks.

The toy metabolic network contains two CoPE-FBA sub-networks given in Fig. 4F. Each sub-network has two alternative internal fluxes distributions: the top and bottom branch. Multiplying the number of alternative flux distributions for each sub-network, 2 × 2 = 4, gives the number of vertices. Larger metabolic models tend to contain more vertices, while the number of sub-networks stays small. For instance, the genome-scale metabolic model iAF1260, consisting of 2374 reactions and 1668 metabolites, has about 1.7 × 10^{6} vertices when studied under glucose growth conditions [27]. Still, only four sub-networks, which contain about 5% of the total number of reactions in this model, are enough to characterize the optimal solution space [27]. Comparing the number of EFMs (*i*), ExPas (*j*), vertices (*k*), and CoPE-FBA sub-networks (*m*) gives an indication of the level of compactness of these approaches. Typically, for larger models the number of EFMs, ExPas, and vertices will explode, which gives *i* ≥ *j* > *k* >> *m*.

The predictions made by any mathematical model depend heavily on the underlying assumptions. The definitions of ExPas, EMFs, and those related to FBA have the steady-state assumption in common. In general, the steady-state assumption is assumed to be valid because of the timescale separation between (fast) intracellular metabolic conversions and (slow) genetic regulation [28, 29].

In addition to the steady-state assumption, FBA assumed optimization of an objective function, which could, in some cases, be debatable from a biological perspective. Typical objective functions are the yield of the biomass reaction or ATP production. Optimization of these objectives is always bounded by capacity constraints of other reactions that ultimately bound the steady-state solution space. In other words, FBA optimizes an objective function relative to a limiting input flux. Thus, optimization of any reaction rate in FBA is always the optimization of a yield defined as the objective reaction rate divided by the limiting input. Optimization of growth rate rather than growth yield is a completely different strategy; this can be easily understood because the yield does not fix the rates of the metabolic processes. Selection for yield only occurs in the absence of competition for nutrients, which is an unlikely scenario in biology. The assumption of one objective may actually not always reflect reality: the occurrence of trade-offs between two metabolic objectives may cause cells to optimize both of them simultaneously (possibly, with different weights), leading to Pareto optimization problems [30].

We provided an overview of the most common mathematical techniques used in the stoichiometric analysis of metabolic networks. We have not described in any detail the application of these techniques to biological problems, which is found elsewhere [31–33]. These applications to biology are the reasons for the existence of pathway analysis, and there are a number of success stories [34, 35]. Yet, the simplifications and subsequent limitations of the described techniques are also clear and extensions to pathway analysis methods include the incorporation of dynamics (such as in dynamic FBA [36]), additional constraints (such as space or resource limitations [37–39], multidimensional optimality [30], and extensions to multi-species FBA [40–42]. It is therefore to be expected that such analysis will penetrate biology in increasingly many ways to provide rigorous and quantitative hypotheses and fundamental understanding.

T.R.M. acknowledges funding from the project BioSolar Cells, co-financed by the Dutch Ministry of Economic Affairs. B.O., R.A.K., and F.J.B. acknowledge the NWO funded project MEMESA (number: 632100021). B.T. and B.O. further acknowledge the Netherlands Genomics Initiative and ZonMW for Zenith grant 40-41009-98-10038.

The authors declare no conflict of interest.

Powered by Wiley Online Library

Copyright © 1999 - 2017 John Wiley & Sons, Inc. All Rights Reserved