Detecting and investigating substrate cycles in a genome-scale human metabolic network


  • Juliane Gebauer,

    1.  Department of Bioinformatics, School of Biology and Pharmaceutics and JenAge Research Core, Friedrich Schiller University of Jena, Germany
    2.  Research Group Theoretical Systems Biology, School of Biology and Pharmaceutics, Friedrich Schiller University of Jena, Germany
    Search for more papers by this author
  • Stefan Schuster,

    1.  Department of Bioinformatics, School of Biology and Pharmaceutics and JenAge Research Core, Friedrich Schiller University of Jena, Germany
    Search for more papers by this author
  • Luís F. de Figueiredo,

    1.  Department of Bioinformatics, School of Biology and Pharmaceutics and JenAge Research Core, Friedrich Schiller University of Jena, Germany
    Search for more papers by this author
    • These authors contributed equally to this work

    • Present address
      Cheminformatics and Metabolism, European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Cambridge, UK

  • Christoph Kaleta

    1.  Department of Bioinformatics, School of Biology and Pharmaceutics and JenAge Research Core, Friedrich Schiller University of Jena, Germany
    2.  Research Group Theoretical Systems Biology, School of Biology and Pharmaceutics, Friedrich Schiller University of Jena, Germany
    Search for more papers by this author
    • These authors contributed equally to this work

C. Kaleta, Research Group Theoretical Systems Biology, School of Biology and Pharmaceutics, Leutragraben 1, Friedrich Schiller University of Jena, D-07743 Jena, Germany
Fax: +49 3641 949595
Tel: +49 3641 949590


Substrate cycles, also known as futile cycles, are cyclic metabolic routes that dissipate energy by hydrolysing cofactors such as ATP. They were first described to occur in the muscles of bumblebees and brown adipose tissue in the 1970s. A popular example is the conversion of fructose 6-phosphate to fructose 1,6-bisphosphate and back. In the present study, we analyze a large number of substrate cycles in human metabolism that consume ATP and discuss their statistics. For this purpose, we use two recently published methods (i.e. EFMEvolver and the K-shortest EFM method) to calculate samples of 100 000 and 15 000 substrate cycles, respectively. We find an unexpectedly high number of substrate cycles in human metabolism, with up to 100 reactions per cycle, utilizing reactions from up to six different compartments. An analysis of tissue-specific models of liver and brain metabolism shows that there is selective pressure that acts against the uncontrolled dissipation of energy by avoiding the coexpression of enzymes belonging to the same substrate cycle. This selective force is particularly strong against futile cycles that have a high flux as a result of thermodynamic principles.


cytosol of astrocytes


cytosol of neurones


elementary flux mode


Substrate cycles, also known as futile cycles, comprise pathways with a cyclic flow that result in the dissipation of energy but do not perform any anabolic or catabolic transformation. In the simplest case, substrate cycles correspond to the interconversion between two substrates. The driving force of these cycles is an exergonic process usually involving the permanent hydrolysis of energy-rich cofactors such as ATP. The function of these cycles is not yet completely understood, although the experimental observation of these cycles has allowed various hypotheses to be put forward: thermogenesis [1] in brown adipose tissue [2] as well as in muscles of bumble-bees [3], improved regulation [4], and substrate cycles as a buffering mechanism in metabolic precursor supply [5] as well as in thermodynamic buffering [6].

Typical examples of substrate cycles occur in carbohydrate catabolism and can be found in important pathways such as glycolysis [7] and anaplerotic reactions [8,9]. Pairs of kinase and phosphatase reactions such as hexokinase/glucose 6-phosphatase (glucose 6-phosphate cycle) and phosphofructokinase-1/fructose 1,6-bisphosphatase (fructose 1,6-bisphosphate cycle) are often found in many organisms and cell types, including human hepatocytes and adipocytes [7,10–14]. Interestingly, an analogous cycle involving fructose 2,6-bisphosphate, which is an important effector in the control of carbohydrate metabolism, is catalyzed by a single, bifunctional enzyme [15]. Another important substrate cycle in Escherichia coli, the pyruvate-phosphoenolpyruvate (pyr-pep) cycle, involves three enzymes, namely pyruvate kinase, pyruvate carboxylase and phosphoenolpyruvate carboxykinase [8,16,17], and functions as a sensitive mechanism to modulate carbohydrate metabolism and energy supply [18,19].

To date, only a limited number of substrate cycles are known in human metabolism and these are relatively short. However, some of these involve more than three reactions. Whittaker and Botha [20] have shown experimentally that a cycle of sucrose synthesis and degradation exists in sugar cane, involving up to five reactions. A theoretical analysis of this metabolic network revealed several substrate cycles, some of them involving up to five reactions [21].

In a network of human nucleotide metabolism, the so-called oxypurine cycle [22] exists, which involves six enzymes. This cycle can also be revealed by computer simulation [23] (see below). Berman and Human [22] proposed that the main function of the oxypurine cycle is regulation of the 5-phosphoribosyl 1-phosphate level in human erythrocytes and the control of purine base uptake and hypoxanthine release. The sensitive regulation of this cycle is achieved at the expense of ATP hydrolysis, and therefore small changes in pH, concentration of inorganic phosphate or oxygen tension can result in a flux change.

Substrate cycles had already been proposed as biochemical markers of ageing several decades ago, such that a defect in the regulation of these cycles could cause an aberrant carbohydrate metabolism [24]. Yet, substrate cycles have also been linked to ageing as a result of its positive effect on the decrease of reactive oxygen species [25,26]. One example is a substrate cycle that consists of the leak of protons through the mitochondrial membrane, thus depleting the proton gradient established by NADH oxidation. The proton leak is responsible for up to 25% of the energy dissipation through respiration in hepatocytes [26] and it is assumed to favour the oxidation state of ubiquinone and, consequently, reduce reactive oxygen species production through free radical semiquinone anion species [27]. A futile cycle with relevance in nutrition is the breakdown and resynthesis of triglycerides from glycerol and free fatty acids in human adipocytes, as identified recently [28]. Furthermore, the Cori cycle, which produces anaerobic lactate from glucose in muscle cells and converts it back to glucose in liver, could be considered as a large substrate cycle [29].

A promising approach for identifying substrate cycles is provided by the concept of elementary flux modes (EFMs). An EFM comprises a minimal set of enzymes that can operate at steady-state and fulfil the irreversibility constraints associated with some reactions [30,31]. This method has been successfully used for detecting substrate cycles of various lengths in small- and medium-size metabolic networks [21,23]. A related method, in the framework of Petri net theory, is based on minimal T-invariants [32]. Using this method, substrate cycles have been found in models of sucrose metabolism in the potato tuber [33] and riboflavin metabolism [34].

With the advent of genome-scale metabolic reconstructions [35], the intriguing question arises whether it is feasible to detect the complete set of substrate cycles in a living cell. Teusink et al. [36] identified 28 futile cycles in a relatively small genome-scale metabolic model of Lactobacillus plantarum. However, the number of EFMs is growing exponentially with respect to an increase in network size [37,38] and an initial analysis shows that it is not possible to compute all substrate cycles in the genome-scale metabolic model of humans reconstructed by Duarte et al. [39]. Nevertheless, for several applications, it is sufficient to identify a representative subset of metabolic pathways. Recently, we proposed a method for identifying a subset of the shortest pathways in genome-scale networks [40]. This allows the identification of all pathways up to a certain length. An alternative method that combines a genetic algorithm with linear programming was developed to sample EFMs from the genome-scale network of E. coli [41]. The additional flexibility of the genetic algorithm allows an exploration of the solution space of all possible flux distributions.

In the present study, we use these two methods to extract a subset of substrate cycles present in a genome-scale metabolic network of humans, many of which are longer than the cycles known previously. We only consider substrate cycles that are consuming ATP and characterize them with respect to their length, compartmentation and ATP consumption. Analysis of the compartmental distribution of substrate cycles shows that a particular compartmentation, as extensively used in eukaryotes, can lead to a high abundance of substrate cycles. Moreover, by assuming that the ATP consumption of a substrate cycle is related to the flux through it, we show that there is a selective pressure that acts on futile cycles with a high energy consumption. This selective force is particularly visible in tissue-specific models of human metabolism that show a lower ATP consumption through substrate cycles compared to their generic counterpart encompassing the entire known metabolism of humans.

Results and Discussion

To sample substrate cycles from the genome-scale metabolic network of humans [39], we defined them as a subset of EFMs and used two recently published methods for their computation. The first one comprises the K-shortest EFM method [40], which calculates EFMs of increasing length, beginning with the shortest. The second method, EFMEvolver [41], is based on an evolutionary algorithm that randomly samples EFMs from a given network. Because substrate cycles are closed systems, which only dissipate energy, we added an ADP phosphorylation reaction to the metabolic model. This reaction was used as the target reaction for the identification of EFMs by both methods.

In total, 15 000 shortest substrate cycles were computed from the human genome-scale network with the K-shortest EFM method, reaching a maximal length of six reactions (not including the ADP phosphorylation reaction). Because the K-shortest EFM method uses a mixed integer linear programming formulation in which the number of constraints increases with the number of calculated EFM, it is unsuitable for enumerating a large set of EFMs. Given this limitation, we stopped the calculations with the K-shortest EFM method after reaching a reasonable number of EFMs. Using EFMEvolver, 100 000 EFMs were calculated. As expected, all futile cycles with five or less reactions found by EFMEvolver were also found by the K-shortest EFM method, thus confirming the validity of the latter method. A histogram of the number of EFMs with up to six reactions is shown in Fig. 1. The K-shortest EFM method calculated all substrate cycles with up to five reactions and 12 028 cycles with a length of six. By contrast, EFMEvolver found a fraction of cycles with up to five reactions. EFMEvolver found 88.2% of the cycles with length three but only 28.6% of the cycles with length four and 9.3% with length five. For substrate cycles of length six, the K-shortest EFM method identified 12 028 EFMs, whereas EFMEvolver found 587 cycles, from which 74 were not present in the set calculated by the K-shortest EFM method. The 100 smallest substrate cycles are presented in Table S1.

Figure 1.

 Length distribution of substrate cycles calculated with EFMevolver and the K-shortest EFM method. (A) Frequency of small substrate cycles up to a length of six reactions, computed with the K-shortest EFM method and EFMEvolver. The black bars represent the frequency of cycles calculated with K-shortest EFM method, whereas the grey bars show cycles that have been calculated with EFMEvolver. The K-shortest EFM method computed all substrate cycles up to a given length, whereas EFMEvolver computed a sample of EFMs. Please note that, because of computational restrictions, not all substrate cycles of length six could be computed. (B) Length distribution of substrate cycles computed by EFMEvolver.

Short substrate cycles are not representative for pathways of futile cycling

Figure 1 shows the length distribution of the EFMs found by EFMEvolver. The distribution has three maxima: the first one at a length of ten with 2384 substrate cycles, the second one at a length of 17 with 3032 cycles, and the last one with a length of 40 reactions and 2154 counted cycles. As can be seen, futile cycles of intermediate lengths of ∼ 40 reactions are the most frequent. The median of futile cycle lengths is 35. Thus, very short futile cycles such as the prominent glucose 6-phosphate/fructose-bisphosphate cycle appear to be an exception rather than the rule. At higher lengths, the distribution is monotonically decreasing, such that very long cycles are rare.

To validate the length distribution, we compared six runs of 100 000 substrate cycles calculated with EFMEvolver for the genome-scale metabolic network (Fig. S1). Furthermore, we compared the substrate cycles of the first run with runs two to six. The comparison shows that, on average, only 9 ± 0.5% of the substrate cycles are identical between two runs.

Interestingly, the length distribution shows a higher number of substrate cycles of length two compared to those with three reactions. Many cycles of length two comprise the phosphorylation and dephosphorylation or concomitant import and export of a metabolite. Apparently, substrate cycles of length three are more complicated to realize. An illustrative example for such a substrate cycle is shown in Fig. 2A. Figure 2B depicts a substrate cycle of length 13 that involves reactions from several metabolic subsystems including glycerophospholipid and pyrimidine metabolism. The EFM has an overall consumption of six ATP and a normalized ATP consumption of 0.4.

Figure 2.

 Two examples of small substrate cycles. (A) Substrate cycle of length three involving 3-phospho-d-glycerate (3PG), 3-phospho-d-glyceroyl phosphate (1,3DPG) and 2,3-disphospho-d-glycerate (2,3DPG). (B) Substrate cycle comprising 13 reactions and consuming six ATP assuming a net flux of one in each reaction. CDP-Chol, CDP choline; CHOL, choline; CholP, choline phosphate; DAG, diacylglycerol; Gln, glutamine; Glu, glutamate; NH4, ammonium; PA, phosphatic acid; Pcho, phosphatidylcholine; Pi, phosphate; ppi, diphosphate. Enzyme abbreviations are given in Table S2.

Most substrate cycles are distributed across several compartments

We analyzed substrate cycles with respect to the number of compartments in which they occur. In a first step, we determined all substrate cycles that occur within individual compartments by blocking the substrate exchange between compartments in the model. We found 17 substrate cycles in the nucleus (Table S3), 33 in the mitochondrion (Table S4) and more than 200 000 in the cytosol. The EFMs in the nucleus and in the mitochondrion were calculated using efmtool [42]. efmtool was not able to calculate all EFMs in the cytosol. For this reason, we computed the 2000 shortest substrate cycles for the cytosol with the K-shortest EFM method and a sample of more than 200 000 substrate cycles with EFMEvolver. Comparing the EFMs that were found by the K-shortest EFM method and EFMEvolver in the cytosol with the initial sample of 100 000 EFM mentioned above, it was found that only 439 EFMs occur in all three samples (Fig. 3). This small intersection gives an impression of the large number of substrate cycles in human metabolism.

Figure 3.

 Overlap between cytosolic substrate cycles calculated with EFMEvolver, the K-shortest EFM method and substrate cycles computed with EFMEvolver in the entire network. The red circle represents the 2000 shortest cycles in the cytosol enumerated with the K-shortest EFM method, the green circle displays the 200 000 cycles in the cytosol calculated with EFMEvolver and the blue circle represents the 100 000 substrate cycles calculated in the genome-scale network using EFMEvolver.

A comparison of the EFMs found by EFMEvolver shows that most substrate cycles span several compartments. The human metabolic model [39] contains the compartments: cytosol, extra organism, endoplasmic reticulum, golgi apparatus, lysosome, mitochondria, nucleus and peroxisome. A histogram of the number of compartments that is spanned by each substrate cycle (Fig. 4) shows that most substrate cycles use reactions from two compartments, with a maximum of six compartments spanned by a substrate cycle. This is of particular relevance for eukaryotic organisms as a result of their extensive compartmentation. Energy dissipation through transport reactions has been described in several studies. For example, McClelland et al. [43] investigated a substrate cycle between the peroxisome and the cytosol in the liver of rats that consumes NADH. Consequently, and as the large number of futile cycles involving transport reactions suggests, the energy dissipation through futile cycling between compartments could be comparatively high in eukaryotic organisms.

Figure 4.

 Distribution of substrate cycles and the number of compartments.

Energy consumption of substrate cycles

Because energy dissipation is an important characteristic of substrate cycles, we determined the flux of the ATP-generating reaction for each of them relative to the sum of fluxes of the futile cycle. The results of this analysis are shown in Fig. 5. Because we only detected cycles with a net consumption of ATP and all other metabolites are balanced, the relative ATP consumption of a substrate cycle is proportional to the change of free energy, ΔG, along this pathway for a normalized sum of fluxes of one. The relative ATP consumption of a substrate cycle is defined by the flux of the ATP generating reaction relative to total flux (without the ATP generating reaction). Because a high ΔG of a pathway allows for a high flux of that pathway [44], a high relative ATP-generating flux corresponds to a high ΔG and hence to a high potential net flux.

Figure 5.

 Density distribution and histogram of ATP consumption of the 100 000 substrate cycles calculated with EFMEvolver. The ATP consumption ranges from 0.0172 to 0.55, with a median of 0.1491.

Because futile cycles dissipate energy, evolutionary pressure is expected to be particularly high on substrate cycles with a high ΔG and hence a high relative consumption of ATP. This evolutionary pressure is one explanation for the high frequency of substrate cycles with a low relative ATP consumption (Fig. 5) because evolution acts in particular against substrate cycles with a high relative ATP consumption.

Substrate cycles from tissue-specific models have lower energy consumption

The genome-scale model of human metabolism that we used for our analysis corresponds to a library of all reactions encoded in the human genome. However, each cell type has a specific metabolic mark-up and, thus, it cannot be expected that all the reactions present in the generic model would be present at the same time in one particular tissue. In accordance with the argument outlined above with respect to selection acting against futile cycles, we would expect that there should be a driving force against the expression of enzymes belonging to a futile cycle. More precisely, this selective force should be particularly strong against the concomitant expression of enzymes belonging to a futile cycle with a high relative ATP consumption. For the analysis of the ATP consumption of EFMs, we normalized EFMs such that the sum of fluxes of each EFM, excluding the ADP phosphorylation reaction, equals one (for further information, see the Materials and methods).

To test this hypothesis, we investigated the ATP consumption of futile cycles of several tissue-specific models. These models were derived from the human genome-scale metabolic model by the integration of transcription data of the specific tissue. Conceptually, tissue-specific models are derived by assuming that weakly expressed enzymes and enzymes that are completely absent from a particular tissue do not contribute to the metabolism of this cell type. Hence, the reactions catalyzed by such enzymes need not be considered in the metabolic model of the corresponding cell type.

Figure 6 (left) shows the ATP consumption of substrate cycles in the brain [45] and liver [46], as well as in the human genome-scale model. Lewis et al. [45] reconstructed the brain network for normal and aged brains, as well as for brains with Alzheimer’s disease. The ATP consumption of the three models does not show statistically significant differences (Fig. 6A). Consequently, we do not observe differences in the energy dissipation of futile cycles between different age states. Hence, we only considered the normal network in our further investigations. The brain model contains two different nerve cells, astrocytes (CA, cytosol of astrocytes) and neurones (CN, cytosol of neurons). In accordance with our hypothesis, the ATP consumption decreases from a median of 0.159 in the generic human genome-scale model to 0.128 in the liver, 0.023 in CN and 0.015 in CA. These decreases are statistically significant (Mann–Whitney–Wilcoxon test for each case, P < 10−16). Interestingly, this value was the lowest for brain cells in contrast to hepatocytes. This might be a result of the higher energy consumption of the brain and a resulting stronger pressure against energy dissipation. Moreover, our results show that futile cycles are also abundant in tissue-specific models and not just in the compendium model of human metabolism.

Figure 6.

 ATP consumption in several metabolic models. (A) Comparison between the ATP consumption of the genome-scale metabolic model and the derived brain and liver models. For clarity, only the density distributions of the ATP consumption are depicted. (B) Comparison of ATP consumption of tissue-specific models and the human metabolic network with random reaction knockouts. Each point in the plot represents one run of 100 000 calculated EFMs. The grey dots represent the test runs of the genome-scale network in which a specific number of reactions were blocked during each run. The x-axis represents the number of used reactions in one run and the y-axis represents the median ATP consumption.

Because the genome-scale model was the basis for both tissue-specific models, we tested whether the decreasing ATP consumption could simply be a result of reaction deletion from the genome-scale model. Accordingly, reactions of the genome-scale model used by futile cycles were blocked randomly and sets of 100 000 futile cycles were calculated. For each run, we blocked a specific number of reaction between one and 400. The result of this test is shown in Fig. 6 (right). Each point represents one run that identified 100 000 substrate cycles. The x-axis shows the number of used reactions for all substrate cycles in one run and the y-axis shows the median ATP consumption. For each tissue-specific network, we searched for those test runs that have the same number of used reactions and found that the tissue-specific models still had a significantly decreased ATP consumption (Mann–Whitney–Wilcoxon test: liver, P < 10−9; aged and non-aged brain, P < 10−10).

Thus, blocking reactions randomly from the genome-scale network does not lead to the same degree of decrease in ATP consumption as in the tissue-specific models described above. This supports our hypothesis that the reduced relative energy consumption is the effect of an evolutionary pressure against futile cycles with a particular high flux.


In the present study, we used two recently published methods for enumerating EFMs to calculate samples of substrate cycles and detected an unexpectedly large number of them in human metabolism. Besides the commonly known shorter cycles of length two to approximately five reactions, there are much longer cycles. In most cases, it is not clear whether these cycles have a biological function or are just ‘by-products’ of the complex and intertwined network of metabolism. It can be assumed that many of the longer cycles are not operative as a result of the down-regulation of at least one enzyme involved. However, even if all enzymes of these long cycles are present and operative, the flux through these cycles is likely to be small or even negligible as a result of low metabolite concentrations and/or slow kinetics. As our analysis shows, the majority of substrate cycles use reactions of several compartments. Thus, the extensive compartmentation of eukaryotes and, consequently, the large number of transporters, makes them particular prone to energy dissipation through substrate cycles, and hence represents an evolutionary disadvantage of the utilization of several cellular compartments.

We found that the ATP consumption of substrate cycles peaks at a small relative ATP consumption and then decreases quickly. We attributed this non-uniform distribution partially to an evolutionary pressure against substrate cycles with a particular high relative ATP consumption because this entails a high relative flux as a result of the laws of thermodynamics. In support of this hypothesis, we found that the relative ATP consumption decreases when tissue-specific models of metabolism are considered because they take into account reactions that are present at the same time in a particular type of cell.

In summary, the present study shows that, in addition to the (relatively short) cycles known previously, there is a large number of substrate cycles present in the human metabolic network. The idea that most substrate cycles matter to human metabolism is demonstrated by the evolutionary pressure acting against those substrate cycles that are expected to have a particular high flux and thus contribute most to unneeded energy dissipation.

Materials and methods

Metabolic network model

A metabolic network limited by a system boundary can be described using the stoichiometric matrix, N, a representation of metabolite interconversions by the reactions in the system. In constraint-based modelling [47,48], the metabolic system is described by applying mass conservation principles and it is often assumed that the system is at steady state. Moreover, some reactions are constrained in the direction of flux if they are practically irreversible under physiological conditions. EFMs [30] can be used to describe all fluxes that are feasible under these constraints. More precisely, an EFM corresponds to a flux vector that fulfils the steady-state and the irreversibility condition. Moreover, the property of elementarity requires that an EFM cannot be further decomposed into flux vectors also fulfilling both conditions [30]. As a result of their definition, EFMs can be considered as the mathematical representation of the biochemical concept of metabolic pathways.

There are different tools for enumerating the complete set of EFMs in a metabolic network [42,49]. However, the exponential growth of the number of EFMs with network size limits the use of these tools to small or moderate size networks [50]. In the present study, we use two recently published methods for EFM analysis in genome-scale networks: the K-shortest EFM method [40] and EFMEvolver [41]. The K-shortest EFM method uses a mixed integer linear programming formulation to enumerate a subset of EFMs with an increasing number of reactions. However, this method is unsuitable for enumerating a large number of EFMs because mixed-integer linear programming needs to be optimized to find each EFM. EFMEvolver overcomes this limitation by combining a genetic algorithm with a linear programming approach. Thus, EFMEvolver allows a more comprehensive exploration of the solution space and the sampling of a larger set of EFMs. The source code of EFMEvolver and the K-shortest EFM method is available from the authors upon request.

Currently, there are two whole-cell models of human metabolism available [39,51]. Furthermore, several tissue-specific models, including two models of human liver metabolism [46,52], a model of brain energy metabolism [45] and a kidney network [53], have been presented recently. In the present study, we used the whole-cell model of human metabolism as reported previously [39], which has been extensively analyzed in Systems Biology, and also was a basis for the reconstruction of the tissue-specific networks of the brain [45] and the liver [46,52]. The tissue-specific networks used in the present study were derived from the genome-scale metabolic network by the integration of transcription data of the specific tissue in which reactions of down-regulated genes are absent in the models, whereas up-regulated genes are present.

As noted above, substrate cycles are cyclic pathways with an overall net conversion of cofactors but without any conversion of metabolic substrates. To enumerate all the potential substrate cycles in a metabolic network, one has to consider, in the first step, a closed system (i.e. no mass transfer across the system boundary). Consequently, we blocked all reactions allowing the exchange of metabolites across the system boundary. Additionally, a substrate cycle is coupled with an exergonic process, usually the hydrolysis of ATP molecules. Thus, a simple way of centring our search on substrate cycles is to create an artificial reaction for ADP phosphorylation, which can be coupled with ATP hydrolysing substrate cycles: ADP + Pi + H+ → ATP + H2O.

This reaction is used as target reaction in the search algorithms and is located in the cytosol. By initial simulations, we found that the human network contained other reactions that allowed the ADP phosphorylation without consumption of any substrate. In the model, these reactions have the IDs: ‘R_BILGLCURte’ (backward direction), ‘R_BILDGLCURte’ (backward direction) and ‘R_ATPS4m’ (forward direction). The last reaction is catalysed by the ATP synthase, the major source of ATP generation under aerobic conditions. The chemical equations of these reactions are given in Table S5. However, blocking this reaction does not pose a problem to our analysis because, in vivo, this reaction requires a proton gradient generated through the oxidation of a substrate.

We also performed a compartment-specific substrate cycle analysis. This analysis focused on the identification of substrate cycles in the nucleus, the mitochondrion and the cytosol. From the reaction set associated with each of these compartments, we removed all the reactions that used metabolites from compartments other than the one under consideration. In this way, we defined a system boundary for each compartment by using the different membranes that separate the various organelles. To find noncytosolic substrate cycles, we added additional ADP phosphorylation reactions in the nucleus and the mitochondrion.

Identification of futile cycles

To enumerate EFMs with the K-shortest EFM method and EFMEvolver, we split reversible reactions into irreversible forward and backward steps. Moreover, we reduce the metabolic network by removing (a) blocked reactions and (b) setting structurally irreversible reactions to irreversible status. Blocked reactions correspond to reactions that cannot carry any flux at steady-state and can be identified using linear programming [54]. Structurally irreversible reactions correspond to reversible reactions for which the flux in one of the directions is blocked at steady-state. The blocked direction of structurally irreversible reactions can also be determined by linear programming. Given a reversible reaction, we check, for each direction of this reaction, whether it can have a nonzero flux when the other direction is blocked. The model that contains all the changes described above is provided in Model S1.

For the enumeration of the K-shortest EFMs, we used a modified version of the original method [40]. First, mixed integer linear programming similar to a recently published method [55] that handles fluxes as real numbers was used. Second, instead of forcing the production or consumption of a metabolite, we searched for EFMs using a particular target reaction, namely the ADP phosphorylation reaction described above. We achieved this through the addition of a constraint to the mixed-integer linear programming formulation, forcing the flux through this reaction to a nonzero value. In the case of EFMEvolver, the genetic algorithm was set up with a population size of 5000 individuals, a mutation rate (i.e. the probability of a bit change in the genome/alphabet) of 0.01 and a recombination rate (the probability of recombining two genomes/alphabet) of 0.2. The target reaction (i.e. the reaction through which EFMs were searched) was the ADP phosphorylation reaction as described above. Additional details are provided by Kaleta et al. [41]. For the analysis of the ATP consumption of EFMs, we normalized EFMs as shown in Eqn (1):


where NFATP is the normalized ATP consumption (flux through the ADP phosphorylation reaction) and FATP is the unnormalized ATP consumption. Furthermore, n represents the number of reactions in one substrate cycle, excluding the ADP phosphorylation reaction and Fi is the corresponding flux through the ith reaction with = 1, …, n.


This work was supported by grants from the German Ministry for Education and Research (Bundesministerium für Bildung und Forschung – BMBF) under grant number FKZ 0315581D (JenAge Research Core) and grant number 0315758 (Virtual Liver). L.F.F. was funded by Fundação Calouste Gulbenkian, Fundação para a Ciência e a Tecnologia (FCT) and Siemens SA Portugal (PhD grant number SFRH/BD/32961/2006) of Portugal.