Deciphering the physiological response of Escherichia coli under high ATP demand

Abstract One long‐standing question in microbiology is how microbes buffer perturbations in energy metabolism. In this study, we systematically analyzed the impact of different levels of ATP demand in Escherichia coli under various conditions (aerobic and anaerobic, with and without cell growth). One key finding is that, under all conditions tested, the glucose uptake increases with rising ATP demand, but only to a critical level beyond which it drops markedly, even below wild‐type levels. Focusing on anaerobic growth and using metabolomics and proteomics data in combination with a kinetic model, we show that this biphasic behavior is induced by the dual dependency of the phosphofructokinase on ATP (substrate) and ADP (allosteric activator). This mechanism buffers increased ATP demands by a higher glycolytic flux but, as shown herein, it collapses under very low ATP concentrations. Model analysis also revealed two major rate‐controlling steps in the glycolysis under high ATP demand, which could be confirmed experimentally. Our results provide new insights on fundamental mechanisms of bacterial energy metabolism and guide the rational engineering of highly productive cell factories.

Thank you again for submitting your work to Molecular Systems Biology. We have now heard back from the three reviewers who agreed to evaluate your study. As you will see below, the reviewers raise substantial concerns on your work, which unfortunately preclude the publication of the study in its current form.
While Reviewers #1 and #2 are relatively more supportive, Reviewer #1 still raises a series of concerns about the model. In particular, Reviewer #3 points to substantial issues with respect to the model and thinks that due to these shortcomings, the major conclusions of the study do not seem to be fully supported by the data and analyses. This reviewer rated the "validity of conclusions" and "suitability of publication" as "low".
During our pre-decision cross-commenting process (in which the reviewers are given the chance to make additional comments, including on each other's reports), Reviewer #3 added, "The model is huge: The first question is whether the model structure is correct. Second, the model contains lots of unknown parameters. Just to mention two issues with the first point: (i) The authors use a fixed summed concentration for ATP and ADP as a constraint. Is there no AMP? Is it not likely that under strong perturbation of energy homeostasis also more AMP is formed? Indeed, this is what even their measurements show. (ii) The authors used (as far as I can tell) for all conditions the same vmax values for the enzymes. Vmax values are influenced by changes in enzyme levels. However, by using one condition-independent vmax value this essentially assumes that enzyme levels do not change. However, their experiments show that enzyme levels change. Of course, when plotting log2 changes of enzyme levels, one only sees the strong changes, but I would consider that a 2-fold increase of at least some enzymes to have an effect on the system. This is not considered. And even these model structure question are not an issue, then there is still the uncertainty of the model parameters. The number of model parameters is huge, the estimation process seem to have involved quite a bit of manual tweaking as described in the SText, and the data that went in there was just the physiological data. From my understanding the proteomics and metabolomics data was NOT used in the estimation. Physiological data is surely not sufficient from a practical identifiability point of view to identify the model parameters. Evidence for this point comes from the following: Fitting physiological data to a stoichiometric model (which has much fewer parameters, i.e. namely only the fluxes) is already an issue and fluxes can usually not be identified. Now, here, the others have even model with much more parameters (namely also kinetic parameters). Thus, it simply cannot be that they can identify the parameters. And as such, I have difficulty in trusting conclusions that were derived from such a model developed in this manner." Under these circumstances and given that the concerns raised by the reviewers are substantial and are unlikely to be addressable within the scope of a major revision, we see no other choice than to return the manuscript with the message that we cannot offer to publish it. I am sorry that the review of your work did not result in a more favorable outcome on this occasion, but I hope that you will not be discouraged from sending your work to Molecular Systems Biology in the future. In any case, thank you for the opportunity to examine this work.

_______________________ Reviewer #1:
This study looks at the role of ATP demand on glycolytic flux, growth rate and product formation, and provides many important new insights into the regulation of central metabolism. It is a well-designed, clearly described study that combines experimental data with a kinetic model. I think this is an important paper for the field. I have a number of suggestions for improvements mainly. line51: holistic is not a word to be used in this context. The understanding is not gained by looking only at the hole. Line81: the terminology of rate-limiting steps should perhaps be avoided in a systems biology journal, and replaced by ratecontrolling, especially in the context of this topic.
Line210: lower ATPase -> higher I suppose Line 239: why is AMP not in the model? Is that not an even more potent activator of PFK than ADP?
Line 252: Of course the accumulation of FBP does not have to correlate with glycolytic flux if you stretch the system beyond its normal regulatory boundaries, which you clearly do in HC. Also in line 485, it is unfair to compare your highly arteficial strain with expectations of optimality. To explain accumulation of FBP in HC I miss reference to the redox state and GAPDH as a clear potential bottleneck of glycolysis. Your NAD levels are lower (but also in HC control?), so possibly the NADH/NAD ratio is higher, inhibiting GAPDH (and explaining high lactate by higher NADH for LDH?). The redox state is notoriously difficult to model properly, but instead of your changes in PFL and PYK, could changes in NADH/NAD ratios (e.g. inhibiting GAPDH instead of PFL and PYK) also improve the model?
Line 373: lactate yield dropped even more in PflB (which makes much more sense), why is this not highlighted? I find the description and especially the explanation of the Table 1 results in general rather poor, and I am missing a direct correspondence with the modeling results. Does the model explain the changes in flux as well? That would be quite impressive. Fig. 1 and 2. Panels C and D are not very insightful, please change into a Table with the actual numbers. I also suggest to provide the data of A and B on log scale, and I also suggest to additionally plot glucose against biomass: this gives a direct readout of the stoichiometry (yield) of growth, which should change if ATP expenditure increases.
Supplements: ATPase reaction (ATPM) in the model seems crucial. Should it not be a function of demand processes, and relatively insensitive to ATP? I think the Hill function sort of achieves that (and I would prefer to give that rationale with the equation, perhaps citing Hofmeyr or Reich and Sel'kov), but with the parameters used the line is close to k*ATP. Based also on the data of ATP in Fig S4, I would suspect you can describe your data much better by a lower Km and and higher Hill coefficient of the ATPase reaction. By plotting the ATPM flux as a function of the measured ATP (or ATP/ADP, or energy charge, many options), you may be able to get the apparent rate equation. Have you tried? Moreover, what happens if you make the ATPM also a function of mu, as would be biologically relevant? Would you be able to compare growth versus non-growth?

Summary
This study analyzed the impact of varying levels of ATP turnover in E. coli by overexpressing genes of ATP hydrolyzing ATPase under anaerobic and aerobic conditions. The results were compared growth results, metabolomic and proteomic data to a kinetic model of E. coli central metabolism. The authors observed a biphasic behavior of glucose uptake in response to ATP demand. They show that this biphasic behavior is induced by the dependency of the phosphofructokinase on ATP and ADP.
General remarks This is a well designed study and the results were convincing and revealing of mechanistic and metabolic principals that can guide further metabolic engineering of E. coli. The ability to directly test a concept with both extensive experimental and modeling approaches is unique and compelling. The authors also observed that their first few versions of their kinetic model didn't match the observations and likely lacked regulatory elements. They were able to make modifications that qualitatively matched with their findings.The findings will be directly applicable to researchers focused on efficient design of E. coli and other chassis strains for biomanufacturing and pathway expression systems. These results will also be interesting to a growing community of researchers involved with metabolic modeling. I would like to recommend that this manuscript be published without revisions. Table 1 could potentially be reformatted into a graphical display instead of a table. I found the table useful as a reference but doesn't directly communicate a clear take-away message. The authors should consider converting this into a figure or putting it into the supplemental information.

Minor points
Reviewer #3: The authors found that upon that upon an increased ATP hydrolysis, E. coli responds with in increased glycolysis, but only up to a point at which the rate through glycolysis collapses and the growth rate as well. The authors find that allosteric regulation of the glycolytic enzyme phosphofructokinase is responsible for this "collapse". Beyond, with a kinetic model, they provide some further insights on E. coli metabolic regulation. The findings (i.e. the response of E. coli upon increased ATP demand and how cells compensate this (to some extent) by increase in glycolysis) are interesting for researchers studying and working with carbon/energy metabolism of E. coli. Main concerns: KINETIC MODEL/PARAMETER ESTIMATION: My main concern refers to the kinetic model, which is key for the finding on the PFK and the other insights generated. This model contains a huge number of unknown parameters. Unfortunately, the whole modeling building approach and the approach that was used to fit the model is incompletely described. For instance, what are the number of unknown parameters, how was the parameter estimation was done, were all experimental conditions fitted together, were the v_max values allowed to vary across conditions, are the parameters actually identifiable from a structural/practical point of view, etc? What is described makes me somewhat worried (i.e. "first model calibration was conducted by hand"; p. 20 of supplementary text). The data to fit the many model parameters is very scares (i.e. only physiological data), compared to the number of parameters. I simply can't see how such limited data can result in a reliable fitting of the many unknown parameters (overfitting?) and whether I at all can trust this kinetic model, which is the basis on which the main conclusions were drawn (i.e. the one on PFK and the ones presented on page 14). Could other models also explain the data? Also, several choices had to be made to assemble this model. One choice refers to constraining the sum of ATP and ADP to a particular value (cf. Supplemental text p. 5). I would be fine with this assumption of AMP concentrations were identical over the different perturbations applied. However, this is not the case (cf. Fig S3A). Also, I do not understand why the authors, who had measured proteome and metabolome data for the strains here, did not use this data in the fitting. Wouldn't this not have delivered a much more trustworthy model? Overall, I have doubts on the solidness of the conclusions that can be drawn with this kinetic model. PROTEOME ALLOCATION: According to the concept of 'protein allocation', protein overexpression and also changes in growth rates should lead to changes in protein expression. On the basis of their proteome analyses, the authors argue that there are only very few small changes in protein abundances (p. 13), even though they overexpress proteins and also have strains with largely different growth rates. Does this mean that the protein allocation concept is not correct? Or are the proteome measurements inaccurate? Yet, the authors observe small expression changes, but as far as I can see these expression changes were not used in the model. I would expect a 2-fold change of some enzymes (as the authors have found them in the proteomics analysis) to have some effect on the outcome of the model. I am puzzled why this has not been taken along (see previous point) and whether I then can trust the model. DETERMINATION OF PHYSIOLOGICAL PARAMETERS: The physiological parameters are the key input data for their work. Yet, I have some concerns about the manner how these data have been determined. First, the authors only determine the uptake/excretion rates on the basis of two measurement time points. This approach is fine if the two time points really originate from the exponential growth phase AND if the measurement data does not have any measurement noise. However, looking at Fig. 1a, I am not sure whether over the full time period cells are in an exponential phase and I do not know at which two time points the authors took their samples to determine the uptake and excretion rates. I think for a work that builds so heavily on physiological data, it would have been much better if the authors had acquired time courses on the extracellular metabolites and had fitted this data to an ODE-based model to estimate the rates. This approach would also have dealt with the issue of measurement noise much better. Second, I am worried about the fact that the authors have not directly measured the biomass dry weight, but have only inferred this from OD measurements by assuming a constant conversion factor between OD and dry weight. At different conditions, it is possible that cell sizes changes or that otherwise an assumption of a constant conversion factor is incorrect. While these two points seem minor, I would like to highlight again that the model only builds on the physiological data, and as such it is even more crucial that the data is 100% solid. Minor: 1) For quite some time during the reading, it wasn't clear to me whether the observations you describe are observations after dynamic perturbations (i.e. in the abstract, i.e. "increases", "biophasic response curve"; these terms all contain some elements of "time", which would rather be consistent with a dynamic perturbation). You might want to change the wording (here and throughout the manuscript) such that the reader directly is put on the right track, i.e. on the track on steady-state perturbations, i.e. no DYANMCIS! 2) If I remember correctly, the long-known short term glycolytic oscillations also involved PFK and its regulation by ADP. Could this be connected? 3) Line 149: Why lactate? Is there an explanation for this? Redox? 4) Fig. 2a: It seems that in some of the strains there is still an increase in the biomass, even though cells do not have any nitrogen source. How can this be explained? 5) Figures are not mentioned in the order as they indeed appear in the figure panels. E.g. Fig. 4a is mentioned before Fig. 3a. This is confusing. 6) Fig. 4C: This lumped reaction with ICDH shows NADPH as cofactor. Is this correct? Isn't this enzyme not NADH dependent? 7) Line 353/354: I don't get the statement. 8) I strongly advise the authors to display their results as shown in Table 1 as a figure. 9) While reading the result section I always wondered whether protein allocation changes could also play a role. The authors discuss this (and in fact even show new results about this) in the discussion section. I strongly suggest that the authors move this part into the result section. 10) Why are the ATP/ADP levels only shown in the supplement? Isn't this a key piece of information for the story here? 11) The description of the MFA M&M section is short. Information such as on the size of the metabolic model is missing as well as info on what is actually optimized. Sum of fluxes or ATP hydrolysis? If it is the latter, I think the authors should then rather refer to MAXIMAL ATP turnover rates in their text. Also, it wasn't clear to me what was meant with lines 661-665. 12) The CSICD involves production of CO2. While several reactions in the kinetic model, involve CO2 as reactant, the CSICD reaction does not have CO2 as product.
Thank you very much for handling our manuscript and for sending us your decision letter. Based on the comments of the reviewers, you decided to reject the manuscript. When reading the reviewer comments in detail, we felt that this decision is based on some misunderstandings and misinterpretations and we therefore kindly ask you to consider our appeal.

30th Jul 2021 Appeal
The preliminary response letter with point-by-point responses is provided in a separate file.
We are grateful for your consideration.
Sincerely, Thank you for your message asking us to reconsider our decision regarding your manuscript MSB-2021-10504. I have carefully read your manuscript and the referee reports once again and have also discussed your preliminary point-by-point responses with the editorial team. Based on the outline you provide, we think that the proposed revisions sound reasonable. As such, we would welcome the submission of a revised version for further consideration.
Without reiterating all the points raised in the reviews and your preliminary point-by-point responses, some of the more substantial issues are the following: -The model construction, as well as the scope and limitations of the model, should be better described and discussed.
-Reviewer #3's concern about the unknown parameters, and other raised issues about the model, need to be carefully clarified and/or addressed.
As you may already know, our editorial policy allows in principle a single round of major revision. It is therefore essential to provide responses to the reviewers' comments that are as complete as possible.
On a more editorial level, we would ask you to address the following issues: glycolytic flux if you stretch the system beyond its normal regulatory boundaries, which you clearly do in HC. Also in line 485, it is unfair to compare your highly arteficial strain with expectations of optimality. To explain accumulation of FBP in HC I miss reference to the redox state and GAPDH as a clear potential bottleneck of glycolysis. Your NAD levels are lower (but also in HC control?), so possibly the NADH/NAD ratio is higher, inhibiting GAPDH (and explaining high lactate by higher NADH for LDH?). The redox state is notoriously difficult to model properly, but instead of your changes in PFL and PYK, could changes in NADH/NAD ratios (e.g. inhibiting GAPDH instead of PFL and PYK) also improve the model?
Author response: First of all, we agree that it is "unfair" to expect that the HC ATPase strain would behave normally and show optimal behavior under these extreme conditions. We removed the sentence from line 252 and changed the wording in the referred sentence in the Discussion section to weaken this statement.
Regarding the possible explanation of the reviewer for the high FBP accumulation: in principle, yes, increasing NADH/NAD ratios might inhibit the GAPDH (unfortunately, as the reviewer mentioned, we could not measure NADH as this is notoriously difficult). However, this could then not explain the accumulation of PEP and pyruvate (and possibly also of other glycolytic metabolites between DHAP and PEP, for which we do not have measurements). Our model analysis suggests that there should be regulations acting on PFL and PYK with which (i) the metabolite accumulations up to FBP, (ii) the high lactate production (due to high levels of pyruvate, the substrate of the LDH), and (iii) the reduced production of acetate and ethanol in the HC strain can be explained.
Reviewer #1 comment 6: Line 373: lactate yield dropped even more in PflB (which makes much more sense), why is this not highlighted? I find the description and especially the explanation of the Table 1 results in general rather poor, and I am missing a direct correspondence with the modeling results. Does the model explain the changes in flux as well? That would be quite impressive.
Author response: Yes, the drop in lactate yield was indeed to be expected (and also predicted by the model). We extended the discussion of the results (and included also a new Figure, as suggested by Reviewer #2) to highlight this point. We now also mention that the model predicted the changes in the product yields qualitatively (it is difficult to make a direct quantitative comparison of measured yields and model predictions since we do not know how much the PFL level increased in the strain overexpressing the pflB gene compared to the standard HC ATPase strains).
Reviewer #1 comment 7: Fig. 1 and 2. Panels C and D are not very insightful, please change into a Table with the actual numbers. I also suggest to provide the data of A and B on log scale, and I also suggest to additionally plot glucose against biomass: this gives a direct readout of the stoichiometry (yield) of growth, which should change if ATP expenditure increases.
Author response: We now moved the Tables with the exact values of rates and yields for growth and growth arrest under anaerobic conditions from the Appendix to the main manuscript (now Table 1). As reviewer #2 and #3 seem to prefer a graphics (for example, instead of using Table 1 (now Table2)), we decided to include both, Figure and Table, in the main manuscript.
Reviewer #1 comment 8: Supplements: ATPase reaction (ATPM) in the model seems crucial. Should it not be a function of demand processes, and relatively insensitive to ATP? I think the Hill function sort of achieves that (and I would prefer to give that rationale with the equation, perhaps citing Hofmeyr or Reich and Sel'kov), but with the parameters used the line is close to k*ATP. Based also on the data of ATP in Fig S4, I would suspect you can describe your data much better by a lower Km and and higher Hill coefficient of the ATPase reaction. By plotting the ATPM flux as a function of the measured ATP (or ATP/ADP, or energy charge, many options), you may be able to get the apparent rate equation. Have you tried? Moreover, what happens if you make the ATPM also a function of mu, as would be biologically relevant? Would you be able to compare growth versus non-growth?
Author response: Although it will be clear to the reviewer, let us first explain the role of the ATPM reaction. In many (also constraint-based) models, the ATPM (or NGAM) reaction reflects the consumption of ATP for non-growth associated maintenance (NGAM) processes. While this can be fixed to a single flux value (mmol/gDW/h) in constraint-based models, this cannot be done in kinetic models: if we would fix that flux then it might happen that ATP is consumed faster than it is produced by other reactions in the model and the ATP concentration may become negative. Therefore, in kinetic models, the ATPM reaction must somehow depend on the ATP concentration and it must become zero if there is no ATP (this would also correspond to what happens in the cell). If we would not consider expression of the ATPase, then we could use a kinetics for the ATPM reaction as suggested by the reviewer (namely as a function of demand processes, i.e. with relative insensitivity against ATP for a large range of ATP values which could be achieved with a low Km value and a Hill function with high exponents). Furthermore, it should also be noted that the growth-related consumption of ATP is accounted for by a fixed ATP stoichiometry per gram biomass produced in the growth reaction (see ATP balance in the model description). This growth-related demand of ATP was computed from a stoichiometric model as explained in the Appendix. It thus would make no sense to make the ATPM flux a function of mu. (Note also that, again, the growth rate kinetics depends explicitly on ATP and the growth rate will become zero if the ATP concentration is zero).
So far, we have described the situation without overexpressing the ATPase. In our original model we have "abused" the ATPM reaction in a sense that it aggregates both the actual non-growth associated ATP demand as well as the ATP consumed by the overexpressed ATPase levels (both reactions have the same net stoichiometry). The ATPase activity was simulated by increasing the vmax of the ATPM flux: each strain had a characteristic vmax level for the ATPM reaction. For the wild type we adjusted the vmax value of the ATPM reaction such that the ATPM flux comes close to the known wild type ATPM demand (~ 3mmol/gDW/h; this corresponds to the case where no ATPase is expressed). Now, for the aggregated processes, we cannot use the suggested kinetics of a demand process since this would not correctly describe the activity of the ATPase.
However, in the revised model version what we now did is to split the ATPM reaction into two reactions: one for the NGAM demand of ATP (= classical ATPM) and one for the ATPase activity (which would be zero in the wild type). This allows us to really describe the (new) NGAM reaction as a demand process, as suggested by the reviewer.
Reviewer #2 general comment: This is a well designed study and the results were convincing and revealing of mechanistic and metabolic principals that can guide further metabolic engineering of E. coli. The ability to directly test a concept with both extensive experimental and modeling approaches is unique and compelling. The authors also observed that their first few versions of their kinetic model didn't match the observations and likely lacked regulatory elements. They were able to make modifications that qualitatively matched with their findings.The findings will be directly applicable to researchers focused on efficient design of E. coli and other chassis strains for biomanufacturing and pathway expression systems. These results will also be interesting to a growing community of researchers involved with metabolic modeling. I would like to recommend that this manuscript be published without revisions.
Author response: We thank the reviewer for the positive general evaluation of our manuscript.
Reviewer #2 comment 1: Table 1 could potentially be reformatted into a graphical display instead of a table. I found the table useful as a reference but doesn't directly communicate a clear take-away message. The authors should consider converting this into a figure or putting it into the supplemental information.
Author response: We now added a graphical display additionally to the Table. As reviewer #1 prefers  a Table with the exact values, we decided to include both, Figure and Table, to the main manuscript.

Reviewer #3:
Reviewer #3 general comment: The authors found that upon that upon an increased ATP hydrolysis, E. coli responds with in increased glycolysis, but only up to a point at which the rate through glycolysis collapses and the growth rate as well. The authors find that allosteric regulation of the glycolytic enzyme phosphofructokinase is responsible for this "collapse". Beyond, with a kinetic model, they provide some further insights on E. coli metabolic regulation. The findings (i.e. the response of E. coli upon increased ATP demand and how cells compensate this (to some extent) by increase in glycolysis) are interesting for researchers studying and working with carbon/energy metabolism of E. coli.
Author response: We thank the reviewer for his comments and give a detailed response below.
Reviewer #3, Comments from pre-decision cross-commenting: "The model is huge: The first question is whether the model structure is correct. Second, the model contains lots of unknown parameters. Just to mention two issues with the first point: (i) The authors use a fixed summed concentration for ATP and ADP as a constraint. Is there no AMP? Is it not likely that under strong perturbation of energy homeostasis also more AMP is formed? Indeed, this is what even their measurements show. (ii) The authors used (as far as I can tell) for all conditions the same vmax values for the enzymes. Vmax values are influenced by changes in enzyme levels. However, by using one condition-independent vmax value this essentially assumes that enzyme levels do not change. However, their experiments show that enzyme levels change. Of course, when plotting log2 changes of enzyme levels, one only sees the strong changes, but I would consider that a 2-fold increase of at least some enzymes to have an effect on the system. This is not considered. And even these model structure question are not an issue, then there is still the uncertainty of the model parameters. The number of model parameters is huge, the estimation process seem to have involved quite a bit of manual tweaking as described in the SText, and the data that went in there was just the physiological data. From my understanding the proteomics and metabolomics data was NOT used in the estimation. Physiological data is surely not sufficient from a practical identifiability point of view to identify the model parameters. Evidence for this point comes from the following: Fitting physiological data to a stoichiometric model (which has much fewer parameters, i.e. namely only the fluxes) is already an issue and fluxes can usually not be identified. Now, here, the others have even model with much more parameters (namely also kinetic parameters). Thus, it simply cannot be that they can identify the parameters. And as such, I have difficulty in trusting conclusions that were derived from such a model developed in this manner." Arguably, the most serious concerns of the reviewer are related to the identifiability of the parameters of the model and the value of the model (and its conclusions) under non-identifiability and parameter uncertainty, as reflected by this statement: "Thus, it simply cannot be that they can identify the parameters. And as such, I have difficulty in trusting conclusions that were derived from such a model developed in this manner." This is also reflected by the following statement taken from the main concern 1 of this Reviewer (see below): "I simply can't see how such limited data can result in a reliable fitting of the many unknown parameters (overfitting?) and whether I at all can trust this kinetic model, which is the basis on which the main conclusions were drawn ..." The statements of the reviewer express the concern that a model is generally only useful if all parameters are identifiable. But this is not true and undermines that models with non-identifiable parameters can be very valuable! Let us start with a very simple example to illustrate this point: assume there is a (algebraic) model that states Y=a*b*X with unknown parameters a and b. Given measurements of Y and X we might be able to demonstrate that this model is able to reasonably reflect the measured data and we can then find a unique estimate for the value of the product of a*b. Clearly, the individual values for a and b are not identifiable (the estimation will deliver values for a and b such that a*b gives an optimal fit; but a and b are clearly non-unique). But the model still has a great value: it could be falsified if there is no linear relationship between X and Y, and, if there is a linear relationship, the (non-unique) estimated values of a and b will lead to a model that can predict the behavior of Y given X. The non-identifiability of the parameters a and b is simply not relevant for this procedure.
In our study, we built, in a bottom-up manner, a kinetic model of the central metabolism of E.coli for anaerobic conditions and used experimental data to calibrate it. The reviewer is right that, with the relatively large number of parameters (179 in model version 2; thereof 101 unknown parameters to be estimated) and the relatively small set of experimental data, there will be parameters that are not identifiable -as in most other kinetic models of this size. This would indeed be crucial for some model purposes, for example, if the aim is to uniquely identify certain parameters (e.g., a Km value) or for model selection, where the concrete level of a certain parameter may decide about the existence of certain interactions. However, this was not relevant for our study since the main goal of our model was instead to (i) show that we can build, in a bottom-up manner, a kinetic model that includes basic mechanisms and biological knowledge of the central metabolism of E. coli and that this model can reproduce the experimental data; (ii) to use this model to gain insights regarding the biphasic (steadystate) response curve of the glucose uptake rate and to possibly falsify our hypothesis (if the biphasic response curve cannot be reproduced), and (iii) to make meaningful predictions that can be verified with experiments (and thereby to strengthen the confidence in the model). We (and apparently also reviewers #1 and #2) think that our model indeed fulfilled these goals. It was never our intention to obtain a model, where the parameters are all identifiable with the given measurements (and we never claimed that; but maybe we should make this point clearer; see also below). In our opinion, the reviewer underestimates the value of a model with non-identifiable parameters and demands too much (even to exclude that other models can explain the data -this is, in general, impossible, even for models where all parameters are identifiable). We would like to convince the reviewer to take another perspective: we HAVE a model that is consistent with our hypotheses and can explain the observed behavior and its predictive capabilities could be clearly demonstrated (see MCA analysis with experimental validation). To say it in other words: would the reviewer trust our argumentation (e.g., the effect of the PFK mechanism on the glycolytic flux) more if we would discuss our hypotheses only verbally without providing an integrated model that demonstrates feasibility of our mechanistic explanation while taking into account a large set of metabolic interactions and regulations? The model proves that our argumentation is a possible mechanistic interpretation (of course, this cannot exclude that there might be other models explaining the observations -but, again, NO model can do that!). Clearly, one may argue that a model is only useful if it cannot reproduce arbitrary datasets and if its predictions are, at least qualitatively, robust against parameter variations. Regarding the first point, the fact that the (first version of the) kinetic model (also after several rounds of parameter fitting) could not reflect the measured high concentrations of glycolytic metabolites already demonstrates that our model cannot be calibrated to arbitrary data and observations! The latter point has now been addressed in the revised manuscript by a Monte-Carlo sampling of parameters (see below).
However, having said this, we agree with the criticism of the reviewer that the model calibration as well as the general role of the model in our study and its limitations should be described and discussed in more detail to avoid such misunderstandings. For example, the reviewer assumes that we did not use the metabolome data for parameter fitting, but we did so! We address these issues under the next (more specific) point of the reviewer.
Regarding the issues with (a) the inclusion of AMP in the model and (b) the proteomics data and the used vmax values in the model, see the explanations given below under main concern 1 and main concern 2, respectively.
Reviewer #3 main concern 1: KINETIC MODEL/PARAMETER ESTIMATION: My main concern refers to the kinetic model, which is key for the finding on the PFK and the other insights generated. This model contains a huge number of unknown parameters. Unfortunately, the whole modeling building approach and the approach that was used to fit the model is incompletely described. For instance, what are the number of unknown parameters, how was the parameter estimation was done, were all experimental conditions fitted together, were the v_max values allowed to vary across conditions, are the parameters actually identifiable from a structural/practical point of view, etc? What is described makes me somewhat worried (i.e. "first model calibration was conducted by hand"; p. 20 of supplementary text). The data to fit the many model parameters is very scares (i.e. only physiological data), compared to the number of parameters. I simply can't see how such limited data can result in a reliable fitting of the many unknown parameters (overfitting?) and whether I at all can trust this kinetic model, which is the basis on which the main conclusions were drawn (i.e. the one on PFK and the ones presented on page 14). Could other models also explain the data? Also, several choices had to be made to assemble this model. One choice refers to constraining the sum of ATP and ADP to a particular value (cf. Supplemental text p. 5). I would be fine with this assumption of AMP concentrations were identical over the different perturbations applied. However, this is not the case (cf. Fig S3A). Also, I do not understand why the authors, who had measured proteome and metabolome data for the strains here, did not use this data in the fitting. Wouldn't this not have delivered a much more trustworthy model? Overall, I have doubts on the solidness of the conclusions that can be drawn with this kinetic model.

Author response:
Parameter identifiability: Regarding the main issue of identifiability of the parameters and the value of a model if parameters are not identifiable, please see our response to the previous point above.
However, we agree that we should discuss in the manuscript that the model has a relatively large number of parameters and that many of them will not be uniquely identifiable -but that the model still has its value (as detailed under the previous point): so we now write in the Discussion section: "The kinetic model played an important role in this study to identify and analyze potential mechanisms in the metabolism of E. coli that led to the observed phenotypes under high ATP demand. It has to be noted that the model is relatively large and comprises more than 100 hundred unknown parameters, many of which will not be uniquely identifiable, despite fitting the model against a considerable set of data. However, the model is based on well-known biological knowledge of E. coli's central metabolism, it is able to reproduce measurements of the different strains reasonably well and it gave predictions that could be successfully verified. Moreover, the results of the Monte-Carlo sampling of kinetic parameters showed that key properties of the kinetic model and its predictions are robust over a wide range of parameter variations. Hence, despite potential parameter identifiability issues, the model could demonstrate its predictive power and thus represents a solid and plausible basis that supports our hypotheses and explains major findings of this study. However, as is true for every model, we can neither prove its correctness nor that other models with alternative mechanisms may exist that reproduce the observed phenomena equally well." Parameter estimation procedure: Again, we agree with the reviewer that we needed to extend the description on how we calibrated the model and fitted the unknown parameters. For example, the reviewer assumes that we did not use the metabolome data for parameter fitting, but we did. Accordingly, we included more information in the main document and extended the description of the parameter identification process in the Supplementary Text in the Appendix.
Monte-Carlo sampling of kinetic parameters and robustness of model predictions: Furthermore, to increase the trust of the reviewer in the model and its predictions and to analyze the robustness of the model with respect to parameter uncertainties, we employed the Monte-Carlo approach of Murabito et al. (2014). This approach samples kinetic parameters of a kinetic model, such that the sampled parameter set (of one run) gives rise to the known reference (steady) state (characterized by the measured steady state metabolic fluxes and metabolite concentrations for that specific scenario). For each specific parameter set, one can then determine important system features, in particular, flux control coefficients. In this way, one obtains a probability distribution of certain model features (such as flux control coefficients) over all sampled parameter sets. We used this approach to analyze the robustness of our model-based FCC predictions for the HC strain and to analyze parameter uncertainties. The results of the Monte-Carlo sampling of kinetic parameters showed that key properties of the kinetic model are robust over a wide range of parameter variations (two order of magnitudes). Hence, despite potential parameter identifiability issues, the model could demonstrate predictive values also under parameter uncertainty and thus represents a solid and plausible basis that supports our hypotheses and explains major findings of this study.
A section on the Monte-Carlo study has been added to the main manuscript and a more detailed description of the entire procedure and its results is given in a new section (4) in the Appendix.
Inclusion of AMP: Finally, regarding AMP, in the original model we did not include AMP because it is not involved in any reaction considered in the network (also not as allosteric regulator, since PFK does not depend on AMP in E. coli). We used therefore the ATP/(ADP+ATP) as a proxy for the energy charge. The reviewer is right that the AMP concentration is largely increased in the HC ATPase strain, which is most likely a consequence of the operation of the adenylate kinase (2 ADP ↔ AMP + ATP) and an(other) indicator that the energy charge is low. However, one can also see that the total concentration of ATP+ADP in the HC ATPase strain is still close to the WT strain. In fact, it is the overall adenosine phosphate pool (AMP+ADP+ATP), which varies in the strains and increases with increasing ATPase activity. This could be a consequence of the reduced growth rate, which might lead to higher accumulation of these central metabolites. However, the largest increase of this pool is not more than 2-fold (HC ATPase vs. WT), Now, to account for the reviewer's comment, we introduced the following model change to account also for AMP and to enable a better comparison between energy charges of the model with the measurements. We included AMP as metabolite and the adenylate kinase as a reaction, which will connect ADP and ATP levels with AMP. Still, we do not know the precise mechanism that adjusts the size of the total adenosine phosphate pool and, as an approximation, we therefore assumed a conservation relation AMP+ADP+ATP=const. The assumption of a constant overall pool (as done in almost all kinetic models of E. coli) should be acceptable, since the maximum change in the size of the pool was only ~2-fold and because it is the relative ratio of the three metabolites that matter for many processes.
Regarding the treatment of proteome measurements and vmax values see our response to the next point (main concern 2).
Reviewer #3 main concern 2: PROTEOME ALLOCATION: According to the concept of 'protein allocation', protein overexpression and also changes in growth rates should lead to changes in protein expression. On the basis of their proteome analyses, the authors argue that there are only very few small changes in protein abundances (p. 13), even though they overexpress proteins and also have strains with largely different growth rates. Does this mean that the protein allocation concept is not correct? Or are the proteome measurements inaccurate? Yet, the authors observe small expression changes, but as far as I can see these expression changes were not used in the model. I would expect a 2-fold change of some enzymes (as the authors have found them in the proteomics analysis) to have some effect on the outcome of the model. I am puzzled why this has not been taken along (see previous point) and whether I then can trust the model.
Author response: First of all, over the entire proteome, there are indeed some significant changes in the protein abundances -and this is most likely a consequence of protein reallocation. However, as stated in the manuscript, it is also true that glycolytic enzymes are much less affected; the fold-change is less than 1,6 for all these enzymes (this is also consistent with the fact that these enzymes are known to be much less regulated at the gene expression level). In fact, ALL enzyme levels of reactions contained in the model exhibit a less than 1,6-fold change in the HC strain, except for PPC, PCK and MDH (malate dehydrogenase). The case of PPC and PCK was discussed in detail in the manuscript and the case of MDH actually fits nicely to this discussion (it seems that the cell seeks to increase the flux from PEP to Malate via PCK (instead of PPC) to gain one extra ATP). However, the overall flux from PPC/PCK via MDH is apparently still relatively low, as the succinate production is still low also in the HC ATPase strain. For this reason, we think that it is justified to approximate the behavior with constant vmax levels of the enzymes in the model to avoid consideration of different models for each strain (and each experiment). Furthermore, we tested how the fluxes and metabolite concentrations would change if we adapt the vmax values for the HC ATPase strain relatively to the changes in protein abundances compared to the wild type. We see only smaller changes in the fluxes, for example, the simulated glycolytic flux in the HC ATPase strain would in this case slightly reduce to 91 %.
We slightly extended the text in the manuscript to justify the use of constant vmax values.
Reviewer #3 main concern 3: DETERMINATION OF PHYSIOLOGICAL PARAMETERS: The physiological parameters are the key input data for their work. Yet, I have some concerns about the manner how these data have been determined. First, the authors only determine the uptake/excretion rates on the basis of two measurement time points. This approach is fine if the two time points really originate from the exponential growth phase AND if the measurement data does not have any measurement noise. However, looking at Fig. 1a, I am not sure whether over the full time period cells are in an exponential phase and I do not know at which two time points the authors took their samples to determine the uptake and excretion rates. I think for a work that builds so heavily on physiological data, it would have been much better if the authors had acquired time courses on the extracellular metabolites and had fitted this data to an ODE-based model to estimate the rates. This approach would also have dealt with the issue of measurement noise much better. Second, I am worried about the fact that the authors have not directly measured the biomass dry weight, but have only inferred this from OD measurements by assuming a constant conversion factor between OD and dry weight. At different conditions, it is possible that cell sizes changes or that otherwise an assumption of a constant conversion factor is incorrect. While these two points seem minor, I would like to highlight again that the model only builds on the physiological data, and as such it is even more crucial that the data is 100% solid.
Author response: Regarding the determination of the uptake and excretion rates of cells cultivated with nitrogen source, only the exponential growth phases were taken into account, as demanded by the reviewer. The time points considered for exponential growth and used for determining the growth rate are now provided in the Source data (relevant for Figures 1, 5 and EV1). Furthermore, we had indeed determined time courses for all (external) metabolite concentrations from which we used the first and the last data point within the exponential phase to determine the exchange rates (separately for each biological replicate from which then the mean and standard deviations were determined for each rate). As indicated in Figures 1 and 2 and in Tables 1 and 2, we observed generally relatively low standard deviations in the calculated growth and exchange rates indicating a low uncertainty in the determined rates. We now included a paragraph in the Methods section describing in more detail how the growth rate was determined for exponentially growing cells.
Regarding the OD measurements: Due to an additional significant experimental effort, using a constant conversion factor for estimating the biomass dry weight from OD is a common practice in our as well as in many other labs, also for different conditions/mutant strains (e.g. doi:10.1038/msb.2013.66; doi:10.1016/j.ymben.2020.03.004; doi:10.1038/npjsba.2016.35). However, to verify that there are no significant differences in the conversion factor between the different strains, we have now repeated the experiments for determining the conversion factor for the WT MG1655 and the HC ATPase strain (as these two strains show the biggest difference in their growth behavior) for anaerobic growth. We detected no significant difference between these two strains (MG1655: 0.189 gDW/L/OD420; 0.191 gDW/L/OD420; 0.219 gDW/L/OD420. HC ATPase: 0.196 gDW/L/OD420; 0.212 gDW/L/OD420; 0.217 gDW/L/OD420. P value = 0.4929) nor a larger discrepancy to the conversion factor used in this study (0.22), which was previously determined in our lab.
I am content with the responses to my comments and believe this paper is an important contribution to the field. I also support the position of the authors wrt the criticism of reviewer 3 and fully agree that a model can be useful and provide important insights without the structure and all parameters being known. It depends on the research question. They provide a parameter scan that provides confidence in the robustness of their predictions against local parameter uncertainty. In view of the question that they pose, and the experimental follow up of their predictions, I believe their modeling approach is well justified.
Reviewer #3: I understand the authors' point. Having re-read the manuscript, I think I now understand better of what is/was my problem with the story. I think my problem is/was that after the finding of the biphasic behavior, the authors directly jumped into the large kinetic model. Instead, I feel that the authors could take the material that they have, reorder it a bit and develop their story in a more step-wise manner, where the big kinetic models would just come after the metabolomics and proteomics data and the small PFK model. For instance: -Start with observation on the biphasic behavior (as you do know) -Mention your hypothesis on what could explain the behavior (i.e. the dependency of PFK on ADP and ATP), i.e. lines 220ff -State that for hypothesis to be true what your expectation is on ADP and ATP levels in the different perturbation strains; test this with metabolomics experiment, i.e. bring lines 262ff; this data shows that the metabolomics data would be consistent with the hypothesis -Test the hypothesis further with the small model of PFK reaction only, i.e. lines 239; also this shows that the hypothesis would be consistent -However, for a further testing of the hypothesis a bigger kinetic model is needed, because ....; towards developing this model, we needed to know whether changes in protein concentrations needed to be modeled as well, and this is why we did proteomics. Here, we found that protein levels don't change much, and thus we considered them to be constant - -It goes from simple to more complex (i.e. kinetic model), from experimental validation of observation (i.e. metabolomics, ATP, ADP levels) to model-based validation of hypothesis -It brings the metabolomics data early (now you need the data for the kinetic model but in fact only later show the data) -It avoids reactions of readers such as my earlier one on the huge kinetic model; as in the above mentioned suggested structure the model only comes at the end as "final" confirmation -Also, as you would bring the proteomics data before the development of the kinetic model, a reader has seen the proteomics data and thus is "convinced" that protein expression changes would not need to be taken into account in the model I sincerely hope that the authors consider such an alternative structure. I recently had a similar experience with a manuscript of my lab, where a reviewer had "forced" us to rethink the story line. After initial significant grinding of teeth, I have to confess that the re-write was absolutely worth it and the story that we resubmitted was much better than the one we had before. I hope I can help the authors of this manuscript to rethink their story line as well. Ultimately, if the story line is improved, the manuscript will be appreciated more by readers and thus will have increased impact.

Reviewer #1:
I am content with the responses to my comments and believe this paper is an important contribution to the field. I also support the position of the authors wrt the criticism of reviewer 3 and fully agree that a model can be useful and provide important insights without the structure and all parameters being known. It depends on the research question. They provide a parameter scan that provides confidence in the robustness of their predictions against local parameter uncertainty. In view of the question that they pose, and the experimental follow up of their predictions, I believe their modeling approach is well justified.
We thank the reviewer for this very positive final statement. -It goes from simple to more complex (i.e. kinetic model), from experimental validation of observation (i.e. metabolomics, ATP, ADP levels) to model-based validation of hypothesis -It brings the metabolomics data early (now you need the data for the kinetic model but in fact only later show the data) -It avoids reactions of readers such as my earlier one on the huge kinetic model; as in the above mentioned suggested structure the model only comes at the end as "final" confirmation -Also, as you would bring the proteomics data before the development of the kinetic model, a reader has seen the proteomics data and thus is "convinced" that protein expression changes would not need to be taken into account in the model Do the data meet the assumptions of the tests (e.g., normal distribution)? Describe any methods used to assess it.
Is there an estimate of variation within each group of data?

Reporting Checklist For Life Sciences Articles (Rev. June 2017)
This checklist is used to ensure good reporting standards and to improve the reproducibility of published results. These guidelines are consistent with the Principles and Guidelines for Reporting Preclinical Research issued by the NIH in 2014. Please follow the journal's authorship guidelines in preparing your manuscript.

B-Statistics and general methods
the assay(s) and method(s) used to carry out the reported observations and measurements an explicit mention of the biological and chemical entity(ies) that are being measured. an explicit mention of the biological and chemical entity(ies) that are altered/varied/perturbed in a controlled manner. a statement of how many times the experiment shown was independently replicated in the laboratory.
Any descriptions too long for the figure legend should be included in the methods section and/or with the source data.
In the pink boxes below, please ensure that the answers to the following questions are reported in the manuscript itself. Every question should be answered. If the question is not relevant to your research, please write NA (non applicable). We encourage you to include a specific subsection in the methods section for statistics, reagents, animal models and human subjects.

definitions of statistical methods and measures:
a description of the sample collection allowing the reader to understand whether the samples represent technical or biological replicates (including how many animals, litters, cultures, etc.).

The data shown in figures should satisfy the following conditions:
Source Data should be included to report the data underlying graphs. Please follow the guidelines set out in the author ship guidelines on Data Presentation.
Please fill out these boxes ê (Do not worry if you cannot see all your text once you press return) a specification of the experimental system investigated (eg cell line, species name).
No statistical methods were used to pre-determine the sample size. Most experiments were done in biological triplicates, which is a normally employed sample size to ensure biological reproducibility, unless otherwise stated. No data was excluded from the study.
To avoid potential batch effects, flasks were distributed randomly on different shakers and bacterial colonies were picked randomly from the plates after transformation. There were no other variables to randomize in this study.

Data
the data were obtained and processed according to the field's best practice and are presented to reflect the results of the experiments in an accurate and unbiased manner. figure panels include only data points, measurements or observations that can be compared to each other in a scientifically meaningful way.