Toward a modeling, optimization, and predictive control framework for fed‐batch metabolic cybergenetics

Biotechnology offers many opportunities for the sustainable manufacturing of valuable products. The toolbox to optimize bioprocesses includes extracellular process elements such as the bioreactor design and mode of operation, medium formulation, culture conditions, feeding rates, and so on. However, these elements are frequently insufficient for achieving optimal process performance or precise product composition. One can use metabolic and genetic engineering methods for optimization at the intracellular level. Nevertheless, those are often of static nature, failing when applied to dynamic processes or if disturbances occur. Furthermore, many bioprocesses are optimized empirically and implemented with little‐to‐no feedback control to counteract disturbances. The concept of cybergenetics has opened new possibilities to optimize bioprocesses by enabling online modulation of the gene expression of metabolism‐relevant proteins via external inputs (e.g., light intensity in optogenetics). Here, we fuse cybergenetics with model‐based optimization and predictive control for optimizing dynamic bioprocesses. To do so, we propose to use dynamic constraint‐based models that integrate the dynamics of metabolic reactions, resource allocation, and inducible gene expression. We formulate a model‐based optimal control problem to find the optimal process inputs. Furthermore, we propose using model predictive control to address uncertainties via online feedback. We focus on fed‐batch processes, where the substrate feeding rate is an additional optimization variable. As a simulation example, we show the optogenetic control of the ATPase enzyme complex for dynamic modulation of enforced ATP wasting to adjust product yield and productivity.


| INTRODUCTION
The demand for sustainable biotechnological products has grown significantly in recent years (Hughes & Jones, 2020;Wohlgemuth et al., 2021).Although several bioprocesses are commercially successful (Jullesson et al., 2015;Sanford et al., 2016), many are still discarded at early stages because they are not as competitive as traditional technologies.A natural question that arises is how bioprocesses' efficiency can be optimized.
The toolbox of bioprocess optimization at the extracellular or macro-level includes selection of the bioreactor mode of operation, bioreactor design, optimization of cultivation conditions (pH, temperature, etc.), formulation of culture media, determination of optimal feeding profiles and initial concentrations, among others (cf.e.g., Azimi et al., 2019;Behera et al., 2019;Vandermies & Fickers, 2019).
These optimization strategies can influence the overall cell metabolism.Still they alone tend to fail at targeting specific metabolic elements, such as key metabolic fluxes, without affecting other cell functionalities.
Dynamic model-based optimization and predictive control strategies can be used to exploit the dynamic potential of bioprocesses.Dynamic optimization allows finding the optimal dynamic operation conditions, for example, in del Rio-Chanona et al. (2019), Jabarivelisdeh et al. (2018), Jabarivelisdeh and Waldherr (2016), Nimmegeers et al. (2018), Ryu et al. (2019).Feedback control schemes, especially predictive control approaches as in Jabarivelisdeh et al. (2020), Jabarivelisdeh and Waldherr (2018), Morabito et al. (2019Morabito et al. ( , 2021Morabito et al. ( , 2022)), allow one to counteract unknown disturbances such as changes in feed conditions or nonmodeled dynamics, while maximizing the production efficiency and rendering a consistent process performance.
At the intracellular or micro-level, the bioprocess optimization toolbox includes metabolic and genetic engineering methods for rewiring metabolic pathways.Classical static metabolic engineering aims at increasing the cell's product yield, often at the expense of lower biomass yield as the substrate flux diverges from biomassproducing reactions to the product-of-interest pathway.This inevitably decreases the volumetric productivity rates in batch-type bioreactors (Lalwani et al., 2018;Venayak et al., 2015).Furthermore, designing dynamic processes based on static metabolic control principles, usually derived under steady-state assumptions, can lead to metabolic imbalances (Cui et al., 2021).
Inducible expression of metabolism-relevant proteins via external inputs has emerged as a promising dynamic degree of freedom for bioprocess optimization at the micro-level (Hartline et al., 2021;Lalwani et al., 2018;Shen et al., 2019).Of increasing popularity is the application of optogenetics, the use of light to modulate gene expression (Hoffman et al., 2022).With optogenetics, one can switch on/off fluxes along metabolic pathways via modulation of enzyme expression (Lalwani, Ip, et al., 2021;Tandar et al., 2019;Zhao et al., 2021).One can also directly influence cell growth via modulation of the expression of (anti)toxin proteins (Lalwani, Kawabe, et al., 2021).In the latter optogenetic examples, the optimal light input values were determined using factorial experiments and similar heuristic approaches, resulting in, for example, two-stage or three-stage fermentations.Furthermore, the inputs were applied in an open-loop fashion, that is, without online feedback or corrective actions.Considering the often present stochasticity of gene expression (De Vrieze et al., 2020) and the possible presence of process disturbances and batch-to-batch variability, bioprocesses operated in this manner may portray poor reproducibility, moderate-to-poor product quality and a higher risk of failure.
Motivated by these challenges, some authors have proposed cybergenetic schemes whereby computer-aided feedback control is used to compensate for uncertainties.In such cases, the corrective actions are calculated outside the cell, for example, by a computeraided controller (Hsiao et al., 2018;Khammash, 2022).To the best of our knowledge, the biotechnological applications of cybergenetics have been so far limited to, for instance, controlling the expression of fluorescence proteins and growth-regulatory proteins (e.g., enzymes involved in essential amino acid synthesis or antibiotic-resistance conferring proteins) via optogenetics (GutiérrezMena et al., 2022;Milias-Argeitis et al., 2016).
It is believed that the next step in this direction, bearing considerable potential, is to implement metabolic cybergenetic systems, that is, emphasizing dynamic metabolic engineering applications (Carrasco-López et al., 2020).Therefore, we seek to extend the scope of cybergenetics to scenarios where metabolic fluxes are to be dynamically manipulated (e.g., toward maximizing the volumetric productivity, achieving a target product yield, rendering a given ratio of products, etc.), while being able to compensate for disturbances and process changes.We aim to use model-based optimization and predictive control methods to exploit the full potential of metabolic cybergenetic systems, considering both cybergenetic inputs and traditional process inputs such as feeding rates simultaneously.
A reasonably good model capable of relating inducible gene expression to changes in the metabolic flux distribution and potential resource burden thus becomes fundamental for advancing in our quest.Unfortunately, so far only very simple models have been used in the context of cybergenetics, often based on phenomenological relations (cf. e.g., Gutiérrez Mena et al., 2022;Lovelett et al., 2021;Milias-Argeitis et al., 2016).In our opinion, these models do not allow capturing all the important phenomena required for model-based control of metabolic cybergenetic systems.
Thus, as the core contribution of this work, we propose a modeling framework for metabolic cybergenetics, which is combined with model-based optimization and predictive control to dynamically modulate intracellular metabolic fluxes for bioprocess optimization.
Without loss of generality, we focus on fed-batch processes due to their advantages compared to pure batch setups.Fed-batch processes include a concentrated feed that supplies fresh medium to the bioreactor, thus extending the production phase.This provides additional dynamic inputs (feed rates) to the system, and allows for higher productivity and more concentrated product streams.It furthermore provides an efficient way to handle processes with substrate inhibition (Doran, 2013;Liu, 2020).The proposed fed-batch metabolic cybergenetic platform comprises four major components (see Figure 1): (1) a cybergenetic input capable of inducing gene expression dynamically, (2) a manipulatable substrate feeding stream, (3) online (bio)sensors and state estimators that monitor and estimate the state of the process, (4) and model-based optimization that operates in a closed-loop and fully automated fashion.
The remainder of the paper is structured as follows.In Section 2 we outline a dynamic constraint-based cybergenetic modeling approach that integrates metabolism, resource allocation, and inducible gene expression.The derived model is used to support model-based optimization, feedback control, and state estimation of metabolic cybergenetics (Sections 3 and 4).In Section 5, we evaluate our framework considering the optogenetic modulation of the ATPase F 1 -subunit 1 in the anaerobic lactate fermentation by Escherichia coli for improved yield and productivity.We consider a fed-batch regime with nonhomogeneous light penetration, applying both open-loop and feedback control.Note that in a previous study, we covered only open-loop optimization in classic batch processes with homogeneous light penetration (Espinel-Ríos, Morabito, Pohlodek, et al., 2022).

| MODELING FOR DYNAMIC OPTIMIZATION AND CONTROL OF FED-BATCH METABOLIC CYBERGENETICS
We use an extended constraint-based modeling approach for capturing the combined dynamics of metabolism, resource allocation, and inducible gene expression.Constraint-based models are usually underdetermined (Gottstein et al., 2016;Klamt & von Kamp, 2022).Therefore, they are often formulated as optimization problems with biologically sound objective functions and subject to constraints.
Without loss of generality, we consider that cells are composed of metabolic enzymes, ribosomes, and quota elements.T T .In this text, the term "regulated" refers to the fact that the protein expression is under cybergenetic control, externally modulated via a suitable genetic system such as a lightinducible gene expression system (Lindner & Diepold, 2021;Liu et al., 2018).Note that the regulated proteins can comprise enzymes directly involved in metabolic pathways, which is the main focus of this paper, but can in principle also include (anti)toxin proteins or antibiotic-resistance conferring proteins for growth modulation.The In Section 5, we will derive u c to account for light penetration in the context of optogenetics.For the sake of generality, we consider that the input perceived by the cells is given by a function  which maps the input at the source u s to an average input u ¯c.Hence, where can include in principle all the model states, θ u n θu  ∈ comprises possible parameters of f ( ) u ⋅ and u ¯c is the average value of u c in the bioreactor following well-mixed conditions.Introducing u ¯c simplifies the model as it is limited to changes in time and not in space, for example, while still accounting for average input gradients.
We describe the resulting change in the amount of regulated protein as, where F : × → Ribosomes catalyze the translation of the messenger ribonucleic acid, resulting from the transcription process, into proteins.In bacteria such as E. coli, transcription and translation are highly coupled, meaning that translation occurs at the same time as active transcription (Scull et al., 2021;Yang et al., 2019).Therefore, we propose to combine these two processes into lumped dose-response functions η : For the degradation of regulated proteins, we consider both the effect of cell dilution due to growth and intrinsic protein turnover.
The latter is captured by D reg .On the other hand, the dilution of the regulated proteins due to cell growth is implicitly considered in our modeling framework.That is, as cellular components are modeled separately, the relative concentration of regulated proteins per biomass dry weight is reduced by the production of the remaining biomass components (such as unregulated enzymes, ribosomes, and quota elements).
We connect the dynamics of the regulated proteins to the overall metabolism and cell resource allocation via dynamic enzyme-cost flux balance analysis (deFBA), a constraint-based metabolic framework that considers resource allocation constraints (Jabarivelisdeh et al., 2020;Jabarivelisdeh & Waldherr, 2018;Waldherr et al., 2015).
The amount of extracellular metabolites is modeled as, is the molar vector of extracellular metabolites, We consider quasi-steady-state dynamics for the amount of intracellular metabolites where is the molar vector of intracellular metabolites, is the stoichiometric matrix of m, and D : describes the degradation of m.
The metabolic fluxes of reactions catalyzed by enzymes in p unr are constrained by the corresponding catalytic enzyme concentration and catalytic constant (k We consider enzyme saturation conditions.Thus, Equation ( 9) takes the product of the enzyme concentrations and the catalytic constants as upper bounds for the metabolic fluxes.While Equation ( 9) is an inequality constraint, we use an equality constraint in Equation ( 10) under the assumption that we have control over the fluxes catalyzed by p reg i .In other words, we shift this degree of freedom from the cell to an external controller.
of the biomass dry weight corresponds to a The bioreactor liquid volume changes over time due to the substrate feeding rate Metabolic fluxes are constrained by biologically feasible lower and upper bounds Similarly, we consider feasible bounds for the dynamic states The conditions of the system at the initial process time t 0 are Summarizing, we express the resulting dynamic constraint-based model for fed-batch metabolic cybergenetic systems in terms of the following general dynamic optimization problem 3 where F ( ) V ⋅ is the objective function that the cell optimizes, usually one assumes it is the maximization of cell growth, and V ( ) ⋅ is a function of the resulting metabolic flux distribution.Solving this dynamic optimization problem allows one to simulate and predict the cell's behavior, as demonstrated in Section 5. We will use this dynamic constraint-based model as a basis for optimizing and controlling the process.

| OPTIMAL CONTROL FOR METABOLIC CYBERGENETICS
Based on the derived dynamic constraint-based model, we aim to find the optimal input trajectories to drive the cell metabolism toward maximizing a desired performance criterion described by a cost function J ( ) ⋅ .Let us collect all the process inputs (manipulated , and all the model parameters in the vector θ.Recall that x contains all the dynamic states, that is, To find the optimal inputs to the plant, we formulate an optimal control problem 16), (17b) Solving ( 17) is a bilevel optimal control problem as the dynamic constraint-based model in ( 16) involves an optimization on its own.Equation (17c) captures additional state and input constraints.

J ( )
⋅ can be defined in several ways based on specific goals.One may want to maximize production, maintain a desired set-point, and follow a reference trajectory, among other possibilities.Equation (17c) can include, for example, physical-, safety-, or economic-related process constraints.If the process is run in batch mode, then u ( ) s ⋅ can be set as the only optimization degree of freedom.The optimal control problem in ( 17) is an open-loop optimization, as only the initial conditions of the process states are used to compute an optimal input trajectory which is then applied to the plant without feedback.Doing so, would not allow reacting to unknown disturbances or model-plant mismatch.

| MODEL PREDICTIVE CONTROL FOR METABOLIC CYBERGENETICS
As we consider fed-batch processes, we use shrinking horizon model predictive control (MPC) to mitigate the effects of process uncertainty such as model-plant mismatch and disturbances (Findeisen & Allgöwer, 2002;Rawlings et al., 2020), that is, to mitigate the challenges of open-loop control.

| Shrinking horizon model predictive control
In MPC, the optimal control problem is evaluated repetitively at given sampling times.At these sampling instances, the states of the system are measured or estimated with an observer.This introduces feedback since the information on the current system states is passed to the controller and corrective control actions can be taken.
Let t k be the sampling times at which measurements are taken.
Without loss of generality, we assume that state measurements are available at equidistant sampling times, that is, t kh s.t.Eqs. ( 1)-( 14) , (18c) where t t t [ , ] k f ∈ and x ˜k indicates the measured value of x.
It is well known that introducing feedback increases the robustness of the controlled system even though the uncertainties are not explicitly taken into account in the controller (Findeisen & Allgöwer, 2005;Yu et al., 2014).More advanced robust MPC approaches (Mayne, 2014), such as stochastic MPC (Heirung et al., 2018;Mesbah et al., 2014), can take explicitly the system uncertainty into account.For simplicity, in this paper, we do not elaborate further on these approaches.
We assume that the culture volume can be monitored straightforwardly based on the applied feeding rate, and there is a range of online sensors available for the extracellular metabolite concentrations (Fung Shek & Betenbaugh, 2021;Reardon, 2021;Reyes et al., 2022).Therefore, monitoring v L and z is technically possible with the present technologies.However, typically there are no commercial sensors for the complete intracellular biomass composition.To circumvent this challenge, some state estimators have been proposed for reconstructing the biomass composition (Espinel-Ríos, Morabito, Bettenbrock, et al., 2022;Jabarivelisdeh et al., 2020).In the next section, we briefly describe the use of a full information estimator-an optimization-based estimator that considers the process dynamics and process constraints, as well as past and current measurements.For more details, we refer the reader to Espinel-Ríos, Morabito, Bettenbrock, et al. (2022).

| Reconstructing unmeasured cell components
Let ( ) i ⋅ be a general optimization variable calculated at time t i .We collect the dynamic Equations ( 3), ( 5), ( 6), and (12) in the vector We indicate with ( )* ⋅ the solution of the full information estimation problem and with ( ˆ) ⋅ the prior information of a variable.With x* 0 , w*, and θ* we reconstruct the states at t k which can be used in the MPC.
The full information estimator considers all the measurements; instead, if only the measurements in a given time window are used, one refers to a moving horizon estimator (Elsheikh et al., 2021;Rawlings et al., 2020).
It is worth noting that other state estimation methods, such as Kalman filters, have been proposed in the literature for inferring unmeasured states (Elsheikh et al., 2021;Haseltine & Rawlings, 2005;Tuveri et al., 2021Tuveri et al., , 2022)).Kalman filters, however, are most effective with unconstrained systems.Consequently, they might not be well-suited for dynamic constraint-based models like the one outlined in this work.In contrast, our optimization-based soft sensor naturally accommodates constraints on states and inputs.
Furthermore, Kalman filters have a memory of just one-time step, while our soft sensor considers a trajectory of states and inputs, which theoretically offers improved state estimation performance and robustness.

| EXAMPLE: OPTOGENETIC CONTROL OF ATPase IN ANAEROBIC LACTATE FERMENTATION BY E. coli
We consider the anaerobic lactate fermentation by E. coli using glucose as substrate, with optogenetic control of the ATPase enzyme complex, c.f. Figure 2. We only have one regulated protein, hence, . The latter enzyme is responsible for catalyzing the hydrolysis reaction of ATP into ADP.We focus on the E. coli KBM10111 strain, engineered with gene deletions of adhE (aldehydealcohol dehydrogenase), ackA (acetate kinase), and pta (phosphate acetyltransferase) (Hädicke et al., 2015).Under these conditions, lactate synthesis from pyruvate is required to balance the redox cofactors generated during glycolysis.Since glycolysis renders net ATP gain, lactate production is linked to ATP synthesis (see Figure 3).
In such cases, where the product pathway is linked to net ATP formation, it has been shown that an enforced ATP turnover or wasting can lead to an increase in the substrate uptake and the metabolic flux through the ATP-producing pathway as a way to counterbalance the ATP loss (cf.e.g., Boecker et al., 2021;Hädicke et al., 2015;Zahoor et al., 2020).Dynamic manipulation of the ATPase expression, and thereby the ATPase flux, can thus be exploited to modulate the product yield and volumetric productivity in bioprocesses (Espinel-Ríos, Bettenbrock, et al., 2022;Espinel-Ríos, Morabito, Bettenbrock, et al. 2022).
We consider the Ccas/CcaR optogenetic system (Olson et al., 2014) for modulating the ATPase expression.CcaS is a sensor histidine kinase that is activated with green light (λ 535 nm ).
Active CcaS phosphorylates the cognate response regulator CcaR.

| Model and process considerations
We assume that the dose-response function for ATPase expression follows a Hill function (Olson et al., 2014) where α is an input-independent basal rate of production (e.g., due to promoter leakage or constitutive expression), β is an input-dependent maximum rate of production, K is a saturation constant and δ is the Hill coefficient.
We assume that light penetration inside the bioreactor is not homogeneous as the cells interfere with the light beam.Let l be the length between the two plates of the bioreactor.We set up a balance over an infinitesimally small distance l d , assuming that the light hits perpendicularly with respect to the illuminated flat surface and that the culture is well-mixed.After integrating from l 0 to l we obtain where , and a λ is a lumped biomass-specific constant that accounts for light scattering and absorption effects.Note that the latter equation follows a similar derivation as the Lambert-Beer law (Hofmann et al., 2014).We obtain I ¯c from the mean integral of I l B ( , ) Additionally, we consider negligible degradation in Equations ( 5),( 6) and (8), while for the ATPase enzyme where d ATPase is a constant ATPase degradation rate.
In Table 1, we summarize the model parameters for the CcaS/ CcaR module and flat-panel bioreactor, as well as the process initial conditions.The cost function is chosen to maximize the lactate concentration at the end time of the process, that is, , with t f = 30 h and 12 control actions (N = 12).We furthermore consider box constraints for the inputs, namely We add a further constraint to the optimization, , to ensure that all glucose, the feeding substrate, is fully consumed.Finally, the bioreactor volume should not surpass the maximum working volume capacity v Lmax , thus we add the con- straint v v ≤ L L max .Remark on the numerical solution of the optimization problem: the solution to problems ( 16)-( 18) are optimal input functions.This renders these problems infinite-dimensional, hence generally impractical to solve.One way to obtain a solution is via a finitedimensional approach (Findeisen & Allgöwer, 2002;Rawlings et al., 2020).In our case, this is achieved by assuming piece-wise constant inputs and by discretizing the ordinary differential equations using orthogonal collocation based on Lagrange interpolation polynomials as motivated by Waldherr et al. (2015).The bilevel T A B L E 1 Relevant parameters and initial conditions of the nominal model., where t 0.5 is the ATPase protein half-life time (Benito et al., 1991).(3) Based on a pilotscale flat-panel photobioreactor design (Koller et al., 2018).(4) Assumed biologically sound order of magnitude.Estimated as ca.1/30 of typical parameter values for microalgae (Blanken et al., 2016).( 5) Estimated from B (0) using resource balance analysis (Jabarivelisdeh et al., 2020).

| Open-loop optimal optogenetic control
Figure 4 shows the open-loop optimization results for the fed-batch fermentation considering no model-plant mismatch.We depict four scenarios: 1. S1: high-strength inducible CcaS/CcaR system-high β value.
The NI case rendered a final lactate concentration of 1434.3 mM, whereas S1 achieved 1572.3 mM (↑10%), S2 1538.5 mM (↑7%), and S3 1498.4 mM (↑4%).Note that, by the end of all fermentations, the maximum allowed bioreactor volume was reached and all the glucose was fully depleted.This implies that overall the same net amount of glucose was fed and consumed.Consequently, in the previous scenarios, the relative gains in product titer also correspond, proportionally, to increments in product yield and volumetric productivity.
Furthermore, in S1 the maximum ATPase enzyme concentration was 12.1% of the cell dry weight, in contrast to 8.5% and 3.8% in S2 and S3, respectively.As foreseeable, the higher the induction capacity strength of the CcaS/CcaR system, the higher the net ATPase enzyme expression, and therefore the higher the net increase in product yield.In previous works dealing with dynamic ATP turnover in one-stage batch fermentations, the increase in product yield was correlated with a loss in volumetric productivity (Espinel-Ríos, Bettenbrock, et al., 2022;Espinel-Ríos, Morabito, Bettenbrock, et al., 2022b).Here, we show that with a fed-batch system, it is possible to increase both the product yield via the ATP turnover mechanism and the volumetric productivity via the introduction of a feeding rate.
Compared to NI, scenarios S1-S3 resulted in 63%, 46%, and 28% lower final biomass concentrations.This can be explained by the combined effect of the lower biomass yields due to the ATP turnover and the potential resource burden related to the cost of producing the ATPase enzyme.Note that there is also a dilution effect from the feeding of the substrate.Overall, the increased ATP turnover rates managed to enhance the final lactate titer despite the lower biomass growth rates.
In all induction scenarios, there was at first a gradual increase in the feeding rate, followed by a continuous decrease after around the midterm of the fermentation.This allowed for making up sufficient biomass while keeping low induction levels of the ATPase enzyme.
Then, to avoid excessive dilution of the biomass, the feeding rate decreased at increasing ATPase expression levels.A benefit of our model-based optimization is that it takes into account resource allocation constraints.The resource allocation phenomena associated with the expression of the ATPase enzyme is presented in Figure 5, where we show the dynamic enzyme composition profiles throughout the open-loop fermentations.Note that the induction of the ATPase enzyme led to a re-accommodation of the unregulated enzymes.For instance, let us compare the profiles of enzymes frdABCD, fumB, mdh, ppc, ldhA, tpiA, and pgi.While they seem to slightly accumulate in the NI fermentation, they are kept at lower concentrations in the ATPase induction cases.The effect is clearer in scenario S1 because there the ATPase was expressed at higher levels.

| Counteracting model uncertainties and disturbances-optogenetic closed-loop control
Nominal open-loop control does not account for model uncertainties and unforeseen disturbances and process changes.Thus, we now evaluate the performance of shrinking horizon MPC for addressing system uncertainty.We limit our analysis to the high-strength inducible CcaS/CcaR system which provided the best results.We to the open-loop optimization.Also, with MPC 1 there was no unconsumed glucose by the end of the process.Note that MPC 1 scenario is a very optimistic result as the full state measurement is assumed and there is no measurement noise present.
MPC 2 results are more realistic regarding practical implementation, that is, with state estimation and measurement noise.
The estimation of the cell composition for selected species at the different sampling times is presented in Figure 7.We also calculated the standard error (SE_FIE) of the estimates. 6Overall, the full information estimator tracked well the concentration trends of the biomass components.However, it should be noted that, in general, the estimation improved as the process proceeded.That is, the estimations were less accurate during the first one-third of the process (cf.e.g., the estimation profiles of enzymes pfkA_fbaA, gltA_acnB_icd, and gdhA_glnA).The The potential of this technology is highlighted considering the dynamic control of the cellular ATP turnover via optogenetic regulation of the ATPase gene expression.We show that optimal control of the light intensity and the substrate feeding rate can enhance the process performance in terms of product titer and volumetric productivity.Furthermore, we demonstrated that introducing feedback via model predictive control can help to counteract system uncertainty.
We believe that the outlined metabolic cybergenetic framework opens the door to new and more advanced biotechnological applications where manipulating metabolic fluxes throughout the process is required.This is actually in line with dynamic metabolic engineering approaches and goes beyond traditional cybergenetic schemes where the regulated proteins (e.g., fluorescence reporters) are not directly involved in metabolic pathways.Moreover, the model-based feature of the presented framework can contribute to shortening and reducing the cost of process development, and obtaining a more robust, consistent, and flexible operation.
Note that the constraint-based dynamic model outlined in this work constitutes an optimization problem on its own.Consequently, model-based optimization using such models is of a bilevel nature.
Here, we applied, for simplicity of presentation, the Karush-Kuhn-Tucker conditions to the inner optimization problem to obtain a single-level optimization that can be solved with conventional nonlinear solvers.However, this transformation renders nonconvex mathematical programs with complementarity constraints, which can be difficult to solve due to the nonlinearity of these constraints.
Furthermore, the inclusion of Lagrange multipliers or dual variables, increases the size of the optimization problem.One might employ more tailored optimization approaches to overcome this challenge.
We are currently working on simpler mathematical modeling approaches for metabolic cybergenetic systems, augmented with machine learning, to reduce the complexity associated with bilevel optimizations and facilitate practical implementations.Ultimately, our goal is to experimentally validate the proposed framework considering the presented case study alongside other relevant bioprocesses.
We are also working on extending the scope of the presented metabolic cybergenetic framework to include synthetic microbial communities (cf.e.g., Espinel-Ríos et al., 2023, n.d).
1 From now on, we will refer to the F 1 -subunit of the ATPase enzyme/ gene as the "ATPase enzyme/gene." 2 Quota elements include, for example, DNA, lipids, carbohydrates, noncatalytic proteins plus other small molecules. 3The model allows describing batch systems by setting v ˙= 0 L .For continuous processes, one can include an additional flow rate leaving the bioreactor.second term of the objective function in (19a).We neglect the state noise and assume constant model parameters, hence they are not estimated.The matrix R is chosen as the identity matrix.∀ ∈ , E i : total number of p i estimates.

"
unregulated" biomass components typically contain, for example, the metabolic enzymes that are not under cybergenetic control, ribosomes, and quota elements.Hereafter, we will omit writing the dependency of the variables with respect to time when clear from the context.F I G U R E 1 Overview of metabolic cybergenetics in fed-batch regime.Key metabolism-relevant proteins such as enzymes p n reg p reg  ∈ are under the regulation of inducible gene expression systems to enable different metabolic modes over time using external inputs (e.g., light intensity).Model-based optimization finds the optimal inputs to the plant.The process outcome can be monitored via (bio)sensors and state estimators.Repeated solution of the optimization leads to feedback control.With regard to the tunable gene expression systems, we differentiate between the inputs manipulated by the controller hold.This is especially relevant in large-scale setups where conditions tend to be less homogeneous, or where the input values received by the cells might depend on the cell density, for example, due to turbidity.
and degradation, respectively, in units of mole per time.Cells possess transcription factors that can switch between active and inactive states at a rate dictated by a specific signal.When active transcription factors bind the promoter region of a regulated gene, they can activate or repress the transcription process.
the fluxes for transport, metabolic and biomass-producing reactions in molar amount per time.In the model, we assume that the feed contains only substrates.The cell needs to invest resources to manufacture its components.Thus, changing the expression of a regulated protein is expected to influence the production rate of other biomass components and the resulting metabolic flux distribution since resources are limited and shared within the cell.Note that including biomass-producing reactions in the network is a way to capture the resource cost because we explicitly consider the required stoichiometric precursor and energy equivalents for the synthesis of all biomass components.With this in mind, the amount of unregulated reactions catalyzed by an enzyme p unr i and ⋅   refers to the absolute value operator.The metabolic fluxes of reactions catalyzed by enzymes in p reg of reactions catalyzed by p reg i .
h s is a fixed sampling interval.Furthermore, we assume that the controller predicts up to the final time t Nh f s ≔ , where N  ∈ is the number of steps in the horizon.Therefore, the prediction horizon shrinks at every sampling time.The shrinking horizon MPC at time t k

.
As with the MPC, we assume for simplicity of presentation equidistant sampling times for the estimator, although nonequidistant sampling times are also possible.At time t k , a full information estimator can be formulated by solving the Phosphorylated (active) CcaR enables the transcription of the target genes.In contrast, CcaS is inactivated with red light (λ 650 nm ), thereby blocking transcription.From now on, let u I is the green light intensity manipulated by the controller and I ¯c is the average of I c , that is, of the green light intensity perceived by the cells inside the bioreactor.Therefore, the process inputs comprise one cybergenetic input plus the substrate feeding rate.Furthermore, we consider a flat-panel photobioreactor, consisting of two flat surfaces joint by a thin gap, thereby creating a rectangular channel (Chanquia et al., 2021).The bioreactor is illuminated from one side by a green light source.This geometry is known to maximize the illumination area per culture volume, hence it is appealing for optogenetics.F I G U R E 2 Overview of the considered example using the Ccas/CcaR optogenetic system for modulating the expression of the ATPase.The left side shows the considered flat-panel (photo)bioreactor and the metabolic cybergenetic control scheme, including the average light intensity inside the culture.On the right side, we show the effect of different expression levels of ATPase on the cell's metabolism.Picture of the bioreactor adapted from Pfaffinger et al. (2016).F I G U R E 3 Scheme of the resource allocation model for the anaerobic lactate fermentation by Escherichia coli KBM10111.Catalytic species are shown in italics (e.g., atpAGD refers to the genes of the ATPase enzyme).Some enzymes are lumped via underscore symbols.In gray we depict the blocked pathways.Adapted from Espinel-Ríos, Bettenbrock, et al. (2022).
Figure 3 for a summary of the resource allocation model.In general, the model contains 34 fluxes: 16 metabolic reactions and 18 biomass-producing reactions.From the latter, 16 reactions are for the synthesis of catalytic enzymes, one for ribosomes and another one for a lumped quota compound.It considers five species in z (glucose, lactate, formate, succinate, and carbon dioxide), 18 species in m and 18 species in p.The cell composition (g/g biomass) is 0.06 catalytic enzymes, 0.38 noncatalytic enzymes, 0.27 ribosomes, and 0.29 other components (DNA, lipids, carbohydrates, etc.), hence φ = 0.38 + 0.29 Q 1) Assumed biologically sound values inferred from feasible deFBA simulations (Espinel-Ríos, Bettenbrock, et al., 2022) for different induction strength scenarios ( S i ).(2) Estimated as d introduced model-plant mismatch by scaling the catalytic constants of the enzymes pfkA_fbaA, gpmA_eno, gapA_pgk, gltA_acnB_icd, and gdhA_glnA by a factor of 0.98, which slightly decreases the fermentation rates.We also scaled down δ and d ATPase by 0.97 and 0.98, respectively; the latter decreases the steepness of the Hill function and the former makes the ATPase enzyme slowerdegrading.The modified model was used for the plant simulations while the nominal model was given to the controller.Two MPC cases are considered: 1. MPC 1: all the states can be measured online without measurement noise.2. MPC 2: the concentrations of the ATPase enzyme, 4 biomass dry weight and extracellular metabolites can be measured online.Gaussian white noise (1% standard deviation) is added to the measurements.The cell composition is estimated via full information estimation. 5The reconstructed cell composition, along with the online measurements, are passed to the MPC.The MPC simulations are shown in Figure 6.We also plot the open-loop scenario (with model-plant mismatch) as a reference case.The open-loop controller resulted in a final lactate concentration of 1449.5 and 64.3 mM net unconsumed glucose.The applied light intensity brought the ATPase enzyme concentration up to 12.6% of the cell dry weight, but then it decreased slightly to 11.9%.The applied light intensity in MPC 1 allowed the ATPase enzyme fraction in the cell to eventually surpass the value achieved in the open-loop fermentation.The combined effect of the corrected light intensities and feeding rates rendered a final lactate titer of 1567.8 mM.The latter represents an 8% improvement with respect progressive improvement of the estimation is explained by the F I G U R E 5 Enzyme expression heat map relative to the biomass dry weight for the fed-batch fermentations in Figure 4. From left to right, NI -no induction, S1-high-strength, S2-medium-strength, and S3-low-strength inducible CcaS/CcaR systems.growing estimation horizon and thus the increasing number available measurements.This furthermore explains why at the beginning of the process the controller's predictions were comparatively off with respect to MPC 1 and the open-loop optimization.MPC 2 reached an intracellular ATPase enzyme concentration of about 10.2%, leading to more biomass accumu- lation.The controller adjusted the feeding rates to avoid having unconsumed glucose by the end of the process.MPC 2 rendered a final lactate concentration of 1544.4 mM, very close to the value achieved in the MPC 1 scenario.F I U R E 6 Closed-loop fed-batch simulations with model uncertainty for the high-strength inducible CcaS/CcaR system.MPC 1: with full state measurement and no measurement noise.MPC 2: with measurements of p ATPase , B, and z in the presence of measurement noise; full information estimator used for estimating p.The open-loop case (without online corrective actions) is also shown.F I G U R E 7 Online estimation of the cell components in percentage of cell dry weight.Only the eight most abundant species are shown.Filled circle: exact value.Empty circle: estimated state.The standard error of the estimate (SE_FIE) is presented.The MPC simulations demonstrate that using model-based feedback control, optionally coupled to state estimation methods, can improve the process performance of metabolic cybergenetic systems in the presence of system uncertainty.6 | CONCLUSIONS AND OUTLOOK We propose to fuse cybergenetics with model-based optimization and predictive control for dynamic metabolic engineering applications.The proposed metabolic cybergenetic framework exploits the concept of online metabolic regulation by dynamically modulating the gene expression of metabolism-relevant intracellular proteins.To do so, we developed a dynamic constraint-based modeling framework that integrates the dynamics of metabolic reactions, resource allocation, and external gene expression regulation.The model is combined with model-based optimization, predictive control, and estimation methods to facilitate the implementation of metabolic cybergenetic systems.
2These biomass components are contained in the molar