Stage‐cost design for optimal and model predictive control of linear port‐Hamiltonian systems: Energy efficiency and robustness

We consider singular optimal control of port‐Hamiltonian systems with minimal energy supply. We investigate the robustness of different stage‐cost designs w.r.t. time discretization and show that alternative formulations that are equivalent in continuous time, differ strongly in view of discretization. Furthermore, we consider the impact of additional quadratic control regularization and demonstrate that this leads to a considerable increase in energy consumption. Then, we extend our results to the tracking problem within model predictive control and show that the intrinsic but singular choice of the cost functional as the supplied energy leads to a substantial improvement of the closed‐loop performance.


OPTIMAL CONTROL WITH MINIMAL ENERGY SUPPLY
We consider pH systems given by ẋ() = ( − )() + (), () =  ⊤ (), with skew-symmetric structure matrix  ∈ ℝ × , symmetric, positive semi-definite dissipation matrix  ∈ ℝ × , symmetric positive definite matrix  ∈ ℝ × , and input matrix  ∈ ℝ × .On a time interval [0, ] with  > 0 total energy of the system is given by the Hamiltonian () = ( In view of this energy balance, the control task of a state transition from an initial state  0 ∈ ℝ  to a terminal state   ∈ ℝ  with minimal energy supply subject to control constraints  ⊂ ℝ  leads to the OCP min ∈([0,],)   (, ) ∶= ∫  0 () ⊤  ⊤ () d s.t. the dynamics (1), (0) =  0 , and () =   . ( Due to the linear dependence of the cost functional in the control, the OCP (3) is singular.However, using the pH structure, this problem was analyzed in Refs.[7,8] in view of existence of solutions, hidden regularities, and turnpike properties.If   is reachable from  0 , which we will tacitly assume in this work, existence of optimal controls is guaranteed by Schaller et al. [8,Proposition 7].We denote the optimal control by  ⋆  ∈ ([0, ]; ) and the associated optimal state trajectory by  ⋆  = (⋅;  0 ,  ⋆  ).Since the boundary values are fixed, rearranging the terms in Equation (2) yields Thus, replacing the cost   in the OCP (3) by   and denoting the optimal control-state pair by ( ⋆  ,  ⋆  ), we have  ⋆  =  ⋆  and  ⋆  =  ⋆  .Intuitively speaking, this means that a state transition with minimal energy supply is a state transition with minimal dissipated energy and vice versa, leading to a particular long-term behavior of optimal solutions, the turnpike property: Optimal states are close to the conservative subspace ker  of the pH-dynamics (1) for the majority of the time.

DISCRETIZED OCPS: STAGE-COST DESIGN
The equivalence of the cost functionals   and   is not guaranteed for the discretized counterparts, as the energy balance may not be preserved.In this section, we conduct numerical experiments with different sampling times Δ ∈ {10 −1 , 5 ⋅ 10 −2 , 10 −2 , 5 ⋅ 10 −3 , 10 −3 , 5 ⋅ 10 −4 } with particular focus on the trajectory error, optimality, and the turnpike property.
For each sampling time and corresponding optimal control, denoted by ũ and ũ , we compute the state response x and x using Equation ( 5) with the step size Δ = 5 ⋅ 10 −4 .The comparison between J ( x , ũ ) and J ( x , ũ ) is shown in Figure 1.It can be seen that while J ( x , ũ ) only slightly varies, the values of J ( x , ũ ) change strongly and converge as the step size gets smaller.It can be concluded that the OCP ( 7) is more robust w.r.t.higher sampling times than the OCP (6).
In Figure 2, we compare the  2 error ‖‖  2 (0,) = between the optimal controls ũ , ũ of the OCPs for Δ ≥ 5 ⋅ 10 −4 with corresponding states x , x obtained from simulation with fine step size Δ = 5 ⋅ 10 −4 with the optimal trajectories  ⋆ and control  ⋆ obtained by solving either of the OCP (3) for the finest sampling time Δ = 5 ⋅ 10 −4 .It can be observed that the optimal solutions of OCP (7) for larger sampling times are always much closer to the optimal solutions with the finest sampling time  ⋆ and  ⋆ , which means that the OCP ( 7) is less affected by larger sampling times.
The turnpike property of the continuous-time OCP (3) (or its equivalent reformulation via   ) with respect to conservative (or nondissipative) subspace ker  has been proven in Refs.[7,8].In Figure 3, we illustrate this property for different sampling times.We observe that whereas the optimal trajectories for OCP (7) are already very accurate for Δ = 10 −1 , the alternative formulation OCP (6) gives poor results for bigger sampling times even though the state is contained in the subspace ker .Thus, we can conclude that in view of the turnpike property w.r.t. the conservative subspace, the formulation ( 7) is preferable.

WHAT ABOUT QUADRATIC CONTROL PENALIZATION?
A common choice in optimal control is to penalize the control by means of a quadratic term.Whereas in some applications, this reflects the control effort in the underlying physical system, a main motivation of this choice is mostly for convenience: For standard quadratic costs, existence results based on radially unbounded cost functionals can be readily applied, the control is directly characterized by the optimality conditions, and in absence of control constraints, Riccati theory is applicable.However, it is clear that for pH dynamics (1) and their intrinsic energy-balance (2), the control effort is given by the product of input and output.We compare the intrinsic (but singular) cost in Equation ( 3 Discretization is performed as lined out in Section 3, yielding In Figure 4, optimal trajectories and control can be seen for the time horizons  = 10 and  = 20.Both optimal trajectories oscillate around zero, however, the oscillations for the solutions of the OCP (7) have a higher amplitude than the OCP (10).We observe that the control bounds for the optimal control for the OCP (7) are active at the beginning and the end of the time period, while the control function for the OCP (10) behaves more moderately due to the quadratic control penalization.
In Figure 5, the distance to the nondissipative subspace is depicted.Whereas for increasing time horizon, the optimal trajectories for both OCPs approach the subspace, the states of Equation ( 7) are always closer than the counterparts of Equation (10).In Figure 6, we observe that also the OCP (10) enjoys a turnpike property towards the conservative subspace.The following proposition provides a theoretical justification for this behavior by means of an exponential turnpike property.Lastly, the ratio J (  )∕ J (  ) for different time horizons is plotted in Figure 7 for Δ = 10 −1 and Δ = 10 −2 .Even though the ratio oscillates with respect to the time horizon, the solution of the OCP ( 7) is two to three times more energy efficient than the solution of the OCP (10).For each time horizon, the optimal trajectory leaves the conservative subspace at a different point leading to changes in the cost functionals ratio.Although there is a small change w.r.t. the sampling time, our examinations show that both of the OCPs are robust to the discretization with bigger sampling times.

TRAJECTORY TRACKING AND MODEL PREDICTIVE CONTROL
In this part, we consider output tracking.Again, we compare the performance of the intrinsic control cost describing the supplied energy in Equation ( 3) with the quadratic penalization of Equation (9).To this end, we introduce the tracking functional where  ∈ ℝ × is an output matrix and where we again used the energy balance (2) to obtain the counterpart of the reformulation (4).The terminal cost is due to the final state being free.Analogously, we endow the quadratic control cost of Equation ( 9) with the tracking cost, leading to
In Figure 8A, we depict results for Setting 2. We observe that in view of tracking performance, the OCP (13) outperforms  OCP (14).Comparing the tracking performance and energy supply yields the ratio  , ( , )∕ , ( , ) = 0.37, showing that the open-loop controls of Equation ( 13) achieve tracking with significantly higher energy efficiency.
For vanishing control  = 0, the pH-system (8) has only one equilibrium  = (0, 0, 0) ⊤ .To examine a problem with several equilibria, introduce another example by modifying the dissipation matrix and the initial state.
In Figure 8B, we observe that when forcing the first state  1 () to track the reference  ref () ≡ 0.5, all states converge to the corresponding equilibrium point  = (0.5, 0.5, 0) ⊤ in the conservative subspace ker .However, due to the terminal cost (()) in Equation (13), optimal trajectories exhibit a leaving arc, that is, they leave the desired state towards the end of the time horizon.

Closed-loop: Model predictive control
We extend our tracking results within MPC.Considering Setting 3, we apply the MPC algorithm with prediction horizon  = 3.3.In Figure 9A, we observe that the leaving arc corresponding to the terminal cost (()) is not present in closed-loop due to the receding horizon structure of MPC.Further, the tracking performance of the OCP ( 13) is much better than the OCP (14).In Table 1, the closed-loop costs for different time horizons are given.For Setting 2, the results show that the cost value for  , is not affected by the selection of prediction horizon and it is always much smaller than the cost value for  , .For Setting 3, since the term (()) is not zero in the predictions, the closed-loop cost is affected by the selection of the time horizon.Nevertheless, for both the examples, the cost value for  , is always smaller than for  , , which means that also for the closed loop tracking problems, the OCP (13) reduces the energy consumption significantly.
Finally, in Figure 9B, we depict the tracking performance for a staircase reference for Setting 4 by applying the MPC algorithm with  = 3.3.If the desired output corresponds to an equilibrium point in the conservative subspace, both of the OCPs approach the reference signal, however, the OCP (13) always gives better tracking performance.

CONCLUSIONS
We studied optimal control of pH systems with minimal energy supply.First, we considered a state transition problem with control and terminal state constraints and showed that discretization in OCPs may cause different results in the cost values so that theoretically equivalent cost functions can give different results for larger step sizes.Then, we compared two different cost functionals with the minimal energy and the quadratic control term and showed that the OCP with the minimal energy is considerably more energy-efficient.Finally, we extend our results for a trajectory-tracking problem within the MPC structure and showed that we obtain a better and more energy-efficient tracking performance without quadratically penalizing the control.Future work will utilize discrete energy balances [10] and discrete Hamiltonians [14] for the reformulation of the OCP, in particular, in the context of nonlinear dynamics.