Robust Regret Optimal Control

This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust $H_\infty$ performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated three examples: (i) a simple single-input, single-output classical design, (ii) a longitudinal control for a simplified model for a Boeing 747 model, and (iii) an active suspension for a quarter car model. All examples compare the robust regret optimal against regret optimal controllers designed without uncertainty.

It is shown that this problem can be converted to an equivalent H ∞ synthesis problem with a scaled plant model.Subsequent work considered the multiplicative ratio in performance, called the competitive ratio, achieved by a given controller relative to the non-causal controller [9], [10].
Additional related work considers state-and input constraints using system level synthesis [11] and regret-bounds for H ∞ controllers [12].A second thread measures regret of a given controller relative to the best static state-feedback with full, non-causal knowledge of the disturbance sequence [13].Online convex optimization techniques [14] are used to optimize over a class of disturbance action policies that depend on a finite history of the disturbance.
The key contribution of our paper is to provide a solution to a robust regret optimal control problem.We formulate the output feedback control problem in Section III-A with an uncertain plant and general interconnection used in the robust control literature [1], [2].We define robust regret relative to the performance achieved by the optimal non-causal controller on the nominal plant (with no uncertainty).We use a definition of regret that includes as special cases H ∞ control [1], [2], [15], (additive) regret [3], [4], [5], [6], [7], [8] and (multiplicative) competitive ratio [9], [10].Our definition of regret is a special case of regret-optimal control with weights [10] (although we include model uncertainty).It is important to note that model uncertainty is not equivalent to an exogenous disturbance.For example, model uncertainty can cause an instability and unbounded signals but this is not possible with bounded disturbances.This distinct feature of model uncertainty motivates the importance of considering robustness in addition to the disturbance rejection.
We solve the robust regret problem using a similar solution approach as in these prior works.First we derive the optimal non-causal controller from [16] in our more general setting (Section II-C).Second, the robust regret problem is converted, via a spectral factorization of the optimal non-causal cost, to an equivalent robust synthesis problem (Section III-B).The robust synthesis problem is non-convex but a sub-optimal controller can be computed via a coordinatewise search known as DK-iteration or µ-synthesis [17], [18], [19], [20], [21].As an intermediate step, we solve the nominal regret problem in Section II for the case of known plant dynamics.
As noted above, the nominal case generalizes the cost function and definition of regret compared to previous works.The nominal case also forms the foundation for our main results on robust regret with model uncertainty.Section IV provides examples to compare the proposed method against existing robust control methods, e.g.H ∞ control, and (nominal) regret-based methods, e.g.additive regret and competitive ratio.

DRAFT
It is important to note that regret is typically defined using an omniscient control design as a benchmark.Our nominal regret definition applies this convention to the disturbance, i.e. the benchmark is the optimal non-causal controller with full knowledge of the disturbance.However, our robust regret definition does not apply this convention to the model uncertainty.Specifically, our benchmark for robust regret is the optimal controller designed on the nominal model with ∆ = 0. Thus the method in this paper designs controllers that have small regret with respect to the disturbance and are robust to model variations.It may be possible to design robust regret optimal controllers where the baseline has full knowledge of the disturbance and the specific value of the uncertain plant, e.g. by adapting gain-scheduling results such as [22].This will be considered in future work.

A. Notation
This subsection reviews basic notation regarding vectors, matrices, signals and systems.This material can be found in most standard texts on signals and systems, e.g.[1], [2].
Let R n and R n×m denote the sets of real n × 1 vectors and n × m matrices, respectively.
Similarly, C n and C n×m denote the sets of complex vectors and matrices of given dimensions.
The superscripts ⊤ and * denote transpose and complex conjugate (Hermitian) transpose of a matrix.Moreover, if M ∈ C n×n then M −⊤ denotes (M ⊤ ) −1 = (M −1 ) ⊤ .The Euclidean norm (2-norm) for a vector v ∈ C n is defined to be ∥v∥ The induced 2-norm for a matrix M ∈ C n×m is defined to be: The induced 2-norm for a matrix M is equal to the maximum singular value, i.e. ∥M ∥ 2→2 = σ(M ) (Section 2.8 of [1]).Similarly, R n and R n×m denote the sets of n × 1 real vectors and n × m real matrices, respectively.The same definitions for the vector 2-norm and matrix induced 2-norm hold for real vectors and matrices.
The sets of integers and nonnegative integers are denoted by Z and N. Let v : Z → R n and w : Z → R n be real, vector-valued sequences.Note that we will mainly use two-sided sequences defined from t = −∞ to t = +∞.Define the inner product ⟨v, w⟩ := ∞ t=−∞ v ⊤ t w t .The set ℓ 2 is an inner product space with sequences v that satisfy ⟨v, v⟩ < ∞.The corresponding norm is DRAFT ∥v∥ 2 := ⟨v, v⟩.Finally, define the truncation operator P T as a mapping from a sequence v to another sequence w = P T v defined by: Next, consider a discrete-time, LTI system G with the following state-space model: where x t ∈ R nx is the state, d t ∈ R n d is the input, and e t ∈ R ne is the output.A system G is said to be causal if P T Gd = P T G(P T d), i.e. the output up to time T only depends on the input up to time T .The system is said to be non-causal if it is not causal, i.e. the output can possibly on future values of the input.
The matrix A ∈ R n×n is said to be Schur stable if the spectral radius is < 1.If A is Schur stable then G is a causal, stable system.Hence G maps an input d ∈ ℓ 2 to output e ∈ ℓ 2 starting from the initial condition x −∞ = 0.The induced ℓ 2 -norm for a stable system G is defined to be: Finally, the transfer function for (1) is G(z) = C(zI nx − A) −1 B + D. The H ∞ norm for a stable system G is: σ G(e jθ ) .

B. Problem Formulation
Consider the feedback interconnection shown in Figure 1.The interconnection, denoted F L (P, K), consists of a controller K in feedback around the lower channels of the plant P .This is a standard feedback diagram for optimal control formulations in the robust control literature [1], [2].The plant P is a discrete-time, linear time-invariant (LTI) system with the following state-space DRAFT representation: where The goal is to design an output-feedback controller K to stabilize the plant and ensure the error remains "small".The cost achieved by a controller K on a disturbance d ∈ ℓ 2 (assuming Here the cost is defined using a two-sided disturbance It is common to formulate optimal control problems using one-sided ℓ 2 signals starting from t = 0 with the initial condition x 0 = 0.However, a non-causal controller will be introduced later.Two-sided signals are used to avoid nonzero initial conditions arising from this non-causal controller.
To simplify notation, define Q := C ⊤ e C e , S := C ⊤ e D eu and R := D ⊤ eu D eu .The cost can be re-written as: DRAFT Equation 4is often used as the starting point for optimal control problems with a linear quadratic cost.If (Q, S, R) are given then we can convert to equivalent output matrices (C e , D eu ).For example, assume Q ⪰ 0, R ≻ 0, and S = 0 are given.Then the corresponding error e is obtained with C e := Q 1/2 0 and D eu := 0 R 1/2 .We will use the optimal non-causal controller, denoted K o , as a baseline for comparison following along the lines of [9], [3], [5], [6], [7], [8], [10].The controller K o depends directly on past, present, and future values of d.This controller is described in detail in Section II-C below with an explicit state-space realization for K o given in Theorem 1.In contrast, a causal, outputfeedback controller depends only on past and present values of the disturbance indirectly via the effect of d on the measurements.We define the performance of any (causal, output-feedback) controller K relative to the baseline, non-causal controller K o as follows: Definition 1.Let γ d ≥ 0 and γ J ≥ 0 be given.A controller K achieves (γ d , γ J )-regret relative to the optimal non-causal controller K o if F L (P, K) is stable and: Section II-D provides a method to solve the following (γ d , γ J )-regret feasibility problem: Given (γ d , γ J ), find a causal, output-feedback controller K that achieves (γ d , γ J )-regret relative to K o or verify that this level of regret cannot be achieved.This feasibility problem includes several existing synthesis methods as special cases: • H ∞ Synthesis: The H ∞ feasibility problem is: Given γ ∞ , find a controller K such that ∥F L (P, K)∥ ∞ < γ ∞ or verify that this level of performance cannot be achieved [1], [2], [15].There are a variety of existing methods to solve the H ∞ feasibility problem in discrete-time including the use of: (i) Riccati equations [24], [25], [26], (ii) linear matrix inequalities [27], or (iii) bilinear transformations combined with solutions for the continuous-time problem [15], [1].The objective in H ∞ synthesis is to minimize the closedloop H ∞ norm: inf K ∥F L (P, K)∥ ∞ .This optimization can be solved to within any desired tolerance using bisection and the solution of the H ∞ feasibility problem.To connect this to regret, recall that the H ∞ norm of an LTI system is equal to the induced ℓ 2 norm.
Hence the closed-loop with a controller K satisfies ∥F L (P, K)∥ ∞ < γ ∞ if and only if • Competitive Ratio Synthesis: The competitive ratio feasibility problem is: Given γ C , find a controller K such that J(K,d) J(K o ,d) < γ 2 C for all nonzero d ∈ ℓ 2 or verify that this level of performance cannot be achieved [9], [10].Thus a controller yields a competitive ratio of γ C if and only if it achieves (0, γ C )-regret with respect to K o .Again, the competitive ratio can be minimized to within any desired tolerance using bisection and the solution of the (0, γ C )-regret feasibility problem.
2 for all nonzero d ∈ ℓ 2 or verify that this level of performance cannot be achieved [7], [8], [3], [5].Thus a controller yields γ R -regret if and only if it achieves (γ R , 1)-regret with respect to K o .Again, the γ R -regret can be minimized to within any desired tolerance using bisection with the (γ R , 1)-regret feasibility problem.The γ R -regret is additive in the sense that J(K, d) is within an additive factor γ 2 R ∥d∥ 2 2 of the optimal non-causal cost.In general we can use the (γ d , γ J )-regret feasibility problem to solve for the Pareto optimal front of values for (γ d , γ J ).Specifically, the Pareto front can be characterized by the following optimization parameterized by θ ∈ [0, 1]: The constraint with θ = 0 corresponds to the performance bound J(K, d) < γ 2 ∥d∥ 2 2 .This yields H ∞ synthesis, i.e. γ * (0) = γ ∞ .Similarly, the constraint with θ = 1 is J(K, d) < γ 2 J(K o , d) and this yields competitive ratio synthesis, i.e. γ * (1) = γ C .These are extreme endpoints on the Pareto front.Any other value of θ ∈ [0, 1] corresponds to the bound . Thus all other points on the Pareto front can be viewed as optimizing with respect to a convex combination of the H ∞ and competitive ratio costs.The special case of additive regret synthesis gives the specific point (γ R , 1) on the Pareto front.This corresponds to the value of θ such that (Such a value of θ exists under mild DRAFT technical conditions.* ) Finally, we note that (γ d , γ J )-regret is a special case of regret-optimal control with weights (Section 4.2 of [10]).

C. Optimal Non-Causal Controller
The optimal non-causal controller is assumed to have full knowledge of the plant dynamics, plant state and the (past, current and future) values of the disturbance.K o is optimal in the sense that it minimizes J(K, d) for each d ∈ ℓ 2 .A solution for the optimal non-causal controller is given in Theorem 11.2.1 of [16].The controller is expressed as an operator with similar results used in [7], [8], [10].An explicit state-space model for the finite-horizon, non-causal controller is constructed in [3], [5] using dynamic programming.These prior results were given for the case where S = 0 in the cost function (4).The corresponding infinite-horizon result is given below as Theorem 1 allowing for S ̸ = 0.An independent proof is given here for completeness using a completion of the squares argument.
The state-space model for K o will be constructed using the stabilizing solution X of a discretetime algebraic Riccati equation (DARE).The next lemma gives sufficient conditions for the existence of this stabilizing solution of the DARE.

and (iv)
A−e jθ I Bu Ce Deu has full column rank for all θ ∈ [0, 2π].Then there is a unique stabilizing solution X ⪰ 0 such that: 1) X satisfies the following DARE: 2) The gain * The non-causal controller provides a lower bound on the performance of any causal controller: J(K o , d) ≤ J(K, d) for any causal controller K. Hence the competitive ratio must satisfy γC ≥ 1 which further implies Moreover, γ * (θ) is bounded if the plant is stabilizable and observable.Finally, if γ * (θ) is a continuous function then it will cross a value of 1 for some value of θ ∈ [0, 1].

DRAFT
Proof.Statements 1)-2) follow from Corollary 21.13 and Theorem 21.7 of [1] (after aligning the notation).Note that the stabilizing solution X, if it exists, is such that (R + B ⊤ u XB u ) is nonsingular.The matrix inversion lemma can be used to show: Statement 3) follows from this equation and the assumption that The special case S = 0 corresponds to C e := Q 1/2 0 and D eu := 0 R 1/2 .In this case the conditions (i)-(iv) of Lemma 1 simplify to: (i) R ≻ 0, (ii) (A, B u ) stabilizable, (iii) A is nonsingular, and (iv) (A, Q) has no unobservable modes on the unit circle.The next theorem constructs the optimal non-causal controller using the stabilizing solution of the DARE (again allowing for S ̸ = 0).Theorem 1. Assume (A, B u , C e , D eu ) satisfy conditions (i) − (iv) in Lemma 1.Let X ⪰ 0 be the unique stabilizing solution of the DARE with corresponding gain K x .Define a non-causal controller K o with inputs (x t , d t ) and output u o t by the following update equations: where Then J(K o , d) ≤ J(K, d) for any stabilizing controller K and disturbance d ∈ ℓ 2 .
Proof.The optimality of the non-causal controller will be shown via completion of the squares.
First, define H := R + B ⊤ u XB u ≻ 0 and express the DARE as: Substitute for Q using the DARE to show: DRAFT Thus the per-step cost achieved by any controller K is: The per-step cost can be rewritten in terms of the input u o generated by the non-causal controller The terms on the last two lines can be combined and simplified after some algebra.There are two key steps in this simplification.First use the dynamics for x t+1 to show: Next, use the dynamics for x t+1 and v t to show: Use these two results to simplify the last two lines of (10) thus yielding: DRAFT Finally, we obtain the cost by summing from t = −∞ to ∞.The two terms on the last line of (11) form telescoping sums.These telescoping sums equal zero due to the assumption that x −∞ = 0 and the controller K is stabilizing so that x t → 0 as t → ∞.Thus the cost is: The terms on the second and third lines only depend on d and not the choice of u.Moreover, H ≻ 0 and the first term is minimized by u t = u o t .Hence the non-causal controller K o minimizes the cost.
The minimal cost achieved by the non-causal controller is obtained by setting u t = u o t in (12): The non-causal dynamics for v t in (8) are stable when iterated backward from v ∞ = 0.If d ∈ ℓ 2 then v ∈ ℓ 2 and this cost is finite.

D. Output Feedback Control Design
We now return to the (γ d , γ J )-regret feasibility problem: Given (γ d , γ J ), find a causal, outputfeedback controller K that achieves (γ d , γ J )-regret relative to K o or verify that this level of regret cannot be achieved.This section provides a solution to this feasibility problem.Again, we follow the basic procedure in [9], [3], [5], [6], [7], [8], [10] and use a spectral factorization to reduce the problem to an equivalent H ∞ feasibility problem.
The regret in Definition 1 involves bounding the cost J(K, d) achieved by a causal K by the following: DRAFT Here e o is the error generated by the closed-loop dynamics with the non-causal controller K o in (8).These closed-loop dynamics are given by: where Â11 := A − B u K x is a Schur, nonsingular matrix by Lemma 1.
We can explicitly solve for v t+1 in terms of (v t , d t ) by inverting Â⊤ 11 in the second equation.This allows the closed-loop dynamics with K o to be expressed in a simpler form.Specifically, define an augmented state and error as xt := [ xt vt ] and êt := γ J e o t γ d dt .The regret in Definition 1 can be written as: where the closed-loop dynamics with K o have the form with the state matrices: The relation used to obtain these simplified expressions.Let P denote the closed-loop dynamics in (14).
The eigenvalues of Â−⊤ 11 lie outside the unit disk.These eigenvalues are associated with the stable, non-causal dynamics of the controller K o .Specifically, an input d ∈ ℓ 2 generates v ∈ ℓ 2 when iterating the non-causal controller backward from v ∞ = 0.The plant dynamics then generate x ∈ ℓ 2 when iterating forward from x −∞ = 0.Moreover, v, x ∈ ℓ 2 imply that v t → 0 as t → −∞ and x t → 0 as t → ∞.Hence the augmented state satisfies the boundary conditions DRAFT x−∞ = 0 and x∞ = 0. † In summary, the system ( 14) is stable but with causal and non-causal dynamics.A spectral factorization can be used to re-write ∥ê∥ 2 2 using only stable, causal dynamics.
Lemma 2. Assume that (A, B u , C e , D eu ) satisfy conditions (i) − (iv) in Lemma 1 so that the DARE (7) has a stabilizing solution X ⪰ 0. This stabilizing solution is used to define the dynamics P from d t to êt in (14).
F is stable, causal, and invertible, and (iii) F −1 is square, stable and causal.Moreover, it is possible to construct F with state dimension n x .
It is important to emphasize that K o is non-causal.Hence the closed-loop ( 14) from d to ê has both casual and non-causal dynamics.The spectral factorization theorem does not state that ê can be computed causally.Rather, it states that the cost ∥ê∥ 2 2 can be equivalently computed as ∥F d∥ 2  2 where F is stable and causal.The result is a restatement of Lemma 5 which is stated and proved in Appendix B. The proof mostly follows from existing spectral factorization results that are reviewed in Appendix A.
The one new aspect of Lemma 2 is to show that the spectral factor F can be constructed with state dimension n x .This requires some additional technical arguments as P in ( 14) has state dimension 2n x .An explicit state-space construction for the spectral factorization is given by Lemma 4 in Appendix A and Lemma 5 in Appendix B. Lemma 2 assumes γ d > 0. This assumption is not satisfied by competitive control which uses In fact, a spectral factorization also exists in the case of competitive control (under a weaker set of assumptions).This requires additional technical details because γ d = 0 corresponds to a singular control problem ( R = D⊤ D = 0).These additional details are omitted.
We now show that the (γ d , γ J )-regret feasibility problem can be reduced to an H ∞ feasibility problem.The regret in Definition 1 can be written as in (13): J(K, d) < ∥ê∥ 2  2 for all nonzero d ∈ ℓ 2 .The left side J(K, d) is equal to the closed-loop cost ∥e∥ 2 2 = ∥F L (P, K)d∥ 2 2 achieved with controller K (Figure 1).The right side ∥ê∥ 2  2 is equal to ∥F d∥ 2 2 by Lemma 2. The spectral factor F depends on the choice of (γ d , γ J ) and is stable with a stable inverse.Define d = F d so † The use of two-sided signals ensures that the augmented state satisfies zero boundary conditions.This would not hold if we used one-sided signals.In particular, iterating K o (8) backward from v∞ = 0 could yield v0 ̸ = 0 and hence x0 = 0 v 0 ̸ = 0.Such nonzero initial conditions create additional technical issues that are avoided by using two-sided signals.
DRAFT that d = F −1 d.The set of all d ∈ ℓ 2 maps 1-to-1 on the set of all d ∈ ℓ 2 .Thus we can rewrite the regret bound as The system F L (P, K)F −1 corresponds to the closed-loop F L (P, K) weighted by the spectral factor inverse F −1 .This weighted closed-loop is shown graphically in Figure 2. Equation 15corresponds to a unit bound on the closed-loop H ∞ norm, i.e. ∥F L (P, K)F −1 ∥ ∞ < 1.This yields the following main result connecting the regret and H ∞ feasibility problems.
Theorem 2. A controller K achieves (γ d , γ J )-regret relative to the optimal non-causal controller Feedback interconnection FL(P, K)F −1 for regret feasibility.
Note that F −1 is stable.Hence the controller K stabilizes F L (P, K)F −1 if and only if K stabilizes F L (P, K).There are well-known results for (sub-optimal) H ∞ synthesis as summarized in Section II-B.For example, Theorem 5.2 in [27] gives necessary and sufficient conditions to synthesize a stabilizing controller K such that ∥F L (P, K)F −1 ∥ ∞ < 1.These conditions are given in terms of linear matrix inequalities (LMIs).No such controller exists if the LMIs are infeasible.The only additional assumptions on the plant P in (2) are that (A, B u ) stabilizable and (A, C y ) detectable.These are the minimal assumptions required for a stabilizing controller to exist.Riccati equation solutions can be developed either directly in discrete-time [24], [25], [26] or mapped, via the bilinear transform, via solutions for the continuous-time problem [15], [1].Both of these Riccati solutions require additional technical assumptions on the plant beyond stabilizability and detectability of (A, B u , C y ).
In summary, we can construct a controller that achieves (γ d , γ J )-regret relative to the noncausal controller K o (or determine that one does exist) by the following steps: DRAFT 1) Construct the optimal non-causal controller as in Theorem 1.
2) Construct the spectral factor F for the given (γ d , γ J ) based on Lemma 2 and Appendix A, and 3) Use existing H ∞ synthesis methods [28] to find a stabilizing controller K such that ∥F L (P, K)F −1 ∥ ∞ < 1 (or determine that one does not exist).
The resulting controller K is dynamic, in general, and takes measurements y to produce control commands u.The weighted plant used for synthesis has state dimension 2n x because both P and It is known that if the H ∞ problem is feasible then the dynamic controller K can constructed with the same state-dimension as the weighted plant, i.e.K can be constructed with order 2n x .
Finally, we comment on the special case of full-information control.This corresponds to the case where the controller can directly measure the disturbance and plant state: . Full-information regret-based control has been studied extensively, e.g.[16], [7], [8], [9], [29].‡   The controller directly measures d t in the full-information problem.The measurement of d t can be used by the controller to perfectly reconstruct dt via d = F d. Let xt denote the state of F evolving with input d.This is equal to the state of F −1 evolving with input d.Thus the full-information H ∞ synthesis problem can be equivalently solved using [ x ⊤ t x⊤ t d⊤ t ] ⊤ as the measurement.Full-information H ∞ synthesis is solved by a static (memoryless) controller: Details on the construction of this controller for discrete-time case can be found in Chapter 9 of [26] or Appendix B of [30].This controller can be implemented, in real-time, by using the measurement of d to compute d = F d and the state xt .Thus the full-information, (γ d , γ J )-regret problem is solved by a controller that combines the dynamics F with the static update in (16).This is a dynamic controller, due to the dependence on F , with state dimension n x .

A. Problem Formulation
Consider the feedback interconnection shown in Figure 3.This interconnection, denoted CL(P, K, ∆), includes an uncertainty ∆ and controller K wrapped around the upper and lower ‡ These previous works used the per-step cost x ⊤ t Qxt + u ⊤ t Rxt.This corresponds to the case S = 0 in the cost function (4).Theorem 2 can be applied for more general per-step costs with S ̸ = 0. channels of the plant P , respectively.This is a standard feedback diagram in the robust control literature for the case of uncertain systems, e.g.see Chapter 11 of [1] or Chapter 8 of [2].The plant P is a discrete-time, LTI system with additional input/output channels (w, v) to incorporate the effect of the uncertainty: where Let ∆ denote the set of n w × n v , discrete-time LTI systems that are causal, stable, and have We only assume that each ∆ ∈ ∆ has a finite-dimensional state but the state dimension is arbitrary.This is referred to as unstructured uncertainty in the robust control literature [1].The results in this paper can be extended to LTI uncertainty with infinite state dimension but this requires additional technical machinery.
Let F U (P, ∆) denote the system obtained by closing ∆ ∈ ∆ around the top channels of P .§ The "nominal" (or best-guess) model is obtained with ∆ = 0.The results in Section II can be § The system FL(FU (P, ∆), K) corresponds to closing a controller K around the bottom channels of FU (P, ∆).Thus FL(FU (P, ∆), K) is equivalent to the shorter notation CL(P, K, ∆).
used to construct a controller that achieves (γ d , γ J )-regret on the nominal model F U (P, 0).In this section we focus on designing a single controller that achieves a certain level of regret for all possible models in the set M := {F U (P, ∆) : ∆ ∈ ∆}.
We first need to concretely define several notions used in this robust design.The closed-loop system CL(P, K, ∆) in Figure 3 is said to be well-posed if the dynamics have a unique solution for any disturbance d ∈ ℓ 2 starting from zero initial conditions for P and ∆, i.e. x −∞ = 0 and x ∆ −∞ = 0.In this case, the cost achieved by a controller K with a disturbance d ∈ ℓ 2 and uncertainty ∆ ∈ ∆ is: Our baseline for performance will be the optimal non-causal controller K o designed for the nominal model F U (P, 0).This baseline controller can be constructed from the results in Section II-C.The cost achieved by K o on the nominal model with disturbance d is J(K o , d, 0).We define the regret relative to this baseline cost.
Definition 2. Let γ d ≥ 0 and γ J ≥ 0 be given.A controller K achieves robust (γ d , γ J )-regret relative to the optimal non-causal controller K o (designed on the nominal model) if CL(P, K, ∆) is well-posed and stable for all ∆ ∈ ∆, and: Section III-B provides a method to solve the following robust (γ d , γ J )-regret feasibility problem: Given (γ d , γ J ), find a causal, output-feedback controller K that achieves robust (γ d , γ J )regret relative to K o or verify that this level of robust regret cannot be achieved.Robust (γ d , 0)-regret is a special case of µ-synthesis feasibility problem [17], [18], [19], [20], [21].
The synthesis method in Section III-B can be adapted to these more general block-structured uncertainties.We will comment later on the changes required for more general block structures.

B. Output Feedback Control Design
We first show that the robust (γ d , γ J )-regret feasibility condition can be reduced to a robust H ∞ condition.The bound on the right side of the robust regret condition (20) does not depend on the uncertainty.Following the same arguments as in Section II-D, this bound can be written as ∥ê∥ 2  2 where ê is the output of the system (14).Moreover, ∥ê∥ 2 2 is equal to ∥F d∥ 2 2 by Lemma 2 where F is a spectral factor that depends on the choice of (γ d , γ J ).Thus the robust (γ d , γ J )-regret in Definition 2 can be written as: The left side J(K, d, ∆) is equal to the closed-loop cost ∥e∥ 2 2 = ∥CL(P, K, ∆) d∥ 2 2 achieved with controller K and uncertainty ∆ ∈ ∆. (Figure 3).The spectral factor F is stable with a stable inverse.Define d = F d so that d = F −1 d.The set of all d ∈ ℓ 2 maps 1-to-1 on the set of all d ∈ ℓ 2 .Thus we can rewrite the robust regret bound as The system CL(P, K, ∆)F −1 corresponds to the closed-loop CL(P, K, ∆) with the disturbance channel weighted by F −1 .Thus (21) corresponds to a robust performance condition on CL(P, K, ∆)F −1 that must hold for all uncertainties ∆ ∈ ∆.This yields the following main result connecting the robust regret to the robust H ∞ condition.
Theorem 3. A controller K achieves robust (γ d , γ J )-regret relative to the optimal non-causal controller K o if and only if CL(P, K, ∆)F −1 is well-posed, stable and ∥CL(P, K, The robust performance condition in Theorem 3 depends on the uncertainty.We next show that this is equivalent to a condition that does not depend on the uncertainty.Consider the feedback interconnection in M is constructed from (P, K, F −1 ) but this dependence is suppressed to simplify the notation.
Note that CL(P, K, ∆)F −1 = F U (M, ∆).The next theorem uses M to state a necessary and sufficient condition for K to achieve robust (γ d , γ J )-regret.This is a special case of results for the structured singular value, also known as µ (see [33] or Chapter 11 of [1]).Technical details are given in Appendix C.
Theorem 4. A controller K achieves robust (γ d , γ J )-regret relative to the optimal non-causal controller K o if and only if (i) M is stable, and (ii) there exists D : [0, 2π] → (0, ∞) such that Proof.By Theorem 3, a controller K achieves robust (γ d , γ J )-regret relative to K o if and only if in Appendix C that F U (M, ∆) satisfies these conditions for all ∆ ∈ ∆ if and only if (i) M is stable, and (ii) there exists D satisfying (23).
The robust (γ d , γ J )-regret feasibility problem requires the construction of a controller K and scaling D that satisfy (23) (where the dependence of M on K has been suppressed).
Unfortunately (23) is, in general, a non-convex constraint on (D, K).A common, pragmatic approach is to perform a coordinate-wise search known as DK-iteration or µ-synthesis [17], DRAFT [18], [19], [20], [21].The D scaling is restricted to be a SISO, LTI, stable system with D −1 stable.In this case, Equation 23 can be written explicitly in terms of (P, K, F −1 , D) as follows: The DK-iteration corresponds to alternately minimizing the H ∞ norm in (24) over K with D held fixed and vice versa.The K-step can be solved as a standard H ∞ synthesis problem.The D-step requires additional details [17], [18], [19], [20], [21]  This section presents a simple, SISO design example.This example is useful to gain some additional insights about the performance of the various nominal and robust regret controllers discussed in this paper.Consider the classical feedback diagram in Figure 5 with the (continuoustime) plant G(s) = 15 s+5.6 .The actuator dynamics A(s) are assumed to be uncertain: where W unc (s) = 3s+4.62s+23.1 and ∆(s) is a stable, LTI system satisfying ∥∆∥ ∞ ≤ 1.The weight W unc (s) has DC gain of 0.2, |W unc (j∞)| = 3, and a zero near −1.5rad/sec.This represents a relatively small uncertainty (20%) uncertainty at low frequencies with increasing uncertainty DRAFT above 1.5rad/sec.The continuous-time plant and actuator models are discretized using zero-order hold with a sample time T s = 0.001 sec.
W e (s) = 0.5s + 6.93 The weight W d (s) represents reference commands with dominant low frequency content below 8 rad/sec.The weight W e has a DC gain of 10, crosses a gain of one at 8 rad/sec, and transitions to a gain of 0.5 at high frequencies.This weight emphasizes the tracking error at low frequencies.
Conversely, the weight W u (s) has a DC gain of 0.1, crosses a gain of one at 8 rad/sec, and transitions to a gain of 1000 at high frequencies.This weight penalizes high frequency control effort.These weights are discretized using a zero-order with a sample time T s = 0.001 sec.The design problem can expressed in the form of the robust synthesis interconnection CL(P, K, ∆) shown in Figure 3.
We first consider the properties of the optimal non-causal controller K o constructed in Sec- that the cost can be equivalently computed in the frequency domain: The simplification in the right-most expression relies on the fact that d is scalar so that T o (jω) is a 2 × 1 column vector.Figure 6 shows the response ∥T o (jω)∥ 2 vs. ω.The gain is small at both very low and very high frequencies.This indicates that the optimal non-causal controller achieves a small cost for disturbances (reference commands) with energy predominantly at low and high frequencies.However, the closed-loop response has a large gain in the range of 1 to 10 rad/sec.Thus the optimal non-causal controller will have (relatively) poor performance for disturbances with energy predominantly in these mid-frequencies.Step DRAFT 2 was performed using the method described in Section II.The bisection was stopped when the interval of lower/upper bounds on γ J , denoted [γ J , γJ ], satisfied γJ − γ J ≤ 10 −4 + 10 −4 γJ .The twenty points were computed in ≈ 6.4sec on a standard laptop, i.e. each point on the nominal regret curve took ≈ 0.32sec to compute.The left side of Figure 8 shows the frequency responses for the nominal regret-optimal controllers along the Pareto curve.The H ∞ and competitive ratio controllers are highlighted in red-dashed and blue solid, respectively.These two controllers lie on the endpoints of the Pareto curve.The other Pareto controllers (black dotted) transition between these two extremes for this example.It is notable that the competitive ratio controller has a larger low frequency gain and smaller high frequency gain than the other controllers.
To further interpret these results, let T ∞ and T C denote the closed-loop system from d to e obtained with the H ∞ and competitive ratio controllers.These are special cases of nominal the competitive ratio controller has similar characteristics (although degraded by a factor of γ C = 3.29 relative to K o ).This is the reason that K C has larger low frequency gain and rolls off more rapidly at high frequencies than K ∞ as shown on the left side of Figure 8.
Finally, we consider robust regret-optimal controllers.Figure 7 also shows the Pareto optimal curve for feasible values (γ d , γ J ) of robust regret (red dotted curve).The curve was generated via bisection on γ J using the same grid of 20 values for γ d .The bisection was performed using the method described in Section III with the same stopping conditions.The twenty points were computed in ≈ 998sec on a standard laptop, i.e. each point on the robust regret curve took ≈ 50sec to compute.The robust regret is more computationally intensive as DK-synthesis must be used at each bisection step to synthesize a controller that satisfies the robust performance condition in Theorem 4.
We first consider the competitive ratio controllers to study the effect of the robust design.The left side of Figure 9 shows the nominal and robust competitive ratio controllers.The nominal competitive ratio controller (blue solid) has large gain at low frequencies and a steep transition to small gain at high frequencies.The corresponding nominal loop G(s)A 0 (s)K C (s) has a phase margin of 36 • at 3.7 rad/sec.The robust competitive ratio controller (red dashed) has a more shallow slope near 3.7 rad/sec.This corresponds to a less negative phase (not shown) by the Bode Gain-Phase theorem.In fact, the nominal loop with the robust competitive ratio controller has a larger phase margin of 50 • at crossover frequency of 3.5 rad/sec.The right side of Figure 9   response with the nominal competitive ratio controller.The blue curve is generated using the nominal plant and is the same as shown on the right side of 8.The red dashed curves are generated on samples of the uncertain plant.Specifically, twenty stable, fifth-order, LTI systems i=1 are randomly sampled and scaled to satisfy ∥∆ i ∥ ∞ ≤ 1.These are substituted into the uncertain actuator model (25) and used to generate a sample of the uncertain closed-loop.
The performance of the nominal competitive ratio controller degrades significantly in the middle frequency range due to this uncertainty.The lower subplot is similar but is generated using the robust competitive ratio controller.This controller is much less sensitive to the model uncertainty as expected.
Other controllers along the nominal and robust Pareto front can be compared.For example, the points along the horizontal axis of Figure 7 correspond to the nominal and robust H ∞ controllers.The robust H ∞ controller is a variant of µ-synthesis that minimizes the worst-case gain from d to e over all norm-bounded uncertainties ∥∆∥ ∞ ≤ 1.The nominal and robust H ∞ controllers are similar for this particular example (although the robust cost is larger).The reason is that the nominal H ∞ controller, shown on the left of Figure 8 has relatively smaller gain at low frequencies and relatively larger gain at high frequencies.Hence this controller is already relatively robust.For example, the loop with the nominal H ∞ controller has a phase margin of 78 • with a crossover frequency of 3.6 rad/sec.The robust H ∞ controller is similar with a slightly larger phase margin of 85 • and a slightly lower crossover frequency of 3.2 rad/sec.Both the nominal and robust H ∞ controllers achieve robustness but at the expense of sacrificing DRAFT performance relative to the nominal and competitive ratio controllers.

B. Boeing 747
A simplified (linearized) model for the longitudinal dynamics of a Boeing 747 at one steady, level flight condition are given by (Problem 17.6 in [34]): where the state matrices are:

.
We will assume there is a multiplicative uncertainty at the plant input: where u t ∈ R 2 is the (nominal) command from the controller and ∆ is a 2 × 2, stable, LTI uncertainty such that ∥∆∥ ∞ ≤ 1. Multiplicative uncertainty is a commonly used to account for non-parametric (dynamic) modeling errors [1].No assumptions are made regarding the statedimension of ∆.The constant factor of 0.6 implies that the effect of the uncertainty can be up to 60% of the size of the nominal control at each frequency.This factor could be made frequency-dependent to account for increased non-parametric errors at high frequencies.We'll use the constant factor for simplicity.
Define the generalized error as e t = [ xt ut ].The corresponding per-step cost is e ⊤ t e t = x ⊤ t x t + u ⊤ t u t .We'll also assume the controller has access to full information measurements, i.e. y t = [ xt dt ].Based on this information, we can formulate the robust regret optimal control design as in Figure 4. Specifically, define (v, w) as the input/output signals of the uncertainty in (31), i.e. w = ∆v with v = u.Then substitute ũ = u + 0.6w into (30): This has the form of the state update equation for the uncertain plant P in (17) with B w = 0.6B, , and B u = B.The output equations for (v, e, y) in ( 17) can be derived similarly.Note that the nominal dynamics are given by ∆ = 0 and neglecting the (v, w) channels.
Figure 10 shows these three points along with the Pareto optimal curve for feasible values (γ d , γ J ) of nominal regret (blue dashed curve).The curve was generated by: (i) selecting 20 values of γ d in the range of [0.001, 0.999]γ ∞ where γ ∞ = 28.47, and (ii) bisecting to find the minimal value of γ J for each value of γ d in the grid.Step 2 was performed using the method described in Section II.The bisection was stopped when the interval of lower/upper bounds on γ J , denoted [γ J , γJ ], satisfied γJ − γ J ≤ 10 −2 + 10 −3 γJ .The twenty points were computed in ≈ 4.5sec on a standard laptop, i.e. each point on the nominal regret curve took ≈ 0.45sec to compute.
Figure 10 also shows the Pareto optimal curve for feasible values (γ d , γ J ) of robust regret (red dotted curve).The curve was generated via bisection on γ J using the same grid of 20 values for γ d .The bisection was performed using the method described in Section III with the same stopping conditions.The twenty points were computed with ≈ 789sec on a standard laptop, i.e. each point on the robust regret curve took ≈ 39.4sec to compute.The robust regret is more computationally intensive as DK-synthesis must be used at each bisection step to synthesize a controller that satisfies the robust performance condition in Theorem 4.
The several key points from this example.First, our notion of nominal regret (Section II) contains H ∞ , competitive ratio, and additive regret as special cases.Second, robust regret (Section III) can be used to synthesis robust controllers and assess the impact of model uncertainty.
For example, the red dotted and blue dashed curves in Figure 10

C. Quarter Car Example
In this section we consider the design of an active suspension controller for a quarter car model.The design is based on the Matlab demo [35] and a summary of the objectives can be found in the corresponding tutorial video [36].A model for the quarter car is given by: where r(t) ∈ R is the road disturbance (m), f s (t) ∈ R is the active suspension force (kN), and x(t) ∈ R 4 is the state.The state-space data is: A hydraulic actuator generates the force based on a control input u.The nominal hydraulic actuator model is first-order with a time constant of 1 60 sec, i.e.A 0 (s) = 60 s+60 .The maximum actuator displacement is 0.05m.This places a constraint on the suspension displacement to be less than 0.05m.The actuator dynamics are assumed to uncertain.This is modeled with an input multiplicative uncertainty: where W unc (s) = 3s+18.5s+46.3 and ∆(s) is a stable, LTI system satisfying ∥∆∥ ∞ ≤ 1.The weight W unc (s) has DC gain of 0.4, |W unc (j∞)| = 3, and a zero near −6.2rad/sec.This represents a relatively small uncertainty (40%) uncertainty at low frequencies with increasing uncertainty above 6.2rad/sec.The continuous-time quarter car and hydraulic actuator models are discretized using zero-order hold and sample time T s = 0.002 sec.|| These weights are also discretized using a zero-order hold with T s = 0.002sec.These weights correspond to a "balanced" objective between comfort and handling.Additional details on the design objectives can be found in [35], [36].
The active suspension design (Figure 11) falls within the general nominal design interconnection framework (Figure 3).We first designed a nominal additive regret controller.This design was performed on the interconnection with no uncertainty so that the actuator is at the nominal dynamics, A(s) = A 0 (s).The optimal additive regret controller achieved (within a bisection tolerance) a regret of (γ d , γ J ) = (0.43, 1).** We also designed an optimal robust, additive || A0(s) and Wunc(s) are discretized with sample time Ts.The normalized uncertainty is replaced by a discrete-time, stable LTI system ∆(z) with sample time Ts satisfying ∥∆∥∞ ≤ 1.
** The optimal nominal H∞ controller achieved γ∞ = 0.66.This paper presents a synthesis method for robust regret optimal control for uncertain, discretetime, LTI systems.The baseline for performance is the optimal non-causal controller designed on the nominal plant model and with knowledge of the disturbance.Robust regret is defined as the performance of a given controller relative to this optimal non-causal control.Our definition of regret includes previous definitions of (additive) regret, (multiplicative) competitive ratio, and H ∞ performance.We show that a controller achieves robust regret if and only if it satisfies a robust H ∞ performance condition.DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret.The approach is demonstrated via three examples.Future work will consider the design of robust regret controllers where the baseline depends on the specific value of the model uncertainty, e.g.possibly using results from linear parameter varying design.Future work will also include comparisons of this approach with optimal regret controllers designed via online convex optimization and disturbance-action controllers.
The input/output dimensions for P ∼ are n d × n ê.If ēt ∈ ℓ 2 then the state of P ∼ satisfies the boundary conditions x−∞ = 0 and x∞ = 0.The para-Hermitian conjugate P ∼ also satisfies the adjoint property stated in the next lemma.
Proof.First, substitute for the output equation of P : Next, substitute for Ĉ⊤ ēt using the dynamics of P ∼ : Finally, substitute for Âx t using the dynamics of P : The first two terms are a telescoping sum.These terms sum to zero by the boundary conditions x−∞ = x∞ = 0. Substitute for dt using the output equation for P ∼ : The second ingredient used to rewrite the cost Ĵ is the spectral factorization as defined below.
Definition 3. A square n d × n d system F is a spectral factor of P ∼ P if: (i) F is stable, causal, and invertible, and (iii) F −1 is stable and causal.Moreover, F ∼ F is called a spectral factorization of P ∼ P .
We can now use the two ingredients to rewrite the cost Ĵ := ⟨ê, ê⟩.First, use the adjoint property in Lemma 3 to write the cost as: Ĵ = ⟨d, P ∼ P d⟩.Next, if a spectral factor F exists then P ∼ P = F ∼ F .Thus another application of the adjoint property in Lemma 3 yields Ĵ := ⟨F d, F d⟩.In other words, we can compute Ĵ using the spectral factor F .The dynamics of F are stable/causal and with a stable/causal inverse.The next lemma provides conditions for the DRAFT existence of a spectral factorization of P ∼ P .An explicit, state-space construction is also given for the spectral factor (when it exists).This is essentially a restatement of Theorems 4.1 and 4.2 in [38]. 1) There is a unique stabilizing solution X ≥ 0 to the following DARE: where Ĥ := R + B⊤ X B ≻ 0. The gain Kx := Ĥ−1 ( Â⊤ X B + Ŝ) ⊤ is stabilizing, i.e.Â − B Kx is a Schur matrix.Moreover, ( Â − B Kx ) is nonsingular.

Â⊤ − K⊤
x Ky is a Schur matrix.

3) Define
Kx , and D F := Ŵ − 1 2 .A spectral factorization of P ∼ P is F ∼ F where the spectral factor F is: Proof.Thus Statement 2 follows from assumptions (i)-(vi) and Lemma 1.
For statement 3, we first show that P ∼ P = F ∼ F .Use the output equation of P to express the cost as: Substitute for Q using the DARE (37) to show: Thus the cost is equal to: The first two terms are a telescoping sum.These terms sum to zero by the boundary conditions x−∞ = x∞ = 0. Thus the cost is equal to Ĵ = ⟨ F d, F d⟩ where F is the following intermediate system: Note that F is a square n d × n d system and invertible.The state matrix for which is a Schur matrix by Statement 1. Hence F −1 is stable and causal.However, Â may not necessarily be a Schur matrix so one additional step is required to obtain the spectral factor F .
By Lemma 3, the cost is equal to Ĵ = ⟨d, F ∼ F d⟩.The matrix Â is assumed to be nonsingular.
This can be used to express the para-Hermitian conjugate F ∼ as: The system F ∼ F corresponds to the serial connection of F and F ∼ , i.e. combine (40) and (41) DRAFT with ēt = ẽt .This yields a state-space model for F ∼ F : Next, apply the following coordinate transformation: . We can simplify the results of the coordinate transformation by using the DARE (38) to show: Similarly, we can use the DARE (38) to show Ŷ Â−⊤ = A F Ŷ .Apply the transformation T to the state matrix of F ∼ F using these relations: The (2, 2) block can be simplified using Combining these two expressions gives A F (I + Ŷ K⊤ x Ĥ Kx ) = Â.Hence the lower right block is A −⊤ F .This can also be used to simplify the (1, 2) block as follows: The second equality makes use of the matrix inversion lemma.Similar simplifications can be made after applying T to the remaining state matrices of F ∼ F .

DRAFT
Thus the coordinate transformation gives the following realization for F ∼ F : It can be shown that this is also a realization for F ∼ F where F is the system in Equation ( 39).
Hence we have shown that The proof is concluded by showing that F satisfies the other properties required of a spectral factor: F is square, n d × n d system that is stable/causal and with a stable/causal inverse.The input and output dimensions of F are equal to n d so F is square.Moreover, A F := Â − K⊤ y Kx is a Schur matrix by Statement 2. Hence F is stable and causal.Next, F −1 exists because D F is invertible and F −1 is given by: The poles of F −1 are given by the eigenvalues of: Thus A F − B F D −1 F C F is a Schur matrix by Statement 1 and hence F −1 is stable/causal.Thus F is a spectral factor of P ∼ P as claimed.

B. Spectral Factorization for Regret Bound
This appendix gives a spectral factorization result tailored for the regret bound in Section II-D.
The closed-loop dynamics with K o , given in (14), have the form: In summary, the matrices satisfy assumptions (i)-(vi) in Lemma 4. This ensures that P ∼ P has a spectral factor F .
The spectral factor in Lemma 4 has the same state dimension as Â which is 2n x .The remainder of the proof shows that a minimal realization for the spectral factor can be constructed with dimension n x .The key point is to verify that the unique solution to the DARE (37) in Statement 1 Lemma 4 has the following special structure: This form assumes that X is nonsingular which will be verified below.The variable V is defined to be the unique positive semidefinite solution to the following DARE: where: It can be shown that Q is positive semidefinite using the DARE for X.Thus Equation 45 for V corresponds to the DARE in Lemma 1 with (A, B u , C e , D eu , Q, R, S) replaced by ( Â−⊤ , XB d , It can be verified that the DARE (45) satisfies conditions (i)-(iv) in Lemma 1 and hence there exists a stabilizing stabilizing solution V ⪰ 0. It can also be shown, by direct substitution, that X given in (44) is positive semidefinite and satisfies the DARE (37).
Hence this choice for X is the unique stabilizing solution.
The stabilizing gain resulting from the special structure of Â, B, and X is:

DRAFT
As a result, Statement 3 of Lemma 4 gives a spectral factorization with the form: The first block of states are unobservable in the output and hence can be removed.Thus the spectral factorization has a minimal realization with state dimension n x as claimed.
Finally, we will verify the assumption above that X is nonsingular.Assume, to the contrary, that X is singular, i.e. there is a nonzero vector v such that Xv = 0. Re-write the DARE (7) as: We have R ≻ 0 and Hence each term on the right is zero so that In summary, Xv = 0 implies X(A − B u K x )v = 0, i.e. the null space of X is (A − B u K x )invariant.It follows that there exists an eigenvector of (A − B u K x ) in the null space of X (Proposition 4.3 of [39]).In other words, there is a nonzero vector v and scalar λ such that x is stable and nonsingular.This gives: [1] gives the PBH test for stabilizability of a continuous-time LTI system.The discrete-time PBH test is analogous.

C. Robust Performance
This section summarizes existing results on robust performance.These are special cases of results for the structured singular value, µ.Details, including more general results, can be found in [33] or Chapter 11 of [1].All systems in this appendix are assumed to be causal.
where the matrix M has been block partitioned conformably with the input/output signals.The matrix interconnection F U (M, ∆) is said to be well-posed if I nv − M 11 ∆ is nonsingular.
If the interconnection is well posed then for each d ∈ C n d there exist unique (v, e, w) that satisfy (47).Moreover, the output e is given by: The first main result concerns the well-posedness and gain of F U (M, ∆) when the uncertainty satisfies σ(∆) ≤ 1.The proof uses the equivalence of ∥ • ∥ 2→2 and σ(•) for matrices.
Proof.Equivalence of A) and B) follows from the fact that the structured singular value (µ) is equal to its upper bound for the case of two full, complex blocks.This was originally shown in [40] but a more recent reference is Theorem 9.1 of [33].Equivalence of B) and C) is a special case of the main-loop theorem (Theorem 4.3 of [33]).
We'll sketch the proof of sufficiency, i.e.A)→ B) → C).This direction is used to verify the bound on robust regret and can be shown with only linear algebra facts.
A)→ B): Assume A) is true and that ∆, ∆ P satisfy (49).We will show that max (σ(∆), σ(∆ P )) > This can be rewritten, by the block matrix determinant formula, as: This condition is equivalent to nonsingularity of I nv − D vw D ∆ .Hence if the system is wellposed then (52) has a unique solution (v t , w t ) for any (x t , x ∆ t , d t ).In addition, F U (M, ∆) has a well-defined state-space representation for the dynamics from d to e: where: and ( B, Ĉ, D) can be defined similarly.
If the interconnection is well posed then for each d ∈ ℓ 2 there exist unique signals (v, e, w, x, x ∆ ) that satisfy the (M, ∆) dynamics from x −∞ = 0 and x ∆ −∞ = 0.The small gain theorem, stated next, provides a necessary and sufficient condition for well-posedness and stability of the system F U (M, ∆) when the uncertainty satisfies ∥∆∥ ∞ ≤ 1.The condition is stated in terms of (1, 1) block of M , denoted M 11 .This corresponds to the system with input w to output v defined by state matrices (A, B w , C v , D vw ).B) F U (M, ∆) is well-posed and stable for all n w × n v stable, LTI systems ∆ with ∥∆∥ ∞ ≤ 1.

DRAFT
Proof.The sufficiency of the small gain condition under more general settings is due to Zames [41], [42].Condition A) is actually necessary and sufficient for LTI systems and uncertainties.Theorem 9.1 in [1] provides a proof for the continuous-time version of this result.We sketch a proof for discrete-time systems for completeness.
This equality holds when: (i) F U (M, ∆) is well-posed and (ii) z is not a pole of M 11 or ∆.
These conditions ensure that the various determinants are well-defined.We can now complete the proof.
In either possibility, Condition B) of Lemma 7 fails to hold. DRAFT

Lemma 1 .
Let (A, B u , C e , D eu ) be given and define Q := C ⊤ e C e , S := C ⊤ e D eu , and R := D ⊤ eu D eu .Assume:

Figure 4 .
This corresponds to the closed-loop CL(P, K, ∆) with the uncertainty channels "open" and with the disturbance channel weighted by F −1 .The system DRAFT from inputs [ w d ] to outputs [ v e ] is denoted by:

Figure 5
Figure 5 also includes weights to describe the performance objectives.There is one generalized disturbance d that accounts for the effect of the reference command.The two generalized errors e 1 and e 2 represent the competing objectives between reference tracking and control effort.The corresponding weights are: tion II-C.To simplify notation, let T o denote the (non-causal) closed-loop system from d to e obtained with K o .A given disturbance d ∈ ℓ 2 generates an error e with resulting cost ¶ A0(s) and Wunc(s) are discretized with sample time Ts.The normalized uncertainty is replaced by a discrete-time, stable LTI system ∆(z) with sample time Ts satisfying ∥∆∥∞ ≤ 1. DRAFT J(K o , d) = ∥e∥ 2 2 .Let d and ê denote the Fourier Transforms of d and e, respectively, so that ê(jω) = T o (jω) d(jω).It follows from the Plancheral (Parseval's) Theorem (Section 3.3 of [2])

Figure 7
shows these three points along with the Pareto optimal curve for feasible values (γ d , γ J ) of nominal regret (blue dashed curve).Equation6in Section II-B characterized the Pareto front using a parameterized optimization.Here we use a different numerical implementation and generate the curve by: (i) selecting 20 values of γ d in the range of [0.001, 0.999] × γ ∞ where γ ∞ = 1.82, and (ii) bisecting to find the minimal value of γ J for each value of γ d in the grid.
provides additional insight.The top subplot shows the closed-loop DRAFT

Fig. 9 .
Fig. 9. Left: Frequency response of nominal and robust competitive ratio controllers.Right: Closed-loop responses with uncertain plant using the nominal (top) and robust (bottom) competitive ratio controllers.

,
where m b = 300kg, m w = 60kg, b s = 1000N/m/s, k s = 16000N/m, and k t = 1.9 × 10 5 N/m.The model has lightly complex poles at −1.43 ± 6.91j and −8.57± 57.6j.The goal is to design the active suspension controller to balance driver comfort (small a b ) and road handling (small s d ).

Figure 11
Figure 11 shows the interconnection for the control design.There are three disturbances associated with the road input and measurement noise.Each of these has a constant weights that (roughly) model the amplitudes of the corresponding input disturbances: W road = 0.07, W d 2 = 0.01, and W d 3 = 0.5.There are three generalized errors that model the competing objectives of actuator commands, body acceleration, and suspension travel.The corresponding performance weights are:

Fig. 12 .
Fig. 12. Bode magnitude plots from road input to body acceleration output for the open-loop (OL) and closed-loop (CL) with nominal additive regret (AR), and robust additive regret (Rob) controllers.

Figure 13 Fig. 13 .
Figure 13 shows a closed-loop, time-domain response with the nominal additive regret controller.The road disturbance is r(t) = 0.025 • (1 − cos(8πt)) for 0 ≤ t ≤ 0.2 and r(t) = 0 for t > 0.2.The simulations are performed with the nominal actuator model and 50 samples of the uncertain actuator model.The nominal responses are good for both the body acceleration

Figure 14 Fig. 14 .
Figure 14 shows the closed-loop, time-domain responses with the robust additive regret controller.The responses with the nominal actuator model are similar and even slightly better with the body acceleration.The main benefit of the robust additive regret controller is that the simulations with the uncertain actuator model show much less variability.Thus this controller is more robust to the actuator uncertainty as expected.
by the Schur complement lemma.Multiply (46) on the left and right by v ⊤ and v.The left side is v ⊤ Xv = 0 while all three terms on the right are ≥ 0.

Consider the uncertain interconnection shown in Figure 15 .
The interconnection, denoted F U (M, ∆), consists of an uncertainty ∆ around the upper channels of a M .We will initially consider the case where w ∈ C nw , d ∈ C n d , v ∈ C nv , and e ∈ C ne are complex vectors.Moreover, M ∈ C (nv+ne)×(nw+n d ) and the uncertainty ∆ ∈ C nw×nv are complex matrices.The interconnection in Figure15corresponds to the following algebraic equations:

Theorem 5 .
Let M be a given (n v + n e ) × (n w + n d ) LTI system and assume M is stable.The following are equivalent: A) ∥M 11 ∥ ∞ < 1.

A.
key step is to connect the eigenvalues of Ã to the transfer function M 11 (z)∆(z).Use the block-matrix determinant formula and definition of the transfer functions to show:det(I nv − M 11 (z)∆(z))Apply the block-matrix determinant formula again to the last line to obtain:det(I nv − M 11 (z)∆(z)) = det(zI − Ã).

2 <
∥v∥ 2 for any nonzero vector v. Hence (I − M 11 (z)∆(z))v ̸ = 0 for any v ̸ = 0. Therefore det(I nv − M 11 (z)∆(z)) ̸ = 0 for all z ∈ D c .One consequence is that I nv − D vw D ∆ is nonsingular because M 11 (∞) = D vw and ∆(∞) = DRAFT posed then det(I − M (e jθ 0 )∆(e jθ 0 but efficient algorithms exist to compute sub-optimal scalings.The algorithm is terminated if the H ∞ norm goes below 1.If the algorithm terminates for this reason then we have a controller that achieves the robust regret bound.Finally, note that Theorem 4 gives a necessary and sufficient condition to achieve robust (γ d , γ J )-regret.However, the DK-iteration need not find the globally optimal pair (D, K).Hence, successful termination of this algorithm is only sufficient for achieving robust regret.
IV. EXAMPLESThis section presents two examples to illustrate the proposed synthesis method for robust, regret optimal control.The code to reproduce the results is available at: https://github.com/jliu879/RobustRegret Optimal Control This section presents three examples to illustrate the proposed synthesis method for robust, regret optimal control.Code to reproduce the results for all examples is available at: https://github.com/jliu879/RobustRegret Optimal Control A. Simple SISO Design