Multivariable nonparametric learning: A robust iterative inversion‐based control approach

Learning control enables significant performance improvement for systems by utilizing past data. Typical design methods aim to achieve fast convergence by using prior system knowledge in the form of a parametric model. To ensure that the learning process converges in the presence of model uncertainties, it is essential that robustness is appropriately introduced, which is particularly challenging for multivariable systems. The aim of the present article is to develop an optimization‐based design framework for fast and robust learning control for multivariable systems. This is achieved by connecting robust control and nonparametric frequency response function identification, which results in a design approach that enables the synthesis of learning and robustness parameters on a frequency‐by‐frequency basis. Application to a multivariable benchmark motion system confirms the potential of the developed framework.

In ILC, these requirements are typically addressed by designing a learning and robustness filter as in the following archetypal approach. 15 Requirement R1 satisfied if the input u i converges monotonically to a fixed point u ∞ , which is achieved if there exists a ∈ (0, 1] such that The monotonic convergence condition ensures the absence of poor learning transients and determines the rate of convergence in the signal norm ||.||. 16 A small can be achieved by designing a learning filter that closely approximates the inverse plant dynamics. 17( §5.2) A typical approach is to first estimate a parametric plant model 18 and subsequently invert this model using dedicated techniques in case the model is nonminimum phase. 19,20 Both steps commonly introduce errors which may hamper convergence at specific frequencies. Establishing convergence under model uncertainties can be achieved by incorporating a robustness filter. Introducing this filter changes the fixed point u ∞ , which in turn affects the asymptotic performance such that perfect tracking can no longer be achieved. Satisfying requirement R2 for a large range of frequencies thus requires careful design of the robustness filter. 21 Methods to design robustness filters are intuitive and well developed for single-input single-output (SISO) systems, whereas designing such filters is substantially more involved for multiinput multioutput (MIMO) systems. For SISO systems, the monotonic convergence condition establishes a bound on the magnitude of the inversion error transfer function, which enables an intuitive manual tuning approach. 22 For MIMO systems, the same condition establishes a bound on the maximum singular value of the matrix transfer function, 16 which makes it challenging to manually design a robustness filter in a way that addresses the performance-robustness trade-off in a nonconservative manner. Robust control methods 23 offer a formal framework for MIMO robustness filter design [24][25][26] but require accurate parametric models of the plant and of the uncertainties, which may be costly and challenging to obtain. 27,28 In Reference 29, the modeling requirements are reduced by developing a framework to design diagonal robustness filters based on convergence criteria that can be evaluated using nonparametric frequency response function (FRF) models, yet the performance and convergence speed are not explicitly taking into consideration. Existing results indicate a trade-off between modeling-effort and performance, and the aim of this article is to go beyond this trade-off by enabling performance optimization using nonparametric FRF models.
IIC 4 avoids the use of parametric models and instead relies on nonparametric FRF models. In IIC, input updating is performed by explicitly transforming the signals to the frequency domain, such that the filtering operations become multiplications between the signal spectra and the FRF representations of the learning and robustness filters, which are complex coefficient matrices. The learning coefficients are typically obtained by inverting an FRF estimate of the plant, and the robustness coefficients are tuned according to an uncertainty model of that FRF. 6,30 These FRF models are generally considered to be accurate and require less user intervention to obtain compared to parametric models, see for example 18( §2) for a systematic procedure. However, the preexisting IIC approach for MIMO systems 31 does not include a robustness matrix and leaves the associated design freedom unexplored.
Although important developments have been made to enable high-performance control through fast and robust learning, presently available techniques do not scale well to complex large-scale multivariable systems. This article aims to overcome the limitations that are imposed by the requirement for parametric multivariable plant and uncertainties models in learning control. This is achieved by developing an FRF-based IIC framework for fast and robust learning. The main challenge in this framework is the optimal design of the frequency-dependent robustness and learning matrices Q( k ) and L( k ).
The main contribution of this article is a design framework for these matrices, which is developed in two parts. In the first part, a deterministic robust control approach is taken to optimally design Q( k ) and L( k ), based on a nominal FRF model with additive norm-bounded uncertainties. In the second part, an approach is developed to identify the required FRF model set in the presence of stochastic disturbances. The combination of these approaches results in a learning method that meets requirements R1 and R2 with a probability that is chosen by the user. Specifically, the outline and contributions of this article are as follows. In Section 2, the iterative tracking control problem is introduced and a MIMO FRF-based IIC method is proposed as a solution to this problem. To design the Q( k ) and L( k ) matrices that play a central role in this approach, a deterministic convergence and performance analysis is developed in Section 3. These results enable an optimization-based approach to Q( k ) and L( k ) design, resulting in the following contribution.
(C1) The Q( k )-matrix is optimized through a semidefinite program (SDP) to guarantee robust convergence and to achieve maximum nominal performance, for the case where L( k ) is taken as the inverse of the system FRF (Section 5).
This approach is the main contribution of this article, and the remaining sections aim to motivate and support the particular design choices. The choice to take L( k ) equal to the inverse of the system FRF is motivated by the second contribution.
(C2) It is shown that if L( k ) equals the inverse system FRF, then the robust convergence rate is optimal (Section 6).
Furthermore, the choice to optimize Q( k ) for nominal performance, instead of robust performance, is motivated by the third contribution.
(C3) It is shown that optimizing Q( k ) for robust performance can be prohibitively conservative, since it results either in perfect tracking, that is, Q( k ) = I, or in no learning at all, that is, Q( k ) = 0 (Section 7).
The following contributions support the application the developed design method in practice.
(C4) An identification method is developed to estimate the nominal model and the set of norm-bounded uncertainties (Section 8) and the combination with the L( k ) and Q( k ) design approach is formulated as a stepwise procedure (Section 9). (C5) The potential of the developed framework is confirmed by application to a benchmark motion system (Section 10).
Contributions C1 to C3 take a deterministic robust control approach to learning as in References 26,32. Contribution C1 provides a means to optimally design Q( k ) in a way that is numerically efficient, and that can readily be parallelized. The results developed in C2 and C3 establish conditions under which the design choices articulated in C1 are optimal. Contribution C4 establishes a connection between stochastic FRF identification and deterministic robust control, which closely connects to References 33,34. In contribution C5, it is illustrated that the performance can be significantly improved by using fully parametrized robustness matrices instead of diagonal matrices, thereby considerably extending the results in References 29,31. In relation to the initial results presented in Reference 35, contribution C1 is established in a rigorous mathematical manner, and C3 is new. Furthermore, contributions C2 and C3 are extended by providing detailed proofs, and C4 is enhanced by explicitly treating real uncertainties that arise at specific frequencies, whereas C5 is extended with additional simulation results.

Preliminaries
The sets of real numbers, complex numbers, and integers are denoted by R, C, and Z. For a matrix A ∈ C n×m , [A] ij denotes the ijth element, A H is the conjugate transpose, and (A) and (A) represent the largest eigenvalue and largest singular value. For a ∈ C n , ||a|| 2 denotes the 2-norm, whereas for A ∈ C n×m , ||A|| 2, 2 denotes the induced 2,2-norm. The discrete Fourier transform (DFT) and its inverse are defined as . 36( §2.21) MIMO LTI DT systems G • : u  → y are considered, which are represented by their transfer function matrices G • (z) ∈ C n×n , z ∈ C, and G • (e j k ) is the FRF of G • on the DFT grid Ω N .

PROBLEM FORMULATION
In this section, the iterative tracking problem is introduced and a MIMO FRF-based IIC approach is presented. The first challenge considered this article is to design the associated robustness and learning matrices to ensure that requirements R1 and R2 are satisfied in the presence of uncertainties in the FRF model. The second challenge is to identify such FRF models from data.

F I G U R E 1
The control configuration with exogenous output disturbances

The iterative tracking problem
The control objective is to achieve accurate tracking and disturbance rejection for the following class of systems.
Assumption 1. The system G • is stable, square, and its FRF is invertible on the DFT grid, that is, det (G • (e j k )) ≠ 0, This assumption is standard in system inversion and ensures that the response of G • is bounded, the transient response decays exponentially, and the inverse response is well defined. When the system is nonsquare, squaring down 37 or optimization-based methods 38 can be used to obtain a square system. Furthermore, it is assumed that the system performs identical tasks, while the output is corrupted by trial-varying disturbances and trial-invariant disturbances.
Assumption 2. The output as measured during the ith trial is disturbed by v i (t) and d(t), such that , as shown in Figure 1. Here, d(t) represents the disturbances that are identical for each trial and v i (t) represents the trial-varying disturbances. Furthermore, r(t) is a trial-invariant bounded reference signal.
Since ||X( k )|| 2 = ||x(t)|| 2 by Parseval's theorem, 39 the iterative tracking problem can be formulated in the frequency domain, aided by the following definition of monotonic convergence for DFT vectors.
, X ∈ C n , is said to convergence monotonically in the 2-norm to a fixed point is referred to as the convergence rate.
converges monotonically in the 2-norm to a fixed point U ∞ ( k ) ∈ C n at a given rate ( k ) ∈ (0, 1], and Note that Problem 1.(a) and 1.(b) directly reflect requirements R1 and R2.

Robust finite-time IIC
IIC algorithms, including References 4,6,31, provide an approach to Problem 1 by using the following update law where L( k ), Q( k ) ∈ C n×n are the learning matrix and robustness matrix, respectively. Note that in (1), U i+1 ( k ) is determined by multiplication of the DFT coefficient vectors U i ( k ) and E i ( k ) by the matrices L( k ) and Q( k ) for each frequency k . This enables a frequency-by-frequency approach to design L( k ) and Q( k ). By assuming that the transient response is negligible, a decoupled approach can be formulated that significantly reduces the design complexity and enables parallelization. To see this, consider the DFT of y i (t), Here, T U i ( k ) represents the transient response, which vanishes exponentially for stable G • . 40 In steady state, that is, for T U i ( k ) = 0, the relation between Y i ( k ) and U i ( k ) is decoupled for each k , such that Q( k ) and L( k ) can be designed for each k individually. Practically, negligible transients can be achieved by ensuring that the task is long enough or by using a periodic implementation, as is presented in Reference 6. The following algorithm encompasses many IIC approaches including References 4,12.

Algorithm 1. Robust finite-time IIC
Select an initial input u 0 (t).
1. Apply the input u i (t) and record the tracking error e i (t) = r(t) − y i (t) when y i is in steady state. 2. (a) Obtain U i ( k ) and E i ( k ) by applying the DFT.
(b) Update U i ( k ) ∀ k by means of (1). (c) Obtain u i+1 (t) by applying the IDFT. 3. Set i ← i + 1 and repeat from step (1.) until e i has converged.
The main objective of this article is to solve Problem 1 using Algorithm 1 by suitable design of Q( k ) and L( k ). This is achieved using a two-step approach where (i) L( k ) and Q( k ) are optimized based on a deterministic FRF model set containing G • (e j k ) and (ii) the deterministic model set is estimated from probabilistic data.

A deterministic FRF-based design approach for L and Q
The frequency-by-frequency updating approach in IIC enables the design of L( k ) and Q( k ) based on an FRF model of G • . This is in contrast to filter-based ILC approaches which typically require a MIMO parametric plant model instead. To facilitate the FRF-based design approach, it is assumed that the following FRF model with additive uncertainty is given, where the argument k is omitted in the forthcoming sections to facilitate a transparent exposition.

Definition 2.
Let (Δ) ∈ C n×n be an FRF model set where W y , W u ∈ C n×n are weighting matrices.
Here, G n is referred to as the nominal model and is a set of unstructured, norm-bounded perturbations.
This assumption ensures that G n corresponds to an FRF of a system whose inverse response is well defined. In the next sections, L( k ) and Q( k ) are designed for optimal performance, and to ensure convergence for all possible models, that is, ∀Δ ∈ .

2.4
Identifying a deterministic FRF model set from probabilistic data In Section 8, a method is developed to estimate the model set (3). It is assumed that the disturbance v(t) is Gaussian during the identification experiments, and an FRF matrix estimatorĜ is formulated that inherits the Gaussian probabilistic properties. By explicit analysis of the probability, an approach is developed to select the matrices W u , W y , and the uncertainty bound , such that the true FRF G • is captured by the model (Δ) with probability . This approach is illustrated for the SISO case in Figure 2. Let the estimate be given byĜ = G • + V, where V is a complex Gaussian variable. Clearly, for G n =Ĝ and W u = W y = 1, The left plot shows the probability density function (PDF) of V, where V is circularly complex Gaussian distributed 18 §2.4 , and the right plot shows the PDF of |V|, denoted by f (|V|), which is a Rayleigh distribution. If the true system should be captured with probability , then must be taken such that Prob(|V| ≤ ) = ∫ 0 f (|V|)d|V| = , which corresponds to the blue area in Figure 2. This approach enables the construction of norm-bounded model sets , and the outspoken unimodal PDF of |V| motivates an L and Q design approach that optimizes the performance associated to the nominal model G n , while ensuring convergence ∀Δ ∈ . In this section, a robust MIMO IIC approach is presented as a solution to the iterative tracking problem, and the two-step Q and L design approach is outlined. This approach builds on the analysis results regarding robust convergence and asymptotic performance that are developed in the next section.

ROBUST CONVERGENCE AND ASYMPTOTIC PERFORMANCE ANALYSIS
In this section, the convergence properties of the update law (1) are analyzed. The analysis provides fundamental relations that enable the deterministic optimization results developed in the subsequent sections. A deterministic robustness analysis is facilitated by assuming that the true model is captured by the model set.

Assumption 4. Let (Δ) be given by Definition 2. There exists a Δ
The following assumption enables a deterministic analysis of convergence. 6,12 For the case with stochastic disturbances during the learning procedure, see Reference 21. This assumption formalizes that the transient response and the trial-varying disturbances are zero during learning. Under these assumptions, it follows readily that Problem 1.(a) is solved if (1) is robustly monotonically convergent, that is, if (1) yields monotonically converging sequences {U i } ∞ i=0 ∀Δ ∈ . The following lemma presents a sufficient condition for robust monotonic convergence and it is emphasized that Δ is independent of i.

Lemma 1. Consider (1) under Assumptions 1 to 5 and let (Δ) be given by Definition 2. If for a given ∈ (0, 1] it holds that
with then for any R, D ∈ C n , the sequences A proof of this lemma is provided in Appendix A. This lemma shows that the matrix S • (L, Q) determines the asymptotic performance and the matrix (L, Q, Δ • ) determines if the IIC input converges monotonically. This result enables the formulation of a constrained optimization problem to design L and Q.

OPTIMIZATION-BASED L AND Q DESIGN
Problem 1, as formulated in Section 2, can be reformulated as an optimization problem in L and Q. In particular, Problem 1.(a) is solved if L and Q are such that (4) holds for a desired convergence rate ∈ (0, 1]. Problem 1.(b) is solved by minimizing ||S • || 2, 2 , with S • as given by (7). However, ||S • || 2, 2 cannot be minimized directly, since G • is unknown. Instead, the model set (Δ) is used to determine a model set of S • , Note that if the convergence condition is satisfied, then (8) is well-posed, that is, the inverse in (8) exists.

Lemma 2. If (4) holds for any
A proof to this lemma is provided in Appendix B. The uncertain matrices (L, Q, Δ) and (L, Q, Δ) naturally enable an optimization-based L and Q design approach. More specifically, the performance can be optimized with respect to the nominal model, or with respect to the worst-case model, under the constraint of robust convergence. Particularly, consider the following general optimization problems for a given ∈ (0, 1].
• Robust convergence and nominal performance inf L,Q∈C n×n • Robust convergence and robust performance These optimization problems are generally nonconvex since (L, Q, Δ) and (L, Q, Δ) are nonlinear in the matrix-parameters L and Q. This makes determination of the global optimizer practically impossible, and an alternative approach is proposed.

The proposed optimization approach
The following approach is proposed to obtain a numerically efficient design method. Consider (9) and fix L as L = G −1 n , which results in the following problem.

Problem 2.
Let (Δ) be given by Definition 2. For a given ∈ (0, 1], determine This approach is motivated in the remainder of the article in 5-fold as follows. 1. It is shown in Section 5 that Problem 2 can be reformulated as an SDP, whose global optimum can be obtained in a numerically efficient manner. 2. It is shown in Section 6 that L = G −1 n optimizes the robust convergence rate in case of perfect tracking, that is, for Q = I. 3. It is shown in Section 7 that optimizing Q with respect to the worst-case performance can be relatively conservative since (10) results in either Q = I or Q = 0, when L = G −1 n . 4. It is shown in Section 7 that taking < 1 can be used to bound the worst-case performance, which confirmed by demonstration in Section 10. 5. It is similarly demonstrated in Section 10 that the nominal performance closely approximates the true performance, such that proposed approach provides and effective way to solve the iterative tracking problem presented in Section 2.
Furthermore, the framework to identify the required FRF model set is developed in Section 8, and the resulting overall design framework is summarized in Section 9. It is remarked that Sections 6 and 7 provide the theoretical motivations for the proposed design approach and can be skipped by the practically oriented reader.

SYNTHESIZING Q FOR OPTIMAL NOMINAL PERFORMANCE
Motivated by the results summarized in the previous section, Q is optimized for nominal performance and robust convergence for L = G −1 n . More specifically, Problem 2 is reformulated as an SDP, which constitutes contribution C1. The following theorem enables a reformulation of Problem 2 as an SDP. Theorem 1. Let (Δ) be given by Definition 2 and satisfy Assumptions 3 and 6. The optimization problem specified by (11) and (12) in Problem 2 is identical to This theorem enables a numerically efficient way to design Q, since (13) can be solved by using various algorithms that are known to converge globally in polynomial time. 41 Furthermore, (13) is solved for each frequency k independently, such that the optimization of the entire Q( k ) is readily parallelized. The performance associated to the resulting Q * can be analyzed by noting that if Q * ≠ I and Q * ≠ 0, then (G −1 n , Q * , Δ) is a nonconstant matrix function of Δ. In this case, the worst-case performance sup Δ∈ ||(G −1 n , Q * , Δ)|| 2,2 can be determined numerically as is described in Appendix F. A proof to Theorem 1 requires the following result.

Lemma 3. Let be given by (3), and A, B ∈ C n×n , then it holds that
if and only if there exists a real x > 0 such that A proof to this lemma is provided in Appendix C. A proof to Theorem 1 is the following.
Proof. Consider the following implications where the second implication follows by the Schur complement. 42 Applying the Schur complement 42(A.5.5) to the lower right matrix entry results in (13c), which concludes the proof. ▪ In this section, a numerically efficient approach is developed to optimize Q for nominal performance when L = G −1 n . In the next section, it analyzed under what conditions perfect tracking can be achieved, and conditions are established under which the design choice L = G −1 n is optimal.
as is the case for (1.a), then the set that determines convergence, ( G −1 n , 1, Δ), will always contain the unit circle, that is, Consequently, convergence cannot be achieved and the worst-case minimum is obtained for * = 0, which implies that L * = 0 is the optimal solution (1.c). If (Δ) does not contain 0, as is the case for (2.a), then convergence can be achieved, and the worst-case minimum is obtained for * = 1, which implies that L * = G −1 n is the optimal solution (2.c) [Colour figure can be viewed at wileyonlinelibrary.com]

OPTIMIZING L FOR FAST CONVERGENCE AND PERFECT TRACKING
This section provides the main motivation for taking L = G −1 n in the approach proposed in Section 4.1. Namely, it is shown that perfect tracking can only be achieved if the uncertainties are small enough, and in this case L = G −1 n achieves the best possible robust convergence rate. This result constitutes contribution C2.
Perfect tracking is obtained if Q = I is feasible, that is, if it satisfies the convergence constraint as given by (4), since for Q = I, (L, I, Δ) = S • = 0. If Q is taken as Q = I in the optimization problems posed by (9) and (10), then only the convergence constraint (4) needs to be satisfied. Hence, (9) and (10) reduce to a feasibility problem in L. If a set of feasible L exists, then a unique solution can be determined by minimizing in (4). This problem is formalized as follows.

Problem 3. Let (Δ) be given by Definition 2. Determine
The solution is first illustrated for the SISO case and is followed by a formal treatment for MIMO systems in Section 6.2.

Optimizing L: Illustration of the SISO case
Consider (18) for the SISO case and parametrize L identically as L = G −1 n , where can take any value, that is, Note that ( G −1 n , 1, Δ) represents a circular set in C with center (1 − ) and radius | G −1 n W y W u |, as is illustrated in Figure 3. The worst-case magnitude sup Δ∈C,|Δ|≤ |( G −1 n , 1, Δ)| is obtained for Δ = e j , where equals the phase of (1− ) Clearly, the worst-case magnitude equals the distance to the center point plus the radius of the circle, as is shown in Figure 3. Consequently, * = arg inf The solution to (20) follows from the following lemma. A proof to this lemma is provided in Appendix D. Lemma 4 implies that the solution to (19) depends on the radius |G −1 n W y W u |, which can be recognized as the relative size of the uncertainty. More specifically, if |G −1 n W y W u | ≥ 1, then it cannot be fitted inside the unit circle. In this case, * = 0 is optimal, since it shrinks the circular set to 0 at the point 1, resulting in |(1, 0, Δ)| = 1, ∀Δ. Hence, if the relative uncertainty is too large, then setting the learning parameters to 0 is optimal in the worst-case sense. An alternative interpretation follows by noting that |G −1 n W y W u | ≥ 1 implies that 0 is an element of the model set (Δ), for which it is immediate that learning cannot be achieved. Conversely, if |G −1 n W y W u | < 1 then smallest distance to the origin is obtained by centering the circular set at the origin, such that * = 1, with |(1, 0, Δ)| = |G −1 n W y W u | < 1. To conclude, L * = G −1 n if |G −1 n W y W u | < 1 and L * = 0 if |G −1 n W y W u | ≥ 1. Next, these results are generalized to the MIMO case.

6.2
Optimizing L: The MIMO case The following theorem generalizes the solution to Problem 3 to the MIMO case under the following assumption.
This assumption is readily satisfied by using the identification method as is developed in Section 8. The essential aspect of this assumption is to render ΔW u unstructured, such that any input vector U i can be arbitrarily rotated without affecting its norm.

Theorem 2.
Let (Δ) be given by Definition 2 and satisfy Assumptions 3 and 6. The solution to (18) in Problem 3 satisfies Similar to the SISO case, Theorem 2 reveals that the solution depends on the relative uncertainty size, ||G −1 n W y || 2,2 . More specifically, (21a) states that if the uncertainty is too large, then there does not exist an L that achieves convergence ∀Δ ∈ and Q = I, such that perfect tracking cannot be achieved. Conversely, if the uncertainty is sufficiently bounded, then perfect tracking is achieved at a convergence rate that is optimal in a robust sense if L = G −1 n . This insight motivates the design choice of setting L = G −1 n , even though this may be a suboptimal solution when Q ≠ I is required to achieved convergence. A proof of Theorem 2 is the following.
Proof. Parametrize L identically as L * = G −1 n , where is any complex matrix ∈ C n×n , and consider (18), which yields * = arg inf This problem can be solved by reducing it to the SISO case. This is achieved by considering the direction along which the effect of uncertainty is maximal, followed by considering the specific unity input along this direction to evaluate the 2,2-induced matrix norm. The worst-case uncertainty is determined by taking the singular value decomposition (SVD) of G −1 n W y , that is, G −1 n W y = UΣV H , which results in wherẽ≜ U H U andΔ ≜ V H ΔW u U, where it is used that U is a unitary matrix. Furthermore, since V and W u are also unitary,Δ ∈ , such that the transformed decision variablesΔ and̃can be identically considered. Next, consider the particular perturbationΔ * = − e j I, which is admissible since ||Δ * || 2,2 ≤ . This perturbation provides a lower bound to the worst case Next, by definition of the induced 2,2-norm, it holds that for any z such that ||z|| 2 = 1. Taking z = [ 1 · · · 0 ] ⊤ , and noting that by convention [Σ] 11 = (G −1 n W y ), the right-hand side of (25) can be written as ( which is similar to the SISO case discussed in Section 6.1. By similarly taking equal to the phase of In conclusion, with respect to (22), it holds that which enables a proof of Theorem 2 by contradiction. First, consider (21a), that is, let (G −1 n W y ) ≥ 1, and note that ||(I, 0, Δ)|| 2,2 = 1. Assume that ∃ , ≠ 0, that results in a strictly lower costs than = 0. In this case, it must hold that By Lemma 4, the right-hand side equals 1. This contradicts the above statement, implying that = 0 is optimal. Similarly, consider (21b), that is, let (G −1 n W y ) < 1, and note that ||(I, G −1 n , Δ)|| 2,2 = (G −1 n W y ). Assume that ∃ , ≠ I, that results in a smaller costs than = I. In this case, it must hold that By Lemma 4, the right-hand side equals (G −1 n W y ). This contradicts the above statement, implying that = I is optimal. ▪ In this section, it is shown that perfect tracking can be obtained if the relative uncertainty is sufficiently bounded, and in that case, L = G −1 n optimizes the robust convergence rate. The next section considers the optimization of Q for robust performance.

OPTIMIZING Q FOR ROBUST PERFORMANCE
This section provides the main motivation for optimizing Q for nominal performance as is advocated by the approach proposed in Section 4.1. Namely, it is shown that optimizing Q for robust performance can be relatively conservative, since it results either in perfect tracking, or in not applying learning, that is, Q = I or Q = 0, when L = G −1 n . It is also shown that taking < 1 can be used to bound the worst-case performance. These insights constitute contribution C3.
Consider (10) for L = G −1 n , which is formulated as follows.
The solution is first illustrated for the SISO case and is followed by a formal treatment for MIMO systems in Section 7.2.
F I G U R E 4 (1) Displays the case for a model set that achieves convergence for Q = 1 and = 1, that is, |G −1 n W y W u | < 1. (1.a) The circular set (G −1 n , Q, Δ) = QG −1 n W y ΔW u that results by varying Q in the feasible range, that is, the range for which convergence is achieved.
n , Q, Δ) that results by varying Q in the feasible range. This shows that for Q = 1, perfect performance is achieved, and the worst-case performance is unbounded if the convergence constraint is active. (2) Displays the case for a model set that does not achieve convergence for Q = 1, that is, |G −1 n W y W u | ≥ 1.

Optimizing Q: Illustration of the SISO case
Consider the optimization of Q for robust performance and robust convergence for the SISO case, such that (30) and (31) yield Note that the performance function is identical to (G −1 n , Q, Δ) = (1 − Q)(1 + QG −1 n W y ΔW u ) −1 , which directly shows that Q = 1 is optimal if it feasible, that is, if it satisfies the convergence constraint. If the constraint is not satisfied for Q = 1 and = 1, then sup Δ∈ |(G −1 n , Q, Δ)| ≥ 1 for all feasible Q, implying that the worst-case performance cannot be improved. To see this, note that if ∃Δ such that |G −1 n W y ΔW u | ≥ 1, then |QG −1 n W y ΔW u | < 1 does not hold for Q = 1. This is equivalent to stating that there exists a Δ * such that (G n + W y Δ * W u ) = 0, which implies (G −1 n , Q, Δ * ) = 1. Consequently, sup Δ∈ |(G −1 n , Q, Δ)| ≥ 1 ∀Q. Hence 1 presents a lower bound, and this lower bound is attained ∀Δ ∈ for Q = 0, implying that Q * = 0. Now consider the case where a certain Q is such that the convergence constraint is active for = 1, that is, |QG −1 n W y ΔW u | < 1, ∀Δ ∈ . For such a Q, ∃Δ * ∈ such that the denominator in (G −1 n , Q * , Δ * ) approaches 0, resulting in an unbounded worst-case performance. By taking < 1 this situation can be relieved. These mechanisms are illustrated in Figure 4. Next, these results are generalized to the MIMO case.

Optimizing Q: The MIMO case
The following theorem presents the general solution to Problem 4 for MIMO systems. (30) and (31) in Problem 4 satisfies

Theorem 3. Let (Δ) be given by Definition 2 and satisfy Assumptions 3 and 6. The solution to
Similar to the SISO case, (33a) states that if the relative uncertainty is too large, then there does not exist a Q that increases the performance ∀Δ ∈ , such that the optimal solution is to not apply IIC, that is, Q = 0. Conversely by (33b), if the relative uncertainty small compared , then the optimal robust performance is attained for Q = I, which implies perfect tracking. This all-or-nothing solution is indicative of conservatism since it leaves no room to partially improve the performance when = 1.
Any Q * ≠ I that optimizes the nominal performance, that is, is a solution to the SDP posed by (13), is such that the constraint ||(G −1 n , Q * , Δ)|| 2,2 < is active. For = 1, this implies that ∃Δ * ∈ for which the factor I + Q * G −1 n W y Δ * W u in (G −1 n , Q * , Δ), given by (31), approaches singularity, yielding unbounded worst-case performance. This observation indicates that the worst-case performance can be moderated by taking < 1, as is demonstrated in Section 10. A proof of Theorem 3 requires the following auxiliary result. Lemma 5. Let be given by (3), and A, B ∈ C n×n , with B unitary, then it holds that A proof of this lemma is provided in Appendix E. A proof to Theorem 3 is the following.

IDENTIFICATION OF THE MODEL SET
In this section, a method is developed to identify the model set (Δ), as given by (3), which constitutes contribution C4. A connection is established between the stochastic disturbances that corrupt the output during the identification process and the resulting uncertainty of the nominal model. This connection results in an approach to determine the uncertainty bound .

MIMO FRF identification
Consider the configuration in Figure 1. The MIMO FRF G • ( k ) can be estimated from n input-output data pairs as 18 where it is assumed that U( k ) is regular ∀ k , which is achieved appropriate input signal selection. 43 To quantify the uncertainty ofĜ( k ), it is assumed that the output is corrupted by the disturbances as follows, where a distinction is made between the DFT coefficients that are real and complex by property.

Definition 3. The real DFT coefficients correspond to the frequency set
and the complex DFT coefficients correspond to the remaining set of frequencies, Ω ℑ ≜ Ω N ⧵ Ω ℜ .
is Hermitian positive definite, and different realizations V i ( k ) are mutually independent. 44 Remark 1. Assumption 7 can be approximately satisfied in practice as follows.
• For a stable LTI system, T U i ( k ) → 0 as N → ∞. Hence, the transient response becomes negligible by taking a long measurement interval, or by applying periodic excitations 18( §2.4) and excluding the first periods from the measurement.
• The trial-invariant deterministic disturbance can be eliminated, that is, D( k ) = 0, by taking the difference between two measurements with different input signals. Specifically, let , which can be redefined as Y = GU + V , which is a measurement with D = 0.
with H(z) a stable filter and i (t) independent, identically distributed noise with existing moments of any order. 18( §2.5, §7, §16. 16) This condition is considered mild since it does not assume a specific distribution in the time domain.
Assumption 7 motivates the following the model set that establishes a direct connection between the disturbances and . Proposition 1. Under Assumption 7, set G n =Ĝ, withĜ given by (36), set W u = u U −1 , where u ≜ (U) and where U is given by (37), and set W y = 1

is the Cholesky decomposition, which exists since C V is Hermitian positive definite. For these settings, (3) results in
V is such thatṼṼ H a complex Wishart matrix, as is described in Reference 45 and Appendix G.
Proof. Under Assumption 7, (36) can be written aŝ where V is a random matrix whose columns V i satisfy Assumption 7, such thatṼ ≜ C −1∕2 V V is a Wishart matrix. For the proposed settings, it holds that (38). ▪ Considering (38), Proposition 1 implies that if there exists Δ * ∈ such that Δ * = −Ṽ, then the true system G • is contained in the model set (Δ). Consequently, can be used to tune the probability thatṼ ∈ , as is considered in the next section.

Determining the uncertainty bound
The section treats the problem of selecting such that G • is contained in (Δ) with probability , as is illustrated for the one-dimensional case in Section 2.3. By definition of , as given by (3), a Δ ∈ can attain −Ṽ if (Ṽ) ≤ . Hence, the probability that G • is contained in (Δ) equals the cumulative distribution function (CDF) Pr( (Ṽ) ≤ ). The following lemma provides explicit expressions of this CDF. Lemma 6. Let Γ(x) denote the gamma function and let (a, x) denote the regularized lower incomplete gamma function, that is, (a, x) = 1 Γ(a) ∫ x 0 z a−1 e −z dz, Γ(a) = ∫ ∞ 0 z a−1 e −z dz. If V( k ), as given by (39), satisfies Assumption 7, then the CDF of (Ṽ( k )), withṼ where − 1, x), and related to F ℜ ( ), n m = n if n is even, and n m = n + 1 if n is odd. When n is even, then elements of the skew symmetric matrix function satisfy If n is odd, Φ is an n m × n m matrix function with additional elements A proof of this lemma is provided in Appendix G. Lemma 6 enables the following key result, which establishes a connection between FRF identification and the developed L and Q design approach. (3), under Assumptions 2 and 7, and let G n ( k ), W y ( k ) and W u ( k ) be set as in Proposition 1. If is such that F( , k ) ≥ , with F( , k ) as presented in Lemma 6, then Pr(G • ( k ) ∈ (Δ, k )) ≥ . This theorem establishes the implicit relation between and the probability that G • is contained in (Δ). The inverse relation exists since the CDF is a nondecreasing function of , and hence, can be determined numerically by means of a root-finding algorithm to achieve a desired probability . Note that Theorem 4 considers a single frequency k . Consequently, the probability that G • ( k ) is contained in (Δ) for a set of frequencies equals , with = N ℜ + 1 2 N ℑ , where N ℜ and N ℑ are the numbers of frequencies corresponding to real and imaginary DFT coefficients.
Remark 4. Practically, the covariance matrix C V may be unknown and can be replaced by the sample covariance C Y of the sample means Y i that can be obtained from additional observations, for example by using periodic repetitions 18( §2.5) . This reduces the uncertainty by averaging, although technically (41) and (42) no longer hold, since the elements of V satisfy a Student t-distribution. 33 In this section, a method is developed to identify the model set (Δ), which includes an approach to determine the norm bound such that the true system is contained in the model set with a desired probability . In conjunction with the results from the previous section, a complete design approach is obtained that is concisely formulated in the next section.

DESIGN FRAMEWORK
In this section, the L and Q design framework summarized.

Initialization
Given a system G • that satisfies Assumption 2.
1. Determine the desired probability ∈ (0, 1) at which U( k ) should converge at an individual frequency k . 2. Determine the desired upper bound on the convergence rate ∈ (0, 1], as is given by Definition 1.
where Ω ℑ and Ω ℜ are specified in Definition 3.

Invert the nominal model and set
n W y || 2,2 < , then set Q = I, else, compute Q by solving (13). Exploit the symmetry of the DFT, that is, Q( k ) = Q( −k ), Q( k+pN ) = Q( k ), p ∈ Z, to reduce the number of SDPs.
Recall that the worst-case performance sup Δ∈ ||(G −1 n , Q * , Δ)|| 2,2 can be computed as is discussed in Appendix F. In case the nominal performance is only slightly smaller 1, while the worst-case performance far exceeds 1, consider setting Q = 0 or recompute Q for a smaller .
In this section, the results from the previous sections are combined into a complete design approach. In the next section, this approach is illustrated by application.

APPLICATION TO A FLEXIBLE MOTION SYSTEM
In this section, the developed design framework, as formulated in Section 9, is illustrated by designing L and Q matrices for a benchmark flexible beam system. Both a diagonal and full Q matrix are synthesized and used to apply MIMO IIC, as presented in Section 2. The results indicate high tracking performance for both cases, yet superior performance is achieved for the fully parametrized Q matrix, thereby confirming the potential of the developed framework, which constitutes contribution C5.

The plant and initialization
The plant is simulated by 16th-order DT LTI model with input-output dimension n = 2 with a sample rate of 1 kHz, whose FRF G • (e j k ) is shown in Figure 5. This system represents the benchmark flexible beam as discussed in Reference 47. The frequency grid of the FRF model is made to match the DFT grid of a reference signal consisting of N = 2 10 samples. It is desired that the IIC algorithm converges with a probability ≥0.95, that is, the probability that convergence occurs at all frequency bins simultaneously must be at least 0.95. This is achieved if an individual frequency converges with probability = 0.95 , where is the number of independent frequency bins, that is, = 1 2 N ℑ + N ℜ = 1 2 (2 10 − 2) + 2. Furthermore, the upper bound on the convergence rate is set to = 0.2.

10.2
Identifying the model set is obtained by performing two experiments, during which v(t) is taken such that V( k ) asymptotically satisfies Assumption 7 with C V = 2 ∀ k , where |A| denotes the elementwise absolute value of A ∈ C n×m . This structure is achieved by taking the ith input during the ith experiment as random-phase multisine with unit spectrum. This makes U is unitary, such that the results in Section 6 hold. The nominal modelĜ n is determined as (36), and is shown in Figure 5. The norm bounds ℑ and ℜ are determined by solving F ℑ ( ) − = 0 and F ℜ ( ) − = 0 using the MATLAB-routine fzero.m.

Determining L and Q
Since G n results from measured data, it is generically invertible. Hence, L is readily determined as L = G −1 n . In the following, Q is designed as a full matrix, Q f ∈ C 2×2 , and as diagonal matrix, Comparing the performance of both matrices provides insight about the performance gap that potentially exist in preexisting approaches. 29 For both cases, Q is optimized for nominal performance, that is, steps 2(a) and 2(b) in Section 9 are performed, where (13) is readily solved using either a full or diagonal parametrization by using YALMIP 48 in combination the semidefinite program (SDP) solver SeDuMi.
The resulting Q-matrices are shown in Figure 6, which displays the magnitude and phase. Three key observations are made. First, both Q matrices satisfy Q = I up to 55 Hz, which corresponds to the perfect tracking solution. Second, beyond 60 Hz, the freedom to assume any full matrix in C 2×2 is exploited by Q f , that is, the off-diagonals are nonzero, Q f is clearly not Hermitian, and all entries are complex. By contrast, Q d turns out to be a real matrix, even though its parametrization is complex. This is in agreement with the well-adopted notion of phase-less robustness filters. 11 Third, and most remarkably, beyond 55 Hz, Q f displays relatively large peaks. This goes against common SISO intuition which requires that |Q|<1 to increase robustness, 6 as indeed turns out to be the optimal solution for Q d .

Validation
The performance of both Q-matrices is shown in Figure 7. In the top plot, the optimized nominal performance bounds are displayed, which correspond to in (13). These bounds indicate perfect tracking up to 55 Hz, and in the range [100, 200] Hz, a significant performance improvement is achieved, that is, < 1. The difference between Q f and Q f is apparent in the range [150, 250] Hz, where Q f performs significantly better. In the second plot, the actually achieved performance is displayed, which shows great similarity with the first plot. This validates that optimizing for nominal performance indeed provides an effective way to optimize the true performance. In the bottom plot, the worst-case performance is displayed, which paints a relatively conservative picture in which Q d significantly outperforms Q f . Before the designed L and Q matrices are employed to increase the tracking performance through IIC in the next section, the effect of is illustrated by optimizing Q f for ∈ {0.2, 0.5, 0.7, 0.9}. The resulting nominal and worst-case performance are shown in Figure 8. Increasing makes the convergence constraint less stringent, such that that the nominal performance improves for all frequencies. In contrast, the worst-case performance improves for those frequencies where it is <1, and it deteriorates significantly for those frequencies it is >1. The latter illustrates that in the limit → 1, the worst-case performance becomes unbounded as is described in Section 7.

Applying iterative inversion-based control
The control goal is to track a high-frequency triangle wave with both outputs. To achieve this, IIC is applied with the designed L and Q matrices, as is presented in Algorithm 1 in Section 2.1. The error is measured after approximately 3 seconds to achieve steady state, and the disturbances are as during the identification experiments. The time domain performance is shown in Figure 9, which shows the initial tracking error, and the converged error obtained by using Q d and Q f . Convergence is approximately attained after a single iteration, which results in a significant performance increase, as is shown in Figure 10. These observation confirm that Problem 1 is solved by applying robust IIC, where L and Q are designed by using the approach developed in this article. The performance difference, as is especially clear in in the second output, confirms there is a benefit to using full robustness matrices over diagonal matrices in IIC. F I G U R E 9 The initial tracking error e(t) ( ) and the error after one IIC iteration (Algorithm 1) for Q f ( ) and Q d ( ). This reveals that significant performance improvement is obtained, and that Q f outperforms Q d in output 2 since the achieved error is significantly smaller [Colour figure can be viewed at wileyonlinelibrary.com]

CONCLUSION
An iterative inversion-based control framework is developed for multivariable systems that enables high performance and fast convergence in the presence of unknown disturbances. This is achieved by developing a design framework for the learning and robustness matrices by connecting robust control and nonparametric FRF identification. In the developed framework, the learning matrix is taken as the inverse of the system FRF and the robustness matrix is optimized to achieve optimal nominal performance while ensuring robust monotonic convergence in the presence model mismatches that are induced by stochastic disturbances that corrupt identification experiments. The resulting design process is computationally efficient and can be parallelized by virtue of the frequency-by-frequency design approach. The learning framework is developed for applications that readily allow estimation of FRF models and that perform repetitive tasks during which the system is approximately in steady state. Furthermore, since the developed approach does not require a parametric model, the dynamic order of the system does not impose any computational limitations. This makes the approach especially suitable for systems whose dynamics are predominantly described by LTI flexible dynamics, such as mechatronic motion systems. The efficacy of the developed framework is illustrated through application to a simulated benchmark flexible beam system, which shows that superior tracking performance can be achieved by exploiting the design freedom that is offered by considering full-MIMO robustness matrices. Future research is aimed at the experimental application of the developed framework to industrial systems to further motivate the adoption of the learning control framework in practice.

ACKNOWLEDGEMENT
This work is supported by Canon Production Printing and the research program VIDI (no 15698), which is (partly) financed by the Netherlands Organization for Scientific Research (NWO).