Performance‐enhanced iterative learning control using a model‐free disturbance observer

Funding information National Natural Science Foundation of China, Grant/Award Numbers: 51805496, 51805192; Natural Science Foundation of Hubei Province of China, Grant/Award Number: 2019CFB206 Abstract This paper proposes a novel performance-enhanced iterative learning control (ILC) scheme using a model-free disturbance observer (DOB) to achieve high performance for precision motion systems that encounter non-repetitive disturbances. As is well known, the performance of the standard ILC (SILC) is severely degraded by the non-repetitive disturbances. By introducing DOB into SILC, this paper improves the robustness of the ILC system against non-repetitive disturbances. In the proposed enhanced ILC (EILC), SILC aims at learning the feedforward signals for a specific reference, while DOB is to compensate for external disturbances. Little or no plant model knowledge is required for SILC. To maintain this advantage after introducing DOB, a model-free design method for DOB is proposed to release the need for the plant model. Based only on a specific reference and the corresponding feedforward signals learned by SILC, the filter of DOB is optimized via an instrumental-variable estimate method. Numerical simulation is performed to illustrate the effectiveness and enhanced performance of the proposed control approach.

A detailed overview of main developments in ILC research can be found in, e.g. [11][12][13]. Design frameworks in ILC include frequency-domain design and time-domain design [11]. A direct relation exists between frequency-domain and time-domain ILC schemes (see [14]). The key idea behind ILC is that during repetitive tasks, ILC iteratively updates the control signals from the measured data of previous tasks, which is accompanied by the performance improvement. There exist many ILC algorithms that do not require a parametric plant model in the design of the robustness filter and learning filter. In [5,15,16], frequency response function is utilized to satisfy the ILC convergence condition. Compared to a parametric model, the frequency response function is often easy, inexpensive, and accurate to obtain for mechatronic systems. In data-driven inversion-based ILC [17,18], the modelling requirement is further reduced by removing the need for the frequency response function. In particular, the updating of the control signal is explicitly performed in the frequency domain by using the Discrete Fourier Transform. In [19,20], the ILC law is implemented in the time-domain lifted framework, where the impulse response is employed. Furthermore, for the widely used proportional-derivative (PD)-type ILC, the most commonly employed method for selecting the gains of the PD-type learning filter is by on-site tuning [11], similar to proportional-integral-derivative (PID) feedback controller.
The servo error induced by the repeating reference as well as repetitive disturbances can be completely eliminated by ILC. However, if the motion system suffers from non-repetitive, or task-varying, disturbances, the performance of ILC schemes is severely deteriorated [21][22][23], especially when the dominant frequencies of non-repetitive disturbances overlap with those of the reference. Therefore, the achievable performance of ILC is limited by non-repetitive disturbances. Traditionally, the rejection of non-repetitive disturbances relies on the feedback control of the closed-loop system, while ILC servers as the feedforward control [24,25]. To enhance the robustness of ILC against non-repetitive disturbances, a significant research effort has focused on filtering out the effect of non-repetitive disturbances from the learning loop. Wang et al. [26] introduce a variable forgetting factor into high-order ILC such that the control variation induced by non-repetitive disturbances is weakened. In [27], a wavelet-based filter is employed that identifies and removes the tracking error induced by non-repetitive disturbances for each iteration of ILC. In [28] and [29], ILC with an iterationdomain filter is proposed to address the non-repetitiveness, which can be regarded as a special high-order ILC.
Compared to the filtering, observer-based methods seem more intuitive and natural to deal with non-repetitive disturbances of the ILC systems. Extension of ILC with a time-domain disturbance observer (DOB) is studied in [30][31][32]. The estimated disturbance information by DOB is added into the standard ILC (SILC) law to compensate for external disturbances including both repetitive and non-repetitive disturbances. Tang et al. [33] achieve performance improvement of ILC by incorporating a time-domain extended state observer (ESO) that can estimate external disturbances and unknown uncertainties. In [34] and [35], SILC is combined with an iteration-domain ESO that iteratively estimates disturbances.
The key to improve the performance of observer-based ILC schemes definitely depends on the estimate accuracy of disturbances by the DOB. In the traditional disturbance observer approaches, a parametric model of the controlled plant is first obtained, for instance, via system identification, and then a DOB or ESO is designed on the basis of the parametric model.
ILC extended with a model-based DOB or ESO indeed enhances robustness against non-repetitive disturbances. However, for a model-based DOB or ESO, the troublesome modelling or identification is inevitable, and the actual estimate accuracy of disturbances is unavoidably degraded by the modelling error. Hence, the advantages of the original data-driven (or model-free) design of the ILC law are destroyed by the introduction of a model-based DOB or ESO. That is, the dependance of disturbance observation methods on the plant model limits the practical applicability of the traditional observer-based ILC. To the best of the authors' knowledge, little effort has been made on releasing the dependence of disturbance observation methods on the plant model. Identical to the feedforward controller in two degrees-of-freedom control configuration [36], DOB aims at accurately approximating the inverse plant model. The successful application of data-driven feedforward tuning [37][38][39] inspires the authors that it is possible to achieve the model-free design of DOB based on the measured data from the tracking tasks.
The aim of this paper is to propose a novel enhanced ILC (EILC) approach using a model-free DOB for precision motion systems that perform repeating tasks but encounter non-repetitive disturbances. The contributions of this paper are threefold. First of all, a new model-free design method is proposed for DOB. The filter of DOB, which aims at approximating the inverse dynamics of the plant, is optimized via an instrumental-variable (IV) estimate method [40] based only on a specific reference and the corresponding feedforward signals learned by SILC. Second, by introducing the model-free DOB into SILC, an EILC scheme is developed to enhance the robustness against non-repetitive disturbances. In the proposed EILC, SILC iteratively learns the feedforward signals for a specific reference and thus eliminates the reference-induced error, while DOB is to compensate for external disturbances. Third, numerical simulation is performed to confirm that the proposed approach achieves improved performance compared to SILC.
The remainder of the paper is organized as follows. The SILC and traditional DOB are described in Sections 2 and 3, respectively. Then, in Section 4, the model-free design method for DOB is developed based only on a specific reference and the corresponding feedforward signals learned by SILC. In Section 5, an EILC scheme using the model-free DOB is presented. In Section 6, numerical simulation is provided to verify the effectiveness and enhanced performance of the proposed approach. Finally, Section 7 presents a brief conclusion to this paper.
Notation: This paper considers linear time-invariant discretetime single-input single-output systems. The discrete-time transfer function of a system is denoted as K (z ) with the complex variable z. z will be omitted without causing confusion. For a sampled signal x, x(k) denotes the sampling value of x at the time instant kT s with the sampling interval T s . Furthermore, the expected value x is defined as where N is the number of the sampled data.

STANDARD ITERATIVE LEARNING CONTROL
ILC servers as the feedforward control, and the ILC configuration is depicted in Figure 1. After each repetition, ILC updates the feedforward signals for the next repetition by learning from past tasks. The system consisting of an unknown plant P and a stabilizing feedback controller C , is repeatedly excited by a reference signal r. Each repetition of r is called a task, or iteration, denoted by subscript j . Furthermore, u j , y j , and e j (e j = r − y j ) denote the control input, the output, and the tacking error at iteration j , respectively. u j is composed of the feedforward signals f j and the feedback signals c j . d j denotes the unmeasured iteration-varying disturbances at iteration j , i.e. d j +1 ≠ d j . Assume that d j is uncorrelated with r. From Figure 1, e j is given by with sensitivity function S = 1∕(1 + PC) and process sensitivity function T = P∕(1 + PC).
A typical SILC algorithm is formulated as: where Q is a robustness filter, and L is a learning filter. Q and L are possibly non-causal. Given the bounded d , a sufficient condition for the asymptotic stability of the SILC system is given by [14,41] Substituting (1) into (2) yields which further gives If the asymptotic stability condition (3) is satisfied, the following asymptotic feed- The uncertain term ∞ depends on all previous non-repetitive disturbances. Due to ‖Q(1 − LT )‖ ∞ < 1, the disturbance of the last one iteration dominates ∞ [27], that is, By using (1) and (2), we have From (1), it follows that Then we have from (7) and (8) that Similar to f ∞ , the asymptotic error e ∞ Δ = lim j →∞ e j +1 can be obtained from (9): Design considerations for Q and L are well known for ILC algorithms (see [11] etc.). Ideally, Q = 1 is selected to achieve high tracking performance, and L = T −1 for fast convergence rate. To avoid parametric modelling for P (z ) (or T (z )), frequency response function T (e j ) [5,15,16] or impulse response of T (z ) [19,20] can be utilized to design Q and L satisfying (3). Another method to achieve the model-free design of ILC is to implement (2) in the frequency domain by using the Discrete Fourier Transform, which is referred to as data-driven inversion-based ILC [17,18].
To clearly illustrate the effect of non-repetitive disturbances on the performance of SILC, set Q = 1, and thus we havê From (1), if there is no SILC serving as the feedforward control in the closed-loop system, the servo error induced by disturbances is given by By comparing (11) and (12), it is observed that SILC amplifies the error induced by the non-repetitive disturbances. Hence, the existence of non-repetitive disturbances degrades the performance of the SILC system.

DISTURBANCE OBSERVER
DOB is a promising approach to compensate for the nonrepetitive disturbances of the ILC system. As shown in Figure 2, DOB estimates the disturbance d by using the control input u From Figure 2, we have Substituting (14) into (13) giveŝ The disturbance estimate errord is given bỹ Equation (16) reveals that if F = P −1 and Θ = 1, d can be accurately estimated by DOB. One-step delay z −1 in (13) is to guarantee the implementability of the discrete-time DOB. The Θ filter is ideally designed to be close to 1 in all the frequency range, but probably at the expense of amplifying the measurement noise. Considering that d is, usually, of low frequency whereas the measurement noise is of high frequency, Θ is nominally selected as a low-pass filter. The cut-off frequency Θ of Θ is chosen as large as possible on the premise of ensuring system stability. From Figure 2, after introducing DOB into the closed-loop system, the output y is given by with .
Due to Θ ≈ 1 in the frequency range of [0, Θ ], we have G ′ dy ≈ 0 at low frequency. Therefore, the effect of the low-frequency d can be well compensated for by DOB. Furthermore, if F closely approximates P −1 , P ′ ≈ P, that is, the system dynamics are nearly unchanged after introducing DOB. As a result, the closed-loop stability will not be degraded by DOB.
The effectiveness of DOB has been well validated (see [30] etc.). As discussed above, the design objective of F is essential to accurately approximate P −1 . In the traditional DOB, a plant model P is first obtained, for instance, via system identification, and then a stable F is designed to approximate P −1 . Hence, the troublesome modelling is inevitable, and the actual estimate accuracy of disturbances is unavoidably degraded by the modelling error. Furthermore, the advantages of model-free design will be lost if SILC is extended with the traditional model-based DOB. Hence, it is indispensable to develop a model-free design method for DOB. Remark 1. The selection of Θ as a low-pass filter results in the performance limitation of DOB with respect to disturbances containing high-frequency components. This selection can be further justified from two engineering perspectives [42]: (1) the high-frequency components of disturbances are normally filtered out by the inertia of the physical system; (2) even though the high-frequency components of disturbances could be well estimated, it is quite difficult to attenuate their impact due to the bandwidth constraints of actuators.

MODEL-FREE DOB BASED ON SILC
In this section, a model-free design method for DOB is proposed based only on a specific reference and the corresponding feedforward signals learned by SILC. This constitutes the first contribution of this paper. The design objective of the F filter of DOB is to approximate P −1 as closely as possible. Hence, the parametrization of F should be capable of capturing the dominant dynamics of P −1 . P (z ) is usually described as with n ≥ n . Therefore, F is parametrized with the following model structure: Here, n = n a + n b + 2. Ideally, n a = n − 1, and n b = n − 1.
The Q filter of SILC is typically selected as Q = 1 such that e ∞ is independent of r and repetitive disturbances. Accordingly, f ∞ of SILC in (6) reduces to with ∞ ≈ lim j →∞ LT d j . Suppose that P −1 (z ) ∈  . The optimal parameters * of F ( , z) satisfy with F ( * , z) = P −1 (z ). Inserting (20) into (22) gives and k denotes the sampling index. Equation (22) inspires us that F can be designed based on a specific r and the corresponding f ∞ learned by SILC. As a result, no parametric model is required for DOB. Identical to F , the feedforward controller of two degree-of-freedom control is aimed at approximating P −1 [36]. In the iterative feedforward tuning, r and the corresponding tracking error are exploited to iteratively tune the feedforward controller [37][38][39]. The effectiveness of the iterative feedforward tuning validates that r is generally persistently exciting for the model-free design of F , which is detailed in the sequel.
Assume that d j is given by d j = H (z ) j , where H (z ) is stable, and is white noise with zero mean and variance 2 . Given the parametrization (20), the parameters of F are determined according tô * where the optimization criterion J ( ) is defined as with instrumental variables (k) ∈ ℝ n and a stable pre-filter R(z ). N denotes the number of the sampled data, and depends on the sampling length of r. As J ( ) is a quadratic function in , the optimization problem (25) can be analytically solved, which giveŝ *  (23) is not white noise. If̄(k) T R (k) is non-singular, and (k) ∞ (k) = 0, we havê * → * as N → ∞ [40]. The asymptotic distribution of̂ * is given by with the covariance matrix P . P depends on the choices of (k) and R(z ). The lower bound of P , i.e. P opt , is given by with (k, * ) The choices of (k) and R(z ) in the IV estimate method primarily affect the asymptotic variance, while the unbiasedness and consistency properties are generally secured. However, H (z ), P −1 (z ) and thus F ( * , z) are unknown. Therefore, the optimal opt (k) and R opt (z ) cannot be directly designed from H (z ) and * . We suggest the following four-step IV estimate method to achieve an unbiased estimate of * with approximately minimum variance.
Step 1: Estimate with the least-squares (LS) method. That is, the estimatêI are determined by minimizing the following criterion: and is given bŷ Since A( * , z) ∞ in (23) is not white noise,̂I is a biased estimate of * , but is used to construct (k) in the following Step 2.
Step 3: Let We have from (22) that with ∞ (k) Δ = lim j →∞ j . Postulate an autoregression model of order n R for II (k):R wherê(t ) is zero-mean white noise. EstimateR(z ) using, for instance, the Burg method [43]. It is not hard to understand that R(z ) ia a good approximation of R opt (z ).
Step 4: Set x II (k) = F (̂I I , z)r (k), and let SincêI I is an unbiased estimate of * , II (k) is an approximation of opt (k). Using II (k) andR(z ), determine the final Remark 2. In order to decrease the effect of d on the estimated variance of̂ * , f ∞ can be obtained by averaging the feedforward signals from multiple iterations after SILC converges. Furthermore, it is clear that the resulting F (̂ * , z) is non-causal (see (20)). F (̂ * , z) is replaced with z −1 F (̂ * , z) so as to ensure a causal implementation of the F filter. Correspondingly, the disturbance estimate errord is subjected to a small increase due to the one-step delay of F .

EILC BY USING MODEL-FREE DOB
In Section 4, a model-free DOB is obtained by using the reference r and the corresponding asymptotic feedforward signal f ∞ learned by SILC. Considering the effects of non-repetitive disturbances, EILC consisting of SILC and a model-free DOB is developed in this section, constituting the second contribution of this paper. In the proposed EILC, SILC aims at learning the feedforward signals for a specific r and hence eliminating the tracking error induced by r, while the model-free DOB is to compensate for external disturbances. The overall control diagram of the EILC system is depicted in Figure 3. As DOB is introduced into the SILC system, the system dynamics are slightly changed. Therefore, f ∞ learned by SILC is no longer optimal for the EILC system. Hence, the feedforward signals should be relearned by the proposed EILC: The superscript ′ means that the corresponding signals come from the EILC system. Similar to the SILC system, the suffi- Enhanced ILC configuration cient condition for the asymptotic stability of the EILC system is given by with T ′ = P ′ ∕(1 + P ′ C ). Since F (̂ * ) obtained by the fourstep IV estimate method accurately matches P −1 , we have from (18) that P ′ ≈ P. Therefore, the robustness filter and learning filter of the EILC system could be selected the same as the SILC system. Furthermore, due to P ′ ≈ P, it is believed that the asymptotic feedforward signals f ′ ∞ of EILC have little difference with f ∞ of SILC. Thus, the initial feedforward f ′ 0 of EILC can be selected as f ∞ , i.e. f ′ 0 = f ∞ , such that a fast convergence rate can be achieved.
The design procedure of the proposed EILC scheme is summarized as follows: 1) Perform the SILC system as shown in Figure 1 until convergence, and record its asymptotic feedforward signals f ∞ ; 2) Given f ∞ and r, F (̂ * ) is obtained by the four-step IV estimate method. 3) With the DOB introduced into the closed-loop system, perform the EILC system (see Figure 3) with f ′ 0 = f ∞ until convergence, and thus the optimal feedforward signals f ′ ∞ are obtained.
Next, the performance of the proposed EILC scheme is discussed. Note that z −1 F (̂ * , z) is practically used for the disturbance observation in the proposed EILC system. Analogous to e ∞ of the SILC system, the asymptotic error e ′ ∞ of the EILC system is derived as Therefore, S ′ ≈ S and T ′ ≈ T . These subsequently lead to G ′ re ≈ G re , It is observed that the reference-induced error of the proposed EILC is comparable to that of SILC. In case of Q = 1, the asymptotic error of both EILC and SILC are independent of r. As 1 − z −1 Θ is close to zero at low frequency, the capacity for suppressing the low-frequency disturbances is dramatically improved by EILC in comparison with SILC.
The performance improvement of EILC can be analysed from another perspective. From (16), the disturbance estimate errord ′ of the EILC system is rewritten as which approaches the disturbance-induced part of e ∞ when the SILC system is disturbed by (1 − z −1 Θ)d . In conclusion, the proposed EILC enhances the capacity to suppress nonrepetitive disturbances in comparison with SILC. Furthermore, owing to the ability to compensate ford ′ INV induced by the approximation error (F − P −1 ), the EILC system outperforms the closed-loop system equipped with only DOB but without ILC.
Remark 3. For a specific reference r, a model-free DOB and feedforward singals f ′ ∞ can be obtained by following the EILC design procedure. f ′ ∞ definitely depends on r. When the EILC system is employed to track another referencer, the feedforward signals should be relearned. From (21), the relearnedf ′ ∞ forr satisfies thatf Since F (̂ * ) accurately approximates P −1 , we have P ′ ≈ P, and thus F (̂ * ) ≈ (P ′ ) −1 , which yieldsf ′ ∞ ≈ F (̂ * , z)r. Therefore, in the feedforward relearning forr, we can select the initial feedforwardf ′ 0 = F (̂ * , z)r to speed up the convergence rate. This is an additional advantage of EILC.

FIGURE 4
Industrial three-axis mechatronic motion system

NUMERICAL SIMULATION
In this section, numerical simulation is performed to illustrate the performance enhancement of the proposed EILC approach for motion systems executing repetitive tasks but suffering from task-varying disturbances. This constitutes the last contribution of this paper.

Simulation setup
Numerical simulation is performed on the X -direction of an industrial three-axis ball-screw-driven system [44], as shown in Figure 4. Driven by three Panasonic servo motors, the end operator can move in three logical axes: X , Y , and Z . The nominal model of X -direction is given by The corresponding discrete-time transfer function is given by with the sampling interval T s = 0.2 ms. C (z ) is selected as a PID controller. By on-site tuning, the proportional, integral, and differential gains are selected as 62.5, 500, and 0.1, respectively. The data-driven feedback control methods including iterative feedback tuning (IFT) [45,46], virtual reference feedback tuning (VRFT) [47], and correlationbased tuning (CbT) [48], can be employed to further optimize the parameters of C with respect to a performance-relevant cost function without any attempts on the plant modelling, which is beyond the scope of this paper. The motion system is excited by a fourth-order reference r. Figure 5 shows the position, velocity, and acceleration setpoints of r. The range of r is 55 mm. A task-varying disturbance d j = H (z ) j is added to the The magnitude of d is within ±0.025. First of all, SILC is performed on the system. With the initial feedforward f 0 = 0, a total of 10 iterations are conducted. The iteration history of SILC in terms of Figure 6. It is shown that SILC converges after four iterations. After V j decreases from 1.3 × 10 −4 mm 2 to about 2.6 × 10 −8 , the performance of the SILC system cannot be further improved due to the existence of task-varying disturbances.
f ∞ of SILC is computed by averaging the feedforward signals f 5 ∼ f 10 , and shown in the top figure of Figure 7. The F filter of DOB is parametrized as in (20) with n a = 1 and n b = 1. Straightforward computations reveal that the optimal parameters * of F ( , z) are given by * =

Figure 8. It is clear that̂I obtained by the LS method in
Step 1 is biased, and F (̂I , z) cannot capture the dynamics of P −1 (z ) above 150 Hz. Based on̂I ,̂I I is obtained by the common IV method in Step 2. Figure 8 shows that F (̂I I , z) approximates P −1 (z ) better than F (̂I , z), but there is still a large phase deviation between F (̂I I , z) and P −1 (z ) at high frequency. The extra work of Steps 3 and 4 of the four-step IV method prompt to decrease the estimated variance of̂ * in comparison witĥI I . As a result,̂ * is a better estimate   Figure 7 shows that F (̂ * , z)r well fits f ∞ of SILC, and the deviation ( f ∞ − F (̂ * , z)r ) is within ±0.0065. This illustrates the high estimate accuracy of̂ * from another perspective (see (21) and (22)). In summary, the four-step IV method achieves the best estimate of * in comparison with the LS method and the common IV method. The accurate match between F (̂ * , z) and P −1 (z ) verifies the effectiveness of the proposed model-free design method for DOB. After introducing the model-free DOB, the proposed EILC is performed on the system. The initial feedforward signals f ′ 0 of EILC are selected as f ′ 0 = f ∞ . As shown in Figure 6, e ′ 0 of EILC is much smaller than e 0 of SILC, and a faster convergence rate is achieved by EILC than SILC. These demonstrate that the selection of f ′ 0 = f ∞ for EILC is appropriate. Next, the disturbance estimate results of the EILC system are first analysed. Take iteration 10 as an example, and Figure 9 shows d ′ 10 can be well estimated by the DOB of EILC since P −1 (z ) is exactly described by F (̂ * , z). The disturbance estimate error d In the proposed EILC approach, the iteration-varying d ′ is estimated and mostly compensated for by the model-free DOB. As shown in Figure 10, the disturbance estimate errord ′ j of DOB is dominated by the iteration-invariant partd ′ INV , which is further compensated for by the iterative learning process. Therefore, EILC can well address the iteration-varying disturbances. Figure 11 shows the magnitude-frequency responses of G de and G ′ de , which represent the transfer function from the disturbance to the asymptotic tracking error of the SILC system and the EILC system, respectively. In the frequency range below 200 Hz disturbance rejection capability is dramatically improved by EILC in comparison with SILC, and EILC does not degrade high-frequency disturbance rejection characteristics. The improved disturbance rejection capability helps enhance the performance of EILC. As shown in Figure 6, V j of the converged EILC and the converged SILC are about 5.5 × 10 −10 and 2.6 × 10 −8 mm 2 , respectively. Hence, contrary to SILC, EILC overcomes the performances limitations from the iterationvarying disturbances. The enhanced performance of EILC is also illustrated by the time-domain tracking error at iteration 10 (see Figure 12). The peak magnitude of the tracking error of EILC, which is 5.9 × 10 −5 mm, equals to 14.1% of that of SILC.

CONCLUSION
This paper has proposed a novel EILC approach using a modelfree DOB for precision motion systems that perform repeating tasks but encounter non-repetitive disturbances. The proposed approach exhibits the following advantages: (1) no need for a parametric model to unbiasedly estimate the F filter of DOB; Simulation results on a three-axis ball-screw-driven system illustrate the effectiveness of the model-free design method for DOB, and the improved performance of the proposed EILC in comparison with SILC. We envision that the ease of implementation and effectiveness make the proposed approach highly suitable for industrial applications.