Critical parameters for the robust stabilization of the inverted pendulum with reaction delay: State feedback versus predictor feedback

The critical length that limits stabilizability for delayed proportional‐derivative‐acceleration (PDA) feedback and for predictor feedback (PF) is analyzed for the inverted pendulum paradigm. The aim of this work is to improve the understanding of human balancing tasks such as stick balancing on the fingertip, which can be modeled as a pendulum cart system. The relation between the critical length of the balanced stick and the reaction time delay in the presence of sensory uncertainties, which are modeled as static parameter perturbations in the control gains, is investigated rigorously. Robust stabilizability analysis is performed using the real structured stability radius. Performance is assessed by the length of the shortest pendulum (critical length) that can still be balanced for a fixed reaction delay. For both PDA feedback and PF control with delay mismatch, it is observed that the relation between the critical length and the reaction delay remains quadratic in the presence of perturbations on the control gains (of fixed size). Numerical comparison shows that predictor feedback is superior over PDA feedback in terms of critical length: shorter pendulum can be balanced by PF than by PDA feedback for the same reaction delay and for the same static parameter perturbation. Furthermore, it is found that both control concepts are more sensitive to the change in the feedback delay than on the same relative change in the parameter uncertainties. Interpretation to human balancing suggests that it is more challenging for the nervous system to cope with reaction delay than with sensory uncertainties.


MOTIVATION AND PROBLEM STATEMENT
The mechanical model for stick balancing on the fingertip is a pendulum-cart system, where the cart incorporates the inertia of the human arm segments during motion. The equation governing the motion of the pendulum reads where is the angular displacement of the pendulum with respect to the vertical axis, b is the system parameter, u(t) is the control action, and is the reaction time delay. When the mass of the cart is significantly larger than the mass of the pendulum (which is the case in stick balancing on the fingertip), then b = 3g 2L , where L is the length of the pendulum and g is the gravitational acceleration. 29,34 Note that (1) is equivalent to the governing equation of a pinned pendulum. The control action in case of delayed PD feedback reads where p and d are the proportional and the derivative feedback gains, respectively. It can be shown that stabilization of (1) with (2) is not possible if the feedback delay is larger than The same phenomenon can be phrased in alternative way: for a fixed feedback delay a stick shorter than cannot be stabilized by delayed PD feedback. 2,7 The regression coefficient c 0,PD = 3 4 g will be used as a reference value for further analysis.
In human stick balancing, the reaction time delay is around 200-250 ms. Equation (4) implies that stick shorter than 29-46 cm cannot by stabilized by control law (2). Indeed, humans typically cannot balance sticks shorter than 30 cm on their fingertip. Still, this argument does not conclude that the human nervous system employs a delayed PD feedback mechanism, since (4) is valid only for an ideally perfect feedback without any parameter uncertainty and without any noise. One would think that parameter uncertainties increases the difficulty of the balancing task, that is, increase the critical length. However, determining the relation between the reaction delay and the stick length experimentally is not trivial. In Reference 35, 27 human subject were involved in a series of virtual stick balancing experiments. Subjects had to balance a pendulum on a computer screen using the computer mouse as manipulator. The setup for the experiments is described in Reference 36. The reaction time delay was changed by adding extra delays in steps of 50 ms in the computer code. Subjects' own reaction time delay was measured by a complex reaction time tester instrument. 37 Subjects balanced sticks of different length and the shortest stick that they were able to balance for a given added delay was determined in a systematic manner. The critical lengths versus the overall delay (own delay + added delay) is shown in Figure 1. The quadratic function was fitted for each subject individually by least-square method. Three sample fitted curves are shown in Figure 1(A) by red, blue, and black colors. The black dashed curve shows the critical length (4) associated with the ideally perfect feedback. The red solid line in Figure 1(B) shows the average of the curves fitted on the measurements of 26 subjects. Note, that one outlier subject was removed from the set compared with Reference 35. It can be seen that the measured critical length is above the theoretical curve, which may imply that parameter uncertainties (due to sensory uncertainties for instance) play a crucial role in the stabilization process. The average regression coefficient in (5) was obtained to be c = 25.2342 m/s 2 , which is larger than the reference regression coefficient c 0,PD = 3 4 g = 7.36 m/s 2 in (4). The goal of this article is to determine whether parameter uncertainties can link the two critical lengths in (4) and in (5) via the constant c. Parameter uncertainties in stick balancing primarily originate from the imperfections of the sensory system. In case of Δ = p error in the perception of the angular position and Δ̇= ḋe rror in the perception of the angular velocity, the control actions changes to Thus, uncertainties in the sensory perception can also be represented as uncertainties in the control gains. Here, we restrict our analysis to static uncertainties. Brute-force worse-cases analyses over a predefined grid of control and system parameters were performed numerically in Reference 30 and in Reference 38 for delayed PD, delayed PDA, and predictor feedback. It was found that predictor feedback is superior over PD feedback even for large (>50%) static uncertainties and superior over PDA for uncertainties less than ≈ 12% in the sense that the corresponding critical length is smaller. Here, similar analysis is performed using the concept of stability radius as more precise and sophisticated measure for robust stability to real structured perturbations according to 39. First, delayed PDA feedback is discussed then predictor feedback is investigated.

PROPORTIONAL-DERIVATIVE-ACCELERATION FEEDBACK
Control action in case of delayed PDA feedback reads where a is the acceleration control gain. Although this controller is often considered to be a kind of predictor feedback in the sense that the position is predicted as a linear combination of its first and second derivatives, it still cannot eliminate the delay in the feedback loop. On the other hand, it is possible to design a stable controller if the stick is sufficiently long, by choosing proper values for the feedback gains p, d, a. The closed-loop of (1) with (7) forms a neutral delay differential equation (NDDE) and the necessary condition for stability is |a| < 1. 15 Robust stability analysis to real structured perturbations of a similar system modeling postural sway in the frontal plane during quiet standing was performed in Reference 39. Here we employ the same steps and adapt the method to the robust stabilizability analysis of (1) with (7). The characteristic equation of the perturbed system can be written in the formD where□ are the nominal values and □ are the uncertainties of the parameters b, p, d, a, respectively, and is the weight function of the perturbation vector T = [ a∕ a d∕ d p∕ p b∕ b ] and □ = w □□ are the weights of the uncertainty. In this article, all the weights of the control gains are taken to be one, that is, w p = w d = w a = 1, that is, the uncertainties in the control gains relative to the nominal values are the same. As explained by (6), these perturbations describe the uncertainties in the sensory perception of the position, velocity, and acceleration. During stick balancing on the fingertip, the length of the stick is a well-defined parameter, thus, no perturbation is allowed in b and the corresponding perturbation weight is set as 1∕w b = 0. The stability of the nominal system can be investigated analytically via the spectrum of the characteristic equationD( ) by using the D-subdivision method. For a given and b, this gives a set of conditions on p, d and a for which the nominal system is stable. As one increases the time delay in case of a fixed system parameter b (i.e., fixed length of the pendulum), the stable domain in the three dimensional space (p, d, a) of the control parameters shrinks. If the delay reaches a critical value, then the stable region becomes a single parameter point, where the system has a zero root with triple multiplicity. This happens when The corresponding minimal admissible length for the inverted pendulum that can be stabilized by PDA feedback with a given delay is For 0 < â < 1 PDA feedback outperforms PD feedback by a factor 1 1+â in terms of the critical length that can be stabilized for a given feedback delay. Uncertainties in the system and control parameters decreases the performance of the controller. When real perturbations are allowed to the parameters b, p, d, and a then the real structured stability radius can be used as a measure of robustness. Following Reference 39, the real structured stability radius can be defined as and can be evaluated by and 2 (.) is the second largest singular value of the resolvent matrix. For each stable operating point, one can calculate a stability radius using (12). Figure 2(A) shows an example for the stability radius in the plane of the control gains (p, d). The curve associated with r = 0 corresponds to the stability boundary. The contour curves r = const bounds the regions in the parameter space where the system is stable for any given perturbations satisfying || || < r . In other words, r is the size of the smallest perturbation that may cause a loss of stability. Figure 2(B) shows the pseudo-spectra for a given control gain triplet (p, d, a) = (25, 6, 0.2). The corresponding stability radius is r R = 0.2941. Since the weights of the control gains were set to w p = w d = w a = 1, stability radius directly gives the allowed perturbation in the control gains. This means that for (p, d, a) = (25, 6, 0.2) loss of stability may arise if the control gains p, d and a are perturbed by 29.41%. The contour curve of the corresponding pseudo-spectra is shown by red line.
Similarly to the nominal model, one can determine the critical length in the case when uncertainties are present. The stability radii have a clear maximum within the region of stabilizing control gain parameters. First the stable region was determined numerically using the semidiscretization method. Then the maximum of the stability radius for a given time delay and pendulum length can be determined by evaluating (12) over a domain of control gains. The procedure was repeated for a series of time delays and pendulum lengths. It shall be mentioned, that stabilizability is lost only once when the delay is swept from zero to infinity, that is, once the stable region disappears at the critical delay, then no stable regions emerges for larger delays. The critical length associated with different stability radii is shown as function of the delay in Figure 3(A,C) for a = 0 (PD feedback) and a = 0.2 (PDA feedback), respectively. It can be seen that the critical length is smaller independently of the size of the perturbation when acceleration is fed back. The case r R = 0 corresponds to the nominal case where the relation between L crit and is given by (9). When r R > 0 then the relation between L crit and remains quadratic and a function can be fit on the numerical result with negligible error (with coefficient of determination R 2 > 0.993) due to limited precision of the computation. Figure 3(B) shows the ratio c PDA (r )∕c 0,PD for different types of perturbations: when only p is perturbed (w p = 1, 1∕w d = 1∕w a = 0, red), when only d is perturbed (w d = 1, 1∕w p = 1∕w a = 0, green) and when both p and d are perturbed (w p = w d = 1, 1∕w a = 0, blue). For the combined perturbation of p and d, the critical length is larger compared with the case when only one of the gains are perturbed. The same tendency can be observed for PDA feedback with different acceleration gain a as shown in Figure 3(D).
Similarly to the nominal system, the critical length for PDA controller is smaller than that of the PD controller. However, this benefit vanishes when the acceleration gain increases due to the strong stability condition |â + a| < 1. When a = 0.8, then 25% perturbation (r = 0.25) already results in an unstable system due to the violation of the strong stability condition, this explains the vertical jump at r = 0.25. In general, PDA feedback will become unstable if r ≥ 1−|â| |â| . Both Figure 3(B,D) shows that c PDA (r )∕c 0,PD is a monotone increasing function of the stability radius r . Hence, parameter uncertainties indeed impair the performance of the control system (i.e., increases the critical length).

PREDICTOR FEEDBACK
Predictor feedback (also known as finite spectrum assignment or modified Smith predictor) is based on the predicted behavior of the system over the delay interval based on an internal model. In case of a perfectly matching internal model and a perfect prediction, the delay can be compensated and the system becomes finite dimensional. For the description of predictor feedback, the system is written in the first-order forṁ Prediction is based on the internal model, which is written in the forṁ whereÃ,B,̃are the estimated system and input matrices and the estimated feedback delay. This equation with initial value y(t) = x(t) can be solved over the interval [t, t +̃], which gives the prediction of the state x at time instant t +̃as The predictor feedback action reads where is the matrix of the control gains. In the next steps, different stability concepts are described.

Ideal stability
If the internal model is perfectly matching the actual system (i.e., if A =Ã, B =B, and =̃), the implementation is perfect and there are no noise and other perturbations, then the system is reduced to the ordinary differential equationẋ For this system, arbitrary pole placement is possible by tuning the control gains p and d. Stability requires p > b and d > 0, therefore selection of large control gains result in a system that is robust to large perturbations in p and d.

Theoretical stability
In case of parameter mismatch, finite spectrum assignment is not possible. In this case, (15) with (19) defines a retarded delay differential equation (RFDE). 38 The characteristic equation of the closed loop system can be obtained by substituting the trial solutions x(t) = X 0 e t , u(t) = U 0 e t to get where The characteristic equation of the closed loop system is det(T( )) = 0.
In order to determine the stability boundaries one can use the D-subdivision method for (24). Here, it is assumed that the system parameters in the internal model are accurate, that is,Ã = A,B = B and the main uncertainty arise in the estimation of the delay. This is a plausible assumption for expert stick balancers, where the dynamics of the stick are known as a result of a long enough learning process, while the reaction delay remains still an uncertain parameter. The extreme case wheñ= 0 corresponds to the traditional delayed PD controller. The critical length can be determined by analyzing the stable region bounded by the D-curves for decreasing length L. The stable region disappears when one of the following root crossing happens: (1) there is a zero characteristic root with triple multiplicity at the origin of the complex plane (triple zero multiplicity, TZM), (2) there is a zero characteristic root with double multiplicity at the origin of the complex plane and a complex pair of characteristic roots on the imaginary axis of the complex plane (double zero multiplicity with complex pair, DZMC). The transition between the two conditions happens in a single point, where a zero characteristic root with triple multiplicity coexist at the origin along with a complex pair of characteristic roots on the imaginary axis of the complex plane (triple zero multiplicity with complex pair, TZMC). Using these conditions, the critical length can be determined by investigating the multiplicity of the roots. Figure 4(A) shows the critical length as function of the actual delay for different delay estimations̃. Similarly to the delayed PD case, the quadratic form can be fit on the numerical results perfectly (with coefficient of determination R 2 = 1, which suggest that (25) is an exact form). The ratio c PF (̃)∕c 0,PD of the regression coefficients can be used to assess the performance of the system compared with the perturbation-free delayed PD feedback. This ratio is shown in Figure 4(B) for different estimated delays. If 0 < < 0.54 or 0.72 <̃< 1.76 then c PF (̃)∕c 0,PD < 1, that is, shorter sticks can be balanced by predictor feedback even with significant delay mismatch than by delayed PD feedback. Wheñ= , then we have finite spectrum assignment, hence the critical length drops theoretically to 0. Surprisingly, if 0.54 <̃< 0.72 or 1.76 <̃then c PF (̃)∕c 0,PD > 1, in this case PD feedback is superior to the predictor feedback. Figure 4(B) shows that the tendency of the regression ratio c PF (̃)∕c 0,PD is symmetric in the vicinity of perfect delay predictioñ= . Hence, 10% delay underestimation yields to a similar effect, as 10% overestimation does.

Robust stability
Similarly to PDA feedback, robustness to perturbations in the control gains can be investigated by the pseudo-spectrum of the characteristic equation (24). If the gain vector is written in the form , then the where . Using (24) withÃ = A andB = B, the characteristic equation of the nominal system has the form where =̃− is the delay mismatch and the weights for the perturbation vector are The stability radius for a stable system can be calculated using definition (12). The critical length for a fixed triplet (b, ,̃) can be calculated by analyzing stability radius contours for decreasing pendulum length. Figure 5(A) shows the critical length as function of the feedback delay for different stability radii in case of a 10% overestimation of the delay in the predictor model. (Note that for perfectly matching delaỹ= , finite spectrum assignment is possible and theoretically L crit → 0.) Similarly to (14) for PDA feedback, the relation between L crit and for the predictor feedback can also well be described by the quadratic form Quadratic fitting on the calculated points gave a coefficient of determination R 2 > 0.98. Figure 5(B) shows the ratio c PF (r ,̂)∕c 0,PD of the regression coefficients. As can be seen, the regression coefficient for the predictor feedback is significantly smaller than that of the delayed PD feedback. A contour plot of the critical lengths is shown in Figure 6 as function of the feedback delay and the stability radius for delayed PD feedback and for predictor feedback. It can be seen that the same change in or in r results in larger change in L crit for the PD feedback than for the predictor feedback. This means that predictor feedback is more robust to control gain perturbations than delayed PD feedback.
Since the delay and the stability radius have different dimensions, direct comparison of their effect on the critical length is equivocal. Still, the effect of changes in the delay and in the stability radius can be compared for some fixed parameter values numerically as a case study.

Delayed PD feedback
When the feedback delay is increased from = 0.2 to 0.4 s with keeping r = 0.2 constant then the critical length increases from 1.06 to 4.08 m. When the stability radius is increased from r = 0.2 to 0.4 with keeping = 0.2 s constant then the critical length increases from 1.06 to 2.12 m. Thus, a 100% relative change in the delay result in larger increase (by about 300%) in the critical length than the same relative change in the stability radius does (which gives about 100% increase in L crit ). In this sense, performance of delayed PD feedback depends more sensitively on the time delay than on parameter perturbations.

Predictor feedback
When the feedback delay is increased for the model discussed above, from = 0.2 to 0.4 s with keeping r = 0.2 constant then the critical length increases from 0.24 to 0.93 m. When the stability radius is increased from r = 0.2 to 0.4 with keeping = 0.2 s constant then the critical length increases from 0.24 to 0.34 m. Thus, a 100% relative change in the delay result in larger increase (by about 390%) in the critical length than the same relative change in the stability radius does (which gives about 42% increase in L crit ). This highlights that although predictor feedback readily involves delay compensation, it depends more sensitively on the time delay than on parameter perturbations similarly to delayed PD feedback.
The above case study also shows that the sensitivity to delay change is about the same for delayed PD and predictor feedback, while predictor feedback is less sensitive to the change in the stability radius. Overall, even with parameters perturbations, predictor feedback seems to be superior over PD feedback in terms of the critical pendulum length. Note, however, that this numerical analysis is valid only for the investigated parameters and in other regions of the parameters space the dependency and sensitivity on changes in and in r might be different.

CONCLUSIONS AND APPLICATION TO HUMAN STICK BALANCING
Stabilization of unstable systems in the gravitational field can in most cases be reduced to the stabilization of an inverted pendulum. Two main factors that limit the performance of the stabilization process are the feedback delay and parameter uncertainties. Here, the main types of feedback mechanism, namely, delayed PDA feedback and predictor feedback was analyzed and compared in terms of the achievable shortest pendulum to be balanced as a measure of control performance. Analysis of real structured stability radii showed that for PDA feedback the critical length remains a function of the square of the delay even in the presence of parameter perturbations as shown by (14). By comparing PD and PDA feedback, it was found that acceleration feedback helps to decrease the critical length if the acceleration control gain is small enough, typically, if |â| < 0.5. For the comparison of the PD and the predictor feedback, it was found that predictor feedback is superior over delayed PD feedback in the sense that significantly shorter pendulums can be stabilized with the same stability radius. When delay mismatch is present in the predictor feedback, then the performance is still better (i.e., critical length is still shorter) than for delayed PD feedback if 0 <̃< 0.54 or 0.72 <̃< 1.76 . Another important feature is that the relation between the critical length and the time delay for predictor feedback with parameter uncertainties is quadratic according to (30).
A main question in human balancing tasks is whether the underlying control mechanism is based on delayed state feedback or on predictor feedback. The typical time delay in stick balancing on the fingertip is between 200 and 250 ms. 24,29,30,32,35 Even expert stick balancers cannot balance sticks shorter than 30 cm, and the typical range of the critical length is between 30 and 50 cm. Based on these values, estimation can be given about the corresponding stability radius, which can be interpreted as the parameter uncertainties in the control gains, that is, in the visual perception of the angular position and the angular velocity. In Figure 6, the range of the time delay associated with stick balancing is indicated by red lines. It can be seen that larger than 4.2% static perturbation can already destabilize the delayed PD feedback in case of = 0.2 s and L = 0.5 m. Estimation of the uncertainties in visual perception is typically estimated to be larger than this value. 29,35 This means that pure PD feedback may not be the proper model for skilled stick balancers. For predictor feedback, the critical lengths are significantly smaller. For = 0.25 s, a stick of length 30 cm can be balanced even in case of 13% static perturbation. This critical length is much shorter than human subjects can balance, which implies that predictor feedback can be a possible control concept in human stick balancing even in case of other uncertainties that were not modeled here, for example, noise, actuation imperfection and further mismatches in the internal model.
It shall be noted that although the presented analysis used the concept of static uncertainties according to References 17-20,39, the results can also be interpreted to stochastic parameter uncertainties, for example, noise in the sensory perception or in the motor control. As shown in References 40-42, stochastic perturbation has a similar effect on the performance of the control process: the stable parameter region in the presence of noise is typically smaller than the stable region for the nominal (noise-free) system. In this sense, the stability radii can be used to demonstrate the robustness of the system against noise, too.
The calculations of the models were performed in MATLAB environment. The codes to generate the results can be downloaded at the url https://github.com/koviub/PDAPFRobust.