Robust static output feedback Nash strategy for uncertain Markov jump linear stochastic systems

Theauthorsdeclarethattheyhavenoconﬂictofinterest. Therearenomaterialsthatthepermissionstatementtoreproducethematerialsfromtheothersourcesisneeded. Thedatathatsupporttheﬁndingsofthisstudyareavailablefromthecorrespondingauthoruponreasonablerequest. Abstract In this article, robust static output feedback (SOF) Nash games for a class of uncertain Markovian jump linear stochastic systems (UMJLSSs) are investigated, in which each player may have access to local/private SOF information. It is proved that the robust SOF Nash strategy set can be obtained by minimizing the upper bounds of the cost functions based on a guaranteed cost control mechanism. By using the Karush–Kuhn–Tucker (KKT) condition, the necessary conditions for the existence of the robust SOF Nash strategy set are established in terms of the solvability conditions of nonlinear simultaneous algebraic equations (NSAEs). A heuristic algorithm is developed to solve the NSAEs. Particularly, it is shown that the robust convergence of the heuristic algorithm is guaranteed by combining


INTRODUCTION
Uncertain Markov jump linear stochastic systems (UMJLSSs) have received considerable attention since they can be used to describe many real world systems, characterized by a stochastic process, subject to random abrupt changes due to the failures of the components and unmodeled dynamics [1]. Usually, deterministic uncertainties refer to uncertainties in system matrices. In the UMJLSSs, deterministic uncertainties are independent of stochastic processes of Markov jump and Brownian motion. Therefore, they can be regarded as a generalized case of Markov jump linear stochastic systems (MJLSSs). In recent years, various efforts have been made to deal with system uncertainties in both stochastic and deterministic systems. By using the Lyapunov stability theory, robust stability and stabilization issues of the uncertain Markov jump delay stochastic systems (UMJDSSs) have been investigated [2,3]

. The robust
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Control Theory & Applications published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology exponential stabilization problem for the UMJDSSs with modedependent state variable delay has also been considered by using memory state feedback controller [4]. The dissipative control problems for continuous and discrete-time nonlinear Markovian jump time-delay systems have been addressed [5,6]. Sliding mode control for a class of delayed discrete-time nonlinear Markovian jump has been handled [7]. Various robust stability and stabilization conditions have been reported for the UMJLSSs [1]. Moreover, some efforts have also been made to deal with dynamic games for the MJLSSs and the UMJLSSs. For example, the state feedback Nash strategies and Pareto suboptimal strategies for the MJLSSs and the robust state feedback Nash strategies and Pareto suboptimal strategies for the UMJLSSs are studied in [8][9][10][11][12]. The existing studies have provided effective solutions to the corresponding problems under the assumption of using state feedback strategies. However, as we know, in the controller/strategy design, full state information are not always available in reality. In most practical applications, control/game designers can only rely on local or partial state information because full state information are difficult to obtain, if not impossible. As a result, it is important to study static output feedback (SOF) controller/strategy in control/game field. Furthermore, a game setting in which all the players have the same SOF information structure does not appear reasonable when considering Nash game problems. In fact, it is essential that each player be allowed to have access to local or private SOF information in a non-cooperative Nash game. To the best of our knowledge, no studies have tackled this important challenge.
In the past two decades, many attentions have been made to study the SOF control/game problems for a variety of systems. The SOF control problems have been studied for Markov jump linear deterministic systems [13][14][15]. The finite-horizon H ∞ SOF control problems for Markov jump systems have also been investigated, and the sufficient conditions for the existence of the SOF controller have been designed in the form of relaxed linear matrix inequality (LMI) [16]. There exist some studies on SOF strategies in dynamic games for the MJLSSs [17][18][19]. While many theoretical results have been established to demonstrate the existence of the SOF controller/strategy, more research efforts are required to study numerical methods to solve SOF control/game problems because these problems involve solving complex equations and inequalities which are usually high-ordered and cross-coupled. The associated matrix inequality problems are NP-hard [20].
In this article, we study a robust SOF Nash game for the UMJLSSs with multiple players. This is an extension of the previous study from deterministic systems [21,23], to Markov jump stochastic systems. On the other hand, although robust SOF strategies for UMJLSSs in cooperative Pareto games have been dealt with in [22], robust SOF strategies for UMJLSSs in non-cooperative Nash games are studied for the first time. The difference is in that all the players are allowed to have their own SOF information structure in Nash games. Designing the robust SOF Nash strategies for the UMJLSSs relies on solving the nonlinear simultaneous algebraic equations (NSAEs). If N players exist in the game, the number of NSAEs will increase dramatically. It becomes very hard to solve such largescale NSAEs.
The contributions of this paper are multi-fold. First, based on the guaranteed cost control mechanism [24], the closedloop stochastic system obtained by using the robust SOF Nash strategy set are proven to be exponentially mean-square stable (EMSS), and the upper bounds of the cost functions are established. Second, we have proven that the robust SOF Nash strategy set can be designed by minimizing the upper bounds of the cost functions. By using the KKT condition [25], the necessary conditions for the existence of a robust SOF Nash strategy set are established in terms of the solvability conditions of the NSAEs. In contrast to the existing results based on the LMIbased algorithm [17,21,23], we solved the NSAEs rather than the LMI strict constraint optimization problem, which guarantees optimality for the bounds of the cost functional. Third, we propose a heuristic algorithm to compute the solution set of the NSAEs. We improve the convergence robustness of the proposed algorithm by combining the Krasnoselskii-Mann (KM) iterative algorithm (see [26,27] and references therein) with a new convergence condition. Note that, although the robust SOF Nash strategy set is considered, the NSAEs are solved explicitly using the proposed iterative techniques to overcome computational difficulty. Finally, we present a practical example to demonstrate the effectiveness of the proposed algorithms. In particular, we show that a robust Nash strategy can be designed using only one state value for each player in a Williams-Otto process example [30][31][32].
Throughout this paper, some notations are used: I r ∈ ℝ r×r denotes the identity matrix; ‖ ⋅ ‖ denotes the Euclidean norm of a matrix; [ ⋅ | (t ) = i] denotes the conditional expectation operator; d n,m denotes the space of all , P) be a given filtered probability space where there exist a standard one-dimensional Wiener process, w(t ), t ≥ 0, and a right continuous homogeneous Markov process, (t ), t ≥ 0, with state space  = {1, 2, … , d }. It is assumed that {w(t )} t ≥0 and { (t )} t ≥0 are independent stochastic processes. Furthermore, Markov process (t ) has the transition probabilities given by Consider the following UMJLSS where x(t ) ∈ ℝ n denotes the state vector, u(t ) ∈ ℝ m denotes the control input, and y(t ) ∈ ℝ r denotes the output vector. Matrices A( (t ), t ), A p ( (t ), t ) ∈ ℝ n×n and B( (t ), t ), B p ( (t ), t ) ∈ ℝ n×m have the following forms [1,11,12,22]: Coefficients A, A p ∈ d n,n , and B, Real matrices Δ( (t ), t ) ∈ ℝ n p ×n a and Δ p ( (t ), t ) ∈ ℝ n q ×n b are unknown, representing deterministic uncertainties.
To study the robust SOF Nash strategy for a class of UMJLSSs, we need to give some related definition and lemmas. Definition 1. [18,22] The UMJLSS in (2) or (A, B, A p , B p ) is stochastic stabilizable in mean-square sense, if there exists an SOF control with K (1), K (2),…,K (d ) being constant matrices, such that for any initial state x(0) = x 0 , (0) = i, the closed-loop system is exponentially mean-square stable (EMSS), that is, for some  > 0 and  > 0.
Consider the following UMJLSS where v(t ) ∈ ℝ m v denotes the external disturbance, z(t ) ∈ ℝ n z denotes the controlled output, and coefficients B v ∈ d n,m v with B v (k), i ∈  , being constant matrices.
The following results related to the disturbance attenuation problem under consideration was established as an extended version of the bounded real lemma from the existing result in [1]. Lemma 1. [12,22] Let denote the required disturbance attenuation level. Consider a set of symmetric positive definite matrices W > 0 and positive scalars (k) and (k), such that the following matrix inequalities hold for every i ∈  : Then, we have ii) the following inequality holds: where iii) the worst-case disturbance is given by Although the H ∞ performance is closely related to the initial value x(0), it does not affect the derivation of the worstcase disturbance. In the H ∞ control theory, it is well known that the initial value x(0) of the system is usually assumed to be zero.
The following result is established to facilitate derivations in the main contributions. Lemma 2. [12,22] Consider the following autonomous UMJLSS and the cost function Suppose that a set of symmetric positive definite matrices X and positive scalars (k) and (k) exist such that the following matrix inequalities hold for every i ∈  :

Then, (i) the UMJLSS in (12) is EMSS; (ii) the cost function has the following upper bound
Next, we propose a robust Nash strategy for a class of UMJLSSs.

ROBUST NASH STRATEGY
Consider a UMJLSS with multiple players, defined by The cost functions are defined by where The robust SOF Nash strategy in non-cooperative dynamic games with multiple players is investigated here, which is an extension of the existing results reported in [11,12,22]. The problem under consideration is formulated as follows.
For a given disturbance attenuation level, > 0, find a robust SOF Nash strategy set (u * 1 , … , u * N ) and a worst case disturbance v * (t ) where x i (t ) denotes the state of each player with the following structure: where i = 1, … , N , and The robust Nash equilibrium refers to that the upper bound of the cost functions, (20) against unmodeled deterministic uncertainties and stochastic uncertainties.

Disturbance attenuation conditions
Consider the following closed-loop UMJLSS and the cost function The following matrix inequalities and the worst case disturbance are obtained based on Lemma 1: where W (k) > 0, k = 1, … , d , and Note the following matrix substitutions in UMJLSS (8): , It should be noted that the following inequality are satisfied

Robust Nash strategy set
Different from the Pareto strategy for cooperative games investigated in [22], the Nash strategy for non-cooperative games is considered here. The Nash strategy can be decided independently by each player, which leads to a more complex strategic structure than that of the Pareto strategy. The cost function in (18a) is changed accordingly as follows: Hence, the following matrix inequalities can be established based on the previous result in [11,12,22] and Lemma 2: where k = 1, … , d , and Furthermore, the following cost upper bounds can be obtained: where The robust SOF Nash strategy can be obtained by solving the upper bound minimization problem of the cost functions (26) as follows.
i and * i are the solutions of the upper bound minimization problem of the cost functions (26), then there exist constants S i > 0, i = 1, … , N such that P * i > 0, K * i , * i and * i satisfy the following NSAEs in (27): where In this case,

constitutes a robust SOF Nash strategy set. Furthermore, the upper bound of the cost functions under the robust SOF Nash strategy set (28) is minimized; inequality (20) is satisfied.
Proof. First, we study the following upper bound optimization problem of cost functions (26): where u * m , m = 1,…,i − 1, i + 1,…,N , are the fixed strategy. Under inequality constraint (25a), the existence conditions can be established using the KKT condition [25]. The Lagrangian,  i (k), is defined as where S i ( j ) is the symmetric matrix of the Lagrange multiplier, and we set (0) = k.
It is clear that Tr [P i (k)] and G i are continuously differentiable at point q * i (k). Using the KKT conditions results in Based on (31c) which are generalized cross-coupled stochastic Sylvester equations (GCCSSEs), (27b) can be obtained. Under the assumption of (27), the GCCSSEs in (31c) have unique solutions S i (k) = S * i (k) > 0. Therefore, (27a) holds, from (31b). Furthermore, from (31d), the strategy set of (27c) can be computed. Next, (27d) and (27e) can be derived, respectively, from (31e) and (31f). Hence, inequality (20) can be obtained since the following inequality holds Second, for the H ∞ constraint, if matrix inequalities (22a) and (22b) hold, we can obtain inequality (28) by applying Lemma 1. This completes the proof of Theorem 1. □ Remark 1. Notably, it is not easy to confirm the feasibility of the derived conditions (27). It is also very difficult to solve the NSAEs. Therefore, an effective and efficient computational procedure needs to be developed.

HEURISTIC ALGORITHM
In order to determine the robust SOF Nash strategy set in (28) and obtain the worst case disturbance in (22c), we need to solve the high-order and complex NSAEs in (27) and the matrix inequality in (22a). In general, equations (27) need to be solved using an algorithm, for example, Newton's method.
In this paper, we consider the reduction of the computation to obtain a solution set by using LMI. By introducing the LMI optimization technique, there is no need to solve equations for optimization variables such as (27d) and (27e), and simplification of the algorithm can be achieved. This is very useful for multivariable optimization problems in this paper. First, the following optimization problem is solved.
Let us define the Lagrangian, (k), is defined as Tr [T (l ), (34) where T ( j ) is the symmetric matrix of the Lagrange multiplier.
Using the KKT condition, the following NSAEs are established as the necessary condition. where Hence, the LMI-based iterative algorithm is given as follows.
Step 1. Set the initial values: choose K is EMSS, and select appropriate constants i (k) and (k).
We are now in the position to state the following theorem.
Proof. It is easy to observe that the algorithm always generates a non-increasing sequence of the cost. That is, the following inequalities hold.
Therefore, since the sequence { (n) } is decreasing, and the lower bound exists from (36), the bounded and monotonic sequences are convergent. □ The procedure of (43) in Step 4 is based on the KM iteration [26,27]. The proposed algorithm achieves robust convergence by combining the non-increasing cost sequence and the KM iteration. However, if we omit the iterative procedure in (43), this algorithm is reduced to the Picard iteration scheme, which may not converge.
Remark 2. Setting the initial condition in both algorithms is a difficult task. In fact, a method of trial and error for selecting the initial condition such that the closed loop system is stable cannot be avoided. As shown in [28], it has been reported that most existing algorithms require the determination of an initial stabilizing gain, which can be extremely challenging. Therefore, further investigation is required to choose a less restrictive initial gain.
To overcome such difficulty, the optimality for the upper bounds of the cost functions in Theorem 1 is attained by solving very large-scale NSAEs resulting from the KKT condition. Furthermore, to avoid the derivation of the Newton's method for solving the NSAEs as a batch process, we have applied the KM iteration, which guarantees the convergence. Moreover, it is reasonable to implement the proposed numerical scheme compared with the iterative technique based on the LMI because the optimal cost bound is computed.
Remark 3. In recent years, a SOF framework for a class of network of switched heterogeneous linear vehicle systems with asynchronous switching in terms of an LMI-based control protocol design scheme has been discussed, as in [29]. However, the established consensus strategy depends on the slack matrices, and how to choose slack matrices was not discussed in detail. Another important difference from [29] is that our problem formulation includes the cost function for all players. As a result, by choosing the weights of the cost functional by the control designer, the appropriate transient response and the optimal bound of the cost could be obtained. Conversely, the SOF control strategy in [29] is only guaranteed exponential stability with an H ∞ disturbance attenuation.

NUMERICAL EXAMPLE
To demonstrate the effectiveness of the preceding theoretical results, a practical numerical example based on the Williams-Otto process [30,31] is provided in this section. The Williams-Otto process is widely known as a non-isothermal continuous stirred-tank reactor (CSTR) that includes three parallel reactions. It is also a typical chemical process widely adopted in the control engineering literature. For example, it was used in [31,32] as a delay framework from which the detailed physical characteristics and related deterministic equations of the system can be found. The details of the CSTR are given below. The reactor is fed by two reactant feed rates F A and F B . Upon entering the chemical reactor, raw materials A and B take part in three chemical reactions that produce a product P along with some other by-products. Product P is produced after some impurities are separated. In this situation, it is required that the reactors be controlled by catalyst feeds u 1 (t ) ∶= F A ∕6V R and u 2 (t ) ∶= F B ∕6V R , which are related to raw materials A and B, respectively. Moreover, it is assumed that control input u 2 (t ) is subject to failures with the following two modes of operation: i) undamaged ( (t ) = 1) and ii) opening reduced to 40% of the desired value due to the probabilistic damage or failure ( (t ) = 2) [33]. Thus, the system can be represented as a stochastic system governed by an UMJLSS, indicating that the failure mode can be regarded as a jumping mode.
On the other hand, we consider that 5% of the state coefficient can be represented by a Wiener process due to stochastic perturbations. In addition, it is assumed that the uncertainties (3) have 1% variation from nominal matrices partially.
It can be assumed that only the deviation of x 4 (t ) in the weight composition of the product P is measured in the CSTR example [31]. Due to this restriction, the SOF strategy is considered and the strategy can be easily implemented in practice. Therefore, it is worth pointing out that although each player can only access the current state x 4 (t ) due to the form of C i ( (t )), i = 1, 2, the proposed SOF strategy can be designed from a practical point of view, unlike [18,22]. On the other hand, the problem and the algorithm in this paper are more complex than those in [19], since [19] has no deterministic uncertainties and presents a small number of optimized parameters compared to the number in this paper. The weight matrices of cost functions are given by R 1 ( (t )) = R 2 ( (t )) = 1, The control problem for the feed rates F A and F B can be regarded as the Prisoner's Dilemma. The Nash equilibrium [34] can be used. Namely, it can be formulated as the dynamic Nash game problem and it is a novel challenging task to determine the complicated decision scenarios in the CSTR.
Next, we select = 2, i (k) = (k) = 0.1, k = 1, 2. Using the proposed recursive algorithms with the initial conditions: we obtain the following robust Nash H ∞ constrained strategy set in (19): It should be noted that the proposed algorithm in Section 4 converges after 52 iterations with an accuracy of 10 −13 with (n+1) < (n) for all n, and the strategy set was computed. On the other hand, although the existing results based on the LMI strict constraint optimization problem [17,21,23] converged with an accuracy of 10 −6 , it is observed that the same algorithm did not converge with an accuracy of 10 −7 . It is worth pointing out that, unlike the LMI-base iterative scheme [17,21,23], the proposed iterative algorithm based on the nonlinear simultaneous algebraic equations (NSAEs) achieves a robust convergence as a novel contribution.
Second, the cost bounds obtained by using the proposed KM iteration technique are as follows: From the numerical results, it is easily found that using the KM iteration technique results in the more precise cost bounds compared to the cost bounds obtained by using the iterative technique based on the LMI [17,21,23]. Finally, Figure 1 shows how the system states and mode vary with time. In the case that the deterministic time-varying uncertainty is Δ(t ) = sin t as a special case, all the states converge to the equilibrium point.
Remark 4. It should be noted that the lower bound of the disturbance attenuation level strongly depends on the system parameters. On the other hand, the selection of the parameters i (k) and (k) requires a method of trial and error or a relatively large value may be selected. In general, the lower bound for the existence of a strategy set is obtained by performing a bisection method in advance under the condition that the non-targeted parameters are fixed. Subsequently, the control designer may consider the usage in the range exceeding these values. In this paper, when the parameter is fixed as = 2, the lower bound of the parameters are i (k) = (k) = 0.0078787. Conversely, when the constraint parameters is fixed as i (k) = (k) = 0.01, the lower bound of the disturbance attenuation level is = 0.25388 in this example. It should be noted that the lower bound of the disturbance attenuation level means the infimum of in which the strategy set exists.

CONCLUSION
In this article, the robust SOF Nash strategies have been studied for the UMJLSSs. As a result, it is ensured that a robust SOF Nash strategy set can be established even if each player can only use his/her local or private SOF information. The influence of deterministic uncertainties and external disturbances can be attenuated simultaneously using the proposed strategy set, which can be computed by solving the NSAEs. Different from the recent studies in [11,12], the robust SOF Nash strategies are considered. Moreover, the robust Nash strategy of noncooperative games is derived, as compared with the Pareto suboptimal strategy in [22]. The heuristic algorithm is to solve the NSAEs. Although the corresponding NSAEs are highly complex and nonlinear, their solutions can be computed by using the proposed algorithms in a relatively straightforward manner. A novel convergence method is introduced combined with the KM iteration to attain robust convergence of the proposed algorithms. A simple example is provided to validate the effectiveness of the proposed algorithms.
In H ∞ control theory, it is assumed implicitly that the initial condition of the system is zero. Therefore, the possibility of the initial condition being nonzero in Lemma 1 has not been considered. Unfortunately, there was no discussion on this issue in this paper, despite its importance. The robustness of the performance of the controller under such a situation is a challenging issue. On the other hand, the computational cost of the proposed method increases drastically when the number of modes increases. In this case, we need to consider a different designing concept from the root, for example, designing a mode-independent SOF strategy. These issues will be studied in future works.
Finally, the proposed approach is expected to be extended to the UMJLSSs so that the transition probabilities of the Markovian process include partially uncertain and/or unknown terms [35]. Although the proposed control strategy cannot be applied to a more general case, the extended problem is more challenging than the UMJDSSs considered in this paper. This issue will be addressed in future investigations.