Direct data-driven design of switching controllers

Summary Switching linear models can be used to represent the behavior of hybrid, time-varying, and nonlinear systems, while generally providing a satisfactory trade-off between accuracy and complexity. Although several control design techniques are available for such models, the effect of modeling errors on the closed-loop performance has not been formally evaluated yet. In this paper, a data-driven synthesis scheme is thus introduced to design optimal switching controllers directly from data, without needing a model of the plant. In particular, the theory will be developed for piecewise affine controllers, which have proven to be effective in many real-world engineering applications. The performance of the proposed approach is illustrated on some benchmark simulation case studies.


INTRODUCTION
Switching models are well-established powerful tools that can be used to describe hybrid, nonlinear, and time-varying phenomena occurring in modern engineering applications, eg, when a system may work in different operating conditions or the dynamics may vary according to specific physical limits or switches. 1In particular, piecewise affine (PWA) models are collections of linear time-invariant (LTI) submodels, defined over nonoverlapping polyhedral regions of the input+state domain and indexed by a discrete-valued switching variable.PWA models have been extensively used and studied within the control community for their universal approximation properties 2 that make them suited for the analysis and control of several classes of nonlinear systems, 3 and for their equivalence with other classes of hybrid models. 4The advantages of working with such model structures are many, but most importantly, they provide an excellent trade-off between descriptive power and complexity (the local submodels may be very simple even if the overall system is highly complex).
In the last decades, several control methods have been proposed to handle systems characterized by the interaction between continuous and discrete dynamics, eg, based on Lyapunov arguments 5 or on the solution of linear matrix inequalities (LMI) [6][7][8] (see the work of De Schutter et al 9 for an overview on existing approaches).Under additional operating constraints, PWA systems can be effectively controlled via hybrid model predictive control. 10,11However, based on the model dynamics, the controller may turn out to be highly complex and of high-order, leading to many robustness and implementation problems. 12Moreover, hardware computational limits may prevent or restrain the use of MPC or optimization-based solutions.Fixed-structure control design, eg, switching proportional integral derivative (PID) control laws with PWA variations of the gains, could instead represent an effective alternative approach for all the above cases.Unfortunately, unlike the LTI case, 13 little work has been done on fixed-order control design for switching systems, eg, in the work of Kong et al, 14 a switching PID is tuned based on an accurate model of the system using genetic algorithms to control a nonlinear DC motor.The switching logic is instead dictated by ad hoc heuristics in the work of Marchetti et al. 15 In the general case where no accurate models are available, the design of a fixed-order switching controller is a tough undertaking, in that it requires three critical design steps: (i) the parameterization and identification from experimental data of a switching model of the nonlinear/time-varying/hybrid system, (ii) a model-reduction step to compute a model of the desired complexity, and (iii) a control design step.While the latter is indeed a critical but feasible task, the first two steps offer several issues.First of all, the problem of identifying PWA models is known to be NP-hard. 16For this reason, several heuristics 17,18 have been introduced to learn both the parameters of each affine submodel and the polyhedral partition of the input+state domain.Among them, the approaches proposed in [19][20][21][22][23] initially tackle the problem of identifying the local models and, afterwards, the one of learning the polyhedral partition.Among these techniques, only the one proposed in 23 can efficiently handle large data sets, thanks to the computationally efficient approach proposed to solve the multicategory discrimination problem.Secondly, model reduction for switching systems is a largely unexplored world.Gosea et al 24 rely on the assumption that the switching signal is known.The same holds for Shaker and Wisniewski, 25 which also requires the solution of a set of LMIs, thus making the method computationally expensive.The work of Papadopoulos and Prandini 26 presents an interesting technique for the case of endogenous switching signal; however, such a method is conceived for continuous time systems, whereas most of existing identification algorithms compute models in discrete time. 18,27It follows that the continuous time counterpart of the model would need to be found, eg, as explained in 28 or, alternatively, the system needs to be directly identified in continuous-time as described in. 29Nevertheless, notice that, in 29 the switching sequence is supposed to be known.
Regardless of the specific technique for identification or model-reduction, the above control design approach strongly relies on the availability of an accurate model of the system.Since the controller complexity is specified a priori, an appealing alternative is the direct optimization from data of the controller parameters, without undertaking an expensive and time-consuming modeling study or performing any model-reduction step. 30Such an approach to control design is usually referred to as direct data-driven control or model-free control and has been widely investigated within the LTI framework, see, eg, the Virtual Reference Feedback Tuning (VRFT) 31,32 and the Non-iterative Correlation-based Tuning (CbT) 33,34 methods.Although such approaches have been extended to the control of more complex systems, such as linear parameter-varying (LPV) systems 35,36 or nonlinear systems with bounded disturbances, 37,38 they have never been extended to the control of hybrid/switching systems, in which both the local behavior and the operating condition of the plant are unknown.To the best of our knowledge, the only approach in the literature for data-driven control of switched systems has been recently presented in, 39 in which the problem of finding a switching static regulator for state feedback is solved within a set-membership framework through the solution of a polynomial optimization problem.This approach relies on Lyapunov arguments to formulate the design problem and to prove the stability of the closed-loop system.This is possible since the authors assume the state of the plant to be measurable, which is often not true in practice.Although a state vector can be constructed relying on past input/output data, properties proven by using this non-minimal state might not extend to the actual system.The authors further assume the discrete state of the controlled system to be known, implying insights on the true system and limiting the use of the approaches when the actual plant is only approximated by a switching model.
Similar assumptions are considered in most of existing approaches for model reference adaptive control (MRAC) for switching and PWA systems, so that closed-loop stability can be guaranteed through Lyapunov arguments.Since the first contributions on the extension of adaptive control to this class of hybrid systems in, 40,41 MRAC methods for PWA systems rely on the assumption that the partition driving the switches in the operating mode of the system is known.This limits the application of the method, especially when dealing with unknown nonlinear systems.Although the approaches presented in 41 have been further extended in 42 to overcome possible instabilities caused by disturbances on the input, the effect of uncertain/noisy state measurements has not been explored.Analogously, the hypothesis of exact state knowledge is shared by the approach proposed in 43 for asynchronous adaptive tracking control for switched system, where the authors further assume prior knowledge on the lower and upper bounds on the controller parameters.More recently, two MRAC approaches for PWA systems have been proposed in, 44 which rely on different strategies to adaptively tune the controller's parameters.Indeed, in the direct method presented therein, parameter tuning is based on the tracking error, whereas the indirect approach relies on a time-varying estimate of the model of the system.Although the approaches handle multivariable plants and are not constrained by the same structural assumptions on the plant imposed in, 40,41 both methods still assume the operating condition of the system to be known.Furthermore, the theoretical analysis of the methods does not consider disturbances affecting the state measurements, which is instead introduced in the numerical examples presented to assess the performance of the controller.An approach for adaptive control of uncertain switched systems has been proposed in, 45 which accounts for disturbances acting on the measured state and does not require prior knowledge on the bounds of the uncertainty set.However, as in most approaches for control of switched systems, the operating condition of the plant is supposed to be directly accessible, so that it can be imposed by the controller.More recently an MRAC approach for switched systems explicitly accounting for the case of unmeasurable state has been proposed in. 46The problem is handled by designing both a state observer and an adaptive controller, which requires quite an extensive design effort with respect to the one needed to design the controller only.Furthermore, both the controller and the observer relies on the discrete state of the plant, which is assumed to be accessible.
Instead of adjusting the controller to changes in the operating condition of the plant online, like in adaptive approaches, 47 in this paper we present a novel model-reference approach for the direct data-driven control of PWA systems, focusing on the single-input-single-output (SISO) setting only.This allows us to set the foundations for future extensions to the multiple-input-multiple-output (MIMO) case, in which not only the nonlinear nature of the plant has to be accounted for but also (unknown) input couplings.The proposed method allows for the offline design of fixed-order PWA controllers from data within a stochastic setting, without requiring the modeling/identification of the unknown plant.The controller is thus characterized by a finite collection of local affine controllers defined over a polyhedral partition of a fixed domain, which are not adjusted/modified once the controller has been deployed.The proposed technique relies on input-output (IO) data only, so that the state of the system to be controlled does not need to be measured.Additionally, we assume that the operating condition of the controlled system is unknown.Thanks to this assumption and given the universal approximation properties of PWA maps, 2 the proposed technique is suited also for the control of nonlinear time-varying systems, which can often be well approximated with PWA models.Inspired by recent results on model-reference direct control of LPV systems, 48 the design problem is formulated as a convex optimization problem, whose solution allows us to characterize both the discrete and the continuous dynamics of the controller.This optimization problem is solved through an extension of the coordinate descent method for learning jump models proposed in, 49 which embeds an instrumental variable (IV) scheme to efficiently cope with the noise affecting the measurements of the system output.Because of the chosen formulation, the partition characterizing the discrete dynamics of the controller is retrieved from data, by using the computationally efficient multicategory discrimination approach proposed in. 23he main contributions of the paper are the following: (i) a novel data-driven stochastic method for the estimation of fixed-order PWA controller is proposed, which does require neither the knowledge of a model for the underlying system nor that of its operating condition; (ii) through the estimation of both local controllers and the switching mechanism from data, the method represents an alternative to existing data-based approaches without requiring the prior design of the switching policy, which is instead directly estimated from offline data; (iii) to the best of our knowledge, this is the first approach that allows for the direct design of both local controllers and switching policy; (iv) thanks to the estimation of the switching policy, we show that the method can effectively be used for the direct control of nonlinear systems, whereas other approaches for data-driven control of switching systems cannot be employed like that, since they rely on the knowledge of the system switching sequence.Despite these advantages, currently closed-loop stability cannot be formally guaranteed, due to the lack of off-the-shelf data-driven conditions for closed-loop stability within the input-output switching setting.Nonetheless, if the chosen reference model is stable and the problem is correctly solved, closed-loop stability is expected to be attained in practice since a stable behavior is matched.The choice of an optimal attainable reference model is thus crucial to achieve good closed-loop performance.Although outside the scope of this work, a way of handling this interesting and challenging problem can be to extend the approach proposed in the work of Selvi et al 50 to the switching case.
The paper is organized as follows.The problem is formally stated in Section 2. Section 3 is devoted to the reformulation of the objective in a direct data-driven fashion.The proposed design method is described in Section 4, whereas the details on the required preprocessing phase are given in Section 5.In Section 6, the effectiveness of the approach is assessed on two numerical example and a benchmark simulation case study.This paper is ended by some concluding remarks.

Notation
Let ℕ be the set of natural numbers and ℤ be the set of integer numbers.Denote with ℝ and ℝ + the sets of real and positive real numbers, respectively.Let ℝ n be the set of n-dimensional real vectors and denote with ℝ n×m the set of real matrices of dimension n × m.Denote with 1 n ∈ ℝ n the n-dimensional unitary vector.Given a vector a ∈ ℝ n , its i-th component is denoted as a i , ||a|| 2 is its Euclidean norm and (a) + is a vector whose i-th element is max{0, a i }.Given two vectors a, b ∈ ℝ n , max{a, b} is the vector whose i-th component is max{a i , b i }, and let a ⪯ b denote the component-wise inequality a i ≤ b i , for i = 1, … , n.Given a matrix A ∈ ℝ n×m , A ′ ∈ ℝ m×n is its transpose.Given {u t } t ≥ 0 with t ∈ ℕ, the shift operator q −1 is such that for d ∈ ℤ it holds q −d u t = u t−d .Given a logic condition C, denote with  [C] the corresponding indicator function, ie, For example, for a, b ∈ ℝ,  [a=b] , = 1 iff a = b.

PROBLEM FORMULATION
Let  s be a single-input single-output (SISO) switching system that can operate according to K ∈ ℕ finite operating regimes (also denoted as modes).The input-output (IO) relationship characterizing  s is described by the difference equation where u(t) ∈ ℝ and  o (t) ∈ ℝ are the input to  s and the corresponding noiseless output, respectively, and the switching variable s o (t) ∈ {1, … , K} indicates the operating mode of the system at time t ∈ ℕ.The scalar  (k) ∈ ℝ and the polynomials A(k, q −1 ) and B(k, q −1 ) in the shift operator q −1 characterize the dynamics of  s at the k-th mode, for k = 1, … , K, with A(k, q −1 ) and B(k, q −1 ) respectively defined as and n a , n b ∈ ℕ indicating the dynamical order of the system.Let X o (t) be the collection of past inputs/outputs of the system given by In this work, we focus on picewise affine systems (PWA) systems.Indeed, we assume that  s changes its operating regime according to a polyhedral partition of the domain .Therefore, we assume that there exists a complete partition { k } K k=1 of space  such that the discrete dynamics of the switching system  s is described by the logical condition Note that the collection { k } K k=1 forms a complete partition of space  if The convex polyhedral region  k is defined as with  where It is straightforward to see that the operating condition of the system in ( 6) is driven by the polyhedral partition of the output space reported in Figure 1.This system is indeed piecewise linear (PWL).
In this paper, we aim at designing a controller  s for  s to match a desired stable closed-loop behavior  s .The considered reference-models lie within the class of PWL models, described by the difference equations with x M (t) ∈  M ⊆ ℝ n x M and  d (t) ∈ ℝ being the desired closed-loop output for the reference r(t) ∈ ℝ at time t.The discrete variable s M (t) ∈ {1, … , K M } changes its value according to a polyhedral partition of the state space  M , ie, According to the switching logic described in (7c), the operating condition of the reference model varies independently of the mode of  s and the reference signal.This choice allows us to design the controller  s , without restricting its effectiveness to a specific class of references.With a slight abuse of notation, the operator M(s M (t)) is used from now on to denote the relationship between r(t) and y d (t), namely, Throughout the rest of this paper, the system  s is assumed to be fully unknown.Nonetheless, we suppose that open-loop experiments can be carried out by feeding  s with a set of inputs {u(t)} T t=1 and that the corresponding output can be measured.It is worth remarking that the operating condition of  s is unknown and it cannot be retrieved from data without identifying a model for the plant.For the collected signals to be suited for learning tasks, the PWA system  s is assumed to be stable, where stability is intended as follows: Definition 1.A PWA system (2)+( 4) is bounded-input-bounded output (BIBO) stable if, for all trajectories {u(t),  o (t)} t∈ℕ verifying (2) and switching paths {s o (t), t ∈ ℕ ∶ s t ∈ {1, … , K}, with K < ∞} satisfying (4), it holds sup Note that this condition can be relaxed by collecting experimental data in closed-loop, through the interconnection of  s with a stabilizing controller. 51The output measurements are supposed to be corrupted by an additive zero-mean stationary noise v(t), ie, Instead of identifying  s , we propose to use the available data set  T = {u(t), (t)} T t=1 to directly design a controller  s for the system, so to match the desired closed-loop behavior  s described by (7).For this purpose, the controller is sought within the class of PWA models where e(t) = r(t)−(t) and s c (t) ∈ {1, … , K c }, with K c ∈ ℕ finite, are the tracking error and the logic state of the controller at time t, respectively.The polynomials A c (k, q −1 ) and B c (k, q −1 ) are defined as with n c a and n c b denoting the dynamical order of the k-th local controller, so that the parameters to be estimated for each local controller are We choose the discrete state s c (t) ∈ {1, … , K c } of the controller to be driven by a polyhedral partition with (t) being generally defined as the following collection of inputs/outputs Remark 1.By imposing that the controller depends on the tracking error e, different weights on the reference and the output are not allowed.Nonetheless, this choice complies with the characteristics of many controllers used in practice for SISO systems, ie, PID controllers.At the same time, it allows us to limit the number of parameters to be learned, thus reducing the computational burden of the final algorithm.Finally, note that the structure of  s can be easily modified to account separately for y and r.Indeed, this would entail a slight change in the expression for the dynamics of the local controllers, while it does not modify the vector .
Remark 2. Given the chosen switching logic for  s (see (10e)-(10f)), the partition characterizing the controller  s is directly related to the one of the plant.Nonetheless, as  s is unknown, such a relationship cannot be explicitly determined and will be therefore numerically computed as illustrated next.
The available dataset  T is assumed to be collected by feeding  s with sufficiently rich inputs, ie, the input sequence should be persistently exciting for all operating conditions of interest and it should enable to periodically visit all these modes.A preliminary check on the properties of the input can be performed as described in, 52 where a more restrictive definition of persistence of excitation is considered.Nonetheless, since the design of experiment for switching systems is a difficult open problem and is out of the scope of this paper, further discussions on this topic will be postponed to future works.
Under this assumption, the problem addressed in the paper can be stated as follows.
Problem 1 (Data-driven design of switching controllers).Given a set of data set  T = {u(t), (t)} T t=1 collected from open-loop experiments on  s , described as in (2), and a PWL reference closed-loop model  s as in (7), find the fixed-order PWA controller with the structure in (10) that realizes the desired closed-loop behavior  s .(12) reduces to the data-driven design of a controller for a linear time invariant (LTI) system.When the reference model is also LTI (ie, K M = 1), the considered design problem can thus be effectively solved via existing techniques, such as the Virtual Reference Feedback Tuning (VRFT) method 31 or Non-iterative Correlation-based Tuning (CbT). 33Moreover, if the operating condition s o (t) of  s is measured,  s can be seen as an LPV system with integer scheduling variable, and thus Problem ( 12) can be solved via the approach proposed in. 48oblem 1 requires to learn from data: (i) the number K c of local controllers, (ii) the overall parameter matrix i=1 are not fixed a priori, the PWA controller is expected to be flexible enough to attain the desired closed-loop behavior, even if the underlying system can only be approximately described through a PWA model.On the other hand, since we aim at designing fixed-order local controllers, their dynamical order (namely, n c a and n c b ) is fixed a priori.In this work, the number of local controllers K c is also imposed by the user.Although K c is generally unknown, it can be chosen by practitioners relying on prior information on the possible working conditions of the system.Alternatively, since the design problem is solved offline, the number of local controllers can be chosen through cross-validation, with an upper bound on K c dictated by the maximum tolerable complexity of the controller.As a possible cross-validation strategy, the design problem can be solved over a subset of data for increasing values K c , and the number of local controllers can then be selected as the one minimizing a properly chosen performance criterion.A possible choice for such a criterion is indicated in Section 6.Otherwise, any clustering algorithm (eg, K-means 53 ) can be used to split the vectors (t) into FIGURE 2 Proposed closed-loop behavior matching scheme K c clusters, for increasing values of K c .The number of local models can then be chosen as the one leading to the best clustering solution according to some evaluation criterion, eg, the Davies-Bouldin index. 54et us now mathematically formalize Problem 1. Define (t) as the mismatch between the desired closed-loop behavior and the one attained interconnecting  s and  s at time t, ie, as illustrated in Figure 2.
According to this matching scheme, we specifically search for a PWA controller  s , defined as in (10), that realizes the desired closed-loop behavior  s in (7) with the least mismatch error (t) in (11).Then, Problem 1 can be rewritten as the optimization problem minimize where  T 1 = {1, 2, … , T}.The constraint in (12b) is formulated exploiting the desired closed-loop dynamics.Note that the cost function is optimized with respect to the unknown parameters Θ, the polyhedral partition i=1 , and the sequence of controller modes  c = {s c (t)} T t=1 .The introduction of this additional unknown is functional to the clustering phase needed to estimate both the parameters of the local controllers and the polyhedral partition of the space  c in a supervised fashion.In the rest of this paper, for fixed K c , n c a , and n c b , we assume that the following statements hold.Assumption 1 (Perfect matching).There exists a parameter matrix Θ o , a polyhedral partition i=1 and an active mode sequence  o,c such that the corresponding PWA controller  s in (10) are the solution of problem (12).

Assumption 2 (Parameters identifiability). For fixed partition {
i=1 and for every two instances Θ (1) and Θ (2) of the parameter matrix Θ, there exists an input trajectory such that the response of the interconnection between  s in (10) and (2) .Assumption 3 (Partition identifiability).For fixed parameters Θ o and for every couple of partitions { c, (1) i i=1 of space  c , there exists an input trajectory such that the response of the interconnection between  s in (10)  and  s in (2) is different if the partitions are different.Assumption 1 entails that the considered problem is well-posed, namely, there exists a controller within the considered class  s that realizes the desired closed-loop behavior.Instead, Assumptions 2 and 3 imply that the parameters and partition characterizing the optimal controller are globally identifiable, provided that the other unknowns are fixed to the optimum.These assumptions are required for consistency analysis, but they may be hard to check in practice.Indeed, the system  s to be controlled is unknown and, thus, problem (12) cannot be explicitly solved.Nonetheless, it must be pointed out that these hypotheses do not limit the applicability of the approach since our main goal is not to exactly retrieve the optimal solution but to design a controller that behaves equivalently to the optimal solution to problem (12), namely,  s leading to the desired closed-loop performance.Note that similar identifiability assumptions are shared by any control design method where the model is found from experimental data.
We further rely on the following technical assumption on M(s M (t)).
Assumption 4. The mapping M(s M (t)) between the reference signal r(t) and the desired output y d (t) is invertible, for all t ∈  T .
The left-inverse of M(s M (t)) in ( 7) is denoted as M † (s M (t)) and is defined as follows.
Definition 2. Given a causal PWA model M(s M (t)) with input r(t) and output y d (t), defined as in (7), its left-inverse is the causal PWA mapping M † (s M (t)) giving r(t) as an output when fed with y d (t), ie, M † (s M (t))M(s M (t)) = 1.

A DIRECT DATA-DRIVEN REFORMULATION
The optimization problem in (12) explicitly depends on the reference signal r (see the constraints in equations ( 12b) and (12e)) and on the (unknown) dynamics of  s via the constraints in (12c)-(12d).By keeping the dependence on r, Problem (12) would need to be solved for different reference trajectories, thus making the approach not feasible in practice in many applications.At the same time, the constraints in equations ( 12c) and (12d) would require the identification of a model for  s prior to the controller design phase.Therefore, before solving the design problem, we annihilate these dependences as described in this section, to reformulate Problem 1 in a purely direct data-driven fashion.

Removing the dependence on the reference signal r
Through the definitions of the mismatch error in (11) and the left-inverse M † (s M (t)), a fictitious reference signal can be obtained as The new reference signal in ( 13) is a combination of the output M † (s M (t))y o (t), returned by the left inverse when fed with the system output y o (t), and a term compensating for the mismatch error between the desired closed-loop behavior and the attained one.This alternative definition of the reference allows us to express the tracking error as which is independent of the actual reference signal r(t).Nonetheless, the tracking error depends on both the noiseless output of  s and the mismatch error (t).By exploiting equations ( 13)-( 14), the optimization problem in (12) can be recast into the following reference-free problem: minimize where the constraint on the mismatch error (see (12b)) is removed since the definition of (t) is used to construct the fictitious reference and the tracking error.

Removing the dependence on the dynamics of  s
By looking at the constraints in (15b)-(15c), it is clear that the problem in (15) still depends on the unknown dynamics of the system  s .The current problem formulation thus requires the identification of the system prior to the controller design phase.To drop the dependence on the model of  s , we leverage on the available data set  T = {u(t), (t)} T t=1 .Specifically, instead of explicitly considering the dynamics of the system, the available data are used to replace y o (t) in (15b)-(15c), so that the resulting problem is given by minimize The problem to be solved now depends only on the data and the parameters of the controller  s to be designed.Nonetheless, the problem in ( 16) is biconvex, due to the products between the optimization variables in the constraint (16b).
Let (t) ∈ ℝ be defined as which can be computed relying on available data and the model to be matched.By properly rearranging terms in (16b), it holds that and, by introducing the regressor X c ((t)) ∈ ℝ n  , ie, it is straightforward to prove that equation ( 18) is equivalent to The design problem is thus recast as the minimization of the parameter-dependent residuals Since our aim is to minimize the mismatch error, it is worth remarking that this problem has to be solved avoiding the trivial solution B c (k, q −1 ) = 0, for all k = 1, … , K c .This goal is accomplished by resorting to an instrumental variable scheme when the parameters Θ are estimated, as detailed in Section 4.1.
The  2 penalty term ||Θ|| 2 2 is introduced to improve the statistical performance of the controller tuning strategy, 55 with the regularization parameter  ∈ ℝ + chosen to trade-off variance and bias of the estimated parameters. 56Notice that the higher  is, the more the local parameters are shrunk towards each other and towards zero. 53Therefore, improper choices of this parameter might lead to a poor matching of the desired closed-loop behavior which, in turn, might cause a deterioration of the tracking performance.

Explicit estimation of the polyhedral partition of space X c
Constraint (21b) in problem (21) depends on the polyhedral partition of the space  c of vectors (t) in (10f).However, the partition is unknown and, thus, it has to be retrieved from data.Inspired by, 23 a possible approach to handle the estimation of the polyhedral partition is to search for a PWA separator of the domain  c .
The problem of learning the polyhedral partition from data reduces to find a function  ∶ ℝ n  → ℝ defined as where  i () ∶ ℝ n  → ℝ, for i = 1, … , K c , are affine functions in , ie, and the parameter vectors  i ∈ ℝ n  and the scalars  i have to be estimated from data, for i = 1, … , K c .According to the definition of the separator in (22), the i-th polyhedral region  c i is given by the data points satisfying () =  i (), namely, the points  such that Example 2. Consider the clusters { c i } 3 i shown in Figure 3, which are defined over the monodimensional space  c .These clusters are pairwise linearly separable, with the PWA separator () dividing the clusters also shown in Figure 3.By looking at the separator, intuitively each data point is associated to the corresponding cluster by searching for the maximum affine function among { i } 3 i=1 .We modify problem (21) to penalize the misclassification of points in the  c -domain, according to the condition in (24) and for the mode sequence  c .Specifically, the squared 2-norm of the violation of ( 24) is introduced in the objective function in (21), and the design problem is recast as follows: minimize ) .
(25) The additional regularization term ) is introduced to better condition the multicategory discrimination problem.The finite regularization parameter  p ∈ ℝ + has to be tuned by the user, considering that  p might influence the generalization capabilities of the PWA separator.Indeed, different choices of  p might cause translations and/or rotations of the PWA separator.In turn, this leads to changes in the partition induced on  c , that ultimately results in different switching policies.Therefore, an improper tuning of this parameter might cause a poor matching of  s .

THE DESIGN METHOD
The data-driven design of the PWA controller in (10a) is carried out by solving the convex unconstrained optimization problem in (25) with respect to the overall parameters Θ of the local controllers, the parameters { i ,  i } K c i=1 of the PWA separator, and the active mode sequence  c .Inspired by, 49 problem ( 25) is solved via a coordinate descent method, which alternates between the minimization of the cost with respect to the parameters Θ and { i ,  i } K c i=1 , for a fixed sequence  k−1 c , and the optimization with respect to the mode sequence  c , for the updated parameters i=1 .The approach, summarized in Algorithm 1, is developed by exploiting the separability of the cost with respect to Θ and the parameters characterizing the PWA separator.Indeed, for a fixed mode sequence  k−1 c , it is straightforward to see that the cost to be optimized, given by can be split into the sum of the following terms: ) allowing us to separately estimate Θ and { i ,  i } i=1 .The update of the active mode sequence at Step 1.3 is performed accounting for both the continuous dynamics of the local controllers and the position of the samples with respect to the estimated partition.It is worth remarking that the estimate of  c allows us to cluster the available data and, thus, to solve the problems at Steps 1.1 and 1.2 in a supervised fashion.Algorithm 1 is run offline until either the cost  does not decrease significantly (according to a tunable threshold  J > 0) within two consecutive iterations (27) or the maximum number of iterations k max is attained.This last stopping criterion is introduced to guarantee the termination of Algorithm 1 in finite time, but it does not affect the closed-loop performance attained with the learned controller  s .Indeed, if the problem is well-posed, Algorithm 1 usually converges to a solution within few steps, thus satisfying the first stopping criterion.* Remark 4. Once Algorithm 1 is stopped, the stability of the resulting controller can be checked by resorting to its nonminimal state-space representation. 28Accordingly, the model-based approach proposed in 57 or the data-driven method presented in 58,59 can be exploited to study the stability of  ⋆ s .However, provided that only input/output data are available in closed-loop, to the best of the authors knowledge, no input/output criterion exists to check the stability of the closed-loop system once  ⋆ s has been learned.Despite this limitation, if the problem is well-posed and a suitable stable reference model is chosen, the closed-loop system is expected to match the desired behavior.
Once Algorithm 1 is terminated, the resulting controller  ⋆ s can be used to control  s , with the control action ũ(t) fed to the system obtained as described in Algorithm 2. In particular, the active local controller is chosen according to the estimated polyhedral partition of space  c (Step 2) and, then, the input is generated by using the estimated parameters for the chosen local controller (Step 4).Note that the reference signal r feeding the closed-loop system is not the fictitious reference used for training, but it is the one selected by the user according to the control task at hand.*Only if the problem is ill-posed or the chosen  J is excessively small, the first stopping criterion might be never satisfied.To avoid infinite loops, it is thus reasonable to set k max = 100, so to always terminate the iterations, while restraining the computational time in case Algorithm 1 does not converge.

Step 1.1: estimation of the parameters 𝚯
For a fixed mode sequence  k−1 c , the estimation of the parameters Θ reduces to the solution of the optimization problem with the objective function defined as in (26b).Assume that the sequence  o,c and the partition { c o,i } K c i=1 satisfying Assumption 1 are known and If the output measurements are not corrupted by noise (ie, v(t) = 0 in ( 9)), problem (30) is equivalent to the one in Equation (15).In this case X c ((t)) in (26b) is simply replaced with X c ( o (t)), with the noiseless regressor given by and However, in realistic scenarios where only noisy data are available, the solution of problem (30) and  o,c are known, as shown in the next section.This discrepancy between the solution of ( 12) and ( 25) is handled via a proper stochastic treatment of noise, by recasting the cost  1 ( T , Θ,  k−1 c ) via an IV scheme.We thus introduce an instrument  (t) ∈ ℝ n  , with the same dimension of the regressor X c ((t)), which is selected so that it is not correlated with the noise dependent term Remark 5.The instrument can be built by constructing a new set of regressors through a new experiment carried out on  s with the same input used to generate  T .Since the output measurements collected during the new experiment are affected by a different realization of the noise, this should guarantee that (t) is correlated with the system dynamics.At the same time, the instrument should be uncorrelated with (M † (s M (t)) − 1)v(t).
By introducing  (t), problem (30) is reformulated as Note that, differently from  1 ( T , Θ,  k−1 c ) in (26b), the objective function in (33) is not separable with respect to time.Therefore, the parameters  i , with i = 1, … , K c , cannot be updated by solving K c separate optimization problems.
Although this impacts to a certain extent on the computational time required to design the controller, this is not critical since we are designing the controller offline.
Let  [s k−1 c (t)] ∈ {0, 1} n  ×n Θ be defined as with  [s k−1 c (t)=i] = 1 if and only if s k−1 c (t) = i.Problem (33) can be equivalently reformulated as which explicitly depends on the unknown parameter matrix Θ.We can then construct Z ∈ ℝ T×n  , U ∈ ℝ T , and R() ∈ ℝ T×n Θ , which are obtained by stacking the elements of Problem (35) can thus be recast in matrix form as whose closed-form solution is given by Beside the impact of the  2 penalty introduced in (21) on the statistical properties of the designed controller, it is worth remarking that this regularization term causes the addition of  ≥ 0 to all the diagonal terms of (Z ′ R()) ′ Z ′ R(), so that the matrix to be inverted is always nonsingular.Therefore, problem (37) is computationally tractable even if is not full rank.Once Θ k is computed, the estimated parameters for the ith local controller can simply be retrieved as Notice that the computation of Θ k in (38a) requires the inversion of an n  -dimensional matrix and, thus, the computational complexity of the method increases with the number of parameters and the number of possible local controllers.Strategies for complexity reduction will be the subject of future investigations.

On the consistency of the estimated parameters
Let us assume that the following assumptions are verified.Assumption 6.The IV is properly selected, namely it is such that Conditions (39a)-(39d) entail a proper choice of the instrument and are commonly exploited to prove consistency. 60nder these conditions, we can prove the following consistency result on the parameters estimated as in (38a). Theorem Proof.See the Appendix.

Step 1.2: construction of the polyhedral partition of X c
Since the mode sequence is fixed to  k−1 c , the data can be split into K c distinct set of points . By relying on these clusters, the PWA separator defined in ( 22) is estimated via the optimization of ) defined in (26c).Specifically, the cost function in equation ( 26c) is equivalent to where we explicitly introduce the dependence of the cost on the number of misclassified points for each cluster.Let c i denote the cardinality of the i-th cluster, namely, the number of points associated to C i , for i = 1, … , K c .As the problem is solved offline, we construct the vector X i stacking the elements of the i-th cluster, so that the cost to be optimized is equal to where the normalization with respect to c i (the number of points associated to each cluster) is introduced for the problem to be better conditioned.This cost function allows us to find the PWA separator via the regularized piecewise smooth Newton method with Armijo's line search proposed in, 23 which significantly reduces the computational time required to solve the discrimination problem with respect to other multicategory classification methods, even when large data sets are used.

Step 1.3: update of the active mode sequence S c
Once the parameters i=1 have been computed, the active mode sequence  c can be updated by minimizing the cost The sequence  c is thus estimated by solving the optimization problem where the regularization terms have been removed as they are independent of the mode sequence.Since the cost function in (43) is separable over the time, the estimation of the mode sequence can be solved to achieve global optimality by dynamic programming 61 with complexity O(TK 2 c ).This is attained as follows.Let  t (u(t), (t), ) ∈ ℝ be the t-th term of the objective function in (43), ie, First, the terminal cost J(i, T) ∈ ℝ is computed for all i ∈ {1, … , K c } as Then, the cost J(i, t) ∈ ℝ and the mode S(i, t + 1) ∈ ℝ are obtained as backward in time, for t = T − 1, … , 1.The mode sequence is then reconstructed forward in time as Note that the optimal active mode at time t is selected as the one that trades off input matching (u(t) ≈ X c ((t)) ′  s k c (t) ) and the assignment of (t) to the polyhedral region such that Therefore, this clustering policy allows us to account for both the local dynamics of the controller and the hypothesis made on the discrete dynamics of  s .

ON THE COMPUTATION OF THE FICTITIOUS REFERENCE SIGNAL
As outlined in Section 3.1, the dependence of the model-matching design problem (12) on the reference signal r is annihilated by constructing a fictitious reference signal from data.Such a signal should not be confused with the virtual reference signal computed in the VRFT algorithm. 31In our case, the fictitious signal embeds the mismatch  to be minimized, while in VRFT the virtual reference represents the "ideal" signal that would feed the closed-loop if the latter coincides with the desired model.Further comments about the differences between our optimization-based approach and state-of-the-art data-driven control can be found in. 48ote that the derivation of r requires the computation of the left-inverse of the PWA reference model  in (7), with the left-inverse defined as in Definition 2. Since the discrete state s M (t) ∈ {1, … , K M } of M(s M (t)) can be measured for all t ∈ ℕ, the problem of computing the left-inverse of the reference mapping M(s M (t)) in ( 7) reduces to the determination of the left-inverse of an LPV map with integer scheduling variable.This holds as the polyhedral partition driving the discrete dynamics depends on the state of M(s M (t)) only (see (7c)), which is invariant to the inversion, so that the discrete state is the same for the reference model  s and the left-inverse M † (s M (t)).Therefore, by relying on existing results on the computation of the left-inverse for LPV mappings, 48 the left-inverse of the PWA map M(s M (t)) can be found according to the following proposition.
Note that, by using  ′ s , the number of samples available for the controller design reduces to T − τ, but this is not critical since usually τ ≪ T.

SIMULATION EXAMPLES
The effectiveness of the proposed data-driven design strategy is evaluated through three numerical examples and a simulation case study of practical interest, already considered in. 48The first and the second numerical examples concern the control of two PWA systems (with LTI and PWL reference models, respectively), so that the dynamics of the (unknown) plant  s is exactly described by Equations ( 2)-( 4).On the other hand, in the other examples, the proposed method is used to design a PWA controller for two different nonlinear systems.In these cases, the behavior of the true system can only be approximated through equations ( 2)-( 4).In all examples, the output measurements are supposed to be corrupted by noise, whose effect is assessed through the signal-to-noise ratio (SNR), ie, SNR = 10 log The instrument  is constructed by performing a second open-loop experiment on the system with the same input used to build  T .Since, in most applications, the control action feeding the switching/nonlinear systems is aimed at compensating their discontinuous/nonlinear nature, we consider LTI reference models for three out of four case studies.Nonetheless, by selecting a PWL reference model for the second numerical example, the performance of PWA controllers designed directly from data are also assessed in case the ultimate goal is for the closed-loop system to be characterized by different operating regimes.Independently of its characteristics, the reference model  s is always chosen to be strictly causal and, thus, it has no direct feed-through.Therefore, its left-inverse is always computed as described in Section 5.
Due to the dependence of the outcome of Algorithm 1 on the initial guess  0 c for the mode sequence, Algorithm 1 is carried out for N different initial sequences generated randomly, selecting the solution associated to the least value of the cost  .This should help us avoiding Algorithm 1 to be trapped into a local minimum.For all possible initial mode sequences, Algorithm 1 is run until k max = 100 iterations are carried out or the condition in equation ( 27) is satisfied with tolerance  J = 10 −10 .The polyhedral partition at Step 1.2 of Algorithm 1 is computed by using the piecewise smooth Newton method proposed in, 23 by using the same parameters reported there in. 23or each case study, the number of local controllers and the regularization parameters are chosen through cross-validation, by running Algorithm 1 for different combinations of the tuning parameters, with their possible values defined over a preselected grid of values.Cross-validation is performed over a data set  T of length T ≪ T not used for learning.Since the value of the affine term in the local control law might affect the tracking performance in closed-loop, a different regularization parameter  aff ≠  is considered for the  2 penalty associated to this term.As an improper tuning might lead to a poor matching of the desired closed-loop behavior, performing closed-loop experiments for each controller might be unsafe.Therefore, the parameters to be used are selected without performing closed-loop experiments, but they are chosen as the ones minimizing the root-mean-square error (RMSE) in the reconstruction of the measured input, ie, with RMSE u defined as follows: u(t) being the input fed in open-loop to the plant at time t, and { ū(t; K c , ,  a  ,  p )} T t=1 being the input obtained by simulating the controller learned for fixed values of K c , ,  aff , and  p in open-loop.When performing cross-validation, Algorithm 1 is run for N = 20 randomly generated initial guesses for the mode sequence.
Once the tuning parameters are selected and the controller is learned via Algorithm 1, its performance is evaluated with respect to a certain tracking task.In this case, the difference between the desired closed-loop behavior and the one attained with the designed controller is quantitatively assessed through the RMSE, ie, All the tests are performed on an i7 2.8-GHz Intel core processor with 16 GB of RAM running MATLAB 2018b.

Numerical example: LTI reference model
Consider the PWL system with K = 2 modes already introduced in Example 1, ie, where The desired closed-loop behavior  s is chosen as the second-order LTI system described by the difference equation The PWA controller designed to attain the desired closed-loop behavior is characterized by first-order local controllers with integral action (ie, n c a = n c b = 1 and a i (s c (t)) = 1 for i = 1, 2).The control law  s is thus given by A data set  T of length T = 2750 is generated by exciting the system in (57) with a sequence of inputs uniformly distributed in the interval [−1, 1].The corresponding outputs are corrupted by a zero-mean white noise v(t) with Gaussian distribution and standard deviation  v = 0.35, which yields SNR=7.1 dB.An additional experiment is performed in similar conditions to generate a data set of length T = 1250 used for cross-validation.By considering a maximum of five local models, Figure 4 shows the smallest RMSE u among the ones resulting from different choices of the tuning parameters, obtained for different values of K c .It is clear that the minimum is attained for K ⋆ c = 2. Unsurprisingly, as the reference model is LTI, this value corresponds to the number of operating conditions of  s .According to the chosen K ⋆ c , the other tuning parameters are set to  ⋆ = 0.1,  ⋆ a  = 10 and  ⋆ p = 0.1.With the selected tuning parameters, the controller is learned by running Algorithm 1 for N = 20 different initial sequences.The partition of the regressor space  c characterizing the switching policy of the controller is shown in Figure 5.It is interesting to note that the induced partition clearly differs from the one describing  s in Equation (57).Indeed, the discrete dynamics of  s is governed by the first component of the plant state, while the controller partition depends on the current/past outputs and past inputs, with the output corresponding to the second component of the state of  s .
The performance of the controller is initially assessed by simulating the closed-loop system in presence of a piecewise constant reference.In the simulation, the feedback output y is still corrupted by noise, which has the same characteristics of the one affecting the open-loop measurements.The attained closed-loop behavior is compared with the desired one in Figure 6, where it can be noticed that the reference signal is tracked despite the measurement noise.Figure 6 further shows that switches between different local controllers are overall not frequent.The same controller is then used to track a sinusoidal reference, attaining the performance shown in Figure 7. Also, in this case, the desired and actual closed-loop responses are similar.As reported in Figure 7, the controller does not switch its state often, with its switching sequence being almost periodic, thus reflecting the characteristics of the chosen reference signal.
For the piecewise constant reference, the RMSE attained with the learned PWA controller is equal to 0.606.Instead, RMSE=0.614 is achieved with a single controller with the same structure described in (59), trained by running Step 1.1 with the same regularization parameters.When the sinusoidal reference is considered, the RMSEs attained with the PWA and LTI controllers are equal to 0.726 and 0.929, respectively.Therefore, despite the negligible difference in RMSE when considering the piecewise constant reference, the PWA controller clearly enhances the tracking performance when considering a sinusoidal reference.

Sensitivity analysis
As pointed out in Section 3, the performance of the learned controller might be influenced by the chosen regularization parameters ,  aff , and  p .Indeed, the selection of an inappropriate value for  p might negatively influence the resulting switching logic of the controller, while improper choices of  and  aff might deteriorate the properties of the estimated local controllers.In turn, wrong choices of these parameters might lead to poor matching of the desired closed-loop behavior, resulting in a deterioration of the tracking performance and, in the worst case, they might lead to closed-loop instability.A posterior sensitivity analysis has thus been performed with respect to each regularization parameter.Different controllers are trained for different values of ,  aff and  p , by changing one tuning parameter at a time and fixing the others to the values selected in cross-validation.The performance of the resulting controllers are then evaluated in tracking the piecewise constant reference shown in Figure 6.The RMSEs reported in Figure 8 indicate that the performance of the learned controller is fairly insensitive to the choice of  and  aff for ,  aff ≤ 10, but it visibly deteriorates for  > 100.A similar behavior is shown in Figure 9, entailing that a choice of  p ≤ 0.1 leads to negligible differences in the performance of the resulting controller.Instead, the trained switching policy results in deteriorated closed-loop tracking performance for  p > 0.1.
Furthermore, the outcome of Algorithm 1 depends on the chosen initial guess for the mode sequence  0 c .As suggested in the introduction, to mitigate the effect of this choice the algorithm can be run for N different choices of  0 c , selecting the solution leading to the smallest cost.To assess the sensitivity of the method to N, Algorithm 1 is carried out for increasing values of this parameter, up to N = 700.As shown in Figure 10, the resulting cost and, thus, the controller does not change for N ≥ 11.
We further evaluate the computational time required to design the controller, namely to run Algorithm 1 until one of the considered termination criteria is attained.This is performed by training a controller with data sets of increasing length, obtaining the computational times reported in Figure 11.Even for a large data set (T = 10 5 ), the CPU time required to train the controller is less than 10 seconds, proving the computational efficiency of the proposed design strategy.
We finally assess the effect of an increasing number of training samples T on the closed-loop tracking performance.As shown in Figure 12, there is no significant improvement in performance for T ≥ 3000.

Numerical example: PWL reference model
Let  s be a PWA system with K = 3 possible operating conditions described by the difference equations  A data set of T = 4750 samples is generated by simulating the behavior of the system in open-loop, with {u(t)} T t=1 being a sequence of randomly generated inputs with uniform distribution in the interval [ −15, 15].The output is corrupted by a zero-mean Gaussian distributed noise with standard deviation  v = 0.8, which yields SNR=16 dB.An additional data set comprising T = 750 samples is generated in similar conditions to perform cross-validation.The data used to learn the local controllers are normalized both in cross-validation and training.Differently from the previous numerical example, in this case, we aim at replicating the behavior of the following bimodal PWL reference model: with Nonetheless, we still design  s so that it is composed by K c first-order linear local controllers with the structure in (59)  for all i = 1, … , K c .Note that, since  s is forced to behave as a PWL system with partition induced by the conditions in (61c), the number of submodels cannot be easily fixed even if K is known.The number of local controllers is set to K ⋆ c = 4, according to the values of RMSE u obtained in cross-validation and reported in Figure 13.As expected, the use of a single controller leads to a worse reconstruction of the input, with respect to all cases when multiple controllers are used.At the same time, the use of a relatively high number of local controllers seems to lead to a deterioration in the performance.According to the results of the cross-validation procedure, the other tuning parameters are chosen as  ⋆ = 10 −4 ,  ⋆ a  = 10 8 and  ⋆ p = 10 3 , respectively.Note that the selected value for  aff leads to  i,3 ≈ 0 for all i = 1, … , K c .Algorithm 1 is run for N = 500 different initial sequences and the resulting controller is tested by simulating the behavior of the interconnection between  s and  s , with the system output not corrupted by noise for better visualization.The closed-loop output obtained in tracking a piecewise constant reference are shown in Figure 14, where the reference signal is chosen to excite all possible operating conditions of  s .By comparing the reference, the desired and the actual closed-loop behaviors, it is clear that the attained closed-loop response follows the desired one.Indeed, according to  s in (61a), the actual closed-loop system tends faster to negative references, while the transient is slower when the reference signal is positive.It is worth pointing out that the RMSE achieved with a single controller is equal to 2.247, while the one attained with the PWA controller is 0.242, clearly indicating the need for a switching controller.

Numerical example: a nonlinear system
To assess how a PWA controller trained with the proposed approach performs when used to control nonlinear systems, let  s be by the equations 62 with parameters p 1 = 1 and p i = 0.5, for i = 2, … , 5. By assuming that our goal is to compensate the nonlinear behavior of the plant by closing the loop, the reference model  s is chosen as the following LTI first-order discrete-time model x M (t + 1) = 0.8x M (t) + 0.2r(t), (63a) A set comprising T = 4000 input/output pairs is constructed by performing an open-loop simulation of the system, with input chosen as the following combination of sinusoids: ) . (64) The measured output is corrupted by zero-mean white noise with Gaussian distribution and standard deviation  v = 0.3, so that SNR= 10.3 dB.An additional data set of length T = 1000 is generated in similar conditions to perform cross-validation and select the number of local controller and the remaining tuning parameters.The values of RMSE u obtained in cross-validation are reported in Table 1.Based on these results, the number of affine controllers is set to K ⋆ c = 3 and, accordingly, the other tuning parameters are fixed to  ⋆ = 10 −8 ,  ⋆ a  = 10 6 and  ⋆ p = 10 −7 , respectively.The controller is then designed by running Algorithm 1 with the chosen tuning parameters, for N = 600 possible initial guesses of the mode sequence.By considering a noisy feedback output, with the noise having the same characteristics as the one altering the open-loop measurements, the closed-loop performance attained when tracking a sinusoidal reference  is shown in Figure 15, along with the input fed to the system and the switching sequence of the controller.Due to the nonlinearity of the system and the noise, the controller switches very frequently.Nonetheless, the reference is tracked with satisfactory performance, attaining RMSE =0.4321.Instead, a RMSE equal to 0.4800 is achieved when using a single affine controller, reflecting a delay in tracking the reference that can be noticed by looking at Figure 16.These results show that PWA controllers trained with the presented approach can be effectively used to control nonlinear systems, given the additional flexibility guaranteed by the estimation of the switching logic of the controller from data.Note that the difference between the reconstruction errors obtained for K c = 2, 3, 4 and reported in Table 1 is negligible.Therefore, controllers  s with up to five local controllers are trained, by fixing the tuning parameters as the one leading to the values of RMSE u reported in Table 1.By tracking the same sinusoidal reference considered in Figure 15, the RMSEs attained with the different controllers in case of uncorrupted feedback outputs are reported in Table 2.It can be clearly noticed that worse performance is attained for K c ≤ 2 and K c ≥ 4, showing that the number of local controllers chosen through cross-validation is the one leading to good closed-loop performance.

A benchmark case study: the servo-positioning system
As a more realistic case study, we investigate the same problem considered in, 48 namely the problem of controlling the servo-positioning system with an unbalanced disc. 63The considered system is characterized by viscous friction and the   presence of an additional mass that, mounted on the disk connected to the rotor, makes the mass distribution over the disk inhomogeneous.Its mathematical model is given by  3. It is worth pointing out that our approach can be used to tackle the considered problem, although the model in ( 65) is nonlinear, thanks to the universal approximation properties of PWA maps and the flexibility guaranteed by the assumption that the switching sequence of  s is unknown.

𝜃(𝜏) . 𝜔(𝜏) ̇I(𝜏)
As in, 48 an ideal zero-order-hold (ZOH) setting is considered, where the input and output sampling is synchronized with sampling period T s = 0.01 seconds.The output measurements are further assumed to be filtered via an ideal anti-aliasing filter.
We build two data sets  T and  T , which comprise T = 1500 and T = 250 samples, respectively used for training and cross-validation.Both sets are constructed by simulating the mathematical model in (65) in presence of a filtered white input voltage distributed uniformly in the interval [−8, 8], with the input filter being a low-pass digital filter with cutoff frequency of 1.1 Hz.As in, 48 the output measurements are corrupted by a white zero-mean Gaussian noise sequence v(t), with variance selected so that SNR=43 dB.
The reference model  s is chosen as the same LTI first order discrete-time model considered in, 48 ie, x M (t + 1) = 0.99x M (t) + 0.01r(t), (66a) which is characterized by a pole at about 0.15 Hz.The structure of the controller is described as follows: We thus aim at designing an eighth-order controller with integral action, where the latter is introduced to ensure zero steady-state error.The discrete state s c (t) of the controller is driven by the switching logic with Π(t) = [ (t − 1) (t − 2) (t − 3) (t − 4) ] ′ , (67d) which is inspired by structure of the LPV controller trained in. 48This means that the PWA controller changes its operating condition based on the shaft angle only, which is reasonable consided the dynamics in (65).the number N of possible initial conditions.Furthermore, a PWA controller can be effectively learned with data collected for SNR down to ∼ 16 dB, while efficient LPV controllers cannot be designed according to 48 for SNR lower than 43 dB, due to the quasi-LPV nature of the system.Indeed, as shown in Figure 20, the new controller still allows to track the reference signal, despite the noise corrupting the output.
All above results show that the proposed approach can be effectively used for the direct data-driven control of nonlinear systems, since the discrete state of  s is assumed to be unknown.

CONCLUSIONS
In this paper, we have introduced a novel approach for model-reference design of PWA controllers, which relies on IO data only and is tailored for scenarios in which the operating condition of the controlled system is unknown.By leveraging on recent results on switching system identification, the method relies on the formulation of a convex optimization problem in the design parameters.The problem is solved via a coordinate descent method alternating between the solution of the problem with respect to each optimization variable, so that both the discrete and continuous dynamics of the PWA controller are reconstructed from data.The controller is designed under the assumption that the system to be controlled is PWA.Nonetheless, this hypothesis does not restrict the effectiveness of the proposed approach to this class of systems.
The method proposed in this paper is aimed at lying the foundation for future research on direct data-driven control of hybrid systems when only IO data are available.Therefore, much work still has to be done to make the approach competitive with state-of-the-art model-based approaches.Future research in this direction will include the analysis of data-driven closed-loop stability conditions, the optimal selection of the reference model and optimal tuning of the number of local controllers and the regularization parameters.

FIGURE 1
FIGURE 1 Polyhedral partition of the space driving the change in the operating condition of the system in (6) [Colour figure can be viewed at wileyonlinelibrary.com]

FIGURE 3
FIGURE 3 Example of three linearly separable clusters and the associated piecewise affine separator [Colour figure can be viewed at wileyonlinelibrary.com]

Assumption 5 . 1 c
Let  k−used to solve problem (37) be different from the sequence  o,c satisfying Assumption 1. Denote with M = {t ∶ s k−1 c (t) ≠ s o,c (t)} the set of misclassified points and with #m ∈ ℕ the cardinality of M. It holds that #m ≪ T, with T being the length of the data set  T .

FIGURE 4
FIGURE 4 Linear time-invariant reference model: RMSE u (K c ) vs number of local controllers K c .RMSE, root-mean-square error

FIGURE 5 3 FIGURE 6 3 FIGURE 7
FIGURE 5 Linear time-invariant reference model: estimated polyhedral partition of the domain  c [Colour figure can be viewed at wileyonlinelibrary.com]

20 FIGURE 11 5 FIGURE 12 12 FIGURE 13
FIGURE 11Linear time-invariant reference model: CPU time vs length of the data set  T .The computational time reported is the mean CPU time required to design the controller over the N = 20 runs of Algorithm 1

3 FIGURE 14
FIGURE 14 Piecewise linear reference model: tracking a piecewise constant signal.[Top left panel] Reference (dashed black) vs desired closed-loop behavior (dotted-dashed blue) and actual closed-loop response (red).Black, red, and blue lines are almost overlapped.[Top right panel] Active mode sequence of the piecewise affine controller.[Lower left panel] Discrete state of  s .[Lower right panel] Active mode sequence of the reference model  s [Colour figure can be viewed at wileyonlinelibrary.com]

K c = 2 4 FIGURE 15 2 FIGURE 16
FIGURE 15 Artificial nonlinear system.[Top left panel] Sinusoidal reference (black) vs actual (red) and desired (dashed blue) closed-loop responses.[Top right panel] Input fed to the system in closed-loop.[Lower panel] Active mode sequence of the piecewise affine controller [Colour figure can be viewed at wileyonlinelibrary.com]

K c = 1
K c = 2 K c = 3 K c = 4 K c = 5 RMSE(K c ) 0.3721 0.4477 0.2416 0.3164 1.2142 ) = (), (65b)where V() [V] and I() [mA] are the voltage and the current over the armature, respectively, () [rad] is the angular position of the disc, () [rad/s] is its angular velocity, and the remaining parameters are reported in Table

FIGURE 19 5 FIGURE 20
FIGURE 19Servo-positioning system.Desired closed-loop response (black) vs closed-loop output attained with the piecewise affine controller (red) and an linear parameter-varying controller trained with the method proposed in 48 (dashed blue) [Colour figure can be viewed at wileyonlinelibrary.com]

TABLE 1
Artificial nonlinear system.RMSE u vs number of local controllers K c

TABLE 2
Artificial nonlinear system.Tracking a sinusoidal reference: root-mean-square error (RMSE) vs number of local controllers K c

TABLE 3
63ysical parameters of the DC motor63