Robust Fault Detection Based on l1 Regularization

Herein, l1 regularization‐based fault detection technique for stochastic discrete time systems in state space form is discussed. Compared with the deterministic nature of a fault which usually causes an abrupt change, the state of a system smoothly evolves in most cases and the disturbance in the process and the sensor measurement has a stochastic characteristic. The modeling uncertainty in the state space of a system can induce a bias in deterministic way and it can be combined to a fault. In the fault detection community, the modeling uncertainty is called a multiplicative fault and the abruptly‐changing fault is called an additive fault. Inspired by the fact that l2 norm is useful for estimating smoothly evolving states and reducing a bias and l1 norm for detecting abrupt change with spare structure, this research develops the technique for detecting fault which changes abruptly under stochastic disturbance and modeling uncertainty. The l2 norm for bias compensation (due to modeling uncertainty and/or additive fault) is combined with l1 norm for fault detection (especially abruptly changing fault) and the two norms are set up as a regularization problem. The regularization problem is convex thus, global solution can be found in efficient way.


Introduction
Recently, intensive attention has been drawn to fault detection technique due to ever-increasing requirements on the reliability of the operation of complex dynamic systems. Fast and reliable fault detection in a system can prevent catastrophic disasters. Thus, vast amount of research efforts have been poured into that area and much of tackled. [1][2][3][4][5][6][7][8][9][10] However, technical developments outside fault detection community, such as machine learning and statistics, induced new insights about understanding the nature of faults and it has triggered this research. For example, the sparsity, which has been an important topic in estimating system state and controlling a system with minimal actuator operation, [11][12][13][14] can represent the nature of abruptly changing fault. The theory of sparisity has been developed in the name of l 1 -regularized least square [15] or LASSO [11] and it has been applied to machine learning, signal/image processing, seismic engineering, and so on. The l 1 regularization technique is used to force most of the decision variables to be equal to zero. It is useful for fault detection because fault signal is mostly zero under normal operation but it abruptly becomes nonzero when fault occurs. Compared with the deterministic nature of a fault which usually causes an abrupt change, the state of a system smoothly evolves in most cases and the disturbance in the process and the sensor measurement has a stochastic characteristic. The modeling uncertainty in the state space of a system can induce a bias in deterministic way and it can be combined to a fault. In the fault detection community, the modeling uncertainty is called a multiplicative fault and the abruptly changing fault is called an additive fault. Inspired by the fact that l 2 norm is useful for estimating smoothly evolving states and reducing a bias and l 1 norm for detecting abrupt change with spare structure, this research develops the technique for detecting fault which changes abruptly under stochastic disturbance and modeling uncertainty. In system identification literature, it is known as bias-variance tradeoff problem and in statistical learning, it is known as overfitting problem. The l 2 norm for bias compensation (due to modeling uncertainty and/or additive fault) is combined with l 1 norm for fault detection (especially abruptly changing fault) and the two norms are set up as a regularization problem through a regularization parameter.
In this research, disturbance is Gaussian. Under Gaussian noise, the state estimaion using Kalman filter (KF) is well known as best linear estimator. The KF solves the optimization problem to minimize process and measurement noise which is in the form of l 2 norm. To detect fault under Gaussian noise and modeling uncetainty, the l 2 and l 1 norms are linearly combined through a regularization parameter. Thus, robust fault detection technique is formulated in convex optimization so that global solution is guranteed.
In Section 2, a discrete time system in state space form under Gaussian process and measurement noise with additive fault (modeling uncertainty, abrupt change) is introduced. In Section 3, l 1 regularization is formulated to detect fault under noise. The formulation results to a convex optimization; hence, it can be solved effectively with open software. In Section 4, three simulations are used for demonstration using DC motor, wind turbines, and robot arm position control. The article is ended by a conclusion in Section 5. Herein, l 1 regularization-based fault detection technique for stochastic discrete time systems in state space form is discussed. Compared with the deterministic nature of a fault which usually causes an abrupt change, the state of a system smoothly evolves in most cases and the disturbance in the process and the sensor measurement has a stochastic characteristic. The modeling uncertainty in the state space of a system can induce a bias in deterministic way and it can be combined to a fault. In the fault detection community, the modeling uncertainty is called a multiplicative fault and the abruptly-changing fault is called an additive fault. Inspired by the fact that l 2 norm is useful for estimating smoothly evolving states and reducing a bias and l 1 norm for detecting abrupt change with spare structure, this research develops the technique for detecting fault which changes abruptly under stochastic disturbance and modeling uncertainty. The l 2 norm for bias compensation (due to modeling uncertainty and/or additive fault) is combined with l 1 norm for fault detection (especially abruptly changing fault) and the two norms are set up as a regularization problem. The regularization problem is convex thus, global solution can be found in efficient way.
The mathematical notation follows the standard way: a variable v having normal distribution with zero mean and covariance matrix with M is denoted as v $ N ð0, MÞ. If needed, clarification is provided as we proceed.

Problem Setup
The following discrete time system is in state space form which is useful for designing various kinds of controller: Linear Quadratic Gaussian, Model Predictive Control, and so on.
Each vector and matrix has a proper dimension. Here, f is the fault signal, w is the zero mean white Gaussian process noise, and e is the zero mean white Gaussian measurement noise. w and e are independent each other. The covariance matrix for w is defined as Q w ðkÞ. Likewise, Q e ðkÞ for e. Thus, using standard notation, we can denote them as follows: w $ N ð0, Q w ðkÞÞ, e $ N ð0, Q e ðkÞÞ. If B k ¼ G k , then f can be considered as an actuator fault. In this research, modeling uncertainty is considered, which means A k ←ðA þ ΔAÞ k , where ΔA represents modeling uncertainty and the same concept is applied to the following matrices: B k , G k , and C k . It is easy that the modeling uncertainty term can be absorbed to fault term f ðkÞ by sightly modifying (1). [2] Without loss of generality, f represents a process fault which affects state change (abruptly or gradually). Thus, in this research, the fault f ðkÞ includes modeling uncertainty and abruptly chainging fault. The modeling uncertinty induces bias and acts as an additive fault. It has a deterministic characteristic. The abrupt change can represent sudden change in control signal and/or actuator stuck. [5] It can be considered as a sparse model which can be detected using l 1 regularization method.

Robust Fault Detection Based on l 1 Regularization
In this section, robust fault detection technique is developed by using l 1 regularization method for a given system of Equation (1). It is formulated in a convex optimization problem, which leads to an efficient global solution.
With the measurement (input u and output y), the quality of state estimation is evaluated using the well-known criterion-of-fit min xð1Þ, wðkÞ where for a vector v ¼ ½v 1 v 2 · · · v n v T , kvk p ≜ ð P n v i¼1 jv i j p Þ 1=p which is known as l p . That is a minimization problem which computes the state estimationxðkÞ with a given xð1Þ. To avoid overfitting issue due to not only noise but also fault, this minimization problem is reformulated using regularization method min xð1Þ, wðkÞ, f ðkÞ where λ is a positive number which defines the amount of regularization whose term penalizes the state changes. In statistical learning, this regularization method is called bias-variance tradeoff. [16] In this research, l 1 norm is applied for regularization because of its power to capture sparsity. [15] It is also useful for detecting fault because fault is often zero under normal operation but when fault occurs, the value suddenly changes to nonzero.
One key feature of Equation (3) is that it is a convex optimization form and it can be solved in OðNÞ to find global solution.
The tuning parameter λ should be properly determined to optimize the solution. In statistical learning, [11] the estimated parameter sequence (in this research, f ðkÞ) as a function of regularization paramter λ, is called the regularization path. If λ ≥ λ max , then the estimated parameter sequence is identically to zero which means f ðkÞ ¼ 0. λ max is called critical paramter value. The critical paramter value is useful for finding good starting value to determine the suitable value of λ. λ max can be readily computed based on convex analysis as follows. First, we obtain xðkÞ, k ¼ 1, : : : , N, with disregarding wðkÞ We define no-fault residual signal as given subsequently using Equation (4) The Equation (3) can be rewritten using Equation (4) and (5) min xð1Þ,wðkÞ,f ðkÞ The gradient of (6) If Equation (7) is evaluated at f ðkÞ ¼ 0ðk ¼ 0, : : : , N À 1Þ to find the λ max , then the following Equation (8) is obtained as follows The gradient of the second term of Equation (6) disapprears. The gradient of the third term in Equation (6) With Equation (8) and (9) at f ðkÞ ¼ 0, ðk ¼ 1, : : : , N À 1Þ, the minimization condition of Equation (6) can be organized as It is well known that the dual norm of l 1 norm is l ∞ norm and it should lie in the unit ball. If we apply this concept to the L.H.S. of Equation (10), then this is obtained Q Rgb In addition, the dual norm of l 2 norm is l 2 norm. If we apply dual norm to both sides of Equation (10), λ max can be found as follows Equation (12) can be used as a tool to calculate the reasonable starting value, 0.01λ max for tuning the optimization problem of Equation (6).
There is a technique called reweighting l 1 minimization [13] which estimates f ðkÞ having more zeros with a slight increament in the criterion-of-fit term. Wth the idea, Equation (6) is modified by inserting the positive weighting term αðkÞ as follows min xð1Þ,wðkÞ,f ðkÞ For the first optimization stage ði ¼ 1Þ, α ðiÞ ðkÞ is initialized as 1 and the minimization of Equation (13) is done to find optimal f ðiÞ ðkÞ. Then, to find better f ðiÞ ðkÞ which has more zeros than the previous one, we can reweight α ðiÞ ðkÞ as follows where ϵ is a positive number. If convergence is noted, we can stop and find the optimal f ðkÞ. The Equation (13) is a convex optimization and several open softwares are available such as CVX, [17] YALMIP [18,19] l 1 ls. Main algorithm is summarized as follows: Algorithm 1: Given A k , B k , C k , G k , Q w ðkÞ, Q e ðkÞ, ϵ, δ, and fyðkÞ, uðkÞg N k¼1 : Step 1 (Initialization): Find λ max with Equation (12) and put λ ¼ 0.01λ max and α i ðkÞ ¼ 1ði ¼ 1Þ.
Step 4: Increase i to i ¼ i þ 1 and jump to step 2. If f i ðkÞ À f iÀ1 ðkÞ 2 ≤ δ as showing convergence, then stop. If not, jump to step 3.

Simulation
In this section, the previous algorithm is applied to DC motor system and wind turbines system to illustrate its effective fault detection. The first system is DC motor system in the study by Gustafson and Graebe. [20] With the sampling time of 0.1 s for the measurement of motor velocity and angle, the discrete time DC motor system is shown as follows where uðkÞ $ N ð0, 1Þ, eðkÞ $ N ð0, 1Þ, xð1Þ $ N ð0, IÞ. An arbitrary fault signal f ðkÞ ¼ ½ 0 1 is injected only at k ¼ 100. This kind of fault can represent an unexpected sudden movement of DC motor due to interruption from power line or surrounding electromagnetic noise. From Equation (12), λ max is found as 31.4888. For the optimization problem of (13) with Algorithm 1, several parameters are set as follows: Figure 1 shows the measured motor angle. At k ¼ 100, fault is injected, but it is too hard to note any change in the output measurement.
www.advancedsciencenews.com www.advintellsyst.com Figure 2 shows the detected fault signal by applying the proposed technique. As shown in the figure, the abrupt fault occurrence is correctly detected and otherwise it shows no occurrence of fault. It clearly shows the sparse nature of fault and its acute change detection using the power of l 1 regularization.
The second illustration is with the fault detection at wind turbines. [21] The benchmark model was used for the competition of fault detection and fault tolerant controller design in pitch system, drive train, generator, and converter system in wind turbines. Among the given eight fault scenarios, the "Fault 6" is chosen to show detection of abrupt change in pitch parameters. The transfer function of pitch actuator system is described as follows, which comes from Equation (6) in the study by Odgaard et al. [21] β m2 where β m2 is the measured pitch angle of sensor 2, β r is the reference pitch angle, and the parameters, ξ and ω n , represent the pressure from air content. The pressure drop due to the faulty air content is denoted by the changing parameter value ξ ¼ 0.6, ω n ¼ 11.11; No fault ξ ¼ 0.45, ω n ¼ 5.73; FaultðAir pressure dropsÞ (17) With given sampling time, 0.01 s, the continuous time transfer function Equation (16) can be transformed to discrete time one with no fault parameter values in Equation (17) The transformation of Equation (18) to state space form is easily done with Matlab and it is shown as follows The autoregressive (AR) form of Equation (18) is described as follows Similarly, for the faulty case, each corresponding equations to (18)- (20) is shown as follows From Equation (20) and (23), it is noted that the coefficient of AR form is changed due to fault. Thus, detection of the change in the coefficients is related with fault detection. For the unified labeling of the following figures, the coefficient of AR form is defined with b 1 , b 2 , a 1 , a 2 β m2 ðkÞ ¼ b 1 β m2 ðk À 1Þ þ b 2 β m2 ðk À 2Þ þ a 1 β r ðk À 1Þ þ a 2 β r ðk À 2Þ (24) The fault due to the air pressure drop is injected at k ¼ 100 by the transition from Equation (18) to (21). With β r $ N ð0, 10Þ and Equation (19), the output data, β m2 , are collected and Algorithm 1 is applied. The following parameter values are used for this simulation: ϵ ¼ 0.01, λ max ¼ 66.6745, N ¼ 200. Figure 3 shows the pitch angle measurement. The pitch system transition from Equation (18) to (21) due to the fault occurred at the sample time, k ¼ 100. It is too hard to recognize  www.advancedsciencenews.com www.advintellsyst.com the fault from the direct measurement of output itself shown in Figure 3 but, using the proposed technique, it is easy to identify the fault occurrence. Figure 4-7 show the parameter change due to the fault. For example, Figure 4 shows the value of b 1 changes at k ¼ 100. Figure 8 shows the sample time when the fault occurred. The proposed algorithm is applied to experimental data which is originated from the study by Torfs et al. [22] It is the data file generated by the experiment of robot arm position control. Between the sample time of 100 and 103 with sampling period of 0.02 s, an abruptly changing fault occurred, as shown in Figure 9.
As shown in Figure 10, the fault is correctly detected using the proposed method in this article.      www.advancedsciencenews.com www.advintellsyst.com

Conclusion
In the proposal, fault detection problem is formulated based on l 1 regularization method for a system having modeling uncertainty and Gausssian disturbance. The modeling uncertainty is reformulated as additive fault and it is understood as inducing bias. Gaussian disturbance requires designed algorithm to perform as like the Kalman filter which is optimal in the sense of least squared regularization. In addition, it is required to detect fault which changes abruptly in the system. The proposed algorithm shows that all the requirements described earlier are satisfied by setting the problem in mixed  www.advancedsciencenews.com www.advintellsyst.com