Hammerstein system with a stochastic input of arbitrary/unknown autocorrelation: Identification of the dynamic linear subsystem

The National Taipei University, Grant/Award Number: 2016‐NTPU‐ORDA‐11 Abstract For a Hammerstein system subject to a stochastic input that is spectrally coloured, this study is first in the open literature (to the present authors' best knowledge) to estimate its linear dynamic subsystem. This estimation is achieved without any prior knowledge nor any prior/simultaneous estimation of the preceding non‐linear static subsystem. This proposed estimator can handle any temporally self‐correlated input despite its potentially unknown spectrum, unknown variance and unknown mean— unlike the common assumption that the input is white and zero‐mean. This proposed estimator needs observations only of the Hammerstein system's overall input and consequential output, but not any observation of any intrasubsystem signal. Furthermore, this proposed estimator can handle a linear subsystem whose input and/or output are each corrupted additively by stationary (and possibly coloured) noises of unknown probability distributions, of unknown non‐zero means and of unknown autocovariances. The proposed estimate is analytically proved herein as asymptotically unbiased and as pointwise consistent; and the estimate's finite‐sample convergence rate is also derived analytically.


| INTRODUCTION
A Hammerstein system consists of two sequential subsystems: (i) a non-linear, static (memoryless) subsystem, followed by (ii) a linear, dynamic, time-invariant, asymptotically stable subsystem. Please refer to Figure 1 for a schematic showing these two subsystems and their associated signals, which will be described in great details in Section 2.

| A spectrally coloured input in system identification
Some engineering systems' input signals are necessarily coloured in their spectra: Examples include wireless telecommunication systems [33][34][35][36][37][38]. Likewise, a biomedical/chemical/mechanical/ thermal system in the real world often has a physical input that would necessarily be spectrally coloured, without the luxury of a white spectrum. The excited input 'signal' often is not an electric voltage or current, whose temporal properties can be readily adjusted, but a physical process in the industrial scale that leaves little room for discretionary adjustment. Furthermore, an input's coloured passband spectrum could facilitate the system be identified for only the specific frequency-subband over which the system operates, rather than over a wider frequency band that is practically irrelevant for the system's subsequent operation. For example, consider a wireless transmitter required to transmit over only a limited frequency band; that transmitter's and the communication channel's behaviour at that applicable frequency band would be more exactly estimated by using a spectrally coloured input with energy concentrated only over that frequency band of interest, rather than by using a white input whose energy could be dispersed also over the out-of-band frequencies.
Among the many existing estimation algorithms [39][40][41][42][43][44][45][46][47][48][49][50][51] of an Hammerstein system's linear dynamic subsystem, all of them explicitly deal with only a spectrally white input. 1 This study alone affronts any coloured spectrum in the input time series, that is, the input time series' autocorrelation function may be arbitrary and/or unknown to the estimator, so long if it is a finite-power. 2

| Contribution of this work
The proposed estimator can handle the following accommodating models of the Hammerstein system, its coloured input and its unobservable internal perturbation: (i) The unknown non-linear static subsystem may be any function that is piecewise differentiable and magnitudebounded-thus beyond the confine of any parametric class. This non-linearity may be non-invertible. (ii) The unknown linear dynamic subsystem's impulse response may have any finite order, which may remain unknown. 3 Furthermore, this subsystem could be noncausal, with up to Q number of anticausal taps in its impulse response. 4 (iii) The Hammerstein system's input signal is modelled as random, wide-sense stationarity, Gaussian-distributed 5 with a (possibly) unknown variance, a (possibly) non-zero unknown mean 6 and as possibly coloured with an unknown frequency spectrum. 7 (iv) In between the Hammerstein system's two subsystems, there could be additive disturbance modelled as an unobservable wide-sense stationary random sequence of {Z n } 8 of possibly non-Gaussian probability distribution, a (possibly) non-zero mean 9 that may be a priori unknown, a (possibly) unknown variance and a (possibly) coloured self-correlation (which may also be unknown). 1011 (v) The linear subsystem's output may be disturbed additively, before the observations are made, by a wide-sense stationary noise sequence {P n } 12 of a (possibly) non-zero mean 13 (that may be unknown), a (possibly) unknown variance 14 and a (possibly) coloured 15 self-correlation (that may also be possibly unknown). 16 This work will also analytically prove the following: (a) The proposed estimate f b h i g is asymptotically unbiased and statistically consistent, regardless of whether the linear Though [47,50,51] explicitly presume the input to be spectrally white, their methods could actually handle spectrally coloured input, though perhaps unrecognized by their authors as so. However, all these four estimators of the Hammerstein linear dynamic subsystem require first the estimation of the non-linear static subsystem-a hassle avoided in the present study. 2 Incidentally, for a block-based system's identification using a spectrally coloured input as system excitation-that has been achieved also for a Wiener system in [52,53], but those methods are inapplicable to the present Hammerstein system. The Wiener system and the Hammerstein system, of course, differ by how they sequentially link the non-linear subsystem and the linear subsystem-a Wiener system has linear then non-linear, whereas a Hammerstein system has non-linear then linear. Such a reversed sequencing makes fundamental differences in the system's internal dynamics and thus in the systemidentification strategy. At a Wiener system, the non-linear subsystem's unobservable input signal is effectively first 'colourized' by the preceding linear subsystem, whether the Wiener system's overall observable input is itself spectrally coloured. In contrast, an Hammerstein system's non-linear subsystem is excited by an observed input (whose temporal selfcorrelation may be estimated from the observations), but the Hammerstein system's observed output is a superposition of time-delayed echoes from the non-linear subsystem. 3 Incidentally, this FIR assumption has also been made in [47,51] of the linear subsystem. 4 This is unlike [39][40][41][42][43][44][45][46][47][48][49][50][51][54][55][56][57][58][59][60][61][62][63][64][65][66][67], which require causality. 5 Incidentally, the Gaussian assumption has also been explicitly acknowledged in [41,55,61,68,69]. 6 This is unlike [39,41,42,46,54,61], which require the mean to be zero. 7 This is unlike [39][40][41][42][43][44][45][46][47][48][49][50][51], which are restricted to spectrally white inputs. 8 This is unlike [41, 46-48, 50, 51, 55, 56, 58-66], which disallow any perturbance between the two subsystems. 9 This is unlike [39, 42-45, 49, 54, 57],which require a zero mean. 10 This is unlike [39,40,42,44,49,54], which require the perturbance {Z n } to be spectrally white. 11 This intersubsystem noise {Z n } is allowed by the presently proposed estimator to be non-Gaussian, but the input signal {U n } is required to be Gaussian-even though both {Z n } and {U n } are fed into the linear subsystem. This is not a contradiction, because the Gaussian distribution is required only for the use of the Bussgang theorem in order to estimate the cross-correlation between the non-linear subsystem's input {U n } and output {W n }. However, {Z n } is neither the non-linear subsystem's input nor output; hence, {Z n } needs not be Gaussian for the proposed estimator. 12 This is unlike [39-44, 48-51, 54, 58], which require a white spectrum of the perturbance at the non-linear static subsystem's output. 13 This is unlike [39-44, 48-51, 54, 58-66], which require a zero mean of the output perturbation.
14 This is unlike [39,44,47,49,51,54,58], which require a prior known variance of {P n }. 15 Among all references that identify a Hammerstein system, all require {P n } of a white spectrum, except [46,47,62,63,65]. subsystem has an infinite impulse response (IIR) or a finite impulse response (FIR). (b) The proposed estimate's convergence rate will be analytically derived in Section 4 and then will be verified by the finite-sample Monte Carlo simulations in Section 5.

| Organization of this study
The rest of this study is organized as follows: Section 2 will define the mathematical model of the Hammerstein system's two subsystems and will define the statistics of their associated signals and noises. Section 3 will develop a new parametric estimator of the linear subsystem. Section 4 will theoretically prove this linear subsystem estimator as statistically consistent and asymptotically unbiased. This estimate's convergence rate will also be analytically derived therein. Section 5 will verify the estimator's performance by finite-sample Monte Carlo simulations. Section 6 will conclude the entire study and will point to possible directions for future research.

| MATHEMATICAL MODEL OF THE SUBSYSTEMS AND THEIR ASSOCIATED TIME SERIES
The Hammerstein system's constituent subsystems, along with their associated signals and noises, have been represented in the block diagram in Figure 1: (a) The overall system's input signal {U n , n = 1, …, N} has (a-i) a mean of μ u that can be non-zero and unknown. (a-ii) a variance of σ 2 u (that needs not be known prior), ∀n.
(a-iii) a spectrum that can be coloured and unknown. (a-iv) a Gaussian probability distribution. 17 (b) The static non-linear subsystem m(⋅) is required to satisfy only the general requirements of being piecewise differentiable and magnitude bounded. 18 The non-linear subsystem's unobserved output equals: (c) To the aforementioned non-linear subsystem's unobserved output {W n }, a temporally wide-sense stationary noise {Z n } may possibly be added. This {Z n } may be Gaussian or non-Gaussian, with a possibly unknown and possibly non-zero mean of μ z (unlike the μ z = 0 assumption in [45,72]), a possibly unknown autocovariance function and statistical independence from the system's input {U n }. This {Z n } produces the corrupted output V n ≔ W n + Z n , which becomes input to the linear subsystem. (d) The linear dynamic subsystem's impulse response g may have a finite or an infinite number (I ) of taps, with I known prior. 19 Moreover, the impulse response may be non-causal, that is Q > 0. This impulse response is assumed as asymptotically stable, that is, For the special case of the linear subsystem having IIR, the requirement needs to be stricter as exponentially bounded 17,18,19 : (e) To the above linear subsystem's output, a random disturbance {P n } may possibly be added to yield the overall observed output, This {P n } may have a (possibly) non-zero mean of μ p , and a (possibly) unknown and (possibly) coloured autocovariance function. This {P n } is statistically independent from {U n } and from {Z n }.

| To account for the additive noise entering between the two subsystems
The proposed estimator can accommodate a noise process of {Z n } that arises between the two subsystems, thereby additively degrading the first subsystem's output, that is corrupting the second subsystem's input (see Figure 1). This study permits (NOT requires) the simultaneous presence of this internal noise {Z n } and the output noise {P n }, unlike [45,57], which allow only the internal noise but disallow the output noise {P n }. Also, this is unlike [41, 46-48, 50, 51, 55, 56], which allow only output noise {P n } but disallow the internal noise {Z n }.
These two noise processes complicate the estimation task differently, but they produce qualitatively similar degradations on the Hammerstein system.
1. The internal noise{Z n } produces an ∑ 17 The input {U n } needs to be Gaussian to identify the linear subsystem, only because Bussgang's theorem in Equation (3.4) requires a Gaussian distribution of {U n }. 18 That is, sup −∞<u<∞ |m(u)| = m sup < ∞. 19 The proposed estimator would still work even if I is not precisely known but only an upper bound is known of I. 20 Incidentally, the assumption in Equation (2.3) has also been made in [52 ,53]. Figure 2a for the degenerate case of I = 1, wherein the deterministic m(U n +Q ) is plotted as a solid curve, the realizations of the random m(U n+Q ) + Z n+Q are shown as icons, and the deterministic m(U n ) + μ z is plotted as the dashed curve. Note how the dashed curve for m(U n+Q ) + μ z is vertically displaced above/below of the solid curve of m(U n ). 2. The output noise {P n } produces P n þ ∑ Figure 2b for the degenerated case of I = 1, wherein the deterministic h −Q m (U n+Q ) is plotted again as a solid curve, the random P n + h −Q m(U n+Q ) are shown as icons and the determin- More generally 21 , the following analysis shows that the output noise {Z n } generally can 'merge' into the Gaussian internal noise {P n }.
|ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl {zffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl } P n : ð3:1Þ Incidentally, [39,49,54] also investigate a Hammerstein system with both an internal noise {Z n } and an output noise {P n }, though those disturbances are modelled in [39,49,54] to be zero-mean white additive noises.

| To estimate the linear dynamic subsystem's impulse response
Within the literature of Hammerstein system identification using a coloured random input: No study exists (to the best of the present authors' knowledge) to identify the linear dynamic subsystem. This present study is the first to do so.
To facilitate the subsequent development, define: γ m;u ðκÞ ≔ CovðmðU n Þ; U n−κ Þ; γ y;u ðκÞ ≔ CovðY n ; U n−κ Þ: Next, multiply the observed output in Equation (2.4) by U nþQ−s ; s ¼ 0; 1; 2; ⋯ � � and then take the expectation of both sides, thereby yielding: ð3:2Þ where I→∞ for an IIR linear subsystem, but I would be a natural number for the FIR case. 22 Equation (3.2) may be reexpressed in a matrix form as: if I→∞ for an IIR linear subsystem, or if k ≥ I for FIR. 23 A direct estimation of γ m,u (κ) would require a prior knowledge of the unknown non-linearity, m(u), as in Equation (2.1). To avoid that, the presently proposed estimator will instead use the Bussgang theorem [73] below to estimate γ m, u (κ). Because {U n } is a Gaussian random process and because V n = m(U n ) is a non-linear function of U n but unobservable, an application of Bussgang theorem 24 gives: Recall that U n ; Y n ; ∀n f g are observations, from which U ≔ 1 N P N n¼1 U n and Y ≔ 1 N P N n¼1 Y n may be computed. In contrast to the direct estimation of γ m,u (κ), this estimate in Equation (3.6) does not need any prior knowledge of the unknown non-linearity m(⋅). The above estimate does not need any prior knowledge of μ u , because both b γ y;u ðκÞ and b γ u ðκÞ are independent of μ u . 25 The estimate of {h n } in Equation (3.6) is admitted only within an indeterminable multiplicative constant, b. However, this unknown constant scaling factor is unavoidable, as may be seen from the following example in p. 539 of [74], or p. 1931 of [75]. Define a first overall system to have m(⋅) followed by {h n }; and define a second system to have m(⋅/b) followed by {bh n }. Then, both systems would produce identical outputs to any same input, if Z n = 0, ∀n.  6). Furthermore, the above estimate permits the additive noises' means, μ z and μ p , to be non-zero and a priori unknown.

| Summary of the proposed algorithm's key steps
The above-developed estimator's key algorithmic steps are summarized below: (1) Take the input/output observations U n ; Y n ; ∀n f g to compute U ≔ 1 N P N n¼1 U n and Y ≔ 1 The above steps (1)(2)(3) can handle stochastically the Gaussian input that is temporally self-correlated with any (possibly unknown) autocorrelation function, with any (possibly unknown) mean or any (possibly unknown) variance.
On contrasting the above against a simpler case of the input known to be temporally uncorrelated. The above algorithmic steps (3) and (4)  The key difference in computational load is the inversion of a k � k matrix needed presently in this more widely applicable case of a self-correlated input in step (3), but only a scalar inversion in the more restricted case of an uncorrelated input in step (3 0 ). and if furthermore k ¼ OðN

| ASYMPTOTIC ANALYSIS OF THE ESTIMATOR PROPOSED IN SECTION 3 FOR THE LINEAR SUBSYSTEM
The above steps (i) and (ii) imply that b h is an asymptotically unbiased and consistent estimate, regardless if the linear subsystem's impulse response is finite or infinite. Theorem 1(ii) also gives a convergent rate O p ðN − 1 2 Þ, which implies o p (1), for the linear subsystem's impulse response estimate, if the linear subsystem is IIR. 26 In the above: ð4:2Þ All norms above are Euclidean, with finite-size operands. 27 (ii) To evaluate the performance of the linear system estimator if the linear subsystem having an IIR: Choosing k as mentioned in the Theorem 1 and using the Lemma 3 of 26 This convergent rate of the proposed linear subsystem estimate subject to any arbitrarily coloured input is not inferior to but same as that in [44,45,47,49,50,56] with simply a white input. 27 With regard to the above proof for a coloured input, that proof non-trivially generalizes the spectrally white case, for which the inequality's right side would simply be ð4:4Þ As the impulse response has been assumed to be exponentially bounded, the approximation error in Equation

| MONTE CARLO SIMULATIONS OF THE PROPOSED ESTIMATOR OF THE LINEAR SUBSYSTEM
Monte Carlo simulations here will verify the efficacy of the earlier proposed estimator under finite-sample settings. All simulations in this section will use the following simulation scenario: (i) The Hammerstein system input {U n } is a second-order 'moving average' process (i.e. MA(2)) of U n = μ u + e n + 1.2e n−1 + 0.32e n−2 , with μ u = 1. Here, {e n } represents a spectrally white stochastic sequence, having a zero mean and a unit variance, at each n. (ii) The non-linearity is set as mðu n Þ ¼ signðu n Þ ð|u n | þ 1Þ 2 − 1 � � , where sign (⋅) symbolizes the sign of the scalar inside the parentheses. This dynamic nonlinearity has been used in [52,53] to model an electronic power amplifier. (iii) The noise {Z n } entering between the two subsystems is temporally coloured, set to this particular MA(2) form: Z n = μ z + ξ n + 0.5ξ n−1 + 0.06ξ n−2 , with μ z = 2. Here, {ξ n } represents a spectrally white but non-Gaussian stochastic sequence, uniformly distributed over [−1, 1] at each n. (iv) The linear subsystem is non-causal (at Q = 1) with an impulse response that is of the specific MA (2)    The proposed estimator's efficacy to handle coloured input, as documented above, can be contrasted against the ineffectiveness by the existing estimators of [39-46, 48, 49]. Those existing estimators were designed with white input in mind; their estimates are clearly very biased in Figure 4-thereon the bias is the difference between the estimation means (marked as �) and the true values (marked as •). 28 About the above proof for a coloured input-that proof non-trivially generalizes the spectrally white case, for which the inequality's right side would simply be The proposed estimator's categoric improvement over these exiting estimators (to handle coloured input) is further documented in Figure 5  Errð c bhÞ is one single scalar that regards the entire impulse response as a vector and that measures the error in estimating this vector. Figure 5 shows the proposed estimator's Errð c bhÞ → 0 as the sample size N increases from 32 to 2048, whereas the existing estimators suffer a high error floor near 0.9. knowledge of the present authors), without any prior knowledge/estimation/parameterization of the non-linear static subsystem. Here the proposed estimator has been analytically proved as consistent and asymptotically unbiased. Its efficacy and derived statistics are affirmed by Monte Carlo simulations. For the Hammerstein system's other constituent subsystem (i.e. the non-linear static subsystem), a non-parametric estimator has been developed in a companion study [77] for a similar signal/noise statistics model as indicated herein. These two subsystems' estimators are algorithmically independent from each other, in the sense that each may be computed without computing the other, or that the two may be computed simultaneously in parallel.

| CONCLUSION
This work's proposed estimator can handle a coloured input of any given power spectrum, but follow-up investigations could adapt the input's spectrum to the Hammerstein system to be identified.
This work's proposed estimator processes the input/ output observations in a batch. Future follow-up investigations could develop an iterative algorithm to update/ modify a recent estimate by incorporating new observations without recomputing for the entire observed dataset, thereby reducing the computational load to facilitate a real-time application.