Estimation of parameters of a harmonic chirp model

Science and Engineering Research Board, Government of India Abstract Here, we address the problem of estimation of the parameters of a harmonic chirp model, often encountered in speech and music applications. This model was introduced recently by Christensen and Jensen [1] as an extension of a standard harmonic model. We propose two methods of estimation: the least squares estimation method and the approximate least squares estimation method. We establish the asymptotic properties of the least squares estimators as well as the approximate least squares estimators of the parameters of this model under the assumption of stationary errors. These asymptotic properties are proved analytically as well as corroborated through simulation experiments. We present two speech signal data sets and their analysis using both the estimation methods. The results show that the proposed methods perform reasonably well for estimating the unknown model parameters.


| INTRODUCTION
Parameter estimation of a sinusoidal model is a longestablished problem and has been given considerable importance in the literature. Mathematically, a sum of sinusoidal model can be described as: A 0 k cosðα 0 k tÞ þ B 0 k sinðα 0 k tÞ þ XðtÞ; t ¼ 1; …; n: Here, the parameters A 0 k s, B 0 k s are the amplitudes, α 0 k s are the frequencies of the observed signal y(t) and X(t) is the additive random error component with mean zero. The explicit assumptions on X(t) will be mentioned later. Many real-life periodic and even nearly periodic phenomena can be explained using this model and because of its increasingly ubiquitous applicability, it has been used extensively in the signal processing literature. For references, one may look into the works of Kay and Marple [2], Stoica [3], Prasad, Chakraborty and Parthasarathy [4] and Kundu and Nandi [5].
A special case of the sinusoidal model is a harmonic model, where the frequencies instead of being arbitrary, are integral multiples of a constant fundamental frequency. Following is the analytical expression of a harmonic model: A 0 k cosðkα 0 tÞ þ B 0 k sinðkα 0 tÞ þ XðtÞ t ¼ 1; …; n;ð1Þ whereas before A 0 k s, B 0 k s are the amplitudes of the signal y(t) and X(t) is the additive error component. However, the frequencies of the sinusoids are multiples of a common, constant fundamental frequency, α 0 . This model has several real life applications, particularly in analysing signals where exact periodicity is observed, for instance in the fields of speech processing, music processing and communications, see for example, Stylianou [6], Christensen, Stoica, Jakobsson and Jensen [7], Fletcher and Rossing [8] and the references cited therein. For references on some of the classical methods of parameter estimation of this model, one may see the works of Quinn and Thompson [9], Li, Stoica and Li [10], Quinn and Hannan [11], Nandi and Kundu [12] and Chan and So [13]. This paper deals with the parameter estimation of a harmonic chirp model which is a natural generalisation of the harmonic model (1) to take into account the time-variations in frequencies of the observed signals. Mathematically, the model is represented as: harmonics of a fundamental chirp rate β 0 and the parameters A 0 k s and B 0 k s are the amplitudes of the observed signal y(t). The error term X(t) is a stationary linear process, and which is a more realistic assumption than the usual Gaussian noise assumption. Popular noise models such as AR, MA or ARMA models can be represented as a stationary linear process under certain conditions and hence this assumption covers a large class of random processes. This model has been suggested recently by Christensen and Jensen [1] and thereafter some authors, for instance, Doweck, Amar and Cohen [14,15], Nørholm, Jensen and Christensen [16], Sward, Brynolfsson, Jakobsson and Hansson-Sandsten [17], Jensen, Nielsen, Jensen, Christensen and Jensen [18] and Nielsen, Jensen, Jensen, Christensen and Jensen [19] have considered its parameter estimation. In real life applications, this model captures the deterministic component of the signals more accurately as it takes into account the modulations in the frequency of the observed signal. Apart from being an extension of a harmonic model, a harmonic chirp model can also be seen as a special case of a chirp model, where different harmonics have arbitrary frequencies and modulations. The latter is one of the fundamental models in signal processing literature and its parameter estimation has received considerable attention over the years. For references, see Abatzoglou [20], Djuric and Kay [21], Peleg and Porat [22], Saha and Kay [23], Myburg, Den Brinker and Van Eijndhoven [24], Weruaga and Kepesi [25] and Pantazis, Rosec, and Stylianou [26].
A brief review of the recent work on the parameter estimation of the harmonic chirp model: Doweck, Amar and Cohen [14] considered the joint estimation of the initial frequency and frequency rate of a complex harmonic chirp model under the assumption of complex Gaussian errors. They proposed the maximum likelihood estimation (MLE) method, the approximate maximum likelihood estimation method and also an alternative method to reduce the computational complexity involved in the first two methods. This alternative method is a two-step estimation method, the first step is to separate the signal from its harmonic components and the second is to estimate the parameters using least squares method given the phases of the harmonic components. Doweck, Amar and Cohen [15] proposed another two methods of estimation. The first method is based on highorder ambiguity function (HAF) which reduces the twodimensional optimisation problems into two one-dimensional optimisation problems. The second method is a lowcomplexity method based on separate-estimate approach that they call Harmonic-SEES method. Through numerical experiments, they demonstrate that the Harmonic-SEES method outperforms the one based on HAF approach. Nørholm, Jensen and Christensen [16] also consider the complex harmonic chirp model with additive complex Gaussian errors and proposed an iterative MLE method. They compared the performance of this model with the harmonic model based on the maximum a posteriori (MAP) model selection criterion for long segments. Sward, Brynolfsson, Jakobsson and Hansson-Sandsten [17] consider the same model and provide an iterative sparse reconstruction framework, which is an extension of the work done by Adalbjörnsson, Jakobsson and Christensen [27] on the harmonic model. Jensen, Nielsen, Jensen, Christensen and Jensen [18] consider a real-valued harmonic chirp model and provide an algorithm for computing an estimate to the ML estimator with much lower computational complexity using recursive matrix structures, the fast Fourier transform, and using a two-step approach that reduces the number of required function evaluations. Nielsen, Jensen, Jensen, Christensen and Jensen [19] proposed an accurate and robust fast method for harmonic chirp summation. In all the aforementioned works, it is shown that the proposed estimators achieve the Cramer Rao lower bound (CRLB), however there has been no study of the asymptotic properties of the proposed estimators.
The aim here is two-fold. First, we consider a more general error structure, that is, we assume that X(t)s are from a stationary linear process. Because of this assumption, the model becomes more realistic as it takes into account the dependence structure that may be present in the errors in many practical situations. The explicit details of this assumption are provided in the next section. The second aim is to establish large sample properties of the studied estimators, namely consistency and asymptotic normality under the stationary assumptions on the error component.
Here, we mainly address the problem of parameter estimation of the model (2) using the least squares estimation method. These estimators are obtained by minimising the error sum of squares defined in (3). Note that the model under consideration is a non-linear time series regression model. Due to the highly non-linear nature of the model, apart from computational difficulty involved in finding the LSEs, one cannot study the finite sample statistical properties of these estimators as well. Thus, to assess the performance of the LSEs theoretically, one has to resort to their asymptotic behaviour under certain suitable assumptions. Moreover, deriving the asymptotic properties of these estimators is not quite straightforward as they do not satisfy the well-established results in literature, for instance, the sufficient conditions of consistency provided by Jennrich [28] or Wu [29]. Thus, the primary focus here is to study the asymptotic properties of the LSEs under the assumption of stationary errors. These properties are proved analytically as well as corroborated through extensive simulation studies. It is also observed that under the special case of Gaussian errors, the asymptotic variances of the LSEs attain the asymptotic CRLBs.
We also discuss briefly the approximate least squares estimation method for the joint estimation of the fundamental frequency and the fundamental chirp rate. This method was first proposed by Walker [30] for estimating the parameters of a sinusoidal model. He obtained the asymptotic properties of approximate least squares estimators (ALSEs) under the assumption of independently and identically distributed (i.i.d.) errors with mean 0 and finite variance. Furthermore, Kundu and Nandi [31] and Grover, Kundu and Mitra [32,33] extended this method to estimate the parameters of a 2-D sinusoidal model and a 1-D chirp model and 2-D chirp model, respectively. Under the assumption of stationary errors, they observed that the ALSEs of the unknown parameters have the same theoretical large sample properties as the corresponding LSEs. Moreover, the ALSEs have two remarkable features− firstly, they are computationally faster as compared to the LSEs and secondly, their consistency is obtained under weaker assumptions on the linear parameters than those required for the LSEs. Thus, it is indeed intriguing to investigate the properties of the ALSEs of the unknown parameters of the harmonic chirp model as well and see how they perform computationally in comparison with the LSEs. We observe that although theoretically the ALSEs have all the desirable properties of the LSEs, computationally the LSEs perform better than the ALSEs. This performance is compared in terms of the average estimates and the order of the biases and the variances of the estimates.
In the next section, we shall state the necessary model assumptions and develop statistical properties of the LSEs of the unknown parameters under these assumptions. In Section 3, we discuss the methodology to obtain the ALSEs and establish their statistical properties. We provide simulation results in Section 4 and in Section 5, we present two speech signal data sets and their analyses using the least squares estimation method as well as the approximate least squares estimation method. We conclude the paper in Section 6. All the proofs are provided in the appendices.

| LEAST SQUARES ESTIMATION METHOD
The least squares estimators of the parameters can be obtained by minimising the following error sum of squares: with respect to θ, where θ = (A 1 , B 1 , …, A p , B p , α, β) is the parameter vector of the model. Henceforth, for brevity, we use matrix notation so that (3) can be rewritten as: is the vector of linear parameters and the matrix Z(α, β ) is defined as follows: Since ϕ is a vector of linear parameters, for a given α and β they can be estimated using simple linear regression by the separable regression technique of Richards [34] as follows: The above equation allows one to reduce the error sum of squares function Q(θ) as a function of only the non-linear parameters α and β. Thus, the reduced error sum of squares function can be written as: where P Z ðα; βÞ ¼ Zðα; βÞ½Zðα; βÞ ⊤ Zðα; βÞ� −1 Zðα; βÞ ⊤ is the projection matrix on the column space of the matrix Z(α, β). The LSEs of α 0 and β 0 are obtained by minimising R(α, β) with respect to α and β simultaneously and once we have the estimators, α and β , we can substitute them in (5) to get the linear parameter estimates.
In the subsequent subsections, we examine the asymptotic properties of these estimators under the following assumptions: Equation (7) is a standard representation of a stationary linear process and covers a large class of stationary random processes.

Assumption 1 The error random sequence {X(t)} is a stationary linear process of the form:
Here {e(t); t ∈{0, ±1, ±2, …}} is a sequence of i.i.d. random variables with E(e(t)) = 0, V(e(t)) = σ 2 and finite fourth moment and a( j)s are real constants such that (7) is a standard representation of a stationary linear process and covers a large class of stationary random processes.

Assumption 2 The true parameter vector
Theorem 1 Under the assumptions 1 and 2, θ , the LSE of θ 0 is a strongly consistent estimator of θ 0 , that is, In this section, we consider another method of parameters estimation and we call it as the approximate least squares estimation method. This method is based on the idea of the periodogram estimators, one of the most popular methods of estimation of the frequencies of a sinusoidal model.
First we define the periodogram-type function and explain how to obtain the ALSEs and then we study different asymptotic properties of these estimators under the assumption of stationary errors.
Using the following lemma proved by Lahiri [35]: Equation (6) can be written as: where, Since the difference between the reduced error sum of squares function, R(α, β), and the periodogram-type function I (α, β) is O(1) as n → ∞, minimising the former is equivalent to maximising the latter. This result is an extension of the one obtained by Walker [30] for the sinusoidal model. Here, the function f = o(1) means f → 0 a.s. and f = O(1) means f ⩽ c 0 a.s., where c 0 is some finite constant. The estimators obtained by maximising I(α, β) with respect to α and β simultaneously are called the ALSEs of α 0 and β 0 . The linear parameter estimates can be obtained by plugging-in the non-linear parameter estimates and using Lemma 1 in (5).
Under the same assumption on the error component X(t) (see Assumption 1) and the following assumption on the parameters: we examine the asymptotic properties of the ALSEs and the obtained results are provided below.
Here the diagonal matrix, D and c are same as defined in Theorem 3.

| SIMULATION STUDIES
We conduct extensive simulation studies for the joint estimation of the fundamental frequency and fundamental chirp rate using the suggested estimation methods. Here, we report the results obtained for the following simulated harmonic chirp model: with the following true values of the amplitudes: We consider two different error structures for data generation: where ϵðtÞ ∼ N ð0; σ 2 Þ. For the first set of simulation experiments, we choose α = 0.03, β = 0.003, σ = 0. 5  In Figures 6-10, we assess the performance of the estimates with varying signal-to-noise ratio (SNR) on the xaxis. As seen from the plots, the RMSEs of both the estimates decrease as the SNR increases, which is expected. Also, the RMSEs of the LSEs are found to be in good agreement with the asymptotic results. It is also observed that the RMSEs of the ALSEs do not change significantly with increasing SNR, particularly for the linear parameters.
Finally, in Figures 11, 12, 13, 14, 15, 16 and 17, 18, 19, 20, 21, we evaluate the performance of the estimators with respect to varying α and β on the x-axis, respectively. These plots show that the estimation accuracy of the LSEs for different values of α and β is quite balanced and at par with the corresponding asymptotic results. Also, the ALSEs perform well but the behaviour of the ALSEs seems erratic with varying α and β.
It is evident that the LSEs outperforms the ALSEs across all sample sizes, SNR values and the parameter values. Moreover, the RMSEs of LSEs almost coincide with the square root of theoretically derived asymptotic variances which is used here as the benchmark for the estimation accuracy.

| DATA ANALYSIS
In this section, we present the analyses of two real life data sets. These data sets are speech signal data sets, 'u:' and 'a:' (vowel sounds) produced by a sound instrument at the Speech Signal Processing laboratory of Indian Institute of Technology, Kanpur. We fit a harmonic chirp model to these data sets using both the least squares and the approximate least squares estimation    Here, SS res (k), is the residual sum of squares when there are k components present in the signal, which decreases as k increases and the second term, 2(2k + 2) ln(n), is the penalty term, and for a fixed n, it increases as k increases. Assuming that the maximum number of components can be J, we choose the estimate of p corresponding to the minimum value of BIC(k).
In Figure 22, we plot the BIC(k) values for k = 3, …, 12. The purple solid line in the figure represents the BIC curve for the data set 'u:' when the estimates are obtained using least squares estimation method and the pink dashed line represents the BIC curve for the same data set when approximate least squares estimation method is used. It is evident from the plot, that we choose the estimate of p as 9 when the LSEs are used and 8 when the ALSEs are used.
Similarly, for the second data set 'a:', we plot the BIC curve (see Figure 23) and we choose the estimate of p as 9 in case of the LSEs and 7 in case of the ALSEs.
Once we have the estimate of the number of components, we estimate the other parameters of the deterministic component of the model, the amplitudes, the fundamental frequency and the fundamental frequency rate using the LSEs and the ALSEs. Using these estimates, we obtain the fitted values. In Figure 24, the first subplot represents the original signal 'a:' along with the estimated signals using both the least squares and the approximate least squares estimation method in the time domain. It can be seen that both the fittings estimate the original signal quite accurately, except at certain peaks and depressions. The subplot to its right, shows the amplitude versus frequency plot for the same data set. Although certain low frequencies are not resolved, the estimated models capture the intermediate frequencies very well. The plots in the next row of the figure, represent the corresponding signals for data set 'u:'. It can be seen that the fittings match very well with the original signal in the time domain. Moreover, the estimated models capture the lower order frequencies quite precisely.

| CONCLUSION
The aim here has been two-fold. First, we considered a more general error structure than the usual Gaussian assumption. Because of this assumption, the model is more realistic as it takes into account the dependence structure that may be present in the errors in many practical situations. The second aim was to establish large-sample properties of the LSEs and the ALSEs under the stationary assumptions on the error component. We observed that both types of estimators are strongly consistent. It is also observed that the order of asymptotic variances of the amplitudes estimators is n −1 and that of the fundamental frequency and fundamental chirp rate estimators are n −3 and n −5 respectively. In fact, both the LSEs and the ALSEs have the same asymptotic distribution. Moreover, under the special case of Gaussian errors, the asymptotic variances of the proposed estimators attain the CRLB. By means of numerical simulations, we evaluate the performance of LSEs and ALSEs and observe that LSEs outperform ALSEs. We have also analysed two speech signal data sets to exemplify the implementation and effectiveness of the two methods to fit a harmonic chirp model in a practical situation. The outcomes of these data analyses are quite encouraging.
There are many other methods for dealing with this problem of parameter estimation. For example, Tsakonas et al. [36] proposed a filtering method for this purpose. However, they assume that the errors are white Gaussian which is usually unrealistic in practice and no theoretical properties such as consistency are established for the proposed method. It will be interesting to extend the method for coloured noise and establish theoretical properties of their method.  To prove Theorem 1, we need the following lemma: Lemma 2 Consider the following set: The notations lim inf and inf denote limit infimum and infimum of a function, respectively. proof Here in this proof we denote Q(θ) by Q n (θ) and θ by θ n to emphasize the point that they depend on n. Now, suppose that (9) is true and b θ is not a consistent estimator of θ 0 , therefore there exists a δ > 0 and a subsequence {n j } of {n} such that: b θ n j ∈ S δ ∀ j ¼ 1; 2; …: Note that when n = n j , b θ n j is the least squares estimators of θ 0 , that is, it minimises Q(θ), therefore: However, since (9) holds true, there exists a J 0 , such that ∀ j > J 0 Q n j ð b θ n j Þ − Q n j ðθ 0 Þ > 0 a:s: which is a contradiction to (10). □ Proof of Theorem 1: Consider the following difference:  to get the desired result. To show this, we proceed as follows.