Non‐crossing quantile double‐autoregression for the analysis of streaming time series data

Many financial time series not only have varying structures at different quantile levels and exhibit the phenomenon of conditional heteroscedasticity at the same time but also arrive in the stream. Quantile double‐autoregression is very useful for time series analysis but faces challenges with model fitting of streaming data sets when estimating other quantiles in subsequent batches. This article proposes a renewable estimation method for quantile double‐autoregression analysis of streaming time series data due to its ability to break with storage barrier and computational barrier. Moreover, the proposed flexible parametric structure of the quantile function enables us to predict any interested quantile value without quantile curve crossing problem or keeping the desirable monotone property of the conditional quantile function. The proposed methods are illustrated using current data and the summary statistics of historical data. Theoretically, the proposed statistic is shown to have the same asymptotic distribution as the standard version computed on an entire data stream with the data batches pooled into one data set, without additional condition. Simulation studies and an empirical example are presented to illustrate the finite sample performance of the proposed methods.


INTRODUCTION
In economics and finance, considerable attention has been devoted to regression models with autoregressive errors for time series data.Because volatility is fundamental to asset pricing, monetary policymaking, portfolio management and risk analysis, it is especially important to accurately forecast volatility.To capture the time-varying volatility, it is necessary to take the conditional heteroscedasticity into account when a linear model is fitted to financial time series.Among existing conditional heteroscedastic models, the double-autoregressive (DAR) models (Ling, 2004) have recently attracted growing attention, see Cai et al. (2013), Li et al. (2015), Xu and Zhao (2021), Zhu and Li (2022) and among others.Because it has two novel properties.First, it has a larger parameter space than that of the commonly used autoregressive (AR) model.Second, the quasi-maximum-likelihood estimator for the DAR model is still asymptotically normal without assuming the moment condition on returns (Ling, 2007), which does not hold for classic AR(p) model with independent and identically distributed errors.The DAR model of order p is defined as where  0 and  0 are vector of unknown parameters, the noise  t is independent of p-dimensional covariates Modeling conditional quantiles and volatility dynamics together is of extreme importance in econometrics and finance.Obviously, the maximum likelihood and least squares estimations are both lack of robustness and may cause larger bias.The quantile regression (QR) method proposed by Koenker and Bassett (1978) can not only maintain the robustness of estimation, but also capture the characteristics of the whole conditional distribution.This allows us to estimate the upper or lower tail of the conditional distribution of interest.In particular, it has been widely used for the prediction of quantile-based risk measures, for example, the value at risk.Hence, it is natural to consider QR for the DAR model as where Q Y t | t−1 () is the conditional th quantile of Y t given the up to t − 1 information set  t−1 and Q  t () is the th quantile of  t .For model (1.2), Cai et al. (2013) developed a Bayesian estimation method, Xu and Zhao (2021) proposed an efficient estimator for  0 by constrainedly weighting information across quantiles, and Zhu and Li (2022) studied a self-weighted conditional quantile estimation method to estimate ( ⊤ 0 ,  ⊤ 0 , Q  t ()) ⊤ by minimizing the following function: where   (r) = r − rI(r < 0) is check loss function, I(⋅) is the indicator function and { t } N t=p+1 are non-negative random weights.
Our era has witnessed the massive explosion of data and a dramatic improvement on technology in collecting and processing big data.Due to the explosive growth of data onto non-traditional sources such as mobile phones, social networks, and e-commerce, streaming data are becoming a core component in big data analysis.For example, when a passenger calls Lyft, real-time streams of data join together to create a seamless user experience.Through this data, the application pieces together real-time location tracking, traffic stats, pricing, and real-time traffic data to simultaneously match the rider with the best possible driver, calculate pricing, and estimate time to destination based on both real-time and historical data.As streaming data grows rapidly in volume and velocity, storing and combing data becomes increasingly challenging due to limited computer memory and storage.To reduce the demand on computing memory and achieve real-time processing, the nature of streaming data calls for the development of algorithms which require only 'one pass' over the data.However, the ordinary QR estimator does not have a display expression, because the loss function (⋅) in (1.3) is not a convex function.As a result, if a quantile  is not considered at the first batch, we cannot obtain its estimator in the subsequent batches, as in ordinary quantile regression, because the data is one pass.Moreover, it is impossible to estimate all quantiles  in (0, 1) at the beginning, which also increases the computational burden.In this article, we address the following natural question: Can we construct a QR estimator of Q Y t | t−1 () in the model (1.2) with an explicit expression, which is an increasing function of quantile level ?
For estimating the unknown parameters in the classical regression models under streaming data, Schifano et al. (2016) developed online-updating algorithms for linear models and estimating equations.Luo and Song (2020) proposed a renewable estimation for the generalized linear model.Luo et al. (2022) developed an incremental learning algorithm based on quadratic inference function to analyze streaming datasets with correlated outcomes such as longitudinal data and clustered data.Yang and Yao (2022) studied an online non-parametric method to dynamically update the estimates of mean and covariance functions for functional data.
Quan and Lin (2022) considered a one-pass non-parametric estimation method for non-parametric regression in the streaming setting.The above methods for streaming data are all based on ordinary least squares or estimating equations.Due to the second-order non-differentiable of the loss function (⋅) in (1.3), we find that the above methods for streaming data based on the least squares and estimating equations are not suitable for QR estimator.To overcome the non-differentiability of the QR loss function, Jiang and Yu (2022) used a convolution-type smoothing method to develop a renewable estimation.Chen et al. (2019) and Wang et al. (2022) also studied QR estimation for streaming data.However, their methods are all required additional strict conditions on the sample size of each bath.The above three QR methods for streaming data analysis, like (1.3), do not solve the problem presented, because they still can only obtain an estimator at a quantile level at a time.
Note that model (1.2),only Q  t () depends on , while  0 and  0 are independent of .We adopt a parameterized method to estimate Q  t (), then a QR estimator of Q Y t | t−1 () can be obtained, which is a function of quantile level .The easiest way to think of parameterizing Q  t () is b() (Frumento and Bottai, 2016), where b() is a set of known functions of .For instance, However, it is difficult to parameterize the quantile regression coefficients accurately, and their method cannot guarantee obtain the non-crossing of different quantile estimators (Frumento et al., 2021).Population conditional quantile functions cannot cross each other for different quantile orders; however, the estimated regression curves often violate this (Jiang and Yu, 2023).For example, in some cases, the estimate of the 5th percentile is larger than that of the 10th percentile, which can be very challenging for interpretation and further analysis.
In this article, we parameterized Q  t () as function (1.4) by the generalized lambda distribution (GLD), because the GLD (Freimer et al., 1988) is defined by its quantile function (the inverse of the distribution function).
where  0,1 is a location parameter,  0,2 > 0 an inverse scale parameter and  0,3 ,  0,4 are shape parameters.The procedure (1.4) allows to estimate the whole quantile function of  t directly using a wider class of distributions, including those which are defined only via their quantile functions and that may not have closed mathematical expressions of their density or distribution functions.Given the correct parameters, the GLD distribution equates to several well-known distributions (e.g. the uniform, exponential, logistic) and for many others, a close approximation is possible (e.g.Gaussian, Cauchy, Student's t, chi-square, Gamma, Weibull, lognormal, beta distribution).Figure 1 illustrates that GLD estimates the quantile function well, where the parameter of GLD is obtained by Dedduwakumara et al. (2021).We refer the reader to Karian and Dudewicz (2000) for a complete list of distributions that the GLD can represent and their corresponding parameters.Cai et al. (2013), Cai (2016) and Cai and Li (2019) considered Bayesian estimations for double AR time series models, nonlinear time series models and threshold GARCH models, respectively, with  t following a generalized lambda distribution.Zhang et al. (2021) developed a quantile index regression by Tukey lambda distribution, and their proposed estimation method is not suitable for heteroscedasticity time series data.Combing (1.2) and (1.4), we have where the unknown parameters  0 = ( ⊤ 0 ,  ⊤ 0 ,  0,1 ,  0,2 ,  0,3 ,  0,4 ) ⊤ are all independent of .The approach (1.5) allows us to estimate the quantile function of a response variable, rather than a sequence of quantiles.Moreover, it is worth noting that we can obtain non-crossing quantile estimators of where (X t ,  0 ) and  0,2 are all positive.Model (1.5) also enables us to conduct the estimation of quantile levels with rich observations and then to extrapolate the fitted structures to far tail, such as Value-at-Risk in economics and finance.
To summarize, we develop a QR method for the DAR model (1.1) with streaming datasets.We first parameterize Q  t () as function (1.4) by the generalized lambda distribution (GLD).The unknown parameters  0 in the model (1.5) can be estimated by a convolution-type smoothing method, which only requires the availability of the current data batch in the data stream and sufficient statistics of the historical data at each stage of the analysis.Theoretically, it has the same asymptotic distributions as the standard version computed on an entire data stream with the data batches pooled into one data set, without additional condition.Based on (1.5), we can estimate the conditional quantile at a very high or low level of quantile level, and the quantile estimators of Q Y t | t−1 () under different s are non-crossing.The most important thing is that it can be applied to streaming data analysis to obtain any wanted conditional quantile estimation.
The remainder of this article is organized as follows.In Section 2, the estimation methods based on all data are proposed.The renewable estimation method is developed in Section 3.Both simulation examples and the application of real data is given in Section 4 to illustrate the proposed procedures.We conclude this article with a brief discussion in Section 5.All technical proofs are provided in the Appendix.

Self-weighted Composite Quantile Regression Estimation
We first study the estimation method based on all data with sample size N.Note that the unknown parameters ) ⊤ are all independent of , we therefore consider the following self-weighted composite quantile regression (SWCQR) to achieve high efficiency: where K is a fixed integer with 0 <  1 < • • • <  K < 1 and one can use the equally spaced quantiles at  k = k∕(K+1) for k = 1, … , K (Zou and Yuan, 2008), t=1 are non-negative random weights, which are to maintain the asymptotic normality for heavy-tailed data with only a finite fractional moment, see Ling (2005) and Zhu and Li (2022).
Therefore, we can obtain the estimator 2) is a function of , thus we can use it to estimate all conditional quantile functions, especially for extreme conditional quantile functions.Moreover, QY t | t−1 () under different  ∈ (0, 1) are non-crossing by where (X t , α) and θ2 are all positive.

Smoothing Self-weighted Composite Quantile Estimation
Because the loss function   (r) = r − rI(r < 0) in (1.3) is non-differentiable, the SWCQR estimator in (2.1) has no display expression, it is impossible to construct a renewable estimator for streaming data sets.To circumvent the non-differentiability of the QR loss function, we estimate  0 by minimizing the following smoothing quantile regression objective function: where (2.4) respectively, where and ∇ 2 are the first and second derivatives of , respectively.For example, we can take an Epanechnikov kernel where e tk () = Y t − q t (,  k ).

Large Sample Properties
To establish the asymptotic properties of the proposed estimators, the following technical conditions are imposed.

C1. {Y
is a positive definite matrix.C4.The kernel function K(⋅) is even, integrable, and twice differentiable with bounded first and second deriva- Condition C1 is commonly used in the literature, see Zhu and Li (2022).For the random weights { t }, there are many choices satisfying Condition C2, such as Condition C4 is a mild condition on K(⋅) for smoothing approximation.For example, Epanechnikov kernel K(u) = (3∕4)(1 − u 2 )I(|u| ≤ 1) satisfies condition C4.
Theorem 2.1.Suppose that conditions C1-C3 are satisfied and N → ∞.Then, we have and where L − −− → represents the convergence in the distribution and and Through the result of Theorems 2.1 and 2.2, the smoothing SWCQR (SSWCQR) estimator γ * in (2.3) achieves optimal efficiency and its asymptotic covariance matrix, which is the same as that of SWCQR estimator γ in (2.1).

Methodology
Now let us discuss how to use (2.3) to develop a renewable estimator for streaming data sets.Assume we have the streaming data sets {D 1 , … , D b } up to the bth batch, where D j is the jth batch data set with a sample size of n j .Then, the total sample size is N b = ∑ b j=1 n j .The data can exceed even a super computer's memory when the number of the blocks b is large enough.For model (1.1), D j = {X j , Y j } is the jth batch data set, where Y j = (Y 1,j , … , Y n j ,j ) ⊤ and X j = (X 1,j , … , X n j ,j ) ⊤ .We begin with a simple scenario of two batches of data D 1 and D 2 , where D 2 arrives after D 1 .We want to update the initial SSWCQR γ1 (or γ * 1 ) by minimizing (2.3) to a renewable SSWCQR (RSSWCQR) γ without using any subject-level data but only some summary statistics from D 1 .By (2.3) and (2.4), the SSWCQR γ1 satisfies, where We propose a new estimator γ2 for streaming data {D 1 , D 2 } as a solution to the equation of the form where where the last equation is according to (3.1), O p (⋅) means bounded with probability and the error term ) can be asymptotically ignored.Through (3.2), the initial γ1 is renewed by γ2 only using the historical summary statistics, including sample variance matrix J(D 1 ; γ1 ; h 1 ) and estimate γ1 , instead of the subject-level raw data D 1 .Generalizing (3.2) to streaming data sets {D 1 , … , D b }, a renewable estimator γb of  0 is defined as a solution to the following incremental estimation equation: Therefore, we can obtain the estimator Qb . (3.4)

Large Sample Properties
The following theorem shows the asymptotic properties of the estimators γb in (3.3) and Qb Theorem 3.1.Suppose that conditions C1-C4 are satisfied.If and ) .
Through the result of Theorems 2.2 and 3.1, it is interesting to notice that the renewable estimator γb achieves optimal efficiency and its asymptotic covariance matrix is the same as that of the SSWCQR estimator γ * b in (2.3), which is computed directly on all the samples.This implies that the proposed renewable estimator achieves the same asymptotic distribution as the SSWCQR estimator.

Algorithm
Numerically, it is quite straightforward to find γb from (3.3) using the Newton-Raphson method at the (r + 1)th iteration: where We summarize the general algorithm for the proposed RSSWCQR method by (3.5) as follows (Algorithm 1). to obtain γb .Thus, our proposed renewable method is indeed an online estimation procedure.Moreover, by step 7 in Algorithm 1, we only need to save γb and Ĵb to obtain the next estimator, which are p × 1 and p × p, respectively.The scale of the data to be stored is (p + 1)p instead of N b p, which is the sample size of the streaming data sets up to b batches.Because p is assumed to be a fixed number in this article, our method greatly reduces the amount of data storage.

NUMERICAL STUDIES
We first use Monte Carlo simulation studies to assess the finite sample performance of the proposed procedures and then demonstrate the application of the proposed methods with a real data analysis.All programs are written in R code.In all of the numerical experiments, we take ) −1 for simplicity, which is also used in Zhu and Li (2022).

Simulation Example 1: SWCQR Estimation
We study the performance of the SWCQR estimation proposed in Section 2.1.We generate data from the following double-autoregressive model: Two error distributions of  t are considered: a standard normal distribution N(0, 1) and a t distribution with 3 degrees of freedom t(3).The sample size is N = 10,000.
To evaluate the performance of the estimation method, we calculate the absolute error (AE) of  0 ,  0 and The true values of Q  t () for  = 0.1, 0.5, 0.9 are −1.282,0, 1.282 and −1.638, 0, 1.638 for N(0, 1) and t(3), respectively.We first investigate the sensitivity of the proposed SWCQR method to the K in (2.1).The most commonly used K is 5, 9, and 19 (Zou and Yuan, 2008).The simulation results of AE are shown in Table I, which are based on 100 simulation replications.As can be seen from Table I that the AEs of  0 ,  0 and Q  t () are very close under different errors and Ks.Therefore, the SWCQR estimation is insensitive to K.Moreover, from Table I, we can see that all of the estimators are very close to the true values.In the following simulation, we will use K = 5 for convenience of calculation.
Next, we consider the performance of the SWCQR estimation for Q Y N+1 | N () at  ∈ {0.1, 0.5, 0.9}.Moreover, we compare our method with QR method in Zhu and Li (2022).To evaluate the performances of the estimation methods, we calculate the mean absolute error The simulation results of the MAE in Table II show that the performances of SWCQR are all better than those of QR under different quantile levels and errors.
Table I.The means and standard deviations (in parentheses) of AEs of  0 ,  0 and Q  t () under different Ks and errors for simulation example 1

Simulation Example 2: SSWCQR Estimation
We study the performance of the SSWCQR estimation proposed in Section 2.2.We generate data from the model (4.1).We first investigate the sensitivity of the proposed SSWCQR method to bandwidth selection h in (2.3).From Theorem 2.2, we choose h = CN −1∕4 ∕ ln N, which satisfy the condition for h.We vary the constant C from 0.01 to 100.To evaluate the performance of the estimation method, we calculate the root square error (RSE): in Section 4.1.The results are shown in Tables III and IV.As can be seen from Tables III and IV that the RSEs and MAEs of SWCQR and SSWCQR are very close under C = 0.01 and 0.1.Therefore, we take h = 0.1N −1∕4 ∕ ln N for SSWCQR method in the subsequent simulation study.

Simulation Example 3: RSSWCQR Estimation
We study the performance of the RSSWCQR estimation proposed in Section 3.
We generate data from the model (4.1),where only the t(3) error is considered.Moreover, we fix the sample size of each batch to n j = 500 for j = 1, … , b and vary the number of batches b = 10, 50,100, … , 1000.Simulation results are based on 100 simulation replications.By Theorem 3.1 and the analysis in simulation 4.2, we take h j = 0.1N −1∕4 j ∕ ln N j .We compare our proposed RSSWCQR estimator with the following two competitors: (1) the SSWCQR estimator with full data; and (2) the average SSWCQR (ASSWCQR) estimator for the streaming data set, that is, estimate each streaming data separately and then take its average.To evaluate the performance of the three methods, we calculate the RSE in Section 4.2.
From Tables V and VI, we note that all the estimators are close to the true value because the RSEs are very small, and for any given number of batches b, the RSEs of the proposed estimator (RSSWCQR) are very close to those of SSWCQR and better than those of ASSWCQR.We first consider the different order p for the following two double-autoregressive models (p = 1, 2): and The mean absolute fitting error (MAFE=N −1 ∑ N t=1 |Y t −X ⊤ t β|) is used to compare the fitting data of the model (under different p), where N = 10772 and β is obtained by SWCQR.The results of MAFE under different models (4.2) and (4.3) are the same as 0.775.Therefore, in the following study, for convenience, we only consider model (4.2).
Since Value-at-Risk (VaR) is an important risk measure for financial assets, we use the model (4.2) to forest the conditional quantile of {Y t }.The first 10,594 observations from 1980 to 2021 are used for model estimation and the remaining 178 observations are reserved for the out-of-sample evaluation.Figure 3 shows that the performances of QR and SWCQR are very close.
Next, we consider the proposed method RSSWCQR in Section 3. We consider b = 42 for the 42 years from 1980 to 2021, and the data of 1980 is regarded as the first streaming data.From Table VIII and Figure 4, we can see that the estimated coefficient of  and VaRs under  = 0.1 and 0.9 are all very close.Moreover, we calculate the relative absolute error (RAE) of SWCQR as     () is obtained by SWCQR and the data are from 2022.RAEs at  = 0.1 and 0.9 of SSWCQR are 0.070% and 0.358%, respectively.RAEs at  = 0.1 and 0.9 of RSSWCQR are 1.450% and 7.647%, respectively.The results of RAEs also show SSWCQR and RSSWCQR are close to SWCQR.

CONCLUSION
In this article, we considered a renewable non-crossing QR for the DAR model with streaming datasets by a parameterized method.In this way, the proposed quantile DAR model allows us to study the whole conditional distribution of financial returns and then obtain the corresponding multi-step ahead conditional predictive distributions.Moreover, the method requires only the availability of the current data batch in the data stream and sufficient statistics on the historical data (the latest estimator and the cumulative Hessian matrix) in each stage of the analysis.The scale of the data to be stored is (p + 1)p instead of N b p, which is the sample size of streaming data sets up to b batches.Because p is assumed to be a fixed number in this article, our method greatly reduces the amount of data storage.Theoretically, the proposed estimators achieve optimal efficiency, and their asymptotic covariance matrices are the same as those of the estimators with full data.The algorithm of the proposed renewable non-crossing QR estimation is fast and scalable due to a convolution-type smoothing approach of the objective function.
From the numerical studies in Section 4, we can see that the parameterization of quantile functions by GLD works very well, and our proposed renewable method is very close to the estimator directly using all data.

Figure 1 .
Figure 1.GLD estimation for quantile functions of three distributions

Figure 2 .
Figure 2. The time series plot of the daily return series: 1980-2022

Figure 3 .
Figure 3. VaR forecasts at the level of 10% and 90% from 1 January 2022 to 19 September 2022 by QR and SWCQR methods.Y is the true value

Figure 4 .
Figure 4. VaR forecasts at the level of 10% and 90% from 1 January 2022 to 19 September 2022 by SWCQR, SSWCQR, and RSSWCQR methods t } is strictly stationary and ergodic with E|Y t | r < ∞ for some 0 < r ≤ 1.The conditional density function of Y t on  t−1 ( f Y t | t−1 ) and its derivative function ( f ′ Y t | t−1 ) are uniformly bounded and f Y t | t−1 is positive on the support {x ∶ 0 < F Y t | t−1 (x) < 1}, where F Y t | t−1 is the distribution function of Y t on  t−1 .C2. { t } is strictly stationary, ergodic, non-negative and measurable with respect to X t with E( t ||X t || 3 2

Table II .
The means and standard deviations (in parentheses) of MAEs under different methods, quantile levels and errors for simulation example 1

Table III .
The means and standard deviations (in parentheses) of RSEs and MAEs (×100) with different hs under N(0, 1) error for simulation example 2

Table IV .
The means and standard deviations (in parentheses) of RSEs and MAEs (×100) with different hs under t(3) error for simulation example 2 Journal of Time Series Analysis published by John Wiley & Sons Ltd.DOI: 10.1111/jtsa.12725

Table V .
The means and standard deviations (in parentheses) of RSEs (×100) with different bs and methods under N(0, 1) error for simulation example 3

Table VI .
The means and standard deviations (in parentheses) of RSEs (×100) with different bs and methods under t(3) error for simulation example 3

Table VII .
Summary statistics for S&P500 returns January 1980 and 19 September 2022 with 10,772 observations in total.The data is downloaded from the website of Yahoo Finance (https://hk.finance.yahoo.com).The daily returns are computed as 100 times the difference of the log of the prices, that is, Y t = 100 ln(p t ∕p t−1 ), where p t is the daily price.Table VII collects the summary statistics of {Y t }, where the sample skewness -1.128 indicates possible asymmetries in the volatility, and the sample kurtosis 24.853 implies heavy tail of {Y t }.