Modified efficient importance sampling for partially non-Gaussian state space models

The construction of an importance density for partially non-Gaussian state space models is crucial when simulation methods are used for likelihood evaluation, signal extraction, and forecasting. The method of efficient importance sampling is successful in this respect, but we show that it can be implemented in a computationally more efficient manner using standard Kalman filter and smoothing methods. Efficient importance sampling is generally applicable for a wide range of models, but it is typically a custom-built procedure. For the class of partially non-Gaussian state space models, we present a general method for efficient importance sampling. Our novel method makes the efficient importance sampling methodology more accessible because it does not require the computation of a (possibly) complicated density kernel that needs to be tracked for each time period. The new method is illustrated for a stochastic volatility model with a Student's t distribution.


INTRODUCTION
For the modelling of an observed time series y 1 , … , y n , we consider a parametric model that we formulate conditionally on a dynamic latent factor or a time-varying parameter vector  t , for time index t = 1, … , n.The conditional model for the observations is given by where ind ∼ is notation for serially independently distributed,  is an unknown and fixed parameter vector, and p(y t | t ; ) is an observation density that is possibly non-Gaussian and may represent a nonlinear relation between y t and  t .The density p(y t | t ; ) is for a random variable y t that is conditional on the latent random variable  t and is a function of the parameter vector .Conditional on the sequence  1 , … ,  n , the observations y 1 , … , y n are serially independently distributed.The time-varying parameters in  t can represent different features of the model including mean, variance, and regression effects that may not be constant over time.Different dynamic specifications for the parameters in  t can be adopted.In our analysis, the conditional observation density and the dynamic model for  t must be specified and both may depend on the fixed parameter vector .
When (a) the observation density for y t conditional on  t is Gaussian, (b) the relation between y t and  t is linear, and (c) the dynamic model for  t is linear and Gaussian, our time series modelling framework reduces to the linear Gaussian state space model as discussed and reviewed in, for example, Harvey (1989) and Durbin and Koopman (2012, Part I).In this framework, we can rely on the celebrated Kalman filter and its related smoothing method for the signal extraction of  t , the evaluation of the likelihood function for a specific value of , and the forecasting of y t .These methods provide minimum mean squared error estimates for  t and minimum mean squared error forecasts for y t .Such optimal properties for estimates produced by the Kalman filter methods are not obtained when we depart from one of the three given assumptions above.For the class of partially non-Gaussian state space models, we replace the conditions (a) and (b) by the condition that the observation density for y t conditional on  t can be non-Gaussian and that the relation between y t and  t can be nonlinear.Under these more general conditions, we cannot rely on linear methods to obtain optimal estimates.In almost all cases of practical interest, we require numerical methods that often pose various computational challenges.For example, to evaluate the likelihood function for a partially non-Gaussian state space model, in most cases, we cannot use the Kalman filter or a related analytical filter.We need to evaluate the high-dimensional integral of the likelihood function directly using numerical methods.In this paper, we focus on Monte Carlo methods, in particular on importance sampling methods.
The general ideas of importance sampling are established in statistics and econometrics, see Kloek and van Dijk (1978), Ripley (1987), andGeweke (1989).Importance sampling techniques for state space models have been explored by Danielsson and Richard (1993), Shephard and Pitt (1997), Durbin and Koopman (1997), So (2003), and Jungbacker and Koopman (2007).A textbook treatment is given by Durbin and Koopman (2012, Part II).The performance of the Monte Carlo estimation method relies on the successful construction of an importance density.
In this paper, we consider the efficient importance sampling (EIS) method of Liesenfeld and Richard (2003), Richard and Zhang (2007), and Jung et al. (2011).The EIS method constructs the parameters of the importance density in such a way that the variance of the importance sampling weights (in logs) is minimized.Although this leads to a globally EIS process, the method can be complicated due to the necessary computation of a density kernel for the importance density at each time period.Moreover, EIS is often custom-build that makes it less accessible.Although Richard and Zhang (2007) have shown that EIS can be implemented for state space models, we show that EIS in the case of partially non-Gaussian state space models can be implemented using standard Kalman filter methods in a similar way as in Shephard and Pitt (1997) and Durbin and Koopman (1997).There is no need for new methods or major adjustments of existing methods.The implementation of our proposed modification of the EIS method for this class of models is therefore straightforward.Furthermore, we show that our modified EIS (MEIS) method can lead to some considerable reductions in computing time.We emphasize however that both implementations of the EIS method should lead to the same estimation results, up to a level of numerical accuracy.It is shown by Koopman et al. (2014) that similar modifications can also be explored for the use of numerical integration methods in importance sampling.
The remainder of the paper is organized as follows.In Section 2, we briefly introduce partially non-Gaussian state space models.In Section 3, we introduce our new MEIS method by reviewing the EIS method for constructing the importance density and show how the EIS method can be implemented using state space methods.In Section 4, we discuss parameter estimation, signal extraction, and forecasting.In Section 5, we verify that EIS and MEIS produce similar estimation results in a Monte Carlo study and we indicate the reductions in computing time by MEIS.In Section 6, we apply our new methodology to four time series of financial returns that are analysed on the basis of the stochastic volatility model with a Student's t density.The empirical results are of interest generally when analyzing volatilities in stock markets.Section 7 reviews the developments and explores future research.

PARTIALLY NON-GAUSSIAN STATE SPACE MODELS
The specification of the observation density in a partially non-Gaussian state space model is given by p(y t | t ; ) and its specification can be given as in (1) with the stochastically time-varying parameter vector  t .The linear Gaussian dynamic process for  t is given by where the elements of the vector of intercepts d t , transition matrix T t , the selection matrix R t , and the variance matrices Q t and Q 0 are known except that some elements have a possible dependence on parameter vector , for t = 1, … , n.The disturbances  t are normally and independently distributed and do not depend on the normally distributed initial state vector  1 .All stochastic and nonstochastic variables have appropriate dimensions, and the dimensions will only be given when it is necessary.The observation y t is typically a scalar, but the methods presented in Section 3 are also applicable for a vector of observations y t .This class of models have been originally studied by Shephard (1994).Illustrations of special cases of our partially non-Gaussian dynamic modelling framework are given below.

Signal plus heavy-tailed noise model
When the time series observations y t are randomly contaminated by noise with large shocks, we may wish to remove the noise from the signal and to model the noise explicitly by a heavy-tailed density.We then may consider the model where the signal function Z t (•) is fixed and known and may also depend on the parameter vector , the stochastically time-varying state vector  t is specified in (2), and (,  2 , ) is the Student's t density with mean , variance  2 , and degrees of freedom .The model clearly fits in our general framework with observation Equation 1 given by where  t is the signal.Other heavy-tailed densities for  t can also be considered.A well-known example of a signal plus noise model is the basic structural time series model of Harvey (1989) in which the univariate time series y t can be decomposed into trend, seasonal, and noise components, where  t is the trend component,  t is the seasonal component, and both are elements of the state vector  t .The dynamic specifications of the two components can be formulated jointly in the form of (2).The signal for this model is linear and is given by  t = Z t ( t ) =  t +  t for t = 1, … , n.When the noise component  t is normally distributed, the time series analysis, including the maximum likelihood estimation of , the signal extraction of  t (or, additionally,  t and  t ), and the forecasting of y t can be based on the Kalman filter and related methods.However, when the noise  t is non-Gaussian as in (3), the resulting estimates of the Kalman filter are no longer optimal.

Stochastic volatility model
A time series of financial returns is often subject to clusters of volatility changes that can effectively be modelled by a dynamic process for the variance.A basic version of the stochastic volatility model for a time series of returns y t is given by where  is a constant, the signal  t represents the time-varying log-variance of y t , and  t is the normally distributed noise term.The specification for  t can be formulated as in the previous illustration.However, a more appropriate formulation for the signal is the stationary process where  jt is the jth element of  t with autoregressive coefficient 0 <  j < 1, and  jt is the jth element of  t , for j = 1, … , p, and where  t and  t are specified as in (2), for t = 1, … , n.It follows that the transition matrix T t and the variance matrix Q t in (2) are diagonal matrices, for t = 1, … , n, with their jth diagonal elements equal to  j and  2 , , for j = 1, … , p.The conditional observation density (1), in logs, for this stochastic volatility model is given by log The stochastic volatility model can be extended in many ways.For example, leverage effects can be introduced by having dependence between  t and  jt , for j = 1, … , p. Also, heavy-tailed density functions can be considered for the noise term  t .We refer to Shephard (2005) for extensive discussions on stochastic volatility models.

Time-varying model for counts
where the signal  t is defined in the same way as in the earlier illustrations.Other densities from the exponential family can also be considered such as the Binomial distribution, the negative Binomial distribution, and the Skellam distribution (difference of two Poisson variables).

MODIFIED EIS
We discuss the EIS and MEIS methods by considering likelihood evaluation.For a discussion about other applications in which MEIS plays an important role including signal extraction, maximum likelihood estimation of , and forecasting of future observations y t we refer to Section 4. The likelihood function of the Models 1 and 2 for the observed vector  = ( ′ 1 , … ,  ′ n ) ′ and as a function of parameter vector  is given by where Analytical expressions for the typically high-dimensional integral are only available in specific cases.An example is the linear Gaussian state space model for which the Kalman filter can be used to evaluate the likelihood value for a given value of .Numerical evaluation is usually dismissed because of the high dimensional vector .A Monte Carlo evaluation of the likelihood function is often explored as a feasible alternative.A basic version of a Monte Carlo estimate of ( 7) is where  (i) refers to the ith simulated sample of  that is generated from the unconditional density p(; ) with i = 1, … , M. The standard law of large numbers insists that L(; ) converges to L(y; ) as M → ∞.Because the simulation of  has no reference to data vector y, the efficiency of the estimate is very low and we need M to be extremely large.An efficient Monte Carlo method for the evaluation of integrals such as ( 7) is based on importance sampling techniques.Simulation-based methods are explored in statistics for different models and purposes.We will review importance sampling methods in the context of time series models.It is useful to express the likelihood function (7) in terms of individual time series observations.Given the serial independence properties for the observations y t conditional on  t and for the disturbances  t , we have with p(

Importance density
For an introduction to Monte Carlo simulation methods and in particular the technique of importance sampling, we refer to Ripley (1987).To evaluate (9) via importance sampling, we introduce the importance density based on the linear Gaussian joint density g(y, ; ) with properly defined mean vector and variance matrix.The dependence of  originates from our Models 1 and 2. We adopt the decomposition g(y, ; ) = g(y|; )g(; ) and have Because the dynamic specification for the state vector in ( 2) is linear and Gaussian, the equality g(; ) = p(; ) is valid.The Gaussian observation density can be expressed by where which has the conditional observation log-density function The artificial variable x t is defined as x t = b t ∕c t , and we can substitute it into the observation log-density to obtain where the constant a t collects all terms that are not associated with  t and is given by ) .
It follows immediately that where it is assumed that x t is modelled by (11) for t = 1, … , n.This is a key result for the developments below.It implies that the analysis concerning g(y, ; ) = g(y|; )g(; ) can be based on the Model 11 for which we can use the Kalman filter and related methods.

Likelihood evaluation via importance sampling
The actual importance density for the evaluation of the likelihood function ( 7) is chosen as where vector  (i) is independently drawn for i = 1, … , M. Because we can represent g(y, ; ) by the linear Gaussian state space Models 11 and 2, we can simulate  from the conditional state density g(|y; ) via simulation smoothing; see, for example, Fruhwirth-Schnatter (1994), Carter and Kohn (1994), de Jong and Shephard (1995), and Durbin and Koopman (2002).Given the simulated realizations  (i) , for i = 1, … , M, the likelihood estimate is Some practical issues on computing L(; ) for the purpose of estimating parameters are discussed in Section 4.1.It can be expected that the Monte Carlo estimate ( 13) is more efficient than the estimate (8) because we simulate  (i)  t with a reference to the data vector y.

Implementation of modified EIS
Here, we introduce our modified EIS method.It is based on the EIS of Richard and Zhang (2007) but we show that the method can be implemented using the simulation smoothing method.Hence, we obtain a computationally fast and more convenient implementation of EIS for time series models.The values for b t and c t , with t = 1, … , n, need to be determined before the calculation of (13) can start.Here, we follow Richard and Zhang (2007) and adopt their EIS method.They propose to choose b t and c t such that the criterion cannot be evaluated analytically for the same reason as the likelihood function ( 7) cannot be evaluated analytically.Hence, we follow the same approach of introducing the importance density g( t |y; ).The criterion to be minimized can then be expressed as with respect to (b t , c t ), where  (i)  t is obtained by sampling from g(|y; ).This minimization of Î * t leads to the weighted least squares solution.In case  t = Z t ( t ) is a scalar, we can define the regression coefficient vector  t = (a * t , b t , c t ) ′ where a * t is the intercept and can be regarded as a deterministic function of b t and c t .The minimum is then obtained at βt = where w it is defined in ( 13) and where for t = 1, … , n.The sampling of  (i) t from g( t |y; ) leads to a main difference with the EIS method.In our MEIS method, the computations for ( 16) are carried out by applying the simulation smoother on the model g(y, ; ) = g(y|; )p(; ) that we typically represent by the Models 11 and 2. However, observation Equation 11 requires values for b t and c t , for t = 1, … , n, which we want to establish via the least squares solution (16).Because the Gaussian kernel of the log-density log g(|; ) acts effectively as a second order Taylor approximation to log (|; ), around some value of  t , we can carry out the minimization iteratively as follows.We set values for b 1 , … , b n and c 1 , … , c n initially.A search for good starting values can be conducted, but in many cases of practical interest, any set of initial values work sufficiently well.Next, we simulate  (i)  t by means of simulation smoothing applied to the linear Gaussian Models 11 and 2, based on the current set of values for (b t , c t ) with t = 1, … , n.A new set of values can be obtained from ( 16).This iterative scheme continues until some level of convergence is obtained.It is assumed that at each iteration when samples are generated from g(|y; ) using a new set of values for (b t , c t ) with t = 1, … , n, the same random numbers are used for computing  (i) so that a smooth convergence process takes place.The random numbers can be saved on the computer or they can be generated by using the the same random seed.
The MEIS implementation can be summarized in algorithmic form as follows: 1 , by means of a simulation smoothing algorithm where the seed of the random number generator is the same for every iteration k. 3. Set  k+1 t = βt , for t = 1, … , n, as obtained from the regression (16) using the simulated values  (i)  1 , … ,  (i)  n , for i = 1, … , M, from the previous step.
, n and some threshold value , the algorithm has converged and the algorithm can be terminated; else set k = k + 1 and return to steps (ii) and (iii).We notice that ⊘ denotes Hadamard division (point wise division).

A comparison with EIS
Our proposed modification of the EIS method, the MEIS method, is clearly different than the original EIS method when applied to Models 1 and 2 although the objective function is the same.Hence, both methods will produce similar results, up to a level of numerical accuracy and simulation error.However, we argue that the implementation of MEIS is much simpler and more straightforward in comparison with EIS.Only standard state space methods are required for MEIS.In higher dimensions, MEIS will be computationally more efficient because it requires the simulation of the signal  t rather than the state vector  t ; in most models of practical interest, the dimension of the signal is much smaller.
The key insight is that we explore the representation of g(y, ; ) by the linear Gaussian state space Models 11 and 2 for the constructed variable x t .This allows us to treat the EIS method on the basis of the computationally efficient Kalman filter and its related smoothing methods including the simulation smoother.Richard and Zhang (2007) have proposed the minimization of ( 14) and have provided the solution (16).The key difference is the way the importance density is constructed.In their implementation of EIS, the researcher needs to do algebra with density kernels that can differ per application and hence EIS is custom-build.They adopt a backwards scheme, starting from t = n towards t = 1, and need to track an integration constant so that each density at time t integrates to unity.We circumvent this time-consuming process because we interpret the density as a well-defined model for x t and apply the simulation smoothing method of Durbin and Koopman (2002) for computing the draws  (i)  t , for t = 1, … , n, directly.Another key development of our MEIS method is that the simulations are based on the signal vector  t , this in contrast to EIS where the simulations are with respect to the state vector  t .In many empirical models of interest, the state vector is typically of a higher dimension than the signal vector that has the same dimension of y t .We therefore expect that in many studies, our implementation will gain computational efficiency.We emphasize here that in situations where the dimension of the state vector is lower than that of the signal vector, we simply base our simulations on the state vector so that we can always obtain draws with the least computational effort.Richard and Zhang (2007) show that an analytical solution to the variance minimization equation in ( 14) can be obtained by ordinary least square if one restricts the choice of the importance density to the exponential family.Although this restriction is a computational convenient one, it is not a necessary restriction as in other cases, Equation 14 can be minimized by nonlinear methods.It should be noted that MEIS is restricted to Gaussian importance samplers, however, this is only a small limitation because Gaussian importance samplers are by far the most popular candidates due to their attractive properties.
Finally, both EIS and MEIS methods can be generalized to state space models with a possibly nonlinear and non-Gaussian state vector updating equation.It requires an extension of the regression computations in step (iii) of MEIS and simulations from another linear Gaussian state space model that is discussed in Durbin and Koopman ((2012), Part II).

APPLICATIONS OF MEIS
In Section 3.2, we have shown how the likelihood function can be evaluated by the method of importance sampling.In this section, we briefly illustrate other applications of (modified) EIS and provide the details for an effective implementation.

Maximum likelihood estimation of 𝜓
The maximum likelihood estimate of the parameter vector  can be simply obtained via a numerical optimization method.Quasi-Newton methods are often used for this task.It may be clear that analytical expressions for the maximum likelihood estimate are not available except some rare cases.
A number of numerical issues need to be addressed before the actual maximization of the likelihood function can take place.We evaluate the likelihood function as a Monte Carlo estimate.The use of different sets of random values for generating the importance draws of  i , with i = 1, … , M, leads clearly to different estimates of the likelihood function L(y; ).Because numerical optimization methods require smooth functions, we evaluate the likelihood functions using the same set of random values.In other words, the same "seed" of the random number generator is taken for each likelihood evaluation.The likelihood is then a smooth function of  only.
In practice, the log likelihood function is maximized.However, the log of the estimate ( 13) is not equal to the estimate of the log likelihood function.The bias in the log of the estimate can be approximately corrected on the basis of a second-order Taylor expansion.We therefore maximize the bias-corrected log likelihood estimate where (; ) = log L(; ), w i = ∏ n t=1 w it and w = M −1 ∑ M i=1 w i ; see Durbin and Koopman (2012) for more details.
The bias-corrected log likelihood estimate can be expressed as The computation of w i , log w, and w−2 s 2 w requires modifications for a numerically feasible and stable implementation.Define The computation of a i and ā is numerically stable.However, the computation of w i = exp(a i ) can lead to numerical overflow problems whereas the computation of u i = exp(a i − ā) is numerically stable.It follows that w i = exp( ā)u i .After some further minor manipulations, it can be shown that log w = ā + log ū, and w−2 s 2 w = ū−2 s 2 u , where The bias-corrected log likelihood estimate ( 17) is computed in a numerically feasible manner using these results.

Signal extraction: Estimation of state and signal vectors
The estimation of  t is based on the evaluation of the integral α = ∫ (|; )d.
We have argued that also the evaluation of such integral in a computational efficient way can be carried out by EIS.The two integrals can be evaluated by Monte Carlo simulation.The estimate of α is then given by , where w i = ∏ n t=1 w it with w it defined as in ( 13) and where both  (i) and w i are based on the draws from the importance density, that is, The draws are obtained by using the method of MEIS described in Section 3.3.The nominator and denominator are typically computed by using the same random numbers, and therefore, we can base the estimate on normalized weights, that is, The signal is a function of the state vector, we have  t = Z t ( t ) and  = Z() where Using the same arguments as above, the estimate of  is given by θ = ∫ w(, ; )g(|; )d ∫ w(, ; )g(|; )d , and we evaluate it via the MEIS method to obtain where the normalized weight w * i is defined as above.These arguments are also valid for any other known function of , see also Durbin and Koopman (2000).

Forecasting
Forecasting requires the estimation of the state vector  t at a time period t after n, that is, t > n.The same principles of signal extraction can be applied.We denote the forecast of  n+j by αn+ , for j = 1, 2, … , and we compute it by importance sampling methods.It follows that where The computation of the draws  (i) n+ ∼ g( n+ |; ), for j = 1, 2, … , is facilitated by the simulation smoothing algorithm.It requires the extension of the data vector y with missing values for the time periods n + 1, … , n + j to obtain the draws from g( + |y; ) as required; see Durbin and Koopman (2012, chapter 4).The forecasting of signal and observations vectors is carried out in a similar way.Observations forecasts can then be based on the signal forecasts.

SIMULATION STUDY
We study the performances of the EIS and MEIS methods in a simulation study.We first verify whether both methods produce roughly the same estimation results with the same simulation efficiency.We expect the parameter estimates and the associated standard errors of both methods to be reasonably close to each other because both methods rely on the same objective function ( 14).We further provide an indication of the computational gains of MEIS.We expect these gains because MEIS requires simulation draws from the signal  t rather than draws from the state  t in EIS.
We consider the stochastic volatility model with multiple volatility factors of Section 2.2 where we replace the Gaussian density p( t ) of Equation 4by the Student's t distribution to obtain a stochastic volatility model with heavier tails, in short the SVt model.Because we have E(y t | t ) =  and Var(y t | t ) = exp( t ), it follows that the log density is given by log ( where constant = log Γ We let the signal consist of the sum of the individual state elements as specified in (5).In the case of p = 2, we can associate the first state element with long-run dependence and the second state element with short-run dependence, see, for example, Durham and Gallant (2002).The p × 1 state vector  t = ( 1,t , … ,  p,t ) ′ as defined in (2) has p × p constant over time system matrices given by with | j | < 1 and  2 , > 0 for j = 1, … , p.The model is identified by imposing  1 > … >  p .For more information about the sum of autoregressive and moving average processes, we refer to Granger and Morris (1976).This dynamic specification in the context of the stochastic volatility Model ( 21) is also adopted by Koopman and Scharth (2013).
We have simulated S = 500 return series of length n = 5, 000 with true values of  as presented in Table 1 where the value of  in ( 21) is fixed at 0. For all simulated time series, estimates of  are obtained from the EIS and MEIS methods as described in Section 3. The calculations are carried out in the Ox computing language of Doornik (2007) and the SsfPack library that consists of a set of C routines for Kalman filter and related methods, see Koopman, Shephard, and Doornik (2008).We notice that the implemented routines in Ox make callbacks to C for its matrix computations but it does not do this if "for loops" are used.Therefore, we have programmed the time consuming "for loops" of the EIS method in C to provide a fair comparison in speed between the two methods.For MEIS, we have used the simulation smoother of Durbin and Koopman (2002) to obtain draws from the signal  t whereas for EIS, the draws from the state  t are obtained by implementing the methods described in Jung, Liesenfeld, and Richard, (2011).The maximum likelihood estimation of the parameter vector  is based on the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm where the starting values are set equal to the true parameter vector as given in Table 1.We present the results of this simulation study in Table 2.We can conclude from the reported estimation results obtained by the EIS and MEIS methods that they are very similar and in some cases identical, up to numerical precision, and when also the standard errors of the estimates are considered in the comparisons.After comparing the sample standard errors of the parameter estimates and the variances of the demeaned importance weights, s 2 u , we can also conclude that the simulation efficiencies of both methods are very similar overall.Hence, both methods produce results that are precise and numerically accurate.the computations are carried out on a i7-2600, 3.40 GHz desktop PC using one core.The convergence criterion of all three methods is set to 10 −3 and the algorithms converge most of the time in 4 − 6 iterations.One likelihood evaluation is based on M = 100 draws.Starting values are set to 90% of the true parameter values as given in Table 1.ν, φ , and σ, , p = 1, 2, 3 are the means of the estimated parameters.s2 u denotes the mean of the variance of the importance sampling weights, that is, 1 S ∑ 500 i=1 s 2 u .The numbers in parenthesis are the sample standard errors of the estimates.Time in the last column is the average computing time (in seconds) for parameter estimation.EIS = efficient importance sampling; MEIS = modified EIS.
Further, we can learn from Table 2 that the MEIS method is somewhat faster than the EIS method, but not much faster.The EIS method becomes computationally less efficient for p > 1 because more computations are necessary to draw the p×1 state vectors when p increases.Hence, we expect difference in computation time between EIS and MEIS to become larger when the state dimension increases.Therefore in our next exercise, we compare the computing time for calculating the likelihood for larger p dimensions and for longer time series.In Table 3, it is clearly shown that the computational gains are small for small dimensions of the state vector but the gains become much larger as the dimension of the state p or the length of the time series n increases.The irregular patterns in the reported gain fractions as n increases are due to the numerical iterative procedures that rely on convergence criteria and are due to our efficient computer implementations using multiple core platforms.We therefore take the reported fractions in Table 3 as indicative.

VOLATILITY MEASUREMENT FROM DAILY STOCK RETURNS
To investigate whether our modified EIS method has relevance in an empirical study with the purpose of measuring time-varying volatility in daily stock returns, we consider the stochastic volatility model with multiple volatility factors as adopted in Section 5. We first estimate the parameters  j and  ,j , for j = 1, … , p from ( 5) or ( 22), and  from ( 21), simultaneously by maximum likelihood.Then, we extract log-volatility  t as specified in ( 5) and based on the model with estimated parameters.
In our empirical study, we consider four daily stock returns, the S&P100 composite index, and from three key U.S. firms (Ford, Bank of (BAC), and IBM) that are all traded at the estimated log-volatilities for the four return series analyzed by the SVt model with p = 2 are presented in Figure 2. The estimates are computed using the MEIS procedure for which the details are presented in Section 4.2.The signal estimates are very similar, or in some cases virtually indistinguishable, for SVt models with p = 1 or p = 3.We display the three volatility signal estimates for the firms Ford, Bank of America, and IBM in three separate plots but together with the S&P100 volatility signal in each of them.Hence, we can easily compare between the individual volatility pattern of the firm and the one of the market as represented by the S&P100 index.We learn from Figure 2 that the stock volatility patterns of both Ford and IBM are generally in common with the market.It is interesting that the same applies to BAC until the financial crisis has started.At the height of the financial crisis in 2009, the BAC volatility has been higher than the S&P100 volatility.During the aftermath of the financial crisis up to recently, the BAC volatility have remained relatively high.
The empirical illustration can be extended in various ways, we mention two possibilities.First, we can carry out a multivariate stochastic volatility analysis in which the stock index and the three individual stocks in the 4 × 1 observation vector y t are then jointly modelled with a common stochastic log-volatility signal.In this modeling approach, the individual stock volatility deviations from the common log-volatility signal can be measured directly.When the signal remains univariate, MEIS can still efficiently carry out the computations, whatever the dimension of y t .Second, we can consider a heavy-tailed distribution for the state disturbances  t in order to capture breaks in volatility (such as for the stock Ford) more effectively.In addition, we can consider the incorporation of the so-called leverage effect.Such challenging extensions require further amendments for both the EIS and MEIS methods.We regard these considerations as interesting directions for future research projects.

CONCLUSIONS
We have presented a new modification of the EIS method for the analysis of partially non-Gaussian state space models that include a wide range of time series models of interest.For the original EIS method of Richard and Zhang (2007), the construction of the importance density relies on an iterative method for which at each step, an importance density kernel needs to be computed.This process can be algebraically complicated and the simulation of the state vectors in EIS can become computationally involved and not straightforward when models require a large state vector with time-varying parameters.In the MEIS method, we construct the same importance density based on a similar simulation method.However, we show that the EIS method can also fully rely on computationally efficient Kalman filter and smoothing methods.The modification therefore leads to a simpler importance sampling method, especially for large state dimensions.The details of this development have been presented here.To show the relevance of these methods in an empirical study, we have analyzed the volatility patterns in four U.S. stock return series and we have commented on our findings.

FIGURE 1
FIGURE 1Daily financial returns from 7 October 2004 up to 6 October 2014 (10 years) for the S&P100 stock index and the stocks Ford, Bank of America (BAC), and IBM, all traded at the New York Stock Exchange and based on close prices

FIGURE 2
FIGURE 2Estimated log-volatilities 7 October up to 6 October 2014 (10 years) for the S&P100 stock index and the stocks Ford, Bank of America, and IBM, based on the SVt model given by (4) with Student's t density (21) and log-volatility signal (5) with p = 2 Time series of counts can be modelled by the Poisson density with the intensity parameter as a function of the time-varying signal  t that we can specify as (1) and (2).The observation log-density function is then given by log ( t | t ; ) =  t log  t −  t − log ( t !),  t = Z t ( t ), t = 1, … , n, ( represents the signal and with Z t ( t ) being the link function that connects the observation y t with the state vector  t .Examples of link functions for the signal  t are presented in the illustrations of Section 2. It follows that the variables b t and c t are functions of the observations y 1 , … , y n and parameter vector  for t = 1, … , n.The constant a t ensures that g(y t | t ; ) integrates to unity and hence it is a deterministic function of b t and c t .An effective importance sampler is obtained by selecting appropriate values for b t and c t for t = 1, … , n.The design of the importance sampler is elegantly reduced to a choice for b t and c t that determine the mean and variance implied by g(y t | t ; ).The importance density g(y t | t ; ) can alternatively be expressed in terms of constructed variable x t = b t ∕c t and the linear and Gaussian model ,  t ; ) is referred to as the importance weight, for t = 1, … , n.The evaluation of the likelihood function by means of importance sampling takes place by simulating state vectors from the importance density g(|y; ) that we denote by g(|; ) g(; )∕g(; ), where g(; ) = p(; ).The likelihood function can be written as t ,  t ; ) ( t ,  t ; ) g( t |; ) g( t |; )d t ( t ,  t ; )w( t ,  t ; )g( t |; )d t .(15) The statements above are valid because b t and c t only have an impact on y t and  t = Z t ( t ); they have no impact on y j and  j = Z j ( j ) with j ≠ t.Also, we have Hence, the minimization of I t with respect to (b t , c t ) is equivalent to the minimization of I * t .The evaluation and minimization of I * t takes place via importance sampling.
. Set k = 1 and set values for  k t = (a * t , b t , c t ) ′ for t = 1, … , n. 2. Construct the linear Gaussian state space Model 11 for x t = b t ∕c t based on  k t ; simulate

TABLE 1
We report values of  that are used for simulating time series of returns with Note.For the EIS and MEIS methods and for each state dimension p, we simulate S = 500 return series of length n = 5, 000.The estimation results are presented in Table2.The sum of the variance of the state components is kept constant for all p (Column 8) which means that the individual variances of the states do vary with p. EIS = efficient importance sampling; MEIS = modified EIS.

TABLE 2
We report results of the simulation study in which S = 500 time series of length n = 5, 000 are simulated from the Student's t as given by Equation21 Note.After simulation, estimates of  are obtained for the EIS and MEIS methods.Details of the simulation study are as follows;

TABLE 3
We report the fraction t(EIS)∕t(MEIS) where t(x) is the time in seconds required for method x in computing the log likelihood function, for different state dimensions p and different time series lengths n.