Interpretation of point forecasts with unkown directive

Point forecasts can be interpreted as functionals (i.e., point summaries) of predictive distributions. We consider the situation where forecasters' directives are hidden and develop methodology for the identification of the unknown functional based on time series data of point forecasts and associated realizations. Focusing on the natural cases of state-dependent quantiles and expectiles, we provide a generalized method of moments estimator for the functional, along with tests of optimality relative to information sets that are specified by instrumental variables. Using simulation, we demonstrate that our optimality test is better calibrated and more powerful than existing solutions. In empirical examples, Greenbook gross domestic product (GDP) forecasts of the US Federal Reserve and model output for precipitation from the European Centre for Medium-Range Weather Forecasts (ECMWF) are indicative of overstatement in anticipation of extreme events.


Introduction
Forecasts are frequently the basis of crucial decisions. Yet, they are fraught with uncertainty due to imperfections in the observation, understanding, and modeling of the underlying mechanisms. To account for this uncertainty, it is increasingly recognized that forecasts should be probabilistic in nature. If forecasts are issued in the form of predictive distributions, it is straightforward to compute the action that maximizes the expected utility, test for optimality, or compare them to other forecasts (see Gneiting and Katzfuss, 2014, for a recent review of probabilistic forecasting).
However, point forecasts are still ubiquitous. Their interpretation requires assumptions on the decision process or directive that the forecasters used to generate the point forecasts (Elliott and Timmermann, 2008;Engelberg et al., 2009;Manski, 2016). A directive can be expressed through a functional (i.e., a real-valued summary) of the predictive distribution. It is a widely used assumption that the reported functional is the mean. However, there is often little justification for this choice. Knowing which functional was used by the forecaster is important, as it allows for proper interpretation, evaluation, testing, and comparison of point forecasts (Gneiting, 2011).
We consider point forecasts with an unknown directive, for which the forecaster only implicitly reported a certain functional of the predictive distribution. This situation can arise with expert forecasts or response items in surveys. Another important example for forecasts with unknown directives is output from complex computer models, which are often tuned by multiple individuals to achieve forecasts that the individuals perceive as optimal in a way that might neither be transparent nor explicitly defined. Such forecasts would be most informative if the user knew the directive under which the forecast was issued. Our goal here is to estimate the functional from a time series of point forecasts and associated realizations, and to construct tests regarding the properties of the functional. Once the functional has been estimated, the point forecasts can be coherently interpreted, improved, and compared to other point or probability forecasts.
Past work on estimating a directive based on point forecasts and realizations has focused on estimation of the loss function. Elliott et al. (2005) provide a generalized method of moments (GMM) estimator of the loss function for constant preferences and linear forecasting models. Patton and Timmermann (2007) apply this method to the U.S. Federal Reserve's gross domestic product (GDP) forecasts with a new class of loss functions, which consists of quadratic splines that are flexible with respect to a state variable. Recently, piecewise linear and piecewise quadratic loss functions have been used in various economic applications (Christodoulakis and Mamatzakis, 2008;Capistrán, 2008;Elliott et al., 2008;Krol, 2013;Pierdzioch et al., 2013;Wang and Lee, 2014;Fritsche et al., 2015). Komunjer and Owyang (2012) derive related estimators for multivariate forecasts and loss functions. Lieli and Stinchcombe (2013) discuss the recoverability of the loss function if conditional distributions are observable. In a neuroscience application, Körding and Wolpert (2004) estimate the loss function implicit in human sensorimotor control by varying targets in an experimental task. Sims (2015) uses a similar approach to infer the implicit loss function of the visual working memory. Guler et al. (2017) propose Mincer-Zarnowitz quantile and expectile regressions to account for asymmetric loss functions.
Here, we argue that the loss function is, in fact, not identifiable based solely on point forecasts and realizations. Hence, we propose to formalize point forecasts via functionals rather than loss functions. This allows for a more general definition of forecast optimality, and we show the existence of identifying moment conditions under weaker conditions. For estimation, we focus on state-dependent quantiles and expectiles, for which the level of asymmetry can depend on the current state. We propose a GMM estimator and show consistency and asymptotic normality under mild assumptions. We also discuss testing of forecast optimality and other forecast properties.
Our approach generalizes the findings of Elliott et al. (2005) to state-dependent forecasts that are not required to be linear functions of the instrumental variables. In comparison to Patton and Timmermann (2007), who consider state-dependent loss functions, our methods are more interpretable and fully theoretically justified. Further, our approach can be used for performance comparisons between point and probability forecasts and for creating density forecasts from point forecasts.
In a data example we illustrate that our approach yields accessible and scientifically relevant insights. Specifically, we show that the GDP Greenbook forecasts of the U.S. Federal Reserve can be interpreted as state-dependent quantiles. In a Monte Carlo study, our approach exhibits better calibrated and more powerful optimality tests than existing solutions, and we demonstrate that our more general definition of optimality can be used to distinguish different state-dependent forecasting behavior.
This manuscript is organized as follows. In Section 2, we introduce optimal forecasts and discuss non-parametric identification. In Section 3, we introduce a parametric GMM estimator in the time series setting, study its large sample behavior, and discuss tests of optimality and more specific hypotheses. In Section 4, we apply the method to the GDP Greenbook forecasts. In Section 5, we compare our methodology to the approach of Patton and Timmermann (2007) using simulated data. Section 6 serves as a discussion. Technical results and proofs are provided in the Appendix. Sections S1 through S4 contain additional details in an online Supplementary Material document.

Identification
In this section, we discuss non-parametric identification of the functional. Then, we describe the relationship between functionals and loss functions, and argue that loss functions cannot be identified.
Consider a real-valued random variable Y and a corresponding point forecast X, which is based on the information available to the forecaster, as encoded by some σalgebra F. Commonly, a point forecast is interpreted as the mean of the conditional distribution L(Y |F), i.e., Here and throughout the paper, equality of random variables is understood to hold almost surely. We proceed to a more general framework. Let α : P → R be a functional (Horowitz and Manski, 2006;Huber and Ronchetti, 2009, p. 9), i.e., a single-valued mapping from some class of probability distributions to the real line. We use the short notation α(Y |F) for α(L(Y |F)).
Definition 1 (optimal α-forecast). A random variable X is an optimal α-forecast of Y with respect to the information set F if Throughout, we call a functional α symmetric if, for every symmetric distribution P with symmetry point c, it holds that c = α(P ). Prominent alternatives to the mean functional are symmetric functionals, like the median, or asymmetric generalizations such as quantiles and expectiles. Now, crucially, we consider the situation in which the functional used by the forecaster and the conditional distributions L(Y |F) are unknown. In line with seminal extant work on professional economic forecasters (Elliott et al., 2005;Patton and Timmermann, 2007), we merely assume that the unknown conditional distribution constitutes a predictive distribution consistent with some information set F.
We consider single-valued scalar functionals throughout, although the results extend to the set-valued case under additional technical considerations. If R is a random variable (or vector), the relation R ∈ F indicates that R is F-measurable. The partial derivative of a function g(x, y) with respect to x is denoted as g (x) (x, y).

Identifying moment conditions for functionals
It is well-known that an optimal mean-forecast relative to the information set F implies an identifying moment condition (Diebold and Lopez, 1996): where the components of the random vector W (henceforth called instruments) honor the information F available to the forecaster when the prediction is issued. The relation (1) allows testing whether a point forecast is an optimal mean-forecast. In the hypothetical limit of an infinite supply of data and instruments, a non-rejection of the test is a sufficient condition for mean-forecast optimality. This property of optimal mean-forecasts can be generalized to optimal α-forecasts: Let X be a forecast based on some information set F. For every sufficiently regular functional α : P → R, there exists a function V identifying the optimal α-forecast, i.e., For a formal statement and regularity conditions see Lemma 1 in Appendix A. The proof applies recent results on identification functions (Steinwart et al., 2014) in the prediction space setting (Gneiting et al., 2007;Gneiting and Ranjan, 2013;Strähl and Ziegel, 2017). We can find the identification function V of a functional α via loss functions as described in Section 2.2 or by elementary considerations as examplified in Section 3.1. The regularity conditions on the functional in Lemma 1 exclude some functionals (e.g., the mode) for their lack of continuity, and others (e.g., the variance) because they do not induce convex level sets on the class of absolutely continuous distributions. The moment conditions in (2) identify the functional only on the class of arising conditional distributions L(Y |F). For example, the mean and the median are distinct functionals in general, but if the considered distribution is symmetric, the two functionals are identical and cannot be identified. In Section 3, we focus on optimal forecasts in the form of either state-dependent quantiles or state-dependent expectiles, which allows for unique identification without involving unduly strong assumptions on the data-generating process.

Relationship between functionals and loss functions
Previous work (e.g., Elliott et al., 2005) defined optimal point forecasts via loss functions L(x, y) as Under regularity conditions, Equation (3) specifies a well-defined optimal α L -forecast, where for a suitable class P of probability distributions. For example, the mean-functional can be defined as minimizing expected quadratic loss, L(x, y) = (x − y) 2 , for probability distributions with finite second moments. Some functionals, such as the expected shortfall and the mode, cannot be defined via loss functions for broad classes of probability distributions (Gneiting, 2011;Heinrich, 2014). Consequently, the optimal α-forecast from Definition 1 constitutes a more general optimality. The conditions for the existence of the identifying moment conditions (2) are satisfied for any functional defined via a continuous loss function; see Appendix A for details. In fact, every functional α with identifying moment conditions (2) can be defined via a loss function under regularity conditions on L(Y |F). In this case, the identification function V derives from the partial derivative L (x) (x, y) of any loss function L that defines α (Steinwart et al., 2014, Thm. 8;Fissler and Ziegel, 2016, Thm. 3.2). This fact can be used to find identification functions for functionals defined via loss functions.

Nonidentifiability of loss functions
For a specific functional, there might be many loss functions defining it (Gneiting, 2011;Ehm et al., 2016). It is therefore impossible to identify the shape of the loss, as all these functions lead to the same functional-forecast and identical moment conditions. For example, given any convex and differentiable function Φ, the Bregman loss function, , induces an optimal mean-forecast (Savage, 1971). Hence, loss functions are not identified, but functionals are identified to the extent that they differ on the predictive distributions.

Parametric estimation and testing of state-dependent quantiles and expectiles
We turn to parametric estimation of possibly varying functionals in the time series setting. Consider a stochastic process {(X t , Y t , Z t ) : t = 1, 2, . . .} of forecasts, observations, and state variables, for which we have a sample path {(x t , y t , z t ) : t = 1, . . . , T }. Our goal is to infer the functional that the point forecasts represent. We assume that at each point in time an optimal α-forecast is issued, i.e., In the situation of an h-step ahead forecast, the available information is typically generated by lagged variables of the outcome Y and the vector-valued state variable Z, in which case F t = σ(Y 1 , . . . , Y t−h , Z 1 , . . . , Z t−h ). For ease of notation, statements about all time points are often denoted without subscripts. For example, we write Extending Definition 1, we allow the functional α to depend on the current situation, represented by the F-measurable state variable Z. We call this a state-dependent functional. Asymmetric and state-dependent point forecasts can arise for a variety of reasons, including varying preferences of the forecaster, asymmetric information, and non-linear transformation of data (see Section S1 for details).
In light of the results and discussion in Section 2, we assume that the true functional is a state-dependent quantile, or that it is a state-dependent expectile of L(Y |F). By restricting the class of feasible functionals to only quantiles (or expectiles), the functionals induce distinct forecasts under minimal assumptions, which guarantees unique identification even under state-dependence.

State-dependent quantiles and expectiles
The τ -quantile functional q τ (P ) of a distribution P with continuous and strictly increasing cumulative distribution function is the unique solution x to the equation P ((−∞, x]) = τ. We can express this directly in terms of the identification function of the τ -quantile, While quantiles are asymmetric generalizations of the median, expectiles are analogously defined as asymmetric generalizations of the mean. Specifically, the τ -expectile e τ (P ) of a distribution P with finite mean was introduced in Newey and Powell (1987) as the unique solution x to the equation . This is equivalent to Hence, quantiles and expectiles can be identified under weak assumptions on the conditional distributions. We allow for additional flexibility and let the level τ of the quantile or expectile depend on the state variable z via a parametric function m(z, θ).
Definition 2 (specification model). Let Θ be a subset of R p and suppose that the state variable z takes values in R k . A specification model is a function m(z, θ) that maps R k × Θ into the unit interval (0, 1).
We say that a specification model m(z, θ) is continuous (continuously differentiable) if it is continuous (continuously differentiable) in θ ∈ Θ for every z.
Let us consider examples for such specification models in Table 1, where we assume that z is real-valued. The special case of a constant model assumes that the forecaster always states the θ-quantile or expectile and was implemented in previous work (Elliott et al., 2005;Christodoulakis and Mamatzakis, 2008;Krol, 2013;Pierdzioch et al., 2013;Fritsche et al., 2015). The break model generalizes the constant model to allow for a structural break in the risk assessment at the threshold value t. The linear model specifies the dependence of the quantile or expectile level on the state z, where the logistic function Ψ(x) = (1 + exp(−x)) −1 ensures that it lies in the unit interval. Lastly, we suggest a method to detect periodic asymmetry in the forecasts. The periodic model provides information about the base level θ 1 and the magnitude θ 2 and period θ 3 of the periodic asymmetry.

GMM estimator
Let us now assume that the forecast X is an optimal quantile-forecast with statedependent level described by the specification model m(z, θ 0 ), Crucially, we assume that Z t is F t -measurable and it follows that We refer to g(θ) = (1(y ≤ x) − m(z, θ)) w as the moment function. Given a sample path of F t -measurable instrumental variables w t = (w t,1 , . . . , w t,q ) , the empirical mean of the moment function is given by Then, the standard GMM estimator is obtained by minimizing the quadratic norm of the empirical moment:θ where the weighting matrix derives from a heteroskedasticity and auto-correlation consistent (HAC) estimatorŜ T of the covariance matrix of the moment function g, as proposed by Newey and West (1987). Throughout this study, we use the standard two-step GMM procedure proposed in Hansen (1982) to findŜ −1 T in (6). If the specification model m is continuous, there exists an F t -measurable instrumental variable W t such that the GMM estimator is consistent for the true parameter value θ 0 ∈ Θ. For a formal statement see Theorem 1 in Appendix B. The crucial point is to establish unique identification of the parameter without restrictive assumptions on the unobservable conditional distributions L(Y |F). Analogously, one can assume an expectile forecast X = e m(Z,θ0) (Y |F) and use the moment function g(θ) = (1(y ≤ x) − m(z, θ))(x − y)w, which also leads to a consistent estimator.
We thus establish the existence of a suitable instrument to achieve consistency without assuming a linear forecasting model nor knowledge of the forecasting model's parameters. Note that under the assumption of a linear forecasting model, Elliott et al. (2005) obtain the stronger result that any subset of the used information identifies their one-dimensional parameter. We discuss the choice of the instrumental variables w in Section 3.3.
Once the identification of the system is established, standard GMM theory (Hansen, 1982) provides a range of useful asymptotic results. If the specification model m(z, θ) is continuously differentiable, the GMM estimator is asymptotically normal, where p is the dimension of the parameter vector, S is the covariance matrix of the moment function, and G is the expectation of its partial derivative with respect to

Testing optimality with unknown directive
The so-called test of overidentifying restrictions (Hansen, 1982) can be used to test forecast optimality. Specifically, if the dimension q of the instrument vector W is greater than the dimension p of the parameter vector θ, it holds that is called the J-statistic. We now discuss an important aspect of our optimality definition: A point forecast can only be defined as optimal with respect to a specific functional and a specific information set. The choice of instruments W in (5) determines the information set for which we test. If a forecast is optimal with respect to F, it also satisfies the moment conditions for any information set G ⊆ F, because where V is the identification function of α. If a test with instruments W rejects optimality, the point forecast is not optimal with respect to any information set F that contains the information set σ(W ) generated by W : Thus, the null hypothesis in the test of overidentifying restrictions is H 0 : There exists θ 0 ∈ Θ such that X = q m(Z,θ0) (Y |F) with σ(W ) ⊆ F.
We use this in Section 5.1 to explore the size and power of optimality tests.
An optimal yet uninformed point forecast can only be rejected if appropriate instruments are available. Furthermore, a misspecified or non-optimal forecast can still form an optimal forecast with respect to a smaller information set or a more flexible class of functionals. For this reason, the choice of instruments and specification models is a crucial part of inference based on our estimators and optimality tests. To obtain power against forecasts that are optimal with respect to different specification models, we propose to include the forecast as an instrument; see Section S4.

Specification tests for forecasting behavior
The estimation of specification models m(z, θ) provides a suitable framework for testing specific hypotheses about forecasting behavior. In general, any restriction R(θ 0 ) = 0 for the model m(z, θ), where R : Θ → R l is differentiable, can be tested using a Wald statistic (e.g., Greene, 2012) where R (θ) (θ) = ∂R(θ)/∂θ is the gradient of R evaluated at θ, and R (θ) (θ 0 )Ŝ T (θ 0 )R (θ) (θ 0 ) is non-degenerate. If R(θ 0 ) = 0, then For illustration, we discuss restrictions in the models introduced in Table 1. In the linear model, we can test if the forecasting behavior is constant with respect to the state z, which corresponds to the restriction θ 2 = 0. The break model facilitates tests for the hypothesis that there is no structural break, in which case θ 1 − θ 2 = 0. This specification test is closely related to the state-dependence test in Caunedo et al. (2013). Finally, we can use the periodic model to test for the presence of periodic asymmetry, i.e., θ 2 = 0. In all of these examples, joint tests of optimality and of the restriction can be carried out based on the asymptotic distribution in (8).

GDP growth forecasts as state-dependent quantiles
We present an application to the GDP Greenbook forecasts of the Federal Reserve. As realized values we take the quarterly real GDP growth rate over the period 1969 to 2011 (T = 172 observations) as reported in the first release †. Standard tests based on the mean functional reject optimality. Patton and Timmermann (2007) model the loss function as a quadratic spline with three nodes whose shape is allowed to change with the growth rate.
Here, we interpret the forecasts as state-dependent quantiles of the Federal Reserve's (implicit) predictive distributions. To investigate whether the reported quantile changes with the predicted GDP growth rate, we apply the linear specification model m(x, θ) = Ψ(θ 1 + xθ 2 ) as described in Section 3.1. As instrumental variables w for the GMM estimator, we use a constant, the forecast and the one-quarter-lagged value of the outcome. In a test of overidentifying restrictions (Section 3.3), we obtain a J-statistic of 1.63 with a p-value of 0.20. Consequently, there is no reason to reject optimality if we allow for state-dependent quantile forecasts.
Compared to the spline loss function of Patton and Timmermann (2007), we use two instead of six parameters, and our more powerful test (Section 5.1) does not reject the hypothesis of an optimal forecast. The need for additional instruments due to the large number of parameters in the spline approach is a concern for small sample sizes.
We obtain the estimateθ T = (−0.20, 0.18) . As illustrated in Figure 1, the forecasts can be interpreted as m(x,θ T )-quantiles that depend on the predicted growth rate x. The Federal Reserve reports lower quantile levels during times of low growth, so forecasts are more conservative during recessions. The covariance estimate implied by (7) is given by where G T is the sample moment of G evaluated at θ =θ T . As Ψ(·) is strictly monotone, we can compute pointwise confidence intervals forθ 1 + xθ 2 and transform into confidence intervals for m(x,θ) = Ψ(θ 1 + xθ 2 ), as illustrated in Figure 1. The Wald test as †The results in this section are robust to using the second revision or the 2017Q1 vintage. introduced in Section 3.4 for the hypothesis that θ 2 = 0 (p-value = 0.01) suggests that the underlying preferences are not only asymmetric but also flexible with respect to a state variable. When applying the lagged outcome y t−1 instead of the predicted value x t as state variable the test of overidentifying restrictions rejects optimality. Consequently, the data is not consistent with a forecaster simply overweighting the predictive content of the current growth rate.

Constant expectile-forecasts with varying information sets
We generate optimal expectile-forecasts at the constant level τ = 1/2.85. For a Gaussian conditional distribution L(Y |F) = N (µ, σ), we obtain the asymmetric forecast x = e 1/2.85 (N (µ, σ)) = µ + σ e 1/2.85 (N (0, 1)) ≈ µ − 1 4 σ. Let I t be the filtration generated by the time series, Applying the standard GMM two-step estimator with the linear specification model using the lagged outcome y t−1 as state variable, we perform the overidentifying-restriction tests of forecast optimality from Section 3.3 at significance level 5%. We compare the tests from Section 4 based on linear state-dependent quantiles and expectiles to the flexible spline test introduced in Patton and Timmermann (2007). † Using one node only and applying the F t -measurable state variable y t−1 instead of y t , the spline test reduces to the expectile linear state-dependent test and the asymptotic results of Section 3 apply. As instruments w t we use a constant, the forecast, the lagged forecast error, the squared lagged forecast error, and one additional lag of the final three variables.
In Figure 2, we see that the quantile-and expectile-based optimality tests are better calibrated than the spline test. This addresses a known problem of the state-dependent spline test, which "appears to require large samples (T ≥ 1000) before the test's size is close to its nominal value, and thus rejections obtained using this test must be interpreted with caution" (Patton andTimmermann, 2007, p. 1183). In contrast to the spline-based estimation, the state-dependent quantile and expectile models provide insightful point estimates and confidence intervals even for moderate sample sizes.
For the power analysis, we construct a 2-step ahead forecast which is optimal with respect to the lagged information set F t = I t−2 = σ(Y t−2 , Y t−3 , . . .). For the conditional distribution we obtain µ Yt|Ft = µ Yt|It−2 = 1 4 y t−2 and σ 2 Yt|Ft = σ 2 Yt|It−2 = 23 20 σ 2 t−1 + 1 10 as shown in Supplementary Section S3. This produces an optimal expectile-forecast with respect to the information set I t−2 , which is not optimal with respect to the information †One modification has been implemented: The nodes of the splines are located at zero and the average positive and negative forecast errors, instead of the median positive and negative forecast errors, as the medians are close to zero. set I t−1 of variables observable when issuing the forecast. The setting allows us to evaluate the power of the optimality test against information rigidities (Coibion and Gorodnichenko, 2015). A well performing test accepts optimality for lagged instruments based on F t = I t−2 according to the 5%-level of the test, and rejects optimality for non-lagged instruments based on I t−1 . The results of this experiment are presented in Figure 3. The quantile-and expectile-based optimality tests are better calibrated and more powerful than the spline-based test, which is strongly oversized for small sample sizes and unable to consistently detect the information rigidity even for large sample sizes. Hence, our tests are not only better calibrated, but also more powerful. The advantage of the functional-based tests is even greater for more flexible models, where the spline based test would require a large number of instruments. Applying fewer instruments improves the finite sample performance of the quantile and expectile based tests even more (based on additional experiments not shown here). However, for the sake of comparability we have kept the instruments identical across the tests.

State-dependent quantile-forecasts under different specification models
We generate optimal state-dependent forecasts for the data generating process (9) under the specification models proposed in Section 3.1: a quantile forecast that depends linearly on the current time series value (linear : m(z t ) = Ψ(y t−1 − 1)), one that is subject to periodic deviations (periodic: m(z t ) = Ψ(1 + sin(πt/8))), and one that is exposed to a break (break : m(z t ) = Ψ(1 − 2 · 1(t ≤ T /2))). To each forecast we apply the overidentifying-restriction tests of forecast optimality from Section 3.3 at level 0.05 based on the three specification models (linear, periodic, break ). In the periodic model we consider two parameters and fix the third (period) parameter as we use only three instruments throughout: a constant, the forecast and the lagged outcome. In Table 2, we see the results for forecasts and tests based on quantiles. Additional details and analogous results for tests based on expectiles can be found in Supplementary Section S4. Even for small sample sizes (T = 100) the tests are reasonably calibrated and quite powerful with rejection rates between 60% and 90% (with the exception of the periodic model, where the break forecast is rejected in 30% of the cases only). For larger sample sizes (T ≥ 1000) the optimality tests are almost perfectly calibrated and have full power against the other state-dependent optimal forecasts.

Discussion
For point forecasts with unknown directive, we posit that it is preferable to estimate and test the functional quoted by the forecaster, rather than the loss function for reasons of identifiability, interpretability, and ease and efficiency of implementation.
We have introduced state-dependent quantiles and expectiles and have shown that, under optimal forecasts, the functional can be consistently estimated. The asymptotic distributions of the GMM estimator and the overidentifying test statistic can be used to construct flexible tests of forecast optimality and of specific model properties. It is particularly noteworthy that state-dependent functionals allow for the treatment of supposedly misspecified forecasts in a principled manner.
Since Mincer and Zarnowitz (1969) the standard test of optimality is the regression y t = β 0 + β 1 x t + u t and the joint test that the coefficients in this model are equal to zero and one. Interestingly, this regression model is equivalent to assuming the optimal forecast which yields the identification function V (x, y) = β 0 + β 1 x − y, and applying the GMMestimator with the instruments w = (1, x).
In a simulation study, we have illustrated that an existing spline-based test is oversized and unlikely to detect information rigidities, while the new estimators yield well calibrated and powerful tests. We further have shown that the GDP forecasts of the Federal Reserve are optimal when viewed as state-dependent quantiles that change with the predicted growth rate.
An important potential application of the new approach is the comparison of a point forecast with unknown directive to other point or probabilistic forecasts. In this situation, the functional represented by the point forecast could be extracted from the probabilistic forecast, and the resulting sets of point forecasts can be compared via any consistent loss function (see Giacomini and White, 2006;Gneiting, 2011).

Acknowledgments
This manuscript has developed from a Diploma thesis written by Patrick Schmidt at Heidelberg University, under the supervision of Matthias Katzfuss and Tilmann Gneiting. We are grateful to Fabian Krüger, Barbara Rossi, the editor, associate editor, and two referees for constructive comments that have greatly improved the manuscript. The work of Patrick Schmidt and Tilmann Gneiting has been partially funded by the Klaus Tschira Foundation and by the European Union Seventh Framework Programme under grant agreement no. 290976. Matthias Katzfuss has been partially supported by US National Science Foundation (NSF) Grant DMS-1521676 and NSF CAREER Grant DMS-1654083.

A. Identification of functionals
Consider the probability space (Ω, A, P), where the elements of the sample space Ω are tuples that comprise the realization Y , the point forecast X, and instruments W . We assume that the information set F is a sub-σ-algebra of A. If no measure is explicitly mentioned, statements like almost surely refer to P. For random variables R 1 and R 2 , we simply write R 1 = R 2 instead of R 1 = R 2 almost surely. In particular, statements like X = α(Y |F) denote P-almost sure properties.
The terminology on functionals follows Steinwart et al. (2014). Specifically, topological statements on the space of probability distributions P are with respect to the metric induced by the L 1 -norm.

Regularity Conditions A.
There exists a convex set P of probability measures with bounded Lebesgue densities such that (i) L(Y |F) ∈ P almost surely, and (ii) the functional α : P → R is continuous, locally nonconstant, and has convex level sets.
The conditions A(ii) are met by any functional defined via a continuous, non-trivial loss function. In particular, continuity follows from the Maximum Theorem (Ok, 2007, p. 229), and functionals defined by loss functions have convex level sets (Osband, 1985;Gneiting, 2011).
Lemma 1 (Identification). Let X be a forecast for Y based on some information σ-algebra F, and let α : P → R be a functional. Under Regularity Conditions A, there exists a function V identifying the optimal α-forecast, i.e., Proof. By Theorem 8 in Steinwart et al. (2014), there exists an identification function V , such that V (t, y) exists for λ × λ-almost all (t, y), where λ is the Lebesgue measure, and for all P ∈ P it holds that t = α(P ) ⇐⇒ E Y ∼P [V (t, Y )] = 0. In particular, as L(Y |F) ∈ P almost surely. As L(Y |F) is absolutely continuous and X constant given F, the identification function V (X, Y ) exists almost surely.
Any F-measurable variable W remains constant under conditional integration. Unconditional integration reveals the moment conditions We show by contradiction that this is also a sufficient condition for E[V (X, Y )|F] = 0. Assume without loss of generality that E[V (X, Y )|F] > 0 with positive probability, then the F-

Regularity Conditions B.
(i) L(Y |F) ∈ P almost surely, where P is the class of absolutely continuous distributions with strictly positive densities and finite first moments.
(ii) The state variable Z is F-measurable.
(iii) The parameter space Θ ⊆ R p is compact.
(iv) The specification model m(z, θ) is continuous on Θ for all z.
(vi) The stochastic process {U t | t ∈ N} is ergodic (in means) and strictly stationary. † (vii) The absolute forecast error has a finite first moment.
(viii)Ŝ T → S, where S is positive definite.
Condition B(i) ensures that the state-dependent quantiles and expectiles fulfill Regularity Conditions of Lemma 1 for fixed z. Expectiles do not require strictly positive densities. Quantiles do not require finite first moments.
Theorem 1 (Consistency). Let X t be an optimal state-dependent quantile forecast, i.e., X t = q m(Zt,θ0) (Y t |F t ) for t = 1, 2, . . . . Under Regularity Conditions B, there exists an F tmeasurable instrumental variable W t such that the GMM estimator defined in (6) is consistent for the true parameter θ 0 ∈ Θ:θ T → θ 0 , as T → ∞. Analogously, the expectile based estimator is consistent if the point forecast is a state-dependent expectile, X t = e m(Zt,θ0) (Y t |F t ). †Strict stationarity means that the distribution of (U t , U t+1 , . . . , U t+s ) does not depend on t for any s, and (mean) ergodicity implies that 1 T T t=1 a(U t ) converges in probability to E[a(U t )] for measurable functions a(·) with E[|a(U t )|] < ∞ (Newey and McFadden, 1986).
Proof. We first consider quantiles. Under B(i) the state-dependent quantile is well-defined and fulfills Regularity Conditions A for fixed z. By Lemma 1, it follows that E[V (U, θ 0 )|F] = 0.
For any θ ∈ Θ with θ = θ 0 it holds that Under B(ii) and defining P = Y |F, we separate the integral at point Y = x, to obtain The m(z, θ)-expectile for the distribution P is the unique solution of (11) (Newey and Powell, 1987). If θ = θ 0 , it follows by B(v) that m(Z, θ) = m(Z, θ 0 ), which implies that X = e m(Z,θ) (Y |F). Consequently, moment condition (11) does not hold.
As the parameter space Θ is compact by B(iii), and the applied functional is in our parameterized class, it follows that θ 0 ∈ Θ and ii) of Newey and McFadden (1986, Thm. 2.6) is satisfied.