For the current “now” *T*, we will in this section assume time-homogeneity of the delay distribution within , that is, we only take those cases into account, which occurred during the last days. In other words, we use as lower nodes of the right-angled trapezoid the observations and in Figure 2. For the further derivation of the delay estimation, we follow the notation in Lawless, (1994) by introducing

that is, denotes the rectangle spanned by the observations , , , and . That is all regarded cases with delay less or equal to *d*. Furthermore, denotes the right edge of this rectangle. This corresponds to all regarded cases with delay equal to *d*. Section B.1 in the Web Appendix contains an example calculation of these quantities for a specific reporting triangle.

Since at time *T* we observe a right-truncated version of the delay distribution, we define the reverse-time discrete hazard function (Lagakos et al., 1988, Kalbfleisch and Lawless, 1991; 1994),

where denotes the cumulative distribution function (CDF) of the delay distribution. We note that . The CDF and PMF of the delay distribution are now given as

- (2)

#### 3.1 Frequentist Nowcasting

For the reader's convenience, in this section we recapitulate and discuss existing results on performing nowcasting for right-truncated delays. Lawless, (1994) showed that the maximum likelihood estimate (MLE) for when is . The MLE for the delay distribution can now be obtained by plug-in of these estimates into (2). Note that if the window width *m* is very close to the maximum delay *D*, the MLE for the long delays may be based on few observations. Based on the MLE, Lawless, (1994) presented the following nowcast procedure

which can be linked to inverse probability weighting. This link also provides solutions for the case when is zero (Elliot and Little, 1999). If , that is, no events are reported to have occurred at *t* by time *T*, then equals zero. Otherwise, the predictive distribution can be computed based on an asymptotic normal approximation for the prediction error. This is done by computing the variance of . From this, a prediction interval for is , where is the quantile of the standard normal distribution. We shall obtain a discrete predictive distribution for via discretizing this Gaussian predictive distribution by taking the difference of the CDF evaluated at the integers and attribute all mass below to .

#### 3.2 Bayesian Nowcasting

As a competitor to the above method, we propose a novel Bayesian hierarchical model to directly address the combination of delay estimation and count data nature of the forecast. In particular, we will show that the generalized Dirichlet (GD) distribution is a conjugate prior-posterior distribution for the reversed delay under right-truncated multinomial sampling.

With the above GD posterior the marginal posterior expectation for , when using the improper prior can be shown to be equal to the ML estimate defined in the previous section (see Web Appendix).

Assume now that, by conjugate prior-posterior updating, we have obtained an estimate of the delay distribution, that is, . In order to predict the unknown , from the incomplete we assume the following model hierarchy for

where is the proportion reported within a delay of days and are known constants. In this hierarchy, the marginal (prior) distribution of is negative binomial with mean and variance . Furthermore, given the distribution of is compound binomial-negative binomial. The marginal posterior for , , is

- (3)

where by application of Bayes' theorem and using

- (4)

for and zero otherwise. Note that in (3) is not available in analytic form but we know that due to previous developments. We hence solve the integration in (3) by Monte Carlo sampling jointly for all using the following algorithm.

- For :
- Draw by the algorithm of Wong, (1998) and calculate for .
- Calculate the nonnormalized density for and , where is sufficiently large.
- Set , where is the normalization constant.

- Approximate for and .

Since the Monte Carlo sampling in is based on entire probability vectors and not just samples, only a small number of *K*, say 100 or 1000, is needed to obtain accurate results for the multivariate integration. Altogether, our Monte Carlo based procedure is fast, accurate and easy to use in practice.