Are epidemic growth rates more informative than reproduction numbers?

Abstract Summary statistics, often derived from simplified models of epidemic spread, inform public health policy in real time. The instantaneous reproduction number, Rt, is predominant among these statistics, measuring the average ability of an infection to multiply. However, Rt encodes no temporal information and is sensitive to modelling assumptions. Consequently, some have proposed the epidemic growth rate, rt, that is, the rate of change of the log‐transformed case incidence, as a more temporally meaningful and model‐agnostic policy guide. We examine this assertion, identifying if and when estimates of rt are more informative than those of Rt. We assess their relative strengths both for learning about pathogen transmission mechanisms and for guiding public health interventions in real time.


INTRODUCTION
Inferring changes in pathogen transmissibility during epidemics is an important challenge. Increases in transmissibility may forewarn of elevating caseloads and hospitalisations, while decreasing rates of spread may evidence the effectiveness of earlier interventions or the influence of infection-acquired immunity (Anderson & May, 1991). Practical limitations on the scope and speed of outbreak surveillance mean that population-wide summary statistics, derived from simplified models of epidemics, often inform public health policy in real time. The instantaneous reproduction number at time t, denoted R t , is predominant among these statistics. It measures the average number of secondary infections generated per effective primary case at that time. An R t above (or below) 1 indicates a growing (or waning) epidemic and often forms part of the evidence base for policy decisions on the imposition or release of interventions (Anderson et al., 2020). However, R t encodes no temporal information. For example, R t = 2 indicates approximate epidemic doubling (per generation of infections) but not the speed of that doubling. Moreover, because inference of R t depends on the model used (and hence its assumptions), differing estimates may be obtained from the same data, complicating the interpretation of R t as a signal for epidemic response (Lloyd, 2009;. Consequently, the instantaneous epidemic growth rate, r t , defined as the rate of change of the log-transformed case incidence, has been proposed as a more informative and understandable measure of transmission dynamics (Pellis, Birrell, et al., 2021). Growth rates may be estimated directly from the gradient of the log-transformed observed incidence curve, have a natural temporal interpretation as the speed of case accumulation and still encode key dynamics, for example, the signs of r t and R t − 1 signify similar transmission trends. Estimates of r t can therefore, seemingly, be derived independently of an epidemic model. However, if a model is assumed, there is a one-to-one correspondence between r t and R t . Thus, R t may provide no more information about transmission patterns than that available already from r t (Wallinga & Lipsitch, 2007). While these observations may at first recommend r t as the more useful measure for policymaking, there are implicit complications.
First, when comparing transmission across different spatial scales, epidemic phases or even data types (e.g. hospitalisations or cases), a non-dimensional parameter may be more useful. A value of R t = 2 has the same interpretation of a primary case generating two secondary ones on average, regardless of the region studied or the phase of the epidemic considered, with important implications for interventions (e.g. if R t = 2, then more than half of transmissions must be prevented for the epidemic to start declining). Second, the process of estimating the logarithmic derivative of a noisy incidence curve is not trivial and noise-smoothing choices may actually be equivalent to modelling assumptions. Third, information encoded in R t may be more easily leveraged to develop other useful outbreak analytics, such as probabilities of epidemic elimination  or herd immunity thresholds (see Discussion and Hethcote (2000)). Last, biases and delays in reporting and surveillance may have differing impacts on estimates of both quantities, making it unclear which offers the higher fidelity view of transmission (Lloyd, 2009).
In this paper, we investigate and discuss the various complexities and subtleties mediating the practical informativeness of estimates of both R t and r t , which we denoteR t andr t respectively. We outline how these quantities can be computed from incidence curves using renewal models and smoothing filters. This leads us to our main result: that the smoothing assumptions inherent in obtainingr t from noisy incidence curves can be equivalent to the epidemiological ones necessary for obtainingR t . Consequently, we conclude that the question of whether R t or r t is more informative for real-time public health policymaking depends on the relative accuracy of the epidemiological assumptions and on how well the subtleties and uncertainties underlying each summary statistic are communicated. Estimates of R t and r t in combination, alongside contextual information about the ongoing epidemic, will provide the most complete picture of pathogen transmission and control.

Computing reproduction numbers and growth rates
Inferring the time-varying transmissibility of a pathogen from routinely available surveillance data is vital to assess ongoing and upcoming trends in epidemic dynamics. Among the most common data types is the incidence curve, which is the time series of new cases. We assume here that cases and infections are equivalent. We use I t to denote the incidence at time t so that over the period 1 ≤ t ≤ T the incidence curve is the time series {I 1 , I 2 , … , I T }. We let w j be the probability that a primary case takes j time units (usually in days) to generate a secondary case. The set of w j for all j constitutes the generation time distribution of the disease, where we make the common assumption that the generation time distribution is approximated by the serial interval distribution (Cori et al., 2013;Wallinga & Teunis, 2004). The serial interval distribution describes the times between symptom onsets for primary and secondary cases and is often computed from independent line-list data (Cori et al., 2013;Cowling et al., 2009). We assume that the set of w j has been well characterised for the infectious disease of interest. The renewal model (Fraser, 2007) relates the instantaneous reproduction number at time t, R t , to the incidence curve and generation time distribution as in Equation (1) with E[I t ] indicating the mean of I t . Typically, an assumed distribution (e.g. Poisson or negative binomial) is used to statistically relate this mean to I t , and estimates of R t (i.e.R t ) are obtained using various Bayesian or maximum likelihood computational approaches.
The total infectiousness, Λ t , summarises how past incidence propagates forwards in time by incorporating knowledge of the generation time distribution via a convolution. Many approaches exist for inferringR t from the incidence curve {I 1 , I 2 , … , I T } with T as the last observed time (see Anderson et al. (2020) for more details). These estimates,R t , are increasingly employed for tracking transmissibility during epidemics and guiding public health responses. The instantaneous growth rate, r t , has been used less frequently to assess transmissibility over time but has recently gained attention as an alternative to R t (Dushoff & Park, 2021; and is among the metrics that COVID-19 advisory bodies track. The quantityr t can be derived from {I 1 , I 2 , … , I T } without additional epidemiological knowledge or assumptions, for example, no estimated generation time distribution is required. Instead, the logarithmic derivative of some smoothed version of the incidence, S[I t ], is used, as in the left side of Equation (2).
Here S[I t ] refers to any smoothing, which at time t may depend on any subset of the incidence curve (and not just I t ). There are various ways of deriving S[I t ] curves (e.g. using splines or moving average filters) . We can unify many of these approaches to smoothing within the framework of Savitzky-Golay (SG) filters (Savitzky & Golay, 1964). SG filters, with dimension m and coefficients a j , perform local least-squares polynomial smoothing via the discrete convolution or kernel in the right side of Equation (2). Equation (2) assumes that m is odd but can be easily adjusted for even filter dimensions. We denote the resulting smoothed incidence as S[I t ] = S t and we may optimise the a j coefficients via least squares or select their values to confer some desired properties (e.g. to maintain certain waveform or frequency characteristics of the original data). We can realise a standard moving average filter within the SG framework by setting each a j = 1∕m, for example. The SG framework has broad applications and further information on its uses and properties can be found in Schafer (2011).
The reproduction numbers and growth rates we consider should not be confused with the basic reproduction number, R 0 , and the intrinsic growth rate, r, which can be estimated using numerous methods (e.g. via compartmental or Richards' growth models, Yan & Chowell, 2019) from various data sources (e.g. prevalence or cumulative case data). While these are related to our R t and r t during the earliest phases of an epidemic, R 0 and r cannot track time-varying changes in transmissibility. Instead they provide insight into initial epidemic growth upon invasion (Anderson et al., 2020). The methods above do not consider spatial, contact or other heterogeneities. However, including these may not always improve transmissibility estimates (Liu et al., 2018). Next, we clarify how R t and r t are related.

Connecting reproduction numbers and growth rates
Equations (1) and (2) describe simple and general approaches to estimating R t and r t from an incidence curve. While a known generation time distribution or serial interval is assumed in Equation (1) when determiningR t , Equation (2) neither makes mechanistic assumptions nor requires additional data for calculatingr t . However, if the assumptions in Equation (1) are made then it is possible to derive a model-dependentr t fromR t . The generalised method for connecting these two summary statistics is given in the left side of Equation (3) (Wallinga & Lipsitch, 2007), with M w denoting the moment generating function about the generation time distribution (defined by the set of w j from Equation 1).
The left side of Equation (3) states that the relationship betweenr t andR t depends strongly on the parametric form of the generation time (or in practice, the serial interval) distribution. The set of w j is most commonly parametrised from the gamma family of distributions with shape and rate parameters a and b. This leads to the analytic expression on the right side of Equation (3). Although the moment generating approach suggests a general way of connectingr t andR t , there is an implicit exponential growth or decay assumption within this formula (Wallinga & Lipsitch, 2007). While Equation (3) is developed for a general renewal model framework, we can also specialise this method to popular compartmental models. For example, under a linearised compartmental Susceptible-Infectious-Recovered (SIR) model, we obtainr t = (R t − 1)∕E[w], with E[w] as the mean generation time (Bettencourt & Ribeiro, 2008).

RESULTS
We examine how the model-basedr t relates toR t under a given generation time distribution (see Methods). The gamma and SIR simplifications of Equation (3) provide key insights into the relative informativeness of these statistics. First, we see that the sign ofR t − 1 andr t is equivalent, making either equally good for inferring the transitions between growing and declining epidemics. We illustrate this for a simulated epidemic in Figure 1, which has been constructed to model seasonal transmission dynamics. The example we provide is representative of the range of possible epidemic trajectories (and estimates) that would result from our chosen true sinusoidal R t profile.
F I G U R E 1 Instantaneous reproduction numbers and growth rates. We simulate a seasonally varying epidemic with incidence I t , according to the renewal model with true transmissibility R t and serial interval distribution estimated for Ebola virus from Van Kerkhove et al. (2015). In panels A and B, we estimate the instantaneous reproduction numberR t (with 95% credible intervals) using EpiFilter (see Parag, 2021) and provide one-step-ahead predictionsÎ t usingR t . In panels C and D we derive three growth rate estimates,r t using: R t (via the Wallinga and Lipsitch, 2007 method), a smoothed version of the incidence curve S t (via SG filters) and the total infectiousness of the epidemic Λ t by treating it as a type of SG filter. The latter two estimates have to be right and left shifted respectively by 1 2 due to the effects of filtering, with ≈ 8 days as the mean generation time or serial interval. We show that a left-shifted version of the total infectiousness Λ t+ effectively approximates a smoothed incidence curve.
We computeR t using the EpiFilter method  (red, A), which provides minimum mean squared error Bayesian estimates. We assume knowledge of the true generation time distribution and validate our estimates with one-step-ahead incidence predictions (red, B) as in . The intersections ofR t (red, A) with 1 and those of the model-basedr t (red, C) with 0 coincide, as expected from Equation (3). Both provide consistent assessments of time-varying transmission, correctly signalling rising and falling seasons.
We next compute the model-agnostic, log-derivative-basedr t using an SG filter as in Equation (2), which effectively fits local splines to the incidence curve. This estimate (grey, C) correlates well with our model-based one (red, C), with some overshoot in periods where incidence is small (and estimation is known to be more difficult ). The only assumptions made in obtaining this estimate relate to how we smooth the data to obtain stable log-derivatives (e.g. we have to make choices about the order of our splines or the dimension of our moving filters). Current approaches to deriving model-agnosticr t values must all ultimately make similar assumptions and choices . Having made these key observations, our main result emerges.
Comparing Equations (1) and (2), we see that the total infectiousness, Λ t , is an implicit SG filter, with the value of the filter dimension m determined by the support of the generation time distribution. Hence, we construct another growth rate estimate,r t ≈ d log Λ t dt as shown in Figure 1 (blue, C). This estimate matches the other two growth rate estimates well but with a decreased overshoot. SG filtering often induces lags and we find that r t estimates from S t and Λ t must be right and left shifted by some value of 1 2 respectively. We do not provide credible intervals for the two SG-basedr t here as we simply intend to demonstrate proof-of-concept. All Bayesian credible intervals that we do present are equal tailed (based on relevant quantiles) and we find that ≈ E[w], the mean generation time.
This correspondence among r t estimates is novel and, importantly, clarifies how time-varying model-agnostic growth rates and instantaneous reproduction numbers relate by exposing that the generation time distribution is effectively an epidemiologically informed smoothing filter. We confirm this by comparing Λ t+ (a left-shifted version of Λ t by ) and S t (grey and blue, D), which are effectively two possible realisations of S[I t ] from Equation (2). In general the value of (which essentially acts as a location parameter) may depend on the type of filter or moving average applied. However, across all the scenarios we investigated (these include diverse epidemics simulated with Ebola virus and SARS-CoV-2 dynamics) we found that setting ≈ E[w] worked well.
Last, in Figure 2 we examine robustness to misspecification of the generation time distribution. We use the same inference procedures as above but now the mean generation time assumed in estimation has a mean that is 33% smaller than that of the true Ebola virus distribution (under which the data are generated). Misspecification could occur due to interventions or other epidemiological changes (Ali et al., 2020). We find thatR t is sensitive to this change (compare the red and blue estimates in A). However, ther t computed from the misspecifiedR t is mostly stable, although there is increased uncertainty (compare the red and blue estimates in B). Correspondence between model based and model agnosticr t is also maintained but not shown. Code for reproducing and expanding on the simulations underlying both figures is publicly available at: https://github.com/kpzoo/growth-rates.

F I G U R E 2
Misspecified estimates of reproduction numbers and growth rates. We repeat the simulation of Figure 1 but our estimates now assume a misspecified Ebola virus generation time distribution. This distribution has a mean that is 33% smaller than the one used to generate the epidemic data (which is from Van Kerkhove et al. 2015). Panel A provides estimates of instantaneous reproduction numbers,R t , under the true and misspecified distributions (with 95% credible intervals) using EpiFilter (Parag, 2021). Panel B presents corresponding growth rate estimates (and 95% credible intervals),r t , which are derived from the variousR t in A (Wallinga & Lipsitch, 2007).

DISCUSSION
Evaluating time-varying changes in pathogen transmissibility is an important challenge, allowing the impact of public health interventions to be assessed and providing indicators that can inform policymaking during epidemics. We have focussed on two key metrics for tracking transmissibility: the instantaneous reproduction number R t (with estimateR t ) and the instantaneous growth rate r t (with estimater t ). Both metrics provide key insights into the dynamics of epidemics as demonstrated by their use during the COVID-19 pandemic (Abbott et al., 2020;Anderson et al., 2020). However, their relative merits and demerits have been increasingly debated (see Pellis, Birrell, et al. (2021) and Jewell and Lewnard (2021) of this issue for example). Recent work has suggested that the benefits of inferringr t might have been underappreciated, and that this quantity may be particularly useful because of its apparent independence from modelling assumptions and its explicit consideration of the epidemic speed (i.e. it more naturally includes temporal information) (Dushoff & Park, 2021;. Here, we have explored and exposed the relationship betweenR t andr t by dissecting the assumptions fundamentally underlying their construction. We found that the relative informativeness of these two estimated quantities during epidemics rests on the reliability of their smoothing and epidemiological assumptions. We demonstrated that bothR t andr t extract signals of changing pathogen transmission by effectively smoothing noise from the incidence curve. As we showed in Figure 1, their key difference lies in the kernel (i.e. the set of weights in the SG filter of Equation 2) used for this smoothing. Specifically, computingr t in a model-agnostic way corresponds to selecting an arbitrary kernel, whereas calculatingR t (and, correspondingly, model-basedr t values) involves implicitly treating the generation time distribution as an epidemiological kernel (see Results and the right sides of Equations 1 and 2).
As a result, if the generation time distribution is estimated accurately and underlying assumptions about pathogen transmission hold, then not only are both measures closely related, with the commonly cited R t = 1 threshold corresponding to an r t = 0 threshold, but R t is also theoretically more informative. This follows becauseR t can be used to derive correct model-based r t values, while also providing additional insights into the mechanism of transmission underlying the observed incidence (Yan & Chowell, 2019). In contrast, starting from the model-agnosticr t , it does not seem possible to deriveR t without epidemiological assumptions. Should the generation time distribution be misspecified as in Figure 2, thenR t could be biased, and the model-agnostiĉ r t would be more informative.
When constructingR t , the generation time distribution is often approximated by the serial interval distribution. Misspecification of the generation time as described above might arise due to the often limited number of observed serial intervals used to estimate the serial interval distribution. Observed serial intervals are commonly obtained from household or contact tracing studies, where it is possible to identify source-recipient transmission pairs (Cowling et al., 2009;Li et al., 2020). However, as case numbers increase, identifying known source-recipient pairs becomes more challenging since there is less certainty about the source of a given transmitted infection and as the risk of infection from an unknown source in the community cannot be ignored.
Moreover, even if sufficient source-recipient pairs are reliably known, the generation time may still be misspecified (Hart et al., 2022). Non-pharmaceutical interventions and public health measures, such as case isolation after symptom onset, may curtail observed serial intervals (Ali et al., 2020) or increase the proportion of cases caused by pre-symptomatic transmission (Sun et al., 2021). In both scenarios it becomes difficult to reliably approximate the generation time distribution with the serial interval distribution, which may also now be time-varying or even have negative values. While recent methods compensate for some of these issues (Ganyani et al., 2020) or incorporate up-to-date distributions (Thompson et al., 2019), accurately relatingr t toR t may not always be simple in practice.
Despite potential issues when obtainingR t , we have made clear that inferring the model-agnosticr t also requires assumptions related to smoothing of the incidence curve (or log incidence curve) and specification of the time interval over which to estimate a particularr t . Furthermore, when case numbers are increasing,r t does not give an indication of the proportion of current transmissions that must be blocked to prevent an epidemic from continuing to grow. This proportion relative to R 0 is known as the herd immunity threshold. This threshold is used to determine the vaccine coverage required in order to control transmission, accounting for vaccine effectiveness and any infection-acquired immunity (Hethcote, 2000;Thompson et al., 2020). On the other hand, it isr t that naturally gives estimated doubling times (or halving times), which may be important for intervention planning purposes.
There are also a number of factors that limit the informativeness of bothR t andr t . First, reporting errors and delays can lead to imprecise case counts, affecting summary statistics derived from incidence curves (Azmon et al., 2014). Second, both of the statistics discussed here relate to averages, but heterogeneous systems with superspreading individuals or events (Lloyd-Smith et al., 2005) require more than a measure of central tendency to be well understood. Inferring pathogen transmissibility and the potential impacts of interventions therefore often requires more complex modelling approaches. Last, it is not onlyr t that requires time windows to be chosen for estimation. Values ofR t are often calculated over shifting time windows. Short windows may lead to fluctuatingR t values that potentially reflect randomness in contacts between hosts rather than variations in transmissibility, while long windows may blur detection of key variations (Cori et al., 2013;. The above problems relate to fundamental bias-variance tradeoffs in the inference of r t and R t , and emphasise that neither measure should be used naively. As highlighted in Lloyd (2009) and illustrated in Figure 2, sensitivity analyses of the structure of the epidemiological model or statistical procedure used are crucial for drawing reliable inferences from noisy data. Moreover, even if these problems do not exist, other contextual information is still often required to obtain a full picture of an ongoing epidemic (Jewell & Lewnard, 2021). For example, while R t = 1 or, equivalently, r t = 0 may indicate a stable epidemic, the policy response may be very different depending on whether incidence is high or low. The first of these scenarios may not be acceptable to policymakers, as it involves large numbers of infections in the near future. Both R t and r t only provide information about the changes in state of an epidemic.
Nonetheless, despite some of the challenges in estimation and the need for contextual information, we contend that bothR t andr t are valuable. Estimates of R t (widely referred to as the 'R number') are particularly useful as an intuitive measure for public communication, allowing the effects of current interventions to be assessed and communicated straightforwardly. However, estimates of r t , expressed as doubling times, are great for expressing the speed at which cases are increasing. Given the risks of depending on either R t or r t that we have explored in this paper and the complementary roles they can play in raising public awareness, we support current efforts to generate estimates of both summary statistics. These quantities in combination and together with contextual measures such as current incidence or prevalence, allow epidemic dynamics to be understood more clearly and completely.