Understanding infection progression under strong control measures through universal COVID-19 growth signatures

Widespread growth signatures in COVID-19 confirmed case counts are reported, with sharp transitions between three distinct dynamical regimes (exponential, superlinear and sublinear). Through analytical and numerical analysis, a novel framework is developed that exploits information in these signatures. An approach well known to physics is applied, where one looks for common dynamical features, independently from differences in other factors. These features and associated scaling laws are used as a powerful tool to pinpoint regions where analytical derivations are effective, get an insight into qualitative changes of the disease progression, and infer the key infection parameters. The developed framework for joint analytical and numerical analysis of empirically observed COVID-19 growth patterns can lead to a fundamental understanding of infection progression under strong control measures, applicable to outbursts of both COVID-19 and other infectious diseases.

While interventions such as quarantine or vaccination have been extensively studied in quantitative epidemiology, effects of social distancing are not well understood [2]- [4] , and when addressed, they have been studied only numerically. Unique opportunity to understand these effects has been provided by COVID-19 tracing through confirmed case counts, active cases and fatalities, in a variety of countries with different demographic and environmental conditions [5,6] . We here show that focusing analytical and numerical derivations on distinct epidemics growth regimes, is a novel and effective approach in revealing infection progression mechanisms that may be a valuable alternative to detailed numerical simulations.
The outline of the manuscript is as follows: Our COVID-19 dynamics model is introduced in the Model subsection. In Numerical Framework/Results we will extract COVID-19 count data and select those countries that systematically trace not only confirmed cases and fatalities, but also active cases, which allows tight constraint of numerical analysis. We will observe three characteristic growth regimes in confirmed case counts, show that our model is well constrained by these regimes for a wide range of countries, and provide an intuitive explanation behind the emergence of such regimes. In Analytical Framework/Results sections, analysis of the characteristic (inflection and maximum) points of the infective curve will allow to i) explain the nearly constant value of the scaling exponent in the superlinear regime of confirmed counts; ii) understand the relation between the duration of this regime and strength of social distancing; iii) pinpoint changes in the reproduction number from outburst to extinguishing the infection, and iv) constrain the main parameter quantifying the effect of social distancing by analysing scaling of the infection growth with time in the sublinear regime. The obtained constraints provide a basis for successful analysis of countries that did not continuously track the active cases. We will finally present the key infection parameters inferred through combined analytical and numerical analysis.

II. MODEL
We develop a mechanistic model (nonlinear and nonhomogeneous), which takes into account gradual introduction of social distancing (as relevant for most countries' response), in addition to other important infection progression mechanisms. We start from standard compartments for epidemiological models, i.e., susceptible (S), exposed (E), infective (I) and recovered (R) [2]- [4] . To account for social distancing and observable quantities, we introduce additional compartments: protected (P ) -where individuals effectively move from susceptible category due to social distancing; total number of diagnosed (confirmed and consequently quarantined) cases (D), active cases (A), and fatalities (F ). D, A and F correspond to directly observable (measured) quantities, but are indirect observables of I, as only part of infective gets diagnosed, due to a large number of mild/asymptomatic cases [7] . We implement the model deterministically, as COVID-19 count numbers are very high wherever reasonable testing capacities are employed. This makes model analytically tractable, and allows robust parameter inference through combination of analytically derived expressions and tightly constrained numerical analysis, as we show below. Our analysis is applied separately to each country, as the effect of social distancing, initial number of infected and exposed, diagnosis/detection efficiency and transmission rates may be different.
However, within a given country, we do not take into account different heterogeneities − demographic, spatial, population activity, or seasonality effects [2,8,9] . These can readily be included in our model, but would lead to model structure which is not analytically tractable, so these extensions are left for future work.
Given this, the model equations are: (1) where N is the total population number; β -the transmission rate; σ -inverse of the latency period; γ -inverse of the infectious period; δ -inverse of the detection/diagnosis period; -the fraction of the infected that get diagnosed; h -the recovery rate; m -the mortality rate. Social distancing is included through Eq. (1) (second equation), which represents the rate at which the population moves (on average) from susceptible to protected category. The term α 1+(t 0 /t) n corresponds to a sigmoidal dependence (similar to Fermi-Dirac function, in quantitative biology known as the Hill function [10] ). Time t 0 determines the half-saturation, so that well before t 0 the social distancing is negligible, while well after t 0 the rate of transition to the protected category approaches α. Parameter n (the Hill constant) determines how rapidly the social distancing is introduced, i.e., large n leads to rapid transition from OFF to ON state, and vice versa [10] . Eq. (3) considers that only a fraction of the infected is diagnosed, so that δI takes into account the diagnosis and the subsequent quarantine process.

III. ANALYTICAL FRAMEWORK
To make the problem analytically tractable, we approximate the 2 nd term in Eq. (1) by unit step function, so that after t 0 the term becomes −αS and dominates over the 1 st term, i.e., S(t) ∼ e −αt . We checked that this approximation agrees well with full-fledged numerical simulations ( Figure 1D and Supplement). In all comparisons with analytical results, numerical analysis is done with the full model, allowing an independent check of both analytical derivations and employed approximations. Under this assumption, Eqs.
(1-2) reduce to: We next introduce two time regions: I) t ≤ t 0 and II) t > t 0 and solve Eq. (4) separately within these regions, where corresponding solutions are denoted as I I (t) and I II (t). As in the above expressions γ + δ always appear together, we further denote γ + δ → γ.
For I I (t), we take I(t = 0) ≡ I 0 , and restrict to dominant (positive) Jacobian eigenvalue, leading to the exponential regime: By shifting t − t 0 → t, I II (t) is determined by Eq. (6) is highly nontrivial, due to variable coefficient (σβe −αt ). By substituting variable it can be shown that Eq. (6) reduces to transformed form of Bessel differential equation [11] : whose general solution for nonintiger ν is given by: where J ν, x represents Bessel function of the first kind. In our case α 1 = γ+σ α , γ 1 = β 1 = 1, while ν = γ−σ α is indeed nonintiger. If we return to t variable, taking into account the following relation between standard and modified (I ν, x ) Bessel functions of the first kind [12,13] : I ν, x = i −ν J ν, ix , the general solution of Eq. (6) reads: To determine C 1 , C 2 , we use the following boundary conditions: I II (0) = I I (t 0 ) and I II (0) = I I (t 0 ), where the first derivative in region II has the following expression: In obtaining the expression above, the following identities were frequently used [12,13] : After derivations, where the following relation [13] I ν + 1, together with sin((ν ± 1)π) = − sin(νπ) and the identity relating modified Bessel function of the first and second kind K ν, are used [12,13] , we finally obtain a surprisingly simple result: where K ν, x is the modified Bessel function of the 2 nd kind.
At maximum and inflection points, I II = 0 and I II = 0, respectively. After extensive simplification of the results, this leads to (y = R 0,free e −αt , where R 0,free = β/γ is the basic reproduction number in the absence of social distancing [6,14] ): Eqs. (14,15) have to be solved numerically, but, as γ and σ are constants, we interestingly obtain that solutions will depend only on α. Since, for the analysis of superlinear and  Figure 2C, respectively), so that the effective reproduction numbers at inflection and maximum points (R e,i and R e,m ) are: From this follows the length of superlinear regime (between inflection and maximum points): We further Taylor expand I II (t) around the inflection point: In the superlinear regime D(t) ∼ (t − t s ) υ , where υ is the scaling exponent and t s marks the beginning of this regime. By Taylor expanding D(t) around t i , using Eq. (18) and Eq. (3): which is always larger than 1, as expected for superlinear regime. As t i is localized towards the beginning of the regime, we estimate t i − t s ∼ ∆t k , where k ≈ 3, 4. Finally, to provide analytical constrain on α, we Taylor expand I II (t) around maximum: As f m (α) < 0, we see that the quadratic term in Eq. (20) which allows to directly constrain α.

IV. NUMERICAL FRAMEWORK
We first numerically analyze outburst dynamics in the countries that continuously updated [15] three observable categories (D, A and F ). For a large majority of countries active cases were either not tracked or continuously updated, so the analysis is done for 10 countries (Andorra, Austria, Czechia, Croatia, Cuba, Germany, Israel, New Zealand, Switzerland and Turkey).
In the exponential regime, the analytical closed-form solution is given by Eq. (5). From this, and the initial slope of ln (D) curve (once the number of counts are out of the stochastic regime), β can be directly determined, while the corresponding eigenvector sets the ratio of I 0 to E 0 . The intercept of the initial exponential growth of D at t = 0 sets the product of I 0 and δ. h and m can also be readily constrained, as from Eqs. (3), they depend only on integrals of the corresponding counts; here note that d(D − A − F )/dt = hD. Also [14], [16], [17] , σ = 1/3 day −1 and γ = 1/4 day −1 , characterize fundamental infectious process, which we assume not to change between different countries.
Only parameters related with the intervention measures (α, t 0 , n, δ) are left to be inferred numerically, leading to tightly constrained numerical results. For this, we individually performed joint fit to all three observable quantities (A, D, F ) for each country. The errors are estimated through Monte-Carlo [18,19] simulations, assuming that count numbers follow Poisson distribution.

V. RESULTS AND DISCUSSION
Representative numerical results are shown in Figure 1 for Germany, while other countries are shown in the Supplement. In Figure 1A-C (and Supplement) we see a good agreement of our numerical analysis with all three classes of the case counts. In Figure 1D, we see sharp transitions between three growth patterns indicated in the figure: i) exponential growth, observed as a straight line in log-linear plot in Figure 1E; ii) superlinear growth, a straight line in log-log plot in Figure 1F; iii) sublinear growth, a straight line in linear-log plot in Figure 1G.
Transition between the growth patterns can be qualitatively understood from Eq. (3), and I(t) curve in Figure 1D. The exponential growth has to break after the inflection point of I(t), i.e., once the maximum of its first derivative (I (t) in Figure 1D) is reached. In the superlinear regime, confirmed counts case (D(t)) curve is convex (D (t) > 0), so this regime breaks once I (t) (dashed blue curve) becomes negative. Equivalently, D(t) curve becomes concave (enters sublinear regime) once the maximum of the I(t) is reached. In addition to this numerical/intuitive understanding, we also showed that we can analytically reproduce the emergence of these growth regimes (Eqs. (6), (14), (15)). Can we also analytically reproduce the parameters that characterize these regimes?
The exponential regime is straightforward to explain, as described in the previous section. The superlinear regime is in between the left inflection point and the maximum of I(t), so that infective numbers grow, but with a decreasing rate. While the derivations are straightforward in the exponential regime, they are highly non-trivial in the subsequent subexponential (superlinear and sublinear) growth. As the superlinear regime spans the region between the left inflection point (t i , I (t i ) = 0) and the maximum (t m , I (t m ) = 0), its duration is ∆t = t m − t i given by Eq. (17), with ∼ 1/α dependence, so that weak measures lead to protracted superlinear growth (see Figure 2A). This tendency is also confirmed by independent numerical analysis in Figure 2A, where for each individual country we numerically infer α and extract the length of the superlinear regime. Therefore, the duration of the superlinear regime indicates the effectiveness of introduced social distancing.
The scaling exponent υ of the superlinear regime is given by Eq. (19), and shown in Figure 2B, where we predict that all countries are roughly in the same range of 1.2 < υ < 1.5 (surprisingly, weakly dependent on α), despite significant differences in the applied measures, demographic and environmental factors. This result is (independently from our model) confirmed from case count numbers (the slope in Figure 1F, and equivalently for other countries, see Figure 2B).
How the effective reproduction number R e changes during this regime, i.e., between the left inflection point and the maximum of I(t)? R e quantifies the average number of secondary cases per infectious case, so that R e > 1 signifies disease outburst, while for R e < 1 the disease starts to be eliminated from the population [14] . The Eq. (16) provides expressions for R e,i (at the inflection point) and R e,m (at the maximum). Interestingly, from Figure 2C, we observe that R e,i and R e,m do not depend on R 0,free and are, respectively, significantly larger and smaller than 1, which shows that transition from infection outburst to extinguishing happens during superlinear growth. Consequently, the steepness of R e change over superlinear regime significantly increases (larger change over smaller time interval, see Figure 2C) with the measure strength.
Finally, in the sublinear regime, in a wide vicinity of I(t) maximum (which marks the beginning of the sublinear growth) leading non-linear term of D(t) is cubic (∼ t 3 , with negative prefactor). This is consistent with the expansion of I(t) around t m , which has leading negative quadratic (t 2 ) dependence (see Eqs. (3) and (20)). The ratio between the prefactors in D(t) expansion is given by Eq. (21), from which we see that α can be directly constrained, as shown in Figure 2D. To further illustrate this, the synergy of analytical derivations and numerical analysis presented above enables us to, directly from the publicly available data, infer key infection parameters necessary to assess epidemics risks (provided in the Supplement (Table S1)). We estimate these parameters by the same model/analysis, for a number of diverse countries, allowing their direct comparison. In Figure 3, we show together Case Fatality Rate (CFR), Infected Fatality Rate (IFR) and infection Attack Rate (AR) [14,21] . CFR is the number of  [20] .
value may present an inherent COVID-19 property.

VI. CONCLUSION
We here developed a novel quantitative framework through which we showed that: i) The emergence of three distinct growth regimes in COVID-19 case counts can be reproduced both analytically and numerically. ii) Typically, a brief superlinear regime is characterized by a sharp transition from outburst to extinguishing the infection, where effective reproduction number changes from much larger to much smaller than one; more effective measures lead to shorter superlinear growth, and to a steeper change of the effective reproduction number.
iii) Scaling exponent of the superlinear regime is surprisingly uniform for countries with diverse environmental and demographic factors and epidemics containment policies; this highly non-trivial empirical result is well reproduced by our model. iv) Scaling prefactors in the sublinear regime contain crucial information for analytically constraining infection progression parameters, so that they can be straightforwardly extracted through numerical analysis. Interestingly, we found that the number of COVID-19 fatalities per total number of infected is highly uniform across diverse analyzed countries, in distinction to other (highly variable) infection parameters, and about twice higher than commonly quoted for influenza (3-4% compared to 1-2%), which may be valuable for direct assessment of the epidemics risks.
While state-of-the-art approach in epidemiological modeling uses computationally highly demanding numerical simulations, the results above demonstrate a shift of paradigm towards simpler, but analytically tractable models, that can both explain common dynamical features of the system and be used for straightforward and highly constrained parameter inference. This shift is based on a novel framework that relates universal growth patterns with characteristic points of the infective curve, followed by analytical derivations in the vicinity of these points, in an approach akin to those in a number of physics problems. As our approach does not depend on any COVID-19 specifics, the developed framework can also be readily applied to potential outbursts of future infections.