Severity of the COVID‐19 pandemic in India

Abstract The main objective of this study is to identify the socioeconomic, meteorological, and geographical factors associated with the severity of COVID‐19 pandemic in India. The severity is measured by the cumulative severity ratio (CSR)—the ratio of the cumulative COVID‐related deaths to the deaths in a pre‐pandemic year—its first difference and COVID infection cases. We have found significant interstate heterogeneity in the pandemic development and have contrasted the trends of the COVID‐19 severities between Maharashtra, which had the largest number of COVID deaths and cases, and the other states. Drawing upon random‐effects models and Tobit models for the weekly and monthly panel data sets of 32 states/union territories, we have found that the factors associated with the COVID severity include income, gender, multi‐morbidity, urbanization, lockdown and unlock phases, weather including temperature and rainfall, and the retail price of wheat. Brief observations from a policy perspective are made toward the end.


| 519
IMAI et Al. total confirmed cases. The authors concluded that (1) males have higher overall burden, but females have a higher relative-risk of COVID-19 mortality in India, and (2) elderly males and females both display high mortality risk and require special care when infected. As the period that this study covers ends on May 20, 2020, well before the huge surge in COVID-19 cases-inevitably constrained by the availability of data-there is a need for covering a more recent period.
As reviewed by Das et al. (2021), recent studies on the correlates of COVID-19 predominantly focus on the meteorological variables (e.g., Ma et al., 2020) and few studies focus on socioeconomic correlates. After controlling for temperature and moisture indices, Das et al. have found that the living environment deprivation (in terms of housing conditions, asset possession and water access/population and household density) was an important correlate of spatial clustering of COVID-19 hotspots in Kolkata megacity, the capital of West Bengal. While we cannot include such detailed data for our study at the national level, we control for weekly temperature and rainfall as well as the ratio of urbanization at state levels.
It is evident that socioeconomic factors influence the COVID-19 pandemic and infections, but virtually no studies have considered them in India, 7 particularly at the national level. An important exception is Olsen et al. (2020) who have estimated a hierarchical and multilevel model to estimate the correlates of the risk of death because of COVID-19 in 11 states of India, considering the factors at both individual and district levels. The authors combined the National Family Health Survey for 2015/16, Census data for 2011, and estimates of COVID-19 deaths cumulatively up to June 2020 from How India Lives. Olsen et al. found that people living in urban areas, belonging to the scheduled caste, being smokers, who are males with more exposure to activities outside home, and above 65 years have a higher risk of the COVID-19-oriented death. While our study cannot incorporate all the factors, it will cover a few important variables, such as urbanization, the morbidity above 60 years, and income per capita. Acharya and Porwal (2020) have constructed the aggregate vulnerability index at state and district levels based on National Family Health Survey Data in 2015/16 with a focus on the five dimensions, namely socioeconomic condition, demographic composition, housing and hygiene condition, availability of health-care facilities, and COVID-19-related epidemiological factors. Among the 10 most vulnerable states, socioeconomic condition, housing and hygiene condition, and availability of healthcare facilities contributed to the overall vulnerability index. The authors found that among the eight states that have contributed to over 80% of the confirmed COVID-19 cases in India, as of June 17, 2020, five states had a high vulnerability index value and the remaining three had medium vulnerability (e.g., Maharashtra with 33% of the total COVID cases and the vulnerability index 0.829, the seventh from the bottom). Although Acharya and Porwal have not estimated the vulnerability index using the actual COVID-19 data, their analysis implied the importance of socioeconomic factors, which is consistent with Olsen et al. (2020).
Our study builds on the existing literature on the correlates of the COVID-19 in India in some important ways. First, our study extends the analysis to October 31, 2020, and thus captures the surge in the COVID pandemic. We use a measure of COVID-19 severity, namely the cumulative severity ratio (CSR). CSR takes COVID-related deaths over a period since the occurrence of the first death relative to deaths in a pre-pandemic year over the same duration. The first difference of CSR is taken to capture a flow measure of the pandemic based on the new COVID-related deaths in comparison with the deaths in a pre-pandemic year. It helps monitor the progression of the pandemic-whether it is intensifying, weakening, or unchanged. We use panel models that allow the use of time-invariant fixed effects.

| Definitions of severity ratios
A new indicator "relative severity" proposed by the World Bank illustrates the unequal distribution and progression of COVID-19 deaths across states (Schellenkens & Sourrowuille, 2020). The relative severity ratio is defined as the ratio of the total deaths attributable to COVID-19 over a given period to the expected total deaths from all causes under the counterfactual assumption that the pandemic had not taken place over a base period of the same length. A comparison with pre-pandemic mortality patterns provides a state-specific measure of the severity of the pandemic. In addition to this ratio (which will be denoted as CSR), Schellenkens and Sourrowuille have defined a daily severity ratio (DSR) that tracks the progression of the severity of the pandemic in each region. To calculate the DSR, the number of COVID-19 deaths on a particular day is divided by the expected daily deaths under the assumption of no-pandemic, that is, annual deaths divided by 365 (in a pre-COVID year). We have modified CSR and DSR to capture excess mortalities. CSR has been redefined as the ratio of the sum of "accumulated COVID-19 oriented death numbers and the expected death numbers" to "the expected deaths from all causes" in a certain period. Likewise, DSR is modified as the ratio of "the sum of daily COVID-19 death numbers and the expected daily death numbers" to "the expected daily death numbers." Algebraically, where The COVID-19 data are collated from Ministry of Health and Family Welfare, India. The data on past mortality patterns are based on the state-wise number of registered deaths in 2017 from the Ministry of Health and Family Welfare, Government of India. For the purpose of deriving CSR, the number of reported deaths in 2017 is scaled down from annual estimates to the length of the pandemic in each state, calculated as the number of days since the first death in the state till the data point (t), with the cutoff date October 31, 2020. For the DSR, the denominator used in the ratio is the total number of deaths in each region in 2017/365. 8 As we discuss later, we will use as a dependent variable the first difference of CSR for the weekly panel as its level is nonstationary and both the level and the first difference of CSR for the monthly panel. Descriptive statistics of the variables are presented in Table A1.

| Trends of severity ratios
Figures 1 and 2 show the trends of CSR and DSR, respectively, for relatively large states to avoid the graphs being cluttered. CSR and DSR are aggregated for each week, from Week 1 (starting on March 13, 2020) to Week 34 (on October 29, 2020). It is noted that during this study period the Indian government made every effort to prevent the spread of COVID-19 starting from a rigorous lockdown policy to relatively loose restrictions. The entire period is roughly divided into "Lockdown" phases from March to May and "Unlock" phases from June to October, as indicated by a dashed line in the figures. The former is divided into four phases: Phases 1-4, and the latter into five phases: Unlock 1.0 to Unlock 5.0 as shown by dotted lines in the figures. The first lockdown (Phase 1) spanned a period of 21 days from March 25 to April 14 in which nearly all factories and services were suspended, barring "essential services." The second lockdown (Phase 2) started on April 15 and continued until May 3, with conditional relaxations for regions where the COVID-19 spread had been contained. With additional relaxations, the third phase of the lockdown (Phase 3) was from May 4 to May 17, and the fourth phase (Phase 4) was from May 18 to June 21.
Unlock 1.0 (June 1-30) was the first phase of the reopening in stages, with an economic focus where shopping malls, hotels, and restaurants reopened. In Unlock 2.0 (July 1-31), the lockdown measures were restricted only to the contaminated zones, and some inter-and intrastate travels were permitted. Further relaxation of restrictions (e.g., night curfews) occurred in Unlock 3.0, while Maharashtra and Tamil Nadu imposed a lockdown (August 1-31). Unlock 4.0 (September 1-30) was characterized by permissions of gathering at marriages/funerals, while wearing face masks became compulsory in public places and Unlock 5.0 (October 1-31) by opening cinemas and a gradual restarting of onsite F I G U R E 1 Trend of cumulative severity ratio-selected states (13-03-2020 to 31-10-2020) (%) [Colour figure can be viewed at wileyonlinelibrary.com] teaching at schools at the discretion of state governments. How these government policies effectively influenced the COVID-19 infection cases or fatalities is debatable and essentially an empirical question. Some authors have constructed the panel data across different countries and have estimated the effects of government policies on the COVID-19 infections. For instance, Chen et al. (2020) estimated the effects of various non-pharmaceutical interventions by governments to prevent the spread of COVID-19 on the country-level effective reproductive rate (Rt) for the panel of 75 economies and have found that while lockdown measures lead to reductions in Rt, gathering bans are more effective than workplace and school closures. How these policies are effective in India remains uncertain, which would justify our focus on different phases.
We observe in Figures 1 and 2 a gradual increase in both CSR and DSR from the latter half of Phase 1 in Maharashtra and Gujarat. However, Maharashtra has seen a continuous rise in both CSR and DSR until Unlock 4.0-5.0 where CSR exceeded 110%. DSR reached 125% in Unlock 4.0. Evidently, Maharashtra has experienced the severest pandemic. However, the state has seen a gradual decline in DSR from mid-September to October 2020. On the contrary, CSR remained stable at around 102% in Gujarat from June to October. DSR has also remained stable in Gujarat after late July. Tamil Nadu experienced a sharp rise in CSR in July and August (Unlock 2.0-3.0). Its DSR became the second worst next to Maharashtra from mid-June to the end of July with a gradual decline after mid-August.
Andhra Pradesh saw a rise in CSR from early July. Its CSR became the second highest roughly at around 103% next to Maharashtra on September 18. DSR in Andhra Pradesh was the second highest in late July to early October with its peak nearly 110% in late August. DSR has declined since then. Uttar Pradesh has seen a rise in CSR from July to October. Other states in the graphs, namely Madhya Pradesh, Rajasthan, Assam, and Kerala, have experienced a gradual increase in CSR, but the F I G U R E 2 Trend of daily severity ratio-selected states (13-03-2020 to 31-10-2020) (%) [Colour figure can be viewed at wileyonlinelibrary.com] | 523 IMAI et Al. pandemics measured by CSR or DSR were not as severe as those of the states mentioned earlier. We observe a large variation in levels of the severity across different states.
We see large variations in other states/union territories not highlighted in Figures 1-3. For instance, in Goa CSR increased from 102% in July to 113% in October 2020. Sikkim's CSR remained at 100% (i.e., no extra mortality due to COVID-19) until the end of July, but its CSR suddenly rose later and reached 115% at the end of October. Puducherry showed a similar trend with a rise in CSR in August and September and CSR reached 110%. In Uttarakhand CSR gradually increased from 101% in July to 106% in October. Jammu and Kashmir saw a similar rise in CSR from 101% to 106% in July-October due to a surge in DSR in the same period. On the contrary, CSR and DSR remained very low elsewhere, such as Odisha and Mizoram. 9 Figures 3 and 4 show the trends of cumulative and daily infection cases of COVID-19 on the basis of weekly averages in selected states. In terms of infection cases, Maharashtra has experienced by far the severest pandemic, although the number of daily infection cases has started to decline after September 18. On the contrary, Kerala's daily infections suddenly rose from mid-September to the end of October-leading to a steep increase in cumulative cases. In other states shown in Figures 3 and 4, daily cases were highest in September and started to decline marginally in October. Most of the other states have shown similar trends of cumulative and daily cases where the latter declined gradually ( Figures A1-A3). One notable exception is West Bengal where daily cases continued to rise in September and October. The daily cases exceeded 4,000 and DSR rose to 104% in October in West Bengal. 10 F I G U R E 3 Trend of cumulative COVID-19 infection cases (13-03-2020 to 31-10-2020) [Colour figure can be viewed at wileyonlinelibrary.com]

| Panel unit-root tests
Given the trends of severity ratios we observe in Figures 1 and 2, we have carried out unit-root tests for CSR based on the weekly panel data. To normalize the infection cases, we have taken the logarithm of the number of cases and applied the unit-root tests for it. Table 2 presents the results of the panel-unit root tests for CSR, its first difference, the log of cases, as well as the log of retail prices series of wheat, one of the explanatory variables. We apply Levin-Lin-Chu (LLC) (Levin et al., 2002) and Im-Pesaran-Shin (IPS) tests (Im et al., 2003). LLC tests the null hypothesis that each time series contains a unit root against the alternative hypothesis that each time series is stationary in which the lag order is permitted to vary across individuals. IPS test is not as restrictive as the LLC test, as it allows for heterogeneous coefficients. The null hypothesis is that all individuals follow a unit-root process against the alternative hypothesis allowing some (but not all) of the individuals to have unit roots. We apply the specifications with and without a time trend. We determine the number of lags by Akaike Information Criteria (AIC). 11 Three states with missing observations have been dropped to make the panel balanced. Table 1 shows that CSR is I(1) (nonstationary) but its first difference is stationary. The log of cases and the log of wheat prices are stationary. Given that CSR is not stationary, the ordinary least squares (OLS) or the static panel data model, such as fixed-effects or random-effects models, cannot be applied. As all the explanatory variables-including the wheat prices and weather variables-are stationary, they are not co-integrated. Therefore, we will use the first difference of CSR or the log of infection cases as a dependent variable for the weekly panel. We have also taken the monthly averages of the data and constructed the monthly panel-where the stationarity is not an issue due to a smaller T (or the panel unit-root tests cannot be conducted). For the monthly panel we use the level of CSR, its first difference, or the log of cases as a dependent variable.

| Model specification
To find the explanation for the regional variation in the severity of COVID-19 pandemic, we use a panel of 32 states/union territories for which the data on various variables are available covering the T A B L E 1 Results of unit-root tests for weekly panel  Lags are determined by Akaike Information Criteria (AIC); b Adjusted t is reported for LLC and Wt-bar is reported for IPS; *Indicates that the estimate is statistically significant at 10% level; ***Denotes the statistical significance at 1% level.

T A B L E 2 Correlates of cumulative severity ratio of COVID-19
(1) (3)  (1) (3)  (1) (3)  (1) (3)     period from March 13, 2020, to October 31, 2020. As we have noted, we have organized the data as weekly or monthly panels where all the variables on daily basis are averaged for each week or month. Because of the missing observations of a few variables, our estimation is based on 1,041 observations for the weekly panel (32 states times 32.5 weeks on average) and 223 observations for the monthly panel (32 states times 7.0 weeks). Our econometric models are motivated by the emerging empirical literature on the correlates of the COVID-19 pandemic we reviewed in Section 2, where it has been found that not only meteorological factors but also socioeconomic and demographic factors are closely associated with the degree of the COVID-19 pandemic or infections. We thus consider socioeconomic variables (e.g., income, multi-morbidity, sex ratio, and urbanization) as well as meteorological and geographical factors (e.g., temperature, rainfall, and state dummy variables). Methodologically, we use pooled OLS (with both state fixed effects and phase or month dummies), a random-effects model, and a random-effects Tobit model so that we can model time-invariant unobservable state or union territory characteristics (e.g., institutional or cultural factors specific to each state or union territory). In the random-effects model, state and phase/month fixed effects are included by applying the mixed-effects model (Bell & Jones, 2015). In the meantime, as some states/union territories had zero cases or deaths in early periods, the random-effects Tobit model is also estimated as a robustness check to consider left-censoring of the dependent variable in case we estimate either CSR or the log of cases.
More specifically, we regress a dependent variable, either the CSR, the first difference of CSR (DCSR), or confirmed COVID-19 infection cases on a number of explanatory variables. CSR captures the overall development of the COVID-19 pandemic, while its first difference denote how the severity progresses over time. We also use the number of the infection cases given that a surge in the confirmed cases is closely associated with the COVID-19 pandemic with some lags.
We have selected the explanatory variables, while constrained by the data availability, to reflect the growing empirical literature on the correlates of COVID-19 pandemics or infections. The timevariant explanatory variables are weather variables, namely temperature and rainfall as well as the lagged commodity price (retail wheat price). We have also used a number of time-invariant variables such as the log of per capita income, urbanization, presence of more than one morbidity condition among those above 60 years, and the sex ratio (the number of females per 1,000 males). The model also includes a few phase dummies as defined in the previous section.
We estimate the following equation. We have taken the logarithm of most of the explanatory variables to capture the relative effect, or the elasticity of each factor. 12 In Equation 1i stands for states (from 1 to 32) and t for weeks from March 13 to October 31, 2020 (from 1 to 34) for the weekly panel data and March to October (from 1 to 8) for the monthly panel data. A dependent variable is DCSR it (the first difference of CSR) or logCovidCases it (the log of the daily infection cases-averaged over a week)-both of which are I(0)-for the weekly panel data and CSR it , DCSR it , or logCovidCases it (the log of infection cases averaged over a month) for the monthly panel as in Equation 2. 13 (1) (2) logCovidCases it = 0 + 1 logPerCapitaIncome i + 2 Multimorbidityabove60 i + 3 Urbanization i + 4 SexRatio i + 5 WheatPrice it−1 + 6 Temperature it + 7 Rainfall it + Phase (orMonth) Dummies t 8 + i + e it .           (1.00).
( PerCapitaIncome i (PCI) denotes income at state levels that is measured by per capita net state domestic product (in Rs., divided by 1,000). 14 PCI captures not only overall economic development at state levels. It may also capture health infrastructure or funding at state levels-for which the data are unavailable-in response to the COVID-19 pandemic. We include the proportion of elderly people who suffer from more than one non-communicable disease (NCDs) at state levels (Multimorbidityabove60 i ). 15 This is the proportion of population in the age group 60+ reporting more than one NCD (e.g., cardiovascular diseases, diabetes, hypertension, among others). To capture the degree of urbanization, we also insert Urbanization i , the share of the population living in urban areas. The idea is that a higher population density and urbanization would increase more interactions among people and raise both CSR and its first difference. Furthermore, we have inserted SexRatio i (the number of females per thousand male), as it is well documented that, while COVID-19 infection rates are broadly similar between men and women, men are more likely to suffer from severe illness or die as a result of COVID infections in China (Jin et al., 2020) and in Europe (Gebhard et al., 2020). However, given the preference for boys over girls in many states of India, more developed states with lower poverty (e.g., Kerala) tend to have a higher sex ratio, and these states may have a better health system. Therefore, the effect of the sex ratio on the COVID-19 may be ambiguous in India. 16 We also control for the effect of the retail price of wheat to examine whether the food price has any association with the COVID-19 pandemic. An increase in the wheat price may lead to the difficulty in accessing food or a macronutrient, but in the meantime, it may induce substitution into inferior cereals, such as ragi or maize, which may result in better nourishment (Gaiha et al., 2014). Our results are consistent with the latter hypothesis.
It is widely debated whether weather influences the COVID-19 infection cases and/or linked to deaths. A study used the data on daily death numbers from Wuhan, China, in January-February 2020 and found that death counts are positively associated with temperature and negatively with relative humidity (Ma et al., 2020). We have collected the daily data on temperature, rainfall, and relative humidity from MERRA (Modern-Era Retrospective analysis for Research and Applications-Version 2 web service) and have taken either week or month averages. It delivers time series of temperature (at 2m), relative humidity (at 2m), and rainfall. The data source is a NASA atmospheric reanalysis of the satellite era using the Goddard Earth Observing System Model (GEOS-5) and focuses on historical climate analyses for a broad range of weather and climate time scales (GMAO, 2015). Due to the high correlation between rainfall and relative humidity, we use the variables Temperature it and Rainfall it .
To capture the time and policy effects, we have included eight dummy variables for Lockdown Phases 2-4 and Unlock 1.0-5.0 for the weekly panel and seven monthly dummies for the monthly panel. This is aimed to capture the associations with the lockdown and unlock policies announced by the Government of India. Equation 1 has been estimated by pooled OLS with state and phase/month fixed effects, random-effects model or mixed-effects model (Bell & Jones, 2015), and random-effects Tobit model with phase/month fixed effects.

| RESULTS
We show the results of our regression analyses in Tables 2 and 3 corresponding to Equations 1 and 2. The main findings are summarized below.
In Table 2, the results based on the weekly panel are shown in Columns 1 and 2, and those based on the monthly panels are in Columns 3-8. We find that log per capita income at the state level is positively associated with DCSR, weekly and monthly changes in CSR of COVID-19 (a proxy for the development of the pandemic) (Columns 1, 3, 5, and 7) as well as monthly CSR, after controlling for state and phase/month fixed effects. For instance, an increase of 1% in per capita income is on average associated with an increase of 0.33% in the change of CSR (Column 1). As CSR is measured in percentages, not in the logarithm, this increase is substantial and implies that the state with a higher income tends to witness a faster change in CSR or a faster development of the pandemic. The reason is that a higher income level tends to be associated with more production, transportation, and movement of people and goods even in the lockdown phases. Consistent results are found for the monthly data. An increase of 1% in per capita income is associated with an increase of 4.4%-5.6% in CSR and a nearly 2% increase in the change in CSR on a monthly basis. If we replace CSR with DSR, we find that a 1% income increase is significantly associated with a 5.4% increase in DSR. 17 We have also found some positive association between CSR or DCSR and multi-morbidity for monthly data, but not weekly data. The estimated coefficient varies across different models, but, for instance, based on Column 4 on CSR, we observe that a 1% increase in urbanization is associated with a 0.36% increase in CSR as consistent with Das et al. (2021) and Olsen et al. (2020). As expected, we find that the share of those among the elderly with multi-morbidity conditions is positively associated with DCSR or CSR (e.g., a 1% increase in the share is correlated with a 0.54%-0.67% increase in CSR, Columns 3 and 5). Similar results are found for DSR.
If the number of women per 1,000 men decreases by 1, this is on average associated with an increase of 0.05%-0.06% in CSR (or a 0.02 increase in DCSR). A consistent result has been found for DSR as well. However, the sign is reversed in Table 3. That is, a higher share of women is associated with a higher level of infection but lower level of fatalities. Whether this reflects any gender difference in the risks of infection and fatalities is not clear, but the results imply that demography is one of the important correlates of the COVID-19 pandemic.
A lagged retail price of wheat is negatively correlated with DCSR or CSR, and the estimates are statistically significant in all the cases except Columns 3 or 4 (RE model applied to the monthly panel). Columns 1 and 2 based on the weekly data and 7 and 8 based on the monthly data show a similar level of parameter estimates. A 1% increase in wheat price is associated with a 0.17%-0.19% decrease in changes in CSR, while the estimated coefficient of the Tobit model suggests that a 1% price increase is correlated with a 4.5%-4.6% decrease in CSR (in levels). The results overall suggest a negative correlation between wheat prices and CSR, which could be due to shift to cheaper and more nutritious cereals. However, once we take the second or third lags, the parameter estimates are negative but not statistically significant.
We have controlled for temperature and rainfall to reflect the empirical literature on the correlates of the COVID-19 infection and pandemic. While the estimated coefficient of temperature is positive and that of rainfall is negative in all the cases, we only find a positive and statistically significant estimate for temperature in Columns 5 and 6 (Tobit for CSR) and a negative and significant estimate for rainfall in Columns 7 and 8 (random-effects model for DCSR). We refrain from inferring any associations between weather conditions and the pandemic once time and state effects are accounted for. Table 2 also shows coefficient estimates of state dummies for selected states. They do not necessarily match the rankings of CSR in Figure 1 or Figures A1 and A2 as estimated coefficients of state dummies have been derived after conditioning other covariates, such as per capita income. However, Maharashtra tends to have a higher parameter estimate when DCSR or CSR is statistically significant (e.g., Columns 1, 3, and 5). Phase or month dummies show that not only the level of CSR but also its change tends to increase in later periods, which implies that the pandemic has worsened over time. A decrease in DCSR from Unlock 4.0 to Unlock 5.0 (Columns 1 and 2) indicates that worsening of the pandemic slowed down in October 2020. Table 3 shows the results on infection cases. Here log of infection cases, which has been found to be stationary based on the unit-root tests, is used as a dependent variable in all the cases. Columns 9-12 based on the weekly panel, while Columns 13-16 on the monthly panel contain results when both random-effects models and Tobit models are applied, given that there are some states with no cases at the onset of the pandemic. It is notable that many of the parameter estimates on CSR or DCSR in Table 2 are reversed in Table 3. For instance, log per capita income is negative and significant in all the cases. 18 That is, if income increases by one percentage point, the number of cases tends to decrease by 2.8%-3.0% (with no causality implied by these results), after controlling for state fixed effects and phase/month dummies. Interpreting the results in Tables 2 and 3 together, a state with a higher income tends to experience the worse pandemic at relatively low case numbers on average. This is counter-intuitive at first sight if we assume that income leads to more interactions among people leading to more cases, but we conjecture that a relatively rich state may be able to carry out more tests, but it does not necessarily have a capacity to cope with fatalities for a certain size of population.
On the contrary, the state with a higher share of the elderly with morbidity conditions tends to have lower COVID-19 cases (where a 1% increase of the former is associated with a 0.28% decrease in the cases), while the unconditional correlation between the two variables is positive. It is conjectured that while morbidity conditions among the elderly can lead to fatalities once they are infected, they may not influence the probability of being infected at the population level. Urbanization is not significantly associated with the number of cases (except a negative and significant coefficient based on Tobit, Column 12). As noted earlier, the sex ratio is positive and significant, implying that the states with more females per 1,000 males tend to have more infection cases (an increase of one woman per 100 men is associated with a 0.03% increase in the cases). As in Table 2, retail prices of wheat are negatively correlated with the log of infection cases where a fall of 1% in wheat prices tends to lead to a decrease ranging from −1.3% to −1.7% in infection cases. It is conjectured that higher wheat price indices reflect a shift toward inferior but more nutritious cereals (Gaiha et al., 2014). 19 On the effect of weather, both temperature and rainfalls have a positive and significant coefficient estimate, that is, hot and rainy weather conditions are correlated with higher COVID-19 infection rates. A 1% increase in temperature is associated with a 0.13%-0.17% increase in the number of cases on average, other factors held constant. In contrast, a 1 mm increase in rainfall is associated with a 0.03%-0.06% increase in the cases. Phase or month dummy variables show that the number of cases tends to be larger and larger in subsequent months or phases. State dummy variables show, after controlling for covariates (e.g., income), that Maharashtra, Andhra Pradesh, Gujarat, Kerala, and Tamil Nadu are the states that exhibit a higher number of infection cases than other states.

| CONCLUSIONS AND POLICY DISCUSSIONS
This study has provided econometric results on the socioeconomic, meteorological, and geographical correlates of the severity of COVID-19 pandemic in India. We have used the measures of excess mortality called CSR and its first difference (DCSR) up to October 31, 2020. The log of COVID-19 cases has also been estimated as a rapid increase in the cases implies a pandemic. The study has adopted a random-effects model and a random-effects Tobit model, the latter of which considers the fact that some states did not record COVID-19-oriented deaths in early phases. The factors associated with the severer pandemic reflected in a large CSR or DCSR include higher income at the state level, a higher share among the elderly with multi-morbidity conditions, urbanization, a lower share of females in the population, lower local retail prices of wheat, and lockdown and unlock phases. On the contrary, the correlates of a higher number of infection cases-which are different from the above factors-include lower income, a lower share of the elderly with multi-morbidity conditions, a higher share of females in the population, lower wheat prices, as well as hotter and/or rainier climatic conditions.
The positive association between the severity measures and per capita income implies that higher incomes are associated with higher mortalities. The underlying mechanisms include greater economic activity, more travel and intermixing, and, consequently, higher exposure to the infection and higher risk of dying if denied medical assistance. A negative association between the infection cases and per capita income is puzzling, but it could be the case that the state with greater economic activities did not necessarily have a matching capacity for testing.
A more or less expected result is the positive association between CSR (or DCSR) and urbanization. Although there has been a large-scale reverse migration from urban areas to villages, indications are that large segments are forced to return to small towns and cities with flickering signs of economic revival.
The negative association of both CSR and DCSR with the sex ratio, the number of women per 1,000 men, means that the state with a higher share of women (i.e., a higher sex ratio) tends to have a lower severity ratio or its increase (or a milder pandemic) after controlling for state fixed effects. This is consistent with Joe et al. (2020) who argued that the evidence from various countries suggests that men are at greater risk of both infections and deaths, and that males are at a greater disadvantage than females with the CFR of 3.3% and 2.9%, respectively. In a statistical analysis, the authors show that the CFR among males is usually higher than females for most of the age groups.
A few limitations are briefly noted. First, as we have used a state, rather than a household, as the unit of analysis, we could not capture variation in COVID-19 fatalities within states. Second, our analysis is limited by the lack of data on health capacity and infrastructure for measuring the response to the COVID pandemic. A third limitation is that we are unable to assess the impact of return migration on the villages/small towns to which they belong.
Finally, we will make a few points about COVID-related policies 20 and future research. First, we have shown a statistical association between the higher share among the elderly with multi-morbidity conditions and the pandemic severity, while the development of COVID-19 pandemic significantly differs across different states. It can be conjectured on the bases of our results and earlier studies on COVID-19 (e.g., Joe et al., 2020) that the elderly suffering from NCDs are susceptible to the risk of dying from COVID-19. Although our analysis has not been able to include the variables on health capacity and infrastructure, given under-funding of the health sectors in a number of states, a case could be made to develop a fully integrated population-based healthcare system that brings together the public and private sectors and the allopathic and indigenous systems, and is well coordinated at different levels of service delivery platforms-primary, secondary, and tertiary. 21 It should address acute and chronic healthcare needs, offer accessible, good quality healthcare choices, and be cashless at the point of service delivery and would mitigate the severity of the pandemic. To understand the role of healthcare systems and funding in mitigating the severity of the pandemic, future research should examine the causal relationship between the health capacity and infrastructure and the COVID-19 pandemic in India and elsewhere.
Another major concern is that the response to COVID-19 infection depends on the immune system of the individuals. Although we did not include the variables on nutrition or health status at individual levels due to data constraints, we have shown that the development of the COVID-19 pandemic is correlated with changes in wheat prices. The underlying mechanism connecting these two variables is unclear, but there exists a possibility that the change in wheat prices influences the nutritional conditions of the vulnerable groups in the population through changes in the availability of staple foods or substitution from wheat to other staple foods. Specifically, individuals with poor nutritional status are likely to have a weak immune system. A significant proportion of women in the age group 15-49 years, for example, are undernourished, and this makes them more vulnerable to COVID-19 morbidity and mortality. As risks of chronic diseases accumulate over a life span, the old (60 years and above) tend to be more vulnerable to diabetes and cardiovascular diseases, and thus exposes them to higher risks of COVID-19 morbidity and mortality. Amid elevated risks to lives and livelihoods, there is also a surge in hunger and food deprivation in both rural and urban areas. Besides, disruption of healthcare services is inimical to nutritional health (Joe et al., 2020). Therefore, food security is a major policy challenge (Reardon et al., 2020). Future research should focus on the causal link between food security or nutritional conditions of people and COVID-19 morbidity or mortality.
As our results show that some states were slower to recover from the pandemic than others, a perceptive comment by Horton (2020) merits serious consideration, particularly when we look beyond the current pandemic. If we can diagnose new infections more rapidly, there is hope of exiting lockdown faster and more safely. For example, with self-isolation when there are early signs of muscle pain, fatigue, headache, diarrhoea, and rashes, the likelihood of avoiding a second or third wave is greater. Another important observation is that prolonged lockdowns are not the answer to future waves of COVID-19. Neither school closures are sustainable, nor could the economy be refrigerated again. What matters most is a mix of combination prevention that includes handwashing, respiratory hygiene, mask wearing, physical distancing, and avoiding mass gatherings, some of which were opted during the unlock phases. Future research should carry out rigorous evaluations of different social distancing policies and COVID-19 morbidity or mortality using, for instance, phased-implementations of lockdown or unlock policies.
In brief, the rapid surge of the corona pandemic calls for extraordinary measures. While some are identified here, their implementation is daunting. Wu, Y. C., Chen, C. S., & Chan, Y. J. (2020)