Human mobility impacts on the surging incidence of COVID‐19 in India

Abstract Human mobility triggers how fast and where infectious diseases spread and modelling community flows helps assess the impact of social distancing policies and advance our understanding of community behaviour in such circumstances. This study investigated the relationship between human mobility and the surging incidence of COVID‐19 in India. We performed a generalised estimating equation with a Poisson log‐linear model to analyse the daily mobility rate and new cases of COVID‐19 between 14 March and 11 September 2020. We found that mobility to grocery and retail locations was significantly associated (p < 0.01) with the incidence of COVID‐19, these being crowded and unorganised in most parts of India. In contrast, visits to parks, workplaces, and transit stations did not considerably affect the changing COVID‐19 cases over time. In particular, workplaces equipped with social distancing protocols or low‐density open spaces are much less susceptible to the spread of the virus. These findings suggest that human mobility data, geographic information, and health geography modelling have significant potential to inform strategic decision‐making during pandemics because they provide actionable knowledge of when and where communities might be exposed to the disease.


| INTRODUCTION
In a globalised world, in which movement of people has increased at an unprecedented scale, populations are susceptible to the spread of infectious disease across scales (Tatem et al., 2006). With rapid advancements in transportation systems, societies have increased reach and speed of travel, and people are moving in significant volumes as never before. The global spread of the new coronavirus SARS-CoV-2-pathogen, also known as COVID-19 (Chen et al., 2020;Lai et al., 2020;Singhal, 2020;Sohrabi et al., 2020;Wang, Hu, et al., 2020; has led to over 80 million infections worldwide by the end of 2020 (WHO, 2021). That rate is indicative of how mobility can turn into a global health problem in an increasingly globalised world. One model to curb the rise of infections involves enforcing social distancing measures, also known as physical distancing measures (Cato et al., 2020;Lasry et al., 2020;Lunn et al., 2020;WHO, 2020b;Zhu et al., 2020), including bans on nonessential travel, stay-at-home orders, school closures, and a temporary shutdown of offices and businesses. These country-based control and mitigation measures have partly determined the course of the COVID-19 pandemic (Anderson et al., 2020).
At time of writing, India was the country second most affected by the COVID-19 health crisis, with over 10 million cases and more than 1.5 million people succumbing to the virulent disease (WHO, 2021). India's first COVID-19 case was reported on 30 January 2020, in the state of Kerala, where the patient had travelled from Wuhan in China, the original epicentre of the epidemic (Rafiq et al., 2020).
At the early stages of the pandemic, the Government of India announced an unprecedented nationwide lockdown on 24 March 2020, banning nonessential travel within the country and cancelling international flights to prevent the spread of COVID-19. The lockdown on 1.38 billion Indians was meant to enforce strict social distancing and was implemented in four phases over nine weeks (25 March to 31 May). The first three weeks of the lockdown were the most severe with nearly all services and factories suspended across the country (Praharaj & Vaidya, 2020). During the second phase of the intervention (15 April to 3 May), lockdown areas were classified into three zones: red (indicating infection hotspots), orange (indicating some infection), and green (no cases of COVID-19) . Some relaxations were announced for orange and green zones in that second phase, allowing agricultural businesses, banks, and government centres to reopen. The third (4-17 May) and fourth (18-31 May) phases of the lockdown were used to open up more activities in the orange zone and to restore normalcy in the green zone.
While the number of cases continued to increase in several parts of India after the lockdown was lifted on 31 May 2020, we are yet to understand what impact the mobility restrictions, lockdowns and subsequent reopenings may have had on the evolving scenario of virus spread and the COVID-19 health crisis.
We hypothesise that by studying human mobility data, it is possible to explain the nature of change in individuals' movements from national to state level due to social distancing interventions and to show how that affected the progression of cases over time. Numerous initiatives are tracking mobility changes during the pandemic, such as Google Community Mobility Reports (GCRM, Google, 2020) and Apple Mobility Trends Reports. These initiatives use anonymised, aggregated data to chart movement trends over time by geography and place categories, including grocery, retail, workplaces, parks, transit centres, and places of residence. These open datasets offer novel opportunities to study the relationship between human mobility patterns and the transmission of COVID-19. The usability and value of mobility data in epidemiology studies are wellestablished in the literature (Wesolowski et al., 2012(Wesolowski et al., , 2015Wu et al., 2020). Researchers have also used human mobility data as an indicator of social distancing, which is known to break the chain of human-to-human transmission of the virus (Badr et al., 2020;Lasry et al., 2020;WHO, 2020a).
In an attempt to explain the COVID-19 outbreak in the United States, Paez (2020) has used the mobility data from GCMR to quantify the regression effects of categories of places to find that workplace-related trips had the most effect on increased caseload. The same GCMR data have been used by Lasry et al. (2020) to examine the temporal correlations between public-policy induced change in community mobility and growth of infections to suggest there was an association at least in the four US metropolitan areas that they studied. Badr et al. (2020) have developed a mobility ratio using similar cell phone data to find that falling mobility trends were strongly correlated with decreased COVID-19 case growth rates over time for the most affected counties in the United States between 1 January and 20 April. Using Apple mobility data, Hadjidemetriou et al. (2020) have investigated the impact of the United Kingdom Government's interventions on human mobility and its subsequent impact on severe COVID-19 outcomes. Mobile location data used by researchers in China (Fang et al., 2020;Kraemer et al., 2020) has shown that the reduction in nonessential travel imposed on the epicentre of the epidemic in Wuhan dramatically changed the trajectory of the pandemic.
This study investigates the potential of GCMR data to help researchers and others assess the impact of human mobility on the incidence of COVID-19 in India. The

Key insights
This case study drawing on data from India provides new insights on health geography modelling and shows the efficacy of analysing spatio-temporal data on human mobility and COVID-19 to identify the geography of exposure risk. An advanced GEE Poisson log-linear model was employed to investigate the relationship between daily mobility and registered COVID-19 cases by places. The results show that visiting certain locations had a significant positive association with increased COVID-19 cases. That outcome suggests that spatio-temporal identification of risk of COVID-19 infection vis-à-vis community mobility can support decision-making and planning to help mitigation efforts directed against current and future pandemics.
scale of the analysis ranges from global (India-wide) to local (state level). We used Google data because Google Map is used extensively in India, and the dataset is potentially more representative of actual mobility patterns. Our study examines one central research question: Is there a relationship between daily mobility patterns and the new incidence of COVID-19? We subsequently explore whether there is a difference in the ways in which coronavirus transmission is influenced by travel to various categories of places such as parks, grocery and food markets, and workplaces. We were keen to learn this: if there is a statistically significant difference in the effects of the different mobility categories, then which locations/places are more prone to spread the infection?
By examining interactions between people and their movements, we emphasise the roles of place, location, and geography in health and disease modelling, positioning the study in the broader field of health geography (Kearns & Moon, 2002). Traditionally, health geography research spans two distinct avenues: the patterns, causes, and spread of disease; and the planning and provision of health services (Dummer, 2008). In this article, we focus on the intersection of the two areas, beginning by modelling the spread of the COVID-19, which leads to recommended planning interventions to support policy development. Our research builds on recent efforts, such as a geographically explicit model of the COVID-19 spread in Italy (Bertuzzo et al., 2020), health geography modelling of COVID-19 dispersion in São Paulo, Brazil (Fortaleza et al., 2021), geographical factor analysis to the COVID-19 outbreak in India (Gupta et al., 2020), and the spatial variation and place disparities in COVID-19 across the United States (Desmet & Wacziarg, 2021). Our findings highlight the extent to which a new geography of health has emerged from the surging COVID-19 pandemic.

| Variables and data sources
We have used two datasets for the longitudinal analysis in this study to answer the research questions. We obtained the first set of data-the daily state-level counts of COVID-19 cases (dependent variable) from COVID-19 India Tracker (COVID-19 India Org Data Operations Group, 2020), a crowd-sourced database for real-time COVID-19 statistics and patient tracing in India. The dataset is made available through an open (CC-BY-4.0) licence. This source is the most authoritative one available for COVID-19 related information in India. It combines both the daily data from the Indian Ministry of Health and Family Welfare and data published in state health department bulletins from all the 36 states and union territories in India.
Daily mobility indicator data are the second set used in the analysis and have been obtained from GCMR (Google, 2020), which provide aggregated and deidentifiable information about Google Map users' visits to different categories of places based on their location histories. Google constructs the data on the basis of the frequency and length of visits to places and reports percentage change from a baseline level, corresponding to the median value of mobility of identical days of the week for a period spanning from 3 January to 6 February 2020 (Aktay et al., 2020). Of the six categories of places, retail and recreation data provide information on visits to places such as restaurants, cafes, shopping centres, libraries, and movie theatres. Grocery and pharmacy data show mobility patterns on visits to grocery markets, food warehouses, farmers markets, and pharmacies. Parks data include travel to and from places such as national parks, public beaches, plazas, public gardens, and marinas. Transit station data provide information on mobility patterns for visits to bus and train stations. Workplace data contain information on mobility patterns for the location of work, and residential data show patterns for movements within neighbourhoods of residence.
Combining the two datasets enabled us to determine the daily sample for 36 states and union territories in India, spanning six months between 14 March and 11 September 2020. We have made the research data available through Mendeley Data (Praharaj, 2020). COVID-19's incubation period, which is the time between exposure to the virus and symptom onset, is typically five to six days but can be as long as 14 days as observed by WHO (2020c) and the Government of India (2020) based on several scientific publications (Backer et al., 2020;Lauer et al., 2020;Yu et al., 2020). Thus, any changes in mobility will likely have a lagged effect on the discovery of new cases. To counter this possibility, we calculated the lagged moving averages of the mobility indicators. We created both a 7-day lagged indicator and a 14-day lagged indicator to help understand whether a longer incubation time leads to a different prediction and inventory of COVID-19 cases. The lagged indicators are calculated as the mean of the mobility indicator using the values from 'date-minus-7-days' to 'date-minus-2-days' for the 7-day lagged indicator. Furthermore, it is possible that mobility and reports of new cases of COVID-19 are endogenous if communities adjust their mobility patterns according to increasing incidence rates (Paez, 2020). Hence, in addition to being consistent with a scientifically proven incubation period, the use of lagged indicators also helps to break this potential endogeneity.

| Statistical model
We used a Poisson log-linear model with a generalised estimating equation (GEE) approach, which is a reliable statistical method for longitudinal data analysis where the broad scientific objective is to describe an outcome (Ballinger, 2004), in this case, the incidence of COVID-19. In longitudinal data, repeated observations for a subject tend to be correlated. GEEs use the generalised linear model to estimate more efficient and unbiased regression parameters relative to ordinary least squares regression because the model allows the development of a working correlation matrix (Zeger et al., 1988). This process accounts for the within-subject correlation of responses on dependent variables of a variety of distribution types-normal, binomial, and Poisson. Initially developed by Zeger and Liang (1986), the GEE method has been widely used in medical and life sciences.
We defined the daily COVID-19 case counts for different regions as the dependent variable. This model assumes that the responses are not normally distributed (Poisson distribution) because they consist of a count of the daily infections by place. We worked with the longitudinal marginal model to quantify the relationship between the incidence of daily COVID-19 cases and the six mobility types (retail and recreation, grocery and pharmacy, parks, transit stations, workplaces, and residential), using the correlation among repeated case counts for 36 Indian regions. The model presents a quasi-likelihood estimate of β coefficient arising from the maximisation of normality-based log-likelihood of COVID-19 cases by geography.
The Poisson log-linear regression model for the expected rate of occurrence of COVID-19 cases in a state or territory can be denoted by the following: The longitudinal data of COVID-19 cases are daily cases for a particular region. We denote the expected cases, log(μ) and six mobility rates Here, GEE treats the response vector for the daily COVID-19 cases with the mean vector noted by μ ij , which corresponds to jth mean. The cases are assumed to be independent across regions but correlated within each region.

| COVID-19 outbreak and the containment response in India
Although the COVID-19 lockdown in India was among the harshest in the world (Daniyal, 2020), levels of implementation and observance by communities varied across regions. We mapped the progression of cases in different states alongside select policy events (Figure 1). The analysis shows that India had just 571 cases on 24 March, when the nationwide lockdown started. Before the lockdown was enforced, India's COVID-19 cases were doubling every three days (Rafiq et al., 2020). At the end of the initial lockdown period on 18 April, the growth rate of infections had significantly slowed to doubling every eight days. Our data reveal that by the end of the first two phases of the lockdown on 3 May, fewer than 43,000 total cases were reported across India. As noted above, the first two phases of the lockdown were rigorously implemented and, as a result, authorities were able to manage the virus spread. But India then experienced a steady rise in cases during the third and fourth phases of the lockdowns, reaching a total of 190,000 cases by the end of May. During these two lockdown phases, mobility restrictions were gradually eased in areas outside the containment zones. The Indian Council for Medical Research [ICMR] (2020) has highlighted the point that at the end of the lockdown, community transmission of the virus had still not been reported in India reportedly because of the pro-active social distancing and mobility restrictions observed during the period. The COVID-19 trajectory in India changed rapidly from mid-June 2020 following executive orders to reopen shopping centres, religious places, hotels, and restaurants from 8 June. Restrictions on interstate travel were also withdrawn, which significantly changed mobility dynamics. With the opening up of economic activities and increased movements, there was an alarming increase of COVID-19 infections in new territories such as Andhra Pradesh, Tamil Nadu, and Karnataka, as highlighted in Figure 1. India became the country second worst-affected by the COVID-19 pandemic on 6 September, when it had 4.2 million total cases. At time of writing, a large number of daily infections were being reported in densely populated states including West Bengal, Bihar, and Uttar Pradesh, which have been at the forefront of massive migrant crisis during the lockdown. The crisis was driven by a mass exodus of migrant workers and informal labourers who were travelling back to their native states from large cities such as Mumbai and Delhi. Thousands of migrant workers, having lost their jobs and facing severe food insecurity (Bhowmick, 2020) because of lockdowns, flocked around train and bus stations, and many even walked hundreds of kilometres to return to their native places. Social distancing has not been possible for those migrants because they travelled together in large groups, and initially, there was no coordinated response by governments to deal with the migrant crisis. Special trains and buses have operated to help transport stranded workers when instances of intense overcrowding have been reported. With India experiencing more than 90,000 new cases a day in early September 2020, the Indian Medical Association revealed that community transmission of COVID-19 was in many parts of the country and that the pandemic was so widespread that authorities could no longer identify the chain of virus transmission; the virus was moving freely in the community (WHO, 2020c; Yu et al., 2020).

| Mobility effects by categories of places on the incidence of COVID-19
We have developed a GEE model with Poisson log-linear analysis with daily COVID-19 cases as the dependent variable. The analysis has been performed using SPSS, and the results of the model are shown in Tables 1 and 2.  Table 1 presents the Chi-Square hypothesis test outcomes with 7-day lagged mobility indicators. Table 2 summarises the results for 14-day lagged mobility indicators.
We have found that mobility patterns related to retail and recreation and grocery and pharmacy are significantly associated (at p < 0.01) with the incidence of COVID-19 (Table 1). The two mobility categories have a positive coefficient (β) of 0.35 and 0.38, respectively, suggesting that an increase in the covariate will result in an increased number of COVID-19 cases over time. The model, therefore, shows that an increase in mobility to places like shopping centres, grocery markets, restaurants, farmers markets, and pharmacies results in an increased number of COVID-19 cases in India. We also discover that the coefficients (B) of parks, transit stations, and residential-related mobility are not statistically significant (p > 0.05), which implies that they do not affect changing COVID-19 cases over time. The test results show that although workplace-related mobility is significantly associated with the dependent variable (p < 0.05), the coefficient value for predicting the relationship is negative. We can therefore verify that reports of COVID-19 cases because of workplace-related mobility have reduced over time.
The GEE Poisson log-linear model results using 14-day lagged mobility indicators are presented in Table 2, and show an outcome very similar to that reported in Table 1, which used the 7-day lagged mobility indicators. Thus, we reconfirm the mobility effects discussed above. In fact, Table 2 shows a significantly stronger coefficient estimate (B = 0.044) than Table 1 for mobility related to grocery and pharmacy, affirming that visits to grocery markets, food warehouses, farmers markets, speciality food shops, pharmacies, and similar places appears to increase the chances of coronavirus infection. The use of 14-day lagged mobility indicators also has revealed a more substantial negative coefficient value (B = À0.065) for workplace-related travel, highlighting that it poses the lowest risk of transmission of the virus among the six place categories. Detailed SPSS output results are presented in Tables S1 and S2, which are supporting information files.

| DISCUSSION
This study has outlined a novel approach to study the relationship between daily mobility and the incidence of COVID-19 using the GEE method. Our statistical model effectively negates the within-subject correlation of responses on dependent variables to produce unbiased parameter estimates for case count data that was not normally distributed. However, there may be some possible limitations or biases in the data used for analysis, and the results should be interpreted in light of those limitations. The dependent variable consists of daily COVID-19 cases by regions. The availability of testing centres and the number of tests performed across states in India both vary significantly (Rafiq et al., 2020;Sarkar et al., 2020). India's testing rate is slightly above 70,000 per million population as of 11 September 2020 (the last date for which data are included in this analysis), which is much lower than other COVID-19 worst-hit countries such as the United States or Brazil, where over 400,000 tests per million people have been conducted (WHO, 2020a). Hence, overall and across regions, the actual cases in India might be much higher than reported COVID-19 positive cases.
In addition, depending on the late development or lack of testing equipment at the early stages of the pandemic, the accuracy of daily new cases might have been affected. Mobility indicators had a major limitation in that they included data for select people who use Google products on a smartphone (Aktay et al., 2020). Therefore, the data might not representatively capture all groups, such as older adults and those from low-income populations who might not be using smartphones while navigating through different places. The overall internet penetration across Indian states is comparatively lower than across developed regions (Praharaj et al., 2017). In 2018, 29% of India's total population accessed internet from their mobile phones, whereas in the same year, 79.1% people used internet through smartphones in the United States (Statista, 2021).
The other limitation arises from the use of aggregated travel numbers, which does not reflect the frequency of travel per person, especially those who travel multiple places and multiple times per day. However, because of the limited accessibility of a large volume of daily travel data in countries such as India, the GPS-based opensource mobility statistics provide a robust understanding of COVID-19 cases and mobility patterns at the regional level.
Either way, the study has established that there was a strong relationship between human mobility patterns and the incidence of COVID-19 in India. Importantly, the results have also highlighted the point that the spread of the coronavirus is closely related to the frequency of short-distance travel for daily needs. The crowdedness of local public places such as local grocery and retail shops emerged as a likely key factor that drives human-tohuman transmission of coronavirus. The local food and F I G U R E 2 Comparison of mean monthly mobility levels for grocery and pharmacy-related trips vegetable markets in Indian cities are highly unorganised places, and they lack space for proper social distancing. During the lockdown period, most states restricted the operating hours that applied to these markets, and vendors were only allowed to open for a few hours in the morning. These rules may have had an adverse impact, leading to overcrowding of these market areas because of the influx of a large number of people seeking essential food items in a limited period. Furthermore, these crowded markets are often located within pockets of poverty, suggesting that the socioeconomic dynamics of localities might have influenced how the virus spread through communities. The findings support existing research (Hamidi et al., 2020;Hawkins et al., 2020;, which has established that socioeconomic factors play an important role in COVID-19 prevalence in the United States. To corroborate the model results, we have also computed and mapped the mean monthly mobility levels from GCMR data for grocery and pharmacy-related trips. A comparison in Figure 2 of the mobility levels in April (during lockdown Phase 1) and June (after the lockdown ended on 31 May) highlights that human movements significantly increased in several parts of India since the lockdown was lifted.
People in states such as Bihar, Chhattisgarh, Kerala, Uttar Pradesh, West Bengal, and Assam were much more mobile, even during the lockdown in April. Most of these regions were at the forefront of the migrant crisis (Praharaj & Han, 2020), where labourers in large numbers returned from other states because of joblessness and hunger. Interestingly, these are the regions that show a manifold increase in new COVID-19 cases since the relaxing of lockdown restrictions, as shown in Figure 3, where we capture the progression of case counts across states over a 6-month period (14 March to 11 September). For example, Bihar added over 150,000 cases between 11 July and 11 September, whereas it had less than 1000 total cases until 11 May. Similarly, in West Bengal, nearly 200,000 cases occurred in the 2-month period between 11 July and 11 September, while there were only 5,500 cases on 31 May. These findings suggest that crossregional travel that is not directly measured by the model potentially has had a substantial effect on the COVID-19 incidence and distribution of cases in India.

| CONCLUSION
The findings strongly suggest that better crowd management and social distancing strategies in public places, local markets, and shopping districts are critical to check the spread of the COVID-19, which can be driven by mass gatherings such as cultural and social events. Urgently needed is a more cohesive approach to managing the interstate and city-to-village movements by migrant workers to address the threat of community transmission during such future health crisis.
In addition, we have found that blanket lockdown restrictions might not be beneficial. We have also provided some evidence to suggest that economic activities such as commuting to workplaces-which have F I G U R E 3 New COVID-19 cases (in thousands) reported in the most affected states over 6 months preventive measures such as social distancing and wearing of masks-are unlikely to increase number of new cases, but retail and recreation-related trips require legitimate preventive management and stricter social distancing. We have also found that visiting low-density open spaces (public parks) does not significantly affect virus spread. Therefore, as a critical tool in the battle against COVID-19, social distancing and restriction on mass gathering (for example, informal street markets and social and cultural places) must continue to be rigorously implemented to control subsequent waves of infections (Badr et al., 2020;Cato et al., 2020;Lasry et al., 2020;Singhal, 2020;WHO, 2020a).
Overall, the results from this study show that initial control measures and lockdown efforts aimed at human mobility reduction have had a decent impact on the occurrence of COVID-19 in India. However, given the limited resources available to enforce extensive lockdowns in a large country and given the adverse effects of lockdowns on the economy, authorities should adopt a place-based approach, focusing on promoting and implementing strict social distancing in hotspot locations that are most vulnerable.
The results are original and, if confirmed in other case studies, will lay the groundwork for more effective containment of COVID-19 in India and other countries that are still experiencing a health emergency. The findings and lessons from this health geography modelling study should help researchers, policy-makers and others to design and evolve better COVID-19 responses during the ongoing crisis and to plan ahead in order to minimise the impacts of likely subsequent waves of the virus or for other pandemics.
Outcomes from our analysis thus build on the growing scholarship on health geography, deconstructing health from a holistic perspective of society and space and conceptualising the roles of place, location, and geography in health, well-being, and disease. More and new research should use more disaggregated data to examine the mobility effects on the incidence of COVID-19 at the different hierarchy of places within regions; for example, urban areas, inner cities, and peripheral rural areas. Such work should also measure the effects of cross-border population movements on COVID-19 outcomes using mobility data modelling for a shorter duration of the pandemic to verify if place-based relationships persist in ways that parallel those found in this study. Additionally, new work might consider socio-economic and demographic factors such as age, poverty rates, income, race, population density, and housing conditions to understand which individual measures or combination of them alongside human mobility increase the vulnerability of places to occurrences of COVID-19 or pandemics alike.