Understanding the geographical burden of stunting in India: A regression‐decomposition analysis of district‐level data from 2015–16

Abstract India accounts for approximately one third of the world's total population of stunted preschoolers. Addressing global undernutrition, therefore, requires an understanding of the determinants of stunting across India's diverse states and districts. We created a district‐level aggregate data set from the recently released 2015–2016 National and Family Health Survey, which covered 601,509 households in 640 districts. We used mapping and descriptive analyses to understand spatial differences in distribution of stunting. We then used population‐weighted regressions to identify stunting determinants and regression‐based decompositions to explain differences between high‐ and low‐stunting districts across India. Stunting prevalence is high (38.4%) and varies considerably across districts (range: 12.4% to 65.1%), with 239 of the 640 districts have stunting levels above 40% and 202 have prevalence of 30–40%. High‐stunting districts are heavily clustered in the north and centre of the country. Differences in stunting prevalence between low and high burden districts were explained by differences in women's low body mass index (19% of the difference), education (12%), children's adequate diet (9%), assets (7%), open defecation (7%), age at marriage (7%), antenatal care (6%), and household size (5%). The decomposition models explained 71% of the observed difference in stunting prevalence. Our findings emphasize the variability in stunting across India, reinforce the multifactorial determinants of stunting, and highlight that interdistrict differences in stunting are strongly explained by a multitude of economic, health, hygiene, and demographic factors. A nationwide focus for stunting prevention is required, while addressing critical determinants district‐by‐district to reduce inequalities and prevalence of childhood stunting.

economic costs of stunting, the Sustainable Development Goals explicitly include reductions in global stunting, and many countries have adopted the World Health Assembly target of achieving a 40% reduction in stunting by 2025.
Achieving this reduction on a global scale, however, requires rapid progress against stunting in India, which accounts for approximately one third of the world's total population of stunted preschoolers (De Onis, Blössner, & Borghi, 2011). Understanding the underlying determinants of stunting in India-which has long been characterized as having unusually high stunting rates relative to its economic development (Ramalingaswami, Jonson, & Rohde, 1997)-has therefore been the subject of considerable investigation. An array of studies from many disciplines has drawn attention to the multifactorial nature of the problem of stunting in India. Explanations have addressed issues such as economic growth and agricultural production (Fenske, Burns, Hothorn, & Rehfuess, 2013;Headey, Chiu, & Kadiyala, 2012;Subramanyam, Kawachi, Berkman, & Subramanian, 2011), poor sanitation and open defecation (Fenske et al., 2013;Spears, Ghosh, & Cumming, 2013), discrimination against women and girls (Jayachandran & Pande, 2015), poor maternal undernutrition before and during pregnancy (Coffey, 2015), exceptionally poor infant and young child feeding practices (Menon, Bamezai, Subandoro, Ayoya, & Aguayo, 2015), and broader dietary deficiencies (Deaton & Dreze, 2008).
Some previous studies have shown that child undernutrition clusters in specific regions in developing countries (Fenn, Morris, & Frost, 2004;Gebreyesus, Mariam, Woldehanna, & Lindtjorn, 2016) and different types of spatial analysis studies have been conducted to identify geographical inequalities in child stunting (Fenn et al., 2004, Gebreyesus et al., 2016, Adekanmbi, Uthman, & Mudasiru, 2013, Alemu, Ahmed, Yalew, & Birhanu, 2016. However, much less has been done on explaining the factors that contribute to spatial variability in stunting (Di Cesare et al., 2015;Haile, Azage, Mola, & Rainey, 2016;Sharaf & Rashad, 2016;Srinivasan, Zanello, & Shankar, 2013), particularly in India. Although India is a highly populated country with a high burden of stunting, limited evidence exists on spatial analysis to examine the patterns of stunting across the country. To our knowledge, two previous assessments have been done; one at the state level (Cavatorta, Shankar, & Flores-Martinez, 2015) and another that utilized data from a subset of Indian districts (112 of 640) from a privately conducted survey to examine the role of sanitation (Spears et al., 2013). The paucity of analysis on the geography of stunting in India is problematic for two reasons. First, there are significant economic, social, and cultural differences both across and within states that might well explain the stark geographical disparities in nutrition previously observed in India (Cavatorta et al., 2015). Second, although Indian governance has traditionally been dominated by federal and state governments, the past 20 years has seen a major push to decentralize decision making to the district and subdistrict levels. Hence, a more granular assessment of the differences in stunting across India's 640 districts is essential for targeting and planning purposes.
In this study, we address this knowledge gap with an analysis of a new district-level data set created to address three research ques-

| Measures
Our outcome indicator of interest is the district level stunting prevalence, which is the proportion of children 0-59 months of age who have their height-for-age two standard deviations below the World Health Organization (WHO, 2006) growth reference (HAZ < −2).
The key determinants of stunting in India were selected based on conceptual frameworks from the previous literature, particularly UNICEF (1990) and the Lancet Nutrition Series .
The UNICEF framework distinguishes between immediate Key messages • India carries a high burden of child stunting, but lack of disaggregated stunting data at the district level has been a challenge for policy and program strategies in a decentralized governance system.
• This is the first study to use district-level data from a recently released national survey to highlight spatial differences in stunting across 640 districts in India.
• Our findings highlight the range of factors that explain differences between high and lower stunting burden districts.
• These results emphasize the importance of focused strategic planning and action to address multiple, and different, district-specific determinants of stunting across India. determinants (diets and disease burdens) and underlying determinants. The Lancet framework links these determinants to interventions, noting that nutrition-specific interventions address immediate determinants, whereas interventions and policies in nutritionsensitive sectors address underlying determinants. In this paper, we distinguish between immediate determinants, nutrition-specific interventions, and underlying determinants.
The immediate determinants included indicators related to maternal undernutrition and child feeding practices. We used women's low body mass index (BMI < 18.5 kg/m 2 ) as a proxy for maternal undernutrition. Indicators for infant and young child feeding included early initiation of breastfeeding (proportion of infants 0-23 months who were breastfed within 1 hr of birth), exclusive breastfeeding (the proportion of infants 0-5.9 months of age who fed only breast milk), timely introduction of complementary foods (proportion of children 6-8.9 months of age who were introduced solid and semi-solid foods), and adequate diet (proportion of children 6-23 months old who received four or more food groups and a minimum meal frequency). Some of these variables are only available for subsets of districts.
The nutrition-specific interventions included antenatal care (ANC) during the first trimester, adequate ANC (at least four ANC visits), and iron and folic acid (IFA) consumption (at least 100 IFA during the last pregnancy). Indicators related to infant's postnatal care included full immunization, vitamin A supplementation, and oral rehydration solution during diarrhoea. Although some of these are health care interventions, they are considered nutrition-specific interventions because they act as important platforms for delivery of nutritionspecific interventions such as micronutrient supplements and nutrition counselling and reach households in the first 1,000 days of life.
The underlying determinants examined included mother's education (≥10 years of schooling), age at marriage (at 18 years or older), sanitation, an asset index, and household size. For sanitation, we used water within premises (with the assumption that more access to water may facilitate more hygienic practices) and open defecation density (the number of people estimated to engage in open defecation per square kilometre). An asset index was constructed from district-level data, using the first principal component extracted from 19 different variables, including housing structure, house ownership, presence of a kitchen, access to electricity, clean cooking fuel, assets, and access to a bank account. We also included the proportion of scheduled caste/tribes (designated groups of historically disadvantaged people in India) in the district because it is an important dimension of inequality in India.

| Statistical analyses
Several complementary methods of analysis were applied to these data. We first estimate the absolute numbers of stunted children by multiplying the stunting prevalence with the estimated number of children 0-5 years of age from the Census of India. We mapped stunting prevalence by district to graphically analyse patterns of stunting across India. We tabulated stunting prevalence and absolute numbers of stunted children by states and by three major state groupings (northern states, southern states, and north-eastern and island states).
District stunting prevalence was then categorized into four bins based on current WHO cut-off values for public health significance (WHO, 2010): low prevalence (<20%), moderate prevalence (20-29.9%), high prevalence (30-39.9%), and very high prevalence (≥40%). The differences in determinants were tested for statistical significance across these different stunting burden categories, using analysis of variance and Bonferroni post hoc comparisons.
Second, to identify the determinants of stunting prevalence at the district level, we examined the bivariate associations between stunting and various determinants using scatter plots and tested for normality of the distributions using the Kolmogorow-Smirnov test. Three variables (4+ antenatal visits, open defecation density, and asset scores) were not normally distributed and showed non-linear bivariate relationships with stunting; hence, they were log-transformed. Multivariate linear regression was then used to examine the different factors associated with stunting. For this regression analysis, we dropped a few variables that were either highly correlated with another variable (e.g., ANC in the first trimester was highly correlated with 4+ ANC visits) or were only available for a subset of the districts (exclusive breastfeeding, timely introduction of foods, and oral rehydration solution during diarrhoea were only available for 425, 186, and 328 districts, respectively). Because we are primarily interested in explaining differences across districts rather than differences across states, all models included state-fixed effects, meaning that we are analysing within-state variation in stunting prevalence. We therefore report both total R 2 , but also the within-and between-state coefficients of determination. All regression models were weighted by the population of children under 5 years because the district population sizes vary substantially. In terms of specifications, we first estimated bivariate models for each variable. We then estimated a multivariable model including only immediate determinants and nutrition-specific interventions and then estimated a full multivariable model that included underlying determinants. In addition to gauging whether the coefficients on immediate determinants are robust to potential confounding factors, this approach allows us to investigate potential causal pathways by examining how coefficients on immediate determinants change as underlying determinants are added to the model (MacKinnon, Krull, & Lockwood, 2000).
In the last step of our analysis, we applied a regression-decomposition to assess the ability of the various determinants described above to predict spatial patterns in stunting and differences between very high-burden and low-burden districts. This approach has been used widely in literature to study mean outcome differences between groups (Jann, 2008), including differences in child malnutrition between geographical areas (Sharaf & Rashad, 2016;Spears et al., 2013;Srinivasan et al., 2013) and between populations measured at different points of time (Headey, Hoddinott, Ali, Tesfaye, & Dereje, 2015). This analysis effectively combines the analysis of differences in means of the explanatory variables (X) and regression estimates of the coefficients associated with these variables (β X ). Specifically, the "explained" difference between one spatial unit (District A) and another unit (District B) is the product of the difference in the mean of X across the two samples (X A − X B ) and the coefficient of X from a pooled regression model (β X ). Intuitively, if a particular X variable has a large regression coefficient ("marginal effect") and a large difference in means over two districts, then this variable will play a large role in explaining the interdistrict difference in stunting. An attractive feature of the decomposition approach is that it gauges the ability of all the variables in the model to predict interdistrict differences, as well as the ability of the model as a whole to account for these differences.
In this analysis, we implemented a decomposition at means of the stunting differences between very high-burden (stunting > 40%) and low-burden districts (stunting < 20%) with the objective of understanding how high-burden districts can move towards much lower rates of stunting. We report the share of actual stunting accounted for by this decomposition, as well as the share unexplained by the model as a whole.

| RESULTS
India achieved a sizeable improvement in stunting between 2006 and 2016, with a decline from 48.0% to 38.4% among children below 5 years (International Institute for Population Sciences, 2017). Despite this, stunting in India remains high and variable across districts, ranging between 12.4% and 65.1% (Figure 1). In total, there are more than 63 million children stunted in the country, which is more than one third of the global estimate for 2013 (De Onis & Branca, 2016). Stunting varies substantially across major regions and states, both in terms of prevalence and absolute numbers of stunted children (Table 1). The populous northern states of India contain approximately 52.6 million stunted children, accounting for more than 80% of stunted children in the country. Average district stunting prevalence for these states varies from 25.2% in Himachal Pradesh to 48.2% in Bihar and 46.3% in Uttar Pradesh. These latter two states are very large, containing 9.2 million and 14.3 million stunted children, respectively. In comparison, all of the Southern states collectively contain 8.1 million stunted children and the north-eastern and island states some 2.4 million. Even so, stunting prevalence in these other regions is relatively high in many instances, with one third of children in Andhra Pradesh and Karnataka estimated to be stunted, for example. Among reasonably populous states, only Kerala had an average district stunting prevalence below the 20% threshold.
Across all 640 districts in India, 239 districts have stunting prevalence in excess of 40% (very high), and 441 districts have stunting prevalence between 30% and 40% (high; Table 2). Only 29 districts have stunting levels between 10% and 20%, and most of these are in South India. Although there is considerable clustering of stunting within states, intrastate variance in district stunting prevalence is still reasonably high. Specifically, inter-state variation explains 56% of the variation in district stunting prevalence (see Table 4 below); hence, 44% of variation in interdistrict stunting prevalence is accounted for by intrastate variation.
National averages and district variability for various determinants across stunting burden categories of stunting are presented in Table 3.
On average, nearly a quarter of women have low BMI. More than 40% of children were breastfed within an hour of birth, and only 55% were exclusively breastfed. Moreover, complementary feeding is of great concern with less than 10% of children receiving an adequately diverse diet. In case of underlying determinants, more than a third of   women had at least 10 years of education, and two thirds of girls married after the age of 18. Open defecation is still prevalent in more than half of the population. Coverage is above 50% for several nutritionspecific interventions. More than half of the women received ANC in the first trimester or had at least four ANC visits, but only 30% of the women consumed at least 100 IFA during pregnancy. Coverage of full immunization and vitamin A supplementation was nearly 60%.
There was high interdistrict variability for most determinants across stunting burden category districts (Table 3). The most inequity among districts is observed for women's low BMI, women's education (≥10 years), asset score, ANC, and IFA consumption where the highburden stunting districts have levels that are 2-3 times lower than the low-burden districts, and gaps range from 16% to 40%.
Bivariate analysis indicates that stunting is associated with a wide range of immediate and underlying determinants ( Table 4). The strongest associations were observed for asset scores (β = −10.6 and −16.6 for Quintile 4 and 5, respectively) and low BMI in women (β = −0.73, 95% CI [0.66, 0.79]). The districts with higher coverage of nutrition specific-interventions had lower prevalence of stunting (β ranged from −0.27 to −0.17).
In the partial multivariable regression analyses (  The variables selected in the full regression model were used in the decomposition analysis to estimate the extent to which differences in these factors explained differences in stunting prevalence across very high-and low-burden districts. Overall, the decomposition models performed well, explaining 71% of the observed differences in stunting prevalence between high-and low-burden districts (Figure 2). This explained share is accounted for by the differences in women's low BMI (19%), women's education (12%), adequate diet among children (9%), asset scores (7%), open defecation (7%), age at marriage (7%), ANC (6%), and household size (5%). Decomposition analyses comparing low-and medium-burden districts found similar results (results not shown).

FIGURE 2
Factors contributing to the difference in stunting prevalence between very high-burden (stunting > 40%) and lowburden districts (stunting < 20%). ANC = antenatal care; BMI = body mass index; HH = household Note. All models included state-fixed effects and are weighted by the number of children 0-5 years in each district. ANC = antenatal care; BMI = body mass index; IFA = iron and folic acid. a Partial model included immediate and nutrition-specific interventions.
b Full model included all factors such as immediate and underlying determinants as well as nutrition-specific interventions.
In parallel with global attention and political commitment to reducing undernutrition, India has made considerable progress in reducing child malnutrition in the last decade. However, stunting prevalence remains high and extremely variable across districts and particularly high in populous northern states. High-stunting districts are characterized by lower levels of immediate and underlying determinants and low levels of nutrition-specific intervention coverage. The key factors associated with stunting were women's BMI, women's education, women's age at marriage, coverage of ANC, adequacy of child diets, household assets, and open defecation. These results suggest that if very high-stunting districts could catalyse improvements in these social, economic, and dietary factors, they would eliminate 71% of the gap with low stunting districts.
Our analysis has several unique strengths. Previous studies have applied decomposition techniques to understand stunting differences between poor-performing states and a single high-performing state (Tamil Nadu) using child level data from NHFS-III (Cavatorta et al., 2015) and to understand changes in India's national stunting prevalence between NHFS-I (1992/1993) and NHFS-III (2005-2006Headey, Hoddinott, & Park, 2016). Our study uses the most recent data, is comprehensive in examining spatial variation across the entire country, and geographically granular in that it focuses on interdistrict variation in stunting in a country with tremendous spatial variation in nutrition and its proximate and underlying determinants. Geographical clustering of stunting in India is pronounced, as is the clustering of various immediate and underlying determinants and intervention coverage. These determinants account for around three quarters of the differences in stunting prevalence between the very high and low prevalence districts. A geographical lens, therefore, highlights spatial dimensions of undernutrition that might be overlooked in child-level analyses. These findings also offer insights on the kinds of gaps that must be closed with equity-enhancing, geographically targeted policy instruments and high-quality implementation of these instruments.
Our analysis, therefore, provides timely evidence for policymakers to tackle stunting in India, in a context of India's commitments to the global nutrition targets and the Sustainable Development Goals.
We acknowledge some of the limitations of this analysis. The cross-sectional and geographically aggregated nature of our data means that our analysis is ecological in nature and could still be hampered by confounding factors. The richer unit level data from the NHFS-4, which at the time of writing had still not been released, will permit a more extensive analysis. We were unable to examine changes in linear growth outcomes by different age categories, which can provide important clues about the aetiology of stunting. From a policy perspective, however, there is significant merit in understanding districtlevel variation, because the district is an increasingly important unit in India's ongoing decentralization process and a district-level focus is central to India's newly launched National Nutrition Strategy (NITI Aayog, 2017). Moreover, despite the more ecological nature of our analysis, our findings are well-aligned with those from many other studies that have examined the determinants of stunting in India, using both unit-level and district-level data sets. For instance, almost all previous analyses of stunting determinants find strong associations with mother's education (Alderman & Headey, 2016). Several studies also link stunting in India to monotonous diets (Menon et al., 2015) and poor sanitation (Spears et al., 2013), even after controlling for wealth and parental education. Other studies have also found ANC visits in their last birth to be strongly associated with stunting in South Asia . A final limitation of note is that we were not able to examine relationships between all aspects of infant and young child feeding practices and child stunting because they are age-specific indicators (exclusive breastfeeding (EBF) for children 0-5 months and timely introduction of foods for children 6-8 months), and data were only available for a subset districts.
A focus on addressing women's nutrition emerges as a key priority area in our analyses, similar to other studies of malnutrition in South Asia (Coffey, 2015). We find, for instance, that low women's BMI explained almost a fifth of the difference between high-and low-burden stunting districts, corroborating results from previous studies that maternal undernutrition before and during pregnancy is a major determinant of poor fetal growth and child stunting (Black et al., 2013).
Accounting for one fifth of the global population with 42% of low BMI prepregnant women (Coffey, 2015), India faces a critical challenge because preconception undernutrition among women can influence birth outcomes and child growth through influencing early placental and embryonic development, epigenetic effects, and competition for nutrients between mother and baby (King, 2016).
Including maternal BMI, variables reflecting women's well-being-BMI, education, early marriage, and access to ANC-explain close to half the difference between high and low stunting districts. Discrimination against women is a widely suspected cause of India's unusually high rate of stunting, including small size at birth and low birth weight (Coffey, 2015). Although the variables in our analysis do not capture gender discrimination in terms of man-woman or boy-girl differences, the indicators used reflect several investments in girls and womeneducation levels, age at marriage, maternal nutrition, and use/access to ANC services. These indicators of investments in girls and women are likely to have both biological and social pathways to better nutrition for children. For example, early marriage, and consequently early child bearing, is more likely to lead to preterm births or small for gestational age births and perhaps also higher fertility prevalence over the life course (Branca, Piwoz, Schultink, & Sullivan, 2015;Temmerman, Khosla, Bhutta, & Bustreo, 2015).
Our study has significant policy implications. The high burden of stunting across most districts in India implies that strategies to address stunting must be rolled out across most of India, and a narrow spatial targeting is unlikely to deliver radical reductions in stunting. Moreover, the fact that 44% of interdistrict variation in stunting prevalence is explained by intrastate variation suggests that decentralization of the district level is critical. In addition to the intrastate variation, interstate differences were also prominent (56% of the variation in district stunting was explained by state-fixed effects). This is likely due to vast differences across states in administrative and governance approaches, implementation capabilities, and economic and sociocultural differences.
The regression model used in this study has significant predictive power, suggesting that the variables used in this analysis could be used for monitoring multisectoral initiatives to reduce stunting. These initiatives should prioritize improving the socioeconomic, nutritional, and health status of girls and women-their nutrition, education, early marriage, and access to care during and after pregnancy-and improvements in sanitation and overall socioeconomic status of the household. We note, however, that many of these factors are rooted in social and cultural contexts that will require more holistic societal changes than policy instruments alone can deliver.
In conclusion, our findings reiterate the complex and multifaceted nature of the burden of stunting in India. The granular district-focused analysis in this study, a first for India, highlights the concentration of this burden in the northern and eastern regions and the close associations between stunting and a wide range of nutrition-specific and nutrition-sensitive factors. The most important policy implications of our analysis are the need for a stunting prevention focus that is nationwide but focused on addressing critical determinants districtby-district to reduce inequalities and the prevalence of childhood stunting.

ACKNOWLEDGMENTS
We thank Lan Mai Tran for data preparation, analytic support, and preparation of figures and tables. We thank Sneha Mani for preparation of maps.