Eligibility or use? Disentangling the sources of horizontal inequity in home care receipt in the Netherlands

Abstract We study horizontal inequity in home care use in the Netherlands, where a social insurance scheme aims to allocate long‐term care according to care needs. Whether the system reaches its goal depends not only on whether eligible individuals have equal access to care but also on whether entitlements for care reflect needs, irrespective of socioeconomic status and other characteristics. We assess and decompose total inequity into inequity in (i) entitlements for home care and (ii) the conversion of these entitlements into actual use. This distinction is original and important, because inequity calls for different policy responses depending on the stage at which it arises. Linking survey and administrative data on the 65 and older, we find higher income elderly to receive less home care than poorer elderly with similar needs. Although lower income elderly tend to make greater use of their entitlements, need‐standardized entitlements are similar across income, education, and wealth levels. However, both use and entitlements vary by origin and place of residence. The Dutch need assessment seems effective at restricting socioeconomic inequity in home care use but may not fully prevent inequity along other dimensions. Low financial barriers and universal eligibility rules may help achieve equity in access but are not sufficient conditions.

Appendix A: implementation of the Lasso procedure

A.1. Selection of need variables and of and non-need factors prior to the baseline estimation of HI for Analyses A and B
The computation of the horizontal inequity indexes in Analyses A and B relies on linear predictions from regression analyses (Equations (1.A) and (1.B)). The statistical precision of the estimates of HI will in turn depend on the statistical precision of the regression estimates. Our dataset contains a very large number of (potentially highly correlated) need variables that may be included as regressors in Equations (1.A) and (1.B). In order to maximize statistical efficiency, we implement a data-driven method to select a subset of need variables to be included in the regression analyses.
We implement a Lasso procedure similar to the one performed by Bakx et al. (2018), 4 following the methodology proposed by Tibshirani (1996). The Lasso aims at selecting covariates in a regression model so that the model achieves a balance between low bias and high statistical efficiency. The Lasso consists in a procedure of shrinkage, whereby coefficients of covariates that are not, or weakly, associated with the outcome variable are shrunk towards zero and eliminated from the regression analysis. It relies on the minimization of an objective function that penalizes the absolute size of the regression coefficients. As the penalizing factor increases, the coefficients of all covariates shrink further towards zero; as it decreases, the procedure is more lenient towards covariates with a small partial correlation with the outcome. 5 We apply the Lasso procedure separately to the following variants of Equations (1.A) and (1.B):

=̃0 +̃′̃+ ̃ (A.A.1)
=̃0 +̃′̃+ ̃ (A.B.1) 4 A more detailed description of the procedure can be found in Appendix C of Bakx et al. (2018). 5 See Bakx et al. (2018) for additional details. We follow their choice of using the objective function and penalization proposed in Belloni et al. (2012). For all 3 analyses we have performed the Lasso selection without taking into account the survey structure. To the best of our knowledge, no specific adjustment of the standard Lasso procedure to complex sampling design survey data has been well established so far.
where ̃ is a vector containing the full list of the need variables prior to performing the Lasso selection, for individual .
The outcome of the Lasso procedures that we apply to Equations (A.A.1) and (A.B.1) is made of two sub-lists of need variables (two sub-vectors of ̃) . As there is no theoretical ground for the legitimate needs for home care entitlements and the needs for home care use to be different, we define our final list of need variables as the union of the variables selected by the Lasso procedures in Analysis A and Analysis B. Table A.I provides the list of the variables that were included as need factors in the Lasso procedures. It also indicates which of these variables were selected and included in the regression analyses implemented to derive the horizontal inequity indexes.

A.2. Selection of need variables and of and non-need factors prior to the baseline estimation of HI for Analyses A and B
In a robustness check, we use a Lasso to select both a subset of need variables and a subset of non-need factors. We apply the Lasso procedure separately to the following variants of Equations   (1) no child (2) co-resides (3) lives in the same municipality (4) lives in a different municipality Notes: a see Online Appendix C, section C.2 for explanations regarding the grouping of the regions.   (1) and (2) and (1.B) for Columns (3) and (4) (defined in Section 3). The estimations take into account the clustered design of the sample (335 primary sampling units). When performing the estimations, a few observations for which the primary sampling unit is missing have to be dropped.  (1) and (2) and (1.B) for Columns (3) and (4) (defined in Section 3). The estimations take into account the clustered design of the sample (335 primary sampling units). When performing the estimations, a few observations for which the primary sampling unit is missing have to be dropped.

C.1. Standard error of the concentration index of LTC use and LTC entitlements
Inference on a concentration index derived from microdata was derived in Kakwani et al. (1997). 7 The standard error of the concentration index of a variable can be obtained using a "convenient regression" and can accommodate survey weights.
For outcome , denoting is the weighted fractional rank of individual i in the distribution of SES, we can derive an estimate of the concentration index by estimating by OLS following regression: The estimate of ( ) is given by: Where 2 is the variance of the (weighted) fractional rank and ̂ is an estimate of the population-average of . Given that the weighted OLS predicted value of is equal to the population average of , and that the mean of the weighted fractional rank is 0.5, we can rewrite: Given that the estimate of ( ) has been rewritten as a function of the regression coefficients, a standard error can be derived applying the delta method. In Stata, we use the nlcom command, which supports probability weights. In addition, this command allows us to take into account the clustered sampling design of the Dutch Health Monitor survey, by correcting the standard error for within-cluster correlation. 8 Using this approach, we can derive standard errors associated with the concentration of LTC use in Analysis A, the concentration of LTC needs in Analysis B and the concentration of LTC use among the elderly eligible for home care in Analysis C. However, in our Tables of results, we have provided confidence intervals derived using an alternative method, i.e. a cluster-bootstrap approach (cf. infra). Inference based on the delta method (available on demand) and inference based on the bootstrap approach lead to the same conclusions regarding the statistical significance of the concentration indices of use and entitlements at the conventional thresholds of 1%, 5% and 10% levels.

C.2. Standard error of the concentration of needs, the horizontal inequity index and the contributions of non-need factors
One way of obtaining standard errors for the concentration index of the need-predicted outcome and for the horizontal inequity index is to apply the convenient regression approach described here-above to the need-predicted outcome and the need-standardized outcome respectively, instead of applying to the raw outcome. However, this approach would not take into the fact that there is sampling variability in the need-predicted and need-standardized outcomes for Analyses A and B, as they are derived from the OLS regression predictions. We therefore conduct inference based on a bootstrap procedure. 9 The standard assumption for a bootstrap procedure is that the observed sample is a random sample of the underlying population and that observations within the sample are independent (van Doorslaer et al., 2004). As the Health Monitor survey is based on a two-stage sampling design with unequal sampling probability, the standard assumption does not hold. We therefore implement a cluster bootstrap while taking into account the unequal sampling probability design.
We proceed in three steps: a. We draw a random subsample b (with replacement) of the 335 primary sampling units We replicate steps 2 to 4 B=1,000 times, so that we obtain B different samples, with B values of our statistics of interest (e.g. concentration index). Finally, for each statistics of interest S, we draw the distribution of its value over the b=1,…, B samples. The upper bound of the 0.5% and of the 99.5% (respectively of the 2.5% and of the 97.5%, and of the 5% and of the 95%) smallest values provide a bootstrapped 99% level (respectively a 95% level and 90% level) confidence interval for statistics S.
Note that the Lasso procedure itself is not bootstrapped, as this would result in excessive computational time. In addition, because of the bootstrap analysis, we have to regroup the LTC 10 For this (bootstrap) sample b, the sample size depends on the number of observations in the selected clusters. 11 This approach is yet not ideal as it is not possible to take into account the individuals who are not selected in sample b in the reweighting process. Another possible approach would be to expand the sample by multiplying each observation by its survey weight, such that we obtain a final sample, of size N' where all observations have a probability sampling weight equal to 1. However, given the large sample size and the fairly large dispersion of weights, expansion would result in a dataset too large to be handled with Stata®. purchasing regions for a technical reason: several of the 32 regions are relatively small (less than 2,500 elderly from our entire study sample, and many less when we focus on the subsample eligible for home care in Analysis C). This implies that some of the bootstrap subsamples do not include any individual from these small regions, and estimates of the coefficients of the LTC purchasing regions cannot be systematically derived.
To circumvent this issue, we have grouped the 32 regions into 8 groups in a data-driven way.
We have run a regression of home care use on need variables and the non-need factors (Equation (1.A), Section 4). We have retrieved the estimates of the coefficients of the 32 regions (the coefficient for the reference region being set to 0). We have then ordered the 32 regions from the one with the lowest coefficient to the one with the highest. Finally, we have divided the regions into 8 groups of 4, from the lowest-use regions to the highest-use group. The aim of this Appendix is twofold: first, we aim at motivating the 'conservative' approach to the horizontal inequity index for Analyses A and B that we adopt in our fourth robustness check (provided in Online Appendix E). This approach relies on a decomposition of income-related inequalities proposed by Wagstaff et al. (2003). Second, we show how the standardization for entitlements that we propose in Analysis C can be made consistent with a decomposition of HI into the contributions of non-need factors, which in turn motivates the estimation of a constrained linear regression analysis when testing for the sources of horizontal inequity at the stage of the conversion of entitlements into care use (Section 5.3 and Table B.II in Online Appendix B).

D.1. Decomposition of and
Following Wagstaff et al. (2003), we can decompose into: Where ̅ (resp. ̅ ) is the population-average of the need factor (resp. non-need factor ), and ( ) (resp. ( )) is the concentration index of variable (resp. ). Similarly, for Analysis B, we can write: The estimates of , and , are provided as a robustness check ('Check 4') in Online appendix E, Table E.I.

D.2. Decomposition of
The derivation of is different from what we do in Analyses A and B, to the extent that in Analysis C use is standardized on a readily available measure of entitlements (we do not need to rely on a regression-based derivation of needs). Still, we can use a similar framework to decompose into the contribution of non-need factors; in addition, a regression can be used to test for any partial correlation between a non-need factor and home care use, when entitlements are controlled 12 According to Wagstaff et al. (2003), ̅ ̅̂( ) can be thought as the (non-causal) contribution of nonneed factor to income-related inequality in care use; as such, HI can be expressed as the sum of the contributions of non-need factors plus the generalized residual. Our empirical analysis does not rely this interpretation, which has been discussed by several recent developments of the equity literature (see Erreygers & Kessels (2013) and Heckley et al. (2016)). Whatever the interpretation of ∑ ̅ ̅̂( ) =1 , it remains true that the horizontal inequity index, in its conventional definition, equals this term plus the (rescaled) concentration index of the residuals.
for. Regression estimates can then be used to better understand what drives the differences in care use standardized for entitlements across population groups (i.e. horizontal inequity) that we report in Section 5.3.
For Analysis C, we estimate the following constrained linear regression on the subsample of the elderly eligible for some home care: The coefficient of entitlements is constrained to be equal to the ratio of population averages ( ̅ / ̅ ). This ensures that we use the vector of estimates of parameters and residuals to decompose income-related inequality in care use as: Where ̅ , is the average of non-need factor in the population eligible for home care.
Consistently, is simply equal to the difference between the concentration index of use and the concentration index of entitlements among the individuals eligible for home care ( ( ) − ( ) ).
In the case of Analysis C, there is no conservative approach to inequity analogous to the one considered for Analyses A and B: when assessing inequity in the conversion of entitlements into actual care use, we standardize use for entitlements, which are perfectly observed. In this setting, the error term necessarily captures unobserved non-need factors, and ( ) reflects the concentration of these unobserved illegitimate determinants of home care use.

Appendix E: Additional robustness checks
In the main text, we have included two robustness checks for the baseline estimates income-related inequity. This Appendix provides four additional robustness checks, whose results are displayed in Table E.I ('Check 3' to 'Check 6'). For better readability, we have reproduced the estimates from the baseline analysis and the first two robustness checks.
[ Table E.I. on the following page] As a third robustness check, we use the Lasso to select both the need variables and the nonneed factors to be included in the regressions. On the one hand, including as control variables as many non-need factors as available reduces the risk that the correlation between need variables and income rank (which defines fair income-related inequalities in access to care) actually captures a correlation between the income rank and the non-need factors, in the case the latter are correlated with need variables. On the other hand, not submitting the non-need factors to the Lasso selection procedure may lead to exclude some need variables that are highly correlated with some non-need factors. This might distort our estimates of legitimate needs and of their distribution across income levels. We have replicated Analyses A and B using the Lasso to statistically select simultaneously a subset of need variables and a subset of non-need factors. The point estimates for HI (Table E.I, 'Check 3') are very close to our baseline estimates. (1) (2) A. LTC use (equity overall) Baseline Getting to our fourth robustness check, our baseline analysis relies on the -standardassumption that the unobserved determinants correlating with the income rank are illegitimate determinants of care use. But what if, in spite of our extensive set of health-related information, we fail to observe some (pro-poor distributed) dimensions of needs that are known to CIZ assessors? In particular, an adapted house can improve the capacity of the elderly with functional limitations to perform the activities of daily living without human assistance (Hoenig et al., 2003); a more accessible dwelling can also delay nursing home entry (Diepstraten et al., 2020). If lowerincome elderly are less likely to have adapted houseswhich we do not know from our data but is suggested by Diepstraten et al. (2020) -, their legitimate care needs would be higher than our estimates indicate. To check that the finding of no pro-rich horizontal inequity in home care access is not driven by our implicit assumption that the unobserved factors co-determining use are needrelated, in a third robustness check we relax this assumption. With this more 'conservative' approach (Bago d'Uva et al., 2009), estimates point to an absence of income-related horizontal inequity at both the eligibility stage and overall: HI is now extremely close to zero for both Analyses A and B. 13 Our analysis of income-related inequity in care use relies on the assumptions that: (i) the OECD square root equivalence scale correctly reflects economies of scale for (household-level) expenditures, and (ii) economic resources are shared equally in the household (between spouses).
If (i) is incorrect, the ranking in the income distribution of singles relative to that of the elderly with a spouse would be incorrect. If (ii) is untrue, then the ranking of men with a spouse relative to that of women with a spouse is incorrect. Given that the share of singles (mostly widows) is higher among women, a sharing rule of household resources unfavorable to women would also imply that relatively more women are over-ranked in the distribution of income, relative to men.
As both spousal status and gender exhibit practically and statistically significant partial correlation with home care use and/or entitlements (holding other needs and non-need factors fixed), violation of (i) and/or (ii) could bias our assessment of income-related inequity in care use.
Estimates of economies of scale and sharing rules in couples are available for the Dutch 65+ population. Using consumption survey data, Cherchye et al. (2012) estimate an economies of scale parameter of 0.32. They also estimate a sharing rule that is more beneficial to women (average of 0.63 and standard deviation of 0.03 among couples).
The OECD-square root equivalence scale we use in our baseline analysis reflects an economies of scale parameter of 0.41 for a two-person household. The OECD modified scale, which was standardly used until the 2000s, reflects an economies of scale parameter of 0.33. We therefore run two additional robustness checks: first, we re-compute equivalized income using the OECD modified scale; second, we assume that women in a couple receive 63% of the household income, after economies of scale have been taken into account.
Changing the equivalence scale is innocuous; however, assigning more resources to women somewhat affects the results. Like in the baseline analysis, there is no evidence of socio-economic inequity in entitlements for home care; but the pro-poor gradient in the conversion of entitlements into use fades out and even becomes slightly pro-rich. This is consistent with the fact that women were found to convert more of their entitlements on average, and robustness check #6 makes women relatively richer. This scenario may over-estimate the pro-richness of the distribution of home care use, for given entitlements. The intra-household sharing rule estimated by Cherchye et al. (2012) used data from 1978 to 2004 and their results suggest that resource sharing became more equal over the period. Re-allocating 63% of the household resources to women among the 65+ population in 2012 may thus make women with a partner artificially richer, relative to men with a partner and single individuals. Furthermore, Cherchye et al. (2012) find substantial heterogeneity in the sharing rule across households. We thus take the results from 'Check 6' with caution.