Modeling Quality and Price Perception in the Choice of Drinking Water in France: A Hybrid Choice Model Approach

The water resources literature usually discards the important price difference between bottled water and tap water as a predictor of drinking water choice. In France, bottled water is about 100 times more expensive than tap water. Using 4,003 survey responses, we model water resources quality (mis)‐perception and water price (mis)‐perceptions by means of a hybrid choice model. We show that respondents who are more likely to consider the quality of water resources as “very poor” or “poor” are less likely to drink tap water. Furthermore, we find that respondents who do not report the correct price difference between tap water and bottled water are more likely to drink bottled water, which is a novel finding, as significant price effects of this type have never been reported in the literature on drinking water choice.


Introduction
In this paper, we propose to explore the relationship between individual perception of water resources quality, water price, and drinking water choices (tap water, filtered water, bottled water) using a hybrid choice model.To this end, we use data from a unique survey conducted in France in 2013 involving 4,003 respondents.Respondents were asked to state their perception of the quality of water resources (surface water and groundwater) and to report their main choice of drinking water.The survey also featured questions on how people perceive the price difference between tap water and bottled water.Such data are usually not available to analysts and allow us to propose a more complete model than those usually found in the empirical literature on drinking water choices, integrating both the perception of water resource quality and the perception of the price difference between bottled and tap water.
In line with other findings, we find that people who are more likely to consider the quality of water resources in France to be "very poor" or "poor" are less likely to drink tap water and more likely to drink bottled water.Additionally, we find that respondents who are less likely to report the correct price difference between tap water and bottled water are also more likely to drink bottled water.This finding has never been reported in the literature to our knowledge, and indicates that price perception affects drinking water choices.We hypothesize that previous studies did not find such an effect because they did not account for "water price mis-perception" (Brent & Ward, 2019).In other words, consumers are typically unaware of how much they actually pay for tap water.However, in practice, as perception (or mis-perception) are latent constructs, they can only be approximated by a set of attitudinal (or perceptual) indicators derived from answers to attitudinal survey questions.These indicators, whether discrete or continuous, are not a direct measure of attitudes and perceptions but instead a function of these perceptions.Therefore, these indicators are at best an imperfect measure of the latent constructs they seek to capture.Including them in choice models as error-free variables raises two main issues.First, this type of approach does not take into account the fact that latent variables are never perfectly observed, and subject to measurement error.Second, endogeneity bias can occur when unobservable factors influence both respondents' stated choices and their answers to attitudinal questions.Using a hybrid choice model allows to mitigate these issues, as widely acknowledged (Amaris et al., 2021;Buckell et al., 2021;Kim et al., 2014).
The article is organized as follows: in Section 1.1, we provide an overview of the literature on drinking water choices.Section 1.2 presents the data.The hybrid choice model is introduced in Section 1.3.The results are reported and discussed in Section 2. Section 3 concludes.

Literature Review
Notice that in what follows, we focus on drinking water choices in high-income countries, our data being taken from a survey conducted in France.However, there is also a rich and abundant literature for developing countries, which admittedly face specific issues regarding drinking water choice.For example, Pattanayak et al. (2005) explore the WTP for improved water supply in Kathmandu (Nepal), where people are confronted with a chronic shortage of water.By contrast, water shortages, when they occur in high-income countries, are more related to droughts than to a lack of investment in water infrastructure.Nevertheless, even if developing countries face specific water supply issues, consumers' perceptions of water quality (Cambodia, Orgill et al., 2013) and/or network water when compared to non-network water (Jordan, Orgill-Meyer et al., 2018) have been shown to play an important role in terms of drinking water choices, especially in terms of averting behaviors such as water storage, in-home treatment, etc.Such results and methodological contributions are partly related to the focus of our paper.However, we chose to restrict our literature review to high-income countries, selecting articles that explicitly address drinking water (tap water and filtered tap water vs. bottled water) using discrete choice econometrics, in line with what we intend to do in our paper.We start our literature review by recalling the role of non-price factors in drinking water choice models.We then continue by showing that there is actually a gap in the literature about the effect of water price (perceptions) in drinking water choice models.Finally, we end our literature review by pointing out the fact that including perceptions variables in drinking water choice models raises measurement error issues.

Non-Price Factors and Drinking Water Choice
The vast majority of the literature on drinking water choice in high-income countries focuses on the United States and Canada, presumably due to better data availability.The perception of health risks associated with tap water is at the heart of the first studies devoted to drinking water choices.As a result, risk perception variables are included as predictors in drinking water choice models, alongside sociodemographic variables or variables characterizing the good consumed, such as taste, appearance, or odor.
The seminal work of Abrahams et al. (2000) looks at drinking water choice as a means of assessing the averting costs associated with perceived risks of water contamination.Using data from a survey conducted in Georgia in 1995, the authors examine how drinking water choices (tap water, filtered tap water, bottled water) are associated with perceived risk, measured as a binary which takes the value 1 when the respondents stated that their tap water was unsafe (0 otherwise), with the perceived quality of the tap water, measured as a composite index reflecting the odor, taste, and appearance of the tap water, and finally with sociodemographic variables.It should be noted that the authors do not address the issue of measurement error in perception variables.The estimation of a simple multinomial logit model leads the authors to identify both the effects of sociodemographic variables (e.g., nonwhites tend to consume more bottled water), and the effects of perception variables.Respondents who consider tap water unsafe tend to consume more bottled water.This is also true for respondents who are dissatisfied with the quality of tap water.Overall, the authors conclude that bottled water expenditures reflect not only risk averting costs, but also the effect of the perceived quality of tap water on drinking water choices.
From an Internet-based survey in Canada, Dupont et al. (2010) also show, by estimating a simple multinomial logit model, that the probability of being a bottled water drinker or filtered tap water drinker increases with income or for respondents who have had unpleasant past experiences with tap water taste.They find that two key perception variables influence the choice of bottled water and filtered tap water over tap water: the level of perceived health risk associated with tap water and the belief that bottled water is safer than tap water.
Other relevant studies include Viscusi et al. (2015), who show that risk perception and past experiences are major determinants of drinking water consumption choices, together with gender, income, and race.Similar results have been found by Doria (2006) and Javidi and Pierce (2018), who focus on whether respondents perceive tap water to be safe or not.Furthermore, Levêque and Burns (2017) report that the perception that consumers have regarding the risks related to drinking tap water, as well as the perceived quality of surface water, are predictors of bottled and filtered water consumption, respectively.Lastly, based on Canadian data, Lloyd-Smith et al. (2018) find, using a latent class model, that perceived risk negatively affects the probability of drinking tap water.However, their focus is on the issue of averting expenditures against health risks associated with tap water consumption.
In addition to the United States and Canada, there are similar studies for other high-income countries, such as France, which is the focus of our analysis.
To our knowledge, Bontemps and Nauges (2009) are the first to explore drinking water choices in France.More precisely, the authors seek to assess how the quality of raw water influences the choice of drinking tap water or not.One of the original features of the article comes from the way in which the quality of raw water is measured, through an index of poor quality of raw water, which in fact corresponds to the drinking water production cost where the respondents are located.Estimation of simple probit models shows that the poor quality of tap water influences whether respondents drink it or not.Johnstone and Serret (2012) use survey data from 10 OECD countries, including France, to analyze drinking water choices.The authors estimate a simple multinomial logit model to identify the main predictors of the choice to consume bottled water and/or purified tap water, with tap water taken as the reference case.Beyond the effect of some sociodemographic variables, the results show that concerns about health risks and (bad) tap water taste have a positive impact on the consumption of bottled and/or filtered tap water.Again, variables that capture the level of concern about tap water quality are not considered to be prone to measurement error.Bontemps and Nauges (2016) use a subset of the same data set (focusing on Australia, Canada, and France) to compare several methods for controlling the endogeneity of a binary variable that encodes satisfaction or dissatisfaction, with the health impacts of tap water as a predictor of whether to drink tap water or not.Bontemps and Nauges (2016) argue that perception variables, such as the perception of the health risks related to drinking tap water, should be treated as endogenous to averting decisions (here, the decision to drink tap water or not).They compare two methods for controlling endogeneity: the recursive bivariate probit model and the special regressor approach.The results show that the marginal effect of being satisfied with tap water on the probability of drinking tap water is twice as large for the recursive bivariate model compared to what is found with the special regressor approach.As the latter is less demanding in terms of the underlying distributional assumptions, the authors suggest that it should be used whenever a special regressor is available.
Finally, Beaumais and Veyronnet (2017) use the same data we use for this paper to examine how drinking water choices (tap water, filtered water, bottled water) are associated with the degree of satisfaction with tap water, measured by a three-level categorical variable (not satisfied, satisfied, very satisfied).Taste heterogeneity across decision makers is modeled using a scaled multinomial logit model, which captures the influence of both observable and unobservable variables on the weights of the variables involved in the indirect utility function.The potential endogeneity of the satisfaction variable is controlled for by the two-stage residual inclusion method (2SRI; Terza et al., 2008).The results suggest that the categorical variable measuring satisfaction with tap water is endogenous.Specifically, comparing the results of models with and without endogeneity control shows significant differences between the two specifications: in both, satisfaction with tap water is a significant predictor of drinking water choices.However, when endogeneity is controlled for, it appears that individuals react in a binary way (not satisfied vs. satisfied) rather than in a nuanced way (not satisfied, satisfied, very satisfied).One of the salient points of all these articles, with the notable exception of Bontemps and Nauges (2016) and Lloyd-Smith et al. (2018), is that price is not considered as an attribute of drinking water choice.Some authors (Abrahams et al., 2000;Dupont et al., 2010;Viscusi et al., 2015) recognize the potential role of price in drinking water choices, but face data limitations that prevent the use of water price in their modeling work.Others, such as Bontemps and Nauges (2009) argue that "consumer choice is not based on price comparison, as the price of bottled water is about 100 times higher than the price of tap water."The idea is that this price difference for two close substitutes should lead consumers to turn away from bottled water.Beaumais and Veyronnet (2017) build similar arguments, and point that in their data, most respondents do not know the price of tap water and are unaware of the price difference between bottled water and tap water.In other studies, price is simply not considered a predictor of drinking water choices (Javidi & Pierce, 2018;Johnstone & Serret, 2012;Levêque & Burns, 2017).

Water Price and Drinking Water Choice: A Substantial Gap in the Literature
As mentioned above, the works of Bontemps and Nauges (2016) and Lloyd-Smith et al. (2018) are the only ones that we found, in which the authors include water price in a drinking water choice model.Although Bontemps and Nauges (2016) use data from a 2011 OECD survey to study the influence of perceived tap water quality on the choice of drinking bottled water instead of tap water, the authors also use the regional average price of water from Water Resources Research 10.1029/2023WR034803 a previous 2008 OECD survey in their estimation.This modeling strategy is explained by the fact that their special regressor, introduced to control for the endogeneity of the perception variable measuring tap water quality, is precisely the average water price in 2008 (at the regional level).Furthermore, the authors "expect tap water to be a normal good."In this vein, the data used by Lloyd-Smith et al. (2018) are sufficiently rich for them to include both the cost of bottled water and filtered tap water in their models, while the price of tap water is assumed to be zero.However, it should be noted that the latter authors do not address the measurement error that might be associated with these stated data on bottled water expenditure and filtered water expenditure.Their work focuses on ways of controlling for the potential endogeneity of perceived health risks associated with tap water, without questioning how the price of water is perceived.
As previously discussed in the Introduction, the term "water price mis-perception," as used by Brent and Ward (2019), seems more appropriate to refer to how people perceive the price of water.Notice that cost misperceptions, for example, energy costs, are known to alter the decision-making process and can result in welfare losses, which can be mitigated by targeted policies (Allcott, 2013).Following Brent and Ward (2019), we therefore assume that using an objective tap water price measure to inform tap water consumption choices is inadequate.In fact, in France, only 20% of the consumers surveyed have been found to know the correct price of tap water, while 58% stated that they "do not know" (CGDD, 2014).It is then obvious that consumers do not use any objective water price variable in the utility maximization process underlying their drinking water choices.
In the literature on individual discrete choice behavior, be it in transport economics, energy economics or water economics, it is now commonplace to distinguish between decision utility and experienced utility (Allcott, 2013;Allcott & Sunstein, 2015;Brent & Wichman, 2022;Train, 2015).In a typical discrete choice model, each individual is confronted with a set of discrete alternatives (in this paper, tap water, filtered water and bottled water).Decision utility, also called anticipated utility (Train, 2015) or choice utility (Brent & Ward, 2019;Brent & Wichman, 2022), refers to the utility that an individual expects from choosing one alternative over other available options, at the decision step.Experienced utility, also called actual utility (Train, 2015), refers to the utility that an individual receives from experiencing (consuming) the chosen alternative (good).When the individual misperceives one or several attributes of the alternatives, at the decision step, experienced utility can be different from decision utility, thus inducing welfare losses in cases where experienced utility is lower than decision utility.Misperceptions can be due to imperfect information or internalities, that is, "costs we impose on ourselves by taking actions that are not in our own best interest."(Allcott & Sunstein, 2015) due to behavioral biases.Notice that, as pointed out by Train (2015, Footnote 2), misperceptions induce a difference between choice utility and experienced utility only if the individual involved discovers ex post (at the experience step) that she misperceived one or several attributes of the chosen alternative (good).Nevertheless, beyond the assessment of welfare losses and from a policy recommendation perspective, identifying whether misperceived attributes actually influence discrete choice is of importance, as it can help design relevant interventions (see our conclusions).
To sum up, the papers on drinking water choice models that we have discussed in this literature review do not consider the role of the perceived price of water, or the perceived difference in water prices (bottled water, tap water) on drinking water choice.Our paper aims to fill this gap.However, in terms of econometric strategy, perception variables pose specific issues, which we now address.

Perception Variables as Predictors of Drinking Water Choice and Endogeneity
In the empirical literature on tap water consumption, many authors have used survey responses to attitudinal statements directly as covariates in a drinking water choice model (Bontemps & Nauges, 2016).However, it is now widely acknowledged that attitudes and perceptions are unobserved and that only manifestations or imperfect measurements of these attitudes can be observed (Hess & Stathopoulos, 2013).More precisely, the perceived quality of water resources is arguably not an exact measure of the way respondents perceive it, but a function thereof, and using it as an explanatory variable in a model of drinking water choice is hence likely to put the analyst at risk of measurement error, leading to endogeneity bias.
As it is well known, endogeneity can arise from three sources: measurement error, omitted variables and simultaneous determination.Regarding the perception of health risks related to tap water, Bontemps and Nauges (2016) seem to target endogeneity arising from simultaneity.They argue that "In typical theoretical averting-behavior models, both actual risk and perceived risk are endogenous since they depend on attitudes toward safety, and hence on the averting actions being taken."Lloyd-Smith et al. ( 2018) explicitly add the source Water Resources Research 10.1029/2023WR034803 of the omitted variable, given that "unobserved factors affecting water choices may be correlated with risk perceptions leading to omitted variable bias."Notice that in their empirical work, Bontemps and Nauges (2016) actually control for the endogeneity of a binary variable that is equal to one when respondents stated to be satisfied with the health impact of tap water and zero otherwise.However, they also include the perceived local water quality in their models, without controlling for the potential endogeneity of this variable.The same holds for Lloyd-Smith et al. (2018), who focus exclusively on controlling the endogeneity of their risk perception variable, without controlling for the endogeneity of the water quality perception variables that they include in their model.On the contrary, Whitehead (2006), cited by both Bontemps and Nauges (2016) and Lloyd-Smith et al. (2018), acknowledges the potential endogeneity of water quality perception variables in a series of models designed to assess willingness to pay (WTP) for improved water quality.Interestingly, Whitehead (2006) considers the inclusion of perceived water quality in WTP models as a way to mitigate the omitted variable bias.From his point of view, endogeneity arises because unobservable variables could be correlated with both perceived quality and WTP.Likewise, Vásquez et al. (2015) include a variable measuring (tap water) quality perception in a series of probit models aimed at exploring various averting behaviors related to drinking water in Nicaragua, such as boiling water, treating water, or purchasing bottled water.They control for the potential endogeneity of the quality perception variable, which, they consider, stems from the fact that "in the context of developing countries where information of water quality is hardly available, household perceptions of water quality may be endogenous to averting behaviors as those perceptions can be based on several factors other than actual water quality." Although ignoring water resources quality and water price (mis-)perception might lead to endogeneity due to omitted variables, including these factors in drinking water choice models is also a potential source of endogeneity due to measurement error.This issue results from the fact that attitudes and perceptions are usually not direct measures of latent constructs, but rather their functions (Budziński & Czajkowski, 2022).More precisely, while it is essential to account for attitudes and perceptions, especially when individuals have a very poor or even erroneous knowledge of the objective reality, measurement error arises when directly incorporating indicators of attitudes and perceptions as error-free variables in standard choice models, as "attitudes can never be observed by an analyst" (Amaris et al., 2021).We argue, in line with Kim et al. (2014), Buckell et al. (2021), and Amaris et al. (2021), that an appropriate econometric strategy, given the modeling problem at hand, is to estimate a hybrid choice model.Such an approach allows us to introduce quality and price perceptions in a drinking water choice model.HCMs are made of three parts (Ben-Akiva et al., 2002;Kim et al., 2014): a choice model (commonly based on the random utility framework), a measurement component and a structural component.It is important to clarify that indicators do not directly enter the utility function.Instead, the indicators and the choice model are interacted by the means of latent variables.The measurement and structural components form what is called the latent variable model.The measurement component links the indicators to the latent variables, explicitly accounting for heterogeneous error terms across individuals, while the structural component links the latent variables to exogenous variables, such as sociodemographic factors.We introduce the data in the next section.

Data
Our research is based on data collected in 2013 from 4,003 respondents by IFOP (Institut Français d'Opinion Publique) as part of a survey commissioned by the Commissariat général au développement durable.Given the secondary nature of our data, we performed a series of checks to ensure the integrity of the data.A detailed data integrity statement is included in Appendix A, where limitations are also discussed.
In addition to the usual sociodemographic variables, respondents were asked about their perception of the quality of water resources, their choice of drinking water, their knowledge of the price of tap water, their knowledge of the difference between the price of tap water and the price of bottled water, etc.The full sample is representative of the French population (CGDD, 2014).The estimation sample is restricted to respondents for whom income is available, that is, 3,506 individuals without altering the representativeness of the sample (see Table 1).Additional variables on nitrate concentration in groundwater and rainfall level were added to the original database from external sources (from the Eider database of French Ministry of the Environment and from the public water data web portal, data.eaufrance.fr).We used the most detailed data for the nitrate levels that are not available at the catchment areas of the respondents but rather at the "arrondissement" level, which is an administrative subdivision of France (Regions > Departments > Arrondissement).At the time that we merged the survey data with external data on nitrate levels, France was divided into 332 arrondissements.It is worth noting that, in France, the Water Resources Research 10.1029/2023WR034803 vast majority (two-thirds) of drinking/tap water is produced from groundwater, which also explains why we chose to use nitrate concentration in groundwater as an objective measure of the quality of water resources.
In the survey, the question about drinking water choices was expressed as follows: "At home, when you drink water, most often you drink…?(Only one possible answer) -Mainly bottled water, -Mainly filtered tap water, -Mainly tap water." As is usual in the literature on drinking water choice, the wording of this question makes the choices mutually exclusive, so that discrete choice modeling techniques can be applied to the data.The main choice of drinking water in France is tap water (40%).Bottled water is the second most popular (39%).Finally, filtered water is third in the list and is consumed by 21% of respondents (see Table 1).

Choice of Drinking Water and Perceived Quality of Water Resources
Does the choice of drinking water depend on the perceived quality of water resources stated by the respondents?In response to the following question: "Apart from drinking water, would you say that the current quality of water resources in France (whether in rivers, lakes, groundwater) is…?," the stated perceived quality of water resources can take one value among four: very poor (1), poor (2), good (3) and very good (4).Note that a separate question was asked about the perception of the expected quality of water resources in the future.
As a first step in our analysis, we report the choice of drinking water depending on the perceived quality of water resources (Table 2).We can observe that the share of tap water drinkers is 34.34% when the respondents consider that water resources are of very poor quality, while it increases to 53.90% when they consider them to be very good (an increase of 19.56% points).The dependence between the two variables, one nominal, the other ordered, is confirmed by a χ 2 test (χ 2 (6) = 45.1451,p-value = 0.000).

Perception of Water Resources Quality
Furthermore, Table 3 reports the concentration of nitrates in milligrams per liter (mg/l) in groundwater used to supply drinking water.Although it is not relevant for our analysis to know where the highest concentrations of nitrate are found in mainland France, it is important to point out that there are large differences across the territory.The South of France are the regions where people drink tap water the most (Auvergne-Rhône-Alpes: 56.91%; Provence-Alpes-Côte d'Azur and Corsica: 51.27%; Occitanie: 50.88%), and are also regions where average nitrate concentrations in raw water are low.On the other hand, Hauts-de-France (17.83%),Centre-Val de Loire (28.85%) and Normandy (31.43%) are the regions where individuals drink tap water the least and average nitrate concentrations relatively high.The last column of Table 3 shows the average perceived quality by region.As perceived quality is an ordered categorical variable that ranges from 1 (very poor) to 4 (very good), a higher average perceived quality roughly corresponds to a better perceived quality.The region where the average nitrate concentration is lowest (Provence-Alpes-Côte d'Azur and Corsica) is also the region where the perceived quality is the highest on average.The regions where the average nitrate concentration is the highest (Normandy, Hauts-de-France) exhibit a low perceived quality on average, Brittany (Bretagne) being an exception, which illustrates that subjective perceptions can be far from objective measures of water quality (Artell et al., 2013).

Perceived Water Price
Finally, Table 4 shows that 58% of the respondents do not know the price of tap water; only 20% can state a value in line with the price actually charged (about €3.4 per m 3 ; the mean value given by the respondents for how much a cubic meter of tap water costs is €5.97 and the standard deviation €8.89).Furthermore, 56% of the respondents consider tap water to be somewhat expensive, about 30% consider it to be somewhat affordable, while 14% state that they do not know whether tap water is expensive or affordable.These findings are in line with other sources, as further discussed in Appendix A. Additionally, the price difference between tap and bottled water is underestimated by approximately half the population, with about 25% unable to comment on the difference: on average, at the time of the survey, bottled water was about 100 times more expensive than tap water according to CGDD (2014).Notice that the same figure is used by Bontemps and Nauges (2009) based on bottled water  purchase data from a sample of French households.The actual price difference at the time of the CGDD survey was that tap water is 92 times more expensive than bottled water.We hypothesize that the correct answer has been rounded up to 100 to reduce cognitive burden and avoid making it too salient compared to other response options.Therefore, it is likely that whether an individual correctly perceives the price difference between tap water and bottled water influences whether she chooses to consume bottled water.Hence, our econometric strategy consists of modeling the perception of the price of tap water with respect to bottled water as a latent variable, where the stated perception of the price difference between bottled water and tap water is used as an indicator jointly estimated together with the choice model.

Econometric Strategy: Latent Variables and Hybrid Choice Model
Typical HCMs are composed of three parts: a measurement component, a structural component, and a discrete choice model component (Ben-Akiva et al., 2002;Budziński & Czajkowski, 2022;Groothuis et al., 2021;Kim et al., 2014).In the following, we present each part in detail to end with the log-likelihood of the whole system of equations.A detailed graph summarizing how the different parts are connected is available in Appendix A (Figure A1).The measurement component of the HCM is composed of a series of indicators measuring two underlying latent variables related to the perception of water quality and the perception of tap water price, labeled α quality and α price respectively.

Measurement Component: Perception of Water Quality
The latent variable α quality is informed by two indicators related to the perception of water resources quality, today and in the future.The measurement models for these indicators are ordered logit model structures.The probability PI quality of observing a given outcome n where (n = 1-4) for the indicator I quality related to the perception of water quality today is given by an ordered logit (OL) model which is specified as follows: where the parameter ζ quality thus corresponds to the effect of a variation in the latent variable α quality on the perception of water quality.The parameters labeled as τ quality n are latent thresholds estimated by the model and normalized so that τ quality 1 = ∞ and τ quality 5 = +∞.The indicator related to the expected quality in the future is labeled I qualityf.and is also modeled via an ordered logit structure (where ζ quality is instead replaced by ζ qualityf and where another set of thresholds labeled τ qualityf is estimated.

Measurement Component: Water Price Perception
The latent variable α price is informed by three indicators related to whether respondents feel that bottled water is expensive compared to tap water and whether they know the actual price of tap water as well as the actual price difference between tap water and bottled water.For each indicator, we first model by the means of a binary logit whether each respondent has answered the question or simply stated "I don't know" and, if the respondent has provided an answer, what her answer is (using different model structures as further described in Table 5 below).
As an illustration, the probability of observing the set of answers for question 2, "How much more expensive do you think bottled water is than tap water?" is labeled PI price2 and corresponds to: (2) The equation for PI price2 combines a binary logit (BL) model for whether the respondent has given a response to the question or not as well as an ordered logit model which is only estimated if the respondent has answered the question (this is the purpose of the indicator variable I price2NA which takes the value 1 if the respondent has answered and 0 else).The ordered logit part of the equation (from which "do not know" is excluded) is coded in such a way that a higher value for the dependent variable corresponds to an answer that is closer or equal to the right statement which is that "bottled water is 100 times more expensive [than tap water]".
The other two measurement equations enter the model in a similar fashion, with PI price1 combining a logit model and a Linear Regression (LR) for modeling "how much does one cubic meter of water cost" and PI price3 combining two logit models for modeling whether the respondent feels that tap water is "somewhat expensive" or "somewhat affordable" conditionally on whether the respondent has provided a response for this question.The entire set of indicators as well as the latent variables they are interacted with are reported in Table 5 below.

Structural Component
The structural component links the latent variables to exogenous factors, including sociodemographic variables.More formally, α quality and α price are defined as follows for respondent i: α quality,i = α quality,nitrates The continuous variables are recoded so that they are now centered on 0. The binary variable "rural" takes the value 1 for respondents living in a rural area and 0 else.The variables "edu1," "edu2," "edu3," and "edu4" are binary variables capturing different levels of education and where "Master and higher degree" is the base.The variable x nitrates is the observed nitrate concentration and x rainfall is the local level of rainfall.Other variables have been defined in Table 1.Finally, σ 1 and σ 2 are random disturbances specified as such: σ 1 ∼ N(0,1) and σ 2 ∼ N(0,1).

Choice Model
The choice model corresponds to the well-known multinomial logit model.The utility that an individual, noted i, derives from choosing alternative j from a set of J alternatives is: where x ij is a vector of observed characteristics of alternative j and observed characteristics of individual i and β a vector of coefficients, which are interpreted as weights in the utility function.V ij = βx ij is often referred to as the deterministic component of the utility U ij .The error term, ε ij , captures the effect of variables that influence the utility U ij but are not included in βx ij .Individual i chooses alternative k if and only if U ik > U ij , ∀j ≠ k.Therefore, the probability that individual i chooses alternative k can be written as follows: Assuming that ε ij is distributed iid extreme value, the probability of choosing a given alternative (e.g., alternative 1 among three) is given by: The two latent variables α quality and α price directly enter the utility functions for the multinomial logit model, which are specified as follows for each individual i (index i has been omitted again for variables x): V bottled,i = 0 where the parameters θ tap, quality , θ tap, price , θ filtered, quality , and θ filtered, price correspond to the effect of the latent variables α quality and α price on the probability of choosing tap water and filtered water with respect to bottled water, which is the reference alternative of the model.It is worth noting that the same variables enter V tap , V filtered as well as α quality and α price , in line with the recommendations of Vij and Walker (2016) on how to adequately specify HCMs.Notice that the HCM model results were compared with "naïve" multinomial logit (MNL) and nested logit models where the indicators enter the models in the same way as other covariates labeled as ß.The HCM model was found to be more efficient (as defined and following the same measurement method proposed by Abou-Zeid and Ben-Akiva ( 2014)) than its MNL counterpart, while nested logit structures were not found to substantially improve fit and behavioral insights.Efficiency measures comparing a naïve MNL and the HCM are reported in Appendix A (Table A2).
The model hence jointly explains the values of 6 dependent variables including the main drinking water choice question, the answers to the two water quality perception questions as well as the three water price perception questions.Therefore, the log-likelihood corresponds to the joint probability of observing the tap water consumption choice for each individual, as well as the likelihood of observing the responses related to each one of the indicators described above.The presence of random parameters in the model allows to relax the Independence of Irrelevant Alternative (IIA) property of the basic multinomial logit model (without mixing):

Results
The final model was estimated using 25,000 Halton draws.We used the Apollo package (Hess & Palma, 2019) running on the Myria HPC system (CRIANN, Normandy, France).Results are described in Table 6 (robust standard errors and robust T-ratios are also reported).

Quality Perception
The results should be interpreted as follows: an increase in the (normally distributed and unitless) latent variable α quality decreases the probability of drinking tap water compared to bottled water as θ tap, quality is negative ( 0.4015) and significant at the 1% level.An increase in this same latent variable also decreases, to a lesser extent, the probability of consuming filtered water given that θ filtered, quality is also negative ( 0.2327) and significant at the 1% level.An increase in the latent variable α quality decreases the probability of stating a better perception of water quality.Indeed, the parameter ζ quality is found to be significant and equal to 0.9402.This indicates that respondents who are less likely to state that the current quality of water resources is good are also less likely to drink tap water, in line with expectations.The sign of ζ qualityf shows that respondents who are more likely to consider that the current quality of water resources in France is good are more likely to consider that this quality is going to decrease in the future.Overall, our results suggest that respondents are more likely to drink tap water or filtered tap water than bottled water when they are more likely to consider that the quality of water resources is high.
Surprisingly, the latent variable α quality is not influenced by nitrate and rainfall levels.This result is driven by multiple factors.First, when estimating the ordered logit models related to quality and quality_f as separate models rather than as indicators in a hybrid choice model structure, we find that the variables nitrates and rainfall have a significant effect on quality but not on quality_f, which might affect the significance of the parameter when used as part of a latent construct informed by both indicators.Second, the means of both nitrates and rainfall are significantly higher for respondents living in a rural area according to a series of t-tests significant at the 1% level (t = 5.5827 and 4.5784 respectively).Hence, it might be that the effect of nitrates and rainfall on α quality is also captured by rural, which is found to be negative and significant.At the same time, there are reasons for keeping all three variables in the model as the distributions of nitrates for rural and nonrural respondents (unreported) largely overlap (and the same argument can apply to rainfall).This does not mean that respondents do not adjust their behavior according to objective measures of water quality, given that β tap, nitrates is negative and significant at the 1% level in the choice model component of the hybrid choice model.Other results indicate that α quality is lower for older individuals ( 0.0171) and higher for women (0.4591).The latent variable also varies by level of education and income level, where respondents who reported a higher income exhibit a lower value for α quality all else being equal.

Price Perception
We now investigate the relationship between the indicators related to the perception of the price of tap water (with respect to bottled water) and the choice of water consumption.We find that an increase in the (normally distributed and unitless) latent variable related to the perception of the price of bottled water, α price , increases the probability of consuming tap water and filtered tap water with respect to bottled water given that θ tap, price and θ filtered, price are both positive and significant at the 1% level (the parameter values are 0.3718 and 0.3565 respectively).At the same time, an increase of α price decreases the probability of reporting a high value for a cubic meter of tap water given that ζ price1 is negative ( 0.5209) and significant at the 1% level.Moreover, an increase of α price also increases the probability of reporting the correct price of bottled water with respect to tap water given that ζ price2 is positive (1.7140) and also significant at the 1% level.In other words, respondents who are less likely to know the correct price of bottled water with respect to tap water are also more likely to consume it.This important result has not been observed elsewhere in the literature, to the best of our knowledge.Also, an increase in α price increases the probability of stating that tap water is somewhat affordable given the sign and value of ζ price3 , which is significant at the 5% level.Finally, the sign of ζ price1na , ζ price2na , and ζ price3na indicate that respondents who are more likely to give an answer other than "I don't know" to the questions related to water price perception (Price1, Price2, Price3) are also more likely to consume tap and filtered water compared to bottled water.Overall, these results suggest that there is a significant price effect when consumers choose between bottled water, tap water, and filtered tap water.
The latent variable α price increases with age (0.0116) and is found to be lower for women ( 0.6281), which means that women are less likely to know the actual price of bottled price with respect to tap water.This is a surprising result, but it is in line with what is found in the data.Indeed, only 27.71% of the male respondents and 14.37% of the female respondents stated that the price of bottled water is about 100 times higher than the price of tap water.This difference between genders is significant at the 1% level according to a χ 2 test (χ 2 (1) = 94.1904,p-

Water Resources Research
10.1029/2023WR034803 value = 0.000).Moreover, α price increases with income.We also find that different levels of education have different effects on the value of the latent variable.

Other Findings From the Discrete Choice Model
The discrete choice model results also show that an increase in nitrates negatively affects the probability of drinking unfiltered tap water, while rainfall increases it.Older people are less likely to drink tap water and filtered In what follows, we adapt the sample enumeration procedure suggested by Hess et al. (2018) to test different scenarios related to the reduction of nitrates in groundwater and changes in the level of education of the population.

Analysis of Scenarios
The process consists of predicting the likelihood that each of the 3,506 respondents included in the model chooses a given drinking water alternative in different scenarios.We test how nitrate levels found in groundwater affect choice probabilities by making predictions for when the level of nitrates found in groundwater is the highest of the sample for all respondents (62.81), the third quartile (23.74), the median level (15.07), the first quartile (6.91) and the minimum level found in the sample (0).The other variables are kept at their respective values in the sample to test these different scenarios all else being equal.We follow the same approach for testing how variations in education level affect choice probabilities.More precisely, we test five scenarios and make predictions for when all respondents have no degree, a middle school education, a high school education, hold a bachelor, and finally hold a master's degree or above, again all else being equal.
The scenarios considered were chosen because they are related to variables that can be acted upon using policy instruments.For example, decision makers can decide to increase nitrate depollution efforts or better inform respondents about water pricing.For each scenario and each respondent, 200 predictions are made by taking draws from a multivariate normal distribution where the means correspond to the model parameter values at convergence and the variance-covariance corresponds to the robust variance-covariance matrix of the model at convergence.This protocol is analogous to the Krinsky and Robb approach (Krinsky & Robb, 1986, 1990) which is commonly used to calculate standard errors for willingness to pay estimates derived from discrete choice models, only in this instance we calculate standard errors for the choice probabilities derived from the different scenarios considered.The results are reported in Figure 1 below (detailed results are provided in Appendix A (Table A1).
Figure 1 shows that large changes in trends can be achieved with depollution efforts, although whether these efforts should be implemented in practice depends on cost-benefit analysis data, which are neither available nor the focus of this study.Moreover, we observe that the main drinking water choice for respondents with a BSc or a MSc degree or higher is tap water, while it is bottled water for the rest of the population.Of course, a higher education level does not necessarily suggest a higher knowledge of water pricing.Future research could investigate where and how different categories of the population gather information about water pricing, to further improve knowledge dissemination.

Conclusion
In this paper, we propose to use a hybrid choice model to assess the role of perception variables in drinking water choices, which has not previously been done in the received literature.In doing so, we address the issues posed by the use of perception variables in empirical works on pro-environmental and/or averting behaviors.We argue that such an approach will continue to prove useful not only in the field of environmental economics but also in all empirical work that seeks to associate individual economic choices with perception variables.
From an empirical point of view, we show that the perception of a poor quality of water resources encourages the consumption of bottled water.Although this result is not new, as a known predictor of drinking water choice, quality perception must be incorporated in our modeling work in order to control for its specific effect.Although the perception of water resources quality is mostly informed by sociodemographic variables and nitrate concentration in groundwater, other pollutants could have been considered, such as pesticides.However, available data on pesticides showed poorer granularity, and pesticide concentration was highly correlated with nitrate concentration.We further acknowledge some of the limitations of the data used in this study in a "Data integrity statement and limitations" section (see Appendix A).

Water Resources Research
10.1029/2023WR034803 Regardless, this modeling strategy allows us to isolate the main effect of interest in this paper, which is the link between water price perception and drinking water choices.In fact, our econometric strategy makes it possible to show that people who are less likely to report the correct price difference between tap water and bottled water are less likely to consume tap water, which has not been shown previously.Taken together, our results suggest that people form perceptions, perhaps even beliefs, about tap water based on the information they have at their disposal, which influence their choice of drinking water at the decision step.However, these perceptions may be Other tests were performed to confirm the coherence of our findings.For example, respondents who reported knowing how much one cubic meter of tap water costs are also 17.01% less likely to answer "I don't know" to question Price2 and 20.54% more likely to choose "100 times more expensive."These results are significant at the 1% level.They are also 2.23% less likely (at the 10% level) to select the midpoint option (3-Tap water is 5 times more expensive), which could capture a weak "I don't know what to choose" effect for the respondents who reported not knowing the price of one cubic meter of tap water.These results come from a multinomial logit model (Other modeling strategies such as generalized ordered models were also suitable, but we considered the multinomial logit to be a more straightforward approach to ensure the clarity of our findings) estimated on the 3,506 observations also used for estimating the hybrid choice model presented in Table 6.

A3. Data Collection and Ethical Considerations
As an international polling and market research firm, IFOP is a member of ESOMAR (European Society for Opinion and Market Research) and abides by the ICC (International Chamber of Commerce)/ESOMAR international code.According to the ESOMAR website, "The Code ensures that researchers and analysts working with both traditional and new of data continue to meet their ethical, professional and legal responsibilities to the individuals whose data they use in research and to the clients and organizations they serve.It also is intended to protect the right of researchers to seek, receive, and impart information as stated in Article 19 of the United Nations International Covenant on Civil and Political Rights."(Figure A1) (Table A1) (Table A2).
Efficiency has been measured by following the method outlined by Abou-Zeid and Ben-Akiva (2014) for comparing a hybrid choice model to its standard counterpart.More precisely: -For each model, the distributions of the choice probabilities have been simulated by taking 200 draws from the multivariate distribution of the parameter estimators (using parameter estimates as means and the robust variance covariance matrix).-This gives the standard errors of the choice probabilities.
-The model for which the standard errors are the smallest is the most efficient (in practice, T-ratios are used for comparing results, as predicted probabilities vary slightly across models).-We find that the HCM presented in the article is marginally more efficient than a naïve approach.

Table 1
Descriptive StatisticsVariables (Names of the variables used in the econometric model in parenthesis) Mean

Table 2
Drinking Water Choice and Perceived Quality of Water Resources

Table 3
Nitrate Concentration in Raw Water, Tap Water Consumption and Perceived Water Resources Quality by Geographical Location

Table 4
Price and Quality Perception-Descriptive Statistics

Table 5
Latent Variables and Related Measurement Equations BEAUMAIS AND CRASTES DIT SOURD compared to bottled water.Furthermore, women are less likely to drink bottled water, all other things being equal, and the probability of drinking tap water decreases as income increases.The parameters β tap, drur and β filtered, drur do not affect water consumption choice, although this doesn't mean that geographical location has no effect on water consumption choice given that geographical location has an impact on both α quality and α price as previously stated.This raises the issue of how to further interpret model output when the same variables are introduced in multiple model components. water