Investigating (sequential) unit asking: An unsuccessful quest for scope sensitivity in willingness to donate judgments

People exhibit scope insensitivity: Their expressed valuation of a problem is not pro-portionate with its scope or size. To address scope insensitivity in charitable giving, Hsee et al. (2013) developed the (Classical) Unit Asking technique, where people are first asked how much they are willing to donate to support a single individual, followed by how much they are willing to donate to support a group of individuals. In this paper, we explored the mechanisms, extensions, and limitations of the technique. In particular, we investigated an extension of the technique, which we call Sequential Unit Asking (SUA). SUA asks people a series of willingness-to-donate questions, in which the number of individuals to be helped increases in a stepwise manner until it reaches the total group size. Across four studies investigating donation judgments (total N ¼ 2045), we did not find evidence that willingness to donate (WTD) judgments to the total group increased with larger groups. Instead, our results suggest that Unit Asking (sequential or classical) increases donation amounts only through a single one-off boost. Further, we find evidence in three out of four studies that the SUA extension increases WTD judgments over Classical Unit Asking. In a fifth study ( N ¼ 537) using a contingent valuation design (instead of donation judgments), we find scope sensitivity using all asking techniques. We conclude that, while it is difficult to create scope sensitivity in WTD judgments, SUA should be considered a promising approach to increase charitable donations

tion of human judgment, which describes the phenomenon that people do not scale their valuation of a quantity in proportion to its size or scope (Kahneman et al., 1999).It is commonly discussed in the context of contingent valuation studies, which investigate how people value different goods (Baron & Greene, 1996;Desvousges et al., 1993;Kahneman & Knetsch, 1992;Lopes & Kipperberg, 2020;Veisten et al., 2004).It is additionally relevant to charitable giving (Västfjäll & Slovic, 2020), which is the primary focus of the current research.Prior work, for example, has found that when participants were asked how much money they would donate to buy Christmas presents for 20 children, they donated no more than those asked to help a single child (Hsee et al., 2013).Scope insensitivity has also been well-documented as a cause of neglecting human lives, explaining, for example, the relative indifference in response to some genocides (Cameron & Payne, 2011;Dickert et al., 2012Dickert et al., , 2015;;Slovic & Vstfjll, 2010).The consequences of this phenomenon can therefore hardly be overestimated.While many papers discuss the problem of scope insensitivity, a relatively minor proportion of the research output has been devoted to potential interventions to overcome it.A notable exception to this is Hsee et al.'s (2013) proposal of the Unit Asking technique.

| UNIT ASKING
Unit Asking involves first asking participants how much they would be willing to donate to help one affected individual before extending to ask how much they would be willing to pay to help many people.
According to Hsee et al. (2013), the underlying mechanism through which Unit Asking reduces scope insensitivity is people's desire for consistency (Ariely et al., 2003;Luisetti et al., 2011;Thomas et al., 2018).When people are first asked about one individual and in the next step asked about a larger number, their desire for consistency drives them to donate an amount that is proportional to the amount that they donated for one individual.Unit Asking has repeatedly been shown to be highly effective in increasing donations over Direct Asking (DA) (where participants are solely asked about the total number of people and not initially asked about one individual Karlsson et al., 2020;Marcinkiewicz, 2016;Simmons, 2013).For example, participants donated more than twice as much to help 20 children when they were first asked to think about how much they would donate to help one child, than when they were only asked to think about all 20 children (M = $49.42 vs. M = $18.03,Hsee et al., 2013).

| EXTENDING UNIT ASKING TO SEQUENTIAL UNIT ASKING
Given the applied relevance of increasing charitable donations, we aimed to improve upon the effectiveness of Unit Asking.In their original paper, Hsee et al. (2013) asked about one individual and then about a larger number of affected individuals.A natural step is, therefore, to extend this "Classical" Unit Asking (henceforth CUA) by scaling up the scope sequentially (in a stepwise manner) to larger numbers of affected individuals.For example, when eliciting donations to fund Christmas gifts for 100 children, instead of asking about 1 child and then all 100, one could first ask about 1 child, then 2, and scale up by roughly doubling (e.g., 5, 10, 20, 50) until reaching 100.
We expect that this Sequential Unit Asking (SUA) will increase donations in comparison with CUA, as the repeated questions should provide additional "bite" for a mechanism related to individuals' desire for consistency to exert an influence.The first goal of this manuscript is to test whether the SUA extension increases donations over CUA.

| DOES (SEQUENTIAL) UNIT ASKING MAKE PEOPLE SCOPE SENSITIVE?
We consider scope sensitivity a continuum, with complete scope insensitivity corresponding to a complete neglect of scope and maximal scope sensitivity reflecting linear proportionality (e.g., giving 100 times more to help 100 individuals than 1 individual).In between these two extremes are different magnitudes of scope sensitivity, such as logarithmic sensitivity (e.g., giving log(100) as much to 100 individuals as to 1 individual Fechner, 1860), or an even weaker ordinal form of scope sensitivity, which only implies giving more to larger numbers of affected individuals (e.g., giving more to 100 individuals than to 1 individual).In this manuscript, we focus on testing for this weak, ordinal form of scope sensitivity.Consequently, when we use the term "scope sensitivity", it refers to ordinal sensitivity.If an asking technique does not even show this weakest form, it cannot show any of the stronger forms.
Whereas the increase in average donations through CUA has been well-established (Hsee et al., 2013;Karlsson et al., 2020;Marcinkiewicz, 2016;Simmons, 2013), it is not yet clear to what degree this demonstrates scope sensitivity on the part of participants, even the weak ordinal form.For scope sensitivity, participants should give more as the number of affected people increases, even if the number of repeated questions remains the same (e.g., after being asked about one individual, they should donate more to help 10,000 affected individuals than to help 100).Hsee et al. (2013) recognize this in the discussion section of their paper.They cite unpublished data (Hsee, Zhang, & Lu, 2013) demonstrating that people gave more to help a total scope of 100 children than 10 children under CUA (and not DA). 1 In contrast to Hsee et al. (2013), an unpublished Master's thesis (Marcinkiewicz, 2016) found CUA to only give a one-off boost to WTD judgments, independent of scope.Marcinkiewicz (2016) asked participants to donate to the charity Global Alliance for Improved Nutrition to help children affected by a food shortage in Mali.He used a CUA technique as described in the previous section-first, asking participants how much they would be willing to donate to help one child and then how much they would be willing to donate to help 4, 20, 200, or 2000 children (this scope varied between-subjects).While participants donated more to help the group of children than a single child, the group donation did not differ between groups of 4, 20, 200, and 2000 affected children.This finding calls into question whether Unit Asking really increases scope sensitivity in willingness to donate (WTD) judgments or rather gives a single one-off boost.Consequently, more research is required to test whether CUA actually makes participants' scope sensitive.Marcinkiewicz (2016) also showed that the higher the scope, the lower people's desire to be consistent.Because SUA presents a smaller increase in scope for each step, we predict that SUA will elicit scope sensitivity where CUA does 1 We infer these simple effects from the fact that Hsee et al. (2013) report an interaction between asking condition and scope but no main effect of scope.Calculating the exact p-value of the interaction, Fð1,310Þ ¼ 4:15, p ¼ :042 shows that the claim about scope sensitivity was only weakly supported in Hsee et al. (2013).Hsee, Zhang, & Lu (2013) wasto our knowledge-also never published at a later point.
not.The second goal of this paper is to test whether either CUA or SUA can result in (ordinal) scope sensitivity, such that people give more to larger numbers of affected individuals.
In the following five studies, we first compare SUA, CUA, and DA (DA-only asking a single question about the full scope) and (a) replicate the beneficial effect of CUA over DA in WTD judgments and (b) show that SUA increases WTD judgments over CUA (Studies 1a and 1b).In Study 2, we test whether CUA and SUA make participants scope sensitive by varying Asking Type and Scope.Study 2 failed to find evidence for scope sensitivity.Study 3 also observed no evidence for scope sensitivity, despite seeking to provide optimal conditions for it.Study 4 switches to contingent valuation instead of a WTD judgment-dependent variable by asking participants how much money would be needed to help x individuals, rather than how much they would donate to help x individuals.In the contingent valuation task, all asking techniques (including DA) showed scope sensitivity.This scope sensitivity was enhanced using SUA, but not CUA.

| STUDY 1A
Study 1a was a nonpreregistered pilot.In this study, we aimed to replicate the benefit of CUA in comparison with DA as in Hsee et al. (2013).In addition, we investigated whether the new SUA method could increase WTD judgments over those observed via CUA.

| Participants
This study received ethical approval from the University of Oxford Central University Research Ethics Committee (reference number R56657/RE001).Participants were paid $0.24 for this 2-min study.We uploaded the study on January 14, 2021 on Positly (https://www.positly.com/),selecting US participants as the target group.Positly is a front-end platform that recruits MTurk participants but adds additional quality metrics (https://www.positly.com/participants/).Positly blocks suspicious IPs, requires high approval rates, and requires participants to consistently pass attention checks.Initially, 406 participants signed up for the study.We excluded three participants that indicated that they were 1036, 1986, and 948 years old.After these exclusions, 403 participants remained, of which 194 were female, 207 were male, and 2 indicated another gender.The mean age was 38.27 (SD ¼ 11:61).2

| Design
Participants were randomly assigned to one of three experimental conditions: DA, CUA, and SUA.The key dependent variable was the amount (in USD) participants were willing to donate to help 100 children, entered as a free response.

| Materials and procedure
Following Hsee et al. (2013), participants were first asked to imagine that the principal of a neighborhood kindergarten had sent them an email to ask for money that would be used to buy Christmas gifts for children: Imagine the following: Christmas is around the corner.
The principal of a neighborhood kindergarten has sent you an email asking for donations.You know her personally and trust her words.The email directs you to a website with the following questions.Please answer these questions as if you were making actual donation decisions.
They were then directed to a site that described a kindergarten with 100 children from low-income families and asked participants how much they would be willing to donate: Thanks for visiting our website.Please read the following carefully and answer the ensuing questions.Even if you are not willing to make a donation, please still answer the questions; you may simply enter $0.You can revise your answers, and your answers will not be recorded until you move on to the next page.
Our kindergarten currently has 100 children (like the one pictured below), they are all from low-income families.And their parents have little money to buy Christmas gifts for them.We hope you can make a donation, so we can buy Christmas gifts for them.
We used 100 children instead of the 20 used in the original study, since a setup with a larger scope seemed more appropriate to test the sequential asking technique.The experiment had three conditions.
In the control condition (as in Hsee et al., 2013), participants were asked directly how much they would be willing to donate to buy Christmas gifts for 100 children: Please think about all 100 of these children.How much are you willing to donate to help these 100 children?
Please enter the amount of money you decide and agree to donate: __ $ In the CUA condition (as in Hsee et al., 2013), participants first indicated how much they would be willing to donate to help one child and only afterwards (on a separate page) asked how much they would be willing to help 100 children.In other words, we first asked the following: Before you decide how much to donate to help these 100 children, please first think about one such child and answer a hypothetical question: How much would you donate to help this one child?Please indicate the amount here: __ $" After filling in this amount, they were asked the following question on the next page: Now please think about all 100 of these children.How much are you willing to donate to help these 100 children?Please enter the amount of money you decide and agree to donate: __ $ Finally, in the SUA condition (novel to this study), participants were also asked to indicate how much they would be willing to donate to help one child first (as in the CUA condition).However, instead of extending directly to 100 children, we scaled up the scope sequentially by asking (on separate pages) their WTD judgments for 2, 5, 10, 20, 50, and only then 100 children.In general, the increase per step can be determined by the n th root of the full scope, where n is the number of steps (i.e., here, ffiffiffiffiffiffiffiffi ffi 100 6 p ).This increase is founded in Fechner's law, postulating that the subjective intensity of stimulus corresponds logarithmically to the stimulus intensity (Fechner, 1860).This law suggests that increasing the number of children by a constant multiplier will correspond to increasing by a constant sum in terms of participants' subjective stimulus intensity.We additionally round to the next multiple of five for small numbers and the next multiple of 10 for larger numbers to make the numbers more intuitive to participants (e.g., in this study, we use 50 rather than 46 as implied by the formula).
Participants provided their responses by typing any amount they saw fit (in US dollars) into a free text response box.

Difficulties in analyzing WTD judgments
Open response WTD judgments often result in large outliers due to the lack of an upper bound.Previous papers on Unit Asking address this by winsorizing outliers and then employing a t-test or analysis of variance (ANOVA) (Hsee et al., 2013;Karlsson et al., 2020).
Winsorizing replaces values above the 95th percentile of values provided with the value exactly at the 95th percentile, to reduce the skew in the data.However, winsorizing is often not effective.Even after winsorizing, the data are extremely skewed.This can make inferences overly dependent on the small subset of participants that indicate extremely large WTDs.We illustrate this using the data of Study 1a as an example.Figure 1 shows the distribution of WTD judgments in the three different groups after winsorizing.This should give some intuition about how unstable the classical inference using winsorized data is in WTD judgments.

Our analytic approach
Due to the highlighted limitations associated with winsorizing, we take a different approach in the current manuscript, which provides Bayesian models with lognormal likelihood and nonparametric frequentist solutions to accommodate the skew in observed responses.
Because we cannot meaningfully analyze scope insensitivity for participants who donated $0 (as they would donate $0 for any scope under scope consistency as well as scope inconsistency), and because the lognormal model cannot accommodate zeros, we excluded these participants in the following analyses.Our approach evolved during the project, and this was the preregistered approach for Study 3.This is the analysis we focus on for all studies in the manuscript.An alternative write-up with analyses exactly as preregistered is available in the OSF project.The conclusions are the same in terms of the Bayes factor categories unless stated otherwise in footnotes.
To test the differences between conditions, we specified a Bayesian analysis using the package BRMS (Bürkner, 2017(Bürkner, , 2018)).We always compared the likelihood of the data under one model assuming a difference in WTD judgments to a model assuming no difference in WTD judgments using the bridgesampling package (Gronau, Sarafoglou, et al., 2017;Gronau, Singmann & Wagenmakers, et al., 2017).We refer to comparisons of two groups as t-tests and to comparisons of more than two groups as ANOVAs.Because our data were expected to be positive and quite skewed, we used a lognormal likelihood rather than the more common Gaussian likelihood , where i denotes the condition).As well as visually confirming the shape of the data, we also validated the likelihood function by comparing lognormal and normal likelihood models with leave-one-out cross-validation (Vehtari et al., 2017), which indicated superior performance of the lognormal models.In addition, the exponent of μ should usually approximate the empirical median of the data on a linear scale, although the two can differ where the data cluster around prominent numbers (e.g., 20, 50).This is observed in our data.We present posterior medians (i.e., exp(μ)) in addition to the empirical medians in our visualization of the results.
The analysis was conditional on participants donating at all (i.e., by removing zeros), and we also removed values larger than $10,000 (and note any instances where inferences are affected by this exclusion).We checked that our intervention did not affect the number of people donating in the first place, using a Bayesian contingency table analysis in the BayesFactor R package (Morey et al., 2015), with assumed sampling type joint multinomial and prior concentration one.
For the main analysis, we used a prior of normal (μ ¼ 3, σ ¼ 1) on the intercept.We used a prior of normal (0.4, 0.4) on the main effect of Asking Type.Finally, we used a prior of normal (1, 0.5) on σ.As the median of the lognormal distribution corresponds to expðμÞ, this implies that we expect a median donation of expð3Þ ¼ 20:9 in the control group and expð3 þ 0:4Þ ¼ 29:96 in the intervention group.In addition, these priors result in reasonable prior predictives for this kind of donation task (see Appendix C, that is, the mean and CI predicted from the priors are similar to values we would expect).
We use Bayes factors as our primary inference criteria (Etz & Wagenmakers, 2017;Jeffreys, 1961;Kass & Raftery, 1995;Rouder & Morey, 2019;Wrinch & Jeffreys, 1921).A Bayes factor compares the probability of the data under the null (no effect) to the alternative as specified by the prior distributions outlined above.As a general rule of thumb, Bayes factors between 1 and 3 are regarded as anecdotal evidence, Bayes factors between 3 and 10 are regarded as moderate evidence, and Bayes factors larger than 10 are regarded as strong evidence (Jeffreys, 1939;Lee & Wagenmakers, 2013).The inverse of the Bayes factor can be used to describe evidence for the null hypothesis.For example, a Bayes factor between 1/3 and 1/10 would be considered moderate evidence for the null.For robustness, we also analyzed the data with frequentist nonparametric tests.

| Effect on proportion of participants donating and summary statistics
A Bayesian contingency table analysis indicated that the intervention did not affect the share of participants donating in the first place (DA: 10.37%, CUA: 13.43%, SUA: 14.18%; BF 10 ¼ 0:032; see Appendix D for the share of participants donating at all across conditions in all experiments).Our main analysis excludes WTD judgments of 0 (51 participants) and larger than 10,000 (zero participants).

| Effect of SUA and CUA on donations
The three-level ANOVA on WTD judgments to 100 children indicated overwhelming evidence for an effect of condition (BF 10 ¼ 4:65 Â 10 9 ).

| Participants
This study received ethical approval from Harvard University's ethics review board.We paid participants $0.31 for this 2-min study.The study was uploaded on February 3, 2021 on Positly for US participants.
Initially, 507 participants signed up for the study.We excluded one participant who indicated that their age as 3963.The mean age was 40.69 (sd ¼ 12:64).A total of 248 participants were female, 256 were male, 1 indicated gender "other", and 1 did not indicate their gender.

| Effect on proportion of participants donating and summary statistics
A Bayesian contingency table analysis indicated that the intervention did not affect the share of participants donating in the first place (DA: 11.76%, CUA: 13.43%, SUA: 14.18%; BF 10 ¼ 0:015).We excluded WTD judgments of 0 (60 participants) and larger than 10,000 (3 participants; conclusions are unaffected by this exclusion unless indicated in a footnote).We proceeded to the main analysis with the remaining 443 participants.Figure 4 displays WTD judgments per step after these exclusions.Figure 5 shows the distribution of WTDs for the full Scope of 100 children.The median donation for 100 children in the DA condition was $20, the median donation for 100 children in the CUA condition was $50, and the median donation for 100 children in the SUA condition was $50.Note that if many participants indicated exactly the median, the distribution of WTD judgments between groups may still differ even though the medians are the same.
Figure 5 shows that this is indeed the case, and more WTD judgments are above the median for SUA, compared with CUA, which is also reflected in the posterior medians (red line).The median donation to help one child was $18.50 in the CUA condition and $15 in the SUA condition.In line with the random assignment, a Wilcoxon test shows no evidence that SUA and CUA conditions differ in terms of donations for one child (W ¼ 11,267, p ¼ :453).

| Effect of SUA and CUA on donations
The three-level ANOVA indicated overwhelming evidence for an effect (BF 10 ¼ 2:63 Â 10 11 ).

| STUDY 2
Studies 1a and 1b showed that SUA elicited higher average WTD judgments for groups of 100 than DA and CUA.However, to demonstrate scope sensitivity, we need to show that the donation is larger when more units are under consideration (e.g., people donate more to help 10,000 people than to help 100 people).Therefore, Study 2 not only varied Asking Type but also the maximum scope (100 vs. 10,000).
When scaling up SUA to a higher scope, there are fundamentally two possible approaches.First, one can keep the increase per step constant.In this case, a larger number of steps will be required to still reach the same full scope for a larger scope size (e.g., 1, 2,5,10,20,50,100,200,500,1000,2000,5000,10,000).Second, one can keep the number of steps constant.This will result in a higher increase per step, which is required to reach a larger scope with the same number of steps (e.g., 1,5,20,100,500,2000,10,000).In Study 2, we investigated both of these options.

| Hypotheses
We preregistered three hypotheses: 1.In a 2 Â 2 analysis with factors Asking Type (Direct vs. CUA) and Scope (100 vs. 10,000), there will be no interaction of Asking Type and Scope (replicating Marcinkiewicz, 2016).
2. In a 2 Â 2 analysis with factors Asking Type (CUA vs. SUA) and Scope (100 vs. 10,000), there will be an interaction of Asking Type and Scope such that SUA minus CUA is positive and larger for Scope 10,000 than Scope 100.
3. If the effect of SUA is partially driven by an increase in the number of steps, we will observe that participants in the increase per step constant condition (SUAI) would donate more for 10,000 children than those in the number of steps constant condition (SUA).4 6.

| Experimental conditions
We employed a similar method to Study 1 with a number of changes.
To allow more realistic scaling up to larger scopes, we replaced the kindergarten example with a vignette describing a food shortage in 6. Scope 10,000 Â SUA -number of steps constant (1,5,20,100,500,2000,10,000) 7. Scope 10,000 Â SUAI -increase per step constant (1,2,5,10,20,50,100,200,500,1000,2000,5000,10,000) Each Unit Asking condition also included the following, exploratory, desire for consistency measure: "You said you would give X dollars for a single child and Y dollars for the group of N children.In doing so, were you trying to be consistent?That is, were you trying to allocate for each child in the group of N children as much as money as for the single child?(1 = not at all, 4 = somewhat 7= yes, absolutely)".
The variables X, Y, and N were adapted based on the participant's responses.The consistency measure was asked at the end of the study to avoid any influences on the key variable of interest (WTD).

| Analyses
We used the same analysis as Study 1b with the addition of a prior of normal (0.25, 0.25) on the interaction.The reason why the priors are increasingly more narrow is that, given the lognormal likelihood, the prediction on the linear scale corresponds to the exponent of the marginal mean.Therefore, using similar priors on main effects and interactions would result in the prediction of an extremely large interaction effect on the linear scale.We tested for the presence of the hypothesized interactions by comparing models that include the interaction to models that do not include the interaction using Bayes factors.In addition, we conducted a frequentist analysis using rank-based (nonparametric) tests for main effects and interactions with the package rfit (Kloke et al., 2012) and independent sample Wilcoxon tests when comparing pairwise differences.

| Results
As we did not achieve sufficient evidence on all three critical tests in the preregistered intermittent analyses, we collected the full sample of 562, consistent with our preregistration.

| Effect on proportion of participants donating and summary statistics
A Bayesian contingency table analysis again indicated that the intervention did not affect the share of participants donating at all (BF 10 ¼ 0:19; see Table D1 for proportions).Therefore, we proceeded to the main analysis excluding WTD judgments of 0 (133 participants) and larger than 10,000 (exclusions by condition: DA = 2; CUA = 2; SUA -number of steps constant = 11; SUA -increase per step constant = 4; conclusions are unaffected by this exclusion unless indicated in a footnote), leaving us with a total of 410 participants.
Figure 6 shows the median WTD judgments per step, and Figure 7 shows the distribution of WTD judgments for the total number of children (see also Table D1).In line with the random assignment, a Wilcoxon test shows no evidence that SUA and CUA conditions differ in terms of donations for one child (W ¼ 10,992, p ¼ :767). 5

| H1: Comparing CUA and DA
The first hypothesis, that CUA would not make participants scope sensitive, received only weak support.A Bayesian interaction test in a 2 Â 2 ANOVA with factors Scope (100 vs. 10,000) and Asking Type (DA vs. CUA) implied weak evidence against an interaction of Asking Type and Scope (BF 10 = 0.50). 6In addition, there was strong evidence for a main effect of CUA versus DA (BF 10 = 10.03).7 A nonparametric ANOVA also found no evidence for an interaction of Asking Type and Scope, Fð1,228Þ ¼ 0:49, p ¼ :488, and a significant main effect of CUA versus DA, Fð1,228Þ ¼ 5:11, p ¼ :025.

| H2: Comparing SUA and CUA
We found weak evidence against the second prediction that SUA would result in more scope consistency than CUA, as revealed by an interaction between SUA (number of steps constant) versus CUA and Scope (BF 10 ¼ 0:64). 8As in the previous study, we found a main effect of SUA versus CUA (BF 10 ¼ 11:70),9 although the evidence here is much weaker.These findings were corroborated with a nonparametric, frequentist analysis finding no interaction, Fð1,247Þ ¼ 0:35, p ¼ :552, and a (just) significant main effect of SUA versus CUA; Fð1,247Þ ¼ 3:96, p ¼ :047.Further, when only comparing the increase in WTD judgments under SUA versus CUA for Scope 100, which is similar to Study 1, the evidence for the effect is only moderate (BF 10 ¼ 4:085).10This is surprising given the strong evidence found in Study 1.

| H3: Comparing different types of SUA
We found no evidence that donations were higher for 10,000 children in the SUAI (increase per step constant) condition than the SUA (constant number of steps) condition, as revealed by a t-test (BF 10 ¼ 0:67).11This is also supported by a nonparametric Wilcoxon test, W ¼ 1473, p ¼ :949.

| Exploratory analyses: Investigating judgments in intermediate steps and desire for consistency
The donation patterns so far give rise to some interesting avenues for more detailed examination.Two conditions used identical sequences of steps up to 100 participant: SUA with constant increase per step for Scope 10,000, dotted gray line in Figure 6 and SUA for Scope 100, solid black line in Figure 6.This raises the question of why the WTD judgments are not higher for the full scope in SUAI with constant increase per step for Scope 10,000, given the larger number of affected individuals as well as larger number of steps.Two explanations come to mind: (1) participants reach some kind of donation ceiling after which they do not donate more and ( 2) participants (who were informed in the beginning what the maximum scope would be) think ahead and donating less for one child when they know the full scope will be higher.In other words, they might first form a judgment 5 The reason why the medians in the figure for one child are different even though there is no significant difference is that participants cluster their responses around prominent numbers (i.e., 10, 20, 50).Therefore, only a few participants changing their judgment can result in a jump between two prominent numbers. 6We found strong evidence under the preregistered model.
about what to donate according to the full scope and then donate a proportional fraction of this to one child.Table 1 shows the share of participants that indicate a strong monotonic increase at each step (i.e., donate more for more children).Table 1 confirms what is suggested by Figure 6.In the Scope 100 condition, more participants increase on each step in comparison with the Scope 10,000 condition.
In addition, the proportion of participants increasing is lower for the larger scopes in the Scope 10,000 condition.In other words, both looking ahead and reaching a donation ceiling may play a role in this donation behavior.
Which kinds of participants are more likely to keep increasing their donations, and which are more likely to drop out?One hypothesis is that participants that donate more for one child are more likely to drop out later, as they are more likely to run out of money.To test this, we conducted a linear regression predicting the number of strong monotonic increases from the WTD judgments for a single child.
However, we did not find any evidence for the notion that WTD judgments for one child were related to monotonic increases for any of the three SUA conditions (maximum scope 100: Fð1,79Þ ¼ 2:18, p ¼ :501, maximum Scope 10,000 and increase per step constant: Fð1,77Þ ¼ :764, p ¼ :385, and maximum Scope 10,000 and number of steps constant: Fð1,77Þ ¼ :666, p ¼ :417).Finally, we found evidence against an effect of Asking Type on our exploratory measure, "desire for consistency" (BF 10 ¼ 0:055), which we included in line with Marcinkiewicz (2016).However, this might not rule out "desire for consistency" as an explanation for the effect given the likely insensitivity of this one item measure.

| Discussion
We again replicated the benefit of SUA over CUA; however, the evidence was much weaker than in Study 1 (though still moderate in strength).We found no evidence that unit asking increased scope sensitivity in this study.This finding holds for SUA and CUA-both techniques only give a constant boost independent of scope.In other words, repeated asking leads people to indicate higher WTD judgments, but this effect appears independent of Scope (the number of children affected).Perhaps the most surprising result is that we do not observe scope sensitivity in the SUAI-increase per step constant-condition.
To explain this result, we suggest that some participants might reach a donation ceiling (as suggested by the smaller number of participants displaying monotonic donation patterns for the larger number of affected individuals; see Table 1).Alternatively, some participants might seek to ensure consistency by donating less to a single child when they had an idea of what the maximum scope would be.Finally, some participants might simply find the large number of repetitive questions that were asked in this condition unpleasant or irritating and, therefore, disengage from the task.Study 3 included alternative optimal conditions for observing scope sensitivity on the basis of these conjectures.

| STUDY 3
Study 3 aimed to provide the most favorable conditions for scope sensitivity.To reduce the likelihood of participants reaching a donation ceiling, we reduced the maximum scope in this study from 10,000 to 50.Achieving scope consistency for such a smaller scope is likely more realistic, and Hsee et al. (2013Hsee et al. ( , p.1806) would appear to agree: "We should also note, however, that the ability of unit asking to increase scope sensitivity is likely limited; if the target numbers are large-for example, 1000 versus 10,000-respondents may encode either number as 'a lot' and not differentiate the two".To avoid participants planning forward when knowing the maximum scope, we included conditions where we do not tell participants the maximum scope in advance of the iterative procedure.Finally, we also reverted back to the kindergarten vignette that we had used in Study 1 which showed the strongest benefit of SUA over CUA.Overall, we aimed to test the following four preregistered research questions in this study: 1. Can we identify scope sensitivity for any of the asking techniques?
3. Does SUA make participants more scope sensitive than CUA?
4. Does telling people the maximum scope beforehand affect scope sensitivity?
7.1 | Method  2,5,10,20,50) We did not include a DA condition this time as, at this point, it is well-established that people are scope insensitive under DA and that both CUA and SUA increase WTD judgments relative to this baseline.

| Effect on proportion of participants donating and summary statistics
We used the same statistical model as in Study 2. Asking Type did not affect the number of participants donating in the first instance (BF 10 ¼ 0:03; see Table D2 for proportions).We proceeded to our main analysis, excluding participants that donated 0 (70 participants) and participants that donated more than 10,000 (0 participants).
Figure 8 visualizes the median donation trajectories for the different steps, and Figure 9 shows the distributions of WTD judgments for the full scope (see also Table D2).In line with the random assignment, a Wilcoxon test shows no evidence that SUA and CUA conditions differed in terms of donations for one child (W ¼ 26,393, p ¼ :916).

| RQ1:
Can we identify scope sensitivity in any format?
We tested this with four comparisons between the WTD judgments for 50 versus 10 participants for both CUA and SUA (i.e., in terms of the Experimental Conditions 1 vs. 2, 3 vs. 4, 3 vs. 7, and 5 vs. 6).We found moderate evidence that participants donate the same for 50 vs. 10 individuals for CUA (BF 10 ¼ 0:29).
For SUA, we compared the Scope 10 condition to the Scope 50 condition (number of steps constant) and the Scope 50 condition (increase per step constant).We found no evidence for either the null or alternative hypothesis for the number of steps constant comparison (BF 10 ¼ 0:86).For the increase per step constant comparison, we found weak evidence that participants donated more with larger scopes (BF 10 ¼ 2:55).In addition, for SUA (number of steps constant), we also have one condition where participants knew the maximum scope when starting the experiment.Here we find weak evidence for participants donating less on the larger scopes (BF 10 ¼ 0:36).In sum, Study 3 did not reveal convincing evidence for scope sensitivity in any format.This lack of evidence is also corroborated by nonparametric Wilcoxon tests.This indicates no evidence for scope consistency under CUA (p ¼ :357), no evidence for scope consistency under SUA (number of steps constant, p ¼ :241), no evidence for scope consistency under SUAI (increase per step constant, p ¼ :077), and no evidence for scope consistency when telling people the maximum scope beforehand (p ¼ :883).
7.2.4 | RQ3: Does SUA make participants more scope sensitive than CUA?
For the interaction between Asking Type (SUA [number of steps constant] vs. CUA) and Scope (10 vs. 50; i.e., 1 and 2 vs. 3 and 4), we find no evidence that SUA increases scope consistency in comparison with CUA (BF 10 ¼ 1:17).When instead testing this interaction using SUAI I G R E 8 Median willingness to donate (WTDs) for each step in Study 3. Error bars represent the interquartile range.
F I G U R E 9 Willingness to donate (WTD) distribution for the full scope in Study 3. *SUA conditions where the maximum Scope was unknown to participants.In all other conditions, the maximum Scope is known.Gray area indicates empirical density.Black lines indicate empirical medians.Red lines indicate posterior medians (i.e., the median of the posterior distribution for this parameter).SUAI is for Sequential Unit Asking (SUA) with increase per step constant.
When testing an interaction between telling people the maximum scope before and Scope (i.e., 3 and 4 vs. 5 and 6), we find no evidence that telling people the maximum scope beforehand increases scope sensitivity (BF 10 ¼ 0:42) for the SUA conditions.This is also confirmed by a robust frequentist ANOVA, Fð1,281Þ ¼ 0. In this study, we did not find evidence for the main effect of SUA.
Importantly, we also did not find evidence against an effect of SUA.To investigate whether this result is driven by the smaller scope in this study, we compared the SUA versus CUA effect in this study (for the condition with 50 participants) to that in Study 1a (Scope = 100), where we found the strongest effect of SUA.We tested the interaction between Condition (SUA vs. CUA) and Study to investigate whether the difference in conditions was affected by the study.As we did not have a Unit Asking condition where the scope was known in advance in Study 3, we used the conditions where the scope is not known for this comparison.We only found weak (and nonsignificant) evidence for this interaction (Bayesian analysis: BF 10 ¼ 2:32; nonparametric ANOVA: Fð1,373Þ ¼ 1:400, p ¼ 0:237).We conclude, therefore, that the effect of SUA versus CUA does not reliably differ across the studies.

Effect of SUA vs. CUA across all donation studies
To further test the overall effect of SUA, we pooled the data from all four studies and one study presented in Appendix A, which employed a different design that likely diminished the effectiveness of SUA.We tested for an overall effect of SUA versus CUA across these five studies using a Bayesian mixed-effects model with random effects for Study and Scope.We used the same priors as in Study 3 and BRMs' default priors on the random effects. 13When pooling across studies, we only find moderate evidence for an effect of SUA in comparison with CUA (BF 10 ¼ 3:65).If we only look at the studies in the main text of this manuscript, the evidence is somewhat stronger (BF 10 ¼ 4:50).
In conclusion, when pooling across all relevant studies, there is moderate evidence that SUA results in higher WTD judgments for helping multiple individuals than does CUA.

| No evidence for scope sensitivity under CUA and SUA
We again did not observe any evidence for scope sensitivity, even after creating these most favorable conditions to observe scope sensitivity in donation judgments.We believe that the two most likely explanations for this result are as follows: 1.Even with reduced scope, participants' budget constraints limit their donation judgments to an extent that scope sensitivity may not be observed with WTD judgments.
2. None of the asking techniques promotes scope sensitivity in general.
If the first explanation is correct, we would expect to see evidence for scope sensitivity in a contingent valuation version of the study.In other words, instead of asking participants how much money they would donate to buy Christmas gifts for the children, one would ask how much money they think is required to buy Christmas presents for these children.As no willingness for a personal donation is asked for, budget constraints would no longer explain the lack of scope sensitivity.We tested such a contingent valuation setup for Study 4. For contingent valuation judgments, we only find evidence for a benefit of SUA over CUA in average judgments when using the increase per step constant condition, SUAI, rather than the number of steps constant condition.We further find evidence for an interaction between Scope and CUA versus SUA, for both SUA types.In other words, SUA may only increase contingent valuation judgments if (1) the scope is larger than 10 and (2) a sufficient number of steps is asked (in this case five steps).Further, we find that people are scope sensitive in a contingent valuation setting even under DA.Notably, this scope sensitivity can further be enhanced using SUA but not using CUA.

| GENERAL DISCUSSION
We tested whether Unit Asking makes people scope sensitive as claimed in previous research.We found that Unit Asking only gives a one-off boost to WTD judgments, independent of scope.In other words, participants donated more under Unit Asking as opposed to DA; however, this increase was independent of the number of individuals affected and, therefore, does not seem to reflect genuine scope sensitivity.
We also introduced a new variant of Unit Asking, which we call SUA.SUA extends CUA by asking a sequence of questions scaling up with scope.We found evidence in three out of four studies that SUA increased WTD judgments over CUA.In addition, when pooling across all studies, we found overall evidence that SUA increased WTD judgments over CUA.However, this increase also seems to come only as a limited series of one-off boosts rather than covarying with scope.
We further investigated contingent valuation judgments, where we found the inverse pattern in comparison with WTD judgments.
People were scope sensitive under all of the asking techniques, but CUA and SUA did not strongly increase judgments in comparison with DA.Table 2 gives on overview of the results across all five studies.Finally, we advanced the methodology for analyzing WTD judgments.We showed that the oft-used method of winsorizing and using conventional t-tests is problematic, as even after winsorizing the inference is not robust to outliers.Instead, we used a Bayesian model with lognormal likelihoods that can directly accommodate skew.We share the analysis code in our OSF project so that other researchers might use the methodology when analyzing WTD judgments or other judgments with similar distributional properties (i.e., extreme positive skew and zeros are reasonably excluded).

| Why is it Difficult to Create Scope Sensitivity in WTD Judgments and Easier in Contingent Valuation?
A possible explanation for the difficulty of making people scope sensitive in WTD judgments using the different Unit Asking manipulations is based on mental accounting.In line with research on this topic (Sussman et al., 2015;Thaler, 1985Thaler, , 1999)), participants might have a fixed budget of how much they are willing to give to charity.This allocated money might already be exploited on relatively small numbers of children; therefore, participants do not usually increase their WTD judgments anymore for larger scopes in the Unit Asking condition.This is also reflected in comments that we got from participants in Study 2, where we included an open feedback box.For example, one participant indicated "I have a certain amount total I'm willing to give ($15), so after that is reached I'm not willing to give anymore."and another participant said "I have limited funds so I can only donate so much regardless of the number in need."Versions of these comments were echoed by a considerable proportion of participants.Even though we tried to mitigate the problem by reducing the maximum scope in Study 3, it is still possible that both studies had reached the donation ceiling for most participants on both scopes, thus diminishing scope sensitivity.
In contrast, the contingent valuation setting removes this possibility: participants are asked how much money is needed to buy Christmas presents.This explains why participants are more readily scope sensitive even without using CUA or SUA.This baseline scope sensitivity can be further enhanced by SUA.However, we only observe a weak form of scope sensitivity, where participants donate more for more children, but this number does not increase proportionally with the scope, a finding that is qualitatively in line with previous literature on scope insensitivity (Dickert et al., 2012(Dickert et al., , 2015)).
Overall, we conclude that when (hypothetical) personal funds are at stake, participants are reluctant to even show weak scope sensitivity, and only strong manipulations such as repeated asking will induce increases in donations.On the other hand, people readily show scope sensitivity for contingent valuation judgments.

| Future directions
In this manuscript, we have focused solely on hypothetical WTD judgments.A standard suggestion for future research would therefore be to include incentive compatible studies.We do not, however, perceive these as a fruitful avenue for future research in this area for three reasons.First, in the current studies, our observations of different results between WTD judgments and contingent valuation judgments, coupled with participants' references to budgetary constraints in open-text feedback, suggest that our participants did take the hypothetical judgments seriously and were restricted by real-world budgets.Second, the long ongoing debate about how much financial incentives change participant behavior in the social and behavioral sciences (Camerer & Hogarth, 1999;Hertwig & Ortmann, 2001) mostly concludes that "the effects of incentives are mixed and complicated" (Camerer & Hogarth, 1999, p. 1).While research has found that reliance on hypothetical donations increases mean donations (Bekkers, 2017), to our knowledge, there is no evidence that hypotheticality influences differences between conditions in WTD judgments.Finally, incentive Note: "+ + +" denotes strong evidence for H 1 , "++" denotes moderate evidence for H 1 , "+" denotes weak evidence for H 1 , "À" denotes weak evidence for H 0 , "ÀÀ" denotes moderate evidence for H 0 , and "À À À" denotes strong evidence for H 0 .Medians and means here reported including donations of 0. SUA uses the number of steps constant condition for scaling up the scope.For Study 2, SUA refers to the condition, where the maximum scope was not known for comparability.As only for Study 4, the judgments differed strongly based on scope; we distinguish between Scope 10 and Scope 50 for this study.*Unlike CUA, SUA further increased scope sensitivity above DA.Abbreviations: DA, Direct Asking; CUA, Classical Unit Asking; SUA, Sequential Unit Asking.
compatible studies typically provide an endowment and ask participants to donate a proportion of this (e.g., Schoenegger & Costa-Gomes, 2022;Small et al., 2007), which is a considerably different task from the majority of real-life donation decisions.Given that our participants appeared to take the WTD task seriously (as evidenced above), we would see such an incentive compatible task as more different from real donation decisions than our hypothetical tasks, especially as it induces an upper bound on how much participants can donate (the endowment), which would make it more difficult to study scope sensitivity.
While we do not see the hypotheticality of the current judgments as limiting the generalizability of the current results, we suggest that future research should seek to generalize them beyond an experimental setting.For real donation decisions, there may be a trade-off between the "boost" provided by SUA and the need to maintain potential donors' attention and reduce the number of questions asked.This question could not be answered within any lab/experimental context.Regardless of whether incentives are offered or not, experimental participants have a reasonable expectation of answering a series of questions.To determine how to maintain potential donors' interest and lever the benefits of SUA in the realworld, a naturalistic field study would be required.Such a study would enable stronger conclusions as to the effectiveness of SUA "in increasing charitable donations in the wild." Further, in line with previous studies on Unit Asking (Hsee et al., 2013;Karlsson et al., 2020), we only include a picture of one needy

| Conclusion
This paper showed the difficulty in inducing scope sensitivity in WTD

Figure 2
Figure 2 visualizes the median WTD judgments for all WTD questions asked (i.e., per step), whereas Figure 3 visualizes the WTD judgments only for the total scope of 100 children.The median donation for 100 children in the DA condition condition was $20, the median donation for 100 children in the CUA condition was $25, and the median donation for 100 children in the SUA condition was $75.For one child, the median donation was $10 in both intervention conditions.In line with the random assignment, a Wilcoxon test shows no evidence that SUA and CUA conditions differed in terms of donations for one child (W ¼ 6526, p ¼ :774).
Ethiopia.Participants were told that in one area of the country, 100 or 10,000 children were affected by the shortage and asked how much they wanted to donate to the charity "Global Alliance for Improved Nutrition" to help the children in need (based onMarcinkiewicz, 2016).The combinations of Scope and Asking Type required to answer our Research Questions resulted in seven conditions.Participants were randomly assigned into one of the following conditions (with the exact sequence of the number of children men-

7. 3 |
Discussion Study 3 7.3.1 | Effect of SUA versus CUA No evidence for a difference in Study 3

8 | STUDY 4
To switch to a contingent valuation setup in Study 4, we replaced the question about donations in Study 3 with the following question: Please think about all 50 [10 in the lower scope condition] of these children.How much money do you think is needed to buy Christmas gifts for these 50 children?Please indicate the amount here: __$ We modified the background scenario from Study 3 by telling participants that they are active in the local community and occasionally give advice to people in their community when they have admin.related questions.In a next step, we informed them that the principal of a neighborhood kindergarten has contacted them to ask how much money they think is needed to buy Christmas gifts for the children.We elicited the contingent valuation judgments with the different asking techniques specified in Study 3. The aim of this study was to test the following preregistered research questions: 1.Does CUA increase judgments over DA? 2. Does SUA increase judgments over CUA? 3. Which asking techniques induce scope sensitivity?8.2.6 | Conclusion Study 4

F
I G U R E 1 0 Median contingency valuation judgments for each step in Study 4. Error bars represent the interquartile range.I G R E 1 1 Contingency valuation distribution for the full scope in Study 4. Gray area indicates empirical density.Black lines indicate empirical medians.Red lines indicate posterior medians (i.e., the median of the posterior distribution for this parameter).SUAI is for SUAI (increase per step constant).
child and do not increase the number of children in the picture as scope increases.The fact that the Unit Asking method increases average donations provides evidence that participants are able to scale their concern with the number of children, at least in an ordinal way, even when visualizations of the scope are not provided.However, adding visualizations to represent additional children may help induce scope sensitivity in combination with (Sequential) Unit Asking.Identifying appropriate visualizations might therefore be one avenue for future research to further increase scope sensitivity (see, e.g., this educational video for visualization/animation techniques: https:// www.youtube.com/watch?v=LEENEFaVUzU).Finally, future work may also explore different sequences of steps.Study 2 showed that the effect of SUA levels off when including too many steps.It would be interesting to further investigate at which point the effect of SUA begins to level off (although leveling off would likely depend on a number of contextual factors, such as the maximum scope).In addition to changing the number of steps, one could also change the ordering of steps or include steps in random order to further investigate how the effects are shaped by the step order.
Densities of willingness to donate (WTD) judgments in Study 1a after winsorizing.Black is the Direct Asking (DA) group, red the Classical Unit Asking (CUA) group, and blue the Sequential Unit Asking (SUA) group.
Pairwise comparisons indicated strong evidence for an effect of CUA in comparison with the DA condition (BF 10 ¼ 33:08), overwhelming evidence for SUA in comparison with the DA condition (BF 10 ¼ 22:43 Â 10 10 ) and strong evidence for SUA in comparison with CUA (BF 10 ¼ 22101:89).The results were also corroborated with a frequentist analysis using nonparametric Wilcoxon tests (all p À values Median willingness to donate (WTDs) for each step in Study 1a.Error bars indicate the interquartile range.F I G U R E 3 Willingness to donate (WTD) distribution for the full scope of 100 in Study 1a.Gray area indicates empirical density.Black lines indicate empirical medians.Red lines indicate posterior medians (i.e., the median of the posterior distribution for this parameter).
Median willingness to donate (WTDs) for each step in Study 1b.Error bars indicate the interquartile range.
condition (BF 10 ¼ 18:46 Â 10 8 ), and strong evidence for SUA in comparison with CUA (BF 10 = 5 643.68).The results are corroborated F I G U R E 4 F I G U R E 5 Willingness to donate (WTD) distribution for the full scope in Study 1b.Gray area indicates empirical density.Black lines indicate empirical medians.Red lines indicate posterior medians (i.e., the median of the posterior distribution for this parameter).witha frequentist analysis using nonparametric Wilcoxon tests (all p À values < :01).
EP/2021/001).We paid participants between $0.40 and $0.65 for participating in a 3-to 5-min study (depending on condition).The study was uploaded on August 24 and 25, 2021, targeting US participants via Positly.Based on our preregistered stopping rule, we used This study received ethical approval from the Ethics Chair for the Department of Experimental Psychology, UCL (Project ID No:

6
Median willingness to donate (WTDs) for each step in Study 2. Error bars indicate interquartile range.Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/bdm.2335by University College London UCL Library Services, Wiley Online Library on [27/10/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License the fourth and sixth of November 2021, targeting US participants via Positly.Initially, 574 participants signed up; no participants were excluded by age, and none failed the attention check.The mean age was40.31(sd¼ 12:62).A total of 289 participants were female, 279 were male, and 6 reported "other".
This study received ethical approval from the Ethics Chair for the Department of Experimental Psychology, UCL (Project ID No: EP/2021/001) and was preregistered at https://osf.io/ezrs9.We paid participants $0.37 for participating in a 2-min study.The study was uploaded between 6. Scope 50 Â SUA-number of steps constant(1, 4, 15, 50)[participants know maximum scope when answering first question] 7. Scope 50 Â SUAI-increase per step constant (1, T A B L E 2 Summary of results across all four studies.
judgments.Contrary to previous claims, Unit Asking does not increase scope sensitivity in separate evaluation but instead gives a one-off boost to WTD judgments.However, our findings also suggest that this one-off boost is increased by asking additional stepwise questions, a technique we term SUA.F I G U R E C . 1 Prior predictives for the lognormal model.APP E NDIX D : SHARE OF PARTICIPANTS DONATING FOR DA, CUA, AND SUA CONDITIONS IN THE DIFFERENT STUDIES T A B L E D 1 Median WTD judgments for the seven conditions of Study 2. Median WTD judgments for the seven conditions of Study 3.
Abbreviations: DA, Direct Asking; CUA, Classical Unit Asking; SUA, Sequential Unit Asking; WTD, willingness to donate.aSUAscaling up with increase per step constant.TA B L E D 2Abbreviations: CUA, Classical Unit Asking; SUA, Sequential Unit Asking.a SUA scaling up with increase per step constant.