Zooming in on Ambiguity Attitudes

Empirical studies of ambiguity attitudes to date have focused on events of moderate likelihood. Extrapolation to rare events requires caution. In an Ellsberg&#8208;like experiment with very unlikely events, we measured ambiguity attitudes with neither assumptions on subjects' beliefs nor restrictions to specific ambiguity models. Very unlikely events were overweighted, being weighted more strongly in isolation than when part of larger events. Using latent profile analysis, we classified the subjects in terms of deviations from ambiguity neutrality. One third behaved close to ambiguity neutrality. The others exhibited overweighting of rare events. Such behavior can lead to money&#8208;pump situations.


INTRODUCTION
Ambiguous rare events are pervasive in various fields of economics. Ambiguous rare events related to losses are events against which people may wish to insure. Policies for preventing or coping with environmental catastrophes also concern ambiguous rare events. Neglect of rare events can explain recent financial crises, as argued by Taleb (2007) in his book The Black Swan. An example of a rare event in the gain domain is to find a so-called "unicorn," a start-up whose value exceeds one billion dollars. The occurrence of bubbles in the evaluation of high-tech start-ups, such the "dot-com bubble" at the end of the 1990s or the more recent Silicon Valley tech bubble, can be a sign that these rare events are overweighted by investors. Kahneman and Tversky (1979) observe that rare events are typically either completely neglected or overweighted. For decision making under risk, the common view in the literature is that low-probability events are overweighted (e.g., Tversky and Kahneman, 1992;Gonzalez and Wu, 1999). However, the picture is not so clear when we consider other decision paradigms. Recent research in psychology has shown that if unlikely events are not described but rather experienced by agents, such events tend to be partially neglected or underweighted (see, for instance, Barron and Erev, 2003;Hertwig et al., 2004;Hertwig and Erev, 2009). Regarding ambiguity, Ellsberg notes in his thesis (republished in 2011) that the common finding of ambiguity aversion might be due to the focus on moderate-likelihood events and gains and that the results might be reversed if unlikely events were to be considered. Several papers have confirmed this conjecture for events with likelihoods in the range of 10%-30% (e.g., Ert and Trautmann, 2014;Baillon and Bleichrodt, 2015;Dimmock et al., 2016), but Ert and Trautmann (2014) also show that experiencing unlikely ambiguous events made them less attractive.
In this article, we zoom in on very unlikely events 2 in an Ellsberg-like experiment. Despite the numerous studies on ambiguity attitudes conducted in recent decades (see Trautmann and Van De Kuilen, 2016, for a survey), rare events have been virtually ignored to date because of three major challenges. The first challenge is to provide sufficiently high incentives to consider such rare events. 3 In our experiment, subjects could either lose an initial endowment of € 300 (loss frame) or win an equal amount (gain frame). The second challenge lies in identifying ambiguity attitudes generated on top of risk attitudes when moving from decision making under risk to decision making under uncertainty. We use matching probabilities to address this issue. The matching probability for an event E is the objective probability p that makes a decision maker indifferent between receiving a nonzero outcome € 300 with probability p and receiving € 300 if event E occurs. Dimmock et al. (2016) formally show that matching probabilities can directly capture ambiguity attitudes without requiring correction for utility or probability weighting. We generalize their result and develop an approach that is valid for all ambiguity models and all decision models under risk. The third challenge is related to controlling for people's unknown beliefs. A decision maker may truly believe that an event is impossible, and we should not misinterpret such behavior as neglecting a rare event. Therefore, we do not use arbitrary benchmarks to assess overweighting or ignorance of very unlikely events. Instead, we compare matching probabilities only with themselves and study their internal consistency, as proposed by Baillon et al. (forthcoming). We do this through the use of additivity measures. Intuitively, if an unlikely event is neither ignored nor overweighted, it should be assigned the same subjective value (matching probability) either in isolation or as part of a larger event. Hence, matching probabilities should be additive under ambiguity neutrality. If unlikely events are weighted more strongly in isolation (overweighting), then the matching probabilities will be said to be subadditive. Neglecting or underweighting would result in the opposite violation of additivity (superadditivity). 4 Below, we begin by defining the theoretical framework and highlighting its advantages (Section 2). We show that our approach is as general as possible and does not rely on any strong assumptions. After describing the experiment (Section 3), we pursue our minimal-assumption approach for empirical analysis as well (Section 4). We first use nonparametric tests to examine whether additivity is violated and, if so, whether it is violated differently in the gain frame and in the loss frame. Our analysis reveals that very unlikely events were not ignored but rather were overweighted overall, and more so in the loss frame. Second, to study heterogeneity of behavior, we use latent profile analysis (LPA) with completely free parameters. This approach allows us to extract several behavioral profiles from the data without ex ante assumptions regarding what these profiles should be. Simultaneously, subjects are classified in accordance with these profiles. One of these profiles is close to ambiguity neutrality and represents approximately one-third of the subjects. The other profiles consist of mild and extreme deviations from ambiguity neutrality, all in the sense of the overweighting of rare events. Finally, we discuss the interpretation and possible consequences of our results (Section 5). Agents who assign greater weight to events in isolation than combined might be exploited in the form of money pumping, for instance, by splitting an insurance contract into subcontracts. The conclusion is presented in Section 6.
2. THEORETICAL FRAMEWORK 2.1. Notation and Matching Probabilities. The state space is finite and is denoted by S. It contains all possible states of nature. Only one state is realized, but it is unknown which one. Subsets of S are called events, and each event is denoted by E. The complementary event to E is denoted by E C . The possible outcomes are monetary amounts from the set {−300, 0, 300}. Bets assign a nonzero outcome to an event and a zero outcome to the complement of that event. An uncertain bet is denoted by x E 0, with x ∈ {−300, 300}, and concerns an event E whose probability is uncertain. It yields € x if the true state belongs to E and yields nothing otherwise (if event E C is realized). A risky bet is denoted by x p 0, with x ∈ {−300, 300} and p ∈ [0, 1]. It pays an amount € x with an objective probability of p and pays nothing with an objective probability of (1 − p ).
Preferences over bets are denoted by . We assume that the decision makers have preferences over all possible combinations of uncertain and risky bets, for all E ⊂ S, x ∈ {−300, 300}, and p ∈ [0, 1]. We also assume that the decision makers' preferences exhibit monotonicity, such that they prefer more to less. Strict preference ( ) and indifference (∼) are defined as usual.
When a decision maker is indifferent between two bets such that we call p a matching probability of event E and denote it by m s (E) = p , where the superscript s ∈ {+, −} denotes the sign of x. Hence, the matching probability for event E and sign s is the objective probability p that makes a decision maker indifferent between an uncertain bet on E and a risky bet that pays out the same amount x with probability p . We require minimal rationality of the decision makers and assume that they satisfy m s (∅) = 0 and m s (S) = 1 for all 2.2. Using Ambiguity Neutrality as a Benchmark. Following Ellsberg (1961), we say that an agent is ambiguity averse (ambiguity seeking) if, for all E, there exists a p such that x E 0 ≺ ( ) x p 0 and x E C 0 ≺ ( ) x 1−p 0. In a typical Ellsberg experiment, E corresponds to drawing a red ball from an opaque urn containing red and black balls in unknown proportion. The matching probability for event E can then be compared against a probability of 50% to characterize the decision maker's ambiguity attitude, under the assumption that the decision maker does not have any reason to believe that red is more likely than black. A more advanced approach consists of also measuring the matching probability of E C (drawing a black ball) and then testing whether the sum of the matching probabilities is 1. Such an approach does not rely on any assumption regarding the decision maker's beliefs. Below, we generalize this approach.
A decision maker satisfying the subjective expected utility model (Savage, 1954) evaluates the uncertain bet x E 0 as P(E)U(x), where U(x) is the utility function and P(E) is the additive subjective probability of E. Probabilistic sophistication relaxes the functional form assumption of subjective expected utility and holds when a probability measure P(·) exists such that the decision maker evaluates the uncertain bet x E 0 as V (x P(E) 0), where V (·) represents the decision maker's preference over risky bets. Hence, under probabilistic sophistication, an individual is indifferent between x E 0 and x P(E) 0, and therefore, m s (E) = P(E). In other words, if probabilistic sophistication holds, the matching probabilities are sign independent and additive. Following the ambiguity literature, we define ambiguity neutrality as probabilistic sophistication. 5 If V is the expected utility functional, then ambiguity neutrality is equivalent to subjective expected utility. A wide range of ambiguity models assumes expected utility under risk, most notably Schmeidler's (1989) Choquet expected utility, Gilboa and Schmeidler's (1989) maxmin expected utility, Klibanoff et al.'s (2005) smooth ambiguity model, Maccheroni et al.'s (2006) variational preferences, Siniscalchi's (2009) vector expected utility, andCerreia-Vioglio et al.'s (2011) uncertainty-averse preferences. Observation 1 directly follows from the definition of ambiguity neutrality, even if V is not the expected utility functional. Our approach will therefore consist of testing whether m + and m − are additive and/or differ. This approach has four main advantages: Isolation of ambiguity attitudes. Matching probabilities isolate ambiguity attitudes from risk attitudes, as shown by Dimmock et al. (2016). They directly capture what uncertainty adds to risk (to V ), which is a commonly accepted definition of ambiguity. No restrictions regarding ambiguity models. Our approach is not restricted to any specific ambiguity model. Deviations from probabilistic sophistication constitute evidence of ambiguity attitudes in all ambiguity models, and our approach is as general as possible. 6 No restrictions regarding risk. V can be any decision model under risk. It need not be expected utility or prospect theory, as in Dimmock et al. (2016). No assumptions regarding beliefs. Additivity properties can be tested regardless of people's beliefs. Our approach does not rely on any assumptions regarding what people (should) believe.
Note that decision makers may be neither ambiguity averse nor ambiguity seeking (i.e., may have matching probabilities of complementary events that sum to 1) while still deviating from ambiguity neutrality. This can occur, for instance, if they overweight unlikely events to the same extent that they underweight very likely events. This is why it is important to study the various ways in which decision makers may deviate from neutrality. We do so by considering three indices of additivity, similar to those of Kilka and Weber (2001) and Baillon and Bleichrodt (2015). These indices represent three different preference conditions, namely, binary complementarity, lower additivity, and upper additivity, and describe different patterns of ambiguity attitudes. Ambiguity neutrality predicts that all indices should be zero and, therefore, should also be the same for gains as for losses. The binary complementarity index captures ambiguity aversion in the sense of Ellsberg, that is, a general dislike for ambiguity regarding both events and their complements. The upper and lower additivity indices focus on deviations from ambiguity neutrality for very unlikely events and for very likely events, respectively.
2.3. Binary Complementarity. The first index that we use directly follows from the idea of Ellsberg's (1961) paradox as presented above and simply checks whether the matching probabilities of two complementary events sum to 1. Accordingly, our binary complementarity index (BC index) measures the distance from unity of the sum of the matching probabilities of an event and its complementary event as follows: We say that binary complementarity holds if BC s (E) = 0. BC s (E) > 0 indicates ambiguity aversion for gains and ambiguity seeking for losses. Symmetrically, BC s (E) < 0 indicates ambiguity seeking for gains and ambiguity aversion for losses.
The remaining two indices are based on preference conditions introduced by Tversky and Wakker (1995).
2.4. Lower Additivity. Consider two disjoint events E, F ⊂ S. Lower additivity holds if an event has the same influence when added to the empty event and when added to a nonempty event. In terms of preference conditions, under the assumption that we observe the following preferences (where Equation (3) is trivial): and lower additivity means that if whenever E ∪ F is bounded away from S. 7 Hence, lower additivity holds if a change from ∅ to E has the same impact on the matching probability as a change from F to E ∪ F . In other words, the difference in matching probability between (3) and (4) should be equal to the difference between (5) and (6), that is, m s (E) − m s (∅) = p = m s (E ∪ F ) − m s (F ) with m s (∅) = 0. This equality gives rise to the lower additivity index (LA index): Note that the LA index is commutative in its arguments E and F , that is, LA s (E, F ) = LA s (F, E). Lower additivity holds if LA s (E, F ) = 0, lower subadditivity holds if LA s (E, F ) > 0, and lower superadditivity holds if LA s (E, F ) < 0.
2.5. Upper Additivity. The third and last index is the symmetric of the LA index, measuring the effect of removing an event from the universal event instead of adding an event to the empty event. Upper additivity holds if the impact of removing an event from the universal event is the same as that of removing it from a nonuniversal event. If a decision maker exhibits the following preferences (where Equation (8) is trivial): and x S−E 0 ∼ x 1−p 0, then upper additivity means that if x E∪F 0 ∼ x p +q 0, whenever F is bounded away from ∅. Hence, upper additivity holds if a change from S to S − E has the same impact on the matching probability as a change from E ∪ F to F , that is, where upper additivity is satisfied when UA s (E, F ) = 0. Upper subadditivity holds if UA s (E, F ) > 0, and upper superadditivity holds if UA s (E, F ) < 0. Note that in contrast to the LA index, the UA index is not commutative.

METHOD
In this section, we outline the experiment. Further details about the experiment are given in Appendix A.1.  3.1. Subjects and Procedure. Students from a Dutch university 8 participated in the experiment. The subjects were recruited from among students who had expressed a desire to be invited to participate in experiments. In total, N = 99 subjects consisting of 37 female and 62 male subjects with a median age of 23 participated in the experiment. The majority of the subjects were studying economics. 9 A maximum of two subjects participated in each session of the experiment. The sessions were held in a room with two cubicles (formed by panels placed on tables) to prevent communication between the subjects. The subjects took approximately 35 minutes to read the instructions, complete the experiment, and receive payment. Two betweensubject treatments were applied: the "gain frame" and the "loss frame." The treatments were assigned randomly to the sessions.
3.2. Stimuli. The experiment was a variant of the original Ellsberg experiment. Ambiguity was created by means of a bag that contained multiple tickets marked with numbers ranging from 1 to 200. Neither the number of tickets in the bag nor the selection of numbers used to mark the tickets was known to the subjects. A ticket would be drawn from the bag, and the number on the ticket would determine the subjects' payment. Hence, the state space was S = {1, . . . , 200}. We elicited the matching probabilities of various events from S.
To determine these events, the subjects were asked to assign different numbers to six symbols: , ȡ, , , ♦, and ‡. Using these symbols, we created the events described in Table 1. In Set A, the specified events were very unlikely to occur, whereas they were very likely to occur in Set B. We elicited the matching probabilities for all events described in Table 1. In the gain frame (gain treatment), each of the specified events would lead to a € 300 payoff. In the loss frame (loss treatment), these same events would each lead to the loss of a € 300 endowment that the subjects initially received. The order of elicitation of the matching probabilities was randomized and different for each subject. 10 We used choice lists to elicit the matching probabilities. Figure 1 shows an example of the choice list related to the event {ȡ, } in Set A for the gain treatment. The ranges of probabilities in the choice lists for the risky bets (Option 2) were kept constant within each set of questions, but they varied between the sets. The probabilities ranged from 0% to 5.5% in Set A and from 94.5% to 100% in Set B. Therefore, a subject whose matching probabilities lay outside of these ranges constituted censored observations. The matching probability of each uncertain event was calculated by taking the midpoint between the highest (lowest) p such that x E 0 x p 0 and the lowest (highest) p such that x E 0 ≺ x p 0 for gains (losses).
3.3. Incentives. Subjects received a € 5 show-up fee. In addition, they had the chance to play one of their choices for real. The subjects in the gain treatment were each physically shown the 10 In Appendix A.1, we describe an additional set of questions, Set C, concerning events of moderate likelihood. Note that Set C was never a crucial component of the experiment because it did not concern rare events. As we show in the Appendix, one of the four events in Set C was often misunderstood by the subjects; consequently, we decided to disregard Set C as a whole.
€ 300 prize and were told that they could both win it, but the money (€ 600 in total) remained on the experimenter's desk. Given that we could not cause the subjects to lose € 300 of their own money, we gave € 300 to each subject in the loss treatment before they started reading the instructions. We placed the money on their desks and told them that it was theirs. All subjects signed a consent form, but the consent form for the loss treatment also asked the subjects to acknowledge that they were given € 300. Nevertheless, it is possible that the subjects in the loss treatment considered only the final outcome (€ 0 or € 300). Strictly speaking, the loss treatment was a loss-frame treatment. The fact that we displayed cash money was also the reason why we conducted each session with a maximum of two subjects. This way, we did not need to have more than € 600 on hand at a time.
Because the amount to be gained or lost was high, it was important that the subjects did not believe that the experimenters could influence the outcome. Such mistrust could develop if the experimenters were to decide which specific number(s) had to be drawn from the bag to determine the outcomes of the uncertain bets. For example, a subject in the gain treatment might have believed that the specific numbers chosen by the experimenters would never be in the bag and that it would therefore be impossible to win. We prevented such mistrust in two ways. First, the bag was prepared before the subjects entered the room and thus before the subjects had selected their own numbers to be drawn from the bag to determine the outcomes of the uncertain bets. Second, we asked the subjects to make a decision both for a given event (Set A) and for the complement of the same event (Set B), and we explicitly stated this in the instructions. Risky bets were implemented with dice.
At the beginning of the experiment, the subjects drew envelopes containing codes that would be opened at the end of the experiment and would determine which choice would be played for real. In the gain treatment, some envelopes would result in no choice being played at all; in the loss treatment, some envelopes would result in a guaranteed loss of the initial endowment. The exact list of envelopes is given in Appendix A.1.3. The purpose of allowing the subjects to draw an envelope first and making them aware that only one choice would ultimately be played for real was to convince them to consider each choice as if it were the choice in the envelope (Johnson et al., 2014). The average payment was € 14, but with a highly skewed distribution (three subjects left with € 305).
3.4. Analysis. The analysis was conducted on 1,114 observations from the 99 subjects (49 and 50 subjects each in the gain and loss treatments, respectively). 11 Since we had censored observations, we chose to use nonparametric tests in our analysis. The Wilcoxon signed-rank test was used to test whether the indices were significantly different from zero. The differences in indices between the treatments were tested using the Mann-Whitney U test. These analyses were conducted separately for each event.
To study the heterogeneity of behavior, we performed a LPA. 12 By identifying shared characteristics in the subjects' indices, the LPA produced different endogenous groups (called profiles) and assigned a probability of being in each group to each subject. Hence, via the LPA, we could answer both the questions of what proportion of the subjects deviated from ambiguity neutrality and by how much. We implemented the LPA by means of the expectation maximization (EM) algorithm. 13 The EM algorithm has the advantages of simplicity, guaranteed convergence, and numerical stability. We conducted the LPA separately for the LA and UA indices. In our estimation procedure, we assumed only that the indices had multivariate normal distribution. For each profile, we estimated the mean of each index and the covariance matrix.  We did not impose any restrictions on the parameter values. Therefore, we did not force the profiles to represent certain characteristics but rather let the data speak for themselves.

RESULTS
In what follows, we multiply the matching probabilities by 100 and therefore report the results as percentages.
4.1. Aggregate Results. Figure 2 displays the matching probabilities for all events in Table 1. An ambiguity-neutral subject who believes all numbers between 1 and 200 have an equal chance of being drawn from the bag would have matching probabilities of 0.5%, 1%, ..., 3% in Panel A and 99.5%, 99%, ..., 97% in Panel B. Figure 2 shows that a vast majority of the subjects had matching probabilities higher than these thresholds in Panel A and lower in Panel B. These findings are consistent with the overweighting of rare events. However, this interpretation relies on the assumption that the subjects believed that all numbers were equally likely to be drawn from the bag. This assumption might be violated for subjects who chose numbers for the symbols , ȡ, , , ♦, and ‡ that they expected to be more likely (or less likely) to be drawn. In order to avoid any assumptions regarding the subjects' beliefs, we studied the additivity indices defined in Section 2. 4.1.1. Binary complementarity. The median of the sum of the matching probabilities of an event and its complementary event was always very close to 100%, resulting in BC indices equal to 0 in most cases (see the upper part of Table 2). The BC indices differed from 0 in only four of the 12 cases (once in the gain treatment and three times in the loss treatment). In all these cases, the difference was in the direction consistent with ambiguity aversion. The significance tests for the differences between the two treatments show that the BC indices in the loss treatment were significantly or marginally lower than those in the gain treatment in four of the six indices. This pattern is also consistent with ambiguity aversion, which predicts higher BC indices in the gain frame than in the loss frame. 14 4.1.2. Lower additivity. Lower additivity was defined as the difference in impact between adding a rare event to the empty event and adding the same rare event to a nonempty event. Table 2 shows that the median lower additivity indices were positive in all cases for both the . Stars indicate the significance levels of the Wilcoxon signed-rank test ("Gain" and "Loss" columns) and the Mann-Whitney U test for the BC (LA or UA) distribution for gains being higher (lower) than that for losses ("Comparison" column). + p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001.
gain frame and the loss frame. Hence, lower subadditivity held. Most subjects assigned higher weights to rare events when they were described in isolation than when they were part of larger events. There was also evidence that the overweighting of rare events was stronger in the loss frame than in the gain frame.

Upper additivity.
After analyzing lower additivity, we examined upper additivity by looking at the difference in impact between removing a rare event from the universal event and removing it from a nonuniversal event. The upper additivity indices were positive in all cases for both treatments (see the bottom of Table 2), indicating upper subadditivity and, thus, the overweighting of rare events that were removed from the universal event. Equivalently, events that were almost certain were underweighted. The between-treatment comparison revealed no significant difference.

Latent Profile Analysis.
The aggregate analysis highlighted the occurrence of strong and consistent overweighting of rare events. To further understand this behavior, we established behavioral profiles via LPA. As a preparatory step, we generated kernel density plots for the LA and UA indices (Figure 3). Such plots can visualize the heterogeneity of behavior and whether this heterogeneity arises from different groups of subjects.
Examination of the kernel density plots revealed that the LA and UA indices were not homogeneous. Although there was a clear accumulation around a value of 0 for all indices, there were also at least two other accumulations, a small one around 2 for several indices and a larger one around 5 for most indices. To account for these three accumulations, we performed LPA using the EM algorithm for three profiles with no restrictions on the class parameters (the means and covariances of the indices for each profile). It is possible that there were more than three profiles; however, we preferred this parsimonious specification because the consideration of each additional class would necessitate the estimation of 10 more parameters. The LPA was performed independently for the LA and UA indices, over the set of subjects for whom we could calculate all indices in each case. 15 15 The numbers of subjects with complete observations for the LA and UA indices were 86 and 81, respectively.   Table 3 presents the LPA results. They are consistent with our expectations from the kernel density plots. For Profile 1, all indices have means near 0 with very small variances (the covariance matrices can be found in Tables A.2 and A.3 in the Appendix). Although the means are (marginally) significantly different from 0, they are still very small, and Profile 1 is very close to ambiguity neutrality. For Profile 2, the indices have means near 2 with high variances, and for Profile 3, the means are near 5. The profiles for the LA and UA indices are very similar despite having been estimated independently. All profiles confirm the absence of upper or lower superadditivity. The subjects either overweighted rare events or were ambiguity neutral; there was no group of subjects who neglected rare events. The LPA assigned to each subject a probability of belonging to each group. Except in a few cases, each subject was assigned to one of the profiles with near certainty. Half of our sample was assigned to Profile 1 for LA and 43% for UA, indicating that many subjects were very close to ambiguity neutrality. Only approximately 10% of the subjects were assigned to Profile 3 (extreme overweighting). Although the above analysis was conducted only for the subjects for whom complete observations were available regarding the indices, it was also possible to calculate the profile membership probabilities for subjects for whom at least one index value was available. Therefore, using the estimated profile means and covariances shown in Table 3, we calculated the probabilities for those subjects as well and included them in our analysis presented below. In this way, we could make use of all of the information available from the data. 16 From the classification of the subjects into profiles, we could study whether subjects given one treatment were more likely to be assigned to a specific profile than subjects given the other treatment. For the LA index, the subjects given the loss treatment had probabilities of 48%, 37%, and 15% of being assigned to Profiles 1, 2, and 3, respectively, whereas the corresponding probabilities were 52%, 44%, and 4% for the subjects given the gain treatment. Hence, although the proportion of Profile 1 subjects was the same for both treatments, the subjects given the loss treatment more often belonged to Profile 3 (extreme overweighting) than did the subjects given the gain treatment. For the UA index, the probabilities of belonging to each profile were 47%, 46%, and 7% for the loss treatment and 35%, 51%, and 14% for the gain treatment. Hence, the subjects tended to show upper subadditivity more often in the gain treatment than in the loss treatment.
Finally, each subject was categorized into one of the three profiles for the LA and UA indices independently. We were also interested in whether a subject who was categorized as showing lower additivity (subadditivity) would also be categorized as showing upper additivity (subadditivity). Figure 4 presents the proportions of the subjects assigned to the different combinations of profiles for LA and UA. The majority of the subjects were assigned to the same profile for LA as for UA. 17 Almost no one was categorized as Profile 1 (ambiguity neutrality) for LA and Profile 3 (extreme overweighting) for UA or vice versa. Overall, approximately one-third of the subjects were classified as Profile 1 for both LA and UA, meaning that they were consistently close to ambiguity neutrality. The fact that a substantial proportion of the subjects were ambiguity neutral is not surprising with regard to the findings of Ahn et al. (2014), who did not reject ambiguity neutrality for the majority of subjects.

DISCUSSION
Our analysis revealed only weak ambiguity aversion as measured by the BC index. However, the LA and UA indices were significantly positive, leading to the rejection of the hypothesis of ambiguity neutrality and consistent with the overweighting of rare events. Three interpretations of such overweighting can be found in the literature. In the first, ambiguity attitudes are regarded as dependent on likelihood and outcomes (Hogarth and Einhorn, 1990). This leads to a fourfold pattern of ambiguity attitudes: ambiguity-seeking attitudes for very unlikely gains and very likely losses and ambiguity-averse attitudes for very likely gains and very unlikely losses (Trautmann and Van De Kuilen, 2016). The second interpretation explains the overweighting of unlikely events (and the equivalent underweighting of very likely events) as a consequence of likelihood insensitivity (Wakker, 2010;Abdellaoui et al., 2011). According to this interpretation, ambiguity decreases people's ability to discriminate between likelihood levels. In the extreme case, some people might assign the same weight to all events, hence overweighting rare events and underweighting very likely ones.
The third interpretation differentiates between ambiguity perception and ambiguity aversion, as in one of the best-known ambiguity models, the alpha-maxmin model (Ghirardato et al., 2004). In that model, ambiguity perception is represented by a set of priors. The decision maker maximizes a linear combination of the best and worst expected utilities that can be obtained over this set of priors. The weight assigned to the worst case is denoted by alpha. Ambiguity aversion is then defined as an alpha value larger than 0.5. Baillon and Bleichrodt (2015) show how the perception of ambiguity (a set of priors that is not a singleton) can lead to the overweighting of rare events for any alpha value other than 0 or 1. Hence, our results are compatible with the alpha-maxmin model. 18 Each profile identified in the LPA can then be interpreted as corresponding to a different degree of perceived ambiguity (almost none, mild, and extreme).
The experiment was conducted using choice lists because this approach has the advantage of making the incentive system easier to explain to the participants. As seen in Figure 1, the probability that would be chosen by an ambiguity-neutral person with uniform beliefs was more salient because of the smaller steps around it. This ensured both higher precision in this probability region and more conservative results. However, systematically switching from the ambiguous prospects to the risky prospects in the middle of the choice list (i.e., exhibiting a middle bias) tends to result in positive LA and UA indices and a zero BC index. Although there was no obvious evidence for a middle bias in the raw data, we ran an online experiment to check the robustness of our elicitation method. In this robustness check, we focused on our weakest results from the gain treatment (LA({ }, { , ♦, ‡}) and UA({ }, { , ♦, ‡} C ), which had the lowest medians) and tested whether they could be replicated with a method immune to middle bias. We used a bisection method, in which subjects are presented with one choice after another and therefore cannot be influenced by any middle bias. The methodological details of the robustness check are described in Appendix A.3. We obtained slightly lower medians for LA ({ }, { , ♦, ‡}) and UA({ }, { , ♦, ‡} C ) (0.05 and 0.55 instead of 0.15 and 0.75), with the former deviation being marginally significant 19 and the latter being significant at p < 0.001. We again obtained stronger deviations from 0 for the UA index than for the LA index.
Using random incentives in an ambiguity experiment (i.e., paying out for at most one choice among many choices with a given probability) is a common practice in the literature. Concretely, it means offering a compound lottery, with the first stage being the objective probability that a choice is played for real at all and the second stage being the chosen option, possibly an ambiguous bet. As noted by several authors (e.g., Oechssler and Roomets, 2014;Bade, 2015), if subjects mentally reverse the order of the compound lottery, perceiving first the ambiguous events and then the lottery, they may act as if they are ambiguity neutral even if they are not, leading to an overestimation of ambiguity neutrality. By contrast, Baillon et al. (2015) showed that perceiving the original order (objective probability followed by ambiguity) does not imply ambiguity neutrality. In our experiment, we made clear that the random incentives were implemented before the uncertainty was resolved, and even before the subjects made their choices, by using envelopes that the subjects drew at the beginning of the experiment to determine which choice would be played for real, as described in Subsection 3.3.
The results regarding the BC index show only limited support for ambiguity aversion, although the LA and UA results demonstrate that the subjects were not ambiguity neutral. This may seem at odds with the ambiguity literature, in which there is ample evidence for ambiguity aversion (see, for instance, the references in table 3.4 of Trautmann and Van De Kuilen, 2016). Perhaps our implementation of the random incentive system may not have fully prevented some subjects from perceiving it as a way to hedge against the ambiguity. We may therefore have underestimated the level of ambiguity aversion. Furthermore, despite our efforts to make the outcomes salient (placing the cash on the desk), the subjects also may have still considered the incentives to be low. 20 An alternative explanation, arising from the work done by Tversky in the 1990s (and formalized by Tversky and Wakker, 1995) is that, for more extreme likelihood levels, upper and lower subadditivity matter more than binary complementarity. In other words, for rare events, overweighting is stronger than ambiguity aversion.
In the experimental literature, it is common to provide an initial endowment to cover the losses with which subjects are faced during an experiment (see, among many others, Cohen et al., 1987;Eisenberger and Weber, 1995;Mason et al., 2005;Kermer et al., 2006;Harbaugh et al., 2010). However, we cannot know for sure whether the subjects perceived actual losses or whether they mentally integrated the loss with the initial endowment and thus considered only gains. Several papers have provided some evidence that subjects who are endowed with a monetary amount consider it theirs (e.g., Mason et al., 2005;Kermer et al., 2006). However, we acknowledge that there are better ways to investigate "real" losses (not simply a loss frame). Abdellaoui and Kemel (2014) provide subjects with a monetary endowment but cause them to lose time, such that they can not easily integrate the loss with the endowment. Kocher et al. (2013) implement "losses from posterior endowment," with subjects later winning back what they initially lost. Bosch-Domènech and Silvestre (2010) provide the initial endowment months before the experiment to encourage subjects to feel that it belonged to them. Unfortunately, we could not credibly implement any of these solutions for the large monetary amount used in this experiment. 21 As a positive consideration, Etchart-Vincent and l'Haridon (2011) find no differences in behavior in an experiment comparing hypothetical losses, real losses, and prior endowments. For this reason, we felt it suitable to choose the prior endowment approach, but we are still careful to refer to the implemented treatment as a loss frame to emphasize that it is a matter of framing/wording. Ideally, we would expect that the LPA would extract from the data a profile of ambiguityneutral subjects with mean indices not significantly different from zero. Indeed, the LPAs for both LA and UA yielded a profile very close to ambiguity neutrality, with very low indices and very low variance within this group. However, the mean indices were still significantly positive for this profile. Alternatively, we could have predefined the profiles and forced one profile to exhibit parameters equal to zero. This approach would result in an obvious but forced 20 Surprisingly, we did observe mild ambiguity aversion in the robustness check conducted online and reported in Appendix A.3, although the incentives in that experiment were less salient. 21 It is difficult to force subjects to lose € 300 worth of time or to gain consent to participate if there is no guarantee that they will have to pay such an amount or to be sure that they will show up after receiving such a large amount in advance.
ambiguity-neutral profile. We believe that by allowing the data to speak for themselves, we were able to obtain results that are less perfect but more powerful.
Previous research regarding unlikely events has not extended much further than events with likelihoods of approximately 5% or 10%. Such studies either have found that events were overweighted (Chipman, 1960;Kahn and Sarin, 1988;Curley and Yates, 1989;Casey and Scholz, 1991) or have not rejected ambiguity neutrality (Curley and Yates, 1985;Sarin and Weber, 1993). Two studies used stimuli more similar to ours but did not address the three challenges identified in the introduction to this article (incentives, isolation of ambiguity attitudes, and control for beliefs). In a hypothetical experiment, Einhorn and Hogarth (1986) do not reject ambiguity neutrality for extremely unlikely events, but their study also does not control for beliefs. Schade et al. (2012) report that subjects were more willing to pay for insurance in an ambiguous scenario than in a risky one, but this difference cannot be interpreted as a manifestation of ambiguity attitudes because they could not properly control for beliefs.
Crucially, our results rely on the events of interest being explicitly described. We cannot infer people's behavior when the relevant events are implicit, such that people may be unaware or at least not fully aware of them. Such unawareness is certainly pervasive in real life, but there are also many situations in which events are explicitly described, for instance, in insurance contracts.
Rare events exert greater influence in isolation. The subjects in our experiment assigned higher weights to events when they were isolated than when they were combined. Such behavior can be exploited, following arguments presented by Rabin and Thaler (2001) for loss aversion. These authors explain how myopic behavior (making decisions one after another without considering the overall consequences) and loss aversion can lead to money-pumping situations if people make small-scale insurance decisions one after another instead of considering the overall impact in terms of risk reduction. Overweighting of rare events will reinforce this pattern. Myopic agents will be prone to overinsure if they are presented with each possible (negative) event one after another, in isolation, instead of as a package.

CONCLUSION
Very unlikely events loom larger than they are. In an experiment, while controlling for risk attitudes and beliefs, we zoomed in on rare events. Using nonparametric tests, we found that the effect of a change from "no gain" to "some possibility of gain" was larger than the effect of a change from "some possibility of gain" to "greater possibility of gain." Similarly, the effect of a change from "certain gain" to "some possibility of gain" was larger than the effect of a change from "some possibility of gain" to "less possibility of gain." Both patterns were similar for losses. By means of LPA, we examined the heterogeneity in ambiguity attitudes. The results revealed that one-third of our sample was consistently close to ambiguity neutrality, whereas the remaining two-thirds showed mild or extreme overweighting. Such behavior is consistent with the mere perception of ambiguity in models such as the alpha-maxmin model. It can also lead to suboptimal situations such as overinsurance or overinvestment in long shots. A.1.1. Additional questions. The full experiment consisted of three sets of questions. Sets A and B have been described in Table 1. Set C is described in Table A.1 . To implement Set C, we asked the subjects to choose values for between 1 and 100 (whereas the values for all other symbols could be between 1 and 200). Figure A.1 shows the distributions of the matching probabilities for Set C. The event F − { } seems not to have been understood by many participants; this led to a high variation in the answers, much higher than for any of the other matching probabilities. For this reason, we decided to exclude the entire set from the analysis.  In each question in Sets A and B, the subjects were given a list of choices placing one fixed uncertain bet (Option 1) against 20 different risky bets (Option 2). The subjects were then asked to make a choice between the uncertain and risky bets for each of these 20 cases. The questions in Set C were prepared similarly; however, the number of risky bets in the choice list was 30 instead of 20. The reason for this difference was to ensure that the subjects would make the same number of choices in each set, namely, 120. An equal number of choices in each set was necessary for our incentive system, in which we used envelopes to determine the choice to be played for real (see Figure A.2 for a sample question from Set C).
An ambiguity-neutral subject believing that all numbers were equally likely to be drawn from the bag would assign a probability of 0.5% to each number. For such a subject, the uncertain events in Sets A, C, and B would have probabilities ranging from 0.5% to 3%, from 49.5% to 50.5%, and from 97.5% to 99.5%, respectively. The lists of probabilities in the risky bets presented in our questions were constructed such that the probability differences between the two rows decreased as the rows approached the subjective probability of an ambiguityneutral subject believing all numbers were equally likely (ambiguity-neutral probability). This arrangement might have assisted the subjects in interpreting the likelihoods of given events and might have led them toward ambiguity neutrality. If this was the case, our results are conservative.
A.1.3. Incentives. The chances of playing for real were determined by drawing from a box containing 50 envelopes. The distribution of the envelopes in this box was as follows: -47 envelopes containing a blank ticket -one envelope containing a ticket for Set A Here, a blank ticket meant that the subject would win nothing in the gain treatment and would lose € 300 in the loss treatment. Any one of the other three tickets meant that a choice from the set specified on the ticket would be played for real.
A second box containing 120 envelopes was also prepared to determine which specific choice would be played if the envelope drawn from the first box contained a nonblank ticket. The tickets in the second box were marked with the codes that appeared in each row of the choice list in each question. Hence, the combination of the tickets drawn from the two boxes uniquely determined the choice to be played for real.
The ambiguous bags were created separately for each session and each subject. Therefore, communication between the subjects outside of the laboratory was irrelevant, and the outcomes for different subjects did not depend on each other.
The probabilities associated with the risky bets (Option 2) were implemented using three 10-sided dice that together could generate all possible numbers from 00.0 to 99.9 up to one decimal place. Hence, for example, a subject in the gain treatment who chose Option 2 for an X% probability of winning € 300 would win if the dice showed a number strictly below X. Both the dice and the bags were shown to the subjects before they began answering the questions.
If a subject was to play a choice for real, it was done privately, and the payments were kept anonymous.
Out of the 99 subjects, no one decided to withdraw from the experiment once they had read the instructions. In total, five subjects drew a nonempty envelope from the first box, and three of them were paid € 300 in accordance with the outcome of the play.
A.1.4. Matching probabilities. There were 22 subjects who gave at least one counterintuitive answer. We defined a counterintuitive answer for a subject in the gain treatment as a preference for a 0% probability of gain over some uncertain chance of winning or a refusal of a 100% probability of gain. Similarly, for a subject in the loss treatment, a counterintuitive answer was defined as the rejection of a 0% probability of loss or a preference for a 100% probability of loss over some uncertain chance of no loss. We treated such answers as erroneous and concluded that subjects who made such mistakes did not answer the questions carefully. However, it should be noted that because of the nature of the experiment, it was easy to select a column incorrectly by mistake, and the majority of these subjects made only one such mistake. Therefore, for the optimal usage of the available data, instead of completely eliminating those subjects from the analysis, we eliminated only the observations corresponding to such answers.
We did not impose a maximum one-time switching rule. Therefore, the subjects could switch between the two bets in a given choice list as many times as they wished. In total, five subjects switched multiple times. Calculating the matching probabilities for such observations would require strong assumptions, from which we refrained for the entire analysis. Hence, we treated these subjects in the same manner as those discussed above and did not include the observations for which there were multiple switches in our analysis.
For censored observations (when the subjects never switched between the bets and always chose the same option in a given choice list), the matching probabilities were set equal to the highest probability available in the choice list (5.5%) for the events in Set A (very unlikely events) or to the lowest probability available in the choice list (94.5%) for the events in Set B (very likely events). For the events in Set C, the applied correction depended on the treatment. If Option 1 was always chosen, then the matching probabilities for the gain (loss) treatment were set equal to the highest (lowest) available probability (75% (25%)), and the opposite probability assignments were made if Option 2 was always chosen. Subjects and procedure. The subjects were recruited from the same platform as for the experiment presented in the main text. The experiment was conducted online, with N = 61 subjects (27 female, median age 22). We used an online experiment because it was run during a period when most students had no obligation to be on campus. As in the main experiment, a large majority of the subjects (84%) were studying economics. Each subject took approximately 10-15 minutes to complete the online questionnaire.
Stimuli. We attempted to replicate our weakest results from the gain treatment, that is, LA({ }, { , ♦, ‡}) and UA({ }, { , ♦, ‡} C ). We elicited the matching probabilities of the events { }, { , ♦, ‡}, and { , , ♦, ‡} and those of their complements. Instead of choice lists, we used a bisection method. We asked the subjects to indicate their preferences between an uncertain bet and a risky bet, starting with the probability that an ambiguity-neutral subject with uniform  beliefs would choose (see Figure A.3). If a subject preferred the uncertain bet (the risky bet), we increased (decreased) the probability associated with the risky bet in the next question. We continued to increase/decrease the probability associated with the risky bet (Option 2) in three subsequent questions after the first one. Once again, the probability increments were smaller around the ambiguity-neutral probability and increased in size as we moved away from it. If we were to represent the probabilities in a list, the probabilities that could potentially be queried against the event { , , ♦, ‡} would be as shown in Figure A.4. Hence, the procedure was very similar to the approach used in the main experiment, but it avoided any middle bias. A disadvantage of the bisection method is that it takes more of the subjects' time. Because this second experiment was conducted online, it could not be too long; this is why we focused on replicating only the weakest findings of the main experiment.
Incentives. The subjects were given the chance to play one of their choices for real. In order to prevent strategic behavior such as always choosing Option 1 to increase the probability in Option 2, 22 we did not randomly select one of their actual choices; instead, we randomly selected one question (e.g., as displayed in Figure A.4), independently of whether this question was answered. Hence, the subjects' answers had no impact on which question was used to determine the payment. If the subject had answered the chosen question in the online experiment, the corresponding choice was directly implemented. If not, the choice was inferred from the answers given by the subject to the other questions using stochastic dominance 23 (see Johnson et al., 2014, for more details on this method).
Ultimately, 5% of the subjects (three subjects) were randomly selected and invited to play for real. Which choice to play was again determined by envelopes that they drew upon arrival. In this case, there was only one box of envelopes containing 82 envelopes, inside which were written all possible event-probability pairs. The uncertain and risky bets were implemented with a bag and dice, as in the first experiment. Since the subjects filled out the questionnaire online, we could only tell them that they could win € 300. We physically showed the € 300 only to the subjects who were invited to play for real.
Matching probabilities. The matching probabilities were calculated in the same way as in the main experiment. We did not observe any subject to present counterintuitive answers, as in the first experiment. No subject preferred a 0% probability of gain to some uncertain chance of winning or refused a 100% probability of gain. This suggests that we could eliminate mistakes that occurred as a consequence of using choice lists. However, the bisection method also introduces its own complications.
By the nature of the bisection design, the first question answered for each event determines the direction of the probabilities presented in Option 2 in the subsequent questions. For a subject who answers the first question mistakenly, it is not possible to change the direction back. Hence, to check whether any subjects were forced to answer questions in the direction opposite to their matching probabilities, we asked a fifth question for every event. This fifth question asked what they would be asked if they were to answer the first question for that event differently. We checked whether the answer to the fifth question was in the opposite direction from the answer to the first question. In this way, we could detect subjects who had potentially made a mistake on the very first question for a given event and for whom we had potentially miscalculated the matching probabilities. We excluded such observations from our analysis. 24 A.3.2. Results. Figure A.5 shows the distributions of the matching probabilities. Once again, we see that the majority of the subjects had matching probabilities equal to or greater than the probabilities an ambiguity-neutral subject with uniform beliefs would have in Panel (a) and lower matching probabilities in Panel (b). These findings suggest the overweighting of rare events.  Table A.4 reports the results of our tests for Experiment 2. The BC values differ significantly from 0 in all cases and always in the direction consistent with ambiguity aversion. Both the LA and UA indices are slightly decreased in magnitude compared with our previous analysis (see Table 2, where the median values are 0.15 for LA and 0.75 for UA); however, we still see evidence of both lower and upper subadditivity. Hence, we were able to replicate the weakest results of our main experiment even with a design that avoids middle bias.
Given the small number of index values extracted here, it is not possible to conduct an LPA using the data from our second experiment. However, we can use the results from the first experiment to gain a rough idea of what the results of such an analysis would be. Using the means and standard errors of the related indices LA({ }, { , ♦, ‡}) and UA({ }, { , ♦, ‡} C ) for Profile 1 (see Table 3), we categorized the subjects into "Profile 1" and "others." We found that the proportions of subjects assigned to Profile 1 and others were, respectively, 79% and 21% NOTES: Numbers are reported as percentages. Every boxplot has lines at Q1, the median, and Q3. The adjacent lines show the most extreme values within 1.5 times the IQR of the nearer quartile.  for the LA index and 53% and 47% for the UA index. 25 Compared with our previous results, the proportion of subjects assigned to near ambiguity neutrality is higher for the LA index, whereas the result for the UA index is similar. This finding is not surprising since the only LA index considered here is the one that was the closest to lower additivity in the previous analysis. As before, we observe more upper subadditivity than lower subadditivity.