The common origin of both oversimplified and overly complex decision rules

Many deviations from rational choice imply the neglect of important evidence and suggest the use of simple heuristics. In contrast, other deviations imply sensitivity to irrelevant evidence and suggest the use of overly complex rules. The current analysis takes two steps toward identifying the conditions that trigger these contradictory deviations from efficient reasoning. The first step involves a theoretical analysis. It shows that the contradictory deviations can be captured without assuming the use of rules of different complexity in different settings. Both deviations can be the product of a reliance on small samples of similar past experiences. This reliance on small samples triggers apparent overcomplexity when the optimal rule is simple, but more complex rules yield better outcomes in most cases; the opposite tendency, oversimplification, emerges when the optimal rule is complex, and simple rules yield better outcomes in most cases. The second step involves a preregistered experiment with 325 participants (Mechanical Turk workers). The experiment shows that human decision makers exhibit the pattern predicted by the reliance-on-small-samples assumption. In the experiment, participants chose between the status quo and a risky alternative in a multi-attribute decision with three binary cues. They used uninforma-tive cues when this strategy was best in most cases yet ignored two informative cues when this strategy was best in most cases. In addition, describing the cues as recommendations given by three experts increased the tendency to follow the modal recommendation (even when reliance on only one of the experts was optimal), but people still behaved as though they relied on a small sample of past experiences

cues. Meehl (1954) found that the decisions of expert psychiatrists were less accurate than decisions derived from simple linear rules, and Dawes et al. (1989) showed that an important part of the experts' errors reflects a counterproductive effort to capture complex interactions. Another example involves the popularity of conspiracy theories, which are often built on complex explanations (Douglas et al., 2017;Marsh et al., 2022). For instance, over a third of Americans believe that global warming is a hoax (Swift-Hook, 2013), maintaining that mainstream scientific research reflects plots by powerful and malevolent groups.
These contradictory deviations from optimal reasoning can also be found in daily health-related decisions. For example, the belief that health problems reflect the impact of demons (a belief that was more popular in the middle ages) suggests the use of overly complex rules: rules that assume high sensitivity to factors that (at least not based on extant research) do not have a large effect on health. In contrast, the tendency to skip careful hand washing (also more common in the middle ages) suggests overly simplified rules that ignore the impact of germs that can be addressed by washing hands.
The leading explanations for using oversimplified rules focus on the role of cognitive costs and limitations (Payne et al., 1993). For example, the tendency to ignore license agreements can be explained as an effort to minimize cognitive effort. In contrast, the leading explanations for relying on overly complex rules focus on the role of motivated inference (Kunda, 1987). To illustrate, the belief that global warming is a hoax can be explained by assuming that it helps polluters maintain a positive self-image. Similarly, the belief in the existence of nonlinear patterns helps expert psychiatrists explain why their diagnostic decisions should not be overruled by a simple linear algorithm.
Together these explanations imply that the direction of the deviations from optimal reasoning is a product of the relative importance of two classes of incentives: cognitive costs and the benefits from selfserving inferences.
One of the main challenges, under this cognitive costs and motivating inference explanation, is the clarification of the way these incentives impact reasoning. The current research addresses this challenge by focusing on simple situations in which behavior is likely to be affected by objective incentives. Our analysis builds on two observations: First, recent research with simple choice tasks has demonstrated that variations in the objective incentive structure can change the direction of deviations from optimal choice .
Second, the leading explanations for the impact of incentive structure on decisions from experience assume a reliance on small samples of past experiences. 1 In the context of multicue decision making, this assumption predicts behavior that appears to reflect the use of both oversimplified and overly complex decision rules. These observations are used below to predict sufficient conditions for the co-existence of these contradictory deviations from optimal reasoning. Sections 2 and 3 describe the two motivating observations and the way they can impact the direction of the deviations from optimal decisions in multicue tasks, Section 4 presents an experiment that tests these implied predictions, and Sections 5 and 6 highlight the implications of our results.  reviewed research that examines decisions from experience in the basic clicking paradigm presented in Figure 1. In each trial of the reviewed experiments, the participants were asked to click on one of two or three keys. Each key was associated with a static payoff distribution (that did not change during the experiment), and each click was followed by the presentation of a single draw from each of the distributions. Each trial's payoff was the draw from the payoff distribution of the selected key, and the participants' goal was to maximize their overall earnings.

| CONTRADICTORY DEVIATIONS FROM MAXIMIZATION IN STATIC CHOICE TASKS
The review showed that certain changes in the incentive structure can reverse the direction of the observed deviations from maximization. Specifically, Erev et al. highlighted six pairs of contradictory F I G U R E 1 Typical instructions and screenshot from one trial in the basic clicking paradigm for studying decisions from experience. On each trial, participants select one option and receive feedback about both options. 1 The value of the reliance-on-small-samples explanation was demonstrated in four choiceprediction competitions (see a review in . The high predictive value of this hypothesis can be the product of an ecologically reasonable effort to respond to patterns (Plonsky et al., 2015). deviations from maximization. Table 1 summarizes four problems (B1-B4) that demonstrate one of these six pairs. In each problem, participants chose between two options with different experienced payoffs. The choice rates in Problems B1 and B2 reveal a deviation from maximization that suggests underweighting of rare events (Barron & Erev, 2003). The participants behaved as if they maximized expected earnings under the belief that the probabilities of the rare outcomes (À10 and +10) are below 10%. In contrast, the choice rates in Problems B3 and B4 reveal a deviation from maximization that suggests overweighting of rare events. In this case, replacing "11 for sure" with a long-shot lottery with the same expected value increased the choice rate of Option Y. Participants now behaved as if they believed that the probability of the rare +20 outcome is above 10%.
In addition,  demonstrated that the direction of the deviations from maximization they documented can be captured with a simple, sample-of-5 model (Erev & Roth, 2014). This model predicts random choice in the first trial and then a reliance on a sample of five past trials on all subsequent trials (randomly drawn with replacement; see Appendix A). For example, consider the decision in Trial 11 of a study that focuses on Problem B2, assuming that in the first 10 trials Option Y yielded nine "À1" outcomes and one "+10" outcome. Under the sample-of-5 model, the agent chooses Option Y only if the mental sample includes the +10 outcome. The probability that this mental sample of five past outcomes includes the +10 outcome is only 1-0.9 5 = 0.41 (even though the observed average payoff from Y is positive). The sample-of-5 column in Table 1

| RELIANCE ON SMALL SAMPLES IN MULTICUE DECISION TASKS
The observation that a reliance on small samples can reverse the deviations from maximization in basic choice tasks suggests that this tendency can also reverse the direction of the deviation from optimal reasoning in multicue tasks. To illustrate this point, we focus on the multicue choice paradigm presented in Figure 2. In each trial in this paradigm, the decision maker first sees a "signal," consisting of three cues, and then chooses between the status quo and a risky bet that can lead to a gain or a loss. Importantly, the signal provides partial information concerning the payoff distribution associated with the risky key. The exact payoff distribution given each signal is presented in Table 2. In this setting, a bias toward oversimplification entails ignoring part of the cues even when this behavior impairs performance, and the opposite bias implies sensitivity to cues (and to their interactions) that cannot help predict the sign of the expected outcomes.
Notice that Figure 2's multicue paradigm is a generalization of Pachur & Olsson, 2012) involves the current focus on the incentive structure. In those previous classification studies, the participants' goal was to maximize the proportion of correct responses; instead, we focus on situations in which the decision makers are motivated to maximize the number of points they earn. This difference is likely to be important when the strategy that maximizes the probability of correct responses differs from the strategy that maximizes the expected number of points. For example, in a binary choice between a status quo and a risky choice leading to a gain of 2 points 90% of the time, and À20 otherwise, the strategy that maximizes the percentage of correct answers is "always choose the risky option", and the strategy that maximizes the expected number of points is "never choose the risky option".  Note: The left-hand columns present the payoff distributions for Option X and Option Y in four different problems (B1-B4). The notation "v1, p1; v2, 1-p1" means "win v1 with probability p1, win v2 otherwise." The right-hand columns present the results over 100 trials of experimental studies conducted by Nevo and Erev (2012) and Teoderescu et al. (2013) and the predictions of two models. The model PAS (partially attentive sampler) is described in Section 4. on a simple generalization of the sample-of-5 model. The model, referred to as "signal-based-sample-of-5" (hereafter SB-sample-of-5), is similar to the sample-of-5 model with the additional constraint that only past experiences with the current signal are considered. Specifically, the SB-sample-of-5 model assumes random choice in the first experience with a specific signal and then, in subsequent reactions to that signal, a reliance on a sample of five previous experiences with that exact signal (drawn with replacement).
The SB-sample-of-5 column in Table 2 presents the predictions of this model for an experiment that examines repeated decisions in two sets of eight problems (the right-hand columns in Table 2 present the results of an experiment, described in Section 3, that tests these predictions). Each decision maker in this experiment faces a sequence of 160 choices: 20 rounds with each of the eight problems, in one of the sets, in random order. Each round involves a choice between the status quo (payoff of 0 with certainty) and a risky action that can lead to a gain or a loss. Before each choice, the decision makers receive a signal that includes three cues. As explained in Figure 2, each cue summarizes the recommendation of one of three experts. The cue T stands for the recommendation "take the risk," and the cue A implies "avoid the risk." Each choice is followed by the presentation of full feedback showing the payoff from the two possible choices in the last trial.
Set Rare-Treasure in Table 2 creates an environment in which the recommendation of one of the experts (the leftmost cue) always maximizes expected value (EV), and the recommendation of the other experts does not add payoff maximizing information. Thus, the EVmaximizing strategy is to follow the best expert. This strategy implies full reliance on the most effective cue and is thus an instance of the "take-the-best" heuristic rule (Bröder, 2000;Gigerenzer et al., 1991).
Yet, given the signal TTT (taking the risk is recommended by all three experts), this optimal strategy leads to a loss in 90% of the cases and to a large gain in the other 10% of the cases. The SB-sample-of-5 model, however, tends to go against the experts, by avoiding this risky option in this one case, predicting behavior that appears to reflect an unduly complex theory. The predicted behavior of most "SB-sampleof-5" agents can be (incorrectly) explained with the assertion that they believe that if all experts recommend taking the risk, this agreement reflects a plot of "corrupted experts." Set Rare-Disaster is similar to Set Rare-Treasure with the exception of the outcome distribution given the signal TTT (Problem 8, when all three experts recommend taking the risk). All the experts err in this set (the expected value is negative), even though in 90% of the trials, the payoff is positive. Thus, the EV-maximizing strategy is relatively complex. It requires sensitivity to all three experts and prescribes the following rule: "avoid the risk if all three experts recommend taking it; The basic instructions (used in Condition Experts in the experiment described below) and sample screenshot from two rounds in the multicue clicking paradigm. Participants selected one option on each round and received feedback about both options. In Condition Abstract, the italics were replaced with: Each cue will be the letter "T" or the letter "A". For example, the signal "TTA" implies that the first and second cues are "T", and the third cue is "A".
in all other cases follow the best expert." Notice that while the recommendation of the best expert impairs EV given the signal TTT, it always minimizes the probability of regret (it maximizes the probability of obtaining the best possible outcome in each choice). As in Set Rare-Treasure, most SB-sample-of-5 agents take the action that leads to the best outcomes most of the time. In the current set, this behavior can be (incorrectly) explained with the assertion that these agents learn to follow an oversimplified rule and take the risk if and only if the best expert predicts a positive expected return.
In summary, under the SB-sample-of-5 model, modifications to the incentive structure can change the direction of the deviation from optimal reasoning. In Set Rare-Treasure, agents that behave in accordance with this model choose as if they prefer a complex explanation over the take-the-best rule, even when the take-the-best rule maximizes the expected return. In contrast, in Set Rare-Disaster, these agents behave as if they follow the simple take-the-best rule, even when this rule is suboptimal.

| The impact of prior information
One shortcoming of the analysis presented above is that it focuses on virtual agents that base their decisions on past experiences in the experimental session and do not consider older past experiences.
Human agents likely consider a wider set of past experiences, including old experiences that resemble the current situation and have occurred before the beginning of the experimental session. Which past experiences human decision makers consider also likely depends on the description of the choice task. Specifically, the information that decision makers receive concerning the meaning of the cues can be important. For example, the cues could be described as recommendations of three experts, as above, or the cues could be described as abstract letters without a particular semantics. Describing the cues as recommendation from experts, as opposed to abstract letters, would be expected to increase the initial tendency to take the risk with an increase in the number of Ts in the signal. The experiment below thus also assessed the impact of prior information on choice in a multicue environment.

| EXPERIMENT
The current experiment examined human behavior in the two sets of problems described in Table 2, under two prior information (hereafter "description") conditions. Condition Experts used the instructions presented in the main part of Figure 2 and described the signal as the recommendation of three experts. Condition Abstract dropped this cover story (see the note in Figure 2). The full instructions also included an answer-until-correct quiz and are presented in Appendix B.
T A B L E 2 Two incentive structures (sets of problems), predictions and experimental results.  Table 1). The right-hand columns present the results of the experiment described below. The bold emphasizes the difference between the two sets. *Significantly different (p < .001) from the rate in the abstract condition. Only those who completed the experiment were counted as valid participants. On average, the participants completed the task in 12 min (the maximum time allowed was 1 h). The participants received a show-up fee of $1 for completing the task and, as indicated in the instructions, had a chance to win a bonus of $0.5 depending on the points gained (63% of the participants earned the bonus). The exact probability of winning the bonus increased linearly with the accumulated number of points; this bonus computation rule induces risk neutrality by rational agents (Roth & Malouf, 1979). All participants provided informed consent before participating, and the study was approved by the Technion-Israel Institute of Technology ethics committee.

| Procedure
After providing consent, each participant received general instructions about the task (see Appendix B) and made a total of 160 choices. On each round, they picked between the status-quo option and the risky option (see Figure 2). These were indicated by two blue boxes on the screen with the corresponding text written on them. The location (left or right) of the status-quo option and the risky option were counterbalanced (but fixed for each participant). Text on top of the screen indicated the round number (out of 160), the total points earned thus far, and the cues for that round (e.g., TTT). Participants indicated their selection by a mouse click.
After each selection, the selected key was highlighted and the points earned were displayed, until participants clicked on the "next" button and moved on to the next trial. Payoff from the unselected option was also displayed (i.e., full feedback) on the other key. The next choice was then immediately available.
Each participant faced each of the eight problems 20 times.
The problems each had a different three-letter cue and a different underlying pay out rate (see Table 2). The problem presented on each trial was randomly determined (drawn without replacement).
That is, the probability of facing the same problem on consecutive trials was around 1/8. On the risky trials with the TTT cue, the rare outcome appeared 10% of the time (truly random). The experiment was programmed using oTree (Chen et al., 2016).
To clarify the impact of the set and the description condition on the reaction to the signal TTT, we conducted three analyses of variance. The first focuses on the mean choice rates over all 20 reactions to this signal (presented in The two additional analyses focused on the last 10 responses to the signal TTT and distinguished between the decisions made when the observed payoff from the Risk option (over all previous experiences with the signal TTT) was positive or negative. The mean riskrates in these cases are presented in Table 3. Notice that the probabilistic payoff rule means that "lucky" participants made all these choices after observing positive average payoffs, and some "unlucky" participants made all these choices after observing negative average payoffs. 2 The lower row in Table 3 shows that both analyses reveal two main effects and no significant interactions. In all cases, whether participants were lucky or unlucky, or whether they encountered the experts or abstract description, participants took the risky option more often in the rare-disaster set with the TTT cue. This behavior implies deviations from efficient use of the take-the-best rule even after observing that these deviations impair earnings.

| Individual differences
To assess the impact of individual differences, Figure 3  We chose to focus on the last 5 here, rather than the last 10 (as in Table 3), as the current analysis examines individual differences and the focus on the last 5 increases the proportion of the participants that saw at least two rare outcomes (the most interesting case) before the choice we analyze.
T A B L E 3 The risk-rate during the last 10 presentations of the signal TTT, as a function of the sign of the average payoff from the risk option in the previous presentations of this signal. Note: n is the number of participants that contributed to the computation of each statistic. The sum is larger than the number of participants as some participants experienced both positive and negative average payoffs (in the last 10 TTT trials) and contributed to both measures.

F I G U R E 3
Risk-rate in the last five reactions to the signal TTT as a function of the average payoff in the first 15 experiences with TTT.
of Condition Abstract avoided the risky option in their last 5 choices).
Under one explanation of the large individual-difference pattern, this behavior reflects "sampling inertia": The four participants that behave as if they overweight the rare events rely on the same small sample in all of the five final trials, and this sample includes the rare event. Table 2 shows that in 6 of the 16 problems, the difference between the risk-rates in the two description conditions was significant. In all six cases, the direction of the significant difference can be explained with the assertion that the description of the cues as experts' recommendations increased the tendency to follow the rule "take the risk if the cue includes at least two T's." In Condition Experts, this rule implies following the modal recommendation and taking the risk when T A B L E 4 The risk-rate in the first two trials (out of 160) of the experiment. Importantly, in five of the six significant cases, following the modal recommendation impaired expected return. Thus, while the additional information could clarify the fact that one of the cues was "best," it did not increase the tendency to behave in accordance with the takethe-best rule.

| Additional effects of the description condition
Another indication of the difference between the two description conditions is suggested by the risk-rates in the first two trials of the experiment, as presented in Table 4. In Condition Experts, participants start by following the modal recommendation, as one might expect in the absence of other information. In Condition Abstract, participants start with a high risk-rate for all cues, but the first feedback (after the first choice and before the second choice) strongly reduced the riskrate. This high initial risk-rate in Condition Abstract can be explained by assuming generalization from old experiences (that occurred before the beginning of the current experiment) in situations in which the feedback was limited to the obtained payoff, and initial exploration was effective. The quick decrease of risk-rate after obtaining full feedback supports this explanation.

| The effect of experience
The The results in Condition Experts reveal that experience reduced the deviation from maximization that comes from reliance on the experts (in reaction to the signals ATT and TAA) but did not eliminate this tendency. In Condition Experts, the effect of experience on the reaction to the signal TTT was negative (i.e., the maximization rate declined with experience) in Set Rare-Disaster and positive in Set Rare-Treasure.
F I G U R E 5 Risk-rates in blocks of five trials in each problem of set rare-disaster: Observed rates by condition (left), and the predictions the models described below (right). EV, expected value maximization; SB-PAS, partial attentive sampler model; SB-PAS-eg, partial attentive sampler with estimated generalization. 4 An additional study of the current sets of problems, summarized in Bonder (2022) Table 1 presents the prediction of PAS for four of the problems con- In the second model-based analysis, we considered alternative refinements of SB-PAS that can capture the difference between the two description conditions. Because Figures 4 and 5 show that the difference between the two description conditions diminishes with time, we chose to focus on the abstraction of the impact of generalization from old past experiences that occurred before the beginning of our experiments. As noted above, SB-PAS approximates the impact of reliance on old experiences by assuming random choice in some of the trials. To examine the impact of relaxing this assumption, we considered a variant of SB-PAS that replaces the random choice assumption with the assumption that when agents rely on old experiences, their risk-rates are similar to the risk-rates in the second trial in Condition Experts (presented in Table 4). 5 The prediction of this model, Instead, when the choice between options is based on small samples of similar past experiences, changes in the incentive structure can affect the apparent complexity of the rules that best fit the observed behavior. The modal behavior of agents that rely on small samples agrees with the prescription of overly simple rules when these rules lead to the best outcomes in most cases and suggests the use of overly complex rules when the optimal rule is simple but leads to disappointing outcomes in most cases.
We also showed that human decision makers exhibit the pattern predicted by a reliance on small samples of similar past experiences.
This hypothesis correctly predicted behavior consistent with the takethe-best heuristic (Gigerenzer et al., 1991), even when take-the-best impaired maximization (was too simple) and also correctly predicted deviations from take-the-best that reflected the use of counterproductive, complex rules, when take-the-best was the optimal strategy.
The current analysis clarifies the variation in the complexity of rules that best capture behavior in two ways. First, our results 5 We focus on the second trial to reduce the impact of the apparent exploration in the first trial. Using the first trial does not change the results.
demonstrate that in certain settings, behavior that appears to reflect rules of different complexity can be predicted with models that assume rules of fixed complexity. Thus, in these settings, the existence of contradictory deviations from optimal decisions does not imply the use of rules of different complexity. For example, the apparent contradictory reactions to demons and germs in the middle ages may not reflect rules of different complexity. Rather, the apparent difference can be the product of the incentive structure. In the middle ages, accepting the common belief in demons was rewarding (in most cases, but had extreme negative outcomes in some cases), and considering the impact of small organisms that cannot be observed (but can be addressed by careful hand washing) was not.
The second clarification focuses on tasks that involve sequential decisions like the decisions made while installing a new product and while making clinical recommendations. Our analysis can shed light on the initial decision, and this decision is likely to affect the complexity of the subsequent processes. For instance, the initial decision to skip reading instructions and contracts implies that the subsequent processes will not use this information, and is therefore likely to reflect oversimplification. The current reliance-on-small-samples hypothesis suggests that oversimplification of this type (as demonstrated by Bakos et al., 2014) is likely when reading the relevant information is important in expectation, but in most cases, skipping reading is effective (see Roth et al., 2016).
Another example involves the tendency of experts to trust their intuition rather than using a simple and effective linear regression algorithm (as demonstrated by Meehl, 1954). Our results suggest that preferring intuition that implies counterproductive complexity is more likely when the common outcome of this behavior is rewarding. The reward can be a subjective feeling of competence, but it can also be extrinsic. Similarly, the tendency to read and share fake news is likely in situations in which an incorrect belief in complex interactions is shared by the decision maker's peers and other people that evaluate his or her decisions. 6 The main shortcoming of the current investigation involves its reliance on models that ignore the possibility of generalization between similar signals. For example, the initial reaction to the signal TAT will most likely be affected by past experiences with the signal TTA. We have considered models that try to capture this likely tendency and found that they do not improve prediction accuracy in a clear way. In natural settings, however, such between-signal generalizations are likely to be important.
Another limitation of our analysis is suggested by gambling phenomena that cannot be easily captured by the current reliance-onsmall-samples hypothesis. For example, many gamblers appear to use complex "problem-solving" rules to devise "optimal" strategies to win (Ejova & Ohtsuka, 2020) and invest effort to "change" their luck (e.g., Ohtsuka & Chan, 2010). We hope to address these and similar shortcomings in future research.
The wider theoretical implications of the current analysis include the generalization of the basic study of decisions from experience to address decisions in multicues tasks. To clarify the interesting implications of this generalization, it is constructive to focus on the relationship between the reliance-on-small-samples hypothesis (supported here and in basic studies of decisions from experience, see review in Erev & Haruvy, 2016) and the hypothesis that people tend to use the take-the-best rule (supported in studies of multidimensional decisions, Gigerenzer & Goldstein, 1996

DATA AVAILABILITY STATEMENT
The raw data and pre-registration are available online (https://osf.io/ yna58/).

ORCID
The probability that the sample (of size 5 with replacement) will not include a + 10 outcome 9 À1 1-(7/8) 5 = .487 10 À1 1-(8/9) 5 = .445 11 À1 1-(9/10) 5 = . Mandatory: Please read the instructions carefully and enter your details below In each trial of this study you will be asked to choose between two options. One option that maintains the status quo (provides a payoff of 0 points for sure), and an option that can lead to a gain or a loss of points. The payoff from the risky option will vary from trial to trial. You will receive three cues that can help predict the exact payoff in each trial. Each cue will be the letter "T" or the letter "A". For example, the signal "TTA" implies that the first and second cues are "T", and the third cue is "A". Please type the word "proceed" in the comments field below. This is to make sure that you read and understand the current instructions. Your goal is to maximize your number of points. Your total number of points at the end of the experiment will determine the probability of earning the $0.5 bonus in addition to the $1 show-up fee. Please enter the following information: Gender Age Note: The text in italics appears only in Condition Abstract. In Condition Experts this text was replaced with: The cues are recommendation of three experts. "T" (take) implies that the expert recommends "Taking the risk", and "A" (avoid) implies that the expert recommends avoiding the risk. For example, the sequence "TTA" implies that the first two experts recommend taking the risk, and the third expert recommends avoiding it.
T A B L E A 1 The estimated distribution of the PAS parameters. Note: Cells indicate the percentages of cases. When the value of κ i,T is "VL", all past experiences were equally weighted as expected when the sample size is very large (! ∞). PAS  generalizes the sample-of-5 model in two ways. First, it allows for the possibility that the sample size depends on the task (problem) and on the agent. The sample size used by agent i in task T, denoted as κ i,T , is a free parameter. Second, it allows for the possibility of reliance on old past experiences (experiences that occur before the beginning of the current study) that diminishes with experience. In the current binary choice task, it is equal to δ tÀ1 t ½ i,T , where δ i,T is a second free parameter, and t is the trial number (in the current problem). Erev et al. estimated (using a constrained maximum likelihood estimation) the distribution of PAS parameters in the population based on the results of 87 distinct binary choice task. Table A1 presents the results.

APP E NDIX
The model signal-based PAS (SB-PAS) generalizes PAS to multicue tasks. This model is similar to PAS with the additional constraint that only past experiences with the current signal are considered.
The original version of PAS, and SB-PAS, approximates the impact of reliance on old past experiences by assuming random choice. The model SB-PAS-eg relaxes this assumption. It assumes that the impact of reliance on old past experiences can be approximated by the choice rate in the second trial of the experiment (after the clarification of the nature of the feedback).