The goal of most hydrometeorological forecast communication is to provide information that people can interpret and use beneficially. Communicating forecasts effectively therefore requires understanding how intended audiences interpret and use forecast information presented in different ways. As capacity to estimate uncertainty in hydrometeorological forecasts has increased, the weather, climate and hydrology communities have become particularly interested in effective communication of uncertainty (e.g. Manning, 2003; Ryan, 2003; NRC, 2003, 2006; Demeritt et al., 2007; AMS, 2008; Budescu et al., 2009). Although hydrometeorological forecast interpretation and use has been studied in some real-world cases, there has been limited empirical study across broader populations. To help fill this knowledge gap, this paper examines how members of the broad US public interpret and use different types of weather forecasts, including those conveying uncertainty, based on people's responses to decision scenario questions.
The paper focuses on data from two sets of decision questions included in a nationwide survey of the US public's perceptions, interpretations, uses, and values for weather forecast information (Morss et al., 2008; Lazo et al., 2009). The first set of questions asked respondents their threshold, in terms of a percentage chance of rain or temperature below freezing, for taking protective action in a picnic or garden scenario. The second set asked respondents to use various quantitative precipitation or temperature forecasts to make binary (yes/no) protective decisions in a potential reservoir flooding or fruit frost scenario. The latter (binary) decision scenarios include a protective action component involving monetary costs and an outcome component involving monetary losses, similar to the cost-loss decision model employed in some studies of hydrometeorological forecast use and value. Although the decision scenarios tested here are simpler than most real world decision settings, examining use of information in a controlled context allows for replicability and more focused study.
The binary decision questions test forecasts that are either single-value (deterministic) or convey uncertainty in one of two ways: as a range of possible values or a percentage chance of exceeding a damage threshold. The forecast information is presented using fairly simple, text-based formats, designed to allow examination of fundamental aspects of forecast interpretation and use. These findings can then be built on in research with more complex communication formats, including graphs and icons.
The analysis examines respondents' stated probabilistic thresholds for taking protective action in the garden and picnic scenarios and their use of different forecast information in the reservoir and fruit scenarios. The results provide insight into respondents' interpretations of deterministic and uncertainty forecasts and their ability to use uncertainty information, including probabilistic forecasts, in decision making. They also indicate the extent to which respondents made decisions according to the cost-loss model. Each set of decision questions is tested in two scenarios to allow initial exploration of how results are similar and different across contexts. Further, by examining the decision question responses in conjunction with responses to other questions on the survey, the analysis explores relationships between respondents' perceptions and interpretations of forecasts and their forecast use.
The decision questions are related to experimental economics and psychology approaches that empirically assess human behaviour in controlled contexts. While such approaches have been widely used to study other topics, few studies have implemented them to examine use of weather forecast information. The present study complements this previous work (e.g. Roulston et al., 2006; Joslyn et al., 2007, 2009a, 2009b; Roulston and Kaplan, 2008; Nadav-Greenberg et al., 2008; Joslyn and Nichols, 2009) by asking related, but different, research questions about weather forecast interpretation and use, with the decision questions implemented in a survey rather than a laboratory setting. The survey implementation presents some potential disadvantages (see Section 2.3) but also advantages (Fehr et al., 2003; Naef and Schupp, 2009). One advantage is that the survey provides a respondent population that is larger and more representative of the general public than most related experimental work. This enhances the power of the analysis and the applicability of the results to the broader public. The survey also provides a large body of attitudinal, sociodemographic, and other data that are often not available in experimental decision studies.
The findings add to existing knowledge about communication, interpretation, and use of hydrometeorological forecasts in several ways. First, despite decades of daily of forecast provision to the public, little is known empirically about how laypeople interpret different types of weather forecasts, particularly those conveying uncertainty information. Studies such as Murphy et al. (1980), Gigerenzer et al. (2005), Morss et al. (2008), and Joslyn et al. (2009a) have begun to address this by asking people their stated interpretations, but many aspects of interpretation are still poorly understood, and different aspects of interpretation may be revealed by studying information use. Second, recent studies employing hypothetical models of how people make decisions (such as cost-loss models) suggest that uncertainty information has potential value to hydrometeorological forecast users (e.g. Richardson, 2000; Zhu et al., 2001; Mylne, 2002; Palmer, 2002). However, for real decision makers, a variety of constraints limit effective communication and use of uncertainty information (e.g. Morss et al., 2005; NRC, 2006). Thus, empirical studies are needed to understand when communicating forecast uncertainty is desirable and how to best do so. Studies that empirically compare use of different types of weather forecasts, as this one does, are especially lacking. Further, weather forecasting is a domain in which risk information is communicated and used in decisions every day. Thus, studies such as the current one provide opportunities to apply concepts from risk communication and decision making under uncertainty to weather prediction and to inform these fields.
This section describes the data examined in this paper, which were collected in 2006 as part of a nationwide US survey. In addition to the questions described in this section, the survey also included questions on respondents' sources, uses, and values of weather forecasts; perceptions and interpretations of weather forecasts, including uncertainty information; sociodemographic characteristics; and other topics (see Morss et al., 2008; Lazo et al., 2009). Several analyses were performed in conjunction with data from these other questions, in order to examine possible influences on use of forecast information.
Analysis of the data was performed using Matlab and SAS. Tests of statistical significance employed in the analysis are described in the Appendix A.
2.1. Survey development, implementation and respondent population
The survey instrument was developed and pretested using standard principles for developing survey questions and conducting survey research (Schuman and Presser, 1996; Dillman, 2000; Tourangeau et al., 2000). The survey was implemented in November 2006 on the Internet with a sample provided by a survey sampling company, designed to be representative of the US population reachable online. The respondent population includes people from every US state and the District of Columbia. Its sociodemographic characteristics are generally similar to that of the US population (U.S. Census Bureau, 2006), except that it is somewhat older and more educated and under represents people with very low and high incomes. While the respondent population is not a random sample of the general US population, it is more diverse and representative than previous related work with convenience samples or students. For further detail on the survey development, implementation and respondent population, see Morss et al. (2008).
Although 1520 completed surveys were received, 55 respondents said that they did not ever use weather forecasts and were not asked most of the remaining questions, including those examined here. Thus, the analysis presented begins with data from 1465 respondents.
2.2. Threshold decision questions
The threshold decision questions asked respondents their probabilistic forecast threshold for taking protective action in two scenarios, referred to as the picnic and garden scenarios. The picnic scenario involves protection against precipitation, and the garden scenario involves protection against low temperature. These questions were motivated by a related question asked in Gigerenzer et al. (2005). Each respondent was randomly assigned to receive one of the two scenarios and the associated threshold question.
In the picnic scenario, respondents were told to suppose they have an outdoor picnic planned for tomorrow. They were then asked: ‘At what forecast chance of rain for tomorrow would you decide today to move your picnic indoors?’ There were 11 response options: forecast chance of rain from 10 to 100%, in intervals of 10%, or not moving the picnic indoors (i.e. take no action).
In the garden scenario, respondents were told to suppose they have a garden with plants that will die if the temperature drops below freezing (32 °F). They were then asked: ‘At what forecast chance that the temperature will be below freezing (32 °F) tonight would you decide today to cover your plants?’ There were 11 response options: forecast chance of temperature below freezing (32 °F) from 10 to 100%, in intervals of 10%, or not covering the plants (i.e. take no action).
The authors hypothesized that respondents would select a range of thresholds but no other specific hypotheses about the responses.
2.3. Binary decision questions
The binary decision questions asked respondents whether or not they would take protective action given different forecast information in two scenarios, referred to as the reservoir and fruit scenarios. The two scenarios are summarized in Table I and presented in further detail below. The scenarios were patterned after the cost-loss decision situation (Thompson, 1952; Thompson and Brier, 1955), in the sense that they include two decision alternatives (protective action at a cost, or no protective action) and two possible outcomes involving monetary losses ($ 100 000 damage, or no damage). For each scenario, two cost conditions ($ 10 000 or $ 20 000) for protective action were tested. The questions were developed to compare respondents' use of different types of forecast information and to examine their decisions from a cost-loss perspective.
Table I. Overview of the reservoir and fruit scenarios used in the binary decision questions described in Section 2.3 (each with two cost conditions)
Potential damage (monetary loss)
$ 100 000
$ 100 000
4 in. or more
below 32 °F
Cost of taking protective action
$ 10 000
$ 20 000
$ 10 000
$ 20 000
Number of respondents after/before removing all-yes and all-no responses (total 1233/1465)
The reservoir scenario involves protection against precipitation and the fruit scenario involves protection against low temperature, similar to the picnic and garden scenarios, respectively, discussed for the threshold question in Section 2.2. Each respondent was (randomly) assigned to receive either precipitation or temperature scenarios for both sets of questions. In other words, respondents who received the picnic (garden) scenario for the threshold decision question received the reservoir (fruit) scenario for the binary decision questions. Each respondent was also randomly assigned to one of the two protective-action cost conditions. Thus, for the binary decision questions, the respondent population was divided into four groups (Table I).
In the reservoir scenario, respondents were told:
‘Suppose you are a manager of a local water reservoir. If there are 4 inches or more of rain tomorrow, your reservoir will overflow and flood the town, causing $ 100 000 in damages (but no injuries or deaths) that your company must pay for. You can prevent a potential flood by releasing water from your reservoir today, but releasing water will cost your company ($ 10 000 or $ 20 000).’
The four possible combinations of decisions (action or no action) and outcomes (damage or no damage) were explained. Respondents were then presented nine forecast conditions, one at a time, in random order. For each forecast condition, they were asked ‘Would you spend the ($ 10 000 or $ 20 000) to release water from your reservoir?’
In the fruit scenario, respondents were told:
‘Suppose you are a fruit grower and your crop is nearly ripe. If the temperature drops below freezing (32 °F) tonight and your crop is unprotected, it will be damaged and you will lose $ 100 000. You can prevent potential freeze damage by protecting your crop today, but protecting your crop will cost you ($ 10 000 or $ 20 000).’
As in the reservoir scenario, the four combinations of decisions and outcomes were explained. Respondents were then asked ‘Would you spend the ($ 10 000 or $ 20 000) to protect your crop?’ for each of nine forecast conditions.
The response options were ‘yes’ or ‘no’ for each question. Respondents were not allowed to return to previous questions, nor were they given information about which outcome occurred.
The nine forecast conditions tested in the reservoir and fruit scenarios are shown in Table II. The forecast conditions were designed to manipulate a few relatively simple dimensions of information presentation, using text-based formats. The ‘single-value’ forecast conditions provide deterministic forecast information, similar to that currently available in most forecasts (NRC, 2006); the forecast values are different distances from the damage threshold. Two forms of uncertainty communication were tested: ‘range’ forecasts and ‘percentage-chance’ forecasts. The range forecast conditions were designed to represent a fairly simple form of uncertainty communication, one that does not involve probabilities. The range forecasts are symmetrical about the third single-value forecast, with the first range meeting the damage threshold and the second range exceeding the threshold. The percentage-chance forecast conditions were designed to represent a somewhat more complex form of uncertainty communication; they present different probabilities of reaching or exceeding the damage threshold.
Table II. Forecast conditions presented in the binary decision questions
Single-value forecast conditions
1 in. of rain
Low temperature of 37 °F
2 in. of rain
Low temperature of 35 °F
3 in. of rain
Low temperature of 33 °F
Range forecast conditions
2–4 in. of rain
Low temperature of 32–34 °F
1–5 in. of rain
Low temperature of 31–35 °F
Percentage-chance forecast conditions
5% chance of 4 in. or more of rain
5% chance of 32 °F or lower
10% chance of 4 in. or more of rain
10% chance of 32 °F or lower
20% chance of 4 in. or more of rain
20% chance of 32 °F or lower
40% chance of 4 in. or more of rain
40% chance of 32 °F or lower
Note that each respondent received only one of the two scenarios and one of the two cost conditions. For the scenario and cost condition they were given, however, each respondent received all nine forecast conditions (in random order). Thus, the study employs a combination of between- and within-subject design.
While the two scenarios and associated forecast conditions are similar in many ways, they are not directly parallel. Differences include the scenario content, the phrasing of the protective decision, the positive versus negative perspective of the damage thresholds and uncertainty forecast conditions, and the different numerical values in the thresholds and forecasts. A large body of psychology and related research suggests that such differences can affect people's responses (e.g. Tversky and Kahneman, 1981; Kühberger, 1998; Levin et al., 1998; Windschitl and Weber, 1999; Joslyn et al., 2009b). Given this, the primary goal of testing two scenarios is to examine how results are consistent or different across the two, as a first exploration of the extent to which the findings might apply across contexts.
The binary decision questions were motivated by related experimental work in behavioural economics and psychology that examines individuals' decisions (e.g. Kagel and Roth, 1995; Loewenstein, 1999; Hertwig and Ortmann, 2001; Croson, 2005). The questions have elements of both approaches: for example, the presentation of monetary costs and benefits in the scenarios is more similar to economics experiments, while the context provided in the scenarios is more similar to psychology experiments (Croson, 2005; Ariely and Norton, 2007). The survey implementation employed here also differs from the typical implementation of such experiments in a laboratory setting. From an experimental economics perspective, a major limitation of this study associated with the survey implementation is that subjects did not receive real monetary payoffs related to their decisions. Such monetary incentives are often not used in psychology studies, and previous work suggests that in many situations they do not substantially alter subjects' average behaviour. Nevertheless, monetary incentives can influence some types of findings, and they do tend to reduce variability in subjects' responses (Smith and Walker, 1993; Camerer and Hogarth, 1999; Hertwig and Ortmann, 2001). On the other hand, the survey implementation has the advantage of providing a larger, more diverse respondent population than typical experimental implementations, which usually involve smaller samples of students (Fehr et al., 2003; Naef and Schupp, 2009). With these considerations in mind, these findings are informative as a first study of some of the issues examined here, and they can inform future related survey and laboratory-based work.
Regarding the single-value forecast conditions, the authors hypothesized that some respondents would choose to protect when the forecast had not reached the damage threshold, and that more respondents would protect as the forecast became closer to the threshold. Regarding the range forecast conditions, the authors hypothesized that as an extreme value in the range reached and then exceeded the damage threshold, more respondents would protect. Regarding the percentage-chance forecasts, the authors hypothesized that as the likelihood of exceeding the threshold increased, more respondents would protect, and that the responses would provide evidence of decision making consistent with the cost-loss model. The authors also hypothesized that respondents would be less likely to choose protective action in the higher cost condition.
3. Results: probabilistic thresholds for protection
Figure 1 presents results from the decision threshold questions, in which respondents were asked to indicate their probabilistic threshold for protection in either the picnic or garden scenario (described in Section 2.2). Comparing responses within each graph in Figure 1 indicates that given a similar situation, different individuals can have very different probability thresholds for decision making, supporting the hypothesis. These differences among individuals likely result from a combination of people's different tolerances for weather-related risk, their different perceptions and interpretations of weather forecasts, and other factors (discussed further below).
Comparing responses between the two graphs in Figure 1 indicates that the respondent population responded differently to the two scenarios, in two ways. First, the distribution of respondents' thresholds for taking protective action is significantly different in the two scenarios (Wilcoxon rank-sum test, respondents who chose not to take protective action removed; N = 1389, U = 26.389, p < 0.0001). (Throughout the manuscript, the term ‘significant’ is used to denote statistical significance. Tests of statistical significance are described in the Appendix.) On average, respondents selected a higher threshold in the picnic scenario (58.3%) than in the garden scenario (53.3%). Given the multiple differences between the scenarios, understanding why respondents' thresholds differed would require further research. Overall, however, this finding suggests that context can influence people's use of weather forecasts.
Second, the distribution of responses in the garden scenario exhibits secondary peaks at the lowest (10%) and highest (100%) thresholds, suggesting that people tended to respond less consistently than in the picnic scenario. The garden scenario asked about probability of temperature exceedence forecasts, which are generally not currently publicly available, while the picnic scenario asked about probability of precipitation (PoP) forecasts, which have been regularly provided to the US public for decades. One possibility is that people responded less consistently in the garden scenario because of their relative lack of familiarity with the forecast format. Further research would be required to test this explanation and, more generally, to examine how experience with a type of forecast information affects its use.
As noted above, it is expected that people's use of forecast information is related to their perceptions and interpretations of forecasts, including forecast uncertainty. To begin exploring these relationships, people's responses to the threshold decision questions were analysed in conjunction with their responses to two other sets of questions from the survey. The first analysis examined whether respondents who reported different levels of confidence in 1-day PoP or temperature forecasts (see Morss et al., 2008) had different distributions of protective-action thresholds in the picnic or garden scenarios, respectively. The analysis was performed separately for the two scenarios, and no significant differences were found (Kruskal–Wallis test, respondents who chose not to take protective action removed; picnic: N = 698, χ2 = 6.370, df = 4, p = 0.1732; garden: N = 6 91, χ2 = 2.314, df = 4, p = 0.6783). This could be because people's confidence has a complex relationship with their thresholds for protective action. For example, as people's confidence decreases, they may have lower confidence that a damaging event will be detected, leading to a lower threshold for protection; alternatively, their lower confidence may cause them to be less likely to take action based on a forecast, leading to a higher threshold or a greater likelihood to not take protective action.
The second analysis examined relationships between people's interpretations of PoP and their use of PoP forecasts in decision making, following Gigerenzer et al. (2005) and Joslyn et al. (2009a). These two studies as well as Morss et al. (2008) found that many lay people still do not know the technically correct definition of PoP forecasts, corroborating earlier work by Murphy et al. (1980). Gigerenzer et al. (2005) found that people who interpreted PoP as meaning percent region or time of rain had a slightly lower PoP threshold for taking an umbrella than those with a ‘correct’ interpretation. Joslyn et al. (2009a, p 188) found that ‘misinterpreting the forecast as more than half the area or time affects decision making by increasing the tendency to take precautionary action’. Both discuss this result in terms of people's tendency to misinterpret PoP as a deterministic forecast of rain, with the only uncertainty being about where or when; this leads people to take protective action at lower PoPs, or, equivalently, to be more likely to take protective action for a given PoP.
Morss et al. (2008) examined people's PoP interpretations using data from the survey discussed here. While they found that many respondents selected a ‘region’ or ‘time’ interpretation of PoP when asked in a closed-ended format, other aspects of their results suggest that many people do not have preconceived interpretations of PoP of the types emphasized in previous studies. Further, despite difficulty clearly defining PoP forecasts, most people find PoP forecasts important (Morss et al., 2008; Lazo et al., 2009). Based on their results, Morss et al. (2008) suggest that a correct technical understanding of PoP may not be important to many people in using PoP. Given this different view of people's interpretations and use of PoP and the present study's larger, more representative respondent population, the authors decided to examine the relationship between PoP interpretation and use in this data set. The decision question studied here also differs from that asked by Joslyn et al. (2009a), so that this study examines this relationship from a complementary decision-making angle.
This relationship was analysed using data from the 629 respondents who received both the picnic scenario and a closed-ended question asking their interpretation of the PoP forecast ‘60% chance of rain tomorrow’ (see Morss et al., 2008). Respondents' PoP thresholds for taking protective action were examined, grouped according to their selected interpretation of PoP. For comparison with Joslyn et al. (2009a), results were also examined with the ‘region’ and ‘time’ interpretation groups combined. Results are shown in Table III, including the mean PoP threshold for each group and a statistical test of whether each group's distribution of PoP thresholds is significantly different from that of the group who selected the ‘correct’ (‘days’) interpretation. The mean thresholds do differ across the groups with different interpretations. However, only one group, the ‘time’ interpretation group, has a threshold distribution that is significantly different (at the 5% level) from the ‘correct’ group. The ‘time’ group does have a lower mean PoP threshold, which would support Joslyn et al.'s (2009a) conclusion about ‘deterministic’ misinterpretations of PoP influencing decision making. However, the ‘region’ group does not have a significantly different threshold distribution from the ‘correct’ group, nor do the ‘region’ and ‘time’ groups combined. This, along with other results in Table III, suggests that the relationship between PoP interpretation and decision making is more complex.
Table III. Results for the probabilistic threshold question in the picnic scenario, with subjects grouped by their responses to the survey's closed-ended question on interpretation of 60% PoP (discussed in Morss et al., 2008, their Section 3c and Table II)
Mean PoP threshold for protection
Standard deviation of PoP threshold for protection
Probability that the group's PoP-threshold distribution differs from that of ‘days’ groupb
Technically correct interpretation, according to how PoP forecasts are verified, as interpreted by Gigerenzer et al. (2005).
Overall, the result that people's decision thresholds depend on the individual and the context suggests the importance of giving people forecast information that they can use to make decisions based on their own criteria given the situation, rather than recommending decisions based on assumed thresholds for protection. The study's findings also indicate the importance of understanding and considering contextual and individual influences on forecast interpretation and use for developing effective forecast communication formats. Further work is needed to understand the potentially complex relationships between people's perceptions and interpretations of forecasts (including PoP) and their use.
4. Results: binary protective decisions with different forecast information
This section examines results from the binary decision questions, in which respondents were asked to make yes-no protective decisions in either the reservoir or fruit scenario (described in Section 2.3 and Table I). Respondents were given one of the two scenarios and one of the two cost conditions and were asked whether they would take protective action under each of the nine forecast conditions in Table II.
Of the 1465 respondents, 183 answered ‘no’ (take no protective action) to all nine forecast conditions, and 49 answered ‘yes’ (take protective action) to all nine. These respondents' repeated yes or no decisions suggest that they may have had difficulty understanding or answering the questions or lacked motivation to expend the cognitive energy required to do so (e.g. Jackson, 1967; Krosnick, 1991). These responses also provide little information for the analyses presented here. Consequently, these 232 respondents (15.8%) were removed from the analyses in this section. (Removing these respondents does not change the overall findings.) The number of respondents for each of the scenarios and cost conditions, before and after removal of these respondents, is provided in Table I.
First, results are compared across the scenarios and cost conditions, in Section 4.1. Sections 4.2, 4.3, and 4.4 then compare respondents' decisions among the nine forecast conditions, categorized as shown in Table II. Recall that respondents received the forecast conditions in random order and were not permitted to return to previous questions, which reduces the potential for order effects on the major findings. Most of the results presented are for comparisons across the subject panel, a few within-subject comparisons are also discussed.
4.1. Response to decision context and cost of protection
Two general patterns appear across the binary decision question results. These patterns are analysed here by pooling results across all nine forecast conditions, but they are also evident in responses to individual forecast conditions (see Figure 2 and subsequent figures).
First, protective action was chosen significantly more often in the fruit scenario than in the reservoir scenario (Pearson chi-squared test; N = 11 097, χ2 = 137.315, df = 1, p < 0.0001). As discussed in Section 2.3, the scenarios and associated decisions and forecasts differ in multiple aspects of content and presentation that could have contributed to this difference in tendency to protect. Consequently, further research would be needed to understand this result. More generally, this result suggests that, as discussed in Section 3, context can affect the interpretation and use of weather forecasts.
Second, there were no significant differences in how often protective action was chosen between the two cost conditions (Pearson chi-squared test; N = 11, 097, χ2 = 0.073, df = 1, p = 0.7865; see also the logistic regression analysis in Section 4.4). This result rejects the hypothesis that respondents would be more likely to protect when the cost of protection was lower. The authors suspect that a response to cost was not detected because, given the hypothetical scenarios and the lack of real monetary payoffs in the study design, the monetary values in the questions were not substantially meaningful to many respondents. This finding may also have been influenced by the between-subject design for the cost condition. Given this result, the $ 10 000 and $ 20 000 cost conditions are combined in most of the remaining analysis and discussion (although the cost conditions are still depicted separately in the figures).
4.2. Interpretation and use of single-value forecasts
This section compares respondents' decisions in the single-value (deterministic) forecast conditions tested (Figure 2). The results indicate how people responded when the deterministic forecast was different distances from the damage threshold, providing information about people's interpretations of deterministic forecasts as revealed by their stated decisions.
Figure 2 shows that, as hypothesized, some respondents chose to protect when the forecast value had not reached or exceeded the damage threshold. This indicates that many people infer uncertainty into deterministic low temperature and quantitative precipitation forecasts (corroborating and extending Morss et al.'s (2008) similar finding based on people's stated interpretations of deterministic high-temperature forecasts).
In addition, as hypothesized, significantly more respondents chose to protect as the single-value forecast became closer to the damage threshold, in both scenarios (Pearson chi-squared test, cost conditions combined; reservoir: N = 1821, χ2 = 427.042, df = 2, p < 0.0001; fruit: N = 1878, χ2 = 335.443, df = 2, p < 0.0001; see also the logistic regression model results discussed later in this section). Within-subject comparisons indicate that only 8% of individuals did not follow this pattern, in other words, changed their response from ‘yes’ to ‘no’ as the forecast grew closer to the threshold. Together, these results suggest that many if not most respondents were able to interpret and use the deterministic forecasts in the general manner expected by the authors when posing the questions.
These data also provide some information about respondents' interpretations of uncertainty in single-value low-temperature and quantitative precipitation forecasts. For example, since the damage threshold in the fruit scenario is 32 °F, the 72% of respondents who chose to take protective action given a forecast of 33 °F indicated a belief that there was at least 1 °F of uncertainty in the forecast. More specifically, if one imagines that respondents infer some uncertainty distribution about a deterministic forecast, this 72% believed there was sufficient probability of temperatures below 32 °F to merit protective action. Some of the remaining 28% may also have believed the temperature might fall below 32 °F, but it was not sufficiently likely to merit protection. The other data in Figure 2 can be used to make similar inferences about respondents' interpretations of uncertainty in the other deterministic forecasts tested; for example, 59% of respondents indicated an interpretation of at least 1 in. of uncertainty in a deterministic forecast of rain amount. The probability of damaging weather required for a person to take protective action depends on factors not explored in the survey. Consequently, further research would be needed to use data on people's decisions to analyse their interpretations of uncertainty in greater detail.
Following on the discussion in Section 3, the authors began exploring relationships between people's decisions using deterministic forecasts and their perceptions and interpretations of forecast uncertainty. To do so, the data from Figure 2 were analysed in conjunction with the survey's data on respondents' stated confidence in 1 day forecasts of different parameters and their stated expectations of high temperature given a 1 day forecast (Morss et al., 2008). Relationships were examined in several different ways, and no significant relationships were found. The most comprehensive analysis was construction of logistic regression models (see Section 4.4). Data for the reservoir and fruit scenarios were modelled separately, with cost conditions combined. The dependent variable was yes/no response; the independent variables were: (1) distance of the deterministic forecast from the damage threshold; (2) confidence in 1 day forecasts of precipitation amount (reservoir scenario) or temperature (fruit scenario); and (3) for the fruit scenario, interpretations of 1 day high temperature forecasts. Interactions between variables were included in the models, to test, e.g. whether people with lower confidence were more likely than others to protect when the deterministic forecast was closer to the threshold. As expected given Figure 2, the distance of the forecast from the threshold was a highly significant predictor in both scenarios. The other variables tested were not significant predictors. Thus, as discussed in Section 3, further work is needed to explore relationships between people's perceptions and interpretations of weather forecasts and their forecast use.
4.3. Interpretation and use of range forecasts
Next, respondents' decisions are examined in the range forecast conditions tested, compared with the single-value forecast conditions with equivalent midpoints (Figure 3). The comparison provides information about how people interpreted forecasts that conveyed uncertainty using ranges, as revealed by their stated decisions.
A significantly different number of respondents chose to protect in the three forecast conditions shown in Figure 3, in both scenarios (Pearson chi-squared test, cost conditions combined; reservoir: N = 1821, χ2 = 19.022, df = 2, p < 0.0001; fruit: N = 1878, χ2 = 11.318, df = 2, p = 0.0035). However, the differences in response to the three forecast conditions in Figure 3 are small, compared, for example to the differences in Figure 2. This is likely because the ranges tested are fairly narrow (and have the same midpoint). Recall from Section 4.2 that many respondents appear to infer at least 1 °F of uncertainty into temperature forecasts and at least 1 in. of uncertainty into quantitative precipitation forecasts. Given these interpretations, adding a range of +/- 1–2 °F or 1–2 in. of rain to a forecast appears not to have altered many respondents' decisions. This is corroborated by a within-subject comparison, which indicates that 59% of respondents (70% in the fruit scenario and 47% in the reservoir scenario) selected the same decision in all three forecast conditions shown in Figure 3.
Despite the small differences in Figure 3, the fruit scenario results support the hypothesis that as one of the extreme values in the range reached and then exceeded the damage threshold of 32 °F, more respondents would choose to protect. However, in the reservoir scenario, fewer respondents chose to protect given a forecast of 1–5 in. than a forecast of 2–4 in. This may be because the placement of ‘4 in.’ at the end of the 2–4 in. forecast emphasizes the 4 in. damage threshold, whereas in the 1–5 in. forecast condition, respondents have to conduct a simple subtraction to compare the forecast with the threshold. As discussed earlier, previous psychology research has shown that such details in presentation can be important. Overall, therefore, these results suggest that further research is needed to understand how people interpret and use forecasts expressing uncertainty as ranges similar to those tested. This is particularly important given that, besides PoP, such ranges (or implied ranges, such as ‘low temperature in the low 30 s’) are one of the more common ways that weather forecast uncertainty is currently communicated to the public.
4.4. Interpretation and use of percentage-chance forecasts
Finally, respondents' decisions are examined in the percentage-chance forecast conditions tested: a 5, 10, 20, or 40% chance that the damage threshold will be reached or exceeded (Figure 4). Studies indicating laypeople's lack of numeracy in general, along with anecdotal experiences from weather forecast providers, has led to discussion in the weather prediction community about whether probabilities are a suitable form for presenting uncertainty to the general public (e.g. Murphy et al., 1980; Sink, 1995; NRC, 2006; Roulston and Kaplan, 2008). The present analysis contributes to this discussion by empirically examining respondents' ability to understand and use one form of probabilistic weather forecasts. The data also allow comparison of people's decisions in the scenarios with decisions hypothesized in the cost-loss model, which has been used as a prototype decision model for studying use and value of probabilistic hydrometeorological forecasts (e.g. Katz and Murphy, 1997).
Figure 4 indicates that in both scenarios, more respondents chose to take protective action as the forecasted chance of exceeding the damage threshold increased. To test this relationship, data from the two scenarios and two cost conditions were combined and a logistic regression model (O'Connell, 2005) was constructed, with yes/no response as the dependent variable (Table IV). (Logistic regression is used because the response variable is a discrete qualitative variable.) Because this analysis combines individuals' responses to four questions (one for each percentage-chance forecast), potential intra-subject correlation was accounted for using the method of generalized estimating equations (Allison, 1999; Ballinger, 2004; Lazo et al., 2010). Percentage-chance forecast was a highly significant predictor of respondents' decisions: the higher the percentage chance of damage, the more likely respondents were to take protective action. Respondents were also significantly more likely to protect in the fruit scenario than the reservoir scenario, and cost of protection was not a significant predictor, corroborating results in Section 4.1. Various sociodemographic characteristics and measures of forecast perception and interpretation were tested as additional model predictors, using data from other questions on the survey, but no stable, significant relationships were found.
Table IV. Results from logistic regression model examining respondents' decision whether to take protective action (yes = 1, no = 2) as a function of percentage-chance forecast (5, 10, 20, or 40%), decision scenario (reservoir = 1 or fruit = 2), and cost of protection ($ 10 000 or $ 20 000)
The regression models the probability of the lower value response (‘yes’); see Section 4.4 for description of model (N = 4932).
Cost of protection
The study design used here does not allow a strong test of whether people can use uncertainty-explicit forecasts to make better decisions than they would using deterministic forecasts. However, Figure 4 and Table IV do provide evidence that much of the respondent population was able to interpret probabilistic forecast information of the type provided well enough to use it in the binary decision questions. This is further supported by a within-subject analysis, summarized in Table V. Looking at the right hand column (all respondents combined), 55% of individuals ordered their responses to the four percentage-chance conditions in a manner that suggests understanding of the forecasts; in other words, their decision changed from no (don't protect) to yes (protect) as the chance of exceeding the damage threshold increased (rows b–d). An additional 30% made the same decision under all four percentage-chance conditions (rows a and e). Only 15% chose to protect at a lower chance and not to protect at a higher chance (row f); these are referred to as ‘unordered’ responses. (Approximately one-quarter of these respondents (3.5% of all respondents) gave a pattern of responses that changed from take protective action to do not take protective action as the percentage chance increased (YNNN, YYNN, or YYYN using the format in Table V) and thus are not strictly ‘unordered’.)
Table V. Within-subject analysis of responses to the four percentage-chance conditions, with data from the reservoir and fruit scenarios combined
Pattern of responses for different percent chances of exceeding the damage threshold (Y = take protective action; N = do not take protective action)
Percent of respondents ($ 10 000 cost condition)
Percent of respondents ($ 20 000 cost condition)
Percent of respondents (cost conditions combined)
Row a (e) represents subjects who said yes (no) to all four percentage-change forecast conditions. Rows b–d represent subjects who provided ‘ordered’ responses, and row f represents subjects who provided ‘unordered’ responses, as discussed in Section 4.4. The percent of respondents who provided each pattern of responses is shown for the $ 10 000 protective-action cost condition (N = 622), $ 20 000 cost condition (N = 611), and the two cost conditions combined (N = 1233). For the $ 10 000 and $ 20 000 cost-condition results, (RA) denotes respondents who chose to protect when the cost of protection was greater than the expected loss (indicating risk-averse behaviour), and (RS) denotes respondents who chose to protect when the cost of protection was less than the expected loss (indicating risk-seeking behaviour).
The 15% of respondents who provided unordered responses may not have understood the scenario or the percentage-chance forecasts, or their decisions may have been influenced by other factors. Such variability or inconsistency in responses is typical in laboratory experiments of this type (e.g. Camerer, 1989; Starmer and Sugden, 1989; Hey, 2001; Blavatskyy, 2007). Recall that these four forecast conditions were randomly ordered among the nine presented, and that the survey implementation provided no monetary incentives for or feedback on decisions. Further, previous research suggests limitations in numeracy among much of the public (e.g. Lipkus et al., 2001; Peters et al., 2006), and the types of percentage-chance weather forecasts tested here are unfamiliar to most people. Given these considerations, a notable percentage of individuals provided responses suggesting that they understood probabilistic weather forecasts of the type tested well enough to use them to make simple protective decisions. One possible implication is that many members of the public may not be accurately characterized as ‘less sophisticated’ users of uncertainty-explicit weather forecasts, compared to ‘more sophisticated’ users such as emergency managers or members of the private sector (as discussed in, e.g. NRC, 2003; Novak et al., 2008; Santos et al., 2008).
The within-subject analysis in Table V also provides information about respondents' thresholds for taking protective action. Looking again at the right hand column with all respondents combined, 6% of respondents changed their response from no at 5% chance of damaging weather to yes at 10% chance, indicating a decision threshold between 5 and 10% (row b). An additional 13 and 36% indicate thresholds between 10–20 and 20–40%, respectively (rows c and d). Assuming they understood the forecasts and questions, respondents who said yes or no to all four forecast conditions indicate thresholds below 5% or above 40%, respectively (rows a and e). As expected from Figure 4 and respondents' tendency to protect more often in the fruit scenario (Section 4.1), a within-subject analysis of the two scenarios separately (not shown) suggests that respondents had higher thresholds for protection in the reservoir scenario than the fruit scenario. These results further support those discussed in Section 3, that different individuals have different percentage-chance thresholds for decision making based on their risk tolerance, the context, and other factors.
The decision criterion in the cost-loss model is: protect when p > C/L and do not protect when p < C/L, where p is the probability of damaging weather, C is the cost of protection, and L is the loss suffered when damaging weather occurs. It is derived by comparing the cost of protection with the expected loss due to weather damage. (When p = C/L, decision makers can choose to either protect or not protect.) C/L is equal to 0.1 and 0.2 in the $ 10 000 and $ 20 000 cost conditions, respectively. Thus, according to the simplest formulation of the cost-loss decision model, respondents are expected to protect (not protect) when the forecasted chance of damage is above (below) this level.
The findings presented here indicate that many respondents were not making cost-loss decisions. One indication of this is the finding that cost of protection did not significantly affect decisions. Further, as Figure 4 and the middle two columns of Table V show, many respondents chose to protect (not protect) when the cost was greater (less) than the expected loss. A portion of respondents indicated risk aversion, but many indicated risk-seeking behaviour. Some of this behaviour could be due to the variability in subject responses typical in experimental studies, exacerbated by the lack of real monetary incentives (Smith and Walker, 1993; Camerer and Hogarth, 1999). However, core theories in economics and psychology (such as expected utility and prospect theory) suggest that people in general do not make strict expected-value decisions, and that people tend to be risk-seeking in the domain of moderate-probability losses (e.g. von Neumann and Morgenstern, 1944; Kahneman and Tversky, 1979; Tversky and Kahneman, 1992). (Note that prospect theory also suggests that people tend to shift from risk-averse to risk-seeking as the probability of loss increases from low to moderate. Thus, our characterization of ‘unordered’ respondents and of risk-averse/risk-seeking respondents in Table V may be oversimplified.) Consequently, these data provide evidence, supported by numerous studies in other settings, that many people do not make decisions as hypothesized in the simplest form of the cost-loss decision making model, even in idealized scenarios modelled after static cost-loss decisions.
5. Summary and discussion
This study builds knowledge about communication, interpretation and use of hydrometeorological forecasts, including forecast uncertainty information, by analyzing data from decision scenario questions incorporated into a nationwide survey of the US public. The decision questions are related to experimental economics and psychology approaches to empirically studying people's use of information in decisions. Testing forecast use in a controlled context permits simplification of the decision setting and the information available, facilitating focused study. Use of forecasts of low temperature, probability of precipitation, and precipitation amount is explored.
The results indicate that, given the same decision scenario, members of the public have a variety of probabilistic thresholds for taking protective action. Given the same scenario and the same forecast information, different people also make different protective decisions. Both findings are consistent with individuals' different orientations towards weather-related risk, varying interpretations of forecasts and situations, and other differences. A potential implication is that, at least for forecasts with indefinite or inhomogeneous audiences, it may be important to provide forecast information that people can use to make decisions based on their own criteria, rather than recommending decisions based on pre-defined thresholds.
Respondents' thresholds for and likelihood of taking protective action also varied with the decision scenario. This is consistent with a large body of research in non-weather settings indicating that context, presentation and framing can affect information interpretation and use. This suggests that designing effective communication formats will require testing forecast interpretation and use in different contexts, as well as considering presentation and framing effects.
When given single-value (deterministic) forecasts, the majority of respondents chose to take protective action even when the forecasted value did not reach the damage threshold. This suggests that most respondents interpreted the deterministic forecasts as uncertain, corroborating and extending results reported in Morss et al. (2008). As in Morss et al. (2008), different people interpreted uncertainty in deterministic forecasts differently. The results presented here provide some information about people's interpretations of uncertainty in deterministic forecasts of rain amount and low temperature, but more in-depth study is needed to understand these interpretations in greater detail.
When given forecasts that conveyed uncertainty as ranges, more respondents chose to take protective action when the forecasted range reached or exceeded the damage threshold than when given a single-value forecast with the midpoint of the range. However, considering the uncertainty that most respondents are already interpreting in single-value forecasts, ranges with the spreads tested here did not provide many respondents with information that altered their decision. Moreover, these findings suggest possible presentation issues in interpretation of range forecasts. Consequently, further investigation of how people interpret and use range forecasts is needed, especially since these are currently a common form used to communicate weather forecast uncertainty.
When given forecasts of the percentage chance of exceeding the damage threshold, more respondents took protective action as the percentage chance increased. This result and a within-subject analysis suggest that many respondents understood probabilistic forecasts of the type tested well enough to use them in the decisions presented. This was true for probabilistic forecasts of precipitation amount and low temperature, neither of which is commonly available to members of the US public (the population sampled here). This provides some evidence that, unlike in some other areas of risk communication, weather and weather forecasts are sufficiently common experiences that probabilities may be suitable for conveying uncertainty in everyday weather forecasts to many members of the public (e.g. NRC, 2006; Handmer and Proudly, 2007). This is supported by previous findings that the majority of lay people understand the probabilistic component of PoP and like PoP forecasts conveyed using probabilities (Murphy et al., 1980, Morss et al., 2008). Yet related research suggests potential issues with lay people's specific interpretations of probabilistic weather forecasts, as well as the importance of information presentation (e.g. Gigerenzer et al., 2005; NRC, 2006; Morss et al., 2008; Joslyn et al., 2009a). Thus, further work is needed to understand, across weather prediction contexts, how to communicate weather forecast uncertainty in ways that promote effective interpretation and use.
The analysis also indicates that many respondents did not make decisions according to the simplest form of the cost-loss decision model used in some meteorological studies of forecast use and value. This occurred even though decision makers were presented with a one-time, binary decision given only one piece of forecast information and perfect knowledge of costs and losses, most real decisions are more complex. While the lack of real monetary payoffs in the study design may have increased variability in respondents' decisions, this finding is consistent with a large body of work from economics, psychology, and other fields on the importance of risk perception and other factors in people's decision making. Thus, while the cost-loss model can be useful for hypothetical study, understanding real-world interpretation, use, and value of forecast information requires empirical studies of people's decisions. Such studies should build on knowledge of how people make decisions under risk and uncertainty developed in other disciplines and other settings.
Finally, the analysis explored relationships between subjects' responses to the decision questions and their perceptions and interpretations of forecasts, using data from other questions posed in the survey (Morss et al., 2008; Lazo et al., 2009). One analysis examined whether respondents who selected different interpretations of PoP had different PoP thresholds for taking protective action. This analysis uses data from a larger, more representative sample than previous related work by Gigerenzer et al. (2005) and Joslyn et al. (2009a). The results, along with those in Morss et al. (2008), suggest that the relationship between PoP interpretation and use is more complex than discussed in these two previous studies. Other cross-analyses found no notable relationships between individuals' forecast perceptions and interpretations (as measured in the survey) and their forecast use. The authors suspect that such relationships exist but are more complex than can be examined using the current data set. Thus, further study of such relationships is needed.
Future work, implemented in surveys or laboratory settings, can build on the questions asked here and extend empirical comparison of forecast interpretation and use to more complex presentation formats. By design, work in controlled settings simplifies the decision context and information available. Thus, such work is complementary to real-world empirical studies employing case analyses, interviews, other types of survey questions, naturalistic decision making, and other methodologies to examine forecast communication, interpretation and use. Together, the knowledge gained can help the meteorology community learn whether, when and how to provide different types of forecast information (particularly uncertainty information), to various audiences, in order to enhance forecast interpretation and beneficial use.
Appendix A: Description of Tests of Statistical Significance
The Wilcoxon rank-sum test (also called the Mann–Whitney–Wilcoxon test), employed in Section 3 and Table III, is a non-parametric test of whether two independent samples of observations come from identical populations (have the same underlying distribution).
The Kruskal–Wallis test, employed in Section 3, is a non-parametric test of whether three or more independent samples come from identical populations.
The Pearson chi-squared test, employed in Sections 4.1 and 4.2, tests whether the frequency distributions of two or more samples come from identical populations.
Non-parametric significance tests were selected because the samples likely did not meet all of the assumptions necessary for the use of parametric tests (for example, that the data are approximately normally distributed).
For all of the tests, N denotes the number of observations, df denotes the degrees of freedom, and U and χ2 denote the relevant statistic for the test being discussed. p is the smallest significance level at which the null hypothesis (no relationship) can be rejected. In this article, the null hypothesis is rejected when p < 0.05, in other words, at the 5% level, although most relationships discussed have a much higher level of statistical significance (e.g. p < 0.0001).
The authors thank Jamie Kruse for valuable discussion on aspects of the analysis, Susan Joslyn for helpful comments on an earlier version of the manuscript, and Kelsey Mulder for assistance with literature review. This research was partially sponsored by NCAR's Collaborative Program on the Societal Impacts and Economic Benefits of Weather Information, which is funded by the National Science Foundation and the National Oceanic and Atmospheric Administration through the US Weather Research Program. The National Center for Atmospheric Research is sponsored by the National Science Foundation.