Variation in exploration and exploitation in group decision-making: Evidence from immersive simulations of major incident emergencies

Multi- agency groups are brought together to make strategic decisions in response to major incident emergencies. Here, we investigated decision- making processes in 18 multi- agency groups who were video recorded while engaged in simulated major incident emergencies involving a potential need to evacuate individuals from the location of the incident. Three general categories of decision- making activity were used to code the videos: situation assessment (SA), plan formulation (PF) and plan execution (PE). Analysis of the transitions between these decision- making activities showed that there were marked between- group departures from normative models of decision- making, which predict an orderly transition from SA to PF and then from PF to PE. These departures appeared to reflect between- group differences in the tendency to explore information (evident in reciprocal transitions between SA and PF) or exploit information (apparent in transitions to and from SA and PF to PE). Moreover, the tendency to explore but not exploit information was associated with the number of transitions to critical decisions (i.e. to evacuate individuals from the location of the incident).

to work together effectively at a strategic level is an ongoing issue in the UK and elsewhere (e.g. Alison & Crego, 2008;Comfort, 2007;Flin, 1996;House et al., 2014;Pollock, 2013). Decision-making  uk/app-conte nt/natio nal-decis ion-model/ the-natio nal-decis ionmodel/ #the-model), includes a prescribed sequence of five categories of decision-making activity: gather information and intelligence; assess risks and develop a working strategy; consider powers, policies and procedures; identify options and contingencies and take action and review what happened next, at which point the guidance returns to gather information and intelligence. A recent analysis of group decision-making at immersive simulated major incidents revealed marked departures from the use of the JDM, but also significant between-group variation in the transitions between the five categories of decision-making activities (Wilkinson et al., 2019; see also, Waring et al., 2020). Thus, there was limited consideration of options and contingencies across all groups, and there was marked between-group variation in the transitions between the activities that they engaged in: Some groups tended to move back and forth between, on the one hand, gathering information and intelligence, and on the other, assessing risks and developing a working strategy.

| NORMATIVE MODEL S FOR G ROUP DECIS ION -MAKING AT MA JOR IN CIDENTS
In other groups, the reciprocal sequences involved taking action, and both gathering information and intelligence, and assessing risks and developing a working strategy. These results indicate that SCGs do not follow the guidance enshrined in the JDM. However, characterizing the decision-making processes in SCGs through the lens and bespoke categories of this model makes it difficult to understand the process of decision-making in the SCGs observed by Wilkinson et al. (2019). For example, the JDM like the NDM separates components of decision-making that one might consider, on an a priori basis, to be intimately related (e.g. assess risks and develop a working strategy, and identify options and contingencies); similarly, the model combines components of decision-making that one might consider, on an a priori basis, to be separate (e.g. take action and review what happened next).
A more general analysis of decision-making processes, which has been used in other operational contexts, has used three categories: situation assessment (SA; e.g. "We need more information about how injuries were sustained."), plan formulation (PF; "How will we transport and shelter affected people?") and plan execution (PE; e.g. "Initiate mutual aid plan."). This coding system was developed by van den Heuvel, Alison and Power (2012;see also, van den Heuvel et al., 2014). Here too, one could assume that decision-making follows a sequence (from SA to PF to PE and back to SA), which echoes normative models of individual decision-making across a variety of domains (e.g. Dewey, 1933; see also, Groenendaal & Helsloot, 2016;Lipshitz & Bar-Ilan, 1996;van den Heuvel et al., 2012, van den Heuvel et al., 2014). This coding system has been used to reveal decision-making processes in firefighters, for whom information gathering (i.e. situation assessment, SA) is often followed by courses of action (i.e. plan execution, PE) without any apparent mediation by a stage of evaluation or plan formulation (i.e. PF; see see also, Klein, 1998;Klein, Calderwood & MacGregor, 1989). Here, we applied this more general form of analysis to the video footage of 18 multi-agency groups originally analysed by Wilkinson et al. (2019;see also, Waring et al., 2019) in order to assess a recent theoretical analysis of group decision-making, which is consistent with aspects of the naturalistic decision-making approach. Bang and Frith (2017) presented a synthesis of the evidence concerning the benefits, as well as the pitfalls, of making decisions in groups rather than individually. Their overarching (Bayesian) theoretical analysis, which involved how the past experience of group members is integrated with new information to affect group decisions, is broadly consistent with the naturalistic decision-making approach. A central component of this approach, developed by Klein (1993Klein ( , 2003Klein ( , 2008; see also, Klein et al., 2003;Hutchins, 1995aHutchins, , 1995b, was based on just this kind of interaction: how previous experience primes decisions in the face of uncertain information (i.e. recognition-primed or intuitive decision-making; see also Doya, 2008;Gigerenzer, 2007;Gureckis & Goldstone, 2006;Salas et al., 2010;Tversky & Kahneman, 1974). In the case of multi-agency groups responding to major incidents, a natural assumption is that bringing together representatives from the relevant agencies might increase their ability to work together effectively (for a discussion of distributed vs. co-localized multi-agency command structures, see Power & Alison, 2017). However, Bang and Frith (2017) argued that group decision-making might also benefit from the combination of different types of individual decision maker: specifically the combination of explorers and exploiters.

| E XPLOR ATI ON AND E XPLOITATI ON IN G ROUP DECIS ION -MAKING
Explorers can be characterized as sampling the available information and decision space in order to select the optimal decision, whereas exploiters commit to a course of action without such a thoroughgoing analysis and based on the prior success of that action (see Frank et al., 2009; see also Badre, Doll, Long & Frank, 2012;Cohen, McClure & Yu, 2007;Daw, O'Doherty, Dayan, Seymour & Dolan, 2006). For example, in the "observe or bet" task the decision maker has two options that yield rewards or losses according to a predetermined bias (Tversky & Edwards, 1966). They can choose to "observe" the outcome of a trial and gain information but accrue no rewards or losses or "bet" and accumulate rewards or losses that are only revealed at the end of the task. The observe trials represent an assay of exploration and the bet trials an assay of exploitation, and their use varies across individuals. Exploration can be linked with one way in which the term decision inertia is used: Where individuals and groups defer taking action and instead continue to seek additional (redundant) information (Alison et al., 2015). While exploitation can be linked to another way in which the same term is used: the tendency for individuals to repeat past choices irrespective of the current evidence (Akaishi et al., 2014).
There are clearly pitfalls associated with being either an explorer (who might not reach a decision in a timely fashion) or an exploiter (who might quickly reach the wrong decision). Bang and Frith (2017; p. 7) state: "A mixture of such diverse individuals can create advantages for the group." This claim has clear practical implications for assembling effective groups in a variety of contexts, including major incidents (cf. Polzer et al., 2002;Roberge & van Dick, 2010). However, to the best of our knowledge, the tendency for groups to engage in exploration and exploitation during decision-making has not been investigated. Certainly, the selection of individuals that come together to respond to major incidents is not based on any formal assessment of their individual approaches to decision-making. Consequently, if the decision-making style of group members contributes to the tendency of groups to engage in exploration or exploitation, then one would predict variation in these processes across different groups.

| A SS E SS ING G ROUP DECIS I ON -MAK ING PRO CE SS E S IN IMMER S IVE S IMUL ATIONS
Immersive simulation learning environments enable the components of major incidents to be simulated and provide a basis for training exercises and research on group decision-making (see Alison et al., 2013). These Hydra environments (Crego, 1996) consist of a "syndicate room" for group members that is equipped with a large screen projector, PC and CCTV. The PC runs a communication interface that is permanently displayed on the projector screen and delivers information updates ("injects") and tasks to the groups. In this way, table-top exercises are lifted to a new level of realism, and the environments provide a platform for investigating how the provision of different forms of information affect group behaviour. They also enable the processes of group decision-making to be investigated in situ, through a real-time analysis of the recordings of real multiagency groups faced with reproducible simulated major incidents.
In fact, Hydra is used by the emergency services across the UK (https://www.hydra found ation.org). This approach has the potential to separate variation in group decision-making that might reflect the idiosyncrasies of different real incidents from those based on the groups that deal with them. It permits important theoretical issues concerning group decision-making to be systematically evaluated.
Here, the sequences of decision-making activities (situation assessment, SA; plan formulation, PF; and plan execution, PE) derived from the recordings allowed us to assess the extent to which the normative model is followed in these important groups (i.e. SA-PF-PE; cf. Burke et al., 2006). We also used these sequences to assess whether any departure from a normative standard reflect between-group differences in exploration and exploitation (cf. Bang & Frith, 2017).
This was achieved through further analysis of the differences in the sequences of decision-making activities across groups: Exploration should be evident as repeatedly moving between situation assessment and plan formulation, whereas exploitation should be evident in transitions between situation assessment and plan execution, and between plan formulation and plan execution. Any such differences in exploration and exploitation across groups should be reflected, other things being equal, in the number of transitions before a critical decision is made: Groups who explore should take more steps to reach a critical decision than those who exploit.

| THE SCENARI OS AND APPROACH
We recorded decision-making activities of real multi-agency groups faced with two simulated major incidents: a large-scale chemical fire at an industrial site (Scenario 1) and a crash between a passenger train and a truck carrying a hazardous substance (Scenario 2). The scenarios were dynamic and unfolding in time, and the groups were located in Hydra immersive simulation suites that allowed information to and from the groups to be controlled and recorded (Alison et al., 2013;Crego, 1996). These suites are modelled on special operations rooms in which such groups meet in the context of real incidents. The groups consisted of senior representatives from the relevant agencies (e.g. local emergency services, civil resource organizations, health boards and government). In both scenarios, the critical decision was whether or not to evacuate local residents. However, in Scenario 1 the groups were tasked with developing a communications strategy and the decision to evacuate was an implicit component of the task, whereas in Scenario 2 the groups were explicitly tasked with deciding whether to evacuate local residents. The recordings of the meetings were coded as a continuous sequence of three decision-making activities: SA, PF and PE (examples from Scenarios 1 and 2 are presented in Table 1). Wilkinson et al. (2019), the coding was conducted independently of the group members who contributed to the decision-making activity. We then assessed the patterns of transitions between the successive categories. The first question was whether groups followed the normative model and consistently moved from situation assessment TA B L E 1 Examples of situation assessment (SA), plan formulation (PF) and plan execution (PE) Scenario 1

As in
SA "What is the information on the state of the fire and the risk of explosion?" PF "We need to develop a media strategy." PE "We will not invoke the evacuation plan at this time." Scenario 2 SA "Do we know how many people might need to be evacuated?" PF "A transport plan is needed." PE "Invoke the Mass Fatalities plan." to plan formulation to plan execution and then further situation assessment in making decisions across the course of the meetings. We then assessed whether deviations from the normative model between groups reflected differences in the tendency to explore or exploit information. As already noted, it was anticipated that while exploration would be evident in transitions between situation assessment and plan formulation (i.e. SA-PF and PF-SA), exploitation would be reflected in transitions to and from plan execution (most obviously between SA-PE and PF-PE). Finally, we examined whether such differences were related to the number of transitions that each group took to arrive at the critical decision (i.e. whether or not to evacuate local residents).

| Participants
Eighteen multi-agency groups attended 2-day national training and ex-

| Procedure
At the start of Scenarios 1 and 2, all participants were given an inter- Across the two days of the training event, the groups took part in timed meetings that were approximately 45-60 min, during which they made decisions in response to the evolving incident. The virtual timeline of the scenarios extended from the afternoon of Day 1, when the incident was declared, to 3 months later through three intermediate time points (evening of Day 1, Day 2 and 1 week later).
The scenarios were delivered using Hydra immersive simulation systems (Crego, 1996). As already noted, Hydra provides a "syndicate room" for each group that is equipped with a large screen projector, PC, wireless keyboard and mouse, printer and CCTV. The PC ran a communication interface that was permanently displayed on the projector screen and delivered information updates ("injects") and However, in this case the scenario involved a crash between a passenger train and a truck, which was carrying a hazardous substance.
The crash caused many fatalities and injuries to passengers on the train. Within the first hour of the incident a fire ignited, burning the hazardous substance and sending a toxic plume of smoke over a residential area. The analysis for Scenarios 1 and 2 was conducted on the critical second group meeting. In this meeting, there was time pressure and the groups were required to make critical decisions: In Scenario 1, the groups were tasked with providing direction to those involved in tactical operations (including whether it would be necessary to evacuate local residents) and what their media strategy would be; and in Scenario 2, they were explicitly tasked with deciding whether or not to evacuate the nearby caravan site, under conditions where the resources were not available to evacuate everyone and the toxic effects of the plume were unclear.

| Coding of activity
The audio-video recordings, from either the Hydra CCTV system or a GoPro camera placed on each group table, were coded using the categories: situation assessment, plan formulation and plan execution. These categories were coded at the level of the group (i.e. independently of the individual contributor) and noted on a spreadsheet for later analysis. Isolated irrelevant comments and those that were not parts of the group discussions (e.g. informal asides, which were infrequent) were excluded from the analysis. The coding was conducted on two separate occasions (by BW), which resulted in a small number of the activities (<5%) being re-coded. An independent assessor (RCH) then confirmed the reliability of the coding on a sample of 30 observations from each study (≈95% agreement with BW).
A lag sequential analysis (Sackett, 1979;see also, O'Connor, 1999) was used to derive the primary data of interest: the sequences of transitions between the decision-making activities in the group meetings. In such an analysis, repetitions of the same category are removed. The lag sequential analysis stopped at the end of the group meetings.

| Analytic approach
To assess whether there were differences between the frequencies of the different categories of decision-making activities, we used within-subjects ANOVAs with post-hoc t tests using a Bonferroni correction for multiple comparisons. As already noted, a lag sequential analysis was used to characterize the six two-step transitions between these categories ( was more likely to be followed by plan formulation (PF) than plan execution (PE), plan formulation (PF) was equally likely to be followed by plan execution (PE) and situation assessment (SA), and plan execution (PE) was less likely to be followed by situation assessment (SA) than (PF).
ANOVA with scenario (1 or 2) as a between-subjects factor and initiating category (SA, PF and PE) and succeeding category (normative or other) as within-subjects factors, confirmed these impressions.

| Group differences in exploration and exploitation
The overall analysis of the results presented in Figure 1  There were no cross loadings >±.28. These two classes accounted for 76% of the variance in the frequencies of the six transitions between the three categories.    did not differ. While there was some tendency for increases in group size to be related to reductions in both exploration and exploitation loadings, these relationships were not statistically significant. In summary, the results presented in Table 2 illustrate similarities between the full set of 18 groups and the subset of 13 that reached the critical decision, in terms of the relationships to the overall number of transitions within a session and group size. The results in Table 2 also show that the exploration and exploitation loadings are similarly related to the overall number of transitions and group size. Thus, the influence of exploration and exploitation on the number of transitions to the critical decision is unlikely to be a consequence of differences in these more general features of the groups.

| G ENER AL D ISCUSS I ON
Groups of individuals from different agencies come together to make life-determining decisions at major incidents. In the UK, they are called SCGs. We investigated decision-making in such multi-agency groups during immersive simulations presented in Hydra suites (Alison et al., 2013;Crego, 1996). This approach-based on archival recordings of the same incidents being faced by different groupsenabled a systematic analysis of the process of group decision-making in this important context (see Wilkinson et al., 2019;see also, Waring et al., 2019;Waring et al., 2020). The recordings of the meetings were coded as sequences of decision-making activities that have been employed to characterize individual decision-making in the emergency services (i.e. situation assessment, SA; plan formulation, PF and plan execution, PE; see Burke et al., 2006;Lipshitz & Bar-Ilan, 1996;van den Heuvel et al., 2012, van den Heuvel et al., 2014. This analysis allowed us to assess whether or not the groups followed a normative cyclical model of decision-making, which assumes that situation assessment is followed by plan formulation and then plan execution, and back to situation assessment. This could not have been achieved without the use of simulated environments, which enable replication, together with analysis of group decision processes in real-time. Our approach also enabled an assessment of between-group differences in styles of decision-making, which would not have been possible through studying a small number of groups or, in the limiting case, a single group (e.g. Waring et al., 2019;Waring et al., 2020;see also, Curnin et al., 2020).
Our overall analysis of the sequences of transitions revealed marked departures from the normative model. Over the course of the simulated incidents, situation assessment was more often followed by plan formulation than plan execution, but plan formulation was just as likely to be followed by situation assessment as it was to be followed by plan execution. Finally, plan execution was more likely to be followed by plan formulation than situation assessment.
Further analysis of the sequences revealed that these departures from the normative standard involved systematic between-group differences in exploration and exploitation.
A principal-components analysis was conducted on the six pairwise transitions between the three decision-making activities (i.e. SA-PF, SA-PE, PF-PE, PF-SA, PE-SA, PE-PF). This analysis revealed that there were marked between-groups differences in the tendency to move between situation assessment and plan execution (i.e. SA-PF and PF-SA). At a descriptive level, these group differences could be aligned to differences in a process of decision inertia (Alison et al., 2015) or they could be characterized as reflecting differences

TA B L E 2 Correlations between exploration, exploitation, number of transitions and group size
in the tendency to engage in exploration (cf. Bang & Frith, 2017).
While decision inertia and exploration are conceptually distinct, they could prove difficult to tease apart. For example, repeated requests for (similar) information could be considered either decision inertia or exploration. However, our analysis also revealed group differences in the tendencies to move between plan execution to both situation assessment and plan formulation (i.e. PF-PE, PE-PF, PE-SA and SA-PE): A pattern of results that is indicative of exploiting information.
Taken together, these between-group differences in the sequences of decision-making activities are clearly analogous to individual differences in information exploration and exploitation. We have already noted that there is evidence for individual differences in these processes (e.g. Badre et al., 2012), and our results provide the first evidence that these differences can be seen at the level of group decision-making. Moreover, it mattered whether groups tended to explore or exploit: The tendency to explore (but not to exploit) was associated with a greater number of transitions (between decisionmaking activities) to reach a critical decision: whether or not to evacuate individuals from the location of the (simulated) major incident. To the extent that the increased number of transitions can be equated with the time at which critical decisions were made, then the consequences of groups tending to engage in exploration could be life determining.
Why do some groups explore and others exploit information?
Our results are consistent with the general claim that the composition of groups might be an important determiner of effective group decision-making (Bang & Frith, 2017). The groups were opportunity samples of individuals from the various agencies, and this sampling approach mirrors how the composition of multi-agency groups who assemble to respond to major incidents in the UK might vary. If some individuals in the groups are explorers and others exploiters, then it is plausible to suppose that the relative numbers of these different individuals affect the behaviour of the groups. This analysis could be assessed by examining how such differences in group composition affect group decision-making. Alternatively, the decision-making style of specific individuals (e.g. the chair; see Wilkinson et al., 2019; see also Waring et al., 2020) might have a disproportionate influence on group decision-making processes: When the chair is an explorer (or exploiter) the group is more likely to exhibit the same bias. We have no evidence regarding the decision-making style of the chair that is independent of the decision-making processes in the group as a whole. However, individual variation in these decision-making biases could be assessed using a version of the "observe or bet" task prior to engaging in group decision-making (Tversky & Edwards, 1966). In any case, the results that we have presented take our understanding of group decision-making-at emergency incidents-from appeals to "groupthink" (see also Janis, 1972Janis, , 1982Janis & Mann, 1977) to a form of analysis that is analytically tractable and testable (cf. Bang & Frith, 2017;see also, Alison et al., 2015).
To conclude: The results from simulations provide one basis upon which to develop future policy, guidance and training for groups who have to make life-determining decisions under conditions of uncertainty and time pressure. The analysis developed in the previous paragraphs highlights important issues around the selection of individuals to be charged with responding to major incidents. The suggestion that individuals could be selected on the basis on their individual decision-making styles (explorer or exploiters) is, at least in principle, a relatively simple one to implement (e.g. by using version of the "explore or bet" task). However, in situ monitoring and feedback of the utility of repeated requests for further information and additional situational assessment is another area that could be targeted in a variety of ways: by the presence of a critical friend or by training the chair to monitor the balance between exploration and exploitation (cf. Janis, 1972Janis, , 1982Lovallo & Kahneman, 2003; Ministry of Defence, 2013; see also, Newell et al., 2015). Finally, it should be acknowledged that our analysis is based on groups responding to simulated major incidents; albeit that the groups themselves are representative of groups who would be called to respond to real incidents. Future research will need to provide an experimental analysis of the origin of group differences in exploration and exploitation, for example, by manipulating the proportion of explorers and exploiters in different groups and examining the consequences for decision-making of this manipulation. This research too will need to be based on real groups responding to simulated major incidents.