SEARCH

SEARCH BY CITATION

Keywords:

  • moral dilemmas;
  • moral judgment;
  • normative ratings;
  • decision making;
  • emotion

ABSTRACT

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

At the present time, the growing interest in the topic of moral judgment highlights the widespread need for a standardized set of experimental stimuli. We provide normative data for a sample of 120 undergraduate students using a new set of 60 moral dilemmas that might be employed in future studies according to specific research needs. Thirty dilemmas were structured to be similar to the Footbridge dilemma (“instrumental” dilemmas, in which the death of one person is a means to save more people), and thirty dilemmas were designed to be similar to the Trolley dilemma (“incidental” dilemmas, in which the death of one person is a foreseen but unintended consequence of the action aimed at saving more people). Besides type of dilemma, risk-involvement was also manipulated: the main character's life was at risk in half of the instrumental dilemmas and in half of the incidental dilemmas. We provide normative values for the following variables: (i) rates of participants' responses (yes/no) to the proposed resolution; (ii) decision times; (iii) ratings of moral acceptability; and (iv) ratings of emotional valence (pleasantness/unpleasantness) and arousal (activation/calm) experienced during decision making. For most of the dependent variables investigated, we observed significant main effects of type of dilemma and risk-involvement in both subject and item analyses. Copyright © 2013 John Wiley & Sons, Ltd.


INTRODUCTION

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

The goal of the present study was to present a new set of moral dilemmas and to provide normative data for moral acceptability, decision times, and emotional salience. Foot (1967) was the first to depict hypothetical scenarios through which to comprehend why in some circumstances it is permissible to harm one or more individuals in the course of saving more people, whereas in other cases, it is not (for an extensive analysis, see also Thomson, 1986; Unger, 1996). In the famous Trolley dilemma, a runaway trolley is about to run over and kill five people, and the only way to save them is to hit a switch that will turn the trolley onto a sidetrack, where it will kill only one person. By contrast, in the Footbridge dilemma, one can save the five by pushing a fat man off an overpass and onto the tracks below, killing him, but stopping the trolley. Behavioral data consistently demonstrated that most people approve of the five-for-one tradeoff in the Trolley dilemma, but not in the Footbridge dilemma (e.g. Petrinovich, O'Neill, & Jorgensen, 1993). In the present study, the Trolley and Footbridge dilemmas have been considered as prototypical examples of “incidental” and “instrumental” dilemmas, respectively. This distinction is based on the Doctrine of the Double Effect (DDE; Aquinas, 1952/1274), according to which it is not permissible to intentionally cause harm for a greater good, although it is permissible as a foreseen but unintended side effect. Specifically, pushing the fat man off the overpass in the Footbridge dilemma violates the DDE because the agent intends to use the fat man as a means to stop the trolley and save the other five people. In contrast, by throwing the switch in the Trolley dilemma, the sacrifice of one workman is only a foreseen consequence of the action aimed at saving five people. This framework has been used to provide a broad set of stimuli structured to be similar to the Trolley or Footbridge dilemmas, which we believe would be more suitable for experimental purposes than the set proposed by Greene, Sommerville, Nystrom, Darley, and Cohen (2001).

Recent research on the psychological and neurobiological processes underlying moral judgment emanates from the pioneering work of Greene et al. (2001). In two functional magnetic resonance imaging studies, Greene and colleagues (2001; Greene, Nystrom, Engell, Darley, & Cohen, 2004) showed that Footbridge-like dilemmas engage brain areas associated with emotion and social cognition, whereas Trolley-like dilemmas activate brain areas associated with abstract reasoning and problem solving. Furthermore, they found that when faced with Footbridge-like dilemmas, people took longer to judge moral violations as appropriate than inappropriate, whereas when they had to judge Trolley-like dilemmas, no difference in response times (RTs) was found between appropriate and inappropriate judgments. The neuroimaging findings along with the RTs data allowed Greene and colleagues (2001; Greene et al. 2004) to propose a dual-process model of moral judgment according to which a slow, conscious, and effortful “cognitive” system competes with a fast, unconscious, and effortless “affective” system.

Recent studies focusing on the characteristics that make Footbridge-like dilemmas capable of eliciting a stronger emotional response than Trolley-like dilemmas, converged on two general factors, that is, personal force and intention. Some studies (Cushman, Young, & Hauser, 2006; Greene et al. 2009) gave particular emphasis on personal force, namely, a condition in which the muscular force applied by an agent directly impacts another person. Specifically, when manipulating personal force and intention using four variants of the Trolley problem, Greene et al. (2009) found a significant interaction between the two factors, showing that intention affected moral acceptability ratings only under the personal force condition. However, when other dilemmas were considered (by reanalyzing the data from Cushman et al. 2006), the effect of intention was significant not only in the presence but also in the absence of personal force. Specifically, the effect of personal force was significant when the harm was a means, but not when the harm was a side effect, indicating that the effect of personal force depended on intention, and not vice versa. The effect of intention in determining people's moral judgments has been explored by a number of studies (Borg, Hynes, Van Horn, Grafton, & Sinnott-Armstrong, 2006; Cushman et al. 2006; Hauser, Cushman, Young, Jin, & Mikhail, 2007; Mikhail, 2002) calling into question the DDE (Aquinas, 1952/1274). Indeed, when testing the influence of the DDE on moral judgments, several studies demonstrated a pattern of results consistent with such moral principle, in that participants were much less likely to cause harm intentionally than unintentionally (Borg et al. 2006; Cushman et al. 2006; Moore, Clark, & Kane, 2008).

At the present time, the most commonly used set of dilemmas is that proposed by Greene et al. (2001). This set is well known in the literature and comprises the distinction between “personal” (Footbridge-like) and “impersonal” (Trolley-like) dilemmas. Recently, several criticisms have been raised against this set of dilemmas. Borg et al. (2006), for example, pointed out that the moral personal scenarios used by Greene and colleagues employed a more emotive language with respect to impersonal scenarios and often referenced immediate family members or close friends. Another severe criticism of the set of dilemmas proposed by Greene et al. (2001) came from a recent reanalysis of their behavioral data carried out by McGuire, Langdon, Coltheart, and Mackenzie (2009). Although the subject analysis they performed showed the same pattern obtained by Greene et al. (2001), the item analysis (in which dilemmas rather than subjects were treated as cases) did not show any significant effect. Furthermore, a follow-up subject analysis excluding nine “poorly endorsed items” showed that the effects found by Greene et al. were entirely due to a small number of stimuli and were therefore not generalizable to other populations of moral dilemmas. On these bases, McGuire et al. (2009) recommended using more rigorously controlled stimuli and running both subject and item analyses.

To address the issues raised against the set of dilemmas used by Greene and colleagues, the present study was aimed at providing a new set of dilemmas, including 30 Footbridge-like dilemmas and 30 Trolley-like dilemmas to be used for different research purposes in experimental studies on moral judgment. A large set of dilemmas should be of particular interest for those researchers who are interested in investigating the processes underlying moral judgment using functional magnetic resonance imaging or event-related potentials. Indeed, these experimental methods typically require the repetition of a large number of stimuli of the same type. At the same time, in the context of study design on moral dilemmas, no repetition of the same specific stimulus is allowed. On the basis of the DDE (Aquinas, 1952/1274), in 30 out of 60 dilemmas, the death of one (or a few) person(s) is a means to save more people (instrumental dilemmas) and in the other 30 dilemmas, the death of one (or few) person(s) is a foreseen but unintended consequence of the action aimed at saving more people (incidental dilemmas). In line with the findings shown by Moore et al. (2008), we also manipulated the self versus other benefit: subjects' lives were at risk in 15 of the 30 instrumental dilemmas and in 15 of the 30 incidental dilemmas.

When designing the text material, (i) we limited our dilemmas to the issues of killing and letting die rather than other moral issues; (ii) we never included children, friends, or relatives; and (iii) we paid attention to using plain language in all dilemmas. Similarly to the set developed by Greene et al. (2001), instrumental and incidental dilemmas were not matched in terms of content. Employing matched scenarios would allow greater control over confounds and the use of paired-samples item analyses, which would be a strength for this type of research. However, we believe that developing a large number of such stimuli is a challenge that cannot easily be overcome. The main difficulty deals with the plausibility of the represented scenarios, which may markedly differ within the matched pairs of dilemmas. Indeed, it is likely that the same scenario sounds plausible in the incidental version and artificial in the instrumental version, or vice versa. This concern is supported by the lack of a large set of matched stimuli in the literature. To the best of our knowledge, there is only one published study in which several pairs of moral scenarios were used, but no more than five pairs were matched according to the DDE principle (Cushman et al. 2006). Besides this limit, there is another fundamental issue to consider, which deals with the similarity of the scenarios within each pair. It is likely that providing participants with couples of virtually identical scenarios would per se foster different judgments, thus inflating the amount of divergence between instrumental and incidental dilemmas. Indeed, on a different but related matter, it has been demonstrated that when two (or more) options are presented and evaluated simultaneously, differences among attribute values might be overweighted as compared with a single evaluation mode, in which each option is presented and evaluated separately (e.g. Hsee, 1996; Hsee, Loewenstein, Blount, & Bazerman, 1999). In other terms, the use of matched dilemmas (i.e. describing virtually identical scenarios and differing only for alternative resolutions), would highlight differences between dilemmas and make participants spontaneously engage in comparisons between alternatives, thus increasing the chance of response bias. On these bases, when weighting costs and benefits of matched versus nonmatched scenarios, we chose to use nonmatched scenarios to avoid the chance that a demand characteristic bias prevailed on spontaneous moral evaluations.

In the present study, for each dilemma, we provided Italian normative values for the following variables: (i) rates of participants' responses (yes/no) to the proposed resolution; (ii) decision times; (iii) ratings of moral acceptability; and (iv) ratings of emotional valence and arousal experienced during decision making. Although collected from an Italian sample, these norms might be extended to other cultural contexts. Indeed, moral judgments elicited by dilemmas designed to target the DDE were found to be largely similar across age, ethnicity, religion, educational level, or national affiliation (Hauser et al. 2007).

The measurement of self-reported emotional experience at the time of judgment is an important novel feature of the present study. In fact, past studies have inferred emotion from the activation of brain areas commonly associated with emotional processing (Borg et al. 2006; Greene et al. 2001, 2004), or by using an a priori criterion for considering some dilemmas as “putatively more emotional” than others (Greene et al. 2001, 2004). In the current study, we asked participants to evaluate their emotional experience by reporting their valence (pleasantness/unpleasantness) and arousal ratings (activation/calm). The valence and arousal dimensions represent primitive affective parameters, and they typically account for most of the variance in emotional judgments (Bradley & Jang, 1994; Lang, Greenwald, Bradley, & Hamm, 1993). Lastly, as recommended by McGuire et al. (2009), we presented both subject and item analyses.

It is our hope that similarly to the standardization of pictorial stimuli, such as that carried out in seminal work by Snodgrass and Vanderwart (1980), our study may have positive and practical effects on studies of moral judgment.

MATERIAL AND METHODS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

Participants

A total of 120 undergraduate students (65 women) were recruited at the University of Padova and volunteered to participate in this study. Mean age was 19.96, SD = 2.70 (Men: M = 20.67, SD = 2.64; women: M = 19.41, SD = 2.62). All the participants had normal or corrected-to-normal vision. The study was approved by the local Ethics Committee, and all participants gave written consent prior to participation.

Stimuli

We arranged 60 experimental and 15 filler moral dilemmas, some of which were redesigned from Cushman et al. (2006); Greene et al. (2001); Greene, Morelli, Lowenberg, Nystrom, and Cohen (2008), and Moore et al. (2008), whereas others were newly developed by us. On the basis of the DDE (Aquinas, 1952/1274), the 60 experimental dilemmas were classified into 30 “instrumental dilemmas,” which described killing one individual as an intended means to save others, and 30 “incidental dilemmas,” which described killing one individual as a foreseen but unintended consequence of saving others. Each of these two classes of dilemmas was varied for risk involvement. Thus, in 30 dilemmas, killing one individual resulted in saving one's own and other people's lives (“self-involvement dilemmas”), whereas in the other 30 dilemmas, killing one individual resulted in saving only other people (“other-involvement dilemmas”). The 15 filler dilemmas described moral issues such as stealing, lying, and being dishonest and never involved killing.

The English translated version of all the experimental and filler dilemmas can be downloaded as Supplementary Material. Some examples of experimental and filler dilemmas are listed in Table 1.

Table 1. Sample instrumental, incidental, and filler dilemmas (text translated from Italian)
DilemmaScenarioResolution
Instrumental Self-involvementYou are the fourth in a team of five mountaineers involved in a climb. The head of the team has just secured himself to the rock face when the second in the team starts to slide, pulling you, and the others, with him. You all fall for tens of meters and stop suspended above a crevasse. Your weight is too much and the rope is not going to resist for long.To lighten the load, you cut the rope which links you to the last climber. You know that he will fall into the crevasse and die, but you and the other two climbers will survive.
Instrumental Other-involvementYou are carrying out research into the behavior of lions in the Savannah of Central Africa. From an observation tower, you can see four people hide behind a bush. They are being followed by a lion and are trying to get away from it. The lion has seen them and is heading for the bush. Another person has been able to climb the observation tower.You push off the person who has climbed the tower so that the lion is drawn towards him. You know that this person will be mauled to death, but the other four will have time to escape.
Incidental Self-involvementYou are in the head office of your bank together with four other people. Suddenly, the director calls you because he has discovered a bomb in an office on the ground floor. He knows you are a bomb disposal expert and asks you to defuse it. You realize immediately that there is not enough time to evacuate the people in the bank before the bomb explodes.You throw the bomb into the basement where there is the security vault. You know that the explosion will kill the security guard in the vault, but you and the other four people will be saved.
Incidental Other-involvementYou are a building worker who is maneuvering a crane on a building site. You have just started your day on the site, when you realize that the cable of the crane is about to break. Attached to the cable is an enormous steel beam which is directly above a crew of six who are working on the outside of a building in construction.You move the arm of the crane a short distance to another area of the site. You know that there is a worker there who will be crushed by the steel beam and will die, but the other six workers will be unhurt.
FillerBecause of the economic crisis of the last year, the company you work for has closed and you have lost your job. Recently you have been looking for a new job, but without success. You realize that you need some experience in computer technology and are convinced that you will be employed much more easily if this experience is on your Curriculum Vitae.You insert false information in your CV about your ability in information technology. You know that in this way you will be considered above the other more qualified candidates and get the job.

Each dilemma was presented as text, in white type (font: Arial; size: 20) against a gray background, through a series of two screens. The first screen described the scenario, in which some kind of threat was going to cause death to a group of persons. By reading the text, participants had all the information to comprehend what would have happened if they decided not to intervene. The second screen described a hypothetical resolution in which the main character killed one individual to save the group of people, which otherwise would have died.

Mean number of words and number of text characters of both the scenarios and resolutions were fully balanced across the four groups of dilemmas (Tables S1–S4 of the Supplementary Material), as confirmed by a series of 2 (Instrumental vs. Incidental) × 2 (Self- vs. Other-involvement) analyses of variance (ANOVAs) (ps ranged from 0.10 to 0.98). Instrumental and incidental dilemmas were also carefully matched for numerical consequences (i.e. the number of people to save or let die).

All dilemmas were presented on a 19″ computer screen at a viewing distance of about 100 cm. Stimulus presentation was accomplished with E-prime software (Psychology Software Tools, Pittsburgh, PA, USA).

A comparison with the personal/impersonal distinction proposed by Greene and colleagues

The instrumental/incidental distinction may give the impression of overlapping with the personal/impersonal one, particularly when considering some specific dilemmas such as the Trolley and the Footbridge. However, the two distinctions are different for at least two reasons. First, the instrumental/incidental distinction is based on one single philosophical principle, namely, the DDE; incidental dilemmas are those in which the actions describe a foreseen but unintended harm, whereas instrumental dilemmas are those in which actions describe an intended harm, thus violating the DDE. By contrast, the personal/impersonal distinction is not sharply defined, as it is based on rather broad criteria that apply to a number of different moral scenarios. Second, the relationship between personal and impersonal dilemmas is quite puzzling, because the three criteria identified by Greene et al. (2001) mark out the boundary of personal dilemmas only. They classify moral dilemmas as personal if the proposed action causes (i) serious bodily harm; (ii) to a specific person or group of people; and (iii) not by deflecting an existing threat. Therefore, the class of impersonal dilemmas is left indefinite and vague, and those dilemmas that do not meet at least one of the three criteria are considered as impersonal. An example of an impersonal dilemma is the so-called “Sculpture” dilemma (Greene et al. 2001), in which people are asked if it is appropriate to destroy one of the art collector's prized sculptures to save a railway workman who is working on the tracks. Another example of an impersonal dilemma that depicts a completely different set of circumstances is the so-called “Donation” dilemma (Greene et al. 2001), in which people are asked if it is appropriate not to donate two hundred dollars to a reputable international aid organization to provide needed medical attention to some poor people to save money. It should be noted that neither of the two dilemmas described earlier could be classified as incidental, because the harm caused by the proposed action cannot be considered as a foreseen but unintended consequence of saving more people. Furthermore, in the “Sculpture” dilemma, there is no involvement of human beings, whereas in the instrumental/incidental distinction, both categories of dilemmas involve the death of one or more people, making instrumental and incidental dilemmas more directly comparable.

Although it is apparent that incidental and impersonal dilemmas are markedly different, some instrumental and personal dilemmas share some features. In both classes of dilemmas, in fact, a serious bodily harm is caused. However, in all the instrumental dilemmas, harm is used as a means to achieve a greater good, whereas this is not the case for all personal dilemmas. In some of them, in fact, the described actions are more self-interested rather than being performed in the interest of the greater good. In the “Architect” dilemma (Greene et al. 2001), for example, people are asked if it is appropriate to push their boss, who makes everyone around him miserable, off of the building to get him out of their lives. Similar concerns may be applied to “Country Road,” “Smother for dollars,” “Safari,” “Hired Rapist,” “Grandson,” and “Infanticide” (Greene et al. 2001). Furthermore, it is worth noting that the aforementioned personal scenarios do not even reflect the concept of moral dilemma, in which, by definition, two incompatible actions are morally required (e.g. MacIntyre, 1990).

Procedure

Participants completed the experiment in small groups of 10 to 15 students each and were tested on individual computers in the same room, with research staff present.

Upon arrival, participants were given information about the experiment, and their written informed consent was obtained. Then, instructions for the task were given. Subjects were provided with three practice dilemmas before beginning the experimental trials. Each trial began with the presentation of the scenario, which participants could read at their own pace. By pressing the spacebar, the resolution was presented until the response was made. Participants were asked to read the resolution and indicate whether they would do the proposed action by pressing the right button to answer “yes” and the left button to answer “no.” The Yes/No buttons were counterbalanced across participants. Decision times were recorded from the onset of the resolution slide until the button press. After this response, participants were asked to judge how morally acceptable the resolution was on an 8-point scale (0 = not at all, 7 = completely). Finally, participants were required to rate their emotional state during decision making using a computerized version of the Self-Assessment Manikin (SAM; Lang, Bradley, & Cuthbert, 2008), displaying the one to nine-point scales of valence (pleasantness/unpleasantness) and arousal (activation/calm), with higher scores indicating higher pleasantness and higher emotional arousal. Participants were explicitly instructed to report how they actually felt when they were deciding. By pressing the spacebar, the next scenario was presented (see Figure 1 for a schematization of the procedure).

image

Figure 1. Sequence of events in the procedure. Participants read the scenario, which remained on the screen until they pressed the spacebar. Then, the resolution was presented, and participants were asked to read it and indicate whether they would do the proposed action by pressing the “yes/no” buttons. Decision times were recorded from the onset of the resolution slide until the button press. After their response, participants were asked to judge how morally acceptable the resolution was on an eight-point scale (0 = not at all, 7 = completely). Finally, participants were required to rate their emotional state during decision making using a computerized version of the Self-Assessment Manikin (SAM; Lang et al. 2008), displaying the one to nine-point scales of valence and arousal. By pressing the spacebar, the next scenario was presented. ITI, intertrial interval. Text is not drawn to scale

Download figure to PowerPoint

Dilemmas were presented in three blocks of 25 trials each, in random order within each block. Participants were allowed to take a short break at the end of each block. The experiment lasted about 50 to 70 min.

RESULTS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

Experimental dilemmas

Five separate ANOVAs have been performed on the experimental dilemmas, with percentages of affirmative responses (to the question: “Would you do it?”), decision times, ratings of acceptability (“How morally acceptable is this resolution?”), and ratings of valence (pleasantness/unpleasantness) and arousal (activation/calm) as dependent variables (DVs). For each of these DVs, we conducted both subject (F1) and item (F2) analyses. In the subject analysis, the dependent variable was each subject's mean response to each of the four groups of dilemmas, with Type of Dilemma (Instrumental vs. Incidental) and Risk-involvement (Self- vs. Other-involvement) as within-subjects factors. In the item analysis, the dependent variable was the mean response for each dilemma, with Type of Dilemma and Risk-involvement as between-subjects factors. For the sake of comparison, participants' gender was also entered as a between factor in the analyses by subjects and as a within factor in the analyses by item. Post-hoc p-values were Bonferroni-corrected to account for multiple comparisons.

No data were available in the literature on the selected dependent variables using a similar set of dilemmas to estimate appropriate sample size for the present study. However, a revised power analysis (G*Power 3.1.5; Faul, Erdfelder, Lang, & Buchner, 2007) was conducted for the statistical tests of the effects that were interesting from a theoretical point of view (i.e. Type of Dilemma and Risk-involvement main effects, Type of Dilemma × Risk-involvement interaction). Assuming a medium effect size (Cohen's f = 0.25) and a correlation of 0.50 among repeated measures, using a two-tailed α = 0.05 at 80% power, a sample size of 34 participants was needed for the mean comparisons of interest (a sample size of 48 was needed with a Bonferroni-corrected significance threshold of p < 0.0125). Therefore, we believe that the selected sample size of 120 offered a reasonable chance of obtaining statistically significant results for the key theoretical comparisons.

For each dilemma, the mean values relative to the aforementioned DVs are reported in the Tables S5–S8 of the Supplementary Material.

Percentages of affirmative responses

Type of Dilemma and Risk-involvement were both significant, with affirmative responses to incidental killing higher than those to instrumental killing [F1(1,118) = 627.50, p < 0.0001, ηp2 = 0.84; F2(1,56) = 343.90, p < 0.0001, ηp2 = 0.86] and affirmative responses to killing to save oneself and others higher than those to killing to save only others [F1(1,118) = 38.37, p < 0.0001, ηp2 = 0.24; F2(1,56) = 14.26, p ]< 0.0001, ηp2 = 0.20]. Gender was marginally significant in the subject analysis, but significant in the item analysis, with males more prone to killing than females [F1(1,118) = 3.07, p = 0.083, ηp2 = 0.02; F2(1,56) = 45.89, p < 0.0001, ηp2 = 0.45].

Decision times

In the analyses, log-transformations of RT values were used to control for the effect of RT skew. Type of Dilemma was significant, with deciding to engage in incidental killing slower than deciding to engage in instrumental killing [F1(1,118) = 117.36, p < 0.0001, ηp2 = 0.50; F2(1,56) = 28.18, p < 0.0001, ηp2 = 0.33]. Gender was significant only in the item analysis, with men overall faster than women [F2(1,56) = 11.22, p = 0.001, ηp2 = 0.17].

Moral acceptability ratings

Type of Dilemma and Risk-involvement were both significant, with incidental killing more acceptable than instrumental killing [F1(1,118) = 25.24, p < 0.0001, ηp2 = 0.18; F2(1,56) = 71.06, p < 0.0001, ηp2 = 0.56] and killing to save oneself and others less acceptable than killing to save only others [F1(1,118) = 11.53, p = 0.001, ηp2 = 0.09; F2(1,56) = 6.67, p = 0.012, ηp2 = 0.11]. The interaction between the two factors also reached significance [F1(1,118) = 15.57, p < 0.0001, ηp2 = 0.12; F2(1,56) = 4.92, p = 0.031, ηp2 = 0.08], showing that the effect of risk involvement was significant only for incidental dilemmas, t(119) = 5.39, p < 0.0001. Gender was significant only in the item analysis, with men giving higher mean acceptability ratings than women [F2(1,56) = 115.70, p < 0.0001, ηp2 = 0.67].

Valence ratings

Risk-involvement was significant in the subject analysis and marginally significant in the item analysis, with killing to save oneself and others rated as more unpleasant than killing to save only others [F1(1,118) = 6.80, p = 0.01, ηp2 = 0.05; F2(1,56) = 3.12, p = 0.083, ηp2 = 0.05]. Gender was significant, with females rating the decisions as more unpleasant than males [F1(1,118) = 6.48, p = 0.012, ηp2 = 0.05; F2(1,56) = 212.14, p < 0.0001, ηp2 = 0.79]. Gender interacted with both Type of Dilemma and Risk-involvement, producing two-way and three-way interactions. We explored the three-way interaction [F1(1,118) = 3.88, p = 0.051, ηp2 = 0.03; F2(1,56) = 4.11, p = 0.047, ηp2 = 0.07] with two separate 2 (Instrumental vs. Incidental) × 2 (Self- vs. Other-involvement) ANOVAs for men and women. For men, the ANOVA showed a significant interaction between the two factors in the subject analysis [F1(1,118) = 6.02, p = 0.017, ηp2 = 0.10], but no post-hoc effect survived Bonferroni correction. For women, the ANOVA showed only a main effect of Risk-involvement [F1(1,118) = 12.43, p = 0.001, ηp2 = 0.16; F2(1,56) = 10.01, p = 0.003, ηp2 = 0.15], with killing to save oneself and others rated as more unpleasant than killing to save only others.

Arousal ratings

Type of Dilemma and Risk-involvement were both significant, with incidental killing judged as more arousing than instrumental killing [F1(1,118) = 4.14, p = 0.044, ηp2 = 0.03; F2(1,56) = 2.84, p = 0.098, ηp2 = 0.05] and killing to save oneself and others as more arousing than killing to save only others [F1(1,118) = 56.83, p < 0.0001, ηp2 = 0.32; F2(1,56) = 21.86, p < 0.0001, ηp2 = 0.28]. Gender was also significant, with women rating the decisions as more arousing than men [F1(1,118) = 7.30, p = 0.008, ηp2 = 0.06; F2(1,56) = 733.89, p < 0.0001, ηp2 = 0.93]. Gender interacted with both Type of Dilemma and Risk-involvement only in the item analysis [F1(1,118) = 1.65, p = 0.201, ηp2 = 0.01; F2(1,56) = 4.04, p = 0.049, ηp2 = 0.07; F1(1,118) = 3.14, p = 0.079, ηp2 = 0.03; F2(1,56) = 4.53, p = 0.038, ηp2 = 0.08, respectively]. Two separate ANOVAs were then performed for men and women. Significant main effects of both Type of Dilemma and Risk-involvement were found for men, with incidental killing judged as more arousing than instrumental killing and killing to save oneself and others as more arousing than killing to save only others [F2(1,56) = 6.04, p = 0.017, ηp2 = 0.10; F2(1,56) = 10.32, p = 0.002, ηp2 = 0.16]. For women, only Risk-involvement reached significance, with killing to save oneself and others rated as more arousing than killing to save only others [F2(1,56) = 25.58, p < 0.0001, ηp2 = 0.31].

A summary of Type of Dilemma and Risk-involvement main effects and mean values obtained for each measure in both subject and item analyses is shown in Table 2.

Table 2. Summary of main effect means (M), F-values and significances obtained for the Type of Dilemma (Instrumental vs. Incidental) and Risk-involvement (Self vs. Other) factors in both subject and item analyses of variance (ANOVAs) for each dependent variable
 Type of dilemma main effectRisk-involvement main effect
   ANOVA ANOVA
 IncidentalInstrumentalBy subjectsBy itemSelfOtherBy subjectsBy item
  • Note.

  • ***

    p < 0.001

  • **

    p < 0.01

  • *

    p < 0.05

  • Trend 0.05 ≤ p ≤ 0.10

  • ns

    = nonsignificant (p > 0.10).

Dependent variableMMF[1,118]F[1,56]MMF[1,118]F[1,56]
Affirmative responses (%)69.6821.39627.50***343.90***50.4540.6238.37***14.26***
Decision times (ms)9521.008032.00117.36***28.18***8716.008837.002.63ns0.27ns
Moral acceptability (0–7)2.692.1925.24***71.06***2.362.5211.53**6.67*
Valence (1–9)2.752.800.61ns0.84ns2.732.826.80*3.12
Arousal (1–9)5.485.374.14*2.845.585.2756.83***21.86***

Filler dilemmas

For filler dilemmas, separate ANOVAs have been performed on the percentages of affirmative responses, response times, judgments of moral acceptability, and ratings of valence and arousal, with gender as a between-subject factor in the analyses by subjects (F1) and as a within-subject factor in the analyses by item (F2). For each dilemma, the mean values relative to the aforementioned DVs are reported in Tables S9–S10 of the Supplementary Material.

A significant Gender main effect was observed for judgments of moral acceptability, with men evaluating the filler resolutions as more morally acceptable relative to women (3.90 vs. 3.43) [F1(1,118) = 5.85, p = 0.017, ηp2 = 0.05; F2(1,14) = 15.67, p = 0.001, ηp2 = 0.53]. Moreover, decisions on filler dilemmas were rated as more unpleasant and arousing by women than by men (4.94 vs. 5.46 and 3.30 vs. 2.79, respectively) [valence: F1(1,118) = 9.50, p = 0.003, ηp2 = 0.07; F2(1,14) = 36.85, p < 0.0001, ηp2 = 0.73; arousal: F1(1,118) = 4.47, p = 0.037, ηp2 = 0.04; F2(1,14) = 24.25, p < 0.0001, ηp2 = 0.63]. No significant differences between men and women were found for the other dependent variables.

Discussion of results

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

As expected, participants judged it to be more permissible to kill one individual as a foreseen but unintended consequence of saving others than as an intended means to save others (see also Borg et al. 2006; Cushman et al. 2006; Greene et al. 2009; Hauser et al. 2007; Mikhail, 2002).

Our results also showed that incidental killing, besides being judged morally more acceptable, received higher percentages of affirmative responses, that is, participants responded that they would perform the proposed action themselves in a significantly higher number of dilemmas. At the same time, decision making to engage in incidental killing was slower and more arousing than to engage in instrumental killing. Taken together, these results provide new evidence on the differential engagement of emotional and cognitive processes during the resolution of dilemmas involving intentional versus unintentional harm. When considering potential confounding factors that per se might engage emotions, such as emotive language and the presence of death, our findings indicate that a strong aversive emotional response developed during decision making for both instrumental and incidental dilemmas. Such emotional engagement was presumably associated with the conflict in choosing whether to approve or disapprove the proposed resolution of each dilemma, because each choice was expected to bring aversive consequences in terms of life loss. In incidental dilemmas, however, in which the harmful act complies with the DDE, participants might have focused on the rational cost-benefit computation, which eventually favored utilitarian judgments, in line with the dual-process theory (Greene et al. 2001, 2004, 2008). Such a process possibly required greater cognitive effort and greater involvement of attentional resources, thus slowing response times and affecting the degree of self-reported emotional arousal. Indeed, critical effects of arousal relate to attentional processing, especially in those tasks requiring resolution of cognitive and behavioral conflicts (e.g. Posner et al. 2009).

We also found that when main characters' own lives were at risk, participants were more likely to choose to kill others, but decision making was experienced as more unpleasant and more arousing relative to when their own lives were not at risk. The goal of staying alive, although expected, entails that a simple egoistic self-preservation instinct at the expense of other people's lives elicits a stronger aversive emotional response, at least on the basis of conscious emotional evaluations. Interestingly, killing to save oneself and others was judged less morally acceptable than killing to save only others. This finding, which holds for incidental but not for instrumental dilemmas, shows that when killing is not intentional, there is a gap between actions and moral principles. In fact, people may perceive killing unintentionally one person to save others as a more virtuous principle when their own lives are not at risk. However, when moving from the moral perspective to the dimension of behavior, people seem less prone to adhere to the moral code and more likely to save themselves. Such dissociation suggests that future research should pay attention to the questions used to investigate moral judgment, the choice depending on the purpose of the study.

Although in previous studies on moral dilemmas, participants were sex-balanced, results were generally reported by presenting the overall effects. A few studies, however, showed gender-related differences in moral judgment. Fumagalli et al. (2010), for example, found that men gave significantly more utilitarian responses to personal moral dilemmas, whereas no differences between men and women were found in such responses to nonmoral and impersonal moral dilemmas. Therefore, for each dilemma of our study we reported, along with the means of the overall sample, the corresponding norms separately for men and women. Significant effects in the item analyses showed that men were overall more prone to killing, gave higher ratings of moral acceptability, and were faster in deciding as compared with women. Furthermore, both subject and item analyses showed that women rated their emotional state during decision making as more unpleasant and arousing than did men. The emotional measures of valence and arousal showed also interesting interactions between gender and the two manipulated variables (i.e. type of dilemma and risk involvement). In the item analysis, men judged incidental killing as more arousing than instrumental killing and killing to save oneself and others as more arousing than killing to save only others. On the other hand, women were affected by risk involvement, as they judged killing to save oneself and others as more unpleasant (both in subject and item analyses) and more arousing (only in the item analysis) than killing to save only others.

CONCLUSIONS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information

In the last decade, Greene and colleagues (2001) have renewed the interest of the scientific community in the topic of moral judgment, as testified by the large use of their set of dilemmas in the relevant literature (e.g. Choe & Min, 2011; Cima, Tonnaer, & Hauser, 2010; Crockett, Clark, Hauser, & Robbins, 2010; Fumagalli et al. 2010; Koenigs et al. 2007; Marsh et al. 2011; Moore et al. 2008; Moretto, Làdavas, Mattioli, & Di Pellegrino, 2010). In the light of the considerable amount of research indicating a growing interest in moral judgment, the criticisms raised against the set of stimuli proposed by Greene and colleagues highlight the widespread need for a new set of experimental dilemmas. This should be particularly relevant to studies investigating the neurobiological processes underlying moral judgment (see Sarlo et al. 2012), because of the large number of stimuli needed by brain-imaging techniques.

The present work provides a new set of moral dilemmas rigorously controlled for a number of confounding factors observed in the set of moral dilemmas proposed by Greene and colleagues (2001). Specifically, we provide 75 moral dilemmas (60 experimental and 15 fillers) meticulously balanced for length with respect to both the scenarios and resolutions. Furthermore, as the cost/benefit ratio may affect people's judgments, experimental dilemmas were also balanced for numerical consequences, that is, the number of people to save or let die.

The experimental set comprises 30 dilemmas structured similarly to the Footbridge dilemma and 30 dilemmas structured similarly to the Trolley dilemma. On the basis of the “Doctrine of the Double Effect” (Aquinas, 1952/1274), they were respectively labeled as “instrumental dilemmas,” which described killing one individual as an intended means to save others, and “incidental dilemmas,” which described killing one individual as a foreseen but unintended consequence of saving others. In addition to the intention variable, the self versus other benefit was also manipulated, as Moore et al. (2008) pointed out that whether or not characters' lives were put at risk varied unevenly across the personal and impersonal dilemmas used by Greene et al. (2001). Another essential difference with the dilemmas proposed by Greene and colleagues is that in all the experimental dilemmas of the present set, the moral violation causes the death of one or more people.

An important novel feature of the present study is that it provides normative data for the most relevant variable affecting moral judgment. For each dilemma, we present the mean values that we collected: (i) by asking a sample of Italian subjects if they would do the action proposed as a resolution of the dilemma; (ii) by recording their decision times; (iii) by asking them how morally acceptable was the action proposed as resolution of the dilemma; and (iv) by obtaining emotional ratings of valence and arousal.

We observed significant main effects of the independent variables for most of the dependent variables examined. It is worth noting that when significant effects were found, they consistently emerged in both subject and item analyses, suggesting that the obtained results are generalizable to other populations of participants and other samples of dilemmas. The significant effects found in the item analysis reveal that the proposed instrumental and incidental dilemmas are highly homogeneous within their respective classes. This is particularly important for the goals we set in the present study, as McGuire et al. (2009) demonstrated that some behavioral effects found by Greene et al. (2001) with their set of dilemmas were driven by just a few particular items.

To the best of our knowledge, so far, no studies have provided normative data for stimuli such as moral dilemmas. We believe that these normative ratings might be effectively employed in future studies to select moral dilemmas according to specific research needs, by controlling for or manipulating the judgments of moral appropriateness or the emotional salience of the experimental conditions.

References

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information
  • Aquinas, T. (1952). The summa theologica (fathers of the english dominican province, trans.). In W. Benton (Series Ed.), Great Books Series: Vol. 19. Chicago: Encyclopedia Britannica, Inc. (Original work published 1274).
  • Borg, J. S., Hynes, C., Van Horn, J., Grafton, S., & Sinnott-Armstrong W. (2006). Consequences, action, and intention as factors in moral judgments: an fMRI investigation. Journal of Cognitive Neuroscience, 18, 803817.
  • Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25, 4959.
  • Choe, S. Y., & Min, K.-H. (2011). Who makes utilitarian judgments? The influences of emotions on utilitarian judgments. Judgment and Decision Making, 6, 580592.
  • Cima, M., Tonnaer, F., & Hauser, M. D. (2010). Psychopaths know right from wrong but don't care. Social Cognitive and Affective Neuroscience, 5, 5967.
  • Crockett, M. J., Clark, L., Hauser, M. D., & Robbins, T. W. (2010). Serotonin selectively influences moral judgment and behavior through effects on harm aversion. Proceedings of the National Academy of Sciences, 107, 1743317438.
  • Cushman, F., Young, L., & Hauser, M. (2006). The role of conscious reasoning and intuition in moral judgment: testing three principles of harm. Psychological Science, 17, 10821089.
    Direct Link:
  • Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175191.
  • Foot, P. (1967). The problem of abortion and the doctrine of double effect. Oxford Review, 5, 515.
  • Fumagalli, M., Ferrucci, R., Mameli, F., Marceglia, S., Mrakic-Sposta, S., Zago, S., Lucchiari, C., Consonni, D., Nordio F., Pravettoni, G., Cappa, S., & Priori, A. (2010). Gender-related differences in moral judgments. Cognitive Processing, 11, 210226.
  • Greene, J. D., Cushman, F. A., Stewart, L. E., Lowenberg, K., Nystrom, L. E., & Cohen, J. D. (2009). Pushing moral buttons: the interaction between personal force and intention in moral judgment. Cognition, 111, 364371.
  • Greene, J. D., Morelli, S. A., Lowenberg, K., Nystrom, L. E., & Cohen, J. D. (2008). Cognitive load selectively interferes with utilitarian moral judgment. Cognition, 107, 11441154.
  • Greene, J. D., Nystrom, L. E., Engell, A. D., Darley, J. M., & Cohen, J. D. (2004). The neural bases of cognitive conflict and control in moral judgment. Neuron, 44, 389400.
  • Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293, 21052108.
  • Hauser, M., Cushman, F., Young, L., Jin, R., & Mikhail, J. (2007). A dissociation between moral judgments and justifications. Mind & Language, 22, 121.
  • Hsee, C. K. (1996). The evaluability hypothesis: an explanation for preference reversals between joint and separate evaluations of alternatives. Organizational Behavior and Human Decision Processes, 67, 247257.
  • Hsee, C., Loewenstein, G., Blount, S., & Bazerman, M. (1999). Preference reversals between joint and separate evaluations of options: a review and theoretical analysis. Psychological Bulletin, 125, 576590.
  • Koenigs, M., Young, L., Adolphs, R., Tranel, D., Cushman, F., Hauser, M., & Damasio, A. (2007). Damage to the prefrontal cortex increases utilitarian moral judgments. Nature, 446, 908911.
  • Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (2008). International affective picture system (IAPS): affective ratings of pictures and instruction manual. Technical Report A-8, University of Florida, Gainesville, FL.
  • Lang, P. J., Greenwald, M. K., Bradley, M. M., & Hamm, A. O. (1993). Looking at pictures: affective, facial, visceral, and behavioral reactions. Psychophysiology, 30, 261273.
  • MacIntyre, A. (1990). Moral dilemmas. Philosophy and Phenomenological Research, 50, 367382.
  • Marsh, A. A., Crowe, S. L., Yu, H. H., Gorodetsky, E. K., Goldman, D., & Blair, R. J. R. (2011). Serotonin transporter genotype (5-httlpr) predicts utilitarian moral judgments. PLoS One, 6, e25148.
  • McGuire, J., Langdon, R., Coltheart, M., & Mackenzie, C. (2009). A reanalysis of the personal/impersonal distinction in moral psychology research. Journal of Experimental Social Psychology, 45, 577580.
  • Mikhail, J. (2002). Law, science, and morality: a review of Richard Posner's ‘The Problematics of Moral and Legal Theory’. Stanford Law Review, 54, 10571127.
  • Moore, A. B., Clark, B. A., & Kane, M. J. (2008). Individual differences in working memory capacity, executive control, and moral judgment. Psychological Science, 19, 549557.
    Direct Link:
  • Moretto, G., Làdavas, E., Mattioli, F., & Di Pellegrino, G. (2010). A psychophysiological investigation of moral judgment after ventromedial prefrontal damage. Journal of Cognitive Neuroscience, 22, 18881899.
  • Petrinovich, L., O'Neill, P., & Jorgensen, M. (1993). An empirical study of moral intuitions: towards an evolutionary ethics. Journal of Personality and Social Psychology, 64, 467478.
  • Posner, J., Russell, J. A., Gerber, A., Gorman, D., Colibazzi, T., Yu, S., Wang, Z., Kangarlu, A., Zhu, H., & Peterson, B. S. (2009). The neurophysiological bases of emotion: an fMRI study of the affective circumplex using emotion-denoting words. Human Brain Mapping, 30, 883895.
  • Sarlo, M., Lotto, L., Manfrinati A., Rumiati, R., Gallicchio, G., & Palomba, D. (2012). Temporal dynamics of cognitive-emotional interplay in moral decision-making. Journal of Cognitive Neuroscience, 24, 10181029.
  • Snodgrass, J. G., & Vanderwart, M. (1980). A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6, 174215.
  • Thomson, J. J. (1986). Rights, restitution, and risk: Essays in moral theory. Cambridge, Mass: Harvard University Press.
  • Unger, P. (1996). Living high and letting die. New York: Oxford University Press.

Biographies

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information
  • Lorella Lotto is an Associate Professor at the Department of Developmental Psychology and Socialization of the University of Padova. She received her PhD in Psychology from the University of Padova. Her research interests include decision-making issues, especially relating to moral judgment, medical decision making, and risk communication.

  • Andrea Manfrinati is a Postdoctoral Research Fellow at the Department of Developmental Psychology and Socialization, University of Padova. He received his PhD in Cognitive Sciences from the University of Padova. His main research interests focus on psychology of thinking, reasoning, and moral judgment.

  • Michela Sarlo is an Assistant Professor at the Department of General Psychology of the University of Padova. She received her PhD in Cognitive Sciences from the University of Padova. Her research interests include the autonomic and neural correlates of cognitive–emotional interaction in decision-making processes.

Supporting Information

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIAL AND METHODS
  5. RESULTS
  6. Discussion of results
  7. CONCLUSIONS
  8. References
  9. Biographies
  10. Supporting Information
FilenameFormatSizeDescription
bdm1782-sup-0001-documentS1.docxWord 2007 document113KSupporting info item
bdm1782-sup-0002-documentS2.docxWord 2007 document133KSupporting info item
bdm1782-sup-0003-TableS1.docxWord 2007 document88KSupporting info item
bdm1782-sup-0004-TableS10.docxWord 2007 document121KSupporting info item
bdm1782-sup-0005-TableS2.docxWord 2007 document98KSupporting info item
bdm1782-sup-0006-TableS3.docxWord 2007 document93KSupporting info item
bdm1782-sup-0007-TableS4.docxWord 2007 document95KSupporting info item
bdm1782-sup-0008-TableS5.docxWord 2007 document119KSupporting info item
bdm1782-sup-0009-TableS6.docxWord 2007 document126KSupporting info item
bdm1782-sup-0010-TableS7.docxWord 2007 document122KSupporting info item
bdm1782-sup-0011-TableS8.docxWord 2007 document126KSupporting info item
bdm1782-sup-0012-TableS9.docxWord 2007 document84KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.