An investigation of social factors related to online mentalizing in a human-robot competitive game

Authors


  • This study was supported by a Grant-in-Aid for Scientific Research on Innovative Areas “Founding a creative society via collaboration between humans and robots (No. 4101)” (24118708), Grant-in-Aid for Young Scientists (B) No. 23700321, and Tamagawa University Global Center of Excellence grant from the Ministry of Education, Culture, Sports, Science and Technology, Japan.

Correspondence concerning this article should be sent to: Hideyuki Takahashi, Brain Science Institute, Tamagawa University, Tamagawagakuen, Machida 194-8610, Japan. (E-mail: hideman@lab.tamagawa.ac.jp)

Abstract

“Mentalizing” is the ability to attribute mental states to other agents. The lack of online mentalizing, which is required in actual social contexts, may cause serious social disorders such as autism. However, the mechanism of online mentalizing is still unclear. In this study, we found that behavioral entropy (which indicates the randomness of decision making) was an efficient behavioral index for online mentalizing in a human-human competitive game. Further participants played the game with a humanoid robot; the results indicated that the entropy was significantly higher in participants whose gaze followed the robot's head turn than in those who did not, although the explicit human-likeness of the robot did not correlate with behavioral entropy. These results implied that mentalizing could be divided into two separate processes: an explicit, logical reasoning process and an implicit, intuitive process driven by perception of the other agent's gaze. We hypothesize that the latter is a core process for online mentalizing, and we argue that the social problems of autistic people are caused by dysfunction of this process.

The ability to attribute a mental state to another agent is called “mentalizing” (Frith & Frith, 2006; Van Overwalle & Baetens, 2009). Mentalizing is essential for interpersonal interaction, and many researchers are greatly concerned with the process of mentalizing. This topic has been investigated in various research fields, such as comparative psychology, developmental psychology, and neuroscience. In particular, clinical disorders (e.g., autism spectrum disorder) have provided great insight into mentalizing (Castelli, Frith, Happé, & Frith, 2002; White, Hill, Happé, & Frith, 2009). Many previous findings have suggested that dysfunction associated with mentalizing causes inappropriate social behaviors. However, the concrete mechanism behind mentalizing is still unclear. Specifically, there is a paucity of knowledge on how mentalizing works in actual social situations, for example, interpersonal interaction.

Mentalizing has been explained in terms of different processes, which are called “offline” and “online”; the latter rather than the former process is thought to have a critical role in actual social situations (Frith & Frith, 2010). Offline mentalizing is mental inference made using deliberation and recursive reasoning. In a typical experimental task to test mentalizing abilities, participants are asked to infer a static mental state (e.g., emotion/intention) of a character who is passively presented (e.g., video clip/cartoon/picture), and they are given ample time and information to deliberate regarding the task (Happé, 1994, Kana, Keller, Cherkassky, Minshew, & Just, 2009). Thus, typical lab experiments might only investigate offline mentalizing. In contrast, online mentalizing makes use of emotional and intuitional processes. Actual social contexts are not static, but the dynamic; mental states of others change frequently, which makes it impossible to obtain enough information to infer others' mental states and makes people sensitive to changes in mental state. Hence, in actual social contexts, people need to infer mental states with limited amounts of time and information for deliberation. However, the mechanism of online mentalizing is still unclear because of the difficulty of investigating it using a typical laboratory experiment.

Experiments that require interaction with another player, such as the prisoner's dilemma game, are expected to be useful for the investigation of online mentalizing (Lee, 2008). These games contain fundamental attributes of actual social contexts, such as interests, reciprocity, and tactics. Because of these aspects, interactive games are often used as microcosms of actual social contexts in economics, psychology, and neuroscience experiments. Compared with typical laboratory experiments, which always have one answer that does not change during a trial and can be answered through deep investigation, interactive games require the participant not only to guess another's mental state, but also to pay attention to changes in the mental state. When participants believe that their opponent in a game has a changeable mental state, they are expected to dynamically infer the opponent's mental state, just as in actual social contexts. Participants tend to think that if their opponent is not human, the opponent's strategy will not change. Previous studies using interactive games have investigated online mentalizing by comparing a condition in which the participants were instructed that their opponent was a human player (online mentalizing is required) with a condition in which their opponent was a mindless computer program (online mentalizing is not required) (Coricelli & Nagel, 2009; Fukui, Murai, Shinozaki, Aso, Fukuyama, Hayashi, & Hanakawa, 2006; Gallagher, Jack, Roepstorff, & Frith, 2002; McCabe, Houser, Ryan, Smith, & Trouard, 2001; Rilling, Sanfey, Aronson, Nystrom, & Cohen, 2004; Waytz, Gray, Epley, & Wegner, 2010). These studies succeeded in extracting specific behavioral and brain activities related to online mentalizing. However, these studies could not specify the precise factors related to online mentalizing in human-human interaction, because the difference between these two conditions (the opponent is either a human player or a mindless computer program) is quite significant, and the appearance and motions of a human opponent cannot be controlled uniformly among participants. In order to specify the factors related to online mentalizing more precisely, the attributes of game opponents need to be controlled more strictly through manipulation of multiple definite experimental variables.

In order to elucidate the precise factors related to online mentalizing, we used a humanoid robot as an opponent during an interactive game. The appearance and movements of the robot can be controlled reproducibly, and we can define multiple experimental variables in the human-robot interaction. In this study, we conducted two experiments using an iterated matching pennies game (MPG) on healthy adults. In the first experiment, we used a mathematical measurement, “entropy,” to quantify the behavioral changes caused by the instruction given to participants that the opponent was a human. From this experiment, we suggested that entropy is an efficient behavioral index related to online mentalizing in the MPG. In the second experiment, the participants' MPG opponent was a humanoid robot. We defined explicit, subjective rating scales of the robot's human-likeness and the participant's behavior of following the robot's gaze that expressed the participant's implicit perception of the human-likeness of the robot (Meltzoff, Brooks, Shon, & Rao, 2010) as experimental variables describing human-robot interaction, and we explore the experimental variables' relationships with entropy. On the basis of the results of these two experiments, we discuss the process of online mentalizing and its dysfunction in autism.

Experiment I Specification of a behavioral index related to online mentalizing in the MPG

In this experiment, we specified a behavioral index related to online mentalizing in the MPG. Concretely, we investigated whether the randomness of decision making in the MPG increased when participants believed their opponent was a human player using a behavioral index, entropy, which was calculated from the frequency of the participants' decisions.

Materials and methods

The MPG is a simple, zero-sum, competitive game played by two players. In this game, each player selects one decision from two options, “L” or “R,” in each trial; winning/losing outcomes for each player are determined by a combination of the decisions of the two players. If both players select the same choice, one player is the winner and the other is the loser. If not, the identities of the winner and the loser are reversed. The winner receives a fixed reward, and the loser loses the same amount, in each trial. The participants played this game across multiple trials and were asked to increase their accumulated rewards as much as possible. The MPG is a symmetrical, zero-sum game, and the required strategy is completely equal between the two players. The participants were required to predict the opponent's next behavior and to avoid the risk of having their behavior predicted by the opponent. In our experiment, the opponent was always a computer program, regardless of the instructions given to the participants, and the program always selected each option with equal probability. Hence, the expected wining ratio was always 0.5, regardless of the participant's decision.

In experiment I, the MPG game was programmed using the Microsoft Visual Basic 6.0 (Microsoft Corp.) programming language in the Windows XP operating system, and game scenes were presented on a laptop personal computer in the following sequence during each trial: in the first scene, the opponent decided its decision; in the second scene, the participant decided his/her decision by pressing a key; in the third scene, the selections of the participant and the opponent were presented; and in the fourth scene, the participant's accumulated reward was presented. In this experiment, if the two players selected different choices, the participant was the winner and received ¥50 (USD$1 equals approximately ¥80); if the players selected the same choices, the participant was the loser and lost ¥50.

In this experiment, there were two conditions: the human opponent condition (HO condition), in which the participants were instructed that the opponent was a human player, and the computer opponent condition (CO condition), in which the participants were instructed that the opponent was a computer program. Each session of the HO or CO condition consisted of 50 trials, and the order of the conditions was fixed: HO-CO-HO-CO-HO-CO (six total sessions). In the HO condition, the participants were instructed that their opponent was a human player without being given any extra information (e.g., how the opponent played the game). A man in his twenties, who was a stranger to the participants, sat beside the participants as a pseudo-human opponent and pretended to be playing the game. The pseudo-human opponent was asked to press one key without hesitation, and this key press did not influence the opponent's decision, as shown on the display. The participants and the pseudo-human opponent were prohibited from talking to or seeing each other. In the CO condition, the participants were instructed that the opponent was a computer program, and they played the game similarly to the HO condition, except for the absence of a pseudo-human opponent. It took approximately 20 min to complete the entire experiment.

Participants

Nineteen healthy adult participants (seven female and 12 male, aged 19–36 years) were enrolled in the study. All participants were recruited from among college students and faculties, and interviews confirmed that they had no developmental problems.

Behavioral analysis

In the MPG, generating random decisions is efficient in order for game players to avoid the prediction of their next decision by their opponent: randomness of decision making is an important behavioral index in the MPG. We quantified the randomness of decision making during each session of the MPG as entropy H, which was calculated from the conditional frequency p(d|c) of decision d (L or R) selected in current game context c (the recent choices for participants and opponents). Entropy H indicates how decision d is generated independently from the current game context, and the value of H positively correlates with the degree of randomness of decision making in the MPG.

p(d|c) was calculated using the following equation:

display math

The variable n(d|c) indicates the observed number of decisions d in context c. k is a correction coefficient that prevents small samples from deforming p(d|c). Because working memory is limited, the participants could not access the entire context; rather, their decisions might be based on a portion of the context. We assumed six partial contexts (pc) for entropy estimation (S1: the last decision by the participant; S2: the last two decisions by the participant; O1: the last decision by the opponent; O2: the last two decisions by the opponent; S1 & O1: a combination of the last decision by both the participant and the opponent; none: no game context) and cpc is the game context corresponding to each pc. Entropy H(d|cpc) in each session is calculated using the following equation:

display math

Here, Npc is the number of possible alternatives for a particular cpc, and this variable normalizes Hpc in the range from 0 to 1. For each session, the lowest of the six entropy values was chosen as the decision-entropy value for that session. This value increases towards one as the decisions become less predictable.

Results

We confirmed that all participants believed that their opponent was a human player in the HO condition by interview after the experiment. We compared both the acquired reward and the behavioral entropy between the HO and CO conditions using one-way repeated ANOVAs. There was no significant difference in the acquired reward between the HO and CO conditions. However, we found a significant main effect of experimental condition on entropy (Figure 1), F(1, 18) = 11.122, p < .01. The mean of the entropy in the HO condition was significantly higher than that in the CO condition. The instruction that the game opponent was a human player thus improves the randomness of the participants' decision making in the MPG regardless of actual game performance, and our results suggested that entropy is an efficient behavioral index related to online mentalizing in MPG.

Figure 1.

Mean entropy values of participants' decision sequences in each condition (error bars represent standard errors). CO = computer opponent; HO = human opponent.

Experiment II Investigation of social factors related to online mentalizing in MPG with a humanoid robot opponent

In Experiment I, we specified a behavioral index, entropy, which is related to online mentalizing in the MPG. However, we could not investigate the concrete factors included in human-human interaction that influenced entropy in the MPG, because human-human interaction was too complicated for precise investigation. To overcome the difficulties in investigating human-human interaction, we developed a human-robot competitive game in which participants played the MPG with a humanoid robot.

The appearance of humanoid robots was quite familiar to the participants, and the movement of the humanoid robot was controlled by an experimenter in another room. Several previous studies have suggested that participants often attribute mental states to mindless robots, and the exploration of factors that induce the attribution of mental faculties to the robot sheds light on the precise mechanism of the mentalizing process (Krach, Hegel, Wrede, Sagerer, Binkofski, & Kircher, 2008).

We conducted this experiment with two purposes: one was to explore the social factors related to entropy in the MPG. For this purpose, we defined two social factors as experimental variables. One was a subjective rating of the robot's human-likeness, and the other was the participant's behavior of following the robot's gaze. Then, we investigated which experimental variables were strongly related to entropy in the MPG. The second purpose was to investigate entropy in the MPG before and after the human-robot imitation. Imitation between two agents is known to strongly influence the social relationship between them (Van Baaren, Holland, Kawakami, & van Knippenberg, 2004). Hence, we investigated whether the human-robot imitation influenced the process of online mentalizing during the human-robot competitive game. For this purpose, we prepared three variations of the human-robot imitation: the “observation” group (control condition), the “participant imitation” group, and the “robot imitation” group; we compared the degrees of entropy among these three conditions.

Methods

In this experiment, we used a humanoid robot named “PoCoBot” as the opponent in the MPG (Figure 2). The robot could move its arms, change its facial direction, and speak predetermined sentences according to the remote control of the experimenter.

Figure 2.

Experimental scene with a robot.

The participants sat at a table facing the front of the robot (the distance between the participants and the robot was approximately 1 m). The participants were given two cards; one had “Left” and the other had “Right” printed on one side of each card. In each trial, the participants selected one of these two cards and placed the selected card face-down on the table. Then, the robot indicated one direction, either left or right, with a gesture. The robot selected each direction with equal probability (0.5). However, the participants were not instructed as to the robot's strategy, and they were instructed to predict the direction that the robot would select. If the direction selected by the participant and the direction selected by the robot were the same, the participant was the winner; if not, the participant was the loser. The participants were motivated to win the game by the instruction that they would receive a large reward if they won against the robot in many trials. Participants played the MPG for 20 trials with the robot in each session. Before the session, the participants had a short conversation with the robot (i.e., greeting), and the robot suddenly turned its head during the conversation. We checked whether the participants followed the robot's gaze direction with their own gaze just after the robot turned its head. The participants were also asked to rate the human-likeness of the robot on a seven-point scale in each session.

In this experiment, the participants played the MPG with the robot for two sessions. The participants were divided into three groups. In the observation group, they were simply instructed to observe the robot's arm movements for 2 min between the two sessions. In the participant imitation group, the robot moved its arms randomly, and the participants were instructed to imitate the robot's arm movements during the same period. In the robot imitation group, they were asked to move their arms freely, and the robot imitated the participant's arm movements.

Participants

Twenty-seven healthy adult participants (12 female and 15 male, aged 18–29 years) were enrolled in the study. Nine participants were assigned to the observation group, ten participants were assigned to the participant imitation group, and eight participants were assigned to the robot imitation group.

Results

We compared the mean of human-likeness in each session across the participant groups using a two-way ANOVA ([session number] × [participant group]; Figure 3). We did not find any significant effects of participant group, but we did find a main effect of session number, F(1, 24) = 18.191, p < .001. These results suggested that the participants tended to rate the human-likeness of the robot as higher in the second session than the first, regardless of the group assignment.

Figure 3.

Mean human-likeness ratings of the robot opponent in each session and participant group (error bars represent standard errors).

We also compared the mean of entropy in each session across the participant groups using a two-way ANOVA ([session number] × [participant group]; Figure 4). We did not find any significant effects of either the session number or the participant group, although the subjective human-likeness of the robot increased from the first to the second session. These results suggested that the randomness of decision making in the MPG with the robot was independent from the subjective human-likeness of the robot.

Figure 4.

Mean entropy of participant response sequences in each session and participant group (error bars represent standard errors).

Next, we investigated whether the participant's behavior of following the gaze of the robot related to the subjective human-likeness of the robot or to the entropy of the participant's responses. In the first session, approximately 74% of the participants followed the gaze of the robot, as did approximately 78% in the second session. There was no significant difference in this ratio between the two sessions. We compared the means of the subjective human-likeness of the robot between the sessions when the participant exhibited gaze-following behavior and the sessions when the participant did not follow the robot's gaze. We did not find a significant difference between the trials with and without gaze-following behavior (Figure 5). Next, we compared the means of the entropy of participant decision sequences between the sessions with gaze-following behavior and the sessions without it. We found that the mean of entropy in the sessions with gaze following was significantly higher than that in sessions without gaze following (Figure 6; t-test: p < .05). These results suggested that the entropy of the participants' decision sequences tended to be high when they followed the robot's gaze, regardless of the subjective human-likeness of the robot.

Figure 5.

Mean human-likeness rating of the robot opponent in trials with versus without gaze-following behavior of the participant (error bars represent standard errors).

Figure 6.

Mean of entropy of participant responses in trials with vs. without gaze-following behavior of the participant (error bars represent standard errors).

Discussion

In this study, we quantified online mentalizing in the MPG with a behavioral index, entropy, and we explored the factors included in human-robot interaction that influenced entropy. We found that when the participant followed the gaze of the robot, entropy was elevated, although entropy was independent of the participant's scores with respect to the subjective human-likeness of the robot.

The human-robot imitation did not significantly influence online mentalizing, that is, entropy, although some previous studies have suggested that imitation between two agents strongly influences the relationship between them. The influence of imitation is known to be strong when the imitation is executed implicitly. However, the participants were instructed about human–robot imitation explicitly in our experiment. Previous studies have suggested that implicit interpersonal imitation influenced participants' cognition more strongly than explicit imitation (Lakin & Chartrand, 2003). If the robot imitates the participants' body movement implicitly, some improvement in online mentalizing might occur due to the imitation. If online mentalizing increases because of the human-robot imitation, this finding could have implications for robotic therapy for social disorders, such as autism.

The explicit human-likeness of the robot increased from the first session to the second session, although entropy did not change between the two sessions. This might be caused by the mere-exposure effect (Zajonc, 1968): the participants' explicit impressions of the robot improved throughout the experiment. However, online mentalizing was not affected by the mere-exposure effect.

The participants' gaze-following behavior meant that the participants were aware of the robot's eyes: if the participants did not recognize the robot's eyes, then there would have been no reason to follow its gaze and look ahead of it. Social psychology studies have suggested that perception of other eyes influences prosocial behavior unconsciously (Mifune, Hashimoto, & Yamagishi, 2010). Our results implied that behavioral change related to online mentalizing in the MPG is not driven by the explicit perception of human-likeness, but by the implicit perception of “eyes.” In terms of game theory, the increase in randomness of decision making is quite optimal in the MPG, because a static strategy has a significant risk of being predicted by the opponent when the opponent “observes” the player's strategy with its eyes (Nash, 1950). Hence, if a participant is aware of a robot's eyes, it is reasonable for the participant to increase his/her randomness of decision making to avoid the risk of a poor outcome in the MPG. Interestingly, our results imply that this behavioral influence is not driven by explicit human perception but by reflexive following of the robot's eyes.

The abovementioned interpretation of online mentalizing is quite similar to the dual-process model for reasoning (Evans, 2003). In that model, human reasoning is thought to consist of two distinct processes: one is a deliberative reasoning system, and the other is an autonomous system driven by innate input modules. The former system permits highly abstract thinking, but it also requires a heavy cognitive load. In contrast, the latter system is too intuitive, but it permits rapid response without mental effort. We consider that offline mentalizing corresponds to the former system of reasoning regarding another's mental state, and online mentalizing corresponds to the latter, autonomous system. Actual social situations change dynamically and contain significant ambiguity. Taken together with the above discussions, we propose a dual-process model for mentalizing that consists of an explicit logical reasoning process for offline mentalizing and an implicit institutive system for online mentalizing (Figure 7). If we try to infer others' mental states using logical reasoning in actual social situations, it requires much effort and causes much stress. Hence, the autonomous system is more useful in actual social situations.

Figure 7.

Explicit and implicit processes of mentalizing.

High-functioning autistic people have autistic traits but no difficulties in terms of intellectual or verbal abilities. They often achieve performance on par with that of typically developing people in laboratory experiments that require mentalizing (Shamay-Tsoory, 2008; Slaughter & Paynter, 2007). However, they have many problems in communicating with others in their daily lives. Recently, this inconsistency between laboratory experiments and actual social contexts has been explained in terms of the distinction between “offline” and “online” mentalizing (Roeyers & Demurie, 2010). Because they have high levels of ability to deliberate regarding mental states, high-functioning autistic people might not have serious problems with offline mentalizing. Inappropriate social behaviors in autistic people might be caused by abnormalities in online mentalizing; the autonomous system driven by shortcomings in the perception of others' eyes. Izuma and colleagues showed that the influence of others' eyes on prosocial behavior is weak in autistic people (Izuma, Matsumoto, Camerer, & Adolphs, 2011).

Recent studies suggest that the enhancement of prosocial behavior driven by others' eyes is caused by the awareness of risk aversion. If one does not practice prosocial behavior in a public place where other people observe him/her, his/her reputation might suffer. To avoid this risk, people eagerly practice prosocial behavior when they are aware of others' eyes. Hence, both the increases in entropy in the MPG and prosocial behavior in public spaces might be driven by an increase in risk awareness. Taken together, our results indicate that being aware of others' eyes automatically increases our risk awareness, and this awareness regulates our cognition and behavior of avoiding various social risks. We believe that autistic people have a serious problem in this autonomous system.

Finally, we discuss a neural correlate of this implicit system that controls risk awareness depending on the presence of others' eyes. We consider the anterior insula to be a key brain region. The anterior insula is known to integrate various information between one's own body and external environments, and it also controls risk awareness (Singer, Critchley, & Preuschoff, 2009). Because of this function, the anterior insula is sometimes called the brain's “alarm system” (Clark, Bechara, Damasio, Aitken, Sahakian, & Robbins, 2008). Uddin and Menon (2009) hypothesized that the anterior insula is a salience network that drives executive function depending on external sensory inputs; they also mentioned that this neural circuit is malformed in autistic people. Dysfunction of the anterior insula might lead to serious problems in online mentalizing among autistic people.

Ancillary