Predictive social perception: Towards a unifying framework from action observation to person knowledge

: Action observation is central to human social interaction. It allows people to derive what mental states drive others' behaviour and coordinate (and compete) effectively with them. Although previous accounts have conceptualised this ability in terms of bottom ‐ up (motoric or conceptual) matching processes, more recent evidence suggests that such mechanisms cannot account for the complexity and uncertainty of the sensory input, even in cases where computations should be much simpler (i.e., low ‐ level vision). It has therefore been argued that perception in general, and social perception in particular, is better described as a process of top – down hypothesis testing. In such models, any assumption about others — their goals, attitudes, and beliefs — is translated into predictions of expected sensory input and compared with incoming stimulation. This allows perception and action to be based on these expectations or — in case of a mismatch — for one's prior assumptions to be revised until they are better aligned with the individual's behaviour. This article will give a (selective) review of recent research from experimental psychology and (social) neuroscience that supports such views, discuss the relevant underlying models, and current gaps in research. In particular, it will argue that much headway can be made when current research on predictive social perception is integrated with classic findings from social psychology, which have already shown striking effects of prior knowledge on the processing of other people's behaviour.

others' behaviour. As with basic visual features, there is no one-to-one mapping between actions and internal states.
The same action can serve very different goals in different contexts, and different motor behaviours can achieve the same goals (e.g., Bach, Knoblich, Gunter, Friederici, & Prinz, 2005;Jacob & Jeannerod, 2005). Moreover, others' behaviour is heavily context dependent: It is influenced not only by their internal states but by the objects around them, the people they interact with, their own social role, and observers readily integrate all these pieces of information (Gergely & Csibra, 2003;Ham & Vonk, 2003;Heider & Simmel, 1944;Lupfer, Clark, & Hutcherson, 1990;Todd, Molden, Ham, & Vonk, 2011). And while these challenges are present even in the fairly restricted types of behaviours typically used in action observation research (reaching, grasping, and simple object use), they become ever more prominent the more the scope of analysis is widened to include larger-scale behaviours, such as planning a dinner or cooperating and competing in a game, where multiple actions and goals will interact.
It has therefore been argued that perception in general (e.g., Clark, 2013;Friston & Kiebel, 2009;Summerfield & Egner, 2009), and social perception in particular, may be better described as top-down processing (e.g., Bach, Nicholson & Hudson, 2014Brown & Brüne, 2012;Csibra, 2008;Jacob, 2008;Kilner, Friston, & Frith, 2007a, 2007bKoster-Hale & Saxe, 2013;Palmer, Seth, & Hohwy, 2015;Teufel, Fletcher, & Davis, 2010). In such predictive processing views, perception (social and otherwise) is always hypothesis driven: any assumption one has about the external world-this is a glass, the sky is blue, my friend is happy-is immediately transformed into concrete predictions about which perceptual input would go along with this state of affairs and compared with the actual stimulation (Bubic, von Cramon, & Schubotz, 2010;den Ouden, Kok, & de Lange, 2012;Friston & Kiebel, 2009;Summerfield & Egner, 2009). If there is a (good enough) match, one's prior hypotheses are confirmed and can form the basis of further processing, resolving ambiguous stimulation, "filling in" missing information, or guiding action towards anticipated future states (e.g. Roelfsema & de Lange, 2016). Bottom-up signals, in such models, communicate prediction errors: mismatch signals that propagate back up the hierarchy when one's prior assumptions cannot (fully) account for the input and have to be revised, until they better explain what is observed. In vision, such models can explain, for example, why observers automatically perceive the "true" colour of a surface by explaining away the surrounding illumination (Bloj, Kersten, & Hurlbert, 1999;see Chetverikov & Ivanchei, 2016, for an application to the blue/gold dress illusion), how changing the expectation that light comes from above makes the same objects appear either convex or concave (Adams, Graf, & Ernst, 2004), or why bistable figures sometimes switch when the brain "tries out" an alternative hypothesis (Hohwy, Roepstorff, & Friston, 2008; see Roelfsema & de Lange, 2016, for similar phenomena).
Applied to social perception, these frameworks argue that our knowledge of other people is similarly not derived from a simple bottom-up "decoding" of each action's meaning. Instead, social perception is seen as a process of hypothesis testing, where prior assumptions about the other person are constantly tested against-and updated by -observed behaviour, across all levels of the hierarchy, from lower level goals of single behaviours to higher level more stable personality dispositions (e.g., Bach, Nicholson, et al., 2014;Bach et al., 2015;Csibra, 2008;Kilner et al., 2007aKilner et al., , 2007b. In such models, any assumption about another person is immediately translated in concrete expectations about their forthcoming actions in the given situation: What the person would do if our assumptions about them are correct ( Figure 1, solid arrows). On the one hand, these predictions help perception by filling in details missing from the input or biasing it slightly into the future, thereby affording anticipatory control in social interactions.
On the other hand, and more importantly, they ensure that one's beliefs about others remain aligned with reality.
If the person behaves differently than expected, prediction errors ( Figure 1, dotted arrows) are communicated back upwards, triggering revisions of one's prior assumptions until they can better account for the observed behaviour.
Rather than describing how each action is understood de novo, those approaches therefore see social perception as a constant series of hypothesis confirmations or revisions: She reaches straight towards the glass? Oh dear, she has not seen the candle in the way! He picks the cake? He probably is not dieting anymore. She is not helping her friend? Perhaps she is not as friendly as she first seemed.
Here, we provide an overview of how such predictive processing frameworks can be applied to social perception.
We will focus on two research topics that form the opposite ends of a scale from lower level inferences (goals of single behaviours) to higher level inferences (others' overarching traits and dispositions) that people make about others, but are typically investigated by different fields: experimental psychology and (social) neuroscience on the one hand and social psychology on the other. We start at the lower level of the hierarchy and review recent evidence that the observation of simple actions (e.g., reaching for an object, using a tool, etc.) is predictive, and that these predictions inform even the actions' lower level perceptual representations (Figure 1, red). We will then discuss key models of how predictions are derived from both knowledge about the other person and the current situational constraints (Figure 1, green and blue). Finally, we will argue that such models interface well with the ample research base from social psychology on person knowledge and attribution. Although the behaviours studied in these fields are typically abstracted away from actual movement patterns (e.g., described as vignettes) and linked to more overarching behaviour tendencies (e.g., being helpful) rather than single actions, the underlying models are similarly couched in notions of prediction and expectation. They describe the person-models (Figure 1, blue), from which action predictions are ultimately assumed to be derived, but which are rarely addressed by research on action observation.
We will sketch how models of predictive social perception can integrate both research fields and not only describe how predictions shape everyday social perception but also how our knowledge of other people is constantly updated by the new evidence and remains aligned with social reality, providing a framework that across different levels of description and methodological frameworks.

| EVIDENCE FOR PREDICTIONS WHEN OBSERVING OTHERS' ACTIONS
Some of the earliest evidence that action observation is predictive comes from recording of eye movements. People make very similar eye movements when they perform actions themselves (e.g., stacking blocks) and when they watch others perform them (Flanagan & Johansson, 2003). Importantly, in both cases, gaze is predictive: It often arrives at goal objects moments before they are acted upon, suggesting that knowledge of what the actor wants to achieve drives predictions about what they will do next. Since then, several studies have confirmed the anticipatory nature of social gaze, driven by all kinds of cues that signal others' intentions, such as their gaze (Teufel et al., 2010), action kinematics (Ambrosini, Costantini, & Sinigaglia, 2011;Ambrosini et al., 2013) or the actor's explicit verbal statements (Eshuis, Coventry, & Vulchanova, 2009).
According to predictive processing models, these predictions happen because higher level expectations about others' (e.g., goals) trickle down into lower level sensory (or motor) structures to form a mental image of the associated behaviours (e.g., Panichello, Cheung, & Bar, 2013), which then drive action (e.g., the eye movements described above) or, especially when the input is ambiguous, could stand in for real perception (see Roelfsema & de Lange, 2016, for examples in nonsocial perception). Indeed, such perceptual effects have long been known from the study of apparent motion (Wertheimer, 1912). When two images of the start and end state of a movement rapidly alternate, observers often report "seeing" the intermediate state, an effect often exploited in animation and computer games. Importantly, and in line with the idea of top-down guidance, this filling in is not a mere interpolation of missing stimulus stages but fundamentally knowledge driven, following realistic trajectories of human motion and avoiding intervening objects, for example, (Shiffrar & Freyd, 1990;Shiffrar & Freyd, 1993).
Since then, several similar perceptual biases have been reported, showing that people mentally extrapolate even complex actions that become occluded (Graf et al., 2007;Prinz & Rapinett, 2008;Springer, Brandstädter, & Prinz, 2013). Especially when such an occlusion happens unpredictably, people often report seeing the action continue further than it really was, again suggesting spontaneous filling in by hypothesis-driven perception (i.e., representational momentum, Freyd & Finke, 1984;Hubbard, 2005;Thornton & Hayes, 2004;Wilson, Lancaster, & Emmorey, 2010;flash-lag effect, Kessler, Gordon, Cessford, & Lages, 2010). Indeed, in line with the idea that such prediction effects "trickle down" from higher level information, we have recently shown for the first time that these perceptual biases dynamically incorporate inferences about others' goals. Participants watched hands reach for or withdraw from objects and judged the location of their sudden disappearance points. Even when the actions they saw were identical, participants perceived the hand's disappearance points closer to objects if they assumed a goal to pick them up, and FIGURE 1 Hierarchy for social prediction. Person knowledge interacts with situational constraints to predict forthcoming behaviour, which directly affects perceptual processing of the action. On each level, prediction errors are fed back upwards and cause potential revisions at the stage above. Is he really reaching for that glass? → Is it really filled with wine? → Has he started drinking again? further away when assuming a withdrawal (Hudson, Nicholson, Ellis & Bach, 2016a;Hudson, Nicholson, Simpson, Ellis & Bach, 2016b; for a similar effect in eye gaze perception, see Hudson & Jellema, 2011;Hudson, Liu, & Jellema, 2009).
Such predictive biases are not only found for an action's outwards appearance. It has long been known that observing others being touched sometimes induces sensations of touch on one's own body (Blakemore, Bristow, Bird, Frith, & Ward, 2005) and activates similar brain regions as being touched (e.g., Keysers et al., 2004;Bufalari, Aprile, Avenanti, Di Russo, & Aglioti, 2007; but see Chan & Baker, 2015; for similar evidence for pain, Lloyd, Di Pellegrino, & Roberts, 2004;Singer et al., 2004; for recent discussion, see Iannetti, Salomons, Moayedi, Mouraux, & Davis, 2013;De Vignemont & Jacob, 2012;Michael & Fardo, 2014). Again, this feeling of others' experiences does not only reflect what is available in the stimulus but is filled in by our prior knowledge about what would happen if we carried out these actions (see also , Pfister, Pfeuffer, & Kunde, 2014). For example, in a recent study, we showed that viewing hands grasp painful objects activates somatosensory regions of the brain and leads to reports of illusory sensations on participants' own fingers. Importantly, this happened even when no skin damage was ever shown, suggesting a predictive filling in through participants' own experience with these objects (Bach, Fenton-Adams, Tipper, 2014;Morrison, Adams, Tipper, & Bach, 2013).

| MECHANISMS FOR SOCIAL PREDICTION
Together, the above findings suggest that even when observing simple, single actions, the brain is very much proactive, predicting what others will do (or experience) next, and that these predictions interact dynamically with incoming perceptual information. But how do observers work out which action an internal state will manifest as in a given situation? Not all goals can be realised in all situations, and each situation differs in how-with what actions -it affords goal achievement. People can only still their hunger, for example, when food is available, and they can only be altruistic when there is the option of, perhaps, helping a homeless person or contributing to a charity (e.g., Bach, Nicholson, et al., 2014;Ham & Vonk, 2003;Lupfer et al., 1990;Todd et al., 2011).
People are very much aware of this situation-dependency when observing others' behaviour. Already in 1944, Heider and Simmel have shown that people attribute motives to even abstract shapes if these shapes interact meaningfully with the environment (see also Heider, 1958). Similarly, it has long been recognized that what triggers the attribution of traits to others (e.g., stinginess), is, on the most basic level, observing behaviour that does not correspond to what is expected from the situation (e.g., not giving a tip in a restaurant; Gilbert & Malone, 1995).
And even though people differ in the extent to which a behaviour is attributed to external or internal causes, there is no question that both are always considered (e.g., Ham & Vonk, 2003;Todd et al., 2011). This knowledge about the interplay between situational and person-internal factors can already be tracked in children (Csibra & Gergely, 2007;Gergely & Csibra, 2003). For example, like adults, young children attribute goals to even abstract objects when they see them repeatedly circumvent an obstacle, and they show surprise if they take the same path if the obstacle is now removed (Csibra, 2008). Similarly, they only imitate an action faithfully if it seems crucially required for action success; if it is not, they try to achieve the same goal with an action that better fits the situational constraints (Csibra, 2008;Gergely, Bekkering, & Király, 2002).
There are currently two-not mutually exclusive-proposals that explain these observations. The first is that predictions emerge from conceptual, rule-like knowledge, derived by domain general, and statistical learning mechanisms. From a lifetime of watching others, people may be able to form conceptual models that describe how internal states relate to overt behaviour, and then use this knowledge to infer which internal state a behaviour communicates, or, conversely, to predict to which action an internal state might lead (Gopnik & Wellman, 2012;Perner, 1991;Perner & Ruffman, 2005;Ruffman, Taumoepeau, & Perkins, 2012;Saxe, 2005;Wellman, 1990). Indeed, children's social behaviour seems to be well described by such domain general learning mechanisms. For example, when watching others' operate objects, children work out, using complex Bayesian reasoning, which action steps are statistically associated with success and only imitate these causal steps (e.g., Buchsbaum, Gopnik, Griffiths, & Shafto, 2011; but see Over & Carpenter, 2013, for evidence of over-imitation as well, especially when social relations rather than learning are important). Even their attribution of stable attitudes to others can be accounted for by such statistical reasoning processes. For example, children assume that someone "likes" a toy if they consistently choose it over other objects (and offer this object to the person later). Importantly, this happens only as long as the object itself does not appear to be simply better (other kids choose it too, Ma & Xu, 2011), and as long as choice frequency exceeds background likelihoods (i.e., picking one of the few frogs out of a box of mostly ducks; Kushnir, Xu, & Wellman, 2010), suggesting a sophisticated integration of multiple sources of information.
The alternative view is that action predictions are primarily motoric and derived from mechanisms for planning and control of one's own actions (Bach, Nicholson, et al., 2014;Baker, Saxe, & Tenenbaum, 2009;Csibra, 2008;Kilner et al., 2007aKilner et al., , 2007b. Humans acquire sophisticated knowledge for how to use objects, both about which goals can in principle be achieved with them (a gun is for shooting), and which motor behaviours this requires (pulling the trigger) (for reviews, see Binkofski & Buxbaum, 2013;van Elk, van Schie, & Bekkering, 2013). We and others have argued that this knowledge can play a crucial role in scaffolding prediction processes in social perception. As soon as an observer has an idea about someone's goals, and knows which objects are available, they could apply their own action knowledge to predict what the other person would do next: which objects they should interact with and how they should operate them (for extended argument, see Bach, Nicholson, et al., 2014;Kilner, 2011). And while such ideas were originally formulated for predicting and understanding simple object-directed actions, they can readily be adapted to interactions with other people (e.g., Wolpert, Doya, & Kawato, 2003) or larger scale situations (e.g., that university affords a degree but requires studying, essay writing, etc.).
As said above, the two approaches-conceptual or motoric-are not mutually exclusive. In fact, different mechanisms might be engaged in different circumstances, depending on the sensory input (e.g., similarity to the other person and familiarity with the action) and the processing goal of the observer (e.g., empathy and joint action). Indeed, various conflicting findings in research on false belief processing-whether children can predict others' actions based on their beliefs of the situation rather than actual reality-can be explained by such dual processing models, such that people fluently choose between a quick, heuristics-based process that makes "good-enough" assumptions about how other people would act, and a more involved, perhaps simulation-based, process that is deployed whenever a deep understanding of the other person is important (e.g., Apperly & Butterfill, 2009).
What needs to be explained is how such predictions of others actions, irrespective of how they are derived, could then feed downward and generate perceptual anticipations of how actions will look or continue (Figure 1, red and grey). Neuronal populations in primate superior temporal sulcus are an ideal target of these influences. These neurons code the visual form of goal-directed actions (e.g., Jellema, Baker, Wicker, & Perrett, 2000) and respond, for example, to observed reaches, but only if a target object is actually present. Others code for seeing people walking towards a goal, irrespective of the actor's orientation (walking backwards or forwards), or for others' shifts of attention, irrespective of how they are achieved (i.e., moving the eyes and moving the head). Although the required modulation by top-down conceptual knowledge has yet to be demonstrated, activation in these regions occurs particularly when predictions are relevant, such as when a person disappears behind an occluder (Saxe, Xiao, Kovacs, Perrett, & Kanwisher, 2004), when the fit of action to the environment has to be evaluated (Morrison et al., 2013), or when actions mismatch with goal expectations (Gao, Scholl, & McCarthy, 2012;Pelphrey, Morris, & Mccarthy, 2004), suggesting that they do indeed receive this information.

| PERSON MODELS AS A SOURCE AND TARGET OF PREDICTIONS
As reviewed above, there is ample research from experimental psychology and cognitive neuroscience that action observation is predictive. Yet research on action observation has surprisingly little to say about the "person models" (Park, 1986) from which these predictions are ultimately assumed to be derived. Humans store a vast amount of behaviour-relevant information about other people, reaching from the fact that our kid likes to pick his nose, to political and musical preferences of our friends, to more abstract traits that predict people's behaviour across situations (Barresi & Moore, 1996;Park, 1986;Park, DeKay, & Kraus, 1994; for neuroimaging evidence, see Greven, Downing, & Ramsey, 2016;Hassabis et al., 2013). This knowledge can be supplemented by knowledge about the individual's group (e.g., stereotyping; boys like football; Quadflieg et al., 2011), their social role (mother or professor; Chen, Banerji, Moons, & Sherman, 2014) or what humans, generally, are like (cf. Quinn & Rosenthal, 2012).
Yet even though person knowledge may play a central role in predicting how others will behave, prior research on action observation has rarely addressed these contributions, focussing instead on the role of overt cues in action prediction (e.g., emotional expressions, Adams, Ambady, Macrae, & Kleck, 2006;Johnston, Miles, & Macrae, 2010; action kinematics, Bach et al., 2011;gaze, Pierno et al., 2006;explicit statements, Hudson, Nicholson, Ellis et al., 2016a;Hudson, Nicholson, Simpson et al., 2016b). Although these cues may indeed exert their effects by providing person information, such as implying others' goals or intentions, they could just as well be explained on the level of action alone, where certain cues (e.g., a smile) directly predict associated behaviours (approach), without drawing upon person information at all (e.g., Gergely & Csibra, 2003;Ruffman et al., 2012). The few studies that suggest a recruitment of person models typically tested very general aspects that fall short of explaining flexible prediction of others' behaviour across situations, for example, that observers automatically activate a famous athlete's most used body part when seeing only their face (Wayne Rooney's foot; , or that they implicitly recall (and mirror) others' prior emotional state or gaze direction (Frischen & Tipper, 2006;Halberstadt, Winkielman, Niedenthal, & Dalle, 2009;Todorov, Gobbini, Evans, & Haxby, 2007).
A challenge for research is therefore to test the assumption of predictive coding models that action predictions indeed reflect higher level person knowledge and not just situational constraints and overt cues that signal others' behaviour ( Figure 1, blue). Interestingly, there is long-standing evidence from social psychology for just such effects.
Although the actions investigated were more abstract and linked to more distal overarching behaviour tendencies instead of single actions, these studies show that people are highly sensitive to information that is diagnostic about others' personality traits and extract it fluently-and often involuntarily-from even short-behaviour descriptions (e.g., Ambady & Rosenthal, 1992;Borkenau, Mauer, Riemann, Spinath, & Angleitner, 2004;Chen et al., 2014;Vonk, 1994;Willis & Todorov, 2006;Winter & Uleman, 1984). Once established, this knowledge biases how subsequent behaviours are processed. Recall that, in predictive coding models, higher level knowledge speed ups processing of expected input. Unexpected input, in contrast, becomes more salient and is processed in more detail, so that it can be integrated with prior knowledge. Very much in line with these assumptions, it is typically found that trait-or stereotype-congruent behaviours are processed (read) more quickly while incongruent ones recruit additional processing resources and are remembered better (e.g., Cohen, 1981;Hamilton & Sherman, 1996;Heider et al., 2007;Macrae & Bodenhausen, 2000;Quadflieg et al., 2011;Sherman, Stroessner, Conrey, & Azam, 2005;Srull & Wyer, 1989;Stangor & McMillan, 1992).
Although these classic findings follow the assumptions of predictive coding models, the studies mostly used verbal trait and behaviour descriptions, leaving open whether such person predictions only affect other's behaviour on a conceptual level or whether they indeed propagate further downwards and affect the observation of the associated actions. Two of our recent studies attempted to fill this gap. We showed, first, that people form internal models how others' behave in different circumstances (whether they tend to kick a ball but turn away from a computer, or vice versa) and that these models are reactivated whenever these individuals are seen, speeding up the identification of person-consistent actions (Schenke, Wyer, & Bach, 2016). Second, we showed that such predictions can even affect the involuntary sharing of attention with others (i.e., joint attention).
We showed that people implicitly learn what objects other individuals typically look at and direct their own attention towards these objects, in an anticipatory manner, whenever these individuals are seen again (Joyce, Schenke, Bayliss, & Bach, 2015). Importantly, these gaze predictions were only found when the gazing faces smiled when looking at the objects, suggesting a learning of their attitudes, not just their overt behaviour tendencies.
Other evidence comes from studies that, while not addressing the prediction of behaviour, nevertheless show that high-level stereotype information can affect lower level perception (for a review, see Otten, Seth, & Pinto, 2016). It has been found, for example, that faces with African American features appear darker than luminancecontrolled European faces (Levin & Banaji, 2006), and that people more readily detect angry expressions in them (e.g., Otten & Banaji, 2012). Similarly, viewing male and female faces in unexpected contexts (e.g., male as a cleaner) modulates early perceptual components in the EEG (Dickter & Gyurovski, 2012) and engages regions involved in perceptual encoding of faces (e.g., fusiform face area; Quadflieg et al., 2011).
Elucidating how person knowledge is linked to predictive processes is important also to understand how one's knowledge about other people stays aligned with reality. Predictive coding models assume not only that person knowledge interacts with current situational constraints to predict forthcoming behaviour (and ultimately perceptual input, filled arrows in Figure 1). At the same time, the mismatch between these predictions and the person's actual behaviour is assumed to be propagated back upwards in the form of prediction errors, allowing the knowledge we have about the other person to be updated and revised, so that it can better account for the current-or future-input (dotted arrows in Figure 1). These revisions have been widely investigated in the nonsocial perception. For example, in visual object perception, this process manifests, for example, when an ambiguous image flips when the brain tries out an alternative hypothesis (Wang, Arteaga, & He, 2013), and in sentence understanding, it happens when a new word causes sudden revisions of one's prior understanding, as in classical garden path sentences, such as "The horse raced through the barn fell" where only the final word reveals that the horse did not race itself but was raced by someone (e.g., Gunter, Friederici, & Schriefers, 2000).
Yet despite ample research in the nonsocial domain, there is very little research on how action observation changes the knowledge we have of other people-their goals, beliefs, or personal traits-especially when they act in a way we did not expect. Various studies show that observing unexpected actions indeed recruits additional processing resources, as measured by both response time costs (e.g., Bach et al., 2005;van Elk et al., 2009avan Elk et al., , 2009b and increased brain activation (e.g., Vander Wyk, Hudac, Carter, Sobel, & Pelphrey, 2009;Bach, Gunter, Knoblich, Prinz, & Friederici, 2009), which, in some studies, reach into regions implicated in encoding person information and mentalizing (e.g., Brass, Schmitt, Spengler, & Gergely, 2007;De Lange, Spronk, Willems, Toni, & Bekkering, 2008;Nicholson, Roser, & Bach, 2017). And although these findings could indeed index attempts to revise prior beliefs about others, this hypothesis has not been explicitly tested. Closest come two recent studies tracking perceptual and motor changes when viewing unexpected actions. Yet while both demonstrated changes to lower level action representations-how different hand grips (Jacquet et al., 2016) and verbal statements (Hudson, Bach, & Nicholson, under review) predict an action's end state-they did not address whether person knowledge, such as higher level intention attribution, was affected as well.
Importantly, classical research in social psychology again provides valuable insights into how such revisions happen (e.g., Quinn & Rosenthal, 2012;Srull & Wyer, 1989;Stern, Marrs, Millar, & Cole, 1984). These findings show that personality attributions, once made, are surprisingly robust against incongruent information and that hypothesis revisions are rare. For example, after it has been previously established that John is a kind man, participants tend to hold on to this impression even if they now read about him refusing to lend money to his friend. Instead, they explain the conflicting information away: Perhaps John is short of cash himself, or had already lent this friend some money and not received it back. Stereotypes about groups (e.g., based on race, gender, or sexual orientation) are particularly robust against such revisions, with in-group members maintaining negative images of out-groups, even in the light of conflicting evidence (e.g., Pettigrew & Tropp, 2006;Sherman et al., 2005).
Despite appearing counterintuitive initially, such findings are not at odds with predictive coding models. First, these models do not assume that predictions are either/or but capture the strength of the link between superordinate concept (e.g., a personality trait) and subordinate consequence (i.e., a certain behaviour; e.g., Clark, 2013). In the social domain in particular, they would therefore capture the uncertainty in how a trait is expressed, such that a few outliers (e.g., even a very generous person does not give to all beggars) would not shift the prior impression much (see Pettigrew, 1979, for an early account). Second, recall that behaviour predictions do not emerge directly from person information but are always filtered through the current situational constraints. Thus, any prediction error cannot only be resolved by revising one's person model but by revising the situation model as well. Indeed, research in social psychology appears to show exactly that. Although unknown people are typically readily attributed new personality traits (i.e., the fundamental attribution error; Jones & Harris, 1967), this is not the case if one already has preexisting knowledge about them. People extract both situation models and person models when reading about others' behaviour (e.g., Ham & Vonk, 2003;Lupfer et al., 1990), and when faced with a behaviour (jumping over a fence) incongruent with a person model (being old), people adjust their situation model first (it could not have been that high a fence). Again, such strategies are often seen in-group stereotypes, where highly prejudiced individuals are especially likely to "explain away" an out-group member's unexpected behaviours through external factors (Sherman et al., 2005).
Thus, predictive coding models would suggest that when an impression has been well established through a highly diagnostic behaviour or ample prior evidence, it will only be revised when the incongruent behaviour is (a) equally diagnostic, (b) the associated trait has little variability in whether (and when) it is expressed, and (c) when there is little scope for variability in the situation model, through which any unexpected behaviour could be explained away.
Beliefs about groups-such as stereotypes about race or gender-have an additional safety net: Even if one is convinced to revise one's model about the single individual one has met, one does not need to revise one's group model. After all, not all members of a particular group might be lazy, just most, and this particular individual might be one of the outliers (Weber & Crocker, 1983;Ashmore & Del Boca, 1981; for a review, see Richards & Hewstone, 2001). When all these factors are considered, it is no surprise that first impressions remain relatively stable, even after repeated exposure to apparently disconfirming information, and that next to the above variable, the prior identification of an individual with a group and the number of exposures to different group members predict whether a stereotype is changed (e.g., Pettigrew & Tropp, 2006). Again, of course, these prior findings come from studies that presented behaviour abstractly, in terms of written behaviour descriptions, and where the actions were linked to more distal goals (e.g. being friendly or ambitious) than in studies on action observation. A challenge (but also the promise) of future research is to integrate recent methodological developments from experimental psychology and social neuroscience with those from social psychology to elucidate not only how prior person and situation models guide the perception of ongoing behaviour but also how observed behaviour feeds back to these models, making sure that one's knowledge about other people remains aligned with reality.

| CONCLUSIONS & FUTURE DIRECTIONS
Prior research has approached the question of how people draw inferences from others' behaviour from two opposite directions, and the behaviours investigated, methods, and theoretical models rarely made contact. Thus, although action observation investigated how people attribute goals to actions but not people, social psychology investigated inferences about other people, ignoring the actions from which they are derived. We have argued here that predictive processing models of social perception are well placed to bridge this gap. Based on ideas originally developed for lower level action observation (Kilner et al., 2007a, b;Csibra, 2008;Bach, Nicholson, et al., 2014), we have sketched a simple model in which higher level person knowledge is constantly integrated with the current environmental constraints to predict others' most likely behaviour and the perceptual input (Figure 1, solid arrows). At the same time, mismatches between prediction and observed action are communicated back upwards, providing opportunity to revise one's person (or situation) model, until they better explain the observed behaviour (Figure 1, dotted arrows).
A goal for future research is to demonstrate that high-level information about other people-as studied by attribution theorists-is indeed translated into lower level action predictions of their actions, and that the return flow from action observation indeed revises and updates these assumptions. Such models make several testable predictions, for example, that predictive biases during action observation are directly related to the confidence of one's person and situation model (Hudson et al., under review), and that the revision of these models depends on which one is more certain (e.g., Gill & Andreychik, 2014). In addition, the simple model sketched here needs to be extended to account for our knowledge of how people reason about others. For example, it currently assumes just one step from person models to action predictions. In reality, actions are organised hierarchically-from the lower level motor acts studied in action observation research (e.g., reaching for a pen) to more general behaviour patterns (being helpful to the person who dropped the pen) they are part of, as studied by social psychology-with several goals and subgoals, and corresponding situation representations on each level (e.g., Csibra, 2008;Hamilton & Grafton, 2007). An effective model of social perception has to implement these hierarchies and describe how predictions at each level are resolved. Thus, if we find a job seeker idly scrolling through Facebook, when do we resolve this discrepancy locally (he is just procrastinating) and when do we escalate it upwards (he has given up his search)?
Similarly, it is clear people sometimes predict others' behaviour not based on the actual state of the world, but what the other person believes it to be (e.g., Apperly & Butterfill, 2009;Wimmer & Perner, 1983). When faced with an unexpected behaviour (e.g., a dieter reaching for a sweetened drink), under what circumstances do we change our person model (have they stopped dieting?), the situation model (is it a low-calorie sweetener?), or merely the belief model of the other person (do they believe it is low-calorie sweetener?).?
The ideas reviewed here are intended as a first sketch of how research on action observation and social psychology can be integrated in a predictive model of social perception. Although still in their infancy, such models may provide a unified framework that can account not only for how one's knowledge guides everyday social perception, but also how, across all levels of the hierarchy, one's social knowledge is constantly updated by one's interaction with the world, from actions and situations, to the individuals that produce these actions, and the groups to which they belong.

ACKNOWLEDGEMENT
This work was supported by the Economic and Social Research Council grant ES/J019178/1 to Patric Bach. We thank Dr Sylvia Terbeck for discussion on some ideas presented in this manuscript.