Neurophysiology of embodied mental rotation: Event‐related potentials in a mental rotation task with human bodies as compared to alphanumeric stimuli

The present study examines if the neural signature of information processing in mental rotation tasks is moderated by stimulus characteristics (e.g., body‐related vs. non‐body‐related stimuli). In the present experiment, stimulus sets of human figures (back view; left vs. right arm abduction) and alphanumeric characters (‘R’; normal vs. mirrored view) were scrutinized with event‐related potentials (ERPs) in the electroencephalography (EEG). Participants had to judge parity between an upright (0° orientation) and a comparison stimulus (stimulus disparity; 0°, 45°, 90°, 135° or 180°). There was a main effect of stimulus disparity for the behavioural (response time and error rates), as well as for the neural data (rotation‐related negativity, RRN). The interaction of stimulus disparity and stimulus type was significant for the RRN, but not for the response time. Lower RRN amplitudes for letters indicate a more pronounced use of alternative processes (e.g., memory retrieval), which seems to be reflected in higher N350 amplitudes. Moreover, the increase of the RRN amplitude and the increase in response time as a function of disparity were positively correlated. Task differences were evident for several ERP components (i.e., N150, P150 and N250), being independent of disparity, which might reflect differences in early and late object cognition prior to the mental rotation process itself. This might be associated with the task‐dependent activation of embodied cognition processes in mental rotation tasks.


| INTRODUCTION
Since Shepard and Metzler (1971) found that response times (RTs) increase linearly with disparity between two stimuli in a parity judgement task using cube figures, mental rotation has been studied extensively with different paradigms and populations (e.g., Tomasino & Gremese, 2016;Voyer et al., 1995). Besides this chronometric approach (Shepard & Metzler, 1971), there are also psychometric paper-and-pencil tests (Vandenberg & Kuse, 1978). According to a traditional model of a mirror versus non-mirror judgement tasks (for letters) requiring mental rotation, there are six successive discrete cognitive processes for the chronometric testing of parity judgements, namely, (1) response preparation (for a same condition), (2) determination of the character identity and its stimulus disparity, (3) mental rotation, (4) parity judgement, (5) response selection and (6) response execution (Shepard & Cooper, 1982). However, the discrete and sequential manner of the mental rotation processes proposed has been criticized (Eckert et al., 2008;Heil et al., 1998;Schendan & Lucia, 2009). Moreover, different mechanisms to facilitate the processing of mental rotation tasks are discussed (e.g., memory retrieval: Provost et al., 2013;embodied cognition: Amorim et al., 2006). To scrutinize respective research desiderata, behavioural and neurophysiological research methods are often combined. With regard to behavioural paradigms, studies on mental rotation consistently demonstrated higher cognitive costs (i.e., increases in reaction times and response errors), whenever these have to be mentally transformed from one orientation into another for the parity judgement of objects (e.g., for cube figures [Metzler & Shepard, 1974;Shepard & Metzler, 1971] and letters [Jordan et al., 2002;Weiss et al., 2009]) or for the laterality judgement of human body parts (Bläsing et al., 2013;Cooper & Shepard, 1975;Parsons, 1994) and whole human bodies (Amorim et al., 2006;Budde et al., 2020;Heppe et al., 2016;Steggemann et al., 2011).

| Behavioural effects in mental rotation of body-related-versus non-bodyrelated stimuli
The comparison of mental rotation of whole human bodies and objects revealed two considerable phenomena as (A) rotation-dependent and (B) rotation-independent effects. Amorim et al. (2006) and Jansen et al. (2012) compared pictures of human bodies with cube figures of the same complexity regarding the spatial configuration (three segments) in a parity-judgement task (i.e., objectbased rotation) and found lower RT slopes for bodyrelated stimuli. Steggemann et al. (2011, Experiment 1) compared pictures of human bodies (female body in front/back view with left/right arm abduction) with letter stimuli (capital letter 'R' and vertically mirrored version) in a parity judgement task (i.e., object-based rotation) and found a stimulus type effect for RTs, but contrary to Amorim et al. (2006), no interaction of stimulus type and disparity. Diverging findings of these studies, which show (Amorim et al., 2006) or do not show (Steggemann et al., 2011) rotation-dependent differences between the stimulus-types, might be explained by differences in the stimulus sets (e.g., cube figures vs. letters). Voyer et al. (2017) also compared mental rotation effects with non-body-related stimuli (i.e., letters; Experiment 1) and line drawings of hands (Experiment 2). They found that behavioural results (i.e., RT functions for the different disparities) are moderated by the task (objectbased: same-different matching vs. egocentric: laterality judgement), with a more linear function for objectcentred mental rotation (i.e., objects are mentally rotated in the picture plane) and a more curvilinear function (increase of RTs for disparity angles above 45 ) for egocentric mental rotation (i.e., participants change their own body-related perspective). For this reason, comparisons of different stimulus types should be done with the same mental rotation paradigm (i.e., egocentric vs. object-based rotation). Studies where the stimulus type and transformation type were confounded (Jansen & Kaltner, 2014;Jola & Mast, 2005) are not discussed in detail here.

| An embodiment perspective on stimulus-dependent effects in mental rotation
Theories of embodied cognition postulate a deeply rooted and omnipresent sensorimotor activation in cognitive processing, which is reasonable under the assumption that our brain is first and foremost a control system for a biological body (e.g., Clark, 1998;Wilson, 2002). Accordingly, Amorim et al. (2006) assumed that mental rotation seems to involve embodied cognition or at least can be facilitated if embodied cognition becomes operative, which should be more pronounced when body-related stimuli are processed. This would explain disparitydependent effects of the stimulus type when comparing body-related versus non-body-related stimuli. Amorim et al. (2006) referred to two categories of embodiment: (1) a spatial embodiment, where body-related spatial reference frames are mapped onto the embodied object, and (2) a motoric embodiment, which refers to the assumption from the ideo-motor approach (e.g., Decety et al., 2002), that observing, imaging and executing actions involve common motor processes. Both spatial and motoric embodiment may facilitate performance in mental rotation tasks. Spatial embodiment should facilitate both egocentric and object-based mental rotations, while motoric embodiment should most of all facilitate egocentric mental rotation. The assumption of the involvement of motor processes as body-related processes in mental rotation tasks is also supported by research revealing substantial interference of mental rotation performance during the planning (Olivier & De Mendoza, 2002;Wohlschläger, 2001) and execution (Wexler et al., 1998, Wohlschläger & Wohlschläger, 1998

of motor acts in behavioural studies.
Embodied cognition is also discussed to facilitate rotation-independent processing in mental rotation tasks. In this perspective, the main effect for stimulus type as found by Steggemann et al. (2011) can be interpreted from a grounded cognition perspective, while embodiment can be seen as an ascertainment of certain detailed aspects of grounded cognition. Grounded approaches in cognitive science postulate that cognition and cognitive concepts, as organized knowledge about the world in the forms of real (e.g., a human being) and abstract concepts (e.g., the letter 'R'), are grounded in (a) modality-specific systems, (b) body and action, (c) the physical environment and (d) the social environment (Kiefer & Barsalou, 2013). This contrasts with classic cognitive scientific approaches, which hypothesize a transduction of modal experiences into amodal representations (Collins & Loftus, 1975;Fodor, 1975;Pylyshyn, 1973;Tulving, 1972). One central aspect of grounded cognition approaches is that the retrieval of a concept induces specific cortical sensory-motor activation during (conscious or unconscious) mental simulations that can be traced back to real concept-related experiences (Barsalou et al., 2003;Kiefer & Barsalou, 2013), which can be seen as a description of an embodiment effect. These cerebral activations arise from different brain areas, which represent context relevant properties of a concept (e.g., if one sees or thinks of a ball, motor programmes for catching and throwing a ball arise from sensorimotor-related brain areas) and contribute to goal-directed behaviour.
Related to the mental rotation process, it is argued that different strategies can become operative that include analytic or piecemeal and, which is more relevant in the context of grounded cognition, body-related strategies (i.e., mutuality [Wohlschläger & Wohlschläger, 1998;Parsons, 1994] and interference [Wohlschläger & Wohlschläger, 1998] of manual and mental rotations, indicating a common higher order cognitive process or the simulation to rotate one's own body parts in space [Cooper & Shepard, 1975;Kosslyn et al., 1998]). Although grounded views on early object cognition (e.g., object perception and recognition) do not rule out the possibility of later effects (e.g., embodied cognition effects on the mental rotation process itself), the assumption that cognition is 'grounded' in these four ways (see Kiefer & Barsalou, 2013, in the previous paragraph) leads to the hypothesis that stimulus-dependent differences should already occur in early cognitive processing stages. This is compatible with the multiplestate interactive (MUSI) account of object cognition (Schendan, 2019), which postulates a first state of object cognition with initial feedforward bottom-up processes (before 200 ms), which contribute to object and feature categorization. Differences in these early processes might not affect the mental rotation process itself, but prior processes of object cognition (e.g., identification of identity and orientation of single objects), which could explain more general stimulus-dependent differences in reaction times, regardless of stimulus-orientation (Steggemann et al., 2011). 1.3 | Event-related potentials in the EEG evoked by the mental rotation of bodyrelated-versus non-body-related stimuli Next, data on event-related potentials (ERPs) in the electroencephalography (EEG) in the context of mental rotation tasks differentiated according to early and late phases of object cognition before the subsequent mental rotation process itself are reported. Due to the high temporal resolution of the EEG signal, it is suitable for the detection of temporally structured cognitive processes as they are defined in the context of mental rotation tasks (Shepard & Cooper, 1982). In this regard, we will refer to the three phases that can be related to the core processes of mental rotation that demand for perception and visual-spatial processing (Shepard & Cooper, 1982): early and late object cognition (Phase 1 and Phase 2), which relate to determination of stimulus identity and stimulus disparity, as well as the mental rotation itself (Phase 3).

| ERPs of Phase 1: Early object cognition
The occipito-parietal P100 and the centro-frontal P150 (and its polarity inverted occipitotemporal counterpart N170) are early object sensitive potentials, which reflect fast bottom-up cognitive processes in the first state of object cognition (according to the Two-State Interactive Theory of Object Cognition [Schendan & Lucia, 2010;Schendan & Maher, 2009] and the MUSI [Schendan, 2019]) that can support feature detection, structural encoding and perceptual categorization of known image classes, like faces or letter strings. These components might represent concept-specific grounded cognition effects. The P100 and the P150 seem to be object sensitive (e.g., showing larger amplitudes for intact than for scrambled objects) and category specific (e.g., showing larger amplitudes for faces than for buildings). Also, the P100 and P150 are ascribed to figureground segregation processes (higher amplitudes for segregation vs. no segregation) (Schendan & Lucia, 2010).
More specific from an embodied cognition perspective, Goslin et al. (2012) suggested that visual properties of objects automatically potentiate motor actions linked to them and found that early components (posterior visual P1 [P100] and N1 [N150]) were modulated by motor affordances of visual stimuli. The authors interpreted this as an embodiment effect and underline the fact that these modulations occur prior to object categorization. These early components have not been scrutinized substantially in the context of mental rotation comparing body-related and non-body-related stimuli.

| ERPs of Phase 2: Later object cognition
Neurophysiological mental rotation research mainly focused on the rotation-related negativity (RRN; see next paragraph), but there is also an earlier negative disparitydependent component (N350) at fronto-central electrodes around 400 to 500 ms after stimulus onset that is associated with later visual object cognition processes (Schendan & Lucia, 2009). The N350 seems to be associated with cognitive processes, like detecting the spatial relations and stimulus orientation by selecting the best match in memory for the visual structure of the perceived object in advance of the mental rotation process itself. This ERP component is less negative for more successfully categorized pictures of known objects (Gratton et al., 2009;Schendan & Kutas, 2002). Hence, this component seems to be linked to memory retrieval processes and could be utilized as an indicator for respective processes. It also might be disparity dependent (i.e., higher negativity with increasing disparity), as spatial transformation computing might also be reflected here (Schendan & Lucia, 2009

| ERPs of Phase 3: The mental rotation process itself
The RRN is an ERP component, which is thought of as a neural correlate of the mental rotation process itself (Heil, 2002). The RRN is a negative-going, slow-wave potential at parietal electrode sites in the time window of 300-600 ms after stimulus onset (Heil, 2002;Peronnet & Farah, 1989), thus overlapping with the P300 (P3b), a component that is fully visible in baseline conditions without any need for mental rotation (e.g., parity judgement of two stimuli without orientation disparity). With increasing orientation disparity, the positive P300 amplitude gets more negative by the overlayed RRN. The RRN (as the net effect) should be operationalized as the negativity relative to a baseline condition (Heil, 2002).
In a series of experiments, Heil (2002) provided evidence for the functional significance of the RRN as a neural correlate of mental rotation of alphanumeric characters, in addition to its dependence from stimulus disparity. According to Heil (2002), the RRN reflects the process of mental rotation, because (a) the classification of misoriented characters (not relying on mental rotation) does not provoke an amplitude modulation; (b) the RRN is evident irrespective of the need for a manual response; (c) the RRN is still present when orientation is known before the task; and (d) the factors that should lead to a postponement of the mental rotation process also postpone the RRN. Moreover, Provost et al. (2013) revealed that extensive practice of a parity judgement task, using a small stimulus set (of Shepard-Metzler-like cube figures), led to a vanishing of the RRN, while it was still evident after extensive practice with a larger stimulus set. The authors postulated that well-practiced stimulus pairs do not provoke mental rotation processes, as the correct response can be directly selected via memory retrieval.
These results underline the functional significance of the RRN and its property to determine the presence or absence of the cognitive processes underlying mental rotation. In this perspective, scalp lateralization of the RRN can also reveal qualitative differences in the applied mental rotation processes, which have not been compared directly for body-related versus non-body-related stimuli. A left lateralization, as it has been found in children compared to adults (Jansen-Osmann & Heil, 2007), is associated with analytic, piecemeal mental rotation (Corballis, 1997), whereas a right lateralization is associated with holistic processes (Beste et al., 2010), as it has been shown for mental rotations out of picture plane (Núñez-Peña & Aznar-Casanova, 2009). According to the assumption of Amorim et al. (2006), body-related stimuli are more likely to induce holistic processing 'because of the increased cohesiveness of human posture by one's body schema (in terms of body structural description)' (p. 329).
Several studies were able to show that the RRN is predictive for behavioural mental rotation performance (e.g., Beste et al., 2010;Riečanský & Jagla, 2008). According to the neural efficiency hypothesis (Haier et al., 1988), lower RRNs are associated with lower RT slopes (indicating superior mental rotation performance).
The RRN has been substantially scrutinized for nonhuman stimuli, such as cube figures (Provost et al., 2013) and alphanumeric characters (e.g., Beste et al., 2010;Heil, 2002;Peronnet & Farah, 1989;Riečanský & Jagla, 2008). Only a few studies used human body parts as stimuli in laterality judgement tasks (i.e., hands; Jongsma et al., 2013;ter Horst et al., 2012). For example, ter Horst et al. (2012) presented pictures of the left and right hands and found RRN effects in the ERPs. But for the mental rotation of whole-body figures rotated around the vertical axes (rotation out of the picture plane), Ishikura (2016) did not find an RRN in a laterality judgement task. Jansen et al. (2020) compared ERPs in the mental rotation of cube figures and humanoid wholebody figures with and without comparable spatial configurations as conceptualized by Amorim et al. (2006). They found stimulus-dependent activations in their objectbased mental rotation task. In the early interval (200-400 ms), results showed less negative amplitudes for abstract cube figures at central and parietal electrodes. In the late interval (400-600 ms), there was no effect for stimulus type at parietal, but at frontal and central electrodes, where again, human body stimuli showed higher negativity. Unfortunately, when evaluating the data of Jansen et al. (2020), it cannot be deciphered if there was an RRN effect on the ERPs for the two types of stimuli (i.e., cube figures vs. humanoid whole-body figures), because the factor disparity was not integrated in the statistical analysis. Moreover, their two intervals for the component analysis each spread over pairs of positive and negative peaks. It is therefore difficult to interpret distinctive ERPs and associated cognitive processes from the data set. Similarly, Feng et al. (2021) analysed the RRN in parity-and laterality-judgement tasks with whole-body stimuli in a most recent study. Unfortunately, they conducted all analyses with a P300-confounded measure, as they calculated the mean RRN over all disparities including the 0 condition. Therefore, from an embodied neurophysiology perspective, examining the RRN component for human stimuli compared to non-human stimuli is warranted, in order to further understand the underlying processes of mental rotation with different types of stimuli.

| Aim of the current study
The current study aims to test for differences in the neural processing (i.e., ERPs in the EEG) of human wholebody stimuli (i.e., a person displayed in different orientations with the left or right arm stretch out) and a set of non-human stimuli (i.e., the letter 'R' being shown as a non-mirror or mirror image in different orientations) in a parity-judgement task. From the behavioural perspective, the current study is a systematic replication of the study by Steggemann et al. (2011;Experiment 1). From the neurophysiological perspective, there are two assumptions that give rise to the hypothesis that the RRN should be moderated by the stimulus type. First, embodied cognition should be more involved in the parity judgement of whole bodies compared to letters. It is questionable whether embodied processes only speed up early cognition (prior to mental rotation), which are reflected in early components of the ERPs or also late cognition (mental rotation itself), which should result in a reduced RRN as a later component of the ERP. Second, there is also a rational to predict a relative reduction of the RRN when alphanumeric stimuli are used. We assume that participants have comprehensive perceptual experience with slightly misoriented letters in a standard (non-mirrored) depiction, leading to multiple canonical views around the upright orientation. This experience should decrease the RRN in the parity judgement of letters, as memory retrieval processes to judge a letter being mirrored or non-mirrored should be applied to a larger extent. In the parity-judgement task with human wholebody stimuli, for which there is no such standard depiction (e.g., raised left or right arm), the RRN should not be comparably affected by memory retrieval processes. Concerning the multiple-views-plus-transformation accounts of visual object cognition (Tarr & Pinker, 1989), spatial transformation processes (e.g., mental rotation) should be utilized for non-canonical views, in order to spatially align them with stored view representations in memory. According to this, canonical views do not require additional spatial transformations, as the identity (e.g., nonmirrored alphanumeric character) fits the mental representation. We also perceive numerous human bodies with outstretched arms, but there is no canonical version (left vs. right arm). Altogether, there are rationales to assume that the stimulus type moderates the RRN, but they lead to contradicting assumptions about the direction of the effect.
Moreover, it is expected that the RRN amplitude is negatively correlated with mental rotation performance (e.g., Riečanský & Jagla, 2008), which is consistent with the neural efficiency hypothesis (Haier et al., 1988). With regard to the lateralization of the RRN, there are studies that find lateralization effects in the RRN (e.g., Johnson et al., 2002), studies that do not find such effects (e.g., Beste et al., 2010) and studies that suggest time-related and functional differences regarding the involvement of left and right parietal lobe in a normal versus mirror judgement of alphanumeric characters but exclusive involvement of the right parietal lobe in the mental rotation process per se (Harris & Miniussi, 2003;Milivojevic et al., 2009). The assumption of more holistic processing in mental rotation with body-related stimuli (Amorim et al., 2006) predicts a right lateralization (Beste et al., 2010) of the RRN.
Referring to the N350, less negativity for alphanumeric stimuli is expected, as the categorization of canonical non-mirrored letters should successfully work based on memory retrieval, which lowers the negativity of this component. Stimulus-dependent differences should only occur in higher disparity conditions, as there seems to be some orientation independence for the categorization of body-related laterality in lower disparities.
Regarding the early potentials (prior to the N350 and RRN), we test for stimulus-dependent differences that might partly explain disparity independent effects of stimulus type on behavioural variables. Differences in this early state of cognitive processing will be scrutinized with an exploratory procedure and discussed within the framework of grounded cognition or to be more specific, with an embodied cognition perspective. We assume that there are facilitative effects in early object cognition based on embodied cognitive processes.

| Participants
We calculated an a priori sample size for moderate effect sizes for a 2 Â 5 interaction (alpha = 0.05; beta [1-alpha] = .90). Accordingly, 32 university students (16 females, mean age = 22.0; SD = 2.18) voluntary participated in this study. All of them were right handed and had normal or corrected-to-normal vision. None of them participated in mental rotation experiments during the last 12 months. All participants gave their written informed consent before being tested, and no money was paid. The study is approved by the local ethics committee.

| Apparatus
Participants were seated 80 cm in front of a screen (LG Flatron W2442PE, 60 Hz, Seoul South Korea), in a room with the lights dimmed. The experiment was executed with the software Presentation (Neurobehavioral Systems Inc., Albany, California, USA). The left (right index finger) and the right (right middle finger) mouse button served as response devices. To start a block or an acquisition, participants had to press the ENTER button on the keyboard. The handedness was determined by the Edinburgh Handedness Inventory (Oldfield, 1971).
For electroencephalogram (EEG) and electrooculogram (EOG) recordings, a 16-channel EEG with Ag/AgCl active scalp electrodes and an alternating current/direct current (AC/DC) amplifier was used (V-Amp, Brain Products, Munich, Germany). Electrode caps (actiCap, Brain Products, Munich, Germany) and an impedance below 20 kΩ ensured a standardized electrode positioning, according to the 10-20 system (Jasper, 1985) and signal quality.

| Stimulus material
The stimulus sets used in this study were identical to those used by Steggemann et al. (2011), with only minor modifications: only the pictures of a female body from back view, with either the left or the right arm stretched out, and the alphanumeric capital character 'R' (Arial font) in normal view or mirrored (Figure 1), with the bulge and the arm of the 'R' pointed either to the right or to the left, were used. The size of both stimuli was adapted to the bigger screen size (24 00 ) and greater distance (80 cm), so that they could be presented with a visual angle of 8 . Furthermore, the proportions of the capital character 'R' were changed. That is, the maximum extension of the 'R' to the maximum extension of the corresponding female body, measured from the body midline, was adapted, for the two types of stimuli to be better comparable.
A stimulus pair consisted of two stimuli of the same type (i.e., human body or letter). One pair of stimuli was always presented one above the other in the screen's centre simultaneously, whereas the stimuli were either identical or mirror images of each other. The upper stimulus was always shown in an upright orientation (0 ), whereas the lower stimulus had the same orientation or was randomly rotated (clockwise) in picture plane, in a 45 , 90 , 135 , 180 , 225 , 270 or 315 orientation, resulting in one of five angular disparities: 0 , 45 , 90 , 135 or 180 .

| Procedure and task
Participants were instructed to judge the parity of two stimuli by button press (left mouse button for same vs. right mouse button for different). With reference to Steggemann et al. (2011), the two tasks were differentiated between a mental body rotation task (MBRT, i.e., to mentally rotate the human body) and a mental object rotation task (MORT, i.e., to mentally rotate the capital character 'R'). The experiment consisted of one session split into two blocks with a total of 320 testing trials. During a short familiarization phase, each possible stimulus pair of the respective task (MBRT or MORT) was performed once, resulting in 32 familiarization trials before each of the two test blocks. During each of the test blocks, each possible stimulus pair was shown five times, resulting in 160 trials per block. This results in 20 trials for 0 and 180 disparity, as well as 40 trials for the 45 (45 and 315 orientation), 90 (90 and 270 orientation) and 135 (135 and 225 orientation) disparity. There are only half of the trials for the 0 and 180 disparity, as we wanted to control for constant number of presentations for single stimulus pairs. Remember that there are no counterclockwise pendants of 0 and 180 orientations. Presenting them twice as often would have facilitated memory retrieval processes in later trials (cf. Provost et al., 2013). To avoid sequence effects, the order of blocks (MBRT vs. MORT) was counterbalanced across participants, as well as gender. Each trial began with a black screen (250 ms), whereupon a fixation cross was displayed for 500 ms, and another black screen was shown for 500 ms, before the stimulus pair was presented (until a response was given). Participants received feedback 17 ms (one frame later; refreshing rate = 60 Hz) after the response. The German words 'Fehler' (for a wrong response) or 'Richtig' (for a correct response) appeared as feedback and stayed on the screen for 2000 ms. Following the feedback, the next trial began.
To minimize EEG artefacts, participants were instructed to blink in the time window between the feedback and the appearance of the new stimulus pair. The beginning of the next block was self-paced.
2.5 | Data processing and analysis 2.5.1 | Behavioural data Two 5 (Disparity: 0 , 45 , 90 , 135 and 180 ) Â 2 (Task: MBRT vs. MORT) Â 2 (Match: same vs. different) analyses of variance (ANOVAs) for RT and response error were calculated with SPSS statistics (IBM). Dependentsamples t tests are calculated for follow-up analyses of single comparisons, using adjusted p values (p adj ), using the Bonferroni-Holm procedure to correct for multiple hypothesis testing. Partial eta squared (η p 2 ; for ANOVAs) and Cohens d (for t tests) are calculated as effect sizes. Due to false (n = 475; 4.63%) or late correct responses (>2500 ms; n = 56; 0.57%), the responses of 526 trials (5.19%) were not included in the RT analyses and the EEG data processing.

| EEG data
Brain Vision software was used for EEG data recording (Brain Vision Recorder, Brain Products) with 500 Hz, data processing and data analyses (Brain Vision Analyzer 2.0, Brain Products). EEG raw data (of the following electrodes: F3, Fz, F4, C3, Cz, C4, P3, Pz and P4) was offline re-referenced to the signal of the two mastoid electrodes, without including the implicit reference (FCz) and filtered with a zero-phase shift Butterworth filter (low cutoff = 0.3 Hz; high cutoff = 20 Hz). An independent component analysis (ICA), using a slope algorithm, detected blinks and eliminated blink artefacts in the EEG. The segmentation (total length: 3500 ms) included a pre-stimulus interval of 1000 ms and a post-stimulus onset interval of 2500 ms. A baseline correction, corresponding to the time of À600 to À100 ms before stimulus presentation, was applied to the EEG signal. Afterwards, two raters performed a manual artefact detection independently for all electrodes and only those F I G U R E 1 On the left side, a stimulus pair for the mental body rotation task (MBRT) is presented as an example for the different conditions, with a stimulus disparity of 135 (rotated 225 clockwise). On the right side, a stimulus pair for the mental object rotation task (MORT) is depicted as an example for the same condition, with a stimulus disparity of 90 (rotated 90 clockwise) trials that were classified as artefacts (i.e., drifts and muscular artefacts) by both raters were rejected (n = 107; 1.04%).
Then, stimulus-locked event related potential (ERP) data for the MBRT and MORT was calculated for each match and stimulus disparity. Trials with correct and incorrect responses were integrated in the analysis. The stimulus-locked N350 was calculated for the time window from 375 to 475 ms after stimulus onset. The stimulus-locked RRN (RRN SL ) was calculated for an early (375 to 475 ms) and a late time window (475 to 600 ms). Additionally, a response-locked ERP data calculation (segment length: 900 ms; start: 900 ms before response) for each stimulus disparity was performed. The response-locked RRN (RRN RL ) was calculated for the time window from À400 to À100 ms before the response. The electrode sites and time windows are roughly oriented by other authors and findings (e.g., Provost et al., 2013;ter Horst et al., 2012) and specified more precisely by visual inspection of our data (see Figures S1 and S2).
An automatic peak detection for different ERP components was performed with Brain Vision Analyzer (Brain Products) to investigate grounded cognition effects in early processing stages. Four components were identified by visual inspection of the descriptive data, and an associated time window after stimulus onset was defined for peak detection: for central electrodes, the P150 (latency: M = 198.5 ms, SE = 2.3) and N250 (latency: M = 254.1 ms, SE = 2.5) were analysed. For parietal electrodes, the P100 (latency: M = 100.9 ms, SE = 1.5) and N150 (latency: M = 149.5 ms, SE = 2.6) were analysed. For statistical analyses, mean voltages in the time area from 5 ms before to 5 ms after the peak were computed.
Dependent sample t tests are calculated for follow-up analyses of single comparisons, using adjusted p values (p adj ) according to Bonferroni-Holm procedure to correct for multiple hypothesis testing. Partial eta squared (η p 2 ; for ANOVAs) and Cohen's d (for t tests) are calculated as effect sizes. In case of a violation of the sphericity assumption, a correction according to Greenhouse-Geisser was calculated. All ANOVAs and follow-up tests were performed with SPSS statistics (IBM). Moreover, the interaction of Disparity Â Match, F (2.81,86.99) = 19.77, p = .001, η 2 p = .17, can be explained by lower RTs in same versus different stimulus pairs in the 0 , t(31) = 11.28, p adj < .001, d = 1.99, 45 , t(31) = 12.41, p adj < .001, d = 2.19], 90 , t(31) = 3.60, p adj = .004, d = 0.64 and 135 disparity condition, t(31) = 6.50, p adj < .001, d = 1.159, which are not evident in the 180 disparity condition.

| Neurophysiological data
Tables 1 and 2 report all ANOVA results regarding the neurophysiological data. Table 1 depicts the results for the RRN SL and the N350, whereas Table 2 presents all results for the early potentials (i.e., P100, N150, P150 and N250). All follow-up analyses and ANOVAs are reported in the respective sections in detail. Figure 3 shows descriptive ERPs for the MBRT and MORT for each stimulus disparity. Data in Figure 4 are calculated relative to the 0 disparity as a baseline condition. Figure 5 shows the mean amplitudes for the stimuluslocked early potentials for each task, whereas Table 1 presents the results from the respective ANOVAs. For the P100 (Pz), the 5 (Disparity: 0 , 45 , 90 , 135 and 180 ) Â F I G U R E 2 Means and standard errors for response times (a) and error rates (b) in the mental body rotation task (MBRT) and mental object rotation task (MORT) as a function of angular disparity ( (Table 2), as the P100 was more positive in different trials than in same trials (see Figure 6). For the N150 (Pz), the respective 5 (Disparity: 0 , 45 , 90 , 135 , 180 ) Â 2 (Task: MBRT vs. MORT) Â 2 (Match: same vs. different) ANOVA (Table 2) showed a main effect for Task, revealing more negative amplitudes in the MBRT, and a main effect of Disparity. Both main effects were qualified by a three-way interaction of Disparity Â Task Â Match. Follow-up analyses revealed that same and different trials differed in the MORT, as the same trials led to more negative amplitudes in the 0 disparity, t(31) = 2,04, p adj < .050, d = À0,36, while in the 180 disparity condition, different trials tended to be associated with more negative N150 amplitudes, t(31) = 2,04, p adj = .050, d = 0.36. Moreover, the disparity effect was evident only for the MBRT, F(1,31) = 3.40, p = .014, η 2 p = .10. The 0 disparity condition led to more negative N150 amplitudes than the 90 , t(31) = 3.64, p adj = .001, d = 0.64, and 135 condition, t(31) = 2.53, p adj = .033, d = 0.44.

| Early potentials
For the P150 (Cz), the ANOVA showed a main effect for Task, as the MORT was associated with more positive P150 amplitudes.

| Coherences of RT and RRN
The correlation between the mean RT slope (mean of MBRT slope and MORT slope for RT) and the RRN SL slope is negative (more negativity means lower values) and significant, p = .014, r = .À413 (after elimination of four outliers deviating more than 2 standard deviations from the mean for slope [n = 3] or RRN SL [n = 1]) ( Figure 7). Higher disparity-dependent increase in negativity is correlated to a higher increase in reaction times. There is no correlation between the mean slope for RT and the RRN RL , p = .150; r = .À203 (after elimination of the outliers). Correlations within the single tasks (MBRT and MORT) are all non-significant.

| Behavioural data
The behavioural data show the typical mental rotation effects with an increase of the RT and response error, as F I G U R E 7 Correlation of the disparity-dependent slopes of the stimulus-locked rotation-related negativity (RRN SL ) and the response times (RTs) a function of stimulus disparity (e.g., Tomasino & Gremese, 2016;Voyer et al., 1995). In fact, it perfectly replicates the previous findings of Steggemann et al. (2011) in a similar setting with comparable effect sizes for the significant effect of disparity on RTs (current study: η 2 p = .82; Steggemann et al., 2011: η 2 p = .83) and response error (current study: η 2 p = .47; Steggemann et al., 2011: η 2 p = .40). Steggemann et al. (2011) also found a non-significant interaction of disparity and task for the RT. Here, we found an interaction of disparity and task only for the different stimulus pairs, as tasks differed only for the lower three disparities (i.e., 0 , 45 and 90 ). Steggemann et al. (2011) revealed an interaction of task and disparity for response errors (η 2 p = .13) that we could replicate for the same trials in the current data set (η 2 p = .17). These (rather small) inconsistencies between the data sets might be explained by differences in speed accuracy prioritization. Whereas Steggemann et al. (2011) found a higher increase of errors for the MORT, the present study found shorter RTs in different trials in the MBRT in low disparity conditions when compared to the MORT. We argued that this reflects a facilitation effect for the lower disparities that does not seem to be as effective in the higher disparities. From an embodied cognition perspective, it is reasonable to assume that embodiment is facilitated by a low disparity between the observer's spatial reference frame and the spatial orientation of a human stimulus (i.e., the comparison stimulus; Jackson et al., 2006;Krause & Kobow, 2013). It remains to be solved why this is only effective for different-stimulus pairs in this setting.

| Neural data
4.2.1 | Phase 1: Early object cognition reflected in the P100, P150, N150 and N250 With respect to the P100, which is associated with early processes of object cognition (i.e., figure-ground segregation, Schendan & Lucia, 2010), a moderate effect between same and different stimulus pairs was revealed, which is hard to interpret. In general, a higher amplitude is evoked, when there is a figure-ground segregation process, but the question is, however, why should this be more prominent for different stimulus pairs than for same stimulus pairs? Is this a matter of task difficulty? Goslin et al. (2012) found a modulation of early components (posterior visual P1 [P100] and N1 [N150]) by motor affordances of visual stimuli and interpreted this as an embodiment effect. This cannot explain the samedifferent effect for the P100 but the higher N150 in the MBRT. Within the MBRT, the N150 was higher for 0 disparity than for 90 and 135 disparity. This may result from the orientation of the comparison stimulus, as it is known that motor-related processes are facilitated by the spatial disparity of human stimuli to the spatial reference frame of the observer (e.g., Jackson et al., 2006;Krause & Kobow, 2013). Moreover, there is a disparity-independent task effect for the P150 in the current study (i.e., lower P150 in the MBRT), which indicates more efficient processes of early object cognition (i.e., figure ground segregation) for human stimuli compared to letters (Schendan & Lucia, 2010).
At this point in time, it seems plausible to assume that the modulations of the P150 and N350 (cf. section 4.2.2) indicate a facilitated processing of human figures before the mental rotation process starts (as indicated by the RRN SL ). This also corresponds to the differences in the intercepts of the reaction time data in the present study and in other studies (e.g., Steggemann et al., 2011). The underlying mechanisms remain unclear, although theoretical attempts (grounded cognition; Kiefer & Barsalou, 2013) lead to the assumption that human figures are processed faster due to a greater embodiment and may cause some kind of motor pre-activation, which later leads to a faster response execution. Such speculations may be the subject of future studies. In addition to the stimulus-dependent effect for the P150, a stimulus-dependent central negativity was observed around 250 ms (N250). This N250 component seems to be modulated by the familiarity of body-related stimuli (e.g., own vs. other faces, dogs and cars: Pierce et al., 2011;Tanaka et al., 2006). In this perspective, a more negative N250 amplitudes in the MBRT might reflect a more distinct effect of perceived ownership with the body stimuli, which can be interpreted with an embodied (or grounded) cognition approach as an additional explanation, according to which a motor strategy (e.g. simulating the body position) might help to encode the human figure stimuli.
All or at least some of the neurophysiological findings on the earlier components (i.e., N150, P150, N250 and N350), prior to the mental rotation process itself (which is reflected in the RRN SL ), may contribute to the disparity-independent, but stimulus-dependent behavioural performance (i.e., faster RTs in the MBRT). Further research is needed to disentangle its distinct contributions and the functional relevance with respect to the assumed embodied processes.

| Phase 2: Memory-based detection of spatial orientation in late object cognition reflected in the N350
The less negative N350 at frontal sites in the rotation of alphanumeric stimuli might reflect more efficient visual object cognition, like detection of spatial relations and the orientation, by selecting the best match in memory for the visual structure of the stimuli (Schendan & Lucia, 2009). In this perspective, memory retrieval, as being a facilitating process to solve mental rotation tasks (Provost et al., 2013), might be reflected in the N350. Additionally, N350 amplitudes indicate more efficient object cognition in same trials compared to different trials, as amplitudes for same trials are smaller. Furthermore, same trials show a smaller N350 in the lower disparities (0 and 45 ), but not in the higher. This seems to reflect less intense (more efficient) memory retrieval for same trials with lower disparities, while higher disparities in same trials might induce comparably intense (less efficient) memory related processes as different trials.

| Phase 3: Mental rotation itself reflected in the RRN SL
The behavioural disparity effect is also reflected in the neural data for the RRN SL and the N350. The overall RRN SL effect (η 2 p = .60) is comparably higher than in other studies (Provost et al., 2013 [η 2 p = .18]; Beste et al., 2010 [η 2 p = .14; calculated from F value]). The increasing RRN SL with higher disparities indicates an increasing demand for mental rotation up to a disparity of 135 . The decrease of the RRN SL in the 180 disparity condition is in line with the assumption that alternative processes, like flipping or vector inversion, can be utilized in the 180 condition (Bock et al., 2003;Neely & Heath, 2010). This is also consistent to the lower response errors in the 180 condition, but it remains to be solved why this is not reflected in the RT, where a linear increase is interpreted as an indicator of mental rotation processes.
As expected, the RRN SL effect was lower for the parity judgement of alphanumeric characters compared to the parity judgement of human bodies. With reference to Liesefeld and Zimmer (2013), we assume that spatialrelational information, as a central factor determining task complexity and mental rotation requirements, is the same for the 'R' and the human body, as they both have one rotational relevant feature. So differences in task complexity do not seem to explain differences in the RRN SL . The bulge and the arm as features of the 'R' can be seen as an inflexible complex of features, because they always appear together at the same side of the vertical line without further degrees of freedom and therefore can be seen as one rotational relevant feature. The rotational relevant feature of the human body is the outstretched arm.
It is postulated that this RRN SL difference reflects a lower involvement of mental rotation processes in the parity judgement of alphanumeric characters, while memory retrieval processes might play a stronger role. Being university students, the present participants had comprehensive expertise in the visual processing of spatially misaligned (normal) alphanumeric characters (reading this text adds more than 150 capital 'R's to this perceptual expertise). One could argue that human bodies are viewed equally often. However, there is one decisive difference in these tasks: we do not perceive human bodies as normal or unusual dependent on the orientation of upper extremities, while we do perceive mirrored letters as unusual, but a misaligned non-mirrored letter might be classified as such, without the need for mental rotation, based on memory retrieval of the respective representation of misaligned non-mirrored letter. The fact, that the RRN SL difference between MBRT and MORT is smallest for the 135stimulus disparity, indicates that alternative processes to the mental rotation in the picture plane (i.e., memory retrieval for 45 and 90 ; flipping for 180 ) might not be used to the same extent for 135 disparity in the MORT. It seems plausible to assume that perceptual expertise, as a prerequisite for memory retrieval-based solutions of the task, is higher for lower disparities (Núñez-Peña & Aznar-Casanova, 2009). The generalizability of this stimulus dependence of the RRN SL should be tested in specific populations with comprehensive perceptual exposition to alphanumeric stimuli (e.g., editors or students of literary studies) or human bodies (e.g., sport coaches and athletes). Future studies might also clarify why differences in the neural data are less markedly reflected in the behavioural data, as we would expect that memory retrieval should lead to lower slopes in RT than a mental rotation-based solution of the task. The lower RTs in the MBRT in the lower disparity conditions may be related to the task specific differences in the RRN SL .

| RRN SL -Lateralization of neural activation
Contrary to some studies (e.g., Johnson et al., 2002) and in line with others (e.g., Beste et al., 2010) using a mirror versus non-mirror judgement task with letters, we do not find clear indications for lateralization effects in our data, because left and right parietal electrodes did not differ significantly. There is only a marginal tendency, as left electrodes indicated more negative values than central electrodes in the MORT. Interpreting this as a lateralization effect would indicate a tendency to more holistic processing in the MORT, which cannot be documented for the MBRT. However, we neither interpret the data as indicating any clear dominance of holistic or analytic processes nor as excluding individual differences in tasksolving strategies in either of the tasks.

| RRN-Coherence with behavioural data
Consistent with the neural efficiency hypothesis (Haier et al., 1988), which is assumed to be valid for easy tasks (with RTs under 2000 ms; Doppelmayr et al., 2005;Dunst et al., 2014), mental rotation performance (i.e., the slope for the RT as a function of disparity) was negatively correlated with the RRN SL . Accordingly, smaller slope values for the RRN SL amplitudes were associated with smaller slope values of the RT. A more efficient neural processing of mental rotation might be one possible cause of the reduced RRN SL , as well as the use of alternative strategies (memory retrieval, Provost et al., 2013). Memory retrieval should lower both mental rotation-related neural activity (i.e., RRN SL ) and the slope for RT. Inconsistent with other studies (e.g., Riečanský & Jagla, 2008), we find a correlation between the RRN SL and mental rotation performance, but not for the RRN RL . While Riečanský and Jagla (2008) only find such a correlation for the RRN RL and suggest these data refer to the later stage of mental rotation, which finally leads to the correct task solution and is thus more related to performance. In contrast to our study, Riečanský and Jagla (2008) used only a one-stimulus normal-mirrorjudgement task.

| Late components in mental rotation tasks in comparable studies
Altogether, it is hard to compare our data of the late components in mental rotation tasks against the only known data set with a similar experimental approach (Jansen et al. (2020), because of decisive differences in the data analysis with respect to time intervals for the ERP components. We decided not to confound obvious components (i.e., pairs of peaks) in the early phases. However, our early plus late interval is comparable to their late interval, where both studies did not find a main effect of task (i.e., stimulus type) at parietal electrodes. Like in Jansen et al. (2020), a main effect of task for frontal electrodes was found in a comparable interval (375-600 ms vs. 400-600 ms). However, whereas Jansen et al. (2020) did not analyse the data relative to disparity, we found a distinct pattern of neural activation for different mental rotation angles.

| LIMITATIONS
As a replication of Steggemann et al. (2011), we used stimuli that differed in the stimulus category (a letter for the non-human stimuli vs. pictures of human bodies), but we cannot exclude that other stimulus features might confound the effect of stimulus type here. One might argue that stimulus complexity (e.g., number of spatial and textural features) also confound our findings. To control for different stimulus features, Amorim et al. (2006) and Voyer et al. (2017) designed humanoid cube figures and humanoid figures that were spatially configurated according to the cube figures. These stimuli have the drawback of a limited external validity, as the spatial configurations might be natural as humans are able to take the respective postures, but they do not incorporate a functional meaning for behaviour (e.g., like pointing into a direction or manipulating and object).
Moreover, the offset of the fixation cross was 500 ms prior to the task-relevant stimulus pair, which might impede the anchoring of gaze and induce offset ERPs (e.g., File et al., 2018). These offset ERPs than lie in the interval for baseline correction and indeed the baseline at frontal and central sites is not flat in the current data, but increasing. Likewise, the feedback onset directly after the response might evoke feedback-related potentials, which may confound the rotation-related components, when RT is short. Stimuli were visible until response, and no gaze instruction (e.g., fixation point) was given. This may allow eye movements that induce artefacts in the EEG signal. This drawback was tolerated in favour of unrestricted processing strategies that are interdependent to oculomotor behaviour (Voyer et al., 2020). Instructing a certain gaze behaviour might reduce artefacts but also confound effects of stimulus type on mental rotation strategies. Artefacts related to eye movements were corrected by an independent components-based method, using the recordings of the electrooculography.
Although we did not reveal laterality effects (e.g., Beste et al., 2010;Jansen-Osmann & Heil, 2007;Núñez-Peña & Aznar-Casanova, 2009), it should be considered that all responses were given with the right upper extremity, which might also have influenced the rotation related processes prior to the response. Especially fast responses may evoke ERPs within the time window of ERPs related to the mental rotation itself. Future studies might therefore counterbalance responding extremity (e.g., left vs. right hand) across practice blocks.

| CONCLUSION
This study replicates the basic finding that the RRN SL , as a negativity overlying the P300 component at parietal sites in the ERP, reflects mental rotation processes (cf. Heil, 2002). Moreover, it was possible to show stimulus-dependent differences in the RRN SL amplitude. These differences indicate a more pronounced use of mental rotation processes in the MBRT compared to the MORT. Accordingly, a higher use of alternative task solutions (e.g., memory retrieval; Provost et al., 2013) is assumed for alphanumeric stimuli in the lower stimulus disparities (0 , 45 and 90 ). However, the data do not indicate any clear preference for holistic or piecemeal mental rotation in either of the task versions on the group level. As predicted, there was a coherence between neural data (RRN SL ) and the behavioural data (RT). In the early phase of visual object cognition, the N150 and the P150 seem to indicate more efficient processing of human figures (i.e., figure ground segregation). Together with a later N250, indicating differences in familiarity, and a N350 that indicate facilitated object cognition of a later stage (spatial relations and orientation detection) in the MBRT, these findings might help to explain, which cognitive processing stages are enhanced in the mental rotation of body-related stimuli compared to mental rotation of non-body-related stimuli, even though the exact reasons for the enhanced cognitive processing remain unclear (e.g., more prominent embodied cognition or more abstract differences in the stimuli-like saliency and complexity).