Object Orientation Affects Spatial Language Comprehension



Typical spatial descriptions, such as “The car is in front of the house,” describe the position of a located object (LO; e.g., the car) in space relative to a reference object (RO) whose location is known (e.g., the house). The orientation of the RO affects spatial language comprehension via the reference frame selection process. However, the effects of the LO's orientation on spatial language have not received great attention. This study explores whether the pure geometric information of the LO (e.g., its orientation) affects spatial language comprehension using placing and production tasks. Our results suggest that the orientation of the LO influences spatial language comprehension even in the absence of functional relationships.

1. Introduction

People use spatial language to map space onto language, and spatial prepositions indicate the location of objects in space. For instance, “The box is over the vase” indicates that the located object (LO; i.e., the box) is somewhere over the reference object (RO, i.e., the vase). To understand such spatial descriptions, people must assign a direction to the space by selecting a reference frame (Carlson-Radvansky & Irwin, 1993, 1994). Levinson (1996) distinguished among three reference frames people use to describe space: absolute, relative, and intrinsic. The absolute (or environment-centered) reference frame is aligned with salient aspects of the environment, such as gravity or geophysical features (e.g., north/south cardinal directions). The relative (viewer-centered) reference frame is selected from the viewer's perspective. In an intrinsic (object-centered) reference frame, the internal coordinates of the RO define the orientation of its axes.

The effects of RO orientation have been widely investigated in spatial language research to show the consequences of selecting multiple reference frames on spatial language comprehension (Carlson, 1999; Carlson & Van Deman, 2008; Carlson-Radvansky & Irwin, 1994; Carlson-Radvansky & Logan, 1997). However, the potential effects of LO orientation on spatial language have frequently been underestimated. For example, Carlson-Ravansky and Irwin (1993, 1994) and Carlson (1999) manipulated the LO's location while its orientation remained irrelevant using deliberately small objects, such as a fly or a dot, or objects without intrinsic orientations, such as spheres (Carlson & Van Deman, 2008) or balls (Carlson, West, Taylor, & Herndon, 2002). In Carlson-Radvansky and Logan (1997), the LO was a simple square devoid of the features people use to determine orientation (cf. Carlson-Radvansky & Jiang, 1998). Levelt (1996) controlled the LO's orientation using a circular object without an intrinsic orientation, such as a football. Similarly, other studies (Levinson, 1996; Pederson, 1995) have used objects without front/back orientations (e.g., cones, cylinders, cubes, balls, and bottles). The authors of these studies were not interested in the LO's orientation; nevertheless, they controlled for it using objects without an intrinsic orientation. Therefore, these authors cannot exclude the possibility that this method interfered with the process under examination.

The lack of interest in the LO's geometric properties is in line with cognitive linguistics, which considers the LO and its properties to be secondary to language comprehension (Jackendoff, 1983). For example, Talmy (1983) claimed that the LO is “simply” the goal of the spatial description. Therefore, properties such as orientation are irrelevant for spatial language comprehension. This view is consistent with the Attentional Vector Sum (Regier & Carlson, 2001), a computational model of the spatial language comprehension process that simulates an attentional shift from the RO to the LO. This model accounts for geometric aspects, such as alignment and distance (and, in a more recent version, functionality; Carlson, Regier, Lopez, & Corrigan, 2006). However, this model treats the LO as a target for the attentional vector without considering that its geometric aspects might interfere with language comprehension.

Cases in which the LO's orientation is manipulated include situations in which functional relationships bind the RO to the LO (e.g., a hammer and nail; Carlson-Radvansky & Radvansky, 1996). A scene illustrating a mail carrier (LO) standing near a mailbox (RO) was preferentially described using intrinsic descriptions (e.g., “The mail carrier is in front of the mailbox”) when the mail carrier was facing the mailbox (i.e., when the LO and RO were shown to be in a functional relationship). However, relative descriptions were preferred when the LO's orientation was changed (e.g., “The mail carrier is to the left of the mailbox”). Therefore, intrinsic reference frames are preferred when the LO and the RO are in a functional relationship. Coventry, Prat-Sala, and Richards (2001) manipulated the LO's orientation in a similar way for a series of sentence-acceptability ratings. People viewed pictures in which the LO had the function of protecting the RO (such as a man holding an umbrella to protect himself from the rain). The participants rated the appropriateness of various descriptive sentences (e.g., “The umbrella is above the man”). The results showed that manipulating the LO's orientation had a strong effect; participants considered sentences in which the umbrella protected the man from the rain to be the most appropriate (Coventry et al., 2001). These studies suggest that the LO's orientation is relevant in spatial language, at least when functional relationships are involved. However, no study has examined the specific effects of the LO's geometric aspects alone on spatial language comprehension.

Levelt's (1984) Principle of Canonical Orientation suggests that the LO has an important role regardless of the functional aspects between the objects (for a revision of his theory, see Levelt, 1996). This principle claims that intrinsic reference frames are permitted only under certain circumstances: “For the intrinsic system to refer to the intrinsic dimension of the reference object that dimension must be in canonical position with respect to the perceptual frame of orientation of the located object” (p. 354). Garnham (1989) stated that the LO's perceptual frame is one in which the object itself is located, assuming that the reference perceptual frame is vertically oriented. Therefore, this principle suggests that the LO's orientation may be important in spatial language processing also when there are no functional relationships.

It has been observed that Tzeltal speakers have a more complex specification of the LO than a simple point-like representation (Brown, 1993; Levinson, 1994). In referring to a locative situation (describing the location of an object or a place, for example), speakers use a verb that carries information about the LO shape and its orientation, providing a “concrete” geometric schema. For example, when asked to describe the action of putting a bottle on a table, Tzeltal speakers use “wajxan” if the bottle presents a vertical orientation (canonical), but they use “balan” if the bottle is lying on one side (non-canonical orientation) (Brown, 2008). This evidence suggests that the geometric properties of the LO may have a more important role in spatial language comprehension than expected. To date, however, no research has provided empirical support for this hypothesis.

This study investigates whether the LO's orientation influences spatial language processing when there is no functional relationship between the LO and RO. This hypothesis (henceforth, the geometric hypothesis) is grounded in the observation that the LO's orientation is relevant for the selection of reference frames (cf. Levelt, 1984) and that people select a reference frame to comprehend spatial language (Carlson-Radvansky & Irwin, 1993, 1994). To investigate this hypothesis, we conducted two experiments based on different methodologies: a placing task and a production task. Both experiments corroborated the geometric hypothesis and showed that people process the LO's orientation when a scene's objects do not interact functionally, which affects spatial language comprehension and production.

2. Experiment 1

This experiment investigated the LO's effect using a placing task. Participants placed objects according to a given spatial description while we manipulated the LO's orientation. For example, we presented participants with a spatial description, such as “The apple is above the strawberry.” Then, the participants placed the LO (i.e., the apple) based on the description (i.e., somewhere over the strawberry). Similar placing task paradigms have been used to successfully test spatial language comprehension. For example, Carlson and Kenny (2006) presented participants with the instruction, “Put the wig above/below/near the curling iron.” The participants then placed the wig in the indicated location. When the participants were given a second set of instructions, such as “Put the plug above/below/near the curling iron,” they placed the LO in a different location. Logan and Sadler (1996) (Experiment 1) examined the regions of space that corresponded to the best examples of a number of spatial prepositions. These authors presented participants with a picture of a box drawn on the center of a frame. Next, they asked the participants, for example, to “draw an X above the box.” Kenny and Carlson (2003; as cited in Carlson, 2003) asked participants to place a beanbag in a specified location (e.g., near or above) while a video manipulated the context by priming which functional parts of an object were important.

These types of placing tasks are interesting because they test language comprehension indirectly; in other words, they allow experimenters to control whether participants have understood the spatial description properly by examining the locations where the objects have been placed. In addition, these tasks permit us to test the effect of the LO orientation within an ecological spatial language use situation, which is more difficult with an acceptability rating task or a picture–sentence verification task.

According to the geometric hypothesis, we should find that the LO's orientation affects spatial language comprehension regardless of the (lack of) functional relationship between the objects (Levelt, 1984). Conversely, if we do not find an effect, the LO's orientation might be relevant for spatial language only in the presence of a functional relationship.

2.1. Method

2.1.1. Participants

Thirty people (6 males and 24 females; age range = 22–52 years, mean age = 28.70 years) volunteered for this experiment. All participants were native Italian speakers with normal or corrected-to-normal vision.

2.1.2. Materials and Design

This experiment employed a placing task in which participants placed an object in a location specified by a spatial description on a computer. The description had the following format: “The (LO) is (spatial preposition) the (RO).” The spatial prepositions could be one of the following: above/below, on the left/on the right, or in front of/behind. We used four polyoriented objects (a pumpkin, a strawberry, a carrot, and a pepper) as the LOs and ROs for spatial descriptions that contained the terms above/below to avoid increasing the participant reaction time (RT) needed to mentally rotate objects into canonical orientations (Leek, 1998; Leek, Reppa, & Tipper, 2003). Unlike monooriented objects, polyoriented objects do not show RT differences as a function of increasing their rotations away from a canonical orientation. The four stimuli for spatial descriptions containing on the left/on the right were pictures of people in a plan view. The names of these people coincided with their hair color; Mr. Red, Mr. Black, Mr. Green, or Mr. White. The four stimuli for spatial descriptions containing in front of/behind were lateral-view images of a horse, a bear, a penguin, and a frog. Fig. 1 illustrates examples of the depicted scenes for each spatial preposition set.

Figure 1.

An example of the depicted scene for each of the three spatial preposition sets used in the experiment. The example above illustrates the cases where the LO is in a canonical orientation. The competitor object (the object not mentioned in the description) is not represented.

Given the spatial description, “The (LO) is (spatial preposition) the (RO),” two tasks were possible: placing the LO in relation to the RO or placing the RO in relation to the LO. This manipulation was introduced with the intention of testing whether people used the LO as an RO in the latter condition. For example, given “A is above B” when participants are asked to place B, the task may bias participants to take A as the RO regardless of the spatial description provided, which indicated that B was the RO. Previous research has shown that non-canonical ROs generate a reference frame conflict, which may lead to longer latencies (Carlson-Radvansky & Logan, 1997). Therefore, we should observe a larger orientation effect when placing the RO than when placing the LO.

The objects to be placed always appeared at the top-left corner of the monitor; thus, half of the trials showed the LO in the top-left corner, and the other half showed the RO. Only the LO's orientation was manipulated. The RO was always upright for above/below, and it always faced up for on the left/on the right prepositions and faced left for in front of/behind prepositions. In addition to the LO and RO, we showed a third competitor object to force participants to explore all objects within the scene and to reduce the possibility that the participants would place an object without looking at the others. We selected filler objects from the same set of objects used in the experimental trials. A “target” object becomes a “competitor” object when we do not mention it specifically. For example, if a scene illustrates a strawberry, a pumpkin, and a carrot given the spatial description, “The pumpkin is above the carrot,” the strawberry is the competitor object. We depicted scenes that illustrate objects without functional relationships (unlike coin/piggybank, toothbrush/toothpaste, or lock/key).

The LO's orientation depended on the spatial preposition described. For example, we presented the LO upright or upside down for above/below, and we presented the LO facing up or down for on the left/on the right prepositions. We presented the LO facing to the right or to the left for in front of/behind prepositions. To group different LO orientations across the three spatial prepositions sets under the same coding, we called orientations that matched the RO's direction “canonical” and those that did not match “non-canonical.” This terminology uses a specific definition of canonical position (Levelt, 1996). In this case, the RO's dimensions are in a canonical orientation when they are aligned perpendicularly (for horizontal prepositions) or parallel (for vertical prepositions) with the vertical dimension of the LO perceptual frame. Conversely, when the LO is in a canonical position, it is undefined. However, according to the principle of converseness (if [A is above B] then [B is below A]; Levelt, 1996), the same principle can be applied to the LO. Accordingly, we assume that the LO is canonically oriented when its direction corresponds to the RO's direction. The examples in Fig. 1 illustrate scenes in which the LO is in both canonical (above) and non-canonical (below) orientations.

In summary, the variables included in the experiments were 6 (spatial prepositions: above/below, in front of/behind, on the left/on the right) × 2 (LO orientation: canonical vs. non-canonical) × 2 (object order: placing the LO vs. placing the RO) × 4 sets of objects, for a total of 96 trials. The experiment used a within-participants design. The dependent variable was the time necessary to place the object in the specified location.

2.1.3. Procedure

Participants sat approximately 60 cm in front of a computer monitor. The experiment showed first a series of instructions followed by a short practice session to familiarize participants with the procedure. Then the experiment began with a sentence consisting of the following format: “The [LO] is [spatial preposition] the [RO].” This sentence remained on screen until participants pressed the spacebar. Next, three objects appeared on screen: the LO, the RO, and the competitor object. The objects that participants had to place always appeared at the top-left corner of the monitor. The positions where the objects-in-place could appear were randomized across different screen positions to prevent participants from seeing objects in predictable locations. We vertically aligned the other two objects (the LO and the competitor if the RO had to be placed, or the RO and the competitor if the LO had to be placed) for the in front of/behind and on the left/on the right spatial prepositions, whereas we horizontally aligned the other objects for the above/below spatial prepositions. Then participants placed the object according to the spatial description using a mouse. Once the object was in the presumably correct location, participants clicked the mouse button to proceed to the next trial. Reaction times were calculated from scene onset until the mouse button press. During placement, the object was shown for the duration of the placing action to allow participants to process its orientation. Finally, an intertrial interval of 700 ms closed each trial. Participants used as much time as they needed to complete the task.

2.2. Results

We calculated placing responses using the object's spatial coordinates. We excluded those cases from the statistical analyses in which the object was placed outside a “good” area. According to the definition by Carlson-Radvansky and Logan (1997), good areas are those aligned to the reference objects (vertical for above/below and horizontal for in front of/behind and on the left/on the right). For this experiment, the good area criterion corresponds to placing the target object more than 20 pixels from the outside edge of the other object, which correspond at about 0.75° of visual angle. We excluded one participant from the analysis because he failed to complete the task. RTs and errors were analyzed separately. No item analysis was carried out according to the criticism that a reduced set of items (as the case in our experiment) leads to a very low statistical power (Raaijmakers, 2003; Raaijmakers, Schrijnemakers, & Gremmen, 1999) and may generate misleading interpretations.

2.2.1. Latencies analysis

The first analysis included the following factors: object order (placing the RO vs. placing the LO), spatial preposition sets (in front of/behind vs. above/below vs. on the left/on the right), and LO orientations (canonical vs. non-canonical). We eliminated RT outliers greater than 2 SD calculated for each condition.1 This procedure eliminated 150 responses (5.2%).

A factorial anova revealed a main effect of LO orientation, showing that participants were faster to place the objects in the described locations when the LO was in a canonical orientation (= 2.352 ms, SD = 538 ms) than when it was not (= 2.739 ms, SD = 1.132 ms), F(1, 28) = 18.57, < .001, η2 = .40. The same analysis also revealed a significant main effect of object order in which participants were significantly faster to place the LO (= 2.358 ms, SD = 811 ms) compared with the RO (= 2.733 ms, SD = 841 ms), F(1, 28) = 31.19, < .001, η2 = .53. There was also a significant effect of spatial prepositions, F(2, 56) = 2.96, < .05, η2 = .14, such that placing the object using in front of/behind prepositions (= 2.404 ms, SD = 1.046 ms) or using above/below prepositions (= 2.549 ms, SD = 889 ms) was faster than on the left/on the right (= 2.684 ms, SD = 749 ms; Scheffé: < .01). Finally, the data analysis revealed a significant interaction between object order and LO orientation, F(1, 28) = 10.44, < .01, η2 = .27 (Fig. 2). A Scheffé post hoc test revealed that the RTs for trials presenting the LO in a canonical orientation were faster in placing both the LO (= 2.371 ms, SD = 566 ms) and the RO conditions (= 2.579 ms, SD = 591 ms) compared to when the LO was in a non-canonical orientation (LO: = 2.684 ms, SD = 1.160 ms; RO: = 3.217 ms, SD = 1.172 ms). All comparisons were significant (ps < .001). We did not find other significant effects.

Figure 2.

This graph illustrates the placing times for the two-way interaction between the orientation of the LO and object's order. Bars represent standard errors.

Because we used a different set of objects depending on the spatial prepositions examined, we ran three separate anovas to investigate whether the effect of the orientation of the LO was significant across all the spatial relations. The first analysis (above/below) revealed a main effect of object order, F(1, 29) = 23.85, < .001, η2 = .45, but the main effect of LO orientation was only marginally significant, F(1, 29) = 3.29, = .08, η2 = .10; their interaction was not significant. The second anova (on the left/on the right) revealed a main effect of object order, F(1, 29) = 11.50, < .001, η2 = .28, and a main effect of LO orientation, F(1, 29) = 8.25, < .001, η2 = .22; again, however, their interaction was not significant. The last anova (in front of/behind) revealed a main effect of object order, F(1, 29) = 11.91, < .001, η2 = .29, and a significant main effect of LO orientation, F(1, 29) = 23.68, < .001, η2 = .45. Their interaction was also significant, F(1, 29) = 5.05, < .001, η2 = .15, showing that the advantage of placing a canonical-oriented object compared with a non-canonical-oriented object was more pronounced when participants placed the RO (= 1.018 ms) compared with the LO (= 411 ms).

Given that people may have processed the LO as an RO for trials where they were asked to place the RO, we ran a separate analysis focusing on trials where people had to place the LO only. The anova revealed a significant main effect of the orientation of the LO, F(1,28) = 4.42, < .05, η2 = .14, with trials showing the LO in a canonical orientation being placed faster (= 2.371 ms, SD = 566 ms) than trials with the LO in a non-canonical orientation (= 2.686 ms, SD = 1160 ms). This analysis also revealed a main effect of spatial preposition, F(2,56) = 6.54, < .001 η2 = .33, showing that placing the object using in front of/behind prepositions (= 2.349 ms, SD = 756 ms) or using above/below prepositions (= 2.520 ms, SD = 1018 ms) was faster than on the left/on the right (= 2.717 ms, SD = 901 ms). However, Scheffe's post hoc test indicates that the only significant comparison was between in front of/behind and on the left/on the right (< .01). No further effects were found.

2.2.2. Errors analysis

Overall, participants made 120 placing errors (4.3%). However, the distribution did not follow the RTs outcomes. In fact, trials in which the LO was in a non-canonical orientation did not cause more errors (56.7%) than those in which the LO was in a canonical orientation (43.3%), Wilcoxon T = 1.8, n = 12, n.s. This analysis also revealed that object presentation order affects the difficulty of the task; trials in which participants had to place the LO had fewer errors (33.3%) than those in which the RO had to be placed (66.7%), Wilcoxon T = 10.5, n = 14, < .05. We did not find other significant effects.

2.3. Discussion

Experiment 1 investigated the effect of LO orientation on spatial language comprehension using a placing task. The results showed that the LO's orientation affects the way people process spatial descriptions, even in the absence of functional relationships. Participants placed objects in the correct location more slowly in trials in which the LO had a non-canonical orientation compared with those in which the LO had a canonical orientation. This effect was found for all spatial prepositions, but the effect was strongest for the horizontal spatial prepositions on the left/on the right and in front of/behind. This variation may be related to the fact that for the above/below preposition, there is no reference frame conflict for the LO to be placed in the designated location given that the absolute, relative, and intrinsic reference frames overlapped. In contrast, for the prepositions on the left/on the right and in front of/behind, it may be necessary to solve the conflict between the intrinsic and the relative reference frame (the absolute reference frame is not necessary for apprehending the scene in these cases). Therefore, it is possible that in this latter case, the orientation effect of the LO is emphasized by the higher complexity of the scenes. However, the lack of an interaction between the orientation of the LO and the preposition type suggests that this interpretation requires further attention.

We also found a main effect of LO orientation when placing both the LO and the RO (although to varying degrees). According to Talmy (1983), people focus first on the RO and then move to the LO. Thus, searching for the RO before the LO may be automatic. However, trials in which participants placed the RO may have disrupted the normal sequence used to establish a correspondence between the description and the objects. In fact, the participants in Experiment 1 may have to restructure the problem, perhaps via contrary relationships (e.g., B is below A → A is above B), explaining the additional time needed for trials in which participants placed the RO. In line with Talmy's (1983) principle, the participants may have taken the LO to be the RO during the early stages of RO-placing trials. According to the Multiple Frame activation theory (Carlson-Radvansky & Irwin, 1994), which has shown that manipulating the orientation of the RO affects spatial language comprehension, we find a larger difference in latencies between canonical versus non-canonical LO orientations for RO-placing trials compared with LO-placing trials. However, the analysis that focused only on trials in which participants had to place the LO revealed a significant LO effect. This finding indicates that the effect of the LO orientation does not depend on the request to place the LO or the RO.

The RT differences among the different spatial terms replicate the effect of symmetry (Franklin & Tversky, 1990). The participants responded to asymmetrical terms (in front of/behind and above/below) more quickly than they responded to symmetrical terms (on the left/on the right). This result confirms previous findings that these prepositions refer to an object axis that cannot be confused with the viewer axis (Bryant, Tversky, & Franklin, 1992).

The RT results support the geometrical hypothesis, but the error analysis does not. This lack of support can be explained in terms of the study constraints because we did not adopt a time restriction; the participants may have prioritized accuracy over speed. Finally, despite the fact that our paradigm mirrored more ecological settings of spatial language comprehension than an acceptability rating task, the most common situation in which people use spatial relationships is for their production. Therefore, Experiment 2 investigated the geometric hypothesis within a production task.

3. Experiment 2

Experiment 1 showed that the LO's orientation affects spatial language comprehension. In particular, when the LO is presented with a non-canonical orientation, people take longer to place it in a described spatial location. However, Experiment 1 primarily addressed the comprehension aspects of spatial language, whereas people often use spatial relationships to describe the location of objects or places (cf. Bohnemeyer & Pederson, 2010; Coventry & Garrod, 2004). For this reason, Experiment 2 investigated whether LO orientation affects spatial language production. One way to study language production is to use the simply describing procedure to recount a scene (Bates & Devescovi, 1989; Osgood, 1971). Similar methodologies have effectively elicited descriptions of images, drawings, or pictures. For example, researchers elicited descriptions of pictures by asking questions related to a story or a scene that participants had just seen (Bates & Devescovi, 1989; Carroll, 1958). Tannenbaum and Williams (1968) presented participants with a story about a train or a car that was described with active or passive voice. Then, the participants described the actor and its actions and goals depicted in a series of pictures to determine whether the verb form used in the preamble primed the verb form used in the participants' descriptions. This methodology suffers from what Bock called the exuberant responding problem (1996) because participants may describe several irrelevant aspects of the story, such as the color of the object or other insignificant details. Lindsley (1975) modified this paradigm to control for exuberant responses by showing participants the four sentence structures that they were allowed to use. Similar paradigms have been used in other studies (Carlson-Radvansky & Irwin, 1993; Clark & Chase, 1974). Bock (1996) controlled for exuberant responses by employing material that tends to elicit a specific description. For example, participants were shown a scene illustrating two simple objects and were asked to describe the location of the LO (Hayward & Tarr, 1995; Experiment 1). We used a modified version of the simply describing procedure (Bates & Devescovi, 1989) to control for exuberant responses by suggesting possible sentence structures that participants might use.

According to the geometric hypothesis, we again expect the LO's orientation to be relevant for spatial language production even in the absence of functional relationships. In addition, based on the results of Experiment 1, we expect that scenes in which the LO has a non-canonical orientation should take longer to be described than those in which the LO has a canonical orientation.

3.1. Method

3.1.1. Participants

Thirty-two people (17 males and 15 females; age range = 18–26 years, mean age = 21 years) volunteered for this experiment. All participants spoke native Italian and had normal or corrected-to-normal vision. None of the participants had volunteered for Experiment 1.

3.1.2. Design and materials

This experiment used the same stimuli as Experiment 1. Experiment 2 did not use a competitor object because people explore and identify two objects per trial in a production task; thus, they cannot ignore either the LO or the RO.

Based on the simply describing procedure (Bates & Devescovi, 1989; Osgood, 1971), participants described the location of the designated object (the LO, underscored by a dot) with respect to the RO. To control for exuberant responding in speech production (Bock, 1996), we did not force participants to use a specific sentence structure; rather, we suggested, during the training trials, a preferred formulation they might use which implicitly indicated the position of the LO in the syntactic structure. Experiment 2 had a 3 (spatial preposition sets: above/below, in front of/behind, and on the left/on the right) × 2 (LO orientation: canonical vs. non-canonical) design. A balanced combination of four objects was used for each preposition, for a total of 48 trials. The experiment lasted approximately 20 min, and its design was within participants.

3.1.3. Procedure

The experiment began by showing the instructions (see Appendix A for detail) followed by some trial examples explaining to participants that their task was to describe the location of a target object (indicated by a dot) using another object as a landmark (suggesting which syntactical structure they should use). We did not set a time limit, but we did stress that descriptions should be as clear and informative as possible. We showed participants all of the objects before the experiment began to control for name variability. We presented the objects simultaneously. The placement of the object pairs was randomized across different screen positions to prevent participants from seeing objects in predictable locations. After training, a blank screen appeared for 700 ms followed by a scene containing two objects. A sound indicated when the objects appeared on screen. This was important because subjects' responses were recorded on a digital voice recorder and reaction times were calculated from that point (scene onset). Once participants described the trial they pressed the spacebar to see the next trial.

3.2. Results

This experiment examined the processes that underlie the production of spatial descriptions and whether LO orientation affects these aspects. Given that the first phonemes of the response might bias voice-response measurements (Kessler, Treiman, & Mullennix, 2002), we used a more sophisticated and precise measure of planning/execution times based on a waveform analysis. We calculated the latencies from the scene onset to the end of the vocalization using Praat speech analysis package (Boersma & Weenik, 2010).

The following analysis first examined the utterance planning time (UPT), which measures the interval from the scene onset until the start of vocalization. In a separate analysis, we analyzed the utterance execution time (UET), which represents the time required to produce a description. UET was measured from the start to the end of vocalization. We included this measure to examine whether sentences are entirely planned before speech begins (Lindsley, 1975) or whether language production is incremental, in that people plan a portion of their speech while they are speaking (Ferreira & Swets, 2002). If we establish that the LO's orientation affects spatial language production by analyzing both UPT and UET intervals, investigating whether this effect occurs during the planning stage or the articulatory stage is possible. In addition to a planning/execution times analysis, we conducted a separate error analysis to explore which spatial arrays were more difficult to describe.

3.2.1. Planning time and executing time analysis

We removed UPTs and UETs greater than 2 SD from the mean to exclude outliers.2 This procedure eliminated 142 responses (9.2%; 78 UPTs = 5.6%; 64 UETs = 4.16%). We considered only correct responses in this analysis. First, we investigated the effect of the prepositions sets (in front of/behind, above/below, and on the left/on the right) and LO orientation (canonical vs. non-canonical) via two 3 × 2 anovas, one for each interval (UPT and UET).

The UPT analysis showed a main effect of preposition set, F(2, 44) = 8.01, < .001, η2 = .35, such that responses for on the left/on the right were faster (= 1.196 ms, SD = 498 ms) compared with the other sets (M above/below = 1.484 ms, SD = 799 ms; M in front of/behind = 1.489 ms, SD = 876 ms; Scheffé: ps < .01). This analysis did not find another significant effect. The interaction between prepositions and LO orientation within the UET interval was not significant, F(2, 44) = .01, n.s., whereas the main effects for both spatial prepositions, F(2, 44) = 47.09, < .001, η2 = .81, and LO orientation, F(1, 22) = 18.30, < .001, η2 = .45, were significant (Fig. 3). Specifically, the spatial prepositions on the left/on the right were processed significantly slower (= 2.884 ms, SD = 496 ms) than above/below (= 2.143 ms, SD = 583 ms) and in front of/behind (= 2.151 ms, SD = 632 ms), ps < .001. The scenes in which the LO had a non-canonical orientation required more time to be described (= 2.600 ms, SD = 662 ms), compared with those with a canonical orientation (= 2.186 ms, SD = 461 ms). This analysis did not find other significant effects.

Figure 3.

The mean utterance preparation time and utterance execution time for describing a scene in which the LO was in a non-canonical or canonical orientation. Bars represent standard errors.

Like Experiment 1, we conducted separate analyses for each set of prepositions. Within the UPT interval, there was no effect of LO orientation for above/below, t(29) = −0.28, n.s., or in front of/behind, t(24) = 0.16, n.s.; however, a significant difference emerged for on the left/on the right, t(31) = −3.72, < .001, = .27. Specifically, descriptions for the scene in which the LO had a canonical orientation were planned significantly faster (= 1.181 ms) than those in which the LO had a non-canonical orientation (= 1.290 ms). The analysis on the UET revealed a significant effect of LO orientation for above/below, t(29) < −3.21, < .01, = .22, on the left/on the right, t(31) < −4.7, < .001, = .29, and in front of/behind, t(24) < −2.79, < .01, = .12. The average advantage for trials in which the LO had a canonical position was 377 ms.

3.2.2. Errors analysis

The participants made 269 errors (19%) based on the following criteria: [C] = online correction of the description during the vocalization, or “self-repair” (Schegloff, Jefferson, & Sacks, 1977); [H] = horizontal axis errors (i.e., inverting left/right location); [P] = perspective choice leading to a spatial relationship confound (i.e., figures above/below are described as in front of/behind); [G] = guideline description errors (i.e., X is opposite of Y, X is too far from Y, and X is going in the opposite direction of Y); and [I] = inversion error (above rather than below and so on). We did not analyze dysfluencies (i.e., hesitations, silent pauses, filled pauses, such as “hum,” “er,” “uh,” false starts, and repetitions) in detail because they have causes that are not related to the aim of this study (Garrett, 1982) and because their presence/absence is automatically reflected in planning/execution times.

We observed errors in the non-canonical condition more often (57%) than in the canonical direction condition (42%), Wilcoxon T = 34, = 22, < .05. We found a significant difference between C (4.09%) and H error types (53.16%), Wilcoxon T = 0, = 7, = .02. The error analysis conducted on each preposition set did not find a statistical difference.

3.3. Discussion

Experiment 2 investigated the effect of LO orientation on a production task in which participants described the location of an object in relationship to another object. The analyses of latencies and errors replicated the effects found in Experiment 1. A consistent effect of LO orientation (longer execution time in describing a scene with the LO in a non-canonical orientation) within the UET interval was found both overall and for each spatial preposition set. This finding corroborates the geometric hypothesis, showing that the LO's orientation is relevant during spatial language production.

However, a separate analysis for each preposition set revealed an effect of LO orientation within the UPT interval for on the left/on the right only, whereas the anova results on the same interval found no effect of LO orientation. This result suggests that participants describing spatial relations such as above/below and in front of/behind accounted for the LO's orientation after the description was programmed. This finding is in line with Ferreira and Swets (2002), who suggested that language production does not require people to plan their utterances completely before speaking. People begin to speak after they know the first word of a description and then plan later elements online. Whether these later elements include spatial relationships remains to be investigated.

This analysis also revealed that the spatial terms on the left/on the right were processed most quickly for the UPT interval but were processed most slowly for the UET interval. The result observed for the UPT interval contradicts Bryant et al. (1992) and Experiment 1, which found that asymmetrical terms are processed faster and more accurately than are symmetrical terms. However, the glut of [H] errors is in line with the observation that the left/right axis is the most difficult to process because of its symmetrical properties (Franklin & Tversky, 1990). Given that the planning time was faster for on the left/on the right utterances compared with the other prepositional conditions, the linguistic aspects that contribute to complicating these spatial terms (e.g., symmetry) must occur after the initiation of utterance production. This explanation is in accordance with the findings by Ferreira and Swets (2002), but it extends their view and suggests that the processing of the geometrical features of objects being described occurs after the utterance of the first word. The lack of interaction between spatial terms and LO orientation indicates that this geometric property does not depend on which spatial terms are described and suggests that our results may generalize to numerous projective spatial prepositions.

4. General discussion

The great interest in spatial language recently shown by cognitive science is justified by the fact that this subject provides a window on how people map spatial information into language. The connection between spatial language and cognition has inspired many studies, particularly those interested in investigating whether extra-linguistic properties affect language. For example, exploration of the geometric properties associated with the RO's orientation has revealed how conflict among reference frames influences spatial language comprehension and production (Carlson-Radvansky & Irwin, 1994; Levinson, 1996; Taylor & Rapp, 2004). Another non-linguistic property that has been shown to affect language is the functional relationship between the objects being described (Carlson-Radvansky & Radvansky, 1996; Coventry et al., 2001). The suggestion that the appropriateness of a spatial relation between two objects depends on the functional relation between them indicates that spatial language is grounded in action and that language carries information about how we perceive the world and how we can act in the world (Coventry & Garrod, 2004).

The present study expands the list of extra-linguistic features that affect language by showing that the LO's geometric properties (e.g., its orientation) are relevant for the processes of spatial language comprehension and production. We observed that participants took longer to place objects in the correct location and their descriptions required more time to formulate when the LO presented a non-canonical orientation. These effects were robust across two methodologies and corroborated the hypothesis that the LO's orientation is relevant for spatial language comprehension regardless of the functional relationship between objects.

Our findings are particularly important for spatial language and spatial cognition domains for a number of reasons. First, these findings suggest that the list of spatial language comprehension processes necessary to understand a simple spatial relation is incomplete and in need of revision. In fact, Carlson-Radvansky and Logan (1997) suggested that additional processes beyond those described in their study might play a role. Second, the outcomes discussed in this study are not relevant only within the spatial language domain but are also relevant for cognitive linguistic theories, which claim that the LO has a marginal role in language. Third, we illustrate why the evidence presented here has implications for the processes involved in mapping spatial representation onto language. Finally, we consider whether pragmatic and inferential accounts can explain our findings.

4.1. Spatial language frameworks

The outcomes illustrated in the current study have implications for some of the most important spatial language frameworks found in the literature. We know that the spatial comprehension process can be summarized by identifying the RO, selecting a reference frame to assign a spatial direction, building a relevant spatial template, processing the LO's goodness of fit, and determining whether this goodness-of-fit measure is acceptable or poor (Carlson-Radvansky & Logan, 1997). In light of our results, the spatial comprehension process should be revised to include a stage in which people process the LO's orientation and determine whether it matches the orientation of the RO. If the two orientations do not match, the spatial apprehension process should take longer to be completed and, according to previous studies on the orientation of the RO, a reduction in the acceptability of the spatial terms should be observed (Carlson-Radvansky & Logan, 1997).

According to our results, other frameworks should be revised to take into account the LO's effects. For instance, the Attentional Vector Sum model (Regier & Carlson, 2001) simulates the apprehension of the spatial relation above by computing an attentional vector that goes from the RO to the LO. This model represents the LO as a simple dot with no mass or orientation; therefore, it cannot account for the LO orientation effect we observed in this study. One possibility consists of adjusting the direction of the attentional vector based on the discrepancy between the orientations of the two objects (which measures the difference between a canonical and a non-canonical orientation).

According to Levelt's (1984) Principle of Canonical Orientation, the orientation of the LO may be particularly critical for imposing an intrinsic reference frame on the RO. Although we did not test this principle directly, by showing that the orientation of the LO is significant to spatial language understanding, we provided empirical evidence in support of this view. The effect of the orientation of the LO may be the result of a conflict imposing a reference frame: If the two objects share the same orientation, the process of selecting the most appropriate reference frame may be effortless. When there is a mismatch orientation, deciding which direction better represents the scene requires that this reference frame conflict be resolved first.

An alternative but related explanation concerns Logan and Sadler's (1996) spatial template theory. Similar to the scenes in which the RO was presented with a non-canonical orientation, the spatial template built on the scene in which the LO had a non-canonical orientation may be weighted according to the orientation of both objects. However, we can only infer this from latency distribution because we did not test this hypothesis directly with an acceptability rating task.

4.2. Cognitive linguistics theories

The lack of interest in cognitive linguistics in the LO and its geometric properties originates from the assumption that the LO has only marginal effects (Jackendoff, 2002; Langacker, 1986). Accordingly, several studies have used an LO that did not have an orientation (e.g., Carlson & Van Deman, 2008; Carlson-Radvansky & Irwin, 1993; Carlson-Radvansky & Irwin, 1994; Levelt, 1996; Levinson, 1996; Pederson, 1995). According to Talmy's view (1983, 2000), the primary object (i.e., the LO) exhibits unknown spatial properties; it is smaller and geometrically simpler (often point like) than the secondary object (the RO). In contrast, the RO is larger and geometrically more complex than the LO and has known properties that can characterize the primary (LO) object's unknowns. Talmy claimed that the orientation of the LO may became more relevant in special cases. For example, in “the gate was set across to the pier” or “the board lay across the railway bed” (Talmy, 2000, pp. 187–189), the spatial term denotes geometric information about the LO orientation that is necessary for a correct understanding of the spatial description. However, within projective spatial terms (such as the ones we examined in this study), the role of the LO has been considered to be of secondary importance. This is the first study to find that people process the geometrical features of a sought-after object (i.e., the LO) during spatial language use, suggesting that the perceptual characteristics of the LO are more important than predicted by this theory. This finding is critical not only for spatial language experiments that show an LO with a clear orientation but also because it shows that language is sensitive to extra-linguistic information, such as the geometry of the objects being described. In turn, this finding suggests that considering the LO a point-like object is overly simplistic and does not capture the complexity of the spatial language comprehension process.

Additional evidence suggesting that the LO may be important in language comes from the observation that Tzeltal speakers use verbs that carry geometrical information about the LO (Brown, 1993; Levinson, 1994). For example, these speakers would use “wajxan” to describe the position of a bottle when it is vertically oriented, but they would prefer “balan” when the bottle lies on the table (Brown, 2008). This fine-tuned verb coding indicates that for their communication system, it is critical to draw attention to the orientation of the LO. However, this information seems to not be essential for the majority of languages. This difference may be related to reference frame systems. We know that Tzeltal speakers communicate spatial information by relying on an absolute geocentric system (e.g., north and south; Levinson, 1994), whereas other languages use a combination of reference frames (with a preference for the absolute/relative reference frame, Carlson, 1999). The principal difference between the two systems is that the geocentric system operates without a landmark (e.g., “Where is your house? North”), whereas the absolute/relative must usually specify a referent object (e.g., “Where is your house? It's near the post office, in front of the bank”). The necessity for Tzeltal speaker to provide information about the geometry of the LO may reflect the attempt to compensate for the less defined, overly simplistic geocentric reference frame system. However, this appealing idea is speculative and requires further study.

4.3. Spatial language, perception, and action

The finding that spatial language comprehension and production slow when the LO is presented in a non-canonical orientation suggests questions about the process of mapping visuoperceptual information into language. Hayward and Tarr (1995) noted that the visual system and the linguistic system are in communication, but language often ignores details that come from perception. This observation resembles Ungerleider and Mishkin's (1982) distinction between two separate cortical pathways, the so-called what and where systems: One specializes in detecting where objects are, and the other specializes in capturing what objects are (but cf. Milner & Goodale, 1995). Similarly, Landau and Jackendoff (1993) argued that spatial prepositions code the nature of the related objects in a poor schematic representation because their focus is locations rather than the properties of objects. Our study shows that the meaning of spatial terms is modulated by information that goes beyond language, such as the orientation of the LO. This finding is in line with Regier's studies (1996) that demonstrated that spatial language is grounded in perception by showing that the definitions of spatial terms reflect properties of the objects being described.

There is also evidence that language is grounded in action and carries information about objects' functionality (Coventry & Garrod, 2004). Our findings have implications for studies in which the orientation of the LO is manipulated to investigate this aspect (e.g., Carlson-Radvansky & Radvansky, 1996; Coventry et al., 2001). According to our results, the orientation of the LO affects spatial language apprehension even when no functional relationship links the LO and the RO. It remains to be established whether a functional relationship overcomes the non-canonical orientation of the LO or whether the two effects can be additive. For example, in Coventry et al.'s (2001) study, for scenes showing a postman facing a postbox, the spatial term in front of received higher ratings compared with scenes in which the postman faced away from the postbox. However, given that in the latter example the LO (the postbox) faces the same direction as the postman (and therefore is in a canonical orientation, according to our definition), we should expect higher ratings. Therefore, functional relation may prevail over the orientation mismatch. Further studies should explicitly investigate this hypothesis.

4.4. Pragmatics and inference in spatial language

The longer time needed to produce a spatial description of a scene with the LO in a non-canonical orientation may reflect participants' attempts to build an ideal spatial description. This idea is not new (Bar-Hillel, 1964) and captures the idea behind pragmatic principles of communication (Grice, 1989; Wilson & Sperber, 2004). The description “A is above B” does not provide information beyond spatial localization (e.g., the LO's orientation). Thus, longer latencies may represent the speaker's recognition that his or her description provides a better fit to a scene in which the LO is canonically oriented because that information is implicitly derived from an A-is-above-B description (Herskovits, 1988). However, this hypothesis was not directly addressed in this study and requires further study. Similarly, it is noteworthy to address another type of inference people make during spatial language comprehension: converseness. The structure of spatial language allows inferences to be made about other spatial relations that hold between the same objects. For instance above–below, front–back, and north–south are directional opposite pairs and exhibit the property of converseness (Levelt, 1984, 1996) such that if the two-place relation expressed by one pole is called R and the other is called R−1, R1(X, Y) → R2(Y, X). Hence, if X is above Y, Y will be below X. According to Grice (1989) and more recent reformulations of Gricean principles (e.g., Levinson, 2000; Sperber & Wilson, 1986), speakers have a duty to avoid statements that are informationally weaker than their knowledge of the world allows (the Q-Principle; Levinson, 2000; see also Asher & Lascarides, 2003). Thus, it follows that the use of spatial expressions in which converseness should apply but does not (such as scenes where the LO is in a non-canonical orientation) may require extra computation in selecting (or apprehending) the correct spatial relation that expresses the spatial array.

4.5. Further accounts

Longer RTs can be explained in terms of unusualness. Given that spatial concepts arise from experience (Evans, 2010), changes in latencies can be explained in terms of plausibility, which mirrors situations in which the LO is upside down (e.g., non-canonical orientations for above/below trials). However, effects of the LO's orientation were also found for spatial prepositions on the left/right and in front of/behind conditions that represent objects in their usual orientations. Therefore, the explanation based on usualness can be discounted.

The different effects of the LO's orientation across the two stages of language production (UET vs. UPT) is interesting with regard to which elements of spatial language are planned before speaking. According to Ferreira and Swets (2002), people plan elements of a sentence after they begin to speak. The absence of an LO's orientation effect in UPT extends Ferreira and Swets' view by showing that the processing of the geometrical features of the objects being described occurs after the spatial description is programmed.

In summary, the finding that the orientation of the LO has an effect on spatial language suggests that spatial language carries information that goes beyond the simple location of objects. We have shown that this has important consequences that are not limited to the language apprehension and production domain but that also involve spatial cognition and spatial representation.


We are grateful to Pia Knoeferle for helpful comments on an earlier draft of this manuscript. We also thank Michela Savoca and Marco Bonesini for their help in collecting data. We are also in debt to three anonymous reviewers and Jim Magnuson for their excellent criticisms and advice that significantly improved this manuscript.


  1. 1

    We replicated the analysis using different RT trimming (2.5 and 3 SD) and the results concerning the effect of the LO did not change.

  2. 2

    We replicated the analysis using different RT trimming (2.5 and 3 SD) and the results concerning the effect of the LO did not change.

Appendix A

The instruction for Experiment 2 started with the following:

“Tieni presente che nel comunicare la posizione di un oggetto ci possono essere vari modi. Vanno tutti bene; non c'è una forma ‘sbagliata'. Cerca solo di fornire la descrizione migliore, quella che secondo te può essere compresa facilmente da un ipotetico ascoltatore e che identifichi chiaramente la posizione dell'oggetto target. Le tue risposte verranno registrate ma non avere fretta di rispondere. Prima di iniziare ti verranno mostrati gli stimoli e i nomi che dovrai usare nelle descrizioni.”

(tr. “Please note that there are several ways to describe the objects in the scene. All are good; there is not a ‘wrong' way. Try only to provide the best description that is the one which can be easily understood by a hypothetical listener and which identify clearly the location of the target object. Before starting, you will see the objects used during the experiment and their names you are expected to use in your descriptions”).

After this instruction we presented an example scene explaining which object location should be described.