How do individuals with Williams syndrome learn a route in a real-world environment?

Individuals with Williams syndrome (WS) show a specific deficit in visuo-spatial abilities. This finding, however, derives mainly from performance on small-scale laboratory-based tasks. This study investigated large-scale route learning in individuals with WS and two matched control groups (moderate learning difficulty group [MLD], typically developing group [TD]). In a non-labelling and a labelling (verbal information provided along the route) condition, participants were guided along one of two unfamiliar 1-km routes with 20 junctions, and then retraced the route themselves (two trials). The WS participants performed less well than the other groups, but given verbal information and repeated experience they learnt nearly all of the turns along the route. The extent of improvement in route knowledge (correct turns) in WS was comparable to that of the control groups. Relational knowledge (correctly identifying spatial relationships between landmarks), compared with the TD group, remained poor for both the WS and the MLD group. Assessment of the relationship between performance on the large-scale route-learning task and that on three small-scale tasks (maze learning, perspective taking, map use) showed no relationship for the TD controls, and only a few non-specific associations in the MLD and WS groups.

We investigated the visuo-spatial abilities of people with WS by asking them to learn a 1-km route through a university campus. Learning a route involves visuospatial abilities, including perspective taking, encoding relationships between landmarks, and the sequence of turns along a route, and is part of developing an overall cognitive representation of an area (Siegel & White, 1975). We will refer to the route-learning task as a 'largescale' spatial task, because it took place in a real environment. There has not been any previous research into the development of large-scale knowledge in WS, and prior to the present research nothing was known about how people with WS learn an unfamiliar landscape. Before we began the research we collected anecdotal reports from parents ⁄ guardians of people with WS. This revealed a common belief that people with WS often get lost or disorientated in unfamiliar places. If this is the case, we expected people with WS to be less efficient (than typically developing [TD] people) in learning new environments. In support of this, fMRI and MRI investigation has shown atypical hippocampal metabolism in WS (Meyer-Lindenberg et al., 2005), an area implicated in large-scale route learning. This research has implications for spatial theories (discussed below). It also has practical implications, because identifying the type of deficits experienced by people with WS in large-scale tasks could lead to remedial interventions designed to improve wayfinding abilities.
In contrast to the case for large-scale tasks there are many 'small-scale' tests of spatial ability (Freundshuh, 2000). Such tasks are carried out in small spaces (e.g. in a laboratory, in a model layout, on a table top). Previous research with participants with WS has focused on their ability to perform small-scale spatial tasks (e.g. Landau & Hoffman, 2007). The research with small-scale tasks has shown that, despite an overall impaired level of ability, WS participants have relative strengths and weaknesses in the spatial domain (Farran & Jarrold, 2003). Neuro-anatomically, it was hypothesized that this reflects impaired dorsal stream functions (object localization, perception for action) relative to ventral stream functions (object and face recognition) (Atkinson et al., 1997). More recently, empirical evidence has pointed towards a further fractionation within the dorsal stream, as some dorsal functions are more impaired than others (Atkinson et al., 2006;Farran & Jarrold, 2004;Jordan, Reiss, Hoffman & Landau, 2002). In support of this, Meyer-Lindenberg et al. (2004) showed reduced dorsal stream activation for location coding in WS, relative to control participants. They relate this to a reduction in grey matter at the dorsal occipitoparietal sulcus ⁄ vertical part of the intraparietal sulcus, but emphasize that this would not impact all dorsal functions.
Given the deficit within the visuo-spatial domain, albeit measured by small-scale tasks, coupled with cortical evidence, this suggests that large-scale routelearning performance, as part of the visuo-spatial domain, should also be impaired. We were also interested in whether performance on small-scale tasks holds any predictive value for performance on large-scale tasks. Below, we give examples of how impaired spatial abilities in WS might be detrimental to the learning of large spaces. These examples emphasize the importance of testing empirically whether one can assume a commonality between the spatial abilities required for a small-scale versus a large-scale task, and whether smallscale task performance extrapolates to large-scale abilities. Until now, this had not been assessed in WS, and the validity of such an assumption was unknown.
Performance on construction and drawing tasks represents a relative weakness in WS, compared to performance on purely perceptual tasks (Farran & Jarrold, 2001). The pattern of performance on construction and drawing tasks shows a lack of global organization, with more attention given to the details than to the cohesive image (i.e. a local bias). If one were to assume that this related to large-scale ability, such a bias might be detrimental when people with WS need to develop knowledge of a large area, because such knowledge depends on organizing partial perceptual views of an environment into a coherent cognitive representation of the whole area (Kitchin & Blades, 2002).
Nardini, Atkinson, Braddick and Burgess (2008) reported a poor ability to use landmarks to find a target in a small-scale spatial array. This was more pronounced when participants had to rely on the spatial configuration of the array (intrinsic frame of reference) than it was when they could determine location relative to the position of their body (body frame of reference). If poor use of landmarks in a small-scale task relates to large-scale environments, this suggests that individuals with WS might find it difficult both to learn a sequence of turns and to encode the spatial relationship between landmarks, with greater impairment on the latter. Such deficits would seriously impair their cognitive representation of the environment.
However, the deficits in small-scale spatial abilities in WS do not necessarily mean that this group has deficits in large-scale spatial tasks. There is much evidence to suggest that spatial ability is not a single construct, but is composed of numerous mechanisms (Allen, Kirasic, Dobson, Long & Beck, 1996). Despite this, there is little agreement as to what these independent constructs are (Quaiser-Pohl, Lehmaan & Eid, 2004). Hegarty & Waller (2005) reviewed most investigations of the relationship between spatial abilities at different scales and found that there was little or no relationship between performance on small-scale and large-scale tasks for TD adults. This has also been shown for TD children (Quaiser-Pohl et al., 2002). This suggests that these two types of task rely on different spatial mechanisms, and thus this distinction might also apply to WS.
To our knowledge, there are no comparisons of brain activation in the performance of small-scale versus largescale spatial tasks. However, dorsal stream activation features strongly in reports of small-scale task performance (e.g. Han, Song, Ding, Yund & Woods, 2001), whereas the hippocampus shows key activation in large-scale tasks (e.g. Hartley, Maguire, Spiers & Burgess, 2003). The hippocampus receives input from the dorsal stream, and thus, although they are not fully independent, it appears that large-scale spatial abilities implicate distinct cortical areas, relative to small-scale spatial abilities. In relation to WS, both dorsal and hippocampal activation are atypical (Meyer-Lindenberg et al., 2004;. Like other researchers, Allen et al. (1996) did not find any direct relationship between a set of small-scale psychometric tests and large-scale spatial performance (learning a route through a city) with TD adults, but they did find an indirect relationship. They found that two types of small-scale spatial tasks (maze learning, perspective taking) were related both to the psychometric measures and to large-scale environmental learning. Maze learning was related to route-learning measures, and perspective taking was related to measures of relational knowledge.
Given Allen et al.'s (1996) findings, we included measures of maze learning and perspective taking in our study. The maze task was designed for young children (Gathercole & Pickering, 2000) and so was simpler than, but similar to, the one used by Allen et al. (1996). Pilot work with a group of people with WS (who did not take part in the present experiment) indicated that a perspective-taking task like the one used by Allen et al. (1996) for TD adults was too difficult for WS participants. We therefore used a perspective-taking task designed for young TD children (Massangkay et al., 1974). Given the relationship found by Allen et al. (1996) between small-scale (maze learning, perspective taking) and large-scale (route walking) learning, we expected the same relationship to apply to the performance of participants in our study.
We also considered how the ability to learn a route might be improved in WS, using two facilitation techniques. The first was a verbal labelling strategy. In one condition, participants were given verbal information about the route while they experienced it for the first time. We know that verbal cognition is a relative strength in WS, so we thought that this relative strength might be used to scaffold performance in relatively weaker areas of cognition. Farran, Jarrold and Gathercole (1999) investigated WS performance on the Performance subtests of the Weschler Intelligence Scale for Children III (WISC III;Weschler, 1992). They demonstrated that the WISC III subtests that shared variance that was uniquely associated with performance on the British Picture Vocabulary Scale (Dunn, Dunn, Whetton & Pintilie, 1982), a verbal measure, represented higher levels of visuo-spatial performance. In contrast, the subtests that shared variance that was uniquely associated with performance on Raven's Coloured Progressive Matrices (Raven, 1993), a measure of nonverbal ability, represented lower levels of visuo-spatial ability. This demonstrated that, in WS, when a non-verbal task allows an increased input from verbal cognition, this is a beneficial strategy to elevate performance.
The above interaction between verbal and non-verbal ability appears to be bi-directional: some aspects of spatial language are relatively impaired within the verbal domain in WS (Laing & Jarrold, 2007;Landau & Hoffman, 2005;Lukµcs, PlØh & Racsmµny, 2007;Phillips et al., 2004). Laing & Jarrold (2007) demonstrated that when spatial comparisons relied on spatial models (whether the blue or red animal is physically bigger on the page) participants with WS showed a significant deficit, relative to when comparisons relied on semantic knowledge (whether a bear or snail is bigger). Lukµcs et al. (2007) showed a similar differentiation between performance on spatial language tasks that activate spatial models and that on those that do not require on-line spatial analysis. In our labelling condition, the information about the route involved pointing out objects along the route within their spatial context using spatial terms such as 'next to' and 'passed'. As individuals with WS can understand the semantics of such terms, we anticipated that any facilitation in the WS group would not be inhibited by the use of spatial language. Farran et al. (1999) did not explicitly encourage verbal strategies for task completion, and therefore people with WS may use such strategies spontaneously. In the present study, it is therefore possible that introducing a verbal strategy for route learning might not have a facilitatory effect on performance in the WS group, as they may already be using a verbal strategy. Any improvement in performance was therefore considered in comparison to a group of individuals with moderate learning difficulties (MLD) and a group of TD individuals. We predicted that the relative magnitude of any facilitatory effect of explicitly encouraging a verbal strategy when learning the route would determine the extent to which each group spontaneously uses verbal strategies. That is, if verbal strategies are already in place, little or no facilitation should be evident, but if verbal strategies are not spontaneously used, facilitation effects should be observed.
The second facilitation technique was the use of repetition. We asked participants to re-trace the route twice; that is, after experiencing the route and retracing it once, they were asked to retrace it a second time. Repeated experience of routes by TD individuals usually results in rapid learning (Kitchin & Blades, 2001) and therefore we expected the TD controls to be at or near ceiling on retrace 2. If the participants with WS also benefited from repeated experience we expected them to perform better on retrace 2 than on retrace 1.
However, route recall depends on long-term memory, which is poor in WS. This has been shown for both verbal and visuo-spatial long-term memory tasks, although the relative performance on verbal and visuospatial tasks shows mixed patterns of ability (Vicari, Brizzolara, Carlesimo, Pezzini & Volterra, 1996;Jarrold et al., 2007;Brock, Brown & Boucher, 2006). Vicari et al. (1996) showed a relative deficit in a visuo-spatial memory task (Rey Figures) compared to a verbal memory task (word-list learning). In contrast, Jarrold et al. (2007) found similarly poor visual and verbal memory, using the doors and people task (Baddeley, Emslie & Nimmo-Smith, 1994). As route learning is a long-term memory task, we predicted that the performance of the WS participants would be poorer than the performance of the TD controls, but that immediate repetition might be a useful strategy for improving their large-scale route learning.
We also included a spatial representation task based on Blades & Cooke (1994): participants used a map of a room to find a target location in the room. We wanted to find out if participants with WS could apply information gained from a small-scale representation (the map) to the large space (the room) that it represented. We reasoned that, if participants with WS demonstrated an ability to use representations, map using might be another way to facilitate environmental learning in WS.
In summary, the present study investigated large-scale spatial knowledge in WS, an issue that has not been investigated before. Both stages of large-scale spatial knowledge development (route learning, relational knowledge) were measured. Control groups of TD and MLD participants were included. We note that, although there have been many studies of real-world spatial development in the typical population, MLD participants have not been studied in large real environments before. We considered the effects of providing participants with verbal information about the route while they were learning it, and also the effects of practice on those participants who retraced the route more than once. We also assessed participants' performance on two small-scale tasks, which, based on previous findings (Allen et al., 1996), we expected to correlate with performance on the large-scale task. Finally, we included a map task to determine whether participants could transfer information learnt from small-scale representation to a real room. We have made some predictions, based on our assumptions about how the abilities and limitations of people with WS might apply in a large-scale learning task.

Participants
Twenty participants with WS were recruited through the Williams Syndrome Foundation, UK. All participants had received a positive diagnosis of WS based on phenotypic and genetic information. Genetic diagnosis was based on a fluorescent in situ hybridization (FISH) test. The FISH test identifies the deletion of elastin on the long arm of chromosome 7, which occurs in approximately 95% of individuals with WS (Lenhoff, Wang, Greenberg & Bellugi, 1997).
Two control groups were included: a group of participants with non-specific moderate learning difficulties (MLD) and a group of typically developing (TD) participants. Full Scale IQ (FSIQ) was determined using the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler, 1999). The WASI consists of four subtests: block design and matrix reasoning as Performance measures; and vocabulary and similarities as Verbal measures. The FSIQs of the WS and MLD participants were calculated from their performance on all four subtests. The FSIQs of the TD participants were calculated from their performance on the vocabulary and matrix reasoning subtests.
Each participant with WS was individually matched to a participant with MLD and to a TD participant. WS and MLD participants were matched on the basis of chronological age (CA) and Performance IQ. WS and TD participants were matched by CA. Participant details are shown in Table 1.
In Table 1, the WS group do not appear to show the characteristic discrepancy of higher Verbal IQ than Performance IQ. This is due to two related factors. First, this discrepancy typically emerges with development (Jarrold, Baddeley & Hewes, 1998a) and the participants included here had not reached the end stage of development (14 of the WS group were under 18 years). Second, many of the WS group scored at floor on one or more of the four WASI subtests, which could mask any VIQ-PIQ discrepancies (an effect also observed by Arnold et al., 1985;Udwin et al., 1986;Pagon et al., 1987). To explore this, we calculated the VIQ-PIQ discrepancy in the six individuals with WS who did not show floor effects on either of the two VIQ subtests (three of these WS participants showed floor effects on one of the PIQ subtests, however, thus potentially still constraining any discrepancy). The VIQ-PIQ discrepancies for these individuals were: 14, -11, 15, 1, 16 and 11, respectively. Thus, those who had reached a later stage of development, and were not affected by floor effects, overall showed the VIQ-PIQ discrepancy characteristic of WS.

Route walking
Participants were guided by an experimenter along two different routes (A and B) on the campus of Reading University. None of the participants had had any previous experience of either of the routes. Each route was 1 km long and each included 20 'choice points'. A choice point was a junction along the route where participants had to decide whether to turn left, turn right, or walk straight ahead. Each of the routes also included four landmarks that were unique places along the route (e.g. red door, large green bin, bench). One landmark was at the start of each route, and the other three landmarks were distributed along the route. Each route was along paths between large university buildings so that participants had only limited views ahead. There were no unique, salient landmarks such as high buildings or other features that could have served as distant landmarks (although we cannot rule out the sun being used as a directional landmark), and we tested participants during the summer when trees and bushes were all in full leaf and thus further limited views across the campus.
A script was used by the experimenter while leading the participant along each route. To make the task as clear as possible, participants were told to think of the experimenter as a bus driver who was taking them along the route and that later it would be their turn to drive the bus (i.e. lead the way) from the start to the end of the route. As participants walked each route the four landmarks were pointed out by name (e.g. red door). The landmarks were described as 'bus stops', and participants were told to remember where the bus stops were so that they could stop the bus when it was their turn to be the driver. There were two conditions. In the non-labelling condition, the remainder of the script was not descriptive of the route, and as the participants were guided along the route they were given only non-specific instructions like 'this way now'. In the labelling condition, participants were given instructions that included directional information and information about features along the route (e.g. '… down this little path and past the fish pond').
After a participant had been guided around the route once by the experimenter, s ⁄ he was asked to take a turn as the 'bus driver', and retrace the route leading the experimenter. Participants' turns at each choice point were noted, and the number of correct turns (maximum: 20) was used as a measure of route knowledge. If a participant made an incorrect turn at a choice point this was recorded as incorrect, and the experimenter led them back to the junction and asked them to make another choice. Only participants' first choice of turn at each junction was scored.
When retracing the route, participants were asked to stop at each of the four landmarks ('bus stops'), from where they were asked to point to the other three landmarks (a total of 12 pointing choices while retracing a route). The four landmarks were never directly visible from each other. A participant indicated the position of a landmark by using a 'camera-gun', which recorded the pointing direction as a still photograph. The participant pointed the camera-gun in what they thought was the correct direction of the target landmark, and the experimenter took a photograph using a remote switch. The photographs were used to calculate the difference (in degrees) between a participant's pointing estimate and the correct direction. This difference was used as a measure of relational knowledge.
The non-labelling condition was always carried out first (if the labelling condition were carried out first, this might influence the way that participants approached the task in a later non-labelling condition). The two routes, A and B, were counterbalanced so that half of the participants walked route A in the non-labelling condition and then route B in the labelling condition. The other half walked route B (non-labelling condition) and then route A (labelling condition). The two conditions were administered on different days.
After retracing the route once (in each condition) participants were asked to retrace the route a second time. We did this to assess the effect of repeated experience on route learning. Several participants declined to retrace the route a further time. As participants had already walked 2 km (guided experience and first retracing) we did not expect all participants to be willing to walk a further 1 km to complete retrace 2.

Map task
Participants viewed a 2 m · 2 m room, marked out by a cotton sheet on the floor of a laboratory. Four items were placed in the marked-out room. These were two unique items (blue desk, green chair) and two non-unique items (identical red boxes). A coloured map (15 cm · 15 cm) of the area was drawn to match the layout and included symbols of the objects in the room (see Figure 1).
Participants were tested using a procedure based on Blades and Cooke (1994). For each trial, a small toy was hidden out of sight under one of the four objects by the experimenter. While the experimenter was doing this, the participant could not see the room or the hiding places. After the toy had been hidden, the participant was shown the map. A yellow sticker was placed on the appropriate symbol on the map to indicate the location of the toy, and the experimenter also pointed to the correct symbol on the map. After seeing the map, the participant was asked to go and find the toy in the room.
There were two conditions: a non-rotated condition, when the map was correctly orientated to the room; and a rotated condition, when the map was rotated 180°in relation to the room, with 12 trials per condition. In each condition the toy was hidden three times under each of the four items in the room (i.e. six trials at unique hiding places, six trials at identical hiding places). The trials were presented randomly, except that in the first trial of the experiment the toy was hidden under one of the unique items. This was because young participants are usually more accurate at using maps to find hidden objects at unique places (Blades & Cooke, 1994) and we wanted to maximize the chance of success in the first trial.

Perspective-taking task
Two perspective-taking tasks were carried out. For both tasks, participants sat at a table, opposite the experimenter, and were presented with stimulus cards (21 cm · 21 cm).
In the Level 1 perspective-taking task, based on Flavell et al. (1981), participants were shown three stimulus cards, which had one item drawn on each side (apple ⁄ orange, car ⁄ boat, cat ⁄ dog). Participants were first asked to name the items on the card (e.g. 'apple' and 'orange'). The card was then held vertically between the experimenter and the participant, and the participant was asked, for example, 'do you see the apple or the orange', followed by 'do I see the apple or the orange'. The card was then reversed and the questions were asked again, but the order in which the objects were named was reversed. This was repeated for each stimulus card. The order of the cards and the sides shown were counterbalanced.
A Level 2 perspective-taking task was based on Massangkay et al. (1974). Participants viewed three single-sided stimulus cards, each depicting a picture that had a distinct right-way-up (turtle, horse, table). After checking that a participant could understand and use the phrases 'right-way-up' and 'upside-down' appropriately, the participant was first shown a card so that the picture on it was 'right-way-up' from the participant's point of view and they were asked if they saw the object the right-way-up or upside-down (e.g. 'do you see the turtle the right-way-up or upsidedown?'). The participant was then asked whether the experimenter saw the picture the right-way-up or upside-down (e.g. 'do I see the turtle the right-way-up or upside-down?'). The experimenter then rotated the picture so that the object was now upside-down from the participant's perspective and then repeated the questions. The order of presentation of pictures and the positions of the pictures (right-way-up or upside-down) were counterbalanced.
For both the Level 1 and the Level 2 perspectivetaking tasks, 12 questions were asked (6 participant viewpoint, 6 experimenter viewpoint). Scores were awarded for the experimenter viewpoint trials only, because none of the participants had any difficulty describing their own views of the stimuli.

Maze task
The maze task, based on Gathercole and Pickering (2000), had two conditions, static and dynamic (see Figure 2). Each maze consisted of a stick-man in the centre, surrounded by two (level one) to six (level five) 'walls' around the figure, and each wall had two entrance points. Entrance points in the walls were arranged so that they were positioned either on opposite sides, or on adjacent sides, alternately. Participants were asked to remember a route from the outside of the maze to the man in the centre. They were shown the correct route by the experimenter and then asked to draw it from memory on a blank maze. In the static condition the correct route was shown as a red line drawn on a target maze, which was removed after a 3-second exposure. For the dynamic condition the researcher traced the correct route onto a blank maze. There were four trials at each of the five levels, and a participant progressed to the next level when at least two out of four trials at a level were completed correctly. Participants were scored for the total number of trials correct (maximum score per condition: 20).

Route knowledge
Route knowledge was measured as the number of correct choices (maximum: 20) that a participant made during the first retracing of the route (see Figure 3). A two-way ANOVA was conducted on participants' routeknowledge score for the first retrace of each route, with a between-participant factor of group (WS, MLD, TD) and a within-participant factor of condition (nonlabelling, labelling). This showed a main effect of group (F(2, 57) = 35.72, p < .001, g p 2 = .56) as a result of superior performance in the TD controls compared to the MLD and WS groups (p < .05 for both), and higher scores for the MLD group than for the WS group ( p = .02). There was also a main effect of condition (F(1, 57) = Large-scale routes and Williams syndrome 459 23.39, p < .001, g p 2 = .29) as a result of higher scores on the labelling condition than on the non-labelling condition. The interaction between condition and group was not significant (F(1, 57) = 2.40, p = .10, g p 2 = .08).
To further investigate the facilitation effect of labelling the route, correlations between route knowledge and verbal and non-verbal ability were explored. All three participant groups completed the WASI vocabulary subtest and the WASI matrices subtest, and so the raw scores for these subtests were used as estimates of verbal and non-verbal ability, respectively. Significant correlations (two-tailed) were found only for the WS group: route knowledge was significantly or marginally positively associated with both verbal and non-verbal ability (verbal ability, non-labelling: r = .49, p = .03; labelling: r = .48, p = .03; non-verbal ability, nonlabelling: r = .43, p = .06; labelling: r = .64, p = .003). For the TD and MLD groups there were no significant correlations (p > .05 for all).
The extent to which route knowledge performance in each condition was specifically related to non-verbal skills and to verbal skills, respectively, was examined using partial correlations. The variance that the WASI matrices (raw scores) shared with the WASI vocabulary (raw scores) was partialled out from the total variance associated with the WASI vocabulary so that only the variance in level of ability that was uniquely associated with the WASI vocabulary remained. Similarly, the variance that the WASI vocabulary shared with the WASI matrices was partialled out from the total variance of the WASI matrices, leaving only the variance uniquely associated with the WASI matrices (i.e. non-verbal ability). The patterns of results were not as expected, because variance uniquely associated with non-verbal ability was associated with performance for the WS group in the labelling condition only, in which stronger non-verbal ability was associated with a higher routeknowledge score (r = .56, p = .01, two-tailed). There were no other significant relationships (p > .05 for all, two-tailed).
Learning was assessed by comparing performance on the first and second retrace of each route. A number of participants did not complete the second retrace of the route owing to fatigue. Where this occurred, rather than removing the participants who were matched to the missing participant, matching was considered to be at a group level (supported by independent t-tests). In the non-labelling condition, the numbers of participants who retraced the route twice were: WS, n = 13; MLD, n = 9; TD, n = 17. Independent t-tests showed that matching was adequate: WS and TD, CA, t(28) = 1.64, p = .11; WS and MLD, CA, t(20) = 0.70, p = .49, nonverbal ability, t(20) = )0.51, p = .61. In the labelling condition, the numbers of participants who retraced the route twice were: WS, n = 11; MLD, n = 11; TD, n = 9, and they were adequately matched at the group level (WS and TD, CA: t(18) = 1.06, p = .30; WS and MLD, CA: t(20) = 0.81, p = .43, non-verbal ability: t(20) = )0.81, p = .43).
Learning was assessed for the labelling and the nonlabelling condition separately to optimize on remaining power (see Figure 4). Two ANOVAs were carried out on the number of correct turns (maximum: 20) with group as a between-participant factor (WS, MLD, TD) and learning as a within-participant factor (retrace 1, retrace 2). For both conditions, there was a main effect of group (non-labelling: F(2, 36) = 26.66, p < .001, g p 2 = .60; labelling: F(2, 28) = 3.74, p = .04, g p 2 = .21), which was the result of higher scores in the TD group than in the WS and MLD groups in both conditions (p < .05 for all).

Relational knowledge
Relational knowledge was assessed by the pointing task, and accuracy was measured as the difference (in degrees) between a participant's pointing to a landmark and the actual direction of the landmark during the first retrace of the route. Measurements were made to within one degree. A second experimenter coded a random 25% of the data for each group. Analysis of inter-rater reliability by Cronbach's alpha showed high reliability, r 2 = 1.00, p < .001. The pointing measure was an error score, and therefore a lower score indicated better performance. The maximum possible error was 180°. One participant with WS did not produce any pointing data, and so the WS group had a maximum n of 19. For each walk of the route, participants made 12 pointing estimates, and the mean of these 12 estimates was calculated for each participant.
The mean scores for the first retrace of the route were compared to chance performance of 90°(guessing would produce errors from 0°to 180°, with a mean of 90°). In both the non-labelling and the labelling condition, all three groups performed above chance (one-sample t-tests against chance, p < .001 for all). The mean scores were analysed by ANOVA, with a between-participant factor of group (WS, MLD, TD) and a within-participant factor of condition (non-labelling, labelling). There was a main effect of group (F(1, 56) = 60.21, p < .001, g p 2 = .68). This was a result of the substantially higher accuracy for the TD controls, where the mean error was 36.41°(non-labelling) and 29.25°(labelling), than for the other groups, where the mean error was consistently above 65°(TD vs. MLD, WS: p < .05 for both; WS vs. MLD: F < 1). The main effect of condition was not significant (F < 1). The interaction between condition and group was not significant (F(2, 56) = 2.57, p = .09, g p 2 = .08) (see Figure 5). The effect of learning on relational knowledge was assessed by comparing pointing errors for the first and second retracings of the route (see Figure 6). As noted with reference to the route-knowledge analysis, not all participants retraced the route a second time. There was an additional loss of data from participants who walked the route a second time but failed to provide pointing data. Thus, these participants provided route-knowledge, but not relational-knowledge, data for the second retrace of the route. Participant numbers are therefore slightly lower than for the route-knowledge analysis (nonlabelling condition: WS, n = 13; MLD, n = 7; TD, n = 17; labelling condition: WS, n = 11; MLD, n = 9; TD, n = 8), but still appropriately matched at a group level (p > .05 for all group comparisons). The analysis above indicated that performance was above chance for the first retrace of the route. Mean scores for the second retrace of the route showed that the WS and TD groups performed above chance for both the non-labelling and the labelling condition (one-sample t-tests against chance, p < .05 for all), but the MLD group performed only marginally above chance in the non-labelling condition (p = .09) and at chance in the labelling condition (p = .26). Given that performance on the first retrace of the route indicated that all groups understood the task, and that mean performance did not reduce in this group from retrace 1 to retrace 2 (see Figure 6), this effect appears to be accounted for by a loss of power on account of reduced participant numbers. As such, the MLD group were not excluded from subsequent analyses. Two ANOVAs were conducted, with a betweenparticipant factor of group (WS, MLD, TD) and a within-participant factor of learning (retrace 1, retrace 2). For both conditions, analysis showed a main effect of group (non-labelling, F(2, 34) = 27.08, p < .001, g p 2 = .61; labelling, F(2, 25) = 20.57, p < .001, g p 2 = .62) as a result of higher accuracy from the TD group than from the WS and MLD groups (p < .05 for all), but similar accuracy for the WS and MLD groups (p > .05 for both conditions). There was no main effect of learning in either condition (F < 1 for both). The interaction between group and learning was significant for the nonlabelling condition only (non-labelling: F(2, 34) = 6.71, p = .003, g p 2 = .28), which was because the TD group showed significant learning (F(1, 16) = 10.07, p = .01, g p 2 = .39), but the WS group showed a marginal effect in the opposite direction (WS: F(1, 12) = 4.32, p = .06, g p 2 = .27: retrace 1 < retrace 2), and the MLD group showed no evidence of learning (F < 1).

Map task
In both the aligned and the rotated conditions of the map task all three groups were at ceiling, as predicted, when the toy was hidden in one of the unique hiding places, and therefore we considered performance only when the toy was hidden in one of the identical hiding places.
There were six trials in the aligned condition and six trials in the rotated condition when the toy was hidden in an identical place. The mean scores are shown in Figure 7. The TD group was at ceiling for the identical hiding places in both conditions, and so analysis was conducted between the WS and MLD groups.
Although there were always four hiding places in each trial (two unique and two identical places), as participants were always correct when the toy was hidden at a unique place we adopted a conservative measure of chance performance (50%) for trials at the two identical hiding places. We therefore compared the performance of the WS and MLD groups against chance performance of three correct trials out of six. In the aligned condition both groups were better than chance (one-sample t-test against chance, WS, MLD: p < .001 for both). In the rotated condition the MLD group was better than chance (p = .001), but the WS group performed at chance (p = .27).
ANOVA was carried out with group (WS, MLD) as a between-participant factor and condition (aligned, rotated) as a within-participant factor. The main effect of group was marginal (F(1, 38) = 3.47, p = .07, g p 2 = .08) as a result of stronger performance in the MLD group than in the WS group, although the chance performance of the WS group might have attenuated the effect. The main effect of condition was significant, because there was better performance in the aligned condition than in the rotated condition (F(1, 38) = 24.77, p < .001, g p 2 = .40). The interaction between group and condition was not significant (F(1, 38) = 1.07, p = .31, g p 2 = .03).

Perspective-taking task
For the Level 1 perspective-taking task performance was at ceiling for all three groups, and was therefore not analysed. For the Level 2 perspective-taking task, participants had to say whether a picture placed on the table between them and the experimenter was the right-way-up or  upside-down from the point of view of the experimenter. Participants who guessed an answer would have been correct on 50% of the six trials. The Level 2 task produced ceiling effects for the TD controls, better than chance performance in the MLD group, and chance performance in the WS group (onesample t-test compared to chance, MLD: p = .03; WS: p = .68) (see Figure 8). An independent t-test between the performance of the WS and the MLD groups showed that the MLD group had a marginally higher score (t(38) = 2.00, p = .053).

Maze task
The mean scores for the maze task are shown in Figure 9. All participants scored significantly above floor performance (WS static and dynamic conditions: p = .001 for both; TD and MLD static and dynamic conditions: p < .001 for all). Importantly, this indicated that, although the WS group had low scores, they did understand the task.
An ANOVA was conducted to compare the performance of the three groups in both conditions (static, dynamic). This revealed a main effect of group (F(2, 57) = 74.13, p < .001, g p 2 = .72) as a result of higher TD than WS and MLD performance (p < .05 for both), and higher MLD than WS performance (p = .01). The effect of condition showed marginally higher performance in the dynamic condition than in the static condition (F(1, 57) = 3.59, p = .06, g p 2 = .06). The interaction between condition and group was not significant, F < 1.

Correlational analyses
To determine whether performance on small-scale and large-scale tasks was related, correlations were carried out between route knowledge and relational knowledge on the route-walking task, and performance on the three small-scale tasks: the maze task, the perspective-taking task, and the map task. For the route-walking task, the non-labelling condition, retrace 1, was used because this condition provided the best reflection of how participants would perform in an everyday routeleaning context. For the maze task, the sum of correct responses for the static and dynamic mazes was used. For perspective taking, only the level 2 task was included owing to ceiling performance on the level 1 task. For the map task, the sum of correct responses for the identical hiding places in the aligned and the rotated conditions was included. The only significant correlations (twotailed) between the small-scale and large-scale tasks related to the maze task. For the MLD group, performance on the maze task correlated with largescale route knowledge only (r = .50, p = .03). For the WS group, maze-task performance correlated with largescale route knowledge (r = .49, p = .03) and with relational knowledge (r = ).49, p = .03) in the predicted directions. None of the other comparisons were significant (p > .05 for all).

Discussion
Individuals with WS were able to learn a route. Even at the first retracing of the route (non-labelling condition) the participants with WS recalled over half of the turns, and with practice and verbal support (the second retrace in the labelling condition) the participants with WS recalled, on average, all but about 2 of the 20 turns. This finding indicates that participants with WS have the potential to learn a new route through an unfamiliar environment, and can do so after quite limited exposure to the route, even though they may take longer than TD people to achieve optimal performance. We discuss the differences between the abilities of the three groups, but it should be borne in mind that these differences are differences between relatively good performances by all of the groups.
Despite good performance on the route-learning task, the participants with WS and MLD did much less well than the TD group on the measure of relational knowledge. This measure indicated how well an individual formed an understanding of the relative position of different places in the environment. With experience, in areas such as those used here, a TD adult will usually have an accurate awareness of all the spatial relationships between all of the places (Kitchin & Blades, 2001). Indeed, we found that the TD adults pointed accurately towards unseen places, with errors, overall, of only about 30°, and we also found that the TD group improved on the pointing measure with greater experience of the route. In contrast, both the other groups performed less well than the TD group, and had errors of between 70°and 80°. Although this was less then chance performance of 90°for retrace 1, it is high enough to demonstrate that the WS and MLD groups had quite an inaccurate understanding of the relationship between places in the environment. Indeed, for retrace 2, with reduced participant numbers, for the MLD group this level of error no longer differed from chance. This high level of error is consistent with evidence for poor spatial relation encoding on smallscale tasks in WS (Farran & Jarrold, 2005;Nardini et al., 2008). In contrast to the TD group, greater experience did not lead to improved relational knowledge for either the MLD or the WS group. Although the ability to learn a route always precedes (developmentally and temporally) the ability to encode spatial relationships in the environment (Blades, 1991;Siegel & White, 1975), the poor performance of the WS and MLD groups in the pointing task and their failure to improve with more experience suggests that the disjunction between learning a route and learning environmental relationships may be greater for WS and MLD participants than for TD participants. This suggests that, although individuals with WS are able to learn a novel route, they do this by relying on learning a specific set of turns and landmarks. The reliance on route knowledge in WS is an important finding, as it suggests that, despite being able to learn a route, individuals with WS would not be able to deviate from that route to find a short cut or to make a detour. This observation holds also for individuals with MLD.
The dissociation in WS between route knowledge and relational knowledge could reflect a neural dissociation. In TD adults there are two systems: place learning is a flexible system, dependent on the hippocampus, which relies on relational knowledge and the building of a cognitive map, whereas the second, less flexible, system is an action-based system that relies on route knowledge and activates the caudate nucleus (Doeller, King & Burgess, 2008;Hartley et al., 2003). Indeed, Hartley et al. (2003) report that when TD adults were presented with an unfamiliar route, good navigators activated their hippocampus and hence their place-learning system, but poor navigators did not, and relied on their action-based system. It is therefore possible that the reliance on route knowledge in WS stems from atypical hippocampal functioning, as observed by Meyer-Lindenberg et al. (2004).
The above dissociation might also relate to the use of different frames of reference. Nardini et al. (2008) described a discrepancy in WS between the use of a body frame of reference and that of an intrinsic frame of reference. If individuals with WS use their relatively good body frame of reference to encode the turns along the route, then, coupled with correct landmark encoding, this would suggest an ability to acquire route knowledge in WS. In contrast, relational knowledge involves encoding the spatial relationships between landmarks from a number of perspectives. This involves an intrinsic frame of reference, a relative weakness in WS, and thus is consistent with the poor relational knowledge observed in this study. However, we must be cautious in suggesting that the variables measured in our study (large-scale variables) relate directly to those considered by Nardini et al. (2008) (small-scale variables), especially because we were unable to show a direct relationship between smallscale and large-scale performance.
An effect of labelling was found for route knowledge, but not for relational knowledge. It therefore appears that relational knowledge did not benefit from verbal coding in typical or atypical populations. For both measures, the pattern of performance of the WS group was comparable to that of the control groups. If individuals with WS spontaneously scaffold their poor non-verbal abilities with verbal coding, then either no effect or a reduced effect of labelling would have been observed in this group. The results suggest that: first, individuals with WS do not spontaneously use a verbal strategy when learning the turns along a route; and second, explicit instructions to encode visuo-spatial information using verbal cues are beneficial in WS. The level of benefit is akin to that of the typical population and to that of MLD individuals. This has a third implication, namely that the use of spatial terms did not inhibit verbal facilitation in WS.
Route knowledge was not specifically related to verbal or to non-verbal ability in any group; the performance of the WS group was related to both verbal and non-verbal ability, but no correlations were observed in the control groups. This lack of correlation might be explained by a strong memory component to the task. However, mazetask performance, a measure of visuo-spatial working memory (Gathercole & Pickering, 2000), only correlated with route-learning ability in the learning-difficulty groups. The contribution of memory could explain the lack of correlation with verbal or non-verbal ability in the MLD group, but not in the TD group. Some of the TD participants scored close to ceiling for route knowledge, which might explain the lack of correlations for this group. Farran et al. (1999) observed that, in WS, non-verbal tasks in which performance was relatively poor or relatively strong were associated with the variance uniquely related to non-verbal ability or to verbal ability, respectively. This was investigated in the present study with the prediction that route knowledge on the non-labelling and labelling conditions would be associated with variance uniquely associated with nonverbal and verbal ability, respectively. This prediction was not supported: the two control groups showed no unique associations, and the associations in the WS group were not as expected -route knowledge was uniquely associated with non-verbal ability in the labelling condition, but not in the non-labelling condition as predicted. This finding suggests that labelling enables individuals with WS to make better use of their non-verbal skills; perhaps labelling encourages these individuals to systematically visualize each part of the route, leading to better recall.
Participants led the experimenter around the route twice. Caution must be taken in interpreting analyses of learning, because participant numbers were reduced. The participants who did not complete the second retrace may have been those who found the task more challenging, tiring, or boring. If so, the remaining members of each group are a less representative sample of the population, because they were a self-selected subset of the original sample. Nevertheless, the results from the second retrace give an indication of the effects of repetition in WS. Repetition of the route improved route knowledge across all three groups. Thus, in addition to verbal coding, individuals with WS benefitted from repeated experience for the measure of route knowledge. In the labelling condition, the extent of learning on route knowledge in the WS group was similar to that of the control groups. In the non-labelling condition, the improvement in route knowledge in the WS group was comparable to that of the TD controls, whereas the MLD controls showed a relatively stronger effect of learning. As the TD controls were at ceiling for the second retrace of the route, it is possible that the effect of learning was constrained. Comparison between the MLD and WS group suggests that, without the aid of verbal coding, individuals with MLD looked for alternative strategies for learning the route, and that this comes to fruition by their second retrace of the route. In the labelling condition, this is less necessary, as the strategy of verbal coding has already been suggested, and so this strategy was used effectively across groups, with comparable learning outcomes. The WS group, however, did not appear to be seeking strategies in either condition, and so show similar learning across conditions, despite improved performance overall in the labelling condition.
In contrast to route knowledge, relational knowledge did not improve with repetition of the route, with the exception of the TD group in the non-labelling condition only. The absence of a learning effect might relate to the lack of feedback; errors in route knowledge were corrected to continue the route, but errors in relational knowledge were not corrected. It is possible that, without feedback, participants were less aware of their errors or were less able to improve their performance on the second retrace. It is likely that awareness of level of performance was stronger in the TD group than in the two learning-difficulty groups, owing to their higher IQ. If this is the case, the differentiation between the labelling and the non-labelling condition in the TD group might relate to differences in the ability to recall the four landmarks across conditions. Verbalizing the route might have made the four to-be-remembered landmarks less salient, owing to the numerous other potential landmarks that were pointed out. If so, it might have been easier for the TD group to show improved performance in the non-labelling condition, as the only landmarks that had been pointed out were the four landmarks used to assess relational knowledge. In sum, the pattern observed in the TD group could have been related to two factors: increased critical awareness of their own performance, and interference from additional landmarks in the labelling condition.
The patterns of performance on the small-scale tasks showed some similarities to large-scale abilities across the participant groups. In both the level 2 perspectivetaking task and the identical-hiding-places conditions of the map task, WS performance was marginally weaker than that for the MLD controls. Both tasks include relational knowledge: in the perspective task the participant has to consider the spatial relationship between the experimenter's position and the picture on the table; and in the map task the hiding place could be identified only by relating the hiding place to some other feature in the room. As the difference between WS and MLD was only marginal for these tasks, one could argue that this pattern is similar to the pattern of relational knowledge in the large-scale task, in which WS performance was at the level of the MLD group. However, the WS group showed chance performance with identical hiding places in the rotated version of the map task and also had chance performance in the level 2 perspective-taking task, which could suggest a particular weakness in WS for these small-scale tasks. If this is the case, then the pattern of relational knowledge is different between small-scale and large-scale environments. Note that the measure of relational knowledge taken in the large-scale task was relatively more sensitive as it was a continuous variable, in contrast to the small-scale tasks, in which participants gave one of two possible answers. One could argue that, if a more sensitive measure of relational knowledge were taken on a small-scale task, then poor, but above-chance, performance might also be observed. This would not, however, predict that group comparisons would show patterns different from those observed here.

Large-scale routes and Williams syndrome 465
This last conclusion is supported by the lack of correlations between the measures of small-scale and large-scale relational knowledge in any of the participant groups. Our results, therefore, do not support Allen et al.'s (1996) claim that performance on small-scale perspective-taking tasks is related to performance in tasks involving large-scale relational knowledge.
Performance in maze tasks has been linked to performance in large-scale route-knowledge tasks (Allen et al., 1996). Performance on the maze task used in our study did not correlate with route knowledge for the TD group, but it did for the MLD and WS groups. Thus, the only support for Allen et al.'s (1996) suggestion that maze-task performance is related to route knowledge was from the groups with learning difficulties. For the WS group, performance on the maze task also correlated with relational knowledge, which was not predicted by Allen et al. (1996). Furthermore, the WS group showed particularly poor performance on the maze task, which contrasted to their route knowledge. As such, these findings did not provide evidence for a relationship between the small-scale maze-learning task and large-scale route knowledge.
Overall, in our study, performance on the small-scale tasks was not related to performance in the real-world environment. This finding is in line with studies of TD children (Quaiser-Pohl et al., 2002) and adults (Hegarty & Waller, 2005) that have also failed to find a relationship between small-and large-scale spatial performance. We have shown that the previous findings for TD children and adults also apply to WS and MLD groups. This supports the notion that small-and large-scale performances are two distinct areas of spatial ability, supported by independent mechanisms. Such mechanisms have not yet been agreed upon. However, Quaiser-Pohl et al. (2002) distinguish between 'large-scale tasks in which the observer is part of the environment and cannot see the whole space of interest from one point of view, and small-scale tasks… where the spatial relations of objects can be seen at once' (page 95) and where '…the movement of one's body position is not important' (page 104).
The lack of a relationship between small-and largescale tasks for WS has an important implication for research into visuo-spatial cognition in WS. To date, this research is predominantly based on laboratory tasks, and, although very valuable for extending our knowledge about WS abilities, any extrapolations from such tasks to real-world everyday contexts should be treated with caution. Impairments in WS on small-and large-scale tasks do not appear to indicate a common deficit, and should be considered as independent deficits to two distinct mechanisms.
We found that the participants with WS performed at chance in the level 2 perspective-taking task, and in the rotated version of the map task when the hiding place was one of two identical targets. This demonstrates that these tasks were too difficult for the WS participants. Given that the WS group was matched to the MLD group by Performance IQ, this indicates specific weaknesses in the component factors measured by these tasks, within the visuo-spatial domain. Individuals with WS could complete the non-rotated condition of the map task with identical hiding places, which involved relational knowledge. Successful completion in the rotated condition is additionally dependent on the ability to rotate the map mentally. Poor performance in this condition seems to relate to poor mental rotation abilities. This is consistent with previous findings that have reported poor mental rotation in WS (Farran & Jarrold, 2004). We also found that level 1 perspective taking was successfully completed by the WS group, but that level 2 perspective taking was not. This task is dependent on a participant's ability to determine whether they or the experimenter see an object as the right-way-up or upside-down, and so also has a strong mental-rotation component. Poor performance was also observed on the maze task. This is best explained by impaired visuo-spatial working memory in WS (Jarrold, Baddeley & Hewes, 1998b;Vicari, Bellucci & Carlisimo, 2003). Jarrold et al. (1998b) demonstrated a Corsi span of between two and three spatial locations in WS, and therefore our maze task, which was visually more complex and involved remembering at least two 'entrances', may have been at the limit of working memory capacity for the WS participants.
The WS participants were successful in the map task when the toy was hidden in one of the two unique hiding places. This involved only matching (e.g. symbol of green chair on the map to green chair in the room), rather than spatial abilities. When the toy was hidden in one of the two identical places, participants had to disambiguate the correct red box. The WS participants could do this if the map was aligned with the room. This indicates that they appreciated that if the correct red box was, for example, on the left of the map then it was on the left in the room. However, when the map was rotated, participants needed to consider the spatial relationships within the layout (e.g. that the correct red box was opposite the chair) and WS participants were unable to do this. Therefore, the WS participants understood some aspects of using a map (symbol-to-place correspondence, and map-to-room directional correspondence, when aligned). However, they were poor at identifying the map-to-room spatial correspondences within the layout. The latter skill is a key aspect of everyday map using because maps include multiple identical symbols and are not usually aligned with an environment. Users must identify the correspondences between spatial patterns on the map and those same spatial patterns in the environment. Given our findings, we suggest that people with WS will have difficulty recognizing such spatial correspondences and that they would not benefit from using maps when learning new environments.
In summary, the results demonstrated that individuals with WS can learn a route through a natural environment, and that route knowledge could be improved by verbal coding of the route, and by walking it more than once. The extent of this improvement was comparable to the improvement in both CA-matched TD controls, and MLD controls matched by CA and Performance IQ. This ability does have some limitations, as the WS group did not show an understanding of the relationship between landmarks on the route. As this is a function of the hippocampus in typical development (e.g. Hartley et al., 2003), this is consistent with evidence for atypical hippocampal function in WS (Meyer-Lindenberg et al., 2004) and implies limited ability to deviate from a learnt route. It is not possible to determine from our study whether relational knowledge develops in WS as an environment becomes more familiar. Despite these difficulties with relational knowledge, the finding of relatively good route knowledge in WS has important practical implications because we demonstrated that, given some verbal support and practice, people with WS are capable of learning a complex real-world route. They can learn such routes successfully, despite major deficits in their performance in many small-scale spatial tasks (e.g. map and maze tasks) that, traditionally, have been linked to expertise in wayfinding.