The well‐worn route revisited: Striatal and hippocampal system contributions to familiar route navigation

Classic research has shown a division in the neuroanatomical structures that support flexible (e.g., short‐cutting) and habitual (e.g., familiar route following) navigational behavior, with hippocampal–caudate systems associated with the former and putamen systems with the latter. There is, however, disagreement about whether the neural structures involved in navigation process particular forms of spatial information, such as associations between constellations of cues forming a cognitive map, versus single landmark‐action associations, or alternatively, perform particular reinforcement learning algorithms that allow the use of different spatial strategies, so‐called model‐based (flexible) or model‐free (habitual) forms of learning. We sought to test these theories by asking participants (N = 24) to navigate within a virtual environment through a previously learned, 9‐junction route with distinctive landmarks at each junction while undergoing functional magnetic resonance imaging (fMRI). In a series of probe trials, we distinguished knowledge of individual landmark‐action associations along the route versus knowledge of the correct sequence of landmark‐action associations, either by having absent landmarks, or “out‐of‐sequence” landmarks. Under a map‐based perspective, sequence knowledge would not require hippocampal systems, because there are no constellations of cues available for cognitive map formation. Within a learning‐based model, however, responding based on knowledge of sequence would require hippocampal systems because prior context has to be utilized. We found that hippocampal–caudate systems were more active in probes requiring sequence knowledge, supporting the learning‐based model. However, we also found greater putamen activation in probes where navigation based purely on sequence memory could be planned, supporting models of putamen function that emphasize its role in action sequencing.

so-called model-based (flexible) or model-free (habitual) forms of learning.We sought to test these theories by asking participants (N = 24) to navigate within a virtual environment through a previously learned, 9-junction route with distinctive landmarks at each junction while undergoing functional magnetic resonance imaging (fMRI).In a series of probe trials, we distinguished knowledge of individual landmarkaction associations along the route versus knowledge of the correct sequence of landmark-action associations, either by having absent landmarks, or "out-ofsequence" landmarks.Under a map-based perspective, sequence knowledge would not require hippocampal systems, because there are no constellations of cues available for cognitive map formation.Within a learning-based model, however, responding based on knowledge of sequence would require hippocampal systems because prior context has to be utilized.We found that hippocampal-caudate systems were more active in probes requiring sequence knowledge, supporting the learning-based model.However, we also found greater putamen activation in probes where navigation based purely on sequence memory could be planned, supporting models of putamen function that emphasize its role in action sequencing.

| INTRODUCTION
All mobile animals have evolved systems for maintaining their orientation with respect to known locations as they navigate their environment in search of food, shelter, and conspecifics.In a seminal publication, O' Keefe and Nadel (1978) proposed the existence of two fundamental types of spatial learning and memory in mammalian navigation systems, one for learning prescribed, familiar routes through the environment, and the other for flexible navigation, exemplified by reaching a hidden goal utilizing a novel route, with hippocampal systems underpinning the latter "cognitive map" (see review in Poulter et al., 2018).
Foundational work by White and McDonald (2002) has further developed parallel spatial memory systems theory, suggesting that following a prescribed route would fall under a family of spatial tasks that are underpinned by stimulus-response associations, subserved by cortico-striatal loops from the motor and sensorimotor cortex to the dorsolateral striatum (putamen in primates), where dopaminergic systems act as a reinforcement signal to modulate the strength of stimulus-response associations depending on reward history.The spatially relevant stimulus here is conceived of as a single landmark, or possibly a particular spatial view or snapshot of a scene (White & McDonald, 2002), and the response is a particular body orientation at a choice point along the route (e.g., turning left or right based on an egocentric representation of space).In contrast, hippocampal memory systems are thought to underpin incidental learning of relations among stimuli, including allocentric map-like representations that are independent of the navigator's viewpoint, and knowledge of these relations can be used as part of a cognitive control loop involving the neocortex, hippocampus, and dorsomedial striatum (caudate in primates), modulated by dopaminergic systems.The cognitive control loop is thought to subserve fast and flexible learning and decisionmaking, whereas the sensorimotor loop is thought to subserve slower, habitual, stimulus-driven behavioral sequences, more resistant to changes in reward contingencies (White & McDonald, 2002;Yin & Knowlton, 2006).
Extensive evidence for this parallel spatial memory systems theory has come from neurophysiological, pharmacological, and lesion studies in rodents (reviews in Devan et al., 2011, Goodman, 2021), as well as imaging work on humans linking hippocampal function with cognitive mapping (Anggraini et al., 2018;Cona & Scarpazza, 2019;Iaria et al., 2003;Igloi et al., 2010;Marchette et al., 2011;Wegman et al., 2014;Woolley et al., 2013).Early fMRI studies also appeared to support a division between familiar route versus flexible short-cutting behavior in virtual environments (VEs) in terms of striatal versus hippocampal activation, respectively, although striatal activation was more centered on caudate as opposed to putamen in these studies (Hartley et al., 2003;Iaria et al., 2003).
One limitation of the current evidence base is that it relies on paradigms such as the classic cross-maze (reviews in Goodman, 2021;Packard & Goodman, 2013), where a single decision point occurs.The organism either selects a response based on a previously learned allocentric place, as defined by constellations of room cues (requiring an intact hippocampal system) or can repeat a habitual egocentrically defined body turn (requiring an intact dorsolateral striatal system).However, Rondi-Reig et al. (2006) suggested that when a familiar route involves many junctions (a form of navigation they termed sequential-egocentric), there may be involvement from hippocampal systems to help the organism maintain a sense of where they are in the sequence of junctions, particularly if junctions resemble one another.Consistent with the proposal, Rondi-Reig et al. observed that knock-out mice with NMDA receptor damage to hippocampal layer CA1 displayed a specific pattern of behavior when learning a 3-junction route with similar-looking junctions.These knock-out mice performed above chance in the first junction, turning left, but then performed at chance levels in the subsequent 2 junctions.On the basis of these findings, the authors argued that striatal systems are able to subserve the learning of the stimulus-response pairing at the first junction, but then hippocampal impairment meant the mice were unable to recognize where they were in the sequence of junctions following the start of the maze, preventing consistent learning of further stimulus-response pairings.
In keeping with evidence from rodent studies, research utilizing fMRI with human participants has confirmed the role of the hippocampus, in collaboration with the caudate nucleus, when disambiguating familiar routes that have overlapping elements relative to unique routes that do not overlap with other learned routes (Brown et al., 2010(Brown et al., , 2012;;Goodroe et al., 2018;He et al., 2022).When navigating overlapping routes, to turn left when following one route but right at the same junction when following a second route, it is necessary to respond based on memory for previous elements in the trajectory, a function posited to depend on hippocampal systems (Brown et al., 2010(Brown et al., , 2012(Brown et al., , 2016;;Davachi & DuBrow, 2015;Foster & Wilson, 2006).Igloi et al. (2010) also found greater hippocampal activation, relative to a control condition, when human participants were navigating a 3-junction VE maze modeled on that used by Rondi-Reig et al. (2006).It should be noted, however, that distal landmarks were present in the experimental condition, whereas they were absent in the control condition, so it is possible that hippocampal systems were activated due to spontaneous mapping processes as opposed to the need to maintain ongoing memory for position along the maze.
The results reviewed above concerning hippocampal involvement in the navigation of familiar routes involving sequences of junctions are hard to reconcile with theories that focus on the allocentric mapping functions of the hippocampus (Hartley et al., 2003;Iaria et al., 2003;O'Keefe & Nadel, 1978;White & McDonald, 2002).One manner in which hippocampal involvement in the navigation of familiar routes can be understood is within the theoretical framework offered by reinforcement learning, e.g., Bornstein and Daw (2011).
Here, in relation to navigational behaviors, Khamassi and Humphries (2012) have suggested that what distinguishes hippocampal and striatal spatial parallel memory systems is the type of learning that each system engages in, specifically, the division between model-free and model-based learning.Model-free learning occurs when cue-action associations are strengthened by average reward history for that specific association, and model-based learning occurs when the organism is able to utilize information about the outcomes of chains of past actions to guide their current choices (Bornstein & Daw, 2011).Khamassi and Humphries (2012) argued that while formulations of parallel spatial memory systems theory (White & McDonald, 2002), based on stimulus-response associations to single cues versus flexible use of cognitive maps, appear to map onto model-free and modelbased learning systems, respectively, the key difference between the theoretical proposals lies in the emphasis, which is more fully on the type of learning as opposed to mapping in their proposal.
Thus, under the proposals of reinforcement learning (Bornstein & Daw, 2011;Khamassi & Humphries, 2012), the results of Rondi-Reig et al. (2006) can be explained by the need to respond at the second or third junction of the maze contingent on memory for previous actions, a model-based decision requiring hippocampal involvement (see also Brown et al., 2010Brown et al., , 2012;;He et al., 2022;Igloi et al., 2010).However, while reinforcement learning can offer an explanation of observations of hippocampal involvement in familiar route following, previous studies have not been designed to directly test these explanations.
To test the Khamassi and Humphries (2012) proposal, we reconceptualized a prescribed route-following task as a dual-solution task having two separable components that normally co-occur in a natural situation.Assuming a landmarked environment, learning a prescribed route can be achieved by encoding individual landmark-action associations at key junctions, such that successfully following a well-known route can occur without knowledge (i.e., a model) of the order of the separate landmark-action associations.However, learning of the sequencing of landmarks, as well as a sequence of egocentric turns at junctions, can also occur when learning a fixed route.In the fMRI study we present here, utilizing a VE, participants learned a series of left-right turns through nine Y-shaped junctions, each of which contained a unique landmark.Importantly, though, unlike previous studies (e.g., Igloi et al., 2010), the environment had no distal landmarks that could trigger spontaneous mapping processes.Once participants had learned the fixed route, we administered a set of probe trials that were designed in order to distinguish the systems involved in performing a response based on individual landmarkaction associations (i.e., model-free learning) versus a response based on sequence knowledge (i.e., model-based learning).
Two types of short probe trials were utilized in the first half of the experiment, presented here as experiment 1a for ease of exposition.In short sequence probes, a landmark was unexpectedly absent at a junction within the route.In a learning-based account of parallel spatial memory systems (Khamassi & Humphries, 2012), it may be predicted that hippocampal formation and caudate activity will be observed, as knowledge of landmark sequence is required to generate a correct response at the landmark-less junction.The second type of short probe trial we refer to as short conflict probes, in which participants encountered, unexpectedly, an out-of-sequence landmark along the route.Importantly, the action associated with the outof-sequence landmark response was the opposite of the appropriate response based on the sequence of turns participants had learned.
Consequently, participants could choose to make a response based on the correct sequence or the correct individual landmark-action association, a feature of other dual solution spatial maze paradigms adapted for fMRI measures (e.g., Marchette et al., 2011;also Furman et al., 2014).Under a learning-based account of parallel spatial memory systems, it would be predicted that a sequence-based response would draw on hippocampal-caudate systems, whereas a landmarkbased response would not.While the lesion literature would predict putamen involvement in a response based on action-landmark associations, few prescribed route navigation studies using fMRI report task-related activations in the putamen, so we treat examination of activations associated with individual action-landmark responses in the present study as exploratory (but see Horga et al., 2015, andPatterson &Knowlton, 2018).

| Participants
We recruited a sample size equivalent to previous fMRI studies that have obtained hippocampal activation in humans during familiar route-following tasks (e.g. Brown et al., 2010;Igloi et al., 2010) while also accounting for potential participant drop-out.A total of 27 participants (18 females; mean age 23.6 years, range 19-34 years) gave informed consent and were paid £30 for participation in the study, but the data from three participants were excluded due to excessive head movement, exceeding 3 mm, leaving a final sample size of 24 (16 females).
The study gained ethical approval from the Department of Psychology ethics committee, Durham University.After scanning, participants were debriefed and provided with an opportunity to ask questions if they wished.

| Virtual environment design
A 9-junction route in a VE was constructed using Unity 2017.4.2f2 (https://unity3d.com/).The overall task for participants was to learn to navigate the route without errors.Participants viewed the environment with a field of view of 55 and a viewing height of 1.7 virtual meters (vm).Each junction along the route was Y-shaped and contained a unique landmark (windmill, bench, sundial, chimenea, fountain, composter, well, birdhouse, and birdbath).As participants arrived at a junction, they were unable to rotate their field of view or observe any landmarks beyond their current junction, and two black arrows aligned with left and right paths signaled that a response could be made.Participants pressed left or right buttons to move along the path to the left or right at each junction, respectively.Once participants had made a response, they were moved passively along the selected path for 2.5 s at a speed of 2.9 vm/s.Figure 1a shows a plan view of the route, and Figure 1b shows screenshots of the firstperson perspective at the beginning and end of route junctions.
When navigating on a given trial, if a participant made a correct response at a junction, they experienced a passive rotation of 60 degrees before movement along the path toward the next junction.If an incorrect response was made, the rotation occurred, and a potted plant landmark was visible as the participant was moved along the incorrect path.As the participant neared the end of the incorrect path, a red mist obscured the view (total duration of feedback procedure 4.5 s), and the participant was returned to the original junction where they were then able to make the correct choice.On reaching the goal location at the end of the route, which was a garden summerhouse, fireworks were displayed for 2.5 s.
As well as experiencing trials traversing the full route, various other types of trials were presented, at different phases of the experiment, as described in the following section.On each trial, route choices and latencies to make decisions at every junction were recorded, together with timestamps for all events within a trial.

Prescan training
On the day prior to scanning, participants learned the route through the VE in a training task lasting approximately 15 min.Initially, the task consisted of trials traversing the whole route, with incorrect choices being subject to feedback, until the participant completed two consecutive trials with no errors.The inter-trial interval used throughout the training was 6 s, during which a blank screen was displayed.This prescan training was conducted in a mock scanner in order to acclimatize participants to the scanning environment.
Once the criterion performance of two consecutive error-free trials had been reached, a pseudo-randomized set of five different trial types were presented four times each (i.e., a total of 20 trials), such that the same trial type was not presented consecutively.Three of the five different trial types consisted of shorter route segments, where only three junctions of the route were presented; once the participant had made their third choice, they traveled down the fourth path for the usual 2.5 s, but the screen then faded to black signaling the end of the trial, unless the segment ended at the summerhouse.The color blocks of Figure 1a show these route segments, starting at the windmill (the yellow block), chimenea (gray), and well (blue).The short route training starting at the well led to the summerhouse and ended with the fireworks reward, as in the full route trials, rather than fading to black.
The purpose of these shorter route segment training trials was two-fold.In terms of learning the individual landmark-action associations within the route, they were important in preventing participants navigating using only a verbally encoded list of 9 right/left turns along the full route without any learning of landmark-action associations.As two of these shorter routes did not start at the beginning of the route, a verbal strategy that comprised only a left and right word list was ineffective, as participants would not know where in the chain of turn directions they were, in trials that did not start at the beginning of the route.The other function of these shorter training segments was to prepare participants for probe trials in the part of the experiment that was conducted within the fMRI scanner, detailed below.
A longer, 5-junction route trial, in which participants navigated through junctions 5-9, ending at the goal location, was also presented, indicated by the thin purple segment in Figure 1a.The purpose of this trial type was to have a trial that started in the first half of the route, but still led to reward, thus avoiding the possibility that participants would associate trials starting relatively early on in the route with termination without reward.Finally, there were also repeat presentations of the full route.Thus, the 3-junction routes, the single 5-junction route, and the full route formed the 5 different trial types presented for 20 trials during training.Following the training in the VE, recognition memory for the route was assessed by asking participants to order screenshots of the landmarks in the correct order.
On the day following pre-scanning training, participants had the opportunity to refresh their knowledge of the full route prior to the scanning session, by conducting trials in the VE traversing the whole learned route, to a criterion of two errorless trials.Only three participants made an error, taking three trials to reach the criterion, with the remainder taking the minimum of two trials, suggesting the route was well learned prior to scanning.

Neuroimaging task
Trials were presented to participants in nine runs while fMRI data was collected.For all runs, a jittered inter-trial interval of 4 s ±2 s was utilized, followed by a 2 s white central fixation cross on a black background to alert the participant to the start of the next trial.The first run consisted of a training phase with the purpose of stabilizing performance in the scanner environment.Participants again had to reach a criterion of two errorless trials in the full route before proceeding.All but one participant achieved this in the minimum of two trials, with one participant requiring three trials.There then ensued a pseudo-random set of the same trial types as described in the previous section, with two trials of each type of route.In addition, six control trials were interspersed with these training trials, modeled on control trials used by Igloi et al. (2010).These consisted of the same 3-junction routes as were used for pre-scan training (see Figure 1a; yellow, gray, and blue route segments) and were each presented twice.In control trials, participants navigated with no landmarks present, and wooden fence barriers were used to block the "incorrect" choice at each junction.At the beginning of the run, participants were alerted to the possibility of routes where paths were blocked, and they were instructed to select the available path (see Supplementary material, Appendix A, Table A.1 for an example of trial ordering during the training run).
Following training, participants completed the short probes phase of the experiment, consisting of four scanning runs, each run containing 24 trials.Participants were instructed that they would be presented with trials they were familiar with and also some in which something might have changed.In such cases, there would be no feedback as to whether their responses were correct or not, but they should respond guided by the knowledge that the learned route to the garden house remained unchanged.Data from this short probes phase is presented as experiment 1a, with the final 4 runs, comprising the long probes phase, reported as experiment 1b.
Eighteen of the 24 trials consisted of probe trials, with three different types of probes presented (Figure 1c).There were six short sequence probe trials, where after an initial junction, the following junction had no landmark.Therefore, the participant had a choice of making a response based on the correct sequence (e.g. a left turn in the top panel of Figure 1c), or they could make an incorrect response.
The six sequence probe trials were made up of two repetitions of the three short route segments depicted in Figure 1c.There were six short conflict probes whereby a second junction contained an outof-place landmark that was associated with a different turning direction along the route.Consequently, there was a conflict between a sequence response based on the usual order of landmarks encountered on the learned route and a landmark response based on the individual learned landmark-action association.For example, in the top panel of Figure 1c, in the learned route, participants turned left after the junction after the well, but a right turn is usually associated with the fountain in the learned route, with the fountain being presented at the critical junction of this short conflict probe.As with sequence probes, each of the route segments displayed in Figure 1c was presented twice in each run.Finally, six 2-path control probes were presented, constructed of the same path segments as depicted in Figure 1c but with no landmarks and a barrier fence blocking access to one of the arms of the junctions.Three of these control segments followed the path of a sequence-based response, and three followed the path of a landmark-based response.
Out of the 24 trials presented in each run of the short probes phase of the experiment, six consisted of one of the trial types presented during the training run, acting as a refresher for the learned route.The order of presentation of the training and probe trial types was pseudo-randomized, with a sample trial order displayed in Supplementary material, Appendix A, Table A.1.In total, this design yielded twenty-four probe trials of each type for each participant across four runs (i.e., 6 per run Â 4 runs), for entry into the analysis.

| Image acquisition
Imaging data were acquired at the James Cook University Hospital, Middlesbrough, using a 3 T Siemens Magnetom Trio scanner with a 32-channel Tim matrix head coil.Functional T2*-weighted BOLD images were acquired using an axial echo planar imaging sequence of the whole brain (repetition time, TR, 2000 ms; echo time, TE, 62 ms; gap 0.3 mm, flip angle, 90 ; acquisition matrix, 96 Â 96; field of view, 210 Â 210 mm, slices, 32; resolution 3 Â 3 Â 3 mm).Slices were acquired in the sagittal plane in ascending interleaved order.The 4th run out of a total of 10 in the experiment consisted of a highresolution T1-weighted anatomical scan using a multiplanar rapidly acquired gradient echo sequence (TR, 2250 ms; TE 2.52 ms; no gap; flip angle, 9 ; acquisition matrix, 1024 Â 1024; field of view; 512 Â 512 mm, slices, 192, resolution 1 Â 0.5 Â 0.5 mm).The first 3-5 slices were discarded for all runs to allow for stabilization of images.

| fMRI Preprocessing
Imaging analysis was conducted using SPM12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Functional images were spatially realigned to the first image in the series, using a least squares rigid body transformation approach and slice-time corrected.After coregistration of functional and structural images, structural images were tissue-segmented and gray matter estimates were used to normalize images to standard Montreal Neurological Institute (MNI) space using optimized voxel-based morphometry.Normalized images were resampled to a 2 Â 2 Â 2 mm resolution with spatial smoothing conducted with an isotropic threedimensional Gaussian filter with a 6 mm kernel at half-maximum.

Behavioral analysis
Accuracy data was collated to ensure participants were making predominantly correct responses.On short conflict probes, participants were classified into sequence responders or landmark responders based on their majority response across their 24 trials.It should be noted that this classification to yield two groups of participants departs from the analysis strategy utilized in other fMRI dual solution navigation tasks in which participants produced a mixture of responses (Furman et al., 2014;Igloi et al., 2010;Marchette et al., 2011), but was a necessity based on the consistency of behavioral responses produced by participants (see also Section 2.1.6.4).
Reaction time (RT) data were collated for the probe trials forming the relevant contrasts in the fMRI analyses, detailed below, and were analyzed using ANOVAs.The RT data are reported in Supplementary material, Appendix B, together with an examination of potential confounds in the fMRI analysis due to any systematic RT differences between conditions (Domagalik et al., 2014;Yarkoni et al., 2009).

fMRI analysis
A generalized linear model (GLM) of the functional time series was used to model the time course of the short probes phase of the experiment, using SPM12 software.For first-level analyses, regressors were convolved with the canonical hemodynamic response function, and high-pass filtered (128 s), with the time series for each participant modeled to generate contrast maps.These contrast maps were entered into second-level group random effects GLMs to test contrasts of interest, in whole brain analyses as well as region of interest (ROI) analyses within the hippocampus, caudate, and putamen (see following section).For whole brain analyses, significant clusters of activation were identified following a cluster-level false discovery rate (FDR) correction of p < .05,using an initial cluster-forming threshold of p < .001.Anatomical labeling of above-threshold activation clusters was conducted using the Automated Anatomical Labelling Atlas 3 toolbox (AAL3; Rolls et al., 2020).

ROI definition
ROI seed regions both for univariate and beta series connectivity analyses (reported in experiment 1b) were defined a priori following the work of (Brown et al., 2010, Brown et al., 2012, and Brown and Stern, 2014) found that the posterior hippocampus was particularly engaged when disambiguating overlapping familiar route sequences and that posterior hippocampal to caudate connectivity was greater for overlapping versus nonoverlapping familiar routes within a VE (Brown et al., 2012).We thus used the right and left hippocampal tail coordinates (MNI +/À18, À36, 2), right and left hippocampal body coordinates (MNI +/À 30, À24, À15), and right and left caudate coordinates (MNI +/À 10, 4, 12) used by Brown et al. (2012) to form the center of spheres with a 5 mm radius, as our ROIs.
While Brown et al. (2012) did not define a putamen ROI a priori, they found that there was significantly more connectivity between the right hippocampal body ROI and the putamen bilaterally at the start of overlapping versus nonoverlapping familiar routes, using a beta-series analysis (Rissman et al., 2004).We thus created our putamen ROIs using the cluster-center voxel of increased putamen connectivity to the right hippocampal body ROI, obtained by Brown et al. (2012).The right putamen ROI was centered at MNI 22, 20, À8, and the left at MNI À22, 20, À8.The MarsBaR.44 toolbox for SPM12 was utilized for ROI creation and analyses (Brett et al., 2002).

Short probes phase: univariate contrasts
For the concatenated 4 runs comprising the short probes phase of the experiment, 15 regressors were created.The regressors comprised reward periods, the first paths of control, sequence, or conflict probe trials, and the second, critical, paths of control, sequence, and conflict probe trials (7 regressors).In addition, the first and subsequent paths of 3-junction control trials, 3-junction routes, 5-junction routes, and full routes were included in the model (8 regressors).Importantly, the division between the first pathways of a route and subsequent paths was motivated by the analysis reported by Igloi et al. (2010), in which it was observed that the first pathways of a trial appear to capture route planning processes.Any feedback periods linked to incorrect responses, together with the six movement parameters, were entered as regressors of no interest.
To assess brain activation differences between sequence probes and control probes, a second-level group analysis was conducted based on parameter estimates of regressors derived from first-level maps, in which the critical pathway of control probes was contrasted with the critical pathway of sequence probes, where no landmark was present at the critical junction.In ROI analyses, the average signal change across the spherical ROIs was used in one-tailed t-tests contrasting the critical paths of sequence probes versus control probes.A statistical threshold of p < .05 was applied, as these were planned contrasts driven by theoretical predictions.
Prior to collecting data, we anticipated conducting withinsubjects analyses of sequence-and landmark-based responses on conflict probes.However, we observed that participants were consistent in favoring either a sequence-or landmark-based response across all conflict probes that were administered.Consequently, we grouped participants into sequence (n = 9) or landmark (n = 15) responders based on their most frequent response, and because of these unequal group sizes, limiting power to detect differences between groups, we report 2 separate second-level contrasts between the critical paths of conflict and control probes for sequence and landmark responders.

These analyses (reported in Supplementary Materials Appendix D)
were clearly exploratory in nature, and the outcomes of these analyses should be treated as such.We also conducted two ROI analyses comparing conflict and control probes in sequence and landmark responders.

| Behavioral results
The mean number of trials required to learn the full route in prescanning training was 6.5 (SD = 2.89, range 3-13), including the two errorless trials signaling learning to criterion.The mean number of trials containing at least one error made in subsequent full and shorter route trials in pre-scanning training was 1.71 (SD = 2.97, range 0-14).
In a recognition memory test following behavioral testing, where participants had to correctly sequence screenshots of the individual landmarks of the route, the mean correct positioning was 81.49% (SD = 25.73%,range 11.1-100%).Thus, while performance was generally good, explicit recall of the sequence was relatively poor in a few participants.
On the short sequence probes (Figure 1c), where after one junction, the critical probe pathway contained no landmarks, participants showed high levels of correct performance (Figure 2a).All participants were included in analyses of the relevant contrasts involving short-sequence probes.For short conflict probes, two groups of participants emerged based on their predominant response.Fifteen participants made a majority of landmark-based responses (Figure 2b).Nine participants made sequence-based responses (Figure 2c).Imaging results are reported for the short conflict probes in Supplementary material, Appendix D.
RT data are reported in Supplementary Material, Appendix B.

| Imaging results
An analysis across the 24 participants was conducted, contrasting the critical path of sequence probes with the equivalent control path.
Under the Khamassi and Humphries (2012) model, it would be predicted that there would be greater hippocampal formation and caudate activation in the critical path of sequence probes, as prior route trajectory context is necessary to correctly respond in the absence of a landmark.The results of the contrast across the whole brain are displayed in Table 1.In the region of interest analyses, only the right caudate ROI was significantly more active in sequence probes relative to control probes (p = .04;Figure 3a,b), with a similar but weaker effect in the left caudate ROI (Supplementary material, Appendix C, Table C.1).For the reverse contrast, both the right and left hippocampal body ROIs were more active in control than sequence probes (p = .04and .02respectively, Table C.1).These results are consistent with the whole brain analyses (Table 1) showing higher activity in areas classically associated with default mode network (DMN) activity (Benedek et al., 2016;Buckner & DiNicola, 2019), although the hippocampus did not emerge as more active in control probes in the whole-brain contrasts.
In conflict probes, a response based on the correct sequence in the learned route was pitted directly against a response based on a learned landmark-action association.Under the Khamassi and Humphries (2012) model, in conflict probes, it may be predicted that there would be greater hippocampal formation activity, as well as caudate activity, in sequence responders, relative to control probes, as a modelbased algorithm is required to make a sequence response.Whereas, such activity increases would not be predicted in landmark responders.
For whole brain analyses, there were several regions that were more active in sequence responders in conflict probes relative to control probes (Supplementary material, Appendix D, Table D.1).In ROI analyses, the left caudate (p = .02)and left putamen ( p = .02)were significantly more active in conflict versus control probes in sequence responders, with similar but weaker trends occurring in right caudate and putamen (Figure 3c; full results in Supplementary material, Appendix C, Table C.2).In contrast, in landmark responders there were no regions that were more active in conflict versus control probes (Table D.1), and no significant differences in the ROIs (Table C.2).

| ROI analyses
In experiment 1a, we predicted that there would be higher caudate and posterior hippocampus ROI activity in short sequence probes, relative to control probes, due to the need to make a response based on knowledge of landmark sequence.However, only the right caudate ROI was more active in sequence probes (Figure 3b; Table C.1).One possibility is that we did not observe a difference in our posterior hippocampal ROIs because of DMN activity occurring in the control condition, likely involving hippocampal activity (Benedek et al., 2016;Buckner & DiNicola, 2019).This was evidenced by the results of the whole brain contrasts, where several areas classically active in the DMN were evident in the control condition (Table 1).Although we modeled our control probes on those utilized by Igloi et al. ( 2010 where differences in hippocampal activation were found between probe and control conditions, subtle differences may have made our control task insufficiently demanding, thus allowing DMN activity to occur.For example, in Igloi et al. (2010), participants navigated the VE using a joystick, requiring continuous motor activity, whereas, in our study, participants were passively moved along paths until a junction requiring a button-press response was reached.The results for the sequence-based responders in conflict probes (Figure 3c) were similar to those for the short sequence probes, with significant left caudate activation but no difference in the posterior hippocampal ROIs.
Interestingly, there was also greater left putamen ROI activation in sequence responders in conflict probes (Figure 3c).Several researchers have proposed that the role of the putamen in habitual behavior is not to sub-serve model-free learning (Bornstein & Daw, 2011;Khamassi & Humphries, 2012), but to chunk previously separate behaviors into smooth sequences following prolonged learning (Dezfouli & Balleine, 2012;Garr, 2019;Pennartz et al., 2011;Smith & Graybiel, 2016), with an intact dorsolateral striatum being necessary for both initiation and termination of movement sequences (review in Garr, 2019).The few fMRI virtual navigation studies that report putamen activation do so in situations where route planning is possible (Iaria et al., 2003;Wegman et al., 2014;Woolley et al., 2013).
The question of the role of putamen activation in route planning is investigated further in experiment 1b, where participants are informed that they will encounter two types of trials.In long sequence probes, they will navigate down a path with a landmark, but after this point, no further landmarks will occur, so they are required to respond based on their learned knowledge of the familiar route.In long landmark probes, they will encounter landmarks in random order, so only their knowledge of individual landmark-action pairings will be useful in responding.If putamen activation occurs during the initiation of a learned action sequence, then it would be predicted that putamen activation would be higher in the first path of long sequence probes than in the first path of long landmark probes.

| Whole-brain analyses
There was a high degree of overlap between the areas activated in short sequence probes and the areas activated in sequence T A B L E 1 Areas more active in the critical path of the short sequence probe versus the equivalent path of the short control probe in whole brain analyses and the reverse contrast.A cluster-level correction at an FDR rate of p < .05 was applied.responders in conflict probes, both relative to the control condition (Supplementary materials, Table D.1).This would suggest that the task was perceived as similar in both instances (i.e., making a response based on sequence knowledge).One such area was the SMA (see also Igloi et al., 2010), a brain region that appears to be involved in all types of tasks that involve sequencing, be these spatial, motor, linguistic, or musical sequences (review in Cona & Semenza, 2017).The more precise functional role of the SMA in sequencing is still a matter of debate (Cona & Semenza, 2017;Garr, 2019).
The middle frontal gyrus, insula, precuneus, and superior parietal lobule, bilaterally, were also active in sequence responders and short sequence probes in the present study (Table 1, Table D.1), with the inferior temporal gyrus active in short sequence probes.These areas are all commonly activated in spatial tasks (Cona & Scarpazza, 2019;Igloi et al., 2010) and may be involved in working memory and attentional aspects of spatial task performance.The insula may play a role in prioritizing stimuli depending on task demands, particularly in tasks where a "retrocue" signals which stimuli held in working memory are required for task response (Myers et al., 2017).The sequence probe, as well as a sequence response to a short conflict probe, can be thought of as a retrocue task in that the absence or misplacement of the landmark at the junction on the critical path of the probe trial serves to cue the participant that the memory for the previous path will be required.
Finally, in short sequence probes relative to control probes, there was more activity in the posterior cingulate (Table 1), commonly active in more long-term spatial memory and navigation tasks (Cona & Scarpazza, 2019) and in lingual gyrus, associated with prior research to visual imagery (Nemmi et al., 2013;Spagna et al., 2021).
In contrast to probes accessing sequence knowledge, there were no brain regions that displayed higher activation in landmark responders relative to control conditions (Table D.1), although DMN activity during control probes may have masked task-relevant activity increases.While this difference could be taken as evidence of different brain mechanisms underlying sequence-based responses relative to landmark-based responses, consistent with the proposals of Khamassi and Humphries (2012), it could be argued that differences may reflect task difficulty (Duncan et al., 2020).Moreover, it might also be argued that the greater number of participants who navigated on the basis of landmark-action knowledge rather than sequence knowledge in the conflict probe trials reflects greater task difficulty in accessing sequence knowledge.However, if similar brain mechanisms underpinned landmark-based and sequence-based responding in conflict probes, it would be expected that these areas would still show differential activation relative to the control condition in landmark responders, even if task difficulty was less than for sequence-based responding.The question of the potential role of task difficulty as an explanation for differences in brain activations between sequence and landmark responding is considered further in Section 4.4, in light of results from Experiment 1b.

| Introduction
The exploratory ROI analyses for the sequence responders in the conflict probes (Figure 1c) indicated activation of the left putamen, which could be interpreted as evidence for route planning (see Supplementary materials Appendix D).In the second half of the experiment, participants' route planning was tested more explicitly by administering a different set of probe trials, accessing knowledge of the same 9-junction route as experiment 1a.In long landmark probes, participants were instructed that they would have to traverse a set of junctions where landmarks were presented in random order, such that only landmark-action knowledge would be useful.In long sequence probes, participants were told that after an initial junction with a landmark, no more landmarks would be present in the trial; thus, they would be required to navigate the following junctions based solely on memory for the route sequence.The rationale for these long probes was to force participants into either a sequence or landmark strategy, unlike short conflict probes in which participants could spontaneously choose equally valid landmark-or sequence-based strategies.
In terms of predictions for these long probes, greater hippocampal and caudate activity in the first path of long sequence versus long landmark probes would be expected (Igloi et al., 2010;Khamassi & Humphries, 2012;Rondi-Reig et al., 2006) due to the use of a modelbased strategy to correctly anticipate navigation of junctions without landmarks, utilizing the context-setting initial landmark.A further prediction would be that a cooperative relation should exist between hippocampus and caudate ROIs, as measured by beta series connectivity analyses, if hippocampus is providing context information to allow action selection in dorsal striatum in long sequence probes (Brown et al., 2012).
Because of our use of putamen ROIs, we could examine the prediction that there should be greater putamen activation in the first path of long sequence probes, relative to the first path of long landmark probes, due to the initiation of a planned movement sequence (Dezfouli & Balleine, 2012;Garr, 2019;Pennartz et al., 2011;Smith & Graybiel, 2016).We could also examine whether significant connectivity exists between hippocampus and putamen ROIs, and if so, whether these relations are cooperative or competitive.While classic spatial parallel memory systems theory posits competitive relations between hippocampal and dorsolateral striatal systems (Devan et al., 2011;Kosaki et al., 2015;Packard & Goodman, 2013;White & McDonald, 2002), both competitive and cooperative relations have been reported in the imaging literature (Brown et al., 2012;review in Freedberg et al., 2020).

| Participants
Participants were as in Section 2.1.1.Out of the 24 participants, 2 were excluded because they made an error on a majority of long sequence probes, displaying relatively poor sequence knowledge.No participant showed poor performance on long landmark probes.
Figure 2d displays levels of correct performance across the 4 runs for long sequence and long landmark probes in the 22 included participants.

| Virtual environment design
See Section 2.1.2.

| Experimental protocol
The long probes phase of the experiment was presented in four scanning runs after the 4 short probes run of experiment 1a.Each run contained six long probe trials, which were formed of three long sequence probes and three long landmark probes, presented in alternating order, counterbalanced across participants in each run.Thus, there were 12 long sequence probes and 12 long landmark probes in total for each participant.Long sequence probes were formed from the three 3-path segments used on short training trials (see color blocks in Figure 1a).On each long sequence probe, only the first landmark along the route was present, with the subsequent junctions having no landmarks.The long landmark probes also consisted of 3-path segments, but landmarks were presented in random order without replacement.
Prior to beginning scanning for each of the long probe runs, participants were provided with an explanation of the behavior required in each type of long probe.During scanning, text ("landmark trial" or "route trial") was displayed on the screen for 4 s to signal to participants whether the upcoming trial required landmark or sequence responses, followed by a fixation cross before the trial commenced.
For the long sequence probes, only memory for the sequence of landmarks/turns following the initial landmark could be utilized for successful performance, whereas for the long landmark probes, only the memory for individual landmark-action associations could be utilized, with no predictive planning possible.

| Image acquisition
See Section 2.1.4.

Behavioral analysis
Accuracy data was collated to ensure participants were making predominantly correct responses.For a long sequence probe or long landmark probe to be classified as correct, the trial had to be fully errorless.RT data were analyzed using ANOVAs and is reported in Supplementary material Appendix B.

fMRI analysis
A GLM of the functional time series was used to model the time course of the long probes phase of the experiment, using SPM12 software.For first-level analyses, regressors were convolved with the canonical hemodynamic response function, and high-pass filtered (128 s), with the time series for each participant modeled to generate contrast maps.These contrast maps were entered into second-level group random effects GLMs to test contrasts of interest in wholebrain analyses as well as region of interest (ROI) analyses within the hippocampus, caudate, and putamen (see Section 2.1.6.3).For whole brain analyses, significant clusters of activation were identified following a cluster-level FDR correction of p < .05,using an initial clusterforming threshold of p < .001.Anatomical labeling of above-threshold activation clusters was conducted using the Automated Anatomical Labelling Atlas 3 toolbox (AAL3; Rolls et al., 2020).

Long probes phase: univariate contrasts
For the concatenated 4 runs comprising the long probes phase of the experiment, 5 regressors were created.These were the reward periods, the first paths of long sequence and landmark probes, and subsequent paths of long sequence and landmark probes.Any feedback periods linked to incorrect responses, together with the 6 movement parameters, were entered as regressors of no interest.A second-level group analysis was conducted based on parameter estimates of regressors derived from these first-level maps, in which the first pathway of long sequence probes was contrasted with the first pathway of long landmark probes.In ROI analyses, the average signal change across the spherical ROIs was used in 1-tailed t-tests contrasting the first path of the long sequence probes versus the first path of the long landmark probes.

Long probes phase: Beta series connectivity analyses
To test the prediction that there would be more connectivity between the hippocampus and caudate in the critical paths of long sequence versus long landmark probes, the BASCO toolbox (Göttlich et al., 2015) for beta series correlation (Rissman et al., 2004) was utilized.Functional connectivity between putamen and hippocampus was also examined.As a first step, individual GLMs were constructed in which the first path of long sequence probes and the first path of long landmark probes for each of the 12 trials were modeled as regressors of interest.All other predictors, i.e., reward periods, subsequent paths of long sequence probes, and subsequent paths of long landmark probe trials, were included as regressors of no interest, together with movement parameters and incorrect trials.Individual trial-by-trial averaged beta values for the regressors of interest across the eight ROIs were extracted using the BASCO toolbox (Göttlich et al., 2015), and analyzed using 2-tailed, 1-sample t-tests against a null hypothesis of no trial-by-trial correlation of beta values.
A sequential Holm-Bonferroni correction was applied to correct type II errors (to a p < .05level) due to multiple comparisons.
To examine correlations between the ROI beta series to all other brain voxels in the first paths of long sequence probes and long landmark probes, the BASCO toolbox was used to generate Fisher z-transformed correlation maps (Göttlich et al., 2015) for each condition and each ROI.These maps could then be entered into second-level group analyses using SPM12 where conditions could be contrasted using paired-samples t-tests, to examine differences in whole-brain functional connectivity in the first path of long sequence probes versus long landmark probes, for different ROIs.

| Univariate contrasts
To ascertain which brain regions were differentially active when only route sequence knowledge could be utilized, relative to when only learned landmark-action associations could be utilized, these two types of trials were contrasted across the final four runs of the experiment.We focus on the first path of each type of probe, where there was a landmark present in both.However, in long sequence trials, this landmark acted as a starting-point indicator from which route planning processes could be triggered, whereas, in long landmark trials, the first path had no predictive value in terms of which landmarks would follow.
Table 2 shows the areas more active in the first path of long sequence probes relative to the first path of long landmark probes, as well as the reverse contrast in whole-brain analyses.In the ROI analyses, the right hippocampal tail (p = .002)and body ( p = .04),right (p = .02)and left ( p = .05)caudate, and right putamen ( p = .005)were significantly more active in long sequence probes relative to long landmark probes, with the left hippocampal tail ROI showing a similar but weaker effect (Figure 4; Supplementary Material Appendix E, Table E.1).There were no ROIs more active in long landmark versus long sequence trials.
As reported in Supplementary Material, Appendix B, RTs were faster in the long landmark probes versus long sequence probes in the first of the four long probe runs.An analysis of the imaging data (reported in Appendix B and Table E.1) using only the last 3 runs obtained similar results to those using all 4 runs, suggesting RT differences cannot account for differences between conditions.

| Beta series connectivity analyses
Table 3 displays the ROI to ROI correlations between the caudate, putamen, and posterior hippocampus as a function of condition.In the first path of long sequence probes, there was significant functional connectivity between the left caudate ROI and the hippocampal body ROIs bilaterally.There was also significant connectivity between the right caudate ROI and the left hippocampal body ROI.In the landmark condition, there was significant connectivity between the left caudate ROI and the right hippocampal body ROI.Although effect sizes were greater in the sequence condition, leading to above-threshold effects, the direction of effects was similar in both conditions, showing cooperative relations between ROIs.
There was strikingly large connectivity in both conditions between the putamen ROIs and hippocampal tail ROIs to the contralateral hemisphere, with a particularly strong effect from left putamen to right hippocampal tail.In addition, in the first path of the long sequence probes there were significant effects from putamen to the ipsilateral hippocampal tail.
For ROI to whole brain analyses, the only ROI where differences occurred between conditions was the right hippocampal tail, where two clusters, one within the left middle occipital gyrus and the other in the left lingual gyrus, showed greater functional connectivity with this ROI seed in the first path of long sequence probes versus long landmark probes (Supplementary Material Appendix F, Figure F.1).

| Discussion
The data from experiment 1b were consistent with our prediction that there would be greater caudate and posterior hippocampal activation in the first path of long sequence versus long landmark probes (Figure 4).As predicted, there was also significant cooperative functional connectivity between the left caudate and the hippocampus body ROI bilaterally in the sequence condition (Table 3), although further research would be required to assess whether the stronger correlation in the sequence condition relative to the landmark condition is statistically reliable.Both conditions evinced a strong significant T A B L E 2 Areas more active in the first path of long sequence probe trials relative to long landmark probe trials in whole brain analyses (df = 21) and the reverse contrast.A cluster extent threshold to yield an FDR correction of p < .05 was applied.F I G U R E 4 Mean percent signal change (and SEs) in contrasts between long sequence probes and long landmark probes in ROIs (n = 22).
correlation between the left caudate and the right hippocampal body.
In general, the significant cooperative functional connectivity in both conditions obtained in our study is consistent with the results of Brown et al. (2012) in their study of disambiguation of overlapping familiar routes.
As with sequence responders in experiment 1a, there was greater putamen activation in long sequence probes, supporting a view of putamen function in action sequencing (Dezfouli & Balleine, 2012) rather than model-free learning (Bornstein & Daw, 2011;Khamassi & Humphries, 2012).These findings, as well as the strong cooperative connectivity found between putamen ROIs and hippocampal tail ROIs, will be considered further in the general discussion.
In terms of whole-brain analyses, greater activation was found in the right middle frontal gyrus in long-sequence probes, an area associated with spatial working memory functions, and the lingual gyrus bilaterally, a region associated with visual imagery (Cona & Scarpazza, 2019), with greater connectivity between the right hippocampal tail and contralateral lingual gyrus also occurring in long sequence probes.In addition, the greater connectivity obtained in long sequence probes between the right hippocampal tail and the left middle occipital gyrus is likely to reflect increased visual imagery in this condition, as this secondary visual area has been linked to imagery in prior studies (Cona & Scarpazza, 2019).The only area more active in long landmark probes relative to long sequence probes was the fusiform gyrus, an area associated with DMN activity, suggesting this condition may have been insufficiently demanding, as occurred with the control probes in experiment 1a.
It might be argued that differences in whole-brain activations between conditions reflected the observation that long sequence probes were more difficult than long landmark probes (see Figure 2d).
However, this greater difficulty was not reflected in many differences in whole-brain activations between conditions (Table 2).This is in contrast to the larger number of areas that were differentially activated in whole-brain contrasts in short sequence probes and in sequence-responders on conflict probes, in experiment 1a.In our discussion of experiment 1a (Section 3.3.2),we noted that both the absence of a landmark (short sequence probes) or a misplaced landmark (short conflict probes) was unexpected and argued that this could have acted as a retrocue to indicate that the memory for the previous response would be critical for current responding, thus leading to the activation of areas typically active in retrocue tasks (Myers et al., 2017).These task elements were not present in the long probe conditions of experiment 1b, in which participants were instructed on what behavior was required in the upcoming trial (i.e., sequence-or landmark-based responses).In fact, the whole-brain differences that were observed between conditions were in areas associated with visual imagery and spatial working memory (Cona & Scarpazza, 2019).
There is little evidence, then, to support the argument that similar brain mechanisms underpin landmark action and sequence learning but that because the latter is more difficult a larger network of brain areas is activated (Duncan et al., 2020).Rather, in experiment 1a, only the short sequence probes, and the conflict probes of the sequence responders, were treated by participants as retrocue tasks, where memory for the previous landmark-response pairing was required for current responding, and therefore the pattern of brain activations obtained was consistent with this difference.

| GENERAL DISCUSSION
Our results across experiments 1a and 1b support the proposition that in the navigation of familiar routes, knowledge of the sequence of landmarks along the route is supported by different brain mechanisms relative to knowledge of individual landmark-action associations, with cooperative caudate and posterior hippocampal systems associated with sequence knowledge (Goodroe et al., 2018;Igloi et al., 2010;Khamassi & Humphries, 2012).Importantly, our observation of greater caudate and posterior hippocampal activation occurred when there were no distal landmarks available to trigger cognitive map formation (Igloi et al., 2010;White & McDonald, 2002), and when a single familiar route was utilized, not requiring disambiguation processes (Brown et al., 2010(Brown et al., , 2012;;Brown & Stern, 2014).These results, therefore, are difficult to interpret under the proposals of cognitive mapping, in which following a well-known route is conceived as a habitual behavior that depends on the caudate nucleus (e.g., Hartley et al., 2003).
Instead, our findings are consistent with the proposals of reinforcement learning theory (Khamassi & Humphries, 2012), and observations of hippocampal involvement during sequence learning in nonspatial tasks (Yin & Knowlton, 2006), in that knowledge of the sequence of actions along a familiar route requires a model-based representation that engages hippocampal systems.
One objection to our conclusion above is that it could be argued that participants may have engaged in spontaneous tracking of their distance and direction from the origin of the route (termed path integration) where evidence indicates a key role for the hippocampus  (Chrastil et al., 2015(Chrastil et al., , 2016; see also McNaughton et al., 1996, Poucet, 1993).However, while this may have been the case, at least for some participants, there is no reason to expect that such spontaneous path integration processes would have varied between conditions, particularly for the first path of long sequence and long landmark probes, so such processes cannot account for the differences found between probes accessing sequence knowledge, versus individual landmark-action pairings.
In terms of the mechanisms underlying single landmark-action associations independent of sequence knowledge, although associated with dorsolateral striatum (putamen) in lesion studies (Kosaki et al., 2015;White & McDonald, 2002), there was no evidence of increased putamen activation in the present study associated with single landmark-action responses either in short conflict probes or in long landmark probes, consistent with other fMRI studies (see Patterson & Knowlton, 2018).Indeed, even in whole-brain contrasts, it was hard to detect task-related activity associated with individual landmark-action responses.In order for future research to investigate this question, it may be necessary to examine areas that alter activity as a function of learning individual landmark-action responses rather than examining contrasts with control conditions not requiring navigation decisions, as utilized in the present research.
Of interest, we found that the right putamen ROI was activated in the first path of long sequence probes relative to the first path of long landmark probes and that there was a high level of functional connectivity between putamen and hippocampal tail ROIs contralaterally, irrespective of condition.Although there were larger effects in the sequence condition for ipsilateral connectivity between putamen and hippocampal tail ROIs, a more powered study is required to examine whether these effects are statistically reliable, as weaker ipsilateral cooperative correlations were also found in the long landmark probes.
Higher putamen activity also occurred in sequence responders during conflict probes (experiment 1a).These results are consistent with models of putamen function that emphasize action sequencing (Dezfouli & Balleine, 2012;Garr, 2019;Pennartz et al., 2011;Smith & Graybiel, 2016), and are also consistent with the results of Pistell et al. (2009), where dorsolateral and dorsomedial lesions severely affected sequential egocentric maze performance in rodents.Our results raise the possibility that there may be collaboration between the hippocampus and both striatal regions in particular contexts, such as demanding route-planning situations (see also Spiers & Maguire, 2006).Another possibility is that both caudate-hippocampal and putamen systems are involved in the sequential aspects of familiar route navigation, but that as a route becomes highly habitual, there is a transfer of control to putamen systems (Bornstein & Daw, 2011;Khamassi & Humphries, 2012).
Studies tracking the learning, and over-learning, process of sequential egocentric spatial navigation have yet to be conducted, both in rodent lesion studies, and in human imaging studies.Although the study by Pistell et al. (2009) demonstrated that the acquisition of sequential egocentric maze navigation was severely affected by striatal lesions, it is not known whether such lesions would also impair performance subsequent to acquisition in intact animals.Following Dezfouli and Balleine (2012, also Smith & Graybiel, 2016), it may be predicted that hippocampal lesions would have less effect following well-learned egocentric sequential route navigation, whereas striatal lesions, particularly in dorsolateral striatum, should impair performance even after over-learning.In terms of human neuroimaging, studies that can track the learning process in sequential egocentric route navigation could test predictions about the brain systems underlying any transfer of control to putamen with a degree of route familiarity.
Finally, an unpredicted finding in our study was the high level of connectivity between putamen and contralateral hippocampal tail ROIs across both long probe conditions.In a recent review, Freedberg et al. (2020), see also (Chase et al., 2015) bring together diverse studies in which both competitive and cooperative relations between the hippocampus and differing striatal regions have been reported.They propose that a candidate region for mediating connectivity, given paucity of direct connections between the hippocampus and dorsal striatum, is the dorsolateral prefrontal cortex (DLPC), which has strong connectivity to both regions, and also suggest that competitive inter-

| Further considerations and conclusions
Some limitations in the present study qualify our conclusions, particularly with regards to the power to detect reliable effects in beta series connectivity analyses, and also with the control condition utilized in experiment 1a evincing consistent DMN activity.In addition, due to the smaller number of participants choosing to make sequence-based as opposed to landmark-based responses in the short conflict probes of experiment 1a, we were unable to directly compare these two groups in analyses including between-subjects contrasts.
Nevertheless, our findings support a learning-based model of hippocampal-to-caudate function (Khamassi & Humphries, 2012), in that knowledge of the sequential aspects of familiar route-following was found to activate hippocampus and caudate, and cooperative functional connectivity was found between these regions.In terms of putamen function, higher putamen activation was obtained when routes had to be planned, as would be predicted by models emphasizing the role of the putamen in action sequencing (Dezfouli & Balleine, 2012).Unexpectedly high levels of functional connectivity between posterior hippocampus ROIs and putamen ROIs were obtained, requiring further investigation.The need for studies, both in the nonhuman animal and fMRI literature, to track potential changes through the course of route learning in whether hippocampal and striatal systems are engaged was highlighted.
U R E 1 (a) Plane view of the VE route.The yellow, pink, and blue color blocks indicate the 3 junction routes that participants were trained on after reaching the initial training criterion (see text for an explanation of these).The purple rectangle indicates the 5 junction route.(b) First-person view of pathways and junctions in the VE.(c) Plane view of different types of short probes utilized in the 4 short probe runs.S = a sequence-based response, L = a landmark-based response, and I = an incorrect response.The double-headed arrows indicate the path segment from the whole route from which the short probe trial is derived (see text for further details).
Experimental stimuli were presented on an MRI-compatible monitor viewed through a mirror mounted on the MRI head coil.Participants used an MRI-compatible response box to indicate choices at each junction.
U R E 2 (a) Individual percentages of correct responding during the sequence probes of the short probes run (N = 24).Markers are scaled to indicate the number of participants accounted for by each data point (range: 1-21 participants).(b) Individual percentage landmark responses in short conflict probes for each of the 4 short probe runs in the landmark responder group (n = 15).Markers are scaled to indicate the number of participants accounted for by each data point (range: 1-12 participants).(c) Individual percentages of sequence responses in short conflict probes in the sequence responder group in each of the 4 short probes run (n = 9).Markers are scaled to indicate the number of participants accounted for by each data point (range: 1-6 participants).(d) Individual percentages of errorless long landmark and long sequence probes for the 4 long probes run (n = 22), see experiment 1b.Markers are scaled to indicate the number of participants accounted for by each data point (range: 1-22 participants for Landmark and 1-20 for Sequence).
U R E 3 (a) Left and Right ROIs in hippocampus, caudate and putamen.(b) Mean percent signal change (and SEs) in short sequence probes vs short control probes contrasts in ROIs (N = 24).(c) Mean percent signal change (and SEs) in short conflict probes vs short control probes contrasts in ROIs in sequence responders (top row, n = 9) and landmark responders (bottom row, n = 15).
actions may dominate early in learning, as both systems compete for DLPC resources.As learning becomes embedded, this competitive relation gives way to cooperative relations, as both systems can be activated in parallel.They propose ways in which their model can be tested, including the use of TMS at different stages of learning focused on DLPC.As discussed above, in terms of route learning paradigms, tracking functional connectivity between the posterior hippocampus and different striatal territories at different stages of learning would help clarify the causes of these strong cooperative interactions.
The data from the final run of one participant was missing due to technical error.
T A B L E 3 Correlations between caudate, putamen, and posterior hippocampus ROIs in the first path of long landmark probes and long sequence probes with correlations passing the Holm-Bonferroni threshold (N = 22, p ≤ .00217) in bold.