Hippocampal BDNF regulates a shift from flexible, goal‐directed to habit memory system function following cocaine abstinence

ABSTRACT The transition from recreational drug use to addiction involves pathological learning processes that support a persistent shift from flexible, goal‐directed to habit behavioral control. Here, we examined the molecular mechanisms supporting altered function in hippocampal (HPC) and dorsolateral striatum (DLS) memory systems following abstinence from repeated cocaine. After 3 weeks of cocaine abstinence (experimenter‐ or self‐administered), we tested new behavioral learning in male rats using a dual‐solution maze task, which provides an unbiased approach to assess HPC‐ versus DLS‐dependent learning strategies. Dorsal hippocampus (dHPC) and DLS brain tissues were collected after memory testing to identify transcriptional adaptations associated with cocaine‐induced shifts in behavioral learning. Our results demonstrate that following prolonged cocaine abstinence rats show a bias toward the use of an inflexible, habit memory system (DLS) in lieu of a more flexible, easily updated memory system involving the HPC. This memory system bias was associated with upregulation and downregulation of brain‐derived neurotrophic factor (BDNF) gene expression and transcriptionally permissive histone acetylation (acetylated histone H3, AcH3) in the DLS and dHPC, respectively. Using viral‐mediated gene transfer, we overexpressed BDNF in the dHPC during cocaine abstinence and new maze learning. This manipulation restored HPC‐dependent behavioral control. These findings provide a system‐level understanding of altered plasticity and behavioral learning following cocaine abstinence and inform mechanisms mediating the organization of learning and memory more broadly.

balance between these systems is a lasting and potentially pervasive symptom of the addicted phenotype that may contribute to both the development and maintenance of drug addiction, as well as therapeutic challenges (Balleine & O'Doherty, 2010;Ersche et al., 2016;Everitt & Robbins, 2005;Everitt & Wolf, 2002;Goodman & Packard, 2016).
Whether maladaptive behaviors characteristic of drug abuse are supported by enhancements in habit memory systems, impairments in goal-directed memory systems or a combination of both remains poorly understood (de Wit et al., 2018;Ersche et al., 2016;Everitt & Robbins, 2005;Robbins et al., 2008).

| Subjects
Male, Long-Evans rats, approximately PND 40, 220-250 g at start of experiment (Charles River Laboratories, Wilmington, MA) were housed individually in a colony room maintained at a constant temperature (23 C) on a reverse 12 hr light/dark cycle (lights off from 7:00 a.m. to 7:00 p.m.) with ad libitum access to food and water (except for during maze training, see below). Training and testing was conducted in the dark cycle. Following acclimation, rats were handled for 5-7 days prior to the onset of the experiments. All procedures were performed in accordance with guidelines stipulated by the NIH Guide for the Care and Use of Laboratory Animals and were approved by the UCLA Institutional Animal Care and Use Committee.

| Operant chambers
Rats were trained to self-administered cocaine in operant chambers (Med Associates Inc., St. Albans, VT) housed inside sound-attenuating cubicles. Each chamber had a stainless steel grid floor and contained two retractable levers located 6 cm above the floor. Two stimulus lights were mounted above the levers and a white house light was located 20 cm above the floor on the wall adjacent to the levers. During training, catheters were attached to polyethylene-50 tubing protected by a metal tether which fed to a plastic swivel outside of the chamber, which itself was connected to a 10 mL syringe loaded with either saline or cocaine. All chambers were controlled by a Med Associates interface system.

| Dual-solution maze
Rats were trained on a plus-shaped maze with a wooden laminatecovered floor and plexiglass walls. Each of the four maze arms (103 × 10 cm 2 ) extended from a central platform (12.5 cm 2 ) at 90 angles. The maze stood on central legs placing the maze arms 97 cm off the ground. Plexiglass walls extended 20 cm from the floor of the maze. One lamp was positioned in the northwest corner of the room, and large posters were positioned on each of the room's four walls.
During habituation, training and probe testing a clear plexiglass barrier (20 cm tall) was used to block the arm opposite to the start arm, creating a T-shaped maze. The north maze arm was blocked during habituation and training, while the south maze arm was blocked during the probe test (Figure 1b). A plexiglass tub with woodchip bedding and wire lid was used as a holding cage between trials.

| Jugular catheterization
Rats were anesthetized with 2-3% isoflurane anesthesia. The ventral neck and shoulder blade areas were shaved and alternating treatments of Betadine scrub and alcohol applied. Two incisions were made: (a) a 1 cm lateral incision in an anterior/posterior direction on the neck over the right jugular vein and (b) a 1 cm medial incision between the scapulae on the back. The jugular vein was isolated through blunt dissection of the surrounding tissue. A catheter tube (0.04 in. O.D. × 0.20 I.D., Plastics One, Roanoke, VA) approximately 15 cm long was then threaded through the incision between the shoulder blades, passing through the subcutaneous space, to the incision in the neck and inserted into the vein toward the heart (~2 cm). The catheter tubing was attached to a guide cannula positioned between the animal's shoulder blades. The cannula bent at a right angle protruding from the skin, was embedded in dental acrylic and fixed with mesh (1 mm thick, 2 cm 2 ) circumscribing its base. The catheter was anchored to the vein using two 6-0 silk (nonabsorbable) sutures, one below and one above the insertion point of the catheter into the vein. The incisions in the neck and back regions were closed using absorbable 4-0 sutures. Rats received a 7-day recovery period following surgery. Rats were injected with Carprofen (5 mg/kg, analgesic) prior to surgery and every 12 hr for 2 days postoperation. Starting 24 hr after surgery and continuing throughout self-administration procedures, catheters were flushed daily with 0.1 mL ampicillin (antibiotic, 4 mg/mL in 0.9% sterile saline) followed by 0.1-0.2 mL heparinized saline (50 IU/mL in 0.9% sterile saline).

| Experimenter-administered cocaine
Rats were treated daily with intraperitoneal (i.p.) injections of 20 mg/kg cocaine or saline vehicle for 14 days. This regimen of cocaine treatment was selected based on previous studies reporting effects that have been replicated in rodent models of cocaine self-administration and in tissues obtained from humans with a history of drug abuse (Kumar et al., 2005;Maze et al., 2010;Robison et al., 2013). Following the final injection, rats were left undisturbed in their home cages, except to monitor weight and health, for 17-21 days.

| Cocaine self-administration
Following 7 days recovery from surgery, rats were trained to selfadministered cocaine on a fixed ratio one schedule (0.75 mg/kg/infusion) in operant conditioning chambers (Med Associates). Training consisted of 3-hr daily sessions run until animals met performance criterion of 12-14 consecutive days with less than 25% change in number of drug infusions earned between days. A session would begin with the illumination of a house light and the extension of two levers. Presses on the designated active lever activated a syringe pump for drug infusion. After each response/infusion, levers were retracted and a cue light located above the active lever was illuminated for a 15 s timeout period. Presses on the inactive lever were recorded but resulted in no programmed consequences. Animals designated to receive saline, were yoked to an animal designated to receive cocaine such that when a cocaine animal pressed the active lever, both animals received their respective infusions and programmed consequences.

| Dual-solution maze task
Starting 1 week prior to maze training and continuing until tissue collection, rats were food restricted to maintain 93-96% of their freefeeding body weight. This mild food restriction schedule was administered because rats were required to only run four trials per test session. Water continued to be provided ad libitum.

| Habituation and training
Animals began food restriction 10-12 days after the last drug exposure. On Days 16 and 17 of drug abstinence, rats were habituated to the plus maze apparatus. Habituation involved one trial per day. Rats were placed on the south start arm of the maze and allowed to freely explore the maze for 5 min. During habituation sessions, a clear plexiglass barrier blocked access to the north arm and no reward pellets were available on the maze. After each habituation trial, animals were returned to their home cages and provided with 10 banana flavored grain-based pellets (Bio-Serv, Flemington, NJ; F0158, 190 mg), subsequently used as training reward. During habituation, experimenters observed entries to each available maze arm, and the number of arm traversals to identify any preference for specific arms. Prior to training, each animal was designated to receive reward in either the east or west maze arm. Rats that exhibited a bias during habituation were intentionally split evenly between receiving reward in their preferred or nonpreferred maze arm. Rewarded arms (east vs. west) were assigned in a counterbalanced manner within and between testing groups. Following habituation animals received a maximum of 12 days of training. Training days consisted of four trials per day. On the first trial of training Day 1, the reward arm was baited with five total pellets along the arm from the choice point to the reward cup at the end of the arm. For all subsequent training trials, one reward pellet was placed at the end of the designated reward arm. For each trial, animals were placed in the south start arm facing away from the maze and given 2 min to navigate to the reward site. If 2 min elapsed before the rat reached the reward site, the rat was placed at the reward site by the experimenter. On every training trial, rats were scored for their choice between east and west goal arms. A choice was defined when the rat had all four paws in the selected goal arm. Training continued until animals exhibited four consecutive days with a total of five errors or less (68.75% correct). If a rat failed to reach this criterion performance after 12 days of training they were removed from the experiment (cocaine, N = 0; saline, N = 2; this was often associated with lack of maze exploration or irregular reward consumption). During the last 4 days of training, the time required for a rat to complete each trial was recorded and analyzed for differences between treatment groups.

| Probe testing
Learning strategy was assessed with a single probe trial. For probe trials, the barrier was moved to block the south start arm, and animals were placed on north arm at the beginning of the trial. Rats were scored based on the goal arm selected, those who turned into the previously rewarded arm were scored as "place learners" and those who turned into the previously unrewarded arm scored as "response learners." Probe testing was conducted at one of two time points: 1 or 24 hr after criterion levels of performance were met for gene expression and Western blot analyses, respectively (Figure 1a,b).
2.6 | Tissue analyses 2.6.1 | RNA isolation and quantitative PCR Bilateral DLS and dHPC were obtained from 1-mm coronal brain sections immediately following the 1-hr probe test (1 hr after performance criterion). RNA isolation and quantitative PCR (qPCR) were performed as described previously (Kennedy et al., 2013).
Briefly, tissue was homogenized in Trizol and processed according to the manufacturer's instructions. RNA was purified with RNeasy Micro columns (Qiagen, Germantown, MD, Cat. #7004) and spectroscopy confirmed that the RNA had 260/280 and 260/230 ratios >1.8. RNA was reversed transcribed into cDNA using iScript cDNA synthesis (Bio-Rad, Hercules, CA). qPCR was performed using 2.5 ng of cDNA for each reaction plus primers and SYBR Green.
Each reaction was run in duplicate and analyzed following the ΔΔCt method as previously described using glyceraldehyde-3-phosphate dehydrogenase (GAPDH) as a normalization control.
Primers used for qPCR were as follows: GAPDH:

| Western blot
Bilateral DLS and dHPC were obtained from 1-mm coronal brain sections 24 hr after probe testing. Western blotting was performed as described previously (Kennedy et al., 2013). Briefly, tissue was homogenized in 50-90 μL of 1 M HEPES lysis buffer (1% SDS) containing protease and phosphatase inhibitors using an ultrasonic processor. Protein concentrations were determined using a DC protein assay and 10-20 μg samples of total protein were electrophoresed on 18% Tris-HCl polyacrylamide gels. Proteins were transferred to a PVDF membrane, blocked for 1 hr in 5% BSA, and incubated overnight at 4 C with either anti-acetyl H3 (Millipore, Burlington, MA, Cat. # 06-599, RRID:AB_2115283, 1:5,000) or anti-GAPDH (Cell Signaling Technology, Danvers, MA, Cat. #2118, RRID:AB_561053, 1:60,000) antibodies. Membranes were then incubated with secondary antibody conjugated to horseradish peroxidase for 1 hr at room temperature (anti-Rabbit, Vector Laboratories, Burlingame, CA, Cat. # PI-1000) and bands were visualized using SuperSignal West Dura substrate. Bands were quantified with the NIH ImageJ software and normalized to GAPDH to control for equal loading. AlexaFluor-594-conjugated (red) goat secondary antibody (1:500; Invitrogen, Carlsbad, CA) for 2 hr. Sections were then washed three times in PBS for 30 min, mounted on slides, and coverslipped with ProLong Gold mounting medium with DAPI. All images were acquired using a Keyence (BZ-X710) microscope with a ×4 or ×20 objective (CFI Plan Apo), CCD camera, and the BZ-X analyzer software.

Quantitative PCR
Animals were euthanized 5-7 days following the probe test. Bilateral DLS and dHPC were obtained from 1-mm coronal brain sections with 16 gauge blunt tip needles. Care was taken to collect tissue surrounding the base of the injection needle track visible on the brain section.
This tissue was processed and analyzed for Bdnf exon IX expression as described above.

| Statistical analysis
Statistical analyses were performed using GraphPad Prism and SPSS Statistics. Datasets were analyzed using Student's t tests, one-and two-way analysis of variances (ANOVAs), Fisher's exact test (FET), and repeated measures ANOVA. The presented gene expression data (Figure 3c-e) were analyzed without correction for multiple comparisons, as small effect sizes were expected and false negatives would limit follow-up research. Identified changes in the expression of a key gene of interest, Bdnf IX, were replicated in a separate cohort of animals ( Figure S1).

| RESULTS
3.1 | Abstinence from repeated cocaine biases new behavioral learning toward the use of a DLSdependent strategy Aberrant learning processes contribute to the chronic relapsing nature of psychostimulant addiction. To investigate whether this represents a global and persistent change that impacts new learning outside of the context of drug-seeking/taking, we trained and tested rats on a dual-solution maze task following abstinence from repeated cocaine.
Extensive research employing this behavioral paradigm has shown that with limited training, rats will use an HPC-dependent (place) strategy to solve the task but with overtraining there is a shift to a DLS-dependent (response-based) solution (Kathirvelu & Colombo, 2013;Packard & McGaugh, 1996;Pych, Chang, Colon-Rivera, Haag, & Gold, 2005;Tolman, Ritchie, & Kalish, 1946, 1947. Because we hypothesized that cocaine exposure would promote DLS-dependent learning strategies we employed a limited training procedure. Rats received experimenter-administered cocaine (20 mg/kg daily, i.p.) for 14 days. Following 21 days of drug abstinence animals were trained in the dual-solution task. To investigate the learning strategy adopted by rats to solve the task, animals were given a probe test either 1 or 24 hr after criterion levels of performance were reached (Figure 1a).
Behavioral data obtained from these two distinct probe-testing procedures were initially examined separately but as there was no difference detected between the two experiments, the data was combined.
Following limited training, saline exposed rats displayed place learning while rats abstinent from cocaine showed the predominant use of response learning (Figure 1d; p < .0001, FET). Importantly, all rats required a similar number of training days to reach criterion levels of performance ( Figure 1c; t (59) = 1.10, p = .28). Measures of the latency to complete trials during the last 4 days of training did not differ across treatment groups (data not shown).
Next, to validate our behavioral findings in a model of contingent cocaine administration, we trained rats to self-administered cocaine (0.75 mg/kg/infusion; 12-14 days of asymptotic lever pressing) during 3 hr daily sessions (Figure 2a,b). Following 21 days of abstinence, rats were trained and tested (24 hr after performance criterion) on the dual-solution maze. Yoked saline control rats exhibited HPCdependent place learning while rats abstinent from cocaine primarily displayed DLS-dependent response learning (Figure 2c; p = .024, FET).
Together, these data show that prior repeated cocaine exposure does not cause a general impairment in new learning but rather is associated with a shift from HPC-to DLS-dependent memory processing and behavioral control.

| Cocaine-induced memory system bias is associated with bidirectional changes in AcH3 and BDNF in the DLS and HPC
Learning and memory are critically dependent on brain plasticity that in turn requires gene transcription. We thus hypothesized that the observed shift in behavioral learning strategy following cocaine would be associated with altered experience-dependent transcriptional activation in the HPC and DLS. Acetylation of core histone proteins modifies chromatin structure to facilitate gene transcription and support new learning and behavioral plasticity (Kennedy & Harvey, 2015;Peleg et al., 2010;Rudenko & Tsai, 2014). Following the 24-hr probe test, we compared total levels of histone H3 acetylation (AcH3) in the dHPC and DLS of experimenter-administered cocaine and salinetreated animals. Levels of AcH3 were significantly increased in the DLS (Figure 3b; t (8) = 2.585, p = .032) and moderately decreased in the dHPC (Figure 3a; t (9) = 2.019, p = .074) of cocaine-treated animals.
A two-way ANOVA was used to compare fold change AcH3 levels in the DLS and dHPC, and revealed a main effect of brain region [F (1,17) = 9.813, p = .006] and an interaction between brain region and drug treatment [F (1,17) = 9.813, p = .006]. AcH3 levels were F I G U R E 1 Memory system bias following abstinence (21 days) from experimenter-administered cocaine (20 mg/kg/day, 14 days). (a) Timeline for experimenter-administered cocaine experiments. Probe testing for future gene expression and western blot analyses was conducted at 1 and 24 hr after performance criterion was met, respectively. (b) Schematic representation of the dual-solution task used for all experiments. (c) Number of days to reach criterion levels of performance. Data presented as mean ± SEM. (d) Saline-treated rats used an HPC-dependent place strategy to solve the task while cocaine-treated rats showed a bias toward the use of a DLSdependent response strategy. DLS, dorsolateral striatum; HPC, hippocampal [Color figure can be viewed at wileyonlinelibrary.com] significantly higher in the DLS compared to the dHPC in cocainetreated animals (p = .0003).
We next profiled changes in the expression of candidate immediately early genes (IEGs) previously shown to play a role in activity-dependent plasticity in both the striatum and HPC. In order to capture the transient nature of IEG induction in response to new learning, we compared mRNA levels of Fos, Jun, early growth response (Egr1-3) and Bdnf in the dHPC and DLS of experimenter-administered cocaine and saline-treated animals immediately following the 1-hr probe test on the dual-solution maze. No group differences were observed in the mRNA levels of Fos, F I G U R E 2 Memory system bias following abstinence (21 days) from cocaine selfadministration (fixed ratio one, 0.75 mg/kg/infusion). (a) Experimental timeline. (b) Cocaine self-administration. (c) Yoked-saline rats used an HPC-dependent place strategy to solve the task while cocaine selfadministering rats showed a bias toward the use of a DLSdependent response strategy. DLS, dorsolateral striatum; HPC, hippocampal F I G U R E 3 Memory system bias following abstinence (21 days) from experimenteradministered cocaine (20 mg/kg/day, 14 days) is associated with bidirectional changes in permissive histone acetylation and Bdnf expression in dHPC and DLS. (a and b) Quantification and representative Western blots of total AcH3 in the dHPC and DLS 24 hr after probe testing (N = 5-6/group). (c-f) mRNA expression in the dHPC (c and e) and DLS (d and f) 1 hr after criterion performance on the plus maze and immediately following probe testing (N = 6-9/group). Data presented as mean ± SEM. #p = .074. *p < .05. DHPC, dorsal hippocampus; DLS, dorsolateral striatum; HPC, hippocampal Jun, Egr1, and Egr3 in dHPC and DLS (Figure 3c,d). 3.3 | Overexpression of BDNF in the dHPC restores HPC-dependent behavioral learning and control following cocaine abstinence HPC lesions or inactivation can facilitate DLS-dependent learning and neurochemical manipulations that enhance HPC memory function can delay transitions from HPC-to DLS-dependent behavioral control (Chang & Gold, 2003;McDonald & White, 1993;Packard, 1999;Packard & McGaugh, 1996;Schroeder, Wingard, & Packard, 2002).
Based on this evidence, we hypothesized that increasing BDNF in the dHPC would restore HPC-dependent behavioral learning in cocaine-treated animals. We tested this hypothesis by overexpressing BDNF in the dHPC. Following 14 days of experimenteradministered cocaine, rats were injected with AAV-BDNF or AAV-mCherry (control) into area CA1 of the dHPC (Figure 4a,b) (Goldberg et al., 2015;White et al., 2016). qPCR of tissue punches from injected areas revealed an increase in BDNF message in tissue transfected with AAV-mCherry-BDNF over tissue transfected with AAV-mCherry (Figure 4b, t (18

| DISCUSSION
The current experiments investigated the effects of cocaine on HPC and DLS behavioral learning and control. Prolonged abstinence from repeated experimenter-or self-administered cocaine biased new behavioral learning toward the use of DLS-dependent strategies. This memory system bias was associated with upregulation and downregulation of transcriptionally permissive AcH3 and BDNF in the DLS and dHPC, respectively. We further observed that viral overexpression of BDNF in the dHPC was sufficient to restore HPC-dependent behavioral control following cocaine. Our results provide novel circuit and mechanistic insight into the persistent and pervasive effects of cocaine on the organization of learning and memory.
Functional and molecular adaptations within frontostriatal circuits have been largely implicated in the development and maintenance of drug addiction as well as drug-induced changes in behavioral learning more broadly. Psychostimulant exposure accelerates DLS-dependent habit behavioral responding in instrumental tasks (Corbit & Janak, 2007;Corbit, Nie, & Janak, 2012;Fuchs et al., 2006;Nelson & Killcross, 2006;Yin et al., 2004;Zapata, Minney, & Shippenberg, 2010) and this has been attributed to hypofunction and decreased plasticity in frontocortical brain regions as well as hyperfunction and increased plasticity in the DSTR (Chen et al., 2013;Corbit, Chieng, & Balleine, 2014;Lucantonio et al., 2012;Moratalla et al., 1996;Schoenbaum & Setlow, 2005;Vanderschuren et al., 2005;Volkow et al., 2006). Instrumental conditioning paradigms have some limitations in understanding the broader impact of drugs of abuse on behavioral learning, as they do not necessarily engage nor require the HPC, a memory system known to participate in the rapid encoding of both external sensory and internal motivational information to support goal-directed behaviors (Eichenbaum et al., 1992;Kennedy & Shapiro, 2004Redish, 2016;White & McDonald, 2002;Wikenheiser & Redish, 2015;Wikenheiser & Schoenbaum, 2016). In maze navigation, tasks like the present where new behavioral learning can be supported by either the HPC or DLS, HPC-dependent place learning is acquired more rapidly and with overtraining there is a transition to DLS-dependent (responsebased) solutions (Kathirvelu & Colombo, 2013;Packard & McGaugh, 1996;Pych et al., 2005;Tolman et al., 1946Tolman et al., , 1947. Rats receiving HPC inactivation following limited training in the dual-solution task perform at chance levels during probe testing, suggesting that under these conditions the DLS has not yet acquired task relevant information (Packard & McGaugh, 1996). In the present experiments, cocaine exposed rats show the predominant use of DLS-dependent response strategies with only limited training. These findings strongly suggest that repeated cocaine exposure is associated with enhanced DLS-dependent behavioral learning and replicate previous reports employing both instrumental and maze learning tasks (Corbit et al., 2014;Fuchs et al., 2006;LeBlanc et al., 2013;Schmitzer-Torbert et al., 2015;Udo, Ugalde, DiPietro, Eichenbaum, & Kantak, 2004;Zapata et al., 2010). Although it is possible that cocaineinduced memory system bias may be mediated through enhanced HPCdependent behavioral learning and an accelerated shift from HPC to DLS behavioral control, this is unlikely given the modest criterion levels of performance employed in the present experiments and the similarity between cocaine and saline-treated animals in task acquisition. Thus, our new results further implicate impaired HPC memory function in this process and align with recent clinical evidence showing that individuals with a history of substance use are biased toward the early use of caudate nucleus-dependent learning strategies when required to solve a virtual dual-solution navigation task (Bohbot, Del Balso, Conrad, Konishi, & Leyton, 2013).
Posttranslational modifications of histone proteins have emerged as critical regulators of experience-dependent transcriptional activation and plasticity. Increases in transcriptionally permissive histone acetylation throughout the brain's reward circuitry have been shown to regulate cocaine-induced molecular and behavioral adaptations (Kennedy et al., 2013;Kennedy & Harvey, 2015;Malvaez et al., 2013;Malvaez, Sanchis-Segura, Vo, Lattal, & Wood, 2010;Renthal et al., 2007;Sadakierska-Chudy et al., 2017;Sadri-Vakili et al., 2010;Schmidt et al., 2012). Multiple reports have further demonstrated a similar role for histone acetylation in mediating learning and memory processes across a variety of behavioral tasks (Castellano et al., 2012;Kilgore et al., 2010;Lattal, Barrett, & Wood, 2007;Melissa Malvaez et al., 2018;Morris, Mahgoub, Na, Pranav, & Monteggia, 2013;Peleg et al., 2010;Rudenko & Tsai, 2014;Stefanko, Barrett, Ly, Reolon, & Wood, 2009). We found that following maze learning, levels of total AcH3 were increased and decreased in the DLS and dHPC of cocaine abstinent animals, respectively. These results suggest that cocaine exposure may persistently alter chromatin-mediated transcriptional regulatory events in both the HPC and DLS, which may in turn promote the capture of new learning by the DLS memory system. Our findings are of particular interest in light of recent evidence showing that manipulations that increase or decrease levels of histone acetylation in the DSTR either accelerated or prevented habitual responding in an instrumental learning task (Malvaez et al., 2018). Together, the data support a common histone acetylation-dependent mechanism for both adaptive and maladaptive shifts between circuits mediating flexible, goal-directed and habit behavioral learning and control.
Activity-dependent increases in BDNF play a critical role in the consolidation of new memories across multiple brain systems (Bambah-Mukku, Travaglia, Chen, Pollonini, & Alberini, 2014;Bekinschtein et al., 2007;Dragunow et al., 1993;Rattiner, Davis, & Ressler, 2004;Tokuyama, Okuno, Hashimoto, Xin Li, & Miyashita, 2000). BDNF regulates cocainemediated behavioral and molecular adaptations in a complex, brain region and cell-type specific manner (Li & Wolf, 2015). The collective data suggest that elevated BDNF in reward circuits including the DSTR, ventral tegmental area and nucleus accumbens facilitate behavioral responses to cocaine while manipulations that increase BDNF in the medial prefrontal cortex (mPFC), a brain region implicated in flexible/goal-directed behavioral responding, can oppose such behaviors. Here, we found that following new learning Bdnf expression was increased in the DLS and decreased in the dHPC of cocaine-treated animals. These data both corroborate and extend previous findings identifying BDNF as a molecular target that may mediate both enhanced and impaired plasticity across habit and goal-directed memory circuits following repeated cocaine. BDNF activates multiple intracellular signaling pathways to increase de novo gene transcription and synaptic plasticity. Here we sought to identify regulation of candidate genes downstream of BDNF in the DLS and HPC following cocaine abstinence and maze learning. In vitro studies have shown that BDNF signaling through the TrkB receptor supports the induction of Fos and members of the Egr family of transcription factors (Egr1-3) (Calella et al., 2007). Both Fos and Egr IEGs are upregulated in response to psychostimulants and new behavioral learning, albeit in a time-dependent and brain region specific manner (Bozon et al., 2003;Chandra et al., 2015;Dragunow, 1996;Jouvert, Dietrich, Aunis, & Zwiller, 2002;Pollak, Herkner, Hoeger, & Lubec, 2005;Torres et al., 2015). Although we found no change in the expression of Fos and Egrs1 and 3 in the DLS and dHPC, the reported decrease in Egr2 in the dHPC of cocaine-treated rats is of particular interest. HPC LTP is decreased following prolonged abstinence from cocaine (Keralapurath, Briggs, & Wagner, 2017;Thompson et al., 2004) and the induction of Egr2 has been implicated in its maintenance (Williams et al., 1995;Worley et al., 1993). Additional evidence suggests that experience-dependent increases in Egr2 may be mediated through histone acetylation-dependent changes in chromatin structure (Torres et al., 2015). Our findings showing a decrease in Bdnf, Egr2, and AcH3 in the dHPC following cocaine abstinence reveal a novel molecular pathway that may mediate cocaine-induced impairments in HPC memory function. Future investigations will be necessary to examine the time course and mechanisms supporting these changes. One possibility is that repeated cocaine exposure causes a persistent up and down regulation of AcH3 in the DLS and dHPC, respectively. This in turn may serve to both prime (DLS) and repress (HPC) the induction of BDNF and other IEGs in response to novel experience to support a lasting and global bias toward inflexible, DLS-dependent behavioral learning and control.
Cognitive behavior therapy (CBT) as a monotherapy or in combination with pharmacotherapy has shown efficacy in the treatment of substance use disorders. CBT encompasses a broad range of behavioral treatments focused on promoting adaptive cognitive and behavioral learning strategies. Clinical investigations have reported a correlation between performance on HPC-dependent memory tasks and CBT treatment retention in cocaine abusers (Aharonovich et al., 2006) and learning and memory deficits in tasks that require fronto-HPC circuits are associated with increased cocaine relapse outcomes following inpatient treatment (Fox, Jackson, & Sinha, 2009). Competitive interactions between HPC and DLS memory systems have been well established. Experimental manipulations that impair HPC function enhance DLS-dependent behavioral learning and manipulations that impair DLS function after overtraining restore HPC behavioral control (Kathirvelu & Colombo, 2013;Matthews & Best, 1995;Packard & McGaugh, 1996;Tomas Pereira et al., 2015). We found that overexpression of BDNF in the dHPC was sufficient to restore HPCdependent behavioral learning and control. These data suggest that the pervasive and predominant use of DLS-dependent learning strategies following repeated cocaine may be corrected through interventions targeted at restoring plasticity and function in goal-directed memory circuits, including the HPC. Our findings provide a broader understanding of how the addicted brain processes and stores new information and support recent clinical and preclinical evidence suggesting that interventions focused on enhancing function within goal-directed memory circuits may improve therapeutic outcomes in patients seeking treatment for cocaine abuse (Corbit et al., 2014;Ersche et al., 2016).
In the present experiments, analyses were focused on HPC and DLS memory systems; components of broader networks known to mediate goal-directed and habit behavioral learning (Redish, 2016;Wikenheiser & Schoenbaum, 2016). The mPFC is reciprocally connected with both the HPC and striatum. Prelimbic mPFC lesions promote habit in instrumental tasks while infralimbic mPFC lesions reinstate goal-directed behavioral responding . Evidence from plus maze navigation tasks further demonstrate a critical role for the mPFC in mediating shifts between HPC-and DLS-dependent behavioral strategies (Ragozzino, Detrick, & Kesner, 1999;Ragozzino, Wilcox, Raso, & Kesner, 1999;Rich & Shapiro, 2007), and HPC-PFC interactions are increased during flexible decisionmaking (Spellman et al., 2015;Young & Shapiro, 2011). Future studies are needed to understand the effects of psychostimulants on dynamic interactions within these circuits, associated neuroplastic changes across circuit components, and the broader impact of such changes on learning processes.
In conclusion, our findings suggest that cocaine-induced impairments in HPC memory function and plasticity may facilitate a pervasive enhancement in DLS-dependent behavioral learning and control and identify altered experience-dependent regulation of BDNF and AcH3 in this process. Bidirectional changes in BDNF and AcH3 across multiple brain regions following cocaine exposure challenge BDNF and AcH3 as therapeutic targets. Follow-up studies investigating regulatory events both upstream and downstream of the reported changes may facilitate the identification of a common molecular pathway mediating cognitive and behavioral abnormalities associated with excessive exposure to psychostimulants. The present results provide new insight into the persistent effects of cocaine on behavioral learning and may ultimately contribute to the development and refinement of both cognitive and pharmacological therapies for treating cocaine abuse and addiction.