Play, but not observing play, engages rat medial prefrontal cortex

Rats have elaborate cognitive capacities for playing Hide & Seek. Playing Hide & Seek strongly engages medial prefrontal cortex and the activity of prefrontal cortex neurons reflects the structure of the game. We wondered if prefrontal neurons would also show a mirroring of play‐related neural activity. Specifically, we asked how does the activity in the rat medial prefrontal cortex differ when the animal plays itself versus when it observes others playing. Consistent with our previous work, when the animal plays itself we observed medial prefrontal cortex activity that was sharply locked to game events. Observing play, however, did not lead to a comparable activation of rat medial prefrontal cortex. Firing rates during observing play were lower than during real play. The modulation of responses in medial prefrontal cortex by game events was strong during playing Hide & Seek, but weak during observing Hide & Seek. We conclude the rat prefrontal cortex does not mirror play events under our experimental conditions.


| INTRODUCTION
Play is an understudied phenomenon shared across taxa (Bekoff & Byers, 1998). By definition, play is purposeless and free (Huizinga, 1949). Play manifests itself most intensely during development and is important for the development of social skills (Himmler, 2016). This implies a role for play in appropriate learning and development of behaviors. Recent work has increased our understanding of rat play behavior (Pellis, Field, Smith, & Pellis, 1997), the ultrasound vocal manifestations of positive emotions by playing rats (Panksepp & Burgdorf, 2003), neural correlates of ticklishness (Ishiyama & Brecht, 2016) and even the neural underpinnings of the social role-play game of Hide & Seek (Reinhold, Sanguinetti-Scheck, Hartmann, & Brecht, 2019). Our recent work has shown that rats are avid role players, who can play Hide & Seek as both hiders and seekers. While playing, they emit highly game-specific vocalization patterns consistent with previous work on ultrasonic vocal manifestations of positive behavior (Panksepp & Burgdorf, 2003). Such play behavior intensely engages the medial prefrontal cortex (mPFC), resulting in activity sharply related to important events of the game (Reinhold et al., 2019). Extensive work in primates has shown the existence of mirror neurons, cells with congruent activity related to action execution or observation. Observing related activity has been described in a variety of cortical areas (Breveglieri et al., 2019;Dushanova & Donoghue, 2010;Pani, Theys, Romero, & Janssen, 2014;Tkach, Reimer, & Hatsopoulos, 2007). It is unknown; however, if observing others play also engages the mPFC and prefrontal neurons in similar mirroring activity.
Here, we ask: (i) how do rats react to observing others play Hide & Seek? (ii) Is the medial prefrontal cortex engaged in the same way while playing than while observing play?

| METHODS
Four male Long-Evans rats were subjects in this study (Janvier-labs). All experimental procedures were performed according to German guidelines on animal welfare and animal experimentation permits G 0297/18 and followed detailed methods described in our previous paper (Reinhold et al., 2019). The only differences consisted on the inclusion of a second observer rat in a 60 × 40 × 30 cm glass box (as shown in the figure), and the alternation between observer and playing rats. All data shall be shared upon request.

| Animals
We used juvenile male rats, which are known to be very playful (Panksepp & Burgdorf, 2003;Pellis et al., 1997). Animals were maintained in a 12:12 hr inverted light/dark cycle with free access to food and water ad libitum. Rats were housed individually to increase their social bond to the experimenter as their sole interaction partner (Panksepp & Burgdorf, 2003). Experimental manipulations started at postnatal day 21 for all animals. All Experimental procedures were performed according to German guidelines on animal welfare and animal experimentation permits G 0297/18.

| Experimental setup
For habituation, training and experiments, animals were brought into the 5 m × 4 m illuminated room; animals adapted quickly to the room and the illumination (100-140 lux). Three hiding places for the experimenter were built using large cardboard boxes and positioned in equal distance to the center of the room. As additional hiding locations for the animals, we provided four boxes with an opening made from transparent plastic; two boxes were sprayed opaque. In order to make the boxes more attractive, we also included a small piece of cloth. Positions of boxes and large hiding places changed throughout training and experiments. We positioned a lockable "start box" with a remote cable-controlled opening mechanism sized 32 × 21 × 15 cm at the center of the room. In seek trials, animals were locked in the start box, while the experimenter was hiding. To prevent animals from using auditory cues or just follow the remote cable, we included white noise masking while the experimenter went into cover and used decoy cables to all different hiding locations.

| Observer box
The experimental setup also included a 40 × 60 × 30 cm glass observer box equipped with an ultrasound microphone, to which subjects were also habituated to. Rats were placed in this box in order to observe another rat playing with the experimenter. The observer box was situated as in Figure 1. We also tested other locations for the observer box, without observing any difference.

| Behavioral experiments and habituation
Habituation and training was performed between 10 a.m. and 8 p.m. during the dark phase of the rats' circadian cycle. Although the overall training strategy was similar for all animals, details of the training protocol were adjusted individually. We either started to train animals on the "seek" paradigm or the "hide" paradigm. Handling of the animals was done using cotton gloves.
Prior to the training, rats were habituated to the experimenter and the experimental setup starting at postnatal day 21. Rats were habituated to the experimenter by 5-10 min intensive handling per session. Gradually, the rats' comfort zone increased. Once rats explored the whole experimental environment, the experimenter started with playful interactions, habituating the rat to tickling and hand chasing as described in Ishiyama & Brecht, 2016. The whole process took 5-10 days with approximately 1 hr habituation per day.

| Seek
In the seek paradigm, the experimenter places the rat in the start box, locks the box and hides. Then, the experimenter remotely opens the start box and waits for the rat to come and | 4129 CONCHA-MIRANDA et Al. find him. Seeking time for the rat was limited to 150 s. If the rat succeeded in finding the experimenter within the given time frame, it was rewarded by a playful interaction. This interaction involved a play phase of 20-50 s including tickling, hand chasing and rough-and-tumble-like play, as well as petting of the rat. If the rat missed the experimenter, meaning it did not find the experimenter within 150 s, it was picked up and brought back to the start box without playful interaction. A finding event was defined as the rat being in line of sight and less than 40 cm distant from the experimenter.
Rats were introduced to seek stepwise. At the beginning, rats were given visual cues. The rat was placed in the open start box, while the experimenter walked away to one of the hiding places. Initially, the experimenter did not hide behind a cardboard box but instead sat down next to it clearly visible to the rat. Once the animals robustly approached the experimenter, the experimenter started to hide. In training, whenever the rat did not find the experimenter within 150 s, the same hiding place was used again in the next round. As a next training step, the experimenter started closing the lid of the start box. Then again, the experimenter would first be visible after box opening before starting to hide fully. Finally, we introduced white noise to mask acoustical cues. On the last training step, the experimenter was randomly hiding behind different hiding places, which is closest to the way humans play Hide & Seek. Rats were not conditioned with food or water rewards. Instead, in all seek protocols, playful interactions after finding the experimenter were used to teach rats to "seek".

| Hide
In the hiding paradigm, rats chose a hiding location and the experimenter sought them. Time to search for a hiding F I G U R E 1 Playing and observing Hide & Seek. (a) Cartoon representing the structure of the game (b) Image of a typical seek trial in a ~20 sqm room, depicting the experimenter hiding behind one of the cardboards, the playing rat (inside dashed circle) about to find the experimenter, and the observer rat inside the observing cage (Marked in Red. Left). (c) Timing of a typical session, where the recorded rat was observing another rat play (Observing RED: transparent green and pink) or when the recorded rat itself was playing (Playing: green and pink). The start of each trial is indicated by a black tickmarc on the top. During observing trials, the recorded rat was located inside the observing box. The white gaps indicate the transition between trials and the switch of rats between observing and playing roles. During the recorder rat observing section, we classified its behavior into 3 categories: Rest (purple), Grooming (light blue) and Engaged (orange location was limited to 90 s. Much like in the seek paradigm, we played with the rats for 20-50 s after successful hiding. A hiding trial was scored as successful if the rat went to one of the seven provided hiding locations (2 transparent boxes, 2 opaque boxes, 3 cardboard hiding places) or another hidden area not visible to the experimenter. If the animal failed to hide, it was carried back to the start box without a playful interaction. Two types of failures were distinguished. If the animal went to and stayed longer than 10 s at a place that was clearly visible for the experimenter, this was defined as a visual failure. Other failures included behaviors like the rat going to a proper hiding location but leaving it within 5 s (or before the experimenter found it), the rat not going to any-hidden or visible-hiding location within 90 s or the rat approaching the experimenter and staying within 40 cm reach for >30 s. After habituation rats were introduced to "hide" as follows: the animal was put into the start box, lid open, with the experimenter sitting next to it. As soon as the animal jumped out of the box, the experimenter rewarded it with a playful interaction and returned it to the box. This procedure was repeated several times until the animal reliably jumped out. From then on, the animal was not rewarded anymore when it jumped out of the box, but only when it went to a hiding location. Initially, the experimenter came to the hiding location very fast (within 5 s). Once the rat did 5 to 10 successful trials, the experimenter took 10-15 s to search and find it.

| Hide & seek-task switching
We trained animals sequentially to play "seek" as well as "hide". Once the animals mastered both forms of play, we introduced a combined game involving both paradigms. This Hide & Seek paradigm required the rat to switch from "hide" to "seek" or form "seek" to "hide". In these switching sessions, we assigned who seeks and hides and signalled this assignment to the rats by two cues: (a) In seeking trials, the experimenter closed the lid of the start box. (b) In hiding trials, the start box was left open and the experimenter sat immobile next to it, loudly counting. Each game started with 5 rounds of "Hide"/"Random Seek" and continued with either 5 rounds of "Random Seek"/"Hide".

| Observing hide & seek
During training sessions, a second rat was placed inside the observer cage. Rat stayed inside the box for 20-30 min and chocolate chips were delivered on the first days to encourage habituation to the box. During the whole 20-30 min, another rat (the demonstrator rat) was being trained in one of the forms of play (either hide or seek). After this time, rats were exchanged and the demonstrator-observer roles were inverted. The now demonstrator rat went through the other training program: that is, if the former demonstrator rat was being trained to play "seek" (hide), then the later demonstrator (the rat that was being observer at the beginning) was trained to play "hide" (seek). Once both rats learned their corresponding form of play, each rat was trained in the second form. When both rats mastered both forms of play, they were trained to switch between tasks as described in the previous paragraph. In this stage of training, the observer rat was able to observe both forms of play while staying inside the box. Under this training phase, the observer rat watched 5 "hide" and 5 "seek" trials, and then, it was taken out of the box to play 5 "hide" and 5 "seek" trials while the other rat now remained inside the observer cage. The order of the forms of play was exchanged between sessions, so that some sessions started with 5 "hide" and others with 5 "seek" trials.

| Video recording and analysis
We acquired wide angle video of the rats' behavior via one overhead Flir Chameleon 3 camera (FLIR® Systems, Inc., USA) running at 30 frames per second. A second Flir Chameleon 3 camera recorded observer behavior inside the observer cage. The images obtained and the corresponding camera metadata related to frame identity and digital input pin states were recorded using Bonsai (Lopes et al., 2015) performing online tracking of the rat.
For the playing rat, videos were manually analyzed for search times, latencies to jump out of the start box, latencies to hide, failures to seek or hide, preferences for different types of hiding locations and probabilities to go to the last hiding location. Regarding the rat inside the observer cage, animal behavior was manually classified as rest, grooming and engaged.
Onset and offset times of different behaviors and events were marked using ELAN version 4.9.3 and newer. Specific behaviors were defined as follows: Darting: Includes all periods where the animal is moving with high speed on a relatively straight trajectory.
Freezing: In contrast to resting, includes times when the animal suddenly stops any movements. In resting, the end of locomotion occurs gradually and the animal may lay down. Both behaviors end with movement.
Exploring: Was defined as periods where the animal moves with low speeds and sniffs the environment.
Jump out: Was defined as the act of leaving the starting box, it starts with moving over the wall of the start box and ends when all paws touch the ground outside.
Jump in: Was defined when the animal starts moving from the experimenter's hand toward the start box and ends when the animal is inside of the start box.

| 4131
CONCHA-MIRANDA et Al. Grooming: Starts with the first grooming motions on body and ends if no further grooming appears within the next five seconds.
Sighting: Was defined as the time when the animal has a clear line of sight to the experimenter. It ends with either interaction or the line of sight being interrupted.
Interaction: Includes all behaviors, where the experimenter plays with the animal and ends only when the animal leaves the experimenter for at least 20 s or return starts.
Return: Describes the time where the animal is carried back to start box by the experimenter and ends with the animal jumping out of the hand.
Hiding: while hiding, the animal is inside an area with presumed intent of being out of line of sight to the experimenter. It ends when the animal comes out, line of sight is established or interaction starts.

| Observer rat video analysis
Using the same method, we classified the observer rat behavior into 3 categories.
Grooming: Starts with the first grooming motions on body and ends if no further grooming appears within the next five seconds.
Rest: Starts when the rat lays on the bedding and is not using its leg's to support its weight. Rats remained awake during this behavior. Behavior finalizes when the rat resumes locomotion.
Engaged: Starts when the rat retakes body support on its legs and is locomotive. This includes many behaviors, like sniffing, rearing, exploring, rearing against the glass. The behavior terminates at Grooming or Rest onset.

| Ultrasonic vocalization recording and analysis
Ultrasonic vocalizations (USVs) produced by the rats were recorded using four microphones, one above each hiding place and the fourth above the starting box. We used the following microphones: condenser ultrasound CM16/CMPA (frequency range 10-150 kHz) and omnidirectional FGseries electret capsule ultrasound microphones by Knowles (frequency range 10-120 kHz; Knowles Electronics, LLC., USA). Data were acquired at a sampling rate of 250 kHz and 16-bit resolution using the Avisoft UltraSoundGate 416H and Avisoft-RECORDER software. USVs were identified using DeepSqueak (Coffey, Marx, & Neumaier, 2019). USVs were detected using a rat pre-trained neural network, and analyzing audio frequencies between 40 and 100 kHz. All automatically detected calls were manually reviewed, to precisely define their beginning, end and proper frequency range. Also any event misclassified as USV was rejected by inspection. For the USV rate analysis, the time of the USV was defined as its starting time. USV were classified as Combined, Flat, Ramp-Up, Ramp-Down, Bow, Short or Modulated following Ishiyama and Brecht (2016)

| Electrophysiological recording and analysis
We implanted 4 Long-Evans rats with Harlan-8 tetrode Drives (Neuralynx Inc., USA) in the medial prefrontal cortex (Cg1, PL, IL) at coordinates (Bregma + 3, lateral + 0.5). Tetrodes were arranged in a 2 by 4 matrix resulting in recordings between 2.3 mm and 3.8 mm anterior from bregma and 0.25 mm and 0.75 mm lateral from bregma. Tetrodes were turned from 12.5 μm diameter nichrome wire (California Fine Wire Company) and gold plated to 250-300 kΩ impedance. In order to identify tetrodes in the anatomy of mPFC, tetrodes were stained with fluorescent tracers DiI and DiD (ThermoFisher Scientific Inc., USA) before implantation. Before surgery, animals were initially anesthetized with isoflurane (cp-pharma, 31303 Burgdorf) followed by an intraperitoneal injection of ketamine (100 mg/kg, Medistar Arzeneimittelvertrieb, 59387 Aschberg) and xylazine (7.5 mg/kg, WDT, 30827 Garbse). During surgery body temperature was kept at 36°C with a thermal blanket, and a non-traumatic head holder was used. Surgeries took 2-3 hr, after which the animals woke up and were treated with carpofen (5 mg/kg, Zoetis Deutschland GmbH 10785 Berlin) for 2 days. Animals were checked on regularly to ensure proper recovery from the surgery. No complications occurred. After 2-3 days of surgery, rats were gradually re-habituated to play Hide & Seek. The first 2 days, rats freely explored the room or remained as observers while a demonstrator rat played. Once animals showed interest in playing again, the Hide & Seek protocol was restarted.
We recorded neural signals using a 32 Channel wirefree neural logger developed by Deuteron Technologies, recording extracellular signals at 32 kHz. The system consists of a headstage performing amplification and digitalization, were the multiplexed signal is then processed in a processor board and stored on a micro SD card on the head of the animal. The whole system is mechanically attached to the cap of the Harlan-8 drive and covered by a protective case that also served as a red target in our online movement tracking.
The processor board of the Neural Logger receives and transmits radio signals allowing for communication with a base station. Radio communication is fast enough to enable synchronization via TTL's between the base station and the logger. Hardware copies of these TTL's are also sent to the cameras via I/O pins and to the Avisoft hardware to ensure synchronization of all devices. Extracellular recordings were spike detected and sorted using Kilosort (Pachitariu, Steinmetz, Kadir, Carandini, & Harris, 2016). After the initial sorting, clusters were manually curated using Phy in Python. Cluster quality was assessed by spike shape, cluster separation of its principal components. SNR and ISI-histogram with lack of contamination in a 1 ms refractory period.
The present report includes data of 3 sessions of Hide & Seek per rat. Animals were recorded while playing and observing play during each session, which consisted of 5 hide & 5 seek "playing" trials, and 5 hide & 5 seek "observing" trials. We found between 8 and 17 good quality units per session.
After conclusion of the experiment, rats were anesthetized with urethane and depth positions on selected tetrodes were marked with electrolytic lesion. The lesions were conducted using a NanoZ (Neuralynx Inc., USA) with a DC current of −8 (µA) for 8 s, tip negative. The rats then received an overdose of the anesthetic and were transcardially perfused using a prefix solution followed by PFA at 4%. The brains were extracted and post-fixed in 4% PFA for 18-24 hr before being sectioned coronally into 100 µm thick sections. Before proceeding with tissue staining, slices were photographed at an epi-fluorescence microscope to reveal differential patterns of fluorescence dyes (DiI or DiO) on different tetrodes. This allowed for further identification of each individual tetrode. To finalize the histology tissues went through cytochrome oxidase staining and were imaged under a bright field microscope to visualize the lesions and identify the anatomical location of each tetrode in the brain, in according to the Paxinos & Watson rat brain atlas (Sixth edition, 2007).

| Response index analysis
All analyses were performed using Matlab (R2019b, The Mathworks, Inc) subroutines and custom scripts. To calculate the neurons responsiveness to events while playing and observing, we used 5 s pre-post response indexes. For each neuron, we calculate a psth 5 s before and after the event onset with a 500 ms bin. We define response index as the difference in mean firing rate before and after the event, divided by the sum of the mean firing rates before and after the event. The resulting response indexes distribute between −1 and 1. Response indexes close to 1 correspond to a strong positive response to the event, that is, firing rate increasing after the event. Response Indexes close to −1 correspond to a strong negative response to the event, firing rate decreasing after the event. Response Indexes close to 0 correspond to cells not changing its firing rate in response to the event.
We use Response Index distributions to compare the responsiveness of cells to events while playing versus the same event while observing using the Brown-Forsythe for equal variance. In the case of Response Indexes, greater variance signifies higher responsiveness to the event.
Response index by observer state was estimated using the same procedure, but instead of using all observing play events to perform the Response Index estimation, we classified these events as occurring during "engage", "rest" or "grooming". Those events that did not match any of the observer states were not included in the analysis. We then repeated the Response Index analysis separately by each of these categories.

| Neuron classification
Neurons were classified either by their firing rate modulation between different states (as observing versus play), or by their response indexes. The firing rate modulation between two states "a" and "b" was estimated by the formula: Where FR X represent the mean firing rate across all "X" states.
To determine which neurons had a significant modulation index we performed bootstrapping, by shifting each spike train at random time intervals to generate a modulation index distribution for each neuron. A neuron was classified as state-modulated if its modulation index was located at one of the 0.05 tails of the estimated distribution. Positive modulation index values indicated neurons with increased firing rate during state a. Negative values indicated higher firing rates during state b. To determine responsive neurons after obtaining their response indexes, we performed an equivalent procedure but adapted to the response index formula.
We also estimated the number of false positives when classifying responsive neurons. To this end, we repeated the same procedure described in the last paragraph, but using as events to estimate the response index, random times picked uniformly between the duration of the session. This procedure gave us a distribution of the number of play-responsive and observer-responsive neurons (or both) that were obtained using these random events. We reported the number of responsive neurons on the 0.05 tail of the distribution, as the number of false positives obtained by chance.

| RESULTS
We played Hide & Seek with rats (n = 4) following the same game logic as previously described (Reinhold et al., 2019). In brief, we have developed a simplified 2-player (rat and human) version of Hide & Seek, in which a rat and a human experimenter engage in alternating blocks of trials of either hide or seek (Figure 1a). The game starts when the rat jumps into the start box (Jump in). In "seek" trials (pink; Figure 1a), closing the lid of the start box signalled the animal was playing the role of the seeker. After the experimenter hid, the rat jumped out and searched for the experimenter. After "finding", the experimenter initiated a playful interaction before returning the rat (Return). We reversed roles and assigned the rat the role of hider. In "hide" trials (green; Figure 1a) the experimenter left the start box open and crouched immobile next to it, cueing the animal to "hide" (Reinhold et al., 2019). Rats quickly acquire this game (within one to two weeks of training). In the current set of experiments rat's learned to find the experimenter in a 20 square meter room, and also learned to switch roles and hide effectively from the experimenter. We paired rats in order for them to take turns between playing and observing the partner rat play (Figure 1b). We recorded from the mPFC of rats (n = 4). Only one rat was recorded at a time. Neuronal activity was continuously recorded in 25-35 min' sessions, first with the recorded rat inside a clear glass box observing other rat play for 10-15 min (Figure 1c, transparent pink and green), and then playing itself (Figure 1c, pink and green). For each session, the behavior inside the observing cage and during play was classified manually from video recordings. The behavior of the playing rat (both recorded and non-recorded rats) was classified in relation to important events in the game (as previously reported in Reinhold et al., 2019). We also classified behaviors inside the observer box while the recorded rat was observing (Figure 1c). Behavior inside the observer cage was subdivided into resting (passive yet awake), grooming or engaged (active, exploring, sniffing, etc.). To further assess the animal's engagement, we recorded ultrasonic vocalizations (USV's) from the rats while playing and observing play (Figure 1d,e). As has been shown before (Reinhold et al., 2019), rats emitted numerous 50 kHz vocalizations during play (black trace; Figure 1d); however, during observing play (red), vocalization rate was low (Figure 1d). The overall differences in vocalization rates between playing and observing were marked (Figure 1e). Additionally, we found that most vocalization types are increased during play in comparison to observing. Rats performed higher rates of Modulated, Flat, Ramp-Up, Ramp-Down and combined calls during play (Fig. S1), only short and bow call types were non-significantly different (Fig. S1).

| Medial prefrontal cortex population activity is higher during play than during observing play
We investigated the mPFC activity during play and observing play by recording the activity of a total of 142 neurons from 4 rats (Figure 2a). The activity of a play-responsive neuron recorded in mPFC/Cg1 (Figure 2a) from a rat that was observing or playing Hide & Seek is shown in Figure 2b. Neurons from the mPFC (IL, PrL and Cg1) showed higher firing rates while playing than while observing play. The firing rate of neurons was higher when rats were either playing seek (Figure 2c) or hide (Figure 2d) than when observing play (seek: Wilcoxon signed rank p < 3.6 × 10 -6 . Hide: Wilcoxon signed rank p < 1.75 × 10 -4 ). Comparing firing rates of "hide" and "seek", however, showed no significant difference (Figure 1e; Wilcoxon signed rank p = .76). To analyze the significant increases in firing rate at the individual cell level, we did a bootstrap analysis by shifting the spike times, allowing us to assess a significance level for each neuron. In Figure 2c,d,e, significant neurons are identified according to the labels. Approximately 20% of neurons were modulated significantly by either form of play, whiles observing modulated cells were approximately 4% of the population. We further subdivided the firing rates while observing play according to the behavioral classifications of resting, engaged (locomotive state) or grooming, and then compared the firing rate differences among all these three states and hide and seek. When rats were playing seek (or hide, not shown) mPFC neurons had higher firing rates than during resting (Wilcoxon signed rank p < 3.93 × 10 -7 ), grooming (Wilcoxon signed rank p < 9.03 × 10 -15 ) or engaged (Wilcoxon signed rank p < 3.06 × 10 -9 ), behavior ( Fig. S2). At the same time, firing rates during engaged behavior were higher than during rest (Wilcoxon signed rank p < .001) and grooming (Wilcoxon signed rank p = .0013), no significant difference was found between grooming and rest (Wilcoxon signed rank p = .0996; Fig. S3). This resulted in significant differences between overall firing rates in observer states and the individual cell level (Fig. S3). These results show that during play, mPFC neurons have an overall higher firing rate than during observing play, and that these differences hold when comparing different behavioral states of the observing animal. However, during observing, the neurons are not completely silent and change their firing rate according to the observer animal's own behavior.

| Medial prefrontal cortex neurons respond during playing but do not respond during observing play
We then studied event-related responses during play and observing play of mPFC neurons. Here, we concentrate on two critical events, the onset of return (the end of the game) and jumping in (the beginning of the game). Figure 3a,b depict a mPFC neuron responsive to jumping in and return onset, respectively. This particular neuron showed a strong response (Figure 3, A black) to jumping in during play, this activity, which can be observed at the single trial level of the spike times raster, is clearly absent during observing play (Figure 3, A red). Similarly, responses to the return event were present during play but absent during observing (Figure 3a right). To determine if this effect was present at the population level, we calculated a 5 s pre-post response index (see methods) for both events during play and observing play. This is a measure of the firing rate change around these events. The response index takes values around 0 for non-responsive neurons (or symmetric responsive neurons) and tends to 1 and −1 as neurons are more strongly modulated for the event (1 for increase in firing rate, and −1 for decrease). During play, neurons showed much stronger response indexes presenting a wider distribution along the playing axis, with several neurons taking values near to −1 and +1 (Figure 3b), and several neurons being assessed as significantly modulated (Figure 3b darker dots in the scatter plot, and darker histogram on the playing axis). Unlike play, during observing play mPFC neurons response index clustered around 0 (Figure 3b, distribution observing axis) and presented fewer significantly modulated cells (red dots and red histogram Figure 3b). Pie charts in Figure 3b,c show the percentage of cells classified as significant, while significant playing responsive cells are expectedly around 20%, we found that ~10% are responsive to observing play events. However, as can be observed in the histograms, while play significant responsive cells distribute for higher absolute values of the response index, observing responsive cells did not. These points toward the possibility that these significant observing cells are overrepresented by the F I G U R E 2 Firing rates of mPFC neurons are lower during observing play. (a) Drawing of half a brain hemisphere depicting the location of a recorded neuron in Cingulate area 1; red line represents tetrode track; X demarcates cell location. (b) Firing rate trace a single neuron in mPFC/Cg1 (according to a) during observing Hide & Seek and playing Hide & Seek (c) Firing rate of mPFC neurons during observing play (x-axis) against the firing rate during seek trials (y-axis). p-value associated to the Wilcoxon ranked test is indicated above (after Bonferroni-Holm correction for multiple comparisons). Light gray dots indicate non-significantly modulated neurons, according to a bootstrap analysis. Dark gray dots indicates neurons that had greater firing rates during playing seek. Red dots indicate neurons with higher firing rate during observing play. The pie-chart indicates the number of neurons with significantly increased firing rate during playing seek (dark gray), observing play seek (red) or non-modulated (light gray), according to their modulation index. (d) as in c, Comparison between observing play and hide trials firing rate. (e) Comparison between seek and hide firing rates. No significant differences were found between hide and seek [Colour figure can be viewed at wileyonlinelibrary.com] addition of false positives. We quantified the frequency of false positives for both observing and playing events and found that false positives are much larger for observing (~10%) than for playing (~4%) (Fig S4, methods). Given all these observations, we did not find conclusive evidence for responsive observing cells. Finding mirror-like responses in mPFC would have signified finding neurons that have high and significant response index for an event during play and that congruently have a high response index for the same event while observing. This, in the Response Index scatter plot would translate into neurons clustering around a diagonal implying correlated congruent responses. Our results show that mPFC neurons are responsive to play events, but not to the same events when observing play. Most strikingly, the response patterns during observing and playing were not strongly correlated (Figure 3b, inset. corr = 0.03 for Jump in, and corr = 0.19 for Return), and the number of cells with significant responses for events during both play and observing was underwhelming and well within false positive rates. This analysis was extended to the interaction event during the game reaching the similar conclusions (not shown).
To assess whether the lack of mirroring response relates to the level of engagement of the observing animal, we first took a brute force approach and separated sessions according to the fraction of engagement of the observing animal into less engaged (sessions below 30%, mean engagement: 23%) and higher engagement (sessions above 30%, mean engagement: 40%). We found no difference in variance between response indexes distributions while observing of cells recorded in lower engagement (gray) and higher engagement (yellow) sessions ( Figure 3c).
However, still remains possible that even inside single sessions the moment to moment behavioral state of the observer influenced weather neurons responded to the other rat's play behavior. To dissect this further, we analyzed response indexes while playing and observing by filtering observing events according to the state of the observer rat. This allows us to compare for the same neurons, the response index of the neuron while playing, to the response index of the neuron while observing events while the animal is in different behavioral states (Figure 4a). We found, however, that even after filtering by state there is weak evidence for mirror-like activity (Figure 4b). Response Indexes of neurons distribute broadly along the playing axis, while they show narrower distributions under the observing/state axis (Figure 4b histograms and scatter plot). Together with this, we evaluated cell by cell significance in their response index and identified high (~29%) play-related response and very low (~5%) observing/state responses. In summary, all our lines of inquiry provide no evidence of play mirroring activity in mPFC in the conditions of the experiment. F I G U R E 3 Medial prefrontal cortex cells respond to Hide & Seek game events while playing but not while observing. (a) Top, spike raster plot for an example neuron aligned to jumping in or return onset during seek trials (white background for playing, red background for observing play). Bottom: PSTH for the corresponding events when playing (gray) and when observing play (red) and the bottom. (b) Response Index (5 s pre-post) of all cells to the onset of an event while playing versus observing for Jump in (left) and Return (right) events. Dark gray circles indicate play-responsive neurons, red circles observing play-responsive neurons, and, pink circles neurons that were responsive for both types of events. Non-responsive neurons are indicated in light gray. Histogram at the left side show number of non-responsive (light gray) and responsive neurons (dark gray) during play. Upper histogram show number of response indexes associated with observing play. Pie-chart indicate number of non-responsive and of responsive neurons for both event types. (c) Comparing response indexes for cells recording during sessions with higher (yellow) or lower (black) observer engagement. Distributions for response indexes while observing did not have different variance according to the Brown-Forsythe test. However, in 3 out of four cases significant differences existed between observing and playing, even after down sampling the populations [Colour figure can be viewed at wileyonlinelibrary.com]

| DISCUSSION
We studied rats playing and observing Hide & Seek. Playing rats performed as described in our previous work; they found the experimenter and hid from the experimenter, vocalizing throughout the game. Observing play, however, did not result in high vocalization rates, even though rats stay awake throughout the experiment. The lack of appetitive 50 kHz vocalizations coincides with the strong difference in overall mPFC activity. We found that neurons in the mPFC are less active during observing play than during play, even if we compare play to the different states of the rat while observing. Concomitant to this change in overall activity state, we found that neurons responsive to play events do not respond similarly to observed play events. We find few significantly responding cells during observing, that do not go beyond false positive rates, and we donot find cells with significant and congruent responses in both conditions. Our data therefore suggest that mPFC does not mirror play activity under our experimental conditions. Recent work has shown synchronization of neural activity in medial prefrontal cortex across brains of socially interacting mice as well as bats (Kingsbury et al., 2019;Zhang & Yartsev, 2019). These findings were at the macroscopic level, either measuring correlations of LFP power (Zhang & Yartsev, 2019) or mean population calcium activity (Kingsbury et al., 2019). Both studies find that close interaction and contact are a requirement for brain to brain synchrony. The lack of physically participation of the observer in the other rat's play might explain the lack of mirror-like responses. Thus, increasing the interaction between rats before and during play, by housing rats together or enabling more diverse sensory contact between observer and demonstrator, may be necessary to observe mirror-like activity.
Positioning the observer box right next to the start box (the most important place in the game) did not result in F I G U R E 4 Behavioral state of the observer does not influence observing responses. (a) Event classification procedure. Upper panel shows the observer states in orange (engage), purple (rest) and light blue (grooming), along with return and jump in events (crosses). The lower panel indicates how each events is classified as engaged, rest, grooming or play (black). Non-classified events are indicated with an asterisk. (b) Response Index plot as on Figure 3, but separating events by observing state. Response index of jump in events, occurring during engaged observer state (xaxis) or during playing (y-axis). Responsive neurons are indicated in dark gray (play) and red (engaged) or pink (play and engaged). Non-responsive neurons are indicated in light gray. Histograms of response index during play and engaged state are located at the left and upper sides, respectively. Piechart indicate number of responsive neurons for each state. mirror-like responses either. However, under these conditions, we cannot truly say whether the observer rat is paying attention to the other rat's play. On this same note, the physics of ultrasonic vocalization most likely did not allow observer rats to hear the playing rat. This may have contributed to the low engagement of the cortex. However, the absolute lack of clear mirror responses in hundreds of neurons while observing and the overall change in mean firing rate point toward a switch in mPFC network state during play.
What do these results tell us about mirror responses in prefrontal cortex? There is cohesive evidence for mirror neurons in primate premotor and motor cortices (Tkach et al., 2007). Why then, did not we observe mirroring of neuronal play responses in rat prefrontal cortex? This difference might simply reflect a species difference, that is, rodents might simply not show such mirror responses. This hypothesis is in line with this interpretation by Tombaz and colleagues (Tombaz et al., 2020) who observed little evidence for mirroring in rat premotor and motor cortices. Data from our lab, however, does not support this conclusion. We (Kaufmann, Brecht, & Ishiyama, 2019) obtained at least preliminary indications for robust mirroring responses in somatosensory cortex in a tickling paradigm. Other possibilities explaining the absence of mirror responses could relate to both modality and anatomical area. Recent work has described the existence of mirror neurons for painful stimuli in the anterior cingulate cortex, especially in deeper layers, an area beyond our electrode arrangement (Carrillo et al., 2019).
Alternatively, the absence of mirroring might be related to the specific neural mechanisms of play. One possible interpretation of our findings along this line is that our results reflect a different behavioral and brain state during play. There is scattered evidence for a distinct behavioral "play" state. Thus, animals and humans emit distinct vocalizations during play (50 kHz calls in rats; Panksepp & Burgdorf, 2003, laughter in humans), primates show distinct facial expressions during play, that is, play faces (Palagi, 2008), and during play-fighting attack, behaviors are systematically modified from real fighting behaviors such that they do not injure the play-mate (Pellis & Pellis, 1987). There is also evidence for distinct neural signature of play behaviors; for example neural plasticity and perceptual learning appear to be dramatically enhanced during action-play games (Green & Bavelier, 2003). Given these findings, we wonder if the distinct behavioral and neural correlates of observing play and actual play reflect distinct brain states in and out of play.