Male‐specific alterations in structure of isolation call sequences of mouse pups with 16p11.2 deletion

Abstract 16p11.2 deletion is one of the most common gene copy variations that increases the susceptibility to autism and other neurodevelopmental disorders. This syndrome leads to developmental delays, including speech impairment and delays in expressive language and communication skills. To study developmental impairment of vocal communication associated with 16p11.2 deletion syndrome, we used the 16p11.2del mouse model and performed an analysis of pup isolation calls (PICs). The earliest PICs at postnatal day 5 from 16p11.2del pups were found altered in a male‐specific fashion relative to wild‐type (WT) pups. Analysis of sequences of ultrasonic vocalizations (USVs) emitted by pups using mutual information between syllables at different positions in the USV spectrograms showed that dependencies exist between syllables in WT mice of both sexes. The order of syllables was not random; syllables were emitted in an ordered fashion. The structure observed in the WT pups was identified and the pattern of syllable sequences was considered typical for the mouse line. However, typical patterns were totally absent in the 16p11.2del male pups, showing on average random syllable sequences, while the 16p11.2del female pups had dependencies similar to the WT pups. Thus, we found that PICs were reduced in number in male 16p11.2 pups and their vocalizations lack the syllable sequence order emitted by WT males and females and 16p11.2 females. Therefore, our study is the first to reveal sex‐specific perinatal communication impairment in a mouse model of 16p11.2 deletion and applies a novel, more granular method of analysing the structure of USVs.

pup isolation calls (PICs) are particularly important, as they elicit searching and retrieval behaviour in the mother, and allow for individual recognition. 2,3,7,15 PICs are emitted when the pups are isolated from the nest or are cold, until postnatal day 13 (P13), 15 although they are thought to start hearing at P10-11. 16,17 PICs, because of their communicative significance, provide a useful model to study developmental dysfunction in production of vocalization, especially in the context of autism spectrum disorders (ASDs) and related speech and language disability. 4,[18][19][20][21] Qualitative and quantitative analyses of USVs can help us in understanding various neurodevelopmental aspects like ethology, behavioural pharmacology, neurotoxicology, and behavioural neurogenetics. Various studies have reported changes in the acoustic features like call duration, peak frequency, bandwidth, peak amplitude and call rate with age in wild-type (WT) 22,23 as well as in ASD models. [24][25][26] Furthermore, negative impact on mouse pup mother social communications was observed because of alteration in call sequences 27 which is analogous to human studies in which participants felt more negative states on listening to crying episodes of ASD babies. 23, 28 Grimsley et al. 23 studied development of vocalizations in mice and, using Zipf's statistic 29 and entropy analysis, 30 showed that sequences of syllables produced by pups were non-random. However, higher order structure in vocalizations or informative sequences that lead to structure has not been explored.
The main objective of this work is to identify the changes in properties of neonatal USVs and sequences of USVs in 16p11.2del compared with WT and if the changes are sex specific. We have investigated the acoustic features as well as the structure of the syllable sequences in steps of different orders starting from proportion of each syllable type to bouts of PICs. Using information theoretic analyses we determine the presence of informative components in the sequence of syllables for which we started with low order structure like transitions and then looked at high order structure, informative sequences. We find changes in syllable acoustic properties and syllable sequences that are specific to the male 16p11.2del pups. We specifically explore the possibility of presence of structure in PICs at the onset of vocalizations (P5) which corresponds to be in between the preterm (P3) and term (P7-P10) human infants. 31 As there is great heterogeneity in ASD mouse models, we focussed on early communication calls and tried to find likely informative syllable sequences, which provide structure to PICs, using mutual information 30 as a measure of dependence and transition probabilities between syllables. [32][33][34] After obtaining specific structure in PIC sequences in the WT P5 pups, we characterized PICs in 16p11.2del mouse. 32,34,35 PICs and we identified specific deficits in the male pups.  [19][20][21][36][37][38] Our findings of altered sequencing in mouse PICs strengthen the 16p11.2del mouse model and provide scope for further investigations to understand the circuit and molecular level manifestations of the disorder which lead to vocalization impairment associated with the disorder.

| METHODS AND MATERIALS
All experiments were approved by the University of Pennsylvania Institutional Care and Use Committee and conducted in accordance with National Institute of Health guidelines. To generate experimental pups, B6129SF1/J (Jackson Lab # 101043) females were bred to B6129S-Del(7Slx1b-Sept1)4Aam/J (# 013128) males. Unless otherwise noted, all results were based on averaging data from within same sex and genotype pups within a litter, as opposed to analysing individual pups. We provide a rationale for grouping through clustering analysis presented at the end of the results section where variability within groups was addressed to study effects of litters within groups.
Preliminary studies were performed on the 16p11.2del line to empirically determine that peak USV emissions occur around postnatal day 5, consistent with other reports. 15

| Pre-processing
Each wav file was divided into 5 s epochs and read in MATLAB (Mathworks). To eliminate the effect of background noise, the signal was filtered by a Butterworth band pass filter of order 7, removing frequencies below 30 kHz and above 160 kHz. 24 Furthermore, to capture important patterns in the signal leaving out low frequency noise, the signal was high pass filtered first by subtracting a 10 point moving average smoothed signal from the raw signal.

| Segmentation of syllables
Short term Fourier transform (STFT) of each epoch was calculated using a Hamming window of length 1024 and an overlap of 75%. Syllables were identified by calculating the power concentrated in each frame divided by the power in all the frames. It was median filtered over 30 ms. Peaks are detected by using a peak detection, comparing each element of data to its neighbouring values. If an element of data was larger than both of its neighbours, the element was considered as a local peak. The local peaks were tracked until it went below a threshold (mean + 0.01 * STD). The process was continued iteratively until all the peaks are tracked and are stored as syllables.

| Classification of syllables
Pitch jump is a distinctive feature for classification of mouse vocalizations as shown by Holy and Guo. 14 Our classification is based on the presence and absence of pitch jumps as done by Holy and Guo. 14 First from the STFT the variance in frequency content was used to determine if the energy content was broadband, and classified as N-type. Next harmonic content was checked based on spectral peaks.
Syllables were classified as H-type if significant peaks were present as harmonics of each other in at least three frames. For further classification, pitch jump was detected by calculating the change in gradient direction both frequency and time axis of the STFT using a Sobel operator. The pitch gradient magnitude was calculated. Peaks in the gradient were found whenever there were pitch jumps. The number of significant peaks is an indicator of the jumps in pitch. The threshold to determine peak in pitch gradient contours was decided blindly, without knowledge of genotype or sex. In the absence of pitch jumps syllables were either tonal with single spectral peak or S-type; syllables with pitch jumps were either with single jump, J-type or with multiple jumps and all other types were classified as Others or O-type ( Figure 1). There is a possibility of syllable misclassification, especially H types as S types in the delM and delF groups, because of any frequency components in the vocalizations above 160 kHz. Thus we repeated the classification with the upper cut-off frequency as 180 kHz instead of 160 kHz (see in Section 2.2). There was absolutely no change in any of the syllable types with the two upper cut off frequencies. The distributions of syllable types for each group were exactly the same with 180 and 160 kHz ( Figure S1). All subtypes of syllables were also present ( Figure S2) as observed in other studies 5,8,9,23,40 however, we used only pitch jump as the primary classification criteria based on Holy and Guo 14 to restrict the number of broad classes enabling our analyses requiring large sample sizes. Furthermore, pitch jump based classification of syllables shown by Holy and Guo through isomaps 14 is inherently tied to the vocalization production machinery.

| Calculation of joint distributions and MI based dependence
Syllable to syllable k-step transition probabilities P k (S i , S j ), which are equivalent to elements of the joint probability distribution, were estimated from the data. Each element denotes the probability of observing the j-type of syllable k steps after observing an i-type syllable, where i and j vary from 1 to 5 (the 5 types of syllables observed). Lack of a particular combination leading to '0' values in the joint distribution were corrected by the Krischevsky and Trofimov correction. 41 Mutual information or MI, between two random variables X and Y, quantifies total dependence between the two random variables 30 and can be computed from the joint distribution P(X, Y) and its marginal distributions P(X) and P(Y) as: By considering the syllables at a particular position as the random variable X and syllables after k-steps as the random variable Y, which take

| Calculation of Kullback Leibler divergence between distributions
We quantified differences between distributions of different syllable types produced by the groups of pups using an information theoretic distance metric, Kullback Leibler divergence (KLD) 30,47 which makes no assumptions about the data. The same method is also applied to quantify differences in joint distributions of different syllable to syllable transition combinations. To compute KLD between the distributions p and q taking on values over the same set (in our case syllables produced by two different groups of pups, for example WTM (wild-type male) and delM (16p11.2del male) taking values of different syllable types with probabilities p(x) and q(x), x being a syllable type, or the syllable to syllable transitions produced by the two groups) is computed as follows: As for MI, we performed debiasing of KLD using bootstrap resampling and consider significance in the same way with 95% confidence intervals.

| RESULTS
The data presented in the study were collected from postnatal day

| Types of syllables
Syllables observed in mouse vocalizations and PICs have been classified in a variety of ways, depending on the spectrotemporal features emphasized and characterized. 14,23, 25 Scattoni and colleagues 25 used 10 categories, namely, complex, harmonic, upward, downward, chevron, 2 syllable, shorts, composite, frequency steps and flat ( Figure S2).
The above classification has been used by others with modification. 23,48 In the current work, the entire data set of detected syllables have been classified into five types of syllables ( Figure 1) based on pitch jumps, a distinctive feature 14 in USVs (Section 2). Using five classes of syllables based on pitch jumps also allows information theoretic calculations to be done reliably, as sufficient number of utterances of each kind need to be present. All types of syllables were found to be present in each of the groups of animals considered. Since all types of syllables were present in the delM and delF groups, the ability to produce the syllables is present in the 16p11.2del mouse pups, indicating that the syllable production machinery is intact and not fundamentally different in the different groups of animals. It is more likely that if any alterations are present, it is in the relative occurrence probabilities of syllables along with structure in sequences of syllables produced.

| Basic call features
In order to elucidate possible alterations in PICs of the different groups we first quantified any possible differences in the basic call durations. We also analysed differences between groups of pups in mean call duration of each syllable type and mean peak frequency of each syllable type. The results are summarized in Figure S8. There were no systematic differences based on each syllable type, which could be attributed to the delM or delF genotype. Thus the overall features independent of syllable type reflect correlational differences with genotype.

| Probability of occurrence of different syllable types
To further understand the differences in PICs across sexes and genotypes, we computed the distributions or probability of occurrence of the different syllable types. The four histograms in Figure 2  KLDs. The delF distribution was also different from the rest but the KLDs are much smaller than those of the delM group. Thus a primarily male specific difference was observed in the 16p11.2del pups in the distribution of syllable types.

| Bouts in PICs
Mouse USVs have bouts of calls separated by gaps of silences. 14,23,40 The sequences of syllables emitted by each mouse pup have been divided into bouts by considering the distribution of silences between syllables ( Figure S3 which is defined by the choice of threshold in Figure 3 we vary the threshold ( Figure S4(A)) and do the same analysis considering larger silence durations to precede the start of bouts. The results shown for all four groups of pups in Figure S4(B), indicate that our conclusion above is independent of the criteria of marking the beginning of a bout of PICs. Making the silence duration marking the transition of bouts systematically longer does not change the observed degree or length of dependence between the bout starting syllable and subsequent syllables in any of the four groups.
To understand the higher order structure in the sequences, the above analysis was extended by computing MI between the second syllable in a bout and every successive syllable and further between the third and following syllables and so on. The results for each group are summarized in Figure 4 to the lack of high probability of a particular transition type (namely harmonic to harmonic) that is present in the WT groups and also somewhat in the delF group. Because the starting two syllables of a bout depends on the criteria or threshold silence duration marking the transition from one bout to another, we tested whether the joint distributions were different by changing the threshold duration of silence for bout end ( Figure S3(A)). We found that the joint distributions for different threshold values (as in Figure S4) were all highly correlated with each other (Figure S5), indicating that our analysis did not depend on the choice of bout end silence duration as long as it was 1 * STD above the mean ISS.

| Transition probabilities of three successive syllables
We extend the analysis in the above section, of two successive syllables, to three successive syllablesboth for the first three syllables of bouts and also any three successive syllables in a bout. Figure 6 summarizes the results and is arranged the same way as Figure  The probability of each type to be in the first position of a bout is compared with the overall probability of occurrence of that syllable ( Figure 2), that is, had the syllables been occurring randomly based on their respective occurrence probabilities. Syllables with higher (95% confidence) probability of occurrence than overall were considered as significant. The process was continued for each of the subsequent positions keeping the previous syllable types fixed until there were no significant syllables. In the above manner we find sequences that occur above chance and obtain the sequences that render structure to the PICs in different groups.
Tracking sequences in the above manner from the start of bouts we found the significant sequences of syllables in each group summarized in Figure 7. The WTM and WTF produced two and one significant sequence respectively, of which the one in the WTF was also present in the WTM (S-type followed by six consecutive harmonic types). The other significant sequence in the WTM was a sequence starting with 'O' or Other-type followed by six harmonic types. The delF group produced three significant sequences which were shorter versions of those observed in the WT groups and another that was a sequence of seven consecutive S-type syllables. The delM group also produced one significant sequence that had only three consecutive S-types, a shortened version of the third significant sequence in 3.9 | Alterations in structure of syllables were not litter specific Although we found clear male specific differences in the del-group of mice in all our analyses there were some degree of similarities in the joint distributions. Furthermore, in the delF group we found more similarities with the WT group and some similarities with the delM group.
In order to explain the similarities we asked whether the differences could be attributed to specific litters of mice in each group or whether any pup in a litter of each of the del groups could exhibit the differences. For these reasons we revisited our analyses of probability of occurrence of each syllable type and joint distributions of 1-step and 2-step syllable to syllable transitions and performed correlation of syllable (or transition) distribution based clustering of mouse pups.
We considered distance between distributions as 1 − ρ (ρ being the correlation between distributions) to cluster or group together the distributions with least distances. 49 The above process allowed creation of dendrograms 49 of groupings of distributions from each mouse.
First such clustering was performed based on the distribution of probability of occurrence of each syllable type for each mouse. Figure 8 (   Further the analysis tools developed to study structure in mouse pup vocalizations can be easily used to study deficits in other ASD mouse model PICs as well as in adult context specific vocalizations.
It should be noted that at P5, the mouse auditory system is not developed and the pups are essentially deaf. Thus our results also imply that not just the PIC vocalizations themselves 50 the PIC structure or sequencing observed at P5 is innate and not learned. However, it remains to be seen if the vocalization sequences as studied here over age change and require auditory experience or not. Since there is a male specific deficit in reward learning 51 in the 16p11.2del mice, the production of PIC sequences seeking the mother at later ages could also have deficits if the later vocalizations are learned.
Moreover, it is known that there are deficits in the adult social interaction related vocalizations in the 16p11.2del mice. 34 However future studies are required to investigate exactly which aspects of vocalization sequences over age have deficits and if they can be mechanistically tied to the deficits in reward learning.
Although other motor deficits were not evident in the delM or delF population, alterations in the vocalization production machinery is not known, which could cause some of the observed changes but The production of sequences of successive syllables might be specific to the early neural circuitry involved in vocalization production (one example site could be Layer V neurons in the motor cortex, 52 which could be disrupted in the 16p11.2delM. No study to the best of our knowledge has addressed the issue of production of sequences, or order of syllables produced. A recent study 53 shows increased cortical excitation inhibition ratio as a common theme in ASDs, with four mouse models including the 16p11.2del. However, the study did not consider any sex-specific effects. It also remains to be investigated if the ASD related excitation inhibition ratio change is present in structures involved in vocalization production. Imbalance of excitation and inhibition could be a possible mechanism by which sequences of vocalization syllables may be altered, however, a circuit model for the production of sequences needs to be tested.
A previous study 54 shows specific changes in 16p11.2del mice with increased Layer VI cortico-thalamic projection neurons and overall decrease in calretinin positive interneurons compared with WT.
However, although changes were not observed in Layer V neurons, the nature of changes specific to motor cortex are not known. Furthermore, changes in interneurons of other types (e.g., parvalbumin, somatostatin) have not been studied in this particular mouse model. Such studies would allow framing hypotheses about the circuit level alterations that may mechanistically explain the observations in our study.