Maturational changes in song sparrow song

in this species. We discuss potential implications of these results for mate choice and male–male competition in song sparrows.


Introduction
Song is a sexually selected signal that plays an important role in mate choice and malemale competition in many songbird species (Andersson 1994, Catchpole andSlater 2008). In the context of mate choice, song may convey useful information about the quality of the male to potential mates Nowicki 2005, Catchpole andSlater 2008). Females benefit from recognizing features of song that indicate male quality, as 2 this confers an advantage to her current reproductive effort and/or to the viability and reproductive success of her future offspring (Andersson 1994, Searcy andNowicki 2005). Song may also provide information about male resource holding potential (RHP) and/or aggressive intent in the context of territory defense behavior (Vehrencamp 2000).
An individual's age has been proposed as a reliable correlate of its quality; thus, a receiver's ability to assess the age of a singing male may provide useful information to either a potential mate or a potential rival (Manning 1985, Kipper and Kiefer 2010, Zipple et al. 2020. For a potential mate, there are two ways in which age may be related to quality. First, older individuals may provide indirect benefits (sensu Kirkpatrick and Ryan 1991) by being generally of higher genetic quality, as evidenced simply by the fact that they have been able to survive to an older age (Trivers 1972, Manning 1985, Kokko 1997, 1998. Second, older males may provide direct benefits by having more experience, superior access to resources (e.g. high quality territories), or a greater capacity to provide parental care to offspring (Yasukawa et al. 1990). For a potential rival, aging may be associated with changes in RHP, which is relevant in male-male interactions. In particular, as has been shown with delayed plumage maturation (Greene et al. 2000), first-year birds may display a less threatening signal to avoid costly aggressive interactions with older males who have established territories (Vehrencamp 2000), and very old birds may be less threatening if they are less effective in territorial challenges (Zipple et al. 2020). Here, we analyze song sparrow Melospiza melodia song across the first three years of life to evaluate whether adequate agerelated variation exists such that receivers could use song as an indicator of male age.
Delayed maturation of song has been observed in several species of songbird. In swamp sparrows, for example, song rate, song length, song consistency and vocal performance change between a male's first and second year of life (Ballentine 2009, Zipple et al. 2019. Other species exhibiting delayed maturation include willow warblers (Phylloscopus trochilus; repertoire size, element rate; Gil et al. 2001 Forstmeier et al. 2006, Węgrzyn et al. 2010, barn swallows (Hirundo rustica; song length, uniqueness; Galeotti et al. 2001), great tits (Parus major; song consistency; Rivera-Gutierrez et al. 2012), ortolan buntings (Emberiza hortulana; song type distribution and composition; Osiejuk et al. 2019) and Java sparrows (Lonchura oryzivora; song length and tempo; Ota and Soma 2014). The strongest effects have been most often detected between the first and second year of life, with mixed trends observed in subsequent years (Kipper and Kiefer 2010). Some species experience stabilization of song characteristics, others undergo continual increases in quality, and in at least three species, senescence of song in later years has been reported (Kipper and Kiefer 2010, Rivera-Gutierrez et al. 2012, Zipple et al. 2019, Berg et al. 2020.
The degree to which song sparrows exhibit delayed maturation of song (if at all) is currently unknown. Song repertoire is the only characteristic that has been examined in song sparrows for age-related change, with no evidence of delayed maturation in composition or size (Searcy et al. 1985, Nordby et al. 2002. Song sparrow songs and singing behavior vary, however, across a number of dimensions that may change with age, suggesting the need for a more thorough evaluation of delayed maturation of song for this species. Here, we use a longitudinal dataset obtained from 17 hand-reared male song sparrows recorded across their first three years of life to investigate whether age-related change occurs in song features that may affect male or female response, including measures of song complexity as well as measures of song output and song consistency.

Subjects and song tutoring
We collected 17 male song sparrows as nestlings in May 2013 from nests located in Crawford County, Pennsylvania, USA. At three to six days after hatching, we transported the birds to Duke University to be hand-reared in a large sound isolation room (Acoustic Systems RE-142, 1.9 × 1.8 × 2.0 m). We hand-fed nestlings a standard nestling diet (Marler and Peters 1988), starting at 30-min intervals from dawn to dusk and later transitioning to 60-min intervals. Birds were given ad libitum access to seed and other foods beginning at 15 days of age and were fully self-sufficient by 24 days of age on average. After fledging, birds were housed in individual cages together in the same room. Birds were maintained on a normal seasonally varying photoperiod for the duration of the study.
Each subject was tutored with recordings of wild male songs starting at approximately 10 days post-hatching and lasting for 12 weeks, ensuring that the entirety of the song learning sensitive period in this species was captured (Marler and Peters 1987). Recordings used for tutoring consisted of 32 different song types that had been recorded in 2009 and 2010 from free-living song sparrows at the same sites where we collected our experimental subjects. Tutor song types were presented in two 2.5-h sessions daily, during which each song type was played at a rate of six songs per minute for a four-min period with one min of silence between bouts (see Anderson et al. 2017 for more details). Each week the tutor songs were played in a different random order. All subjects experienced the same tutoring regime together in the soundisolating room in which they were hand-reared.

Song recording
On days when birds were recorded as adults, they were moved to individual sound isolation chambers (Industrial Acoustics AC-1) to facilitate high quality recording. We analyzed songs that were recorded beginning late April (specifically, on the calendar date when a photoperiod of 13.5 h of light was first achieved) through early June of 2014, 2015 and 2016. Recording data were collected for all 17 individuals across these three years. The subjects' songs were fully crystallized by their first adult recording session in 2014.

Song analysis
We analyzed songs sung between one hour before dawn (i.e. lights on) until two hours after dawn, and we included songs recorded over three to ten separate days to acquire a minimum of 200 songs per bird per year (Searcy et al. 1985).
No new song types were observed in any bird's repertoire in second-or third-year recordings that had not been observed in its first-year recordings, so we are confident that we captured the entire song repertoire for each individual in the first year. We visualized spectrograms using Syrinx (J.M. Burt, Seattle, WA; 512 transform, 10 ms/line time axis, 0-10 000 frequency axis) and Signal for Windows ver. 4 (172.3 Hz frequency resolution and 5.8 ms time resolution). Overall, our analysis included 50 933 songs recorded from 17 birds (per bird per year: M = 999, SD = 698).
Song sparrows in this population typically develop a song repertoire of 5-11 distinct song types (Boogert et al. 2011). Each song type is sung with variation, with variants of each type differing slightly in composition due to the omission, addition or rearrangement of one or more notes (Podos et al. 1992). By visually inspecting the spectrographs, we grouped songs into song types and classified renditions of each song type into different variants (Fig. 1). Variants were distinguished based on differences in the types and sequences of notes and trills in a song, but songs were still classified as the same variant if only the number of syllable repetitions in a trill differed (Podos et al. 1992). Only songs with significant differences in the structure and ordering of trills and note complexes, especially in the beginning and middle of the song, were classified as distinct song types ( Fig. 1). Therefore, songs of the same type and variant can still exhibit variation in the number of notes but will always have the same number of unique note types. Almost 70% of song type variants are sung infrequently (<1% of bird's total recording sample), and there is no evidence of a maximum number of song variants produced for a given song type (Podos et al. 1992). The number of song types and number of variants per song type were calculated as two initial measures of complexity. As another measure of within-type variation, we determined the most common variant of each song type for each bird in each year and calculated the proportion of total songs recorded that were represented by these variants.
We quantified a fourth measure of song complexity by counting the number of notes in each song exemplar of the most common variant for each song type, calculating the average for each most common variant and averaging across song types for each bird. In addition, we calculated the number of unique notes of each most common variant song type. We chose to calculate these measures exclusively with the most common variant of each song type because the most common variant is a reasonable proxy for what receivers are hearing most of the time. The most common variant is consistently sung over three times more often than the next most common variant (M = 3.08, SD = 0.57), and 74 ± 17% (mean ± SD) of songs sung within a day are classified as the most common variant established for that year.
We calculated the interval between songs by measuring the amount of time (to the nearest millisecond) between the start of each song, with the interval considered withinbout if the song type remained the same between songs and between-bout if the song type changed. For hourly singing rate, we simply calculated the average number of songs sung per hour.
We determined between-song consistency using pairwise spectrographic cross-correlation comparisons (512 pt FFT, frequency range 0-10 kHz; Signal for Windows ver. 4) between a middle syllable from ten exemplar trills of the same song type (n = 45 possible pairs). Exemplar trills were almost always internal (as opposed to introductory or terminal trills, e.g. unique notes f and g in Panel B, C and D, Fig. 1) and we consistently chose the first identifiable set of ten consecutive songs recorded within the same day. We used any song type that contained an internal exemplar trill that always consisted of at least three syllables in the analysis, and each individual bird was represented by at least one song type over all three years. We measured within-song stereotypy using the same set of selected ten exemplar trills but instead performed crosscorrelation comparisons of sequential syllables within each trill. For both cross-correlation calculations, we calculated the average and standard error (SE) for each unique combination of song type, year and individual bird.

Statistical analysis
To explore the relationship between age and each song characteristic, we used linear mixed models with individual bird ID as a random effect to test for within-bird differences in song characteristics across the first three years of life (age treated as a factor; R ver. 3.6.3, packages 'lme4' and 'ggplot2'; <www.r-project.org>, Bates et al. 2014, Wickham 2016. To meet the assumptions of residual normality, we used a square root transformation when analyzing interval between songs, within-song stereotypy and between-song consistency, which produced approximately normal distributions of residual terms. With one exception, analyses of transformed and untransformed values were qualitatively identical. The only discrepancy occurred in the within-bout interval comparison between ages 1 and 3, where the transformed value was significant (p = 0.04) and the untransformed value was insignificant (p = 0.14).
We performed linear discriminant analysis using the MASS package in R Ripley 2002, Ripley et. al. 2013) to test whether age could be correctly classified based on our measured song characteristics. To minimize redundancy while maximizing explanatory power, we Figure 1. Sample song sparrow sonograms with identified characteristics. All songs shown were sung by the same male. Panel A is a rendition of one song type and panels B, C, and D are different renditions of a second song type. (A) This example highlights the main components of a song: notes, syllables, trills and note complexes. Trills consist of repetitions of a syllable, which may include one or more notes. Note complexes consist of a group of notes that are not repeated in a pattern. Song sparrow songs typically alternate these phrase types. (B) This song consists of 10 unique notes and 18 total notes. It was the most common variant (MCV) in this bird's repertoire and accounted for 26% of songs classified as this song type in 2014. (C) Also sung in 2014, this song is the same type and variant as B, as it has the same sequence of trills and note complexes and therefore the same number and type of unique notes. However, note that due to variation in repetition of syllables, this song has 16 total notes instead of 18 in B. Discrepancies in the total number of notes do not warrant classification of a new variant, so long as the trill and note complex sequence and the unique number of notes remain the same. (D) This song is the same song type as shown in panels B and C but is classified as a different variant. This is due to the addition of notes at the end of the song, resulting in an increase to 12 unique notes and giving the song a total of 20 notes. This was the most common variant (MCV) and accounted for 54% of songs classified as this song type in 2015. In panels B-D, notes are identified with alphabetical letters below the sonogram, with different letters indicating unique note types. included each characteristic listed in Table 1 in the models, except that we included a single measure of time interval between all songs rather than distinguishing between withinand between-bout intervals. We created two models: one with three groups representing each age class (hereafter the 'discrete-age model'), and another with only two age classes (young = age 1, older = age 2 and 3; hereafter the 'maturity model'). To evaluate each model's predictive accuracy, we split the data (17 birds × 3 years = 51 unique profiles) into training (80% = 42/51) and test (20% = 9/51) sets and calculated percentage of correctly classified samples in the test set.

Results
Song sparrows in our sample had an average of 6.6 different song types (range: 5-11, SD = 1.4) in their repertoire and sang an average of nine variants (range: 1-32) per song type per year. We observed a consistent pattern of age-related change across the song characteristics we examined (Table 1, Fig. 2). In seven of the eight characteristics (within-type variation, number of notes, unique number of notes, time interval within songs, time interval between songs, singing rate and between-song consistency), a significant change occurred from age 1 to age 2, but no further significant change occurred from age 2 to age 3. Specifically, we found that measures of complexity such as the average number of notes per song and the number of unique notes per song significantly increased between ages 1 and 2 and between ages 1 and 3 but exhibited no change between ages 2 and 3. The proportion of variants identified as the most common variant followed a similar progression, with significant positive change observed between ages 1 and 2 and between ages 1 and 3, indicating a decrease in within-type variation. The time interval between song type switches and hourly singing rate significantly increased between ages 1 and 2 and between ages 1 and 3 (excluding within-bout time interval) but not between ages 2 and 3, once again highlighting the importance of the difference between the first two years of life. Between-song consistency also experienced a slight yet significant decline from age 1 to age 2 and age 1 to age 3. Within-song stereotypy was the only characteristic that did not significantly change with age.

Linear discriminant analysis
The results of our multivariate linear discriminant analysis were consistent with our analyses of individual song characteristics. Visual inspection of the linear discriminant values revealed substantial separation between one-year-old birds and all older birds but not between two-year-old and threeyear-old birds. Applying the model predictively to the test data confirmed this visual assessment: the models effectively discriminated between songs recorded from one-year-old birds as compared to more mature birds (i.e. birds older than one year), but did not discriminate between two-and three-year-old birds. For this reason, the maturity model was much more successful than the discrete-age model in classifying age based on the measured song characteristics, as it correctly predicted the age class of 88.9% of the test observations (eight out of nine profiles) compared to only 55.6% for the discrete-age model (five out of nine profiles). Interestingly, the discrete-age model performed slightly better at identifying one-year-old birds, as it correctly classified all one-year-old birds (three out of three profiles), while the maturity model only correctly classified 66.7% of oneyear-olds (two out of three profiles). However, all two-and three-year-old birds were correctly classified in the maturity model (six out of six profiles), whereas only 50% of two-and three-year-olds could be properly identified in the discreteage model (three out of six profiles).
The single linear discriminant identified by the maturity model consisted primarily of number of unique notes (coeff = 1.15), within-type variation (coeff = 1.10) and singing rate (coeff = −0.88). These same variables contributed most heavily to the first linear discriminant from the discrete-age model, which accounted for 96.1% of the predicted among-group variation (coeff = 1.12; 1.29; −0.89). Visualization of the two models confirm a difference in discriminability. Density plots reveal that the maturity model had significantly less overlap between age categories and was thus better able to discriminate than the discrete-age model (Fig. 3). The discrete-age model had a large amount of overlap between ages 2 and 3, which most likely accounts for its inability to effectively discriminate (Fig. 3). The first three columns show averages and standard errors of each song characteristic for ages 1-3 and the last three columns indicate significance levels for each age comparison. Transformations used to meet the assumption of normality include time interval, within-song stereotypy and between-song consistency (square root transformation). MCV refers to most common variant. p values for each age comparison are reported from linear mixed effects models using the following significance levels: p < 0.05*, p < 0.0005***.

Discussion
Male song sparrow song is a complex, variable trait and an important assessment signal affecting female mate choice (Searcy et al. 1985, Hiebert et al. 1989) and male territory defense (Nowicki et al. 1998). The main goal of the present study was to ask whether song characteristics could provide information about male age across the first three years of life. Two conditions must be met for this to be possible: First, there must be measurable age-related changes in song features that differ significantly across years. Second, there must be a reasonable likelihood that potential mates or potential rivals could distinguish age based on these differences. We found that several characteristics of song sparrow song exhibit agerelated change, which can be attributed largely to significant changes occurring between ages 1 and 2 (Table 1). In aggregate, age-related differences in these song features may allow receivers to distinguish between young (age 1) and older (age 2-3) males, but are unlikely to allow receivers to distinguish between two-year-old and three-year-old males (Fig. 3). These findings are consistent with the idea that song could serve as a signal of maturational status in the context of mate choice and/or male-male competition in this species. The song characteristics most strongly associated with age are the degree of within-type variation (the proportion of songs that belong to the most common variant), the number of unique notes sung in the most common variant and singing rate. While all measured features contribute at least to some degree and have the potential to be salient in age discrimination, these three components explain the bulk of the total variation in male song across years. Thus, if signal receivers discriminate males' songs based on age, they would likely use some combination of differences in these three variables.
Two measures of song complexity clearly increase with age: the average number of notes and the unique number of note types contained in a song of the most common variant. This finding is consistent with the possibility that older males are more attractive to females, as the complexity of a signal is often an honest positive indicator of the quality of an individual (Manning 1985, Kipper andKiefer 2010). In some species, an increase in song complexity has also been shown to be associated with the outcome of aggressive Figure 2. Multiple song sparrow song characteristics display maturational change in the first three years of adult life. Outliers were removed from the figure for within-bout time interval to better illustrate differences (outliers retained in the analysis). Transformations used to meet the assumption of normality include within-bout time interval, between-bout time interval, within-song stereotypy and between-song consistency (square root transformation). MCV refers to most common variant. Significance determined by linear mixed models. *p < 0.05, ***p < 0.0005. male-male interactions. For instance, male New Zealand tuis Prosthemadera novaeseelandiae tend to respond more aggressively to complex song and perceive complex song as a greater territorial threat (Hill et al. 2018), and male European robins Erithacus rubecula respond faster (i.e. more aggressively) to more complex songs (Kareklas et al. 2019). Increases in both average number of notes and unique number of note types represent increases in song complexity, albeit in slightly different ways; an increase in the unique number of notes necessarily indicates a change in the most common variant of a song type, whereas an increase in the average number of notes may instead indicate lengthening of trill phrases within the same most common variant.
While other parameters such as singing rate, time interval and within-type variation clearly exhibited significant agerelated change, the corresponding link between male quality and variation in these features is unclear. Specifically, time interval increases with age, while singing rate and within-type variation decrease with age. Interestingly, within-type variation has been shown to increase in aggressive contests in song sparrows, although it is not clear if this increase functions as a signal (Searcy et al. 2000). Such measures of song output and complexity are expected to correlate positively to individual quality, yielding an expectation that higher singing rate, shorter time interval between songs and more withintype variation would be associated with high-quality individuals (e.g. song complexity and learning ability in zebra finches: Clayton andPröve 1989, Boogert et al. 2008; singing rate and territory quality in pied flycatchers: Gottlander 1987, Alatalo et al. 1990). Yet, individuals in their first year sang with characteristics that tend to indicate high quality, even though younger individuals are expected to be of lesser quality.
Our findings for between-song consistency and withinsong stereotypy also do not match expectations of increased quality with age. Across many songbirds, these performancerelated measures have been established as positively related to measures of individual quality (age, dominance and reproductive success in tropical mockingbirds, Botero et al. 2009; male philopatry in great reed warblers, Węgrzyn et al. 2010) and are attended to by receivers (preference in banded wrens, de ; more aggressive male response in great tits, Rivera-Gutierrez et al. 2011). In our analysis, however, between-song consistency declined with increasing age and within-song stereotypy remained constant with age, neither of which are consistent with the idea that older males are of higher quality. Nevertheless, any consistent age-related change may be used in tandem with other characteristics by receivers to assess age, regardless of its relation to individual quality. Thus, between-song consistency may still be relevant in signaling age in this species, while within-song stereotypy is unlikely to be used by receivers in assessing age. Importantly, receivers may still attend to between-song consistency and within-song stereotypy independent of their relation to age if they convey other relevant information about the signaler. For example, between-song consistency and within-song stereotypy may provide more information about the quality of the individual within a given age class rather than information about the age class of an individual. Figure 3. Linear discriminant model successfully discriminates between one-year-old birds and older birds. Density plots of linear discriminant 1 (LD1) for (A) discrete-age model and (B) maturity model. The models were capable of discriminating one-year-old birds from older birds, but not two-year-old and three-year-old birds. Tick marks correspond to the calculated LD1 of all training data profiles, yielding 42 unique marks for each model.

8
The discriminant function analysis only effectively categorized song based on age when birds aged 2 and 3 were combined into one older age class (Fig. 3). This finding is not surprising considering that the largest and most significant changes in all features that displayed delayed maturation occurred between years 1 and 2, whereas little to no change occurred between years 2 and 3 (Table 1). These results are consistent with the notion that a peak in male quality likely is achieved around ages 2-3, which is expected from theory and experimental studies linking older age with higher reproductive fitness (Clutton-Brock 1988, Forslund andPärt 1995). Although the discriminant function analysis results demonstrate that it should be possible for receivers to discriminate song from differently aged males, we cannot draw any conclusions as to whether individuals actually perceive and respond to these differences. Nonetheless, our results are consistent with the possibility that female receivers can selectively mate with fully mature males and male receivers can preferentially avoid aggressive interactions with older, more established males.
We were only able to examine data across the first three years of life and so were unable to evaluate potential agerelated decreases in song quality, or senescence, as has been demonstrated in a longitudinal study of swamp sparrows (Zipple et al. 2019(Zipple et al. , 2020, a longitudinal study of great tits (Rivera-Gutierrez et al. 2012) and a cross-sectional study of Seychelles warblers (Berg et al. 2020). If senescence also occurs in song sparrows, then song could be an indicator of intermediate age rather than simply older age. Signaling intermediate age may be particularly important in songbird mate choice, since we would expect intermediate-aged males to be preferred by females. These males provide more direct benefits than younger males but do not yet exert a considerable negative genetic effect on offspring quality like older males (Segami et al. 2021). In a population of song sparrows in the eastern United States living in similar habitat and at a similar latitude as the population from which our subjects were taken, Nice (1937) reported the average life span to be just under three years, although many males live to five or six years of age and a few live to be eight-or nine-years old. Therefore, changes to song in the first three years of life are most likely to be relevant in natural populations.
Although our subjects were hand-reared, it is unlikely that the captive environment substantially influenced our results. Multiple studies have shown that song sparrows and other songbirds can learn normal song using wild recordings of adult song alone and without the aid of a live tutor Peters 1987, 1988). There is a chance, however, that some aspects of song were influenced by the absence of adult males during the subjects' adulthood and absence of conspecific interaction during recording sessions. Thus, our results only reflect an example of how song may develop and change with age. Only by directly comparing our findings with song development in wild populations would we be able to assess whether the hand-rearing process significantly impacts the presence of delayed maturation in this species. Because of this uncertainty, our findings should be interpreted with caution. Nonetheless, the recorded repertoires presented in this study are generally consistent with characteristics of normal song development in wild populations. It may, however, be important to note that our subjects learned significantly less song types on average than a wild population (6.6 ± 1.4 vs 7.9 ± 1.6 in Boogert et al. 2011; unpaired t-test: p = 0.0043). We have no reason to believe that hand-rearing or acclimation to captivity had significant directional effects on our results, given that all birds were housed in the same space and the social environment was held constant throughout.
The most important question that remains is whether signal receivers attend to the age-related differences we have documented. Age-related change in song can only be considered relevant with the support of playback experiments and evaluation of corresponding fitness outcomes in a natural setting. If males or females respond differentially to song with characteristics of two-and three-year-old males over one-yearold males, then song can be considered a reliable predictor of age. If females that mate with older males are conferred a reproductive fitness advantage or if older males are more effective at establishing and maintaining a territory, then song can also be considered a reliable indicator of male quality. Future work should examine behavioral responses of male and female conspecifics to songs of differently aged males to test these predictions. Determining whether age-related changes in song quality have implications for male fitness will deepen our understanding of the role of signaler age in sexual selection and the evolution of age-specific signaling strategies.