Sex-specific additive genetic variances and correlations for fitness in a song sparrow (Melospiza melodia) population subject to natural immigration and inbreeding

Quantifying sex-specific additive genetic variance (VA) in fitness, and the cross-sex genetic correlation (rA), is pre-requisite to predicting evolutionary dynamics and the magnitude of sexual conflict. Quantifying VA and rA in underlying fitness components, and multiple genetic consequences of immigration and resulting gene flow, is required to identify mechanisms that maintain VA in fitness. However, these key parameters have rarely been estimated in wild populations experiencing natural environmental variation and immigration. We used comprehensive pedigree and life-history data from song sparrows (Melospiza melodia) to estimate VA and rA in sex-specific fitness and underlying fitness components, and to estimate additive genetic effects of immigrants as well as inbreeding depression. We found substantial VA in female and male fitness, with a moderate positive cross-sex rA. There was also substantial VA in adult reproductive success in males but not females, and moderate VA in juvenile survival but not adult survival. Immigrants introduced alleles for which additive genetic effects on local fitness were negative, potentially reducing population mean fitness through migration load, yet alleviating expression of inbreeding depression. Substantial VA for fitness can consequently be maintained in the wild, and be concordant between the sexes despite marked sex-specific VA in reproductive success.


Introduction
The magnitude of additive genetic variance (V A ) in fitness governs the rate of adaptive trait evolution and the expected increase in population mean fitness (Fisher 1930;Robertson 1966;Price 1970), and thereby links adaptation and population persistence (Bell 2013; Gomulkiewicz and Shaw 2013;Carlson et al. 2014;Shaw and Shaw 2014). Quantifying the magnitude of V A in fitness, and identifying key mechanisms that maintain or constrain such V A , are consequently central objectives in evolutionary biology (Burt 1995;Barton and Keightley 2002;Ellegren and Sheldon 2008;Walsh and Blows 2009;Shaw and Shaw 2014).
The magnitude and maintenance of V A will partly depend on the genetic architecture underlying fitness in and among individuals. Specifically, in organisms with separate sexes, many genes that affect fitness will be expressed in both sexes. Such genes can have congruent or divergent pleiotropic effects on multiple sex-specific fitness components encompassing survival to sexual maturity and subsequent reproductive success (Arnold and Wade 1984;Falconer 1989, p.338;Chippindale et al. 2001). Such positive or negative pleiotropy can create additive genetic correlations (r A ) between the sexes, and among fitness components within each sex, potentially generating evolutionary sexual conflict and multiple life-history trade-offs and multi-dimensional constraints (Lande 1980(Lande , 1982Rose 1982;Charlesworth 1987;Chippindale et al. 2001;Kruuk et al. 2008;Bonduriansky and Chenoweth 2009;Walsh and Blows 2009;Shaw and Shaw 2014).
Consequently, V A in sex-specific fitness and fitness components, and corresponding cross-sex and within-sex r A s, are key parameters that shape the total V A for fitness that emerges and is maintained following selection (Lewontin 1974;Rose 1982;Chippindale et al. 2001;Brommer et al. 2007;Kruuk et al. 2008;Walsh and Blows 2009;Walling et al. 2014).
Further, the magnitude of V A in fitness and underlying fitness components that is maintained in any focal population or sub-population will also depend on natural spatio-temporal variation in fitness, and on resulting variation in the form of local selection and adaptation and associated patterns of immigration and inter-deme gene flow (Merilä and Sheldon 1999;Zhang 2012;Carlson et al. 2014;Shaw and Shaw 2014). Immigration could increase V A by introducing alleles with negative or positive additive effects on local fitness, potentially causing migration load, and impeding or facilitating adaptation and population growth (Lenormand 2002;Garant et al. 2007;Edelaar and Bolnick 2012;Carlson et al. 2014). Such immigration could further change mean fitness by altering the degree of local inbreeding versus outbreeding and associated expression of inbreeding depression, heterosis, and outbreeding depression (Ingvarsson and Whitlock 2000;Tallmon et al. 2004;Frankham 2016). Such effects, and resulting effective rates of gene flow, depend fundamentally on the genetic properties of immigrants relative to focal natives (Ingvarsson and Whitlock 2000;Tallmon et al. 2004;Edelaar and Bolnick 2012). Therefore, understanding and predicting overall evolutionary dynamics requires estimation of V A in fitness and underlying fitness components in both sexes, and associated cross-sex and within-sex r A s, in wild populations experiencing natural abiotic and biotic environmental variation (Ellegren and Sheldon 2008;Kirkpatrick 2009;Kruuk et al. 2008Kruuk et al. , 2014Shaw and Shaw 2014;Walling et al. 2014), and also requires explicit estimation of multiple genetic effects resulting from immigration (Ingvarsson and Whitlock 2000;Lenormand 2002;Tallmon et al. 2004;Garant et al. 2007;Edelaar and Bolnick 2012;Carlson et al. 2014).
Fitness can be defined and measured in numerous ways (Brommer 2000;Metcalf and Pavard 2007;Orr 2009;Saether and Engen 2015). In the context of Fisher's (1930) Fundamental Theorem, absolute fitness is most straightforwardly defined as the total number of zygotes produced by a zygote (Crow and Kimura 1970;Arnold and Wade 1984;Falconer 1989 p. 336;Shaw and Shaw 2014). Such fitness emerges from sequential life-history events, encompassing survival from conception to sexual maturity and subsequent adult lifetime reproductive success (LRS). In iteroparous species, adult LRS itself results from a repeating sequence of reproduction followed by survival towards the next reproductive opportunity, terminated by death. Fitness and its components therefore reflect expression of numerous developmental, physiological, morphological and behavioral traits, and are consequently best conceptualized as highly polygenic, complex traits (e.g. Houle 1992; Barton and Keightley 2002;Flint and Mackay 2009;Hill 2012;Travisano and Shaw 2013) even though loci of large effect can exist (e.g. Johnston et al. 2013;Trask et al. 2016). Key V A s and r A s can be estimated using quantitative genetic methods derived from the infinitesimal model (Lynch and Walsh 1998). Although the phenotypic distribution of fitness is intrinsically non-Gaussian (Arnold and Wade 1984; Wagenius et al. 2010;Shaw and Etterson 2012;Bell 2013;Shaw and Shaw 2014), V A s and r A s can be estimated on appropriate latent scales in order to fulfill the fundamental quantitative genetic assumption of multivariate normality of the average effect of an individual's polygenic genotype (i.e. breeding value, Lynch and Walsh 1998 pp.72-79;de Villemereuil et al. 2016).
Such estimation of V A s and r A s in wild populations is empowered by a class of quantitative genetic generalized linear mixed models (QGGLMMs) commonly known as 'animal models' (Kruuk 2004;Charmantier et al. 2014). Such models partition variance in observed phenotypes across individuals and, given an appropriately specified relatedness matrix and model, minimize biases in estimates of V A and r A stemming from selection (i.e. non-random variation in fitness) and resulting unobservable phenotypes, as well as estimate variances arising from shared environmental effects (Henderson 1973;Kruuk 2004;Kruuk and Hadfield 2007;Hadfield 2008). Such models can also directly estimate mean additive genetic values of immigrants relative to natives, and estimate the magnitude of inbreeding depression, and thereby elucidate key roles of immigration and resulting gene flow in shaping phenotypic means and variances Wolak and Keller 2014;Wolak and Reid 2017).
However, despite the widely recognized need and available statistical methods, few studies have rigorously estimated sex-specific V A s and the cross-sex r A in fitness in wild populations (Burt 1995;Gardner et al. 2005;Kruuk et al. 2008;Kirkpatrick 2009;Shaw and Shaw 2014). Of 17 known studies (including on humans, Homo sapiens) that estimated V A for sex-specific absolute fitness measured approximately zygote to zygote, only 8 considered male fitness alongside female fitness (Appendix S1). Since most such studies estimated at least one sex-specific V A to be close to zero, possibly due to low power, only two attempted to estimate the cross-sex r A (McFarlane et al. 2014;Zietsch et al. 2014, Appendix S1). Two further studies estimated the cross-sex r A for fitness measured as an adult's number of adult (i.e. recruited) offspring (Brommer et al. 2007;Foerster et al. 2007), but such cross-generation measures are harder to reconcile with primary evolutionary theory (Arnold and Wade 1984;Wolf and Wade 2001), and all available estimates are very imprecise (Appendix S1). Further, few studies explicitly estimated V A on appropriate latent scales (but see Milot et al. 2011;McFarlane et al. 2014) or then transformed estimates back onto observed phenotypic scales, as is ideally required to facilitate cross-study comparison (de Villemereuil et al. 2016). Finally, no QGGLMM analysis of V A in fitness has explicitly estimated additive genetic effects of immigrants, or thereby directly assessed the role of introgressive gene flow in changing local mean breeding value and maintaining V A and associated evolutionary potential (Wolak andReid 2016, 2017).
This paucity of estimates likely reflects the substantial challenges of collecting comprehensive sex-specific fitness and relatedness data from free-living individuals. Since wild population studies can rarely count all conceived zygotes, fitness can be pragmatically quantified as the total number of offspring produced over an individual's lifetime, where focal individuals and their offspring are censused as close to conception as feasible (typically soon after birth, hatch, or seed formation).
However, most field datasets have some degree of missing or incorrect parentage assignment, and resulting pedigree error could bias quantitative genetic analyses (Brommer et al. 2007;Firth et al. 2015;Wolak and Reid 2017). Further, challenges of tracking juveniles among natal and subsequent breeding locations, and of paternity assignment, mean that records of survival to maturity and male reproductive success are often missing or incorrect (Kruuk et al. 2000;Brommer et al. 2007; Stinchcombe 2014). Observed distributions of fitness may also exclude unobserved non-breeders, and hence inaccurately reflect the frequency of individuals with zero fitness. Such error will likely bias, with respect to fitness, estimates of V A , phenotypic means and variances, and hence key standardized metrics that depend on V A and moments of the phenotypic distribution and that underpin comparative analyses (heritability, h 2 ; evolvability, I A ; coefficient of additive genetic variance, CV A ;e.g. Freeman-Gallant et al. 2005). Even given comprehensive data spanning multiple generations, V A s and cross-sex r A s in non-Gaussian traits are notoriously difficult to estimate precisely (Shaw 1987;Poissant et al. 2010;Kruuk et al. 2014). Statistical methods that adequately quantify uncertainty should then be used to facilitate inference and subsequent meta-analyses (Garcia-Gonzalez et al. 2012).
Accordingly, we fitted Bayesian QGGLMMs to comprehensive multi-generation fitness and pedigree data from song sparrows (Melospiza melodia) to estimate sex-specific V A s and r A s in fitness, and in two hierarchical levels of fitness components. Specifically, we estimated (i) V A in sex-specific fitness and the cross-sex r A , thereby evaluating scope for inter-sexual conflict; (ii) V A and r A in and among juvenile survival and sex-specific adult LRS, comprising the primary fitness components that generate overall fitness; and (iii) V A and r A in sex-specific adult annual reproductive success (ARS) and (iv) V A in adult annual survival, comprising the key life-history traits that generate adult LRS. In all cases, we explicitly estimated additive genetic effects of immigrants relative to defined local population founders, and estimated the magnitude of inbreeding depression, and thereby evaluate concurrent impacts of natural immigration and resulting gene flow on local additive genetic and phenotypic variation in fitness. Estimating V A and r A in fitness and fitness components in the wild is perhaps most tractable in populations with limited emigration but sufficient immigration to generate substantial variance in relatedness, and where all local residents and immigrants can be observed. A population of song sparrows inhabiting Mandarte Island, British Columbia, Canada, fulfills these criteria and has proved valuable for quantifying fitness of residents and immigrants and for pedigree-based quantitative genetic analyses (Keller 1998;Marr et al. 2002;Reid et al. 2011Reid et al. , 2014aReid and Sardell 2012;Wolak and Reid 2016).

STUDY SYSTEM
Mandarte's song sparrows typically form socially monogamous breeding pairs, starting from age one year, with a mean of 28±11SD (range 11-52) breeding females per year during 1993-2015.
Pairs can rear up to three broods of chicks per year (mean brood size 2.8±1.0SD chicks, range 1-4).
However, 28% of offspring are sired by extra-pair males (Sardell et al. 2010), creating opportunities for individual males to gain or lose substantial reproductive success compared to their sociallypaired female (Reid et al. , 2014bReid and Sardell 2012). Further, since the adult sex-ratio is often male-biased (mean proportion males during 1993-2015: 0.60±0.09SD, range 0.39-0.75), some males remain socially unpaired in some years (Lebigre et al. 2012), and these males typically gain little extra-pair paternity (Sardell et al. 2010). Consequently, the population's mating system and ecology fosters different means and variances in female versus male reproductive success (Lebigre et al. 2012), creating potential for sexual conflict and trade-offs over fitness components despite social monogamy.
Since 1975, virtually all song sparrow breeding attempts on Mandarte were closely monitored and all chicks surviving to ca. 6 days post-hatch were marked with unique combinations of metal and colored plastic bands (Smith et al. 2006). Mandarte lies within a large song sparrow metapopulation and receives occasional immigrants (totaling 28 females and 16 males during 1976-2014) that were mist-netted and color-banded soon after arriving (Marr et al. 2002;Reid et al. 2006;Smith et al. 2006). Consequently, every song sparrow in the population is individually identifiable individuals, including unpaired males, with resighting probability >0.99 (Wilson et al. 2007). Local chick survival from banding to adulthood the following April, and adult survival to subsequent years, were consequently accurately recorded (Keller 1998;Smith et al. 2006).
Each year, the socially-paired parents that reared all banded offspring were identified. To determine genetic parentage, since 1993 all banded chicks and adults were blood sampled and genotyped at 160 polymorphic microsatellite loci. All chicks were assigned to genetic parents with >99% individual-level confidence (Sardell et al. 2010;Nietlisbach et al. 2015). These analyses demonstrated zero extra-pair maternity, and effectively eliminated paternity error. Each banded individual's sex was determined from adult reproductive behavior and/or by genotyping the chromobox-helicase-DNA-binding (CHD) gene (Postma et al. 2011;Nietlisbach et al. 2015).
The local fitness of each chick banded on Mandarte since 1993 was measured as its total lifetime number of chicks banded on Mandarte, including zeros for chicks that died before adulthood (Appendix S2). The two major fitness components, juvenile survival and adult LRS, were respectively measured as survival from banding to adulthood the following April, and the total number of banded chicks assigned to individuals that survived to adulthood. For each adult, LRS was then further subdivided into ARS and annual survival, respectively measured as the number of banded chicks assigned to each individual in any one year, and survival to the following April.
Since adult (breeding) dispersal away from Mandarte is probably very rare, observed local adult survival likely equates to true survival (Marr et al. 2002;Smith et al. 2006). The relatively high local recruitment rate implies that juvenile (natal) dispersal is also relatively infrequent, although non-zero. However, surveys of immediately surrounding islands have detected few local dispersers, implying that unobserved dispersal from Mandarte is likely to be longer distance. Observed juvenile survival on Mandarte is therefore an appropriate measure of effective local survival and hence local fitness.

QUANTITATIVE GENETIC MODELS
We fitted a series of four non-Gaussian QGGLMMs designed to estimate sex-specific additive genetic variances (V A ) and covariances (COV A ), and hence estimate associated standardized statistics (r A , h 2 , I A , CV A ), for fitness and fitness components.
First, we fitted a bivariate QGGLMM (Appendix S4) to estimate V A in female and male fitness and the cross-sex COV A , assuming Poisson distributions with log link functions. Random hatch-year effects were fitted to estimate sex-specific cohort variances in fitness and the cross-sex cohort covariance. Sex-specific residual variances were estimated assuming additive overdispersion, with residual covariance fixed to zero.
Second, we fitted a trivariate QGGLMM (Appendix S4) to estimate V A in juvenile survival and adult female and male LRS, and the three pairwise COV A s. We modeled juvenile survival as a single joint trait of both sexes with sex-specific intercepts, rather than as two sex-specific traits.
This simplification facilitated multivariate analysis of juvenile survival alongside sex-specific adult LRS, and is justified because previously published and exploratory analyses demonstrated a strong positive cross-sex r A for juvenile survival and similar magnitudes of V A in both sexes, implying considerable shared V A (Reid and Sardell 2012, Appendix S7). Under these conditions, modeling a single trait for both sexes does not bias estimates of V A (Wolak et al. 2015). Juvenile survival was modeled as a binary trait with logit link function and residual variance fixed to one. We assumed Poisson distributions for female and male LRS, with log link functions and independent residual variances (as for fitness). Random hatch-year effects were again fitted, thereby estimating cohort variances and covariances in and among the three traits. Random effects of the identities of each chick's mother, social father, social parent pair, and brood were also fitted for juvenile survival, thereby accounting for common environmental effects stemming from parental care and natal conditions. While these four effects may be somewhat confounded, our aim was not to precisely estimate associated variances, but simply to minimize possible bias in V A (e.g. Kruuk and Hadfield  Reid et al. 2014a). Analogous common environmental effects were not fitted to female and male adult LRS, because relatively few parents and broods produced multiple same-sex chicks that survived to adulthood, and previous analyses did not reveal substantial parental effects on adult lifehistory traits.
Third, we fitted a bivariate QGGLMM (Appendix S4) to estimate V A in adult female and male ARS and the cross-sex COV A assuming Poisson distributions for both traits, log link functions, and independent residual variances. Random individual effects were fitted to estimate sex-specific permanent individual variances (i.e. repeatable among-individual variation stemming from permanent environmental and/or non-additive genetic effects). Random year of observation effects were also fitted to estimate among-year environmental variances and the cross-sex year covariance.
Fourth, we fitted a univariate QGGLMM (Appendix S4) to estimate V A in annual adult survival modeled as a single trait for both sexes with sex-specific intercepts (as for juvenile survival). We modeled survival as a binary trait expressed by each individual adult in each year, with logit link function and residual variance fixed to one (e.g. Hadfield et al. 2013). Random year of observation and individual effects were fitted to estimate among-year environmental variance and account for overdispersion compared to the assumed geometric distribution of age-specific survival events.

IMMIGRANTS, INBREEDING DEPRESSION, AND FIXED EFFECTS
Standard QGGLMMs estimate V A and COV A for a default base population that comprises 'phantom parents' of all pedigreed individuals with unknown parents (Kruuk 2004;Wolak and Reid 2017). In populations with complete local pedigree data for a focal study period but that are open to immigration, the default base population comprises phantom parents of all adults alive at the study start (hereafter 'founders') and of subsequent immigrants. To directly estimate the difference in mean additive genetic value for fitness and fitness components between the defined founders and individual's autosomal genome that originated from the defined immigrant group, calculated from pedigree data (Appendix S3). The regression slope (β IGG ), modeled as a fixed effect, estimates the difference in mean additive genetic value of the immigrant group relative to the founder group (Wolak and Reid 2017). Since immigration was infrequent, phantom parents of female and male immigrants that arrived in all years were pooled into a single genetic group (Appendix S3). This assumes that the phantom mothers of female and male immigrants have similar mean genetic values as the phantom fathers for any focal trait, and hence that alleles originating in immigrants of both sexes similarly affect the genetic values of descendants of both sexes. This mirrors the standard QGGLMM assumption that female and male phantom parents of founders have the same mean breeding values for any focal trait (Wolak et al. 2015).
To quantify inbreeding depression, and minimize bias in V A estimates that can result from correlated inbreeding across relatives, all four QGGLMMs also included trait-specific linear regressions on individual coefficient of inbreeding (f), calculated from pedigree data Wolak and Keller 2014). Regression slopes (β f ) equate to haploid 'lethal equivalents' for traits modeled with log link functions (fitness, adult LRS, and ARS), but not for traits modeled with logit link functions (juvenile and adult survival).
Further fixed effects were restricted to those required to standardize trait observations across individuals. Since juvenile survival probability decreases with increasing seasonal hatch date (Smith et al. 2006), and hatch date reflects the parents' breeding phenotype, models for juvenile survival included a linear regression on the first egg lay date in the nest in which each focal individual hatched. Since adult ARS and survival vary with age (Smith et al. 2006;Keller et al. 2008), associated models included categorical effects of age at observation (ages 1, 2, 3-5, or ≥6 years).  (Sardell et al. 2010;Reid et al. 2011;Nietlisbach et al. 2015Nietlisbach et al. , 2017. For each QGGLMM, the pedigree was pruned to individuals with observed phenotypes and their known ancestors. The inverse numerator relationship matrix, and individuals' IGG and f coefficients, were computed using standard algorithms (Wolak and Reid 2017, Appendix S3). Immigrants were defined as unrelated to all Mandarte residents at arrival, and to subsequent immigrants (Marr et al. 2002;Reid et al. 2006).
For each model, phenotypic data were restricted to cohorts for which all or virtually all individuals had complete fitness or fitness component data, known sex, and genetically verified parents (Appendix S2). Observations of immigrants' own phenotypes were excluded because they might reflect ecological effects associated with dispersal or subsequent settlement (Marr et al. 2002), and because immigrants' f values are undefined relative to the Mandarte pedigree base population (Reid et al. 2006). However, immigrants that produced ≥1 banded offspring were explicitly included in the pedigree to enable estimation of relatedness among descendants and genetic group effects.
All models were implemented in a Bayesian framework, using a Markov chain Monte Carlo (MCMC) algorithm to sample posterior distributions. We used diffuse normal prior distributions for all fixed effects (mean=0, variance=10 10 ), and multivariate parameter expanded priors for covariance matrices that gave uniform marginal prior distributions on the correlation. Parameter

FITNESS
Across 1406 female and 1415 male chicks banded on Mandarte during 1993Mandarte during -2012Mandarte during , 1177 and 1185 (83.7%) respectively had zero fitness. Consequently, fitness distributions were strongly right-skewed, with maxima of 50 and 69 banded offspring for females and males respectively (Fig.   1A). Raw mean sex-specific fitness was 1.78 and 1.70 respectively, with substantial phenotypic variances (females 29.8, males 31.7).
In the bivariate QGGLMM, the posterior distributions for latent-scale V A in female and male fitness showed clear peaks that were substantially shifted away from zero and from the prior distributions, indicating substantial V A for sex-specific fitness (Figs. 2A,B). The posterior modes were similar in both sexes, and the lower 95%CI limits did not converge towards zero (Table 1).
There was non-zero cohort variance and substantial residual variance in both sexes, reflecting the overdispersed phenotypic distributions (Table 1, Fig. 1). Consequently, there was relatively small but non-zero heritability of fitness in both sexes; posterior modes and means for h 2 latent were 0.08-0.09, with lower 95%CI limits that did not converge to zero (Table 1, Fig. S2).
The posterior mode for the cross-sex COV A in fitness was positive, generating a posterior mode for the cross-sex r A of intermediate magnitude between zero and one (Table 1, Fig. 2C). The 95%CI for r A was wide and included zero. However, 88% of the posterior density exceeded zero, representing substantial divergence from the uniform prior, yet the upper 95%CI limit did not converge towards one (Table 1, Fig. 2C). This implies that fitness variation has some, but not all, of the same additive genetic basis in females and males.
In total, 26 immigrants that arrived on Mandarte during 1976-2012 made a non-zero expected genetic contribution to the 2821 Mandarte-hatched individuals whose fitness was observed (Appendix S3). Across these 2821 individuals, mean IGG coefficient was 0.520.13SD (range 0.14-0.86). Approximately half the focal individuals' genomes are therefore expected to have originated from immigrants on average, implying that immigration could contribute substantially to standing V A within the Mandarte breeding population. The posterior modes for the regressions of sex-specific fitness on IGG, which quantify mean immigrant genetic group effects, were negative in both sexes with 95%CIs that did not overlap zero (Table 1). Additive effects of alleles carried by immigrants therefore decreased fitness, relative to additive effects of alleles in the defined founder population, in both sexes.
Substantial variation in f was directly attributable to immigration: 91% of individuals with f=0 had one immigrant parent. However, since immigrants' descendants commonly inbred in future generations, f and IGG were only moderately correlated across individuals (females: r=-0.25, males: r=-0.30). The posterior modes for the regressions of sex-specific fitness on f were negative with 95%CI that did not overlap zero, demonstrating very strong inbreeding depression in fitness in both sexes (Table 1). variance 85.1, 5.8% zeroes) and 7.7 (median 4, variance 97.6, 26.3% zeroes) banded offspring respectively (Fig. 1B).
In the trivariate QGGLMM, the posterior distribution for V A in juvenile survival showed a clear peak, and hence posterior mean, that departed from zero and from the prior distribution, although the lower 95%CI limit converged towards zero (Table 2, Fig. 3A). There was substantial cohort variance (Table 2), but small variances attributable to mothers, social fathers, broods and parent pairs (Appendix S5). Consequently, the posterior means for h 2 latent and h 2 observed were small, but again showed clear peaks away from zero (Fig. S3). Although the lower 95%CI limits converged towards zero, approximately 93% and 82% of posterior samples respectively exceeded a minimal value of 0.01 (Table 2, Fig. S3).
The posterior mode for V A in adult female LRS was very small ( Table 2). The posterior mean was slightly greater due to the right-skewed posterior distribution (Table 2, Fig. 3B). However there was substantial posterior density close to zero compared to the prior distribution, and the lower 95%CI limit converged towards zero (Fig. 3B, Table 2). Consequently, the posterior modes (and means) of h 2 latent , h 2 observed and I A-observed for female LRS were small, with lower 95%CI limits that converged towards zero ( In marked contrast, the posterior mode and mean for V A in adult male LRS were substantial and the lower 95%CI limit considerably exceeded zero (Table 2, Fig. 3C). Consequently, although there were also moderate cohort and residual variances, the posterior mode and mean for h 2 latent for male LRS were substantial (Table 2, Fig. S4). These values were smaller for h 2 observed , reflecting the non-linear transformation induced by the mean-variance relationship of the Poisson distribution, but the lower 95%CI limit still did not converged towards zero (Table 2, Fig. S4). The posterior mode for I A-observed for male LRS was also moderate ( Table 2, Fig. S5).
Since V A in female LRS was so small and the lower 95%CI limit for V A in juvenile survival also converged towards zero, the pairwise COV A s and r A s among juvenile survival and female and male LRS were unsurprisingly estimated with considerable uncertainty (Table 2, Fig. 3). The posterior modes and means for r A between juvenile survival and male LRS, and between female and male LRS, were slightly negative, but spanned zero for juvenile survival and female LRS, all with 95%CI limits that did not converge towards either -1 or 1 (Table 2, Fig. 3).

Distributions of IGG and f for individuals included in analyses of juvenile survival and adult
LRS (and ARS and survival) were quantitatively similar to those for individuals included in analyses of fitness (summarized above). The posterior mode for the regression of juvenile survival on IGG was negative, with a 95%CI that did not overlap zero (Table 2). Further analyses showed similar negative slopes for female and male juvenile survival modeled as separate traits (Appendix S7). However, the posterior modes for the regressions of adult female and male LRS on IGG were small, with 95%CIs that spanned zero (Table 2). This implies that additive effects of immigrants' alleles decreased local juvenile survival, but not adult LRS, relative to additive effects of founders' alleles.

ADULT ANNUAL REPRODUCTIVE SUCCESS
In the bivariate QGGLMM, the posterior mode for V A in female ARS was very small and the lower 95%CI limit converged towards zero (Table 3, Fig. 4A). However, the posterior mean was slightly larger (Table 3), and 75% of the posterior density exceeded a minimal value of 0.01. This implies the existence of very small, but non-zero, V A for female ARS (Fig. 4A inset).
In contrast, the posterior mode and mean for V A in male ARS were substantially larger and the lower 95%CI limit did not converge towards zero (Table 3, Fig. 4B). The permanent individual variances were very small in both sexes, but the year and residual variances were substantial, especially for males (Table 3). Consequently, despite the marked difference in V A , the posterior means for h 2 latent and h 2 observed for ARS were similar in both sexes (~0.06-0.18), but I A-observed was substantially greater in males than females (Table 3, Figs. S6, S7).
The posterior mode for the cross-sex additive genetic correlation (r A ) in ARS was positive but small. Due to the small V A in female ARS, the 95%CI was again wide and spanned zero, but did not converge towards either -1 or 1 (Table 3, Fig. 4C).
The posterior modes for the regressions of ARS on IGG were small in both sexes, with 95%CIs that overlapped zero (Table 3). The posterior modes for the regressions of ARS on f were negative in both sexes, although the 95%CI for females again overlapped zero (Table 3).

ADULT SURVIVAL
For the focal 254 adult females and 331 adult males, the mean number of observations of annual survival (or mortality) was 2.1 (median 1, range 1-9, Fig. 5A) for females and 2.3 (median 2, range 1-9, Fig. 5B) for males, representing overall survival rates of 53.0% and 58.0% respectively.
In the univariate QGGLMM, the posterior mode for V A was effectively zero (Table 3, Fig.   5C). The posterior mean was slightly larger, but there was substantial posterior density close to zero compared to the prior distribution, and the lower 95%CI limit converged to zero (Table 3). Since there was also substantial year variance, the posterior modes for h 2 latent and h 2 observed were very small (Table 3; Fig. S8.3). The posterior modes for the regressions of adult survival on IGG and f were also small, with 95%CIs that overlapped zero (Table 3). Analyses of adult longevity rather than annual survival yielded similar conclusions (Appendix S8).

FITNESS
The sex-specific additive genetic variances (V A ) in fitness, and the cross-sex genetic correlation (r A ), are key parameters that determine the rate of fitness evolution and shape evolutionary responses to natural and sexual selection (Burt 1995;Brommer et al. 2007;Kirkpatrick 2009;Shaw and Shaw 2014). They also underlie the potential for evolutionary sexual conflict, which might constrain evolution yet help maintain overall V A in fitness (Lande 1980;Chippindale et al. 2001;Kruuk et al. 2008;Bonduriansky and Chenoweth 2009;Long et al. 2012). However, these key parameters have rarely been estimated in wild populations, particularly using theoretically appropriate measures of fitness while accommodating non-Gaussian phenotypic distributions and accounting for genetic effects of immigration and inbreeding (Kruuk et al. 2008;Kirkpatrick 2009;Shaw and Etterson 2012;Gomulkiewicz and Shaw 2013;Shaw and Shaw 2014).
Our analyses of comprehensive fitness data from free-living song sparrows estimated nonzero latent-scale V A s and heritabilities for fitness, of similar magnitudes, in both sexes. Such estimates do not concur with basic theoretical predictions that V A in fitness will be negligible at equilibrium (Charlesworth 1987), which has been interpreted to commonly apply (Shaw and Shaw 2014;Walling et al. 2014). Instead, they support the view that substantial V A in fitness can be readily generated and/or maintained and imply that this population is not at an evolutionary equilibrium (Houle 1992;Kirkpatrick 2009;Zhang 2012;Shaw and Shaw 2014). Further, our estimate of a moderate positive cross-sex r A for fitness implies that some V A is shared between the sexes, potentially facilitating an increase in population mean fitness (Lande 1980). However, the cross-sex r A in fitness was detectably less than one, implying that some sexually antagonistic genetic variation does exist, potentially facilitating the maintenance of overall V A .
The few available estimates of sex-specific V A in fitness in wild populations cannot readily be quantitatively compared because different studies used different fitness metrics, analytical methods and estimation scales, with different degrees of paternity error and missing data. However, qualitatively concordant with our results, V A for fitness was estimated to be non-zero and similar in both sexes in collared flycatchers (Ficedula albicollis, Merilä & Sheldon 2000;Brommer et al. 2007) and Swedish humans (Zietsch et al. 2014). Conversely, V A was estimated to be zero or very pre-industrial Finnish humans (Pettay et al. 2005, Appendix S1).
Meanwhile, our estimate of a moderate positive cross-sex r A for fitness differs from the substantial negative values previously estimated in wild populations (Foerster et al. 2007;Brommer et al. 2007;McFarlane et al. 2014; Appendix S1), and from the small or slightly negative values estimated in laboratory populations (Chippindale et al. 2001;Delcourt et al. 2009;Innocenti and Morrow 2010;Collet et al. 2016). Yet, cross-sex r A s can change substantially when (laboratory) populations experience novel environments (Delcourt et al. 2009;Punzalan et al. 2014;Collet et al. (Long et al. 2012), or inbreeding (Duffy et al. 2014). Positive estimates, such as ours, might indicate populations where both sexes are displaced from their fitness peak, and consequently experience congruent directional selection (Long et al. 2012;Duffy et al. 2014;Punzalan et al. 2014). Overall, further rigorous and standardized estimates of V A and r A in sexspecific fitness from wild populations experiencing different ecological circumstances are clearly required to discern general patterns and evolutionary implications.

COMPONENTS
Values of V A in sex-specific fitness, and the cross-sex r A , must ultimately result from V A s and crosssex and within-sex r A s in underlying sex-specific fitness components. Quantifying such parameters can consequently help identify mechanisms that maintain V A in fitness, and identify sources of sexual conflict (Walling et al. 2014). Juvenile survival to maturity constitutes one primary fitness component; indeed, 96% of observed song sparrow fitness values of zero represent individuals that did not (locally) survive to adulthood, and such patterns are likely commonplace (Blomquist 2010; Wagenius et al. 2010;Gomulkiewicz and Shaw 2013). We estimated moderate V A in juvenile survival, concurring with previous evidence that V A is moderate and similar in female and male song sparrows with a substantial positive cross-sex r A (Reid and Sardell 2012, Appendix S7).
However, for adult LRS, which constitutes the remaining primary fitness component, there was a striking difference between the sexes: V A for male LRS was substantial and clearly exceeded zero, while V A for female LRS was very small. This implies that there is opportunity for relatively rapid evolutionary change in male LRS and genetically correlated traits, but little such opportunity regarding female LRS.
The small V A estimate for female LRS impedes precise estimation of the cross-sex r A in LRS, and indeed renders such estimation somewhat redundant (since r A is undefined given zero V A in one or both sexes). Nevertheless, the posterior mode was small, and if anything slightly negative, further suggesting that additive genetic effects on adult LRS are largely independent in females and males.
Together, our results imply that the moderate positive cross-sex r A in fitness is primarily driven by the positive cross-sex r A in juvenile survival. Consequently, the cross-sex expression of additive genetic effects on juvenile survival ameliorates potentially sexually antagonistic genetic variation in overall fitness resulting from sex-specific expression of adult LRS. These patterns are reminiscent of those observed in Drosophila melanogaster, where a positive cross-sex r A in juvenile survival initially combined with a negative cross-sex r A in adult reproductive success to generate a weak overall cross-sex r A for fitness (Chippindale et al. 2001), but where the cross-sex r A in adult reproductive success was no longer detectably different from zero after further generations of laboratory adaptation (Collet et al. 2016).
Further decomposition of adult LRS in song sparrows revealed little or no V A in adult annual survival, and identified ARS as the primary source of V A in male LRS. The substantial difference in V A in ARS, and hence LRS, between males and females likely reflects the population's ecology and mating system. Due to the typically male-biased adult sex-ratio and frequent extra-pair paternity, males accumulate ARS by securing a territory and a social mate, defending within-pair paternity and accruing extra-pair paternity (Sardell et al. 2010;Lebigre et al. 2012;Reid et al. 2011Reid et al. , 2014a. In contrast, females accumulate ARS through their own fecundity. Consequently, while components of ARS such as within-pair paternity can be conceptualized as 'emergent' traits of pairs rather than individuals (Reid et al. 2014a), males and females are likely to differ substantially in the suite of physiological and behavioral traits that generate high ARS, and hence in underlying genetic effects.
Previous analyses revealed non-zero V A in annual male extra-pair reproductive success and a positive r A with within-pair paternity success per brood (Reid et al. 2014b), but a negative r A between net paternity success and juvenile survival . Together, these positive and negative correlations, alongside among-year variation in adult sex-ratio and hence the

GENETIC EFFECTS OF IMMIGRATION
Immigration, and resulting gene flow, is one primary mechanism that can maintain V A in fitness and associated evolutionary potential in any focal population, and can also rapidly increase mean fitness by alleviating inbreeding depression. However the overall genetic effects of immigration, and the evolutionary consequences, depend on genetic properties of naturally-occurring immigrants compared to existing natives (Ingvarsson and Whitlock 2000;Tallmon et al. 2004;Edelaar and Bolnick 2012;Carlson et al. 2014). We utilized the multi-generation song sparrow pedigree, that links all Mandarte-hatched individuals to their immigrant and 'founder' ancestors and hence describes the expected introgression of immigrants' alleles, to directly estimate the relative mean additive genetic values for local fitness of the defined immigrant and founder genetic groups.
Unlike analyses that aim to discern demographic and evolutionary consequences of dispersal by directly comparing observed phenotypes of immigrants (or dispersers) and residents (e.g. Marr et al. 2002;Pasinelli et al. 2004;Nosil et al. 2005;Pärn et al. 2009;Bonte et al. 2012), our analyses do not utilize immigrants' own phenotypes and consequently cannot be confounded by environmental effects of dispersal on those phenotypes. Our analyses showed that immigrant song sparrows carry alleles that, when expressed in subsequent Mandarte-hatched generations, have negative additive effects on local fitness in both sexes.
Such negative effects of immigrant's alleles could stem from three main processes. First, there could be divergent selection among song sparrow demes and resulting local adaptation. Immigrants to Mandarte might consequently not be locally adapted and hence have low mean additive genetic value for local fitness, as assumed by classical migration load models. Second, dispersal could be non-random, such that individuals that immigrate into Mandarte have low additive genetic value for local fitness even in the absence of any local adaptation. Third, low additive genetic value for fitness measured on Mandarte could reflect V A in dispersal, such that offspring of immigrants are more likely to emigrate and hence have zero local fitness (e.g. Doligez and Pärt 2008). These three processes, which are not mutually exclusive, are not distinguished by our current analyses.
However, immigrants' low additive genetic value for fitness resulted primarily from low additive genetic value for local juvenile survival rather than subsequent adult LRS, and therefore reflects some combination of effects on early-life mortality and/or emigration. To indicate biological effect sizes, the estimated latent-scale effect of β IGG = -2.6 (Table 2) implies a decrease in local juvenile survival probability of approximately 0.04 given an increase in individual IGG coefficient of 0.1 spanning the current mean of ~0.5, which is not trivial. In general, such reduced local survival of immigrants' descendants would reduce the effective rate of gene flow below that expected given the observed immigration rate (Garant et al. 2007).
However, our analyses also demonstrate strong inbreeding depression in fitness in both sexes, resulting from inbreeding depression in juvenile survival and adult LRS and ARS (as previously documented, Keller 1998;Reid et al. 2014c;Nietlisbach et al. 2017). Such inbreeding depression reflects covariance between individual fitness and f, where the underlying variance in f stems substantially from immigrant-native outcrossing; resulting F1 offspring are defined as outbred and have relatively high fitness (Keller 1998;Marr et al. 2002;Reid et al. 2006Reid et al. , 2014cWolak and Reid 2016). The estimated latent-scale effect size of β f = -9.3 (Table 2) implies an increase in juvenile survival probability of approximately 0.25 for outbred offspring (f=0) compared to inbred offspring with f=0.1 (see also Keller 1998;Reid et al. 2014c). This effect could cause rapid initial introgression of immigrants' alleles, and hence increase the short-term effective rate of gene flow (e.g. Ingvarsson and Whitlock 2000;Garant et al. 2007;Hedrick et al. 2014). Indeed, the mean IGG However, once immigrants' descendants start to inbreed, as is inevitable for initially highfitness lineages in small populations (Reid et al. 2006;Bijlsma et al. 2010;Hedrick et al. 2014), increased expression of recessive alleles with detrimental effects on local fitness would occur. This process would exacerbate the decrease in fitness that is expected following recombination in F2 and subsequent generations and resulting outbreeding depression (Frankham 2016), as documented in song sparrows (Marr et al. 2002). The combination of heterosis that exacerbates initial introgression and low overall additive genetic value for fitness could potentially generate substantial migration load; almost all population members might be pulled below the fitness peak, substantially decreasing population mean fitness but potentially generating a positive overall cross-sex r A for fitness (as observed in song sparrows) and alleviating sexual conflict (Long et al. 2012;Duffy et al. 2014;Punzalan et al. 2014). Such multi-generational dynamics of immigrants' alleles can, in future, be explicitly quantified using pedigree and genomic data (from song sparrows and other systems), and through theory that simultaneously considers heterosis and migration load (e.g. Lopez et al. 2009). Meanwhile, our analyses demonstrate that structured quantitative genetic analyses can explicitly estimate V A in fitness alongside multiple genetic consequences of immigration in wild populations, and thereby inform understanding of the contributions of gene flow to the magnitude and maintenance of overall V A in fitness and resulting evolutionary dynamics.  Tables   Table 1. Marginal posterior means, modes (in square brackets), and 95% credible intervals (in parentheses) for latent-scale estimates from the bivariate model for female and male fitness. Within the additive genetic and cohort matrices, sex-specific variances are shown along the diagonal (bold) with cross-sex covariances (COV) and correlations (r, italics) above and below the diagonal respectively. Sex-specific residual variances (V R ), heritabilities (h 2 latent ), and slopes of regressions on individual coefficient of inbreeding (β f ) and immigrant genetic group coefficient (β IGG ) are also shown.