PEDIGREE ERROR DUE TO EXTRA‐PAIR REPRODUCTION SUBSTANTIALLY BIASES ESTIMATES OF INBREEDING DEPRESSION

Understanding the evolutionary dynamics of inbreeding and inbreeding depression requires unbiased estimation of inbreeding depression across diverse mating systems. However, studies estimating inbreeding depression often measure inbreeding with error, for example, based on pedigree data derived from observed parental behavior that ignore paternity error stemming from multiple mating. Such paternity error causes error in estimated coefficients of inbreeding (f) and reproductive success and could bias estimates of inbreeding depression. We used complete “apparent” pedigree data compiled from observed parental behavior and analogous “actual” pedigree data comprising genetic parentage to quantify effects of paternity error stemming from extra‐pair reproduction on estimates of f, reproductive success, and inbreeding depression in free‐living song sparrows (Melospiza melodia). Paternity error caused widespread error in estimates of f and male reproductive success, causing inbreeding depression in male and female annual and lifetime reproductive success and juvenile male survival to be substantially underestimated. Conversely, inbreeding depression in adult male survival tended to be overestimated when paternity error was ignored. Pedigree error stemming from extra‐pair reproduction therefore caused substantial and divergent bias in estimates of inbreeding depression that could bias tests of evolutionary theories regarding inbreeding and inbreeding depression and their links to variation in mating system.

However, even when inbreeding reduces offspring fitness it could still increase a parent's inclusive fitness and hence be adaptive. This is because parents are more closely related to inbred offspring than to outbred offspring or, phrased alternatively, because inbreeding can increase the mating success of relatives (Lande and Schemske 1985;Waser et al. 1986;Kokko and Ots 2006;Parker 2006). Whether there is net selection for or against inbreeding therefore depends on properties of the mating system and the magnitude of inbreeding depression (Waser et al. 1986;Ralls et al. 1988;Kokko and Ots 2006;Szulkin et al. 2013). Furthermore, the magnitude of inbreeding depression could itself depend on inbreeding rate and consequent purging, and hence on ecological circumstances that influence mating system and effective population size (Lande and Schemske 1985;Keller and Waller 2002;Goodwillie et al. 2005;Laws and Jamieson 2010;Cheptou and Donohue 2011). Unbiased estimation of the magnitude of inbreeding depression occurring in a range of different mating systems is therefore prerequisite to understanding the magnitude and direction of selection on inbreeding and associated mating system evolution.
The magnitude of inbreeding depression, or inbreeding load, is frequently estimated as the slope of a regression of (log) fitness on individual coefficient of inbreeding (f, the probability that two homologous alleles are identical by descent, Morton et al. 1956;Lynch and Walsh 1998 p. 276;Keller and Waller 2002;Charlesworth and Willis 2009). In general, estimated regression slopes can be biased when independent variables are measured with error (Fuller 1987). One important assumption underlying the regression approach to measuring inbreeding depression is therefore that f (the independent variable) is set experimentally or otherwise measured without error (Draper and Smith 1998, p. 89).
In fact, f will rarely be measured without error, whether calculated from pedigree data or inferred straight from genotypic data. Pedigree data are commonly incomplete because some individuals have unknown parents, or inaccurate because multiple matings or extra-pair reproduction mean that parents are incorrectly assigned based on observed parental behavior (e.g., Keller 1998;Kruuk et al. 2002;Visscher et al. 2002;Cassell et al. 2003;Brommer et al. 2007;Jensen et al. 2007;Szulkin et al. 2007). Pedigrees can still contain substantial error and uncertainty even when parents are assigned based on genotypic data (Hadfield et al. 2006;Walling et al. 2010). Such pedigree errors, which will cause error in estimates of f and potentially bias estimates of inbreeding depression, may therefore be normal rather than exceptional, particularly in wild population studies.
Furthermore, pedigree error might occur nonrandomly with respect to the key traits that determine the magnitude of inbreeding depression (i.e., fitness and f). For example, extra-pair paternity might be biased with respect to fitness or relatedness if, as widely hypothesized, females use extra-pair reproduction to increase offspring fitness and/or avoid inbreeding (Tregenza and Wedell 2000;Griffith et al. 2002;Brouwer et al. 2011;Sardell et al. 2012). The mating system itself could then affect the degree to which estimates of inbreeding depression are biased. Quantitative assessments of such bias are therefore required before evolutionary hypotheses relating inbreeding depression to mating systems, and vice versa, can be meaningfully tested.
Few empirical studies have quantified the bias in estimates of inbreeding depression caused by pedigree error, or more specifically by pedigree error stemming from observation of the mating system (e.g., due to extra-pair reproduction, Keller et al. 2001a).  and Kruuk et al. (2002) postulated that their analyses of behavioral pedigree data from passerine birds most probably underestimated inbreeding depression. This assertion stemmed from the general expectation that random measurement error in independent variables will downwardly bias regression slopes (termed regression "attenuation" or "dilution," Draper and Smith 1998, pp. 89-91;Carroll et al. 2006, p. 41). This expectation derives from a classical additive measurement error model, which assumes that errors in the independent variable are normally distributed with mean of zero and homogeneous variance, are uncorrelated, and are independent of the true values of the independent variable and of any measurement error in the dependent variable (Draper and Smith 1998, p. 90;Carroll et al. 2006, p. 3).
However, there are multiple reasons why studies of inbreeding depression might violate these assumptions. For example, the distribution of the independent variable f is bounded at zero and often highly right-skewed, meaning that even random pedigree error may cause heterogeneous and nonnormal error in f. Furthermore, pedigree errors affect estimates of f for individuals whose parents were incorrectly assigned and their descendants, meaning that error in f is correlated across relatives. Finally, pedigree error stemming from extra-pair paternity also introduces error into estimates of male reproductive success and hence fitness (the dependent variable) derived from observed parental behavior. Errors in the dependent and independent variables could consequently be correlated to some degree. Conversely, if a female's socially paired and extra-pair mates were similar in relatedness or fitness, for example, due to repeated expression of a female preference, then extra-pair reproduction could potentially cause less error and bias in estimates of f, fitness, and inbreeding depression than otherwise expected. When the assumptions of the classical additive measurement error model are violated in such ways, "reverse attenuation" can occur (Carroll et al. 2006, p. 46), meaning that regression analyses could overestimate inbreeding depression. This might explain why including individuals with limited pedigree data (and hence downwardly biased estimates of f) inflated estimates of inbreeding depression in dairy cattle traits (Cassell et al. 2003).
The net impact of all such violations of key assumptions of classical additive measurement error theory cannot be easily predicted a priori. Empirical studies are therefore needed to quantify the degree to which pedigree error can bias the magnitude of inbreeding depression estimated using standard field datasets and regression approaches, by causing error in estimated f, in estimated fitness, or in both.
Here, we quantify the effects of paternity error stemming from extra-pair reproduction on estimates of f, reproductive success, and inbreeding depression in socially monogamous song sparrows (Melospiza melodia) inhabiting Mandarte Island, BC, Canada. Previous analyses of pedigree data compiled from observed parental behavior estimated substantial inbreeding depression in fitness components in this population (e.g., Keller 1998;Reid et al. 2003;Marr et al. 2006;Keller et al. 2008). Molecular genetic analyses then revealed substantial extra-pair reproduction; about 28% of hatched chicks were sired by a male other than a female's paired social mate (O'Connor et al. 2006;Sardell et al. 2010;Reid et al. 2011a,b). The pedigree compiled from observed parental behavior, hereafter termed the "apparent pedigree," therefore contains about 28% paternity error. Genotypic data were consequently used to compile a highly resolved "actual" pedigree in which genetic sires were assigned to >99% of song sparrows fledged during 1993-2011 with high confidence (Sardell et al. 2010, see Methods). Although unlikely to be completely error-free, the "actual" pedigree contains substantially less paternity error than the "apparent" pedigree. Comparative analysis of the two pedigrees therefore allows explicit quantification of the biases that pedigree error caused by extra-pair paternity can introduce into estimates of f, male reproductive success, and inbreeding depression.
In this study, we first quantify the magnitude and form of the errors in f and in male reproductive success caused by extrapair paternity, and consider whether these errors violate the assumptions of classical additive measurement error models (e.g., Carroll et al. 2006, p. 46). Second, we quantify resulting bias in the estimated magnitude of inbreeding depression in major fitness components; juvenile survival to recruitment; and adult annual survival, annual reproductive success (ARS), and lifetime reproductive success (LRS). We show that inbreeding depression in key fitness components was substantially underestimated (attenuated) due to pedigree error stemming from extra-pair paternity, but also report a case of reverse attenuation where paternity error caused inbreeding depression to be overestimated.

STUDY SYSTEM
Song sparrows can breed from age one year and typically form socially monogamous breeding pairs where both sexes contribute to territory defense and parental care (Arcese et al. 2002;Smith et al. 2006). However, they are genetically polygynandrous, with frequent extra-pair paternity (O'Connor et al. 2006;Sardell et al. 2010;Hill et al. 2011).
The resident population of song sparrows inhabiting Mandarte Island has been studied intensively since 1975 and recently averaged 30 ± 12 (standard deviation [SD]) breeding pairs (Keller 1998;Smith et al. 2006;Lebigre et al. 2012). Each year, all nests were located and closely monitored and all chicks surviving to six days posthatch were marked with unique combinations of metal and colored plastic bands. Immigrants to Mandarte (1.1 per year on average) were also banded soon after arrival. All chicks that survived to independence from parental care (24 days posthatch) and their apparent mothers and fathers (the socially paired adults that defended territories, incubated clutches, and provisioned chicks) were identified by their bands. All adult (≥1 year old) males that remained socially unpaired (due to the male-biased adult sex ratio) were also identified (Lebigre et al. 2012). Due to the intensive fieldwork and Mandarte's small size (6 hectares), the probability of resighting a surviving adult song sparrow on Mandarte during any breeding season is effectively one (P > 0.998 across all years, Wilson et al. 2007). Each individual's local survival was therefore accurately documented. Although there may be some unobserved juvenile dispersal, the relatively high local recruitment rate (approximately 30% of independent offspring) and scarcity of Mandarte-banded song sparrows on surrounding islands suggest that dispersal is relatively rare (Smith et al. 2006;Wilson and Arcese 2008;Sardell et al. 2011).

PEDIGREE AND PATERNITY DATA
The detailed field observations of parental behavior were used to compile the "apparent" pedigree, linking all banded chicks to their apparent mother and father (Keller 1998;Keller et al. 2008;Reid et al. 2008). This pedigree included all song sparrows fledged during 1975-2011 except that the parents of some chicks fledged in 1980 were unknown due to reduced fieldwork (Keller 1998). Each individual's "apparent" coefficient of inbreeding (apparent f) relative to the apparent pedigree baseline was calculated using standard algorithms (Wright 1922;Keller 1998;Reid et al. 2008).
To correct the apparent pedigree and hence estimates of f for error caused by extra-pair paternity, all chicks banded during 1993-2011 and their parents were blood-sampled and genotyped at 13 polymorphic microsatellite loci (Sardell et al. 2010).
Bayesian models that incorporated genotypic and spatial information describing the locations of chicks and candidate parents were used to infer genetic parents (implemented in package Master-Bayes, Hadfield et al. 2006;Sardell et al. 2010). These analyses suggested that all mothers were correctly identified based on observed parental behavior, and assigned a genetic father to >99% of banded chicks with >95% individual-level statistical confidence. Overall, 753 of 2667 (28.2%) banded offspring and 492 of 1808 (27.2%) independent offspring were assigned to an extra-pair sire. These genetic parentage data were used to compile an "actual" pedigree that assigned all chicks banded during 1993-2011 to their most likely genetic parents. This pedigree was then used to calculate each individual's "actual" coefficient of inbreeding (actual f). Because no chicks whose genetic fathers were assigned with <95% confidence survived to breed, the remaining paternity uncertainty in these cases introduced no downstream error in f.
Inbreeding coefficients are defined relative to a basal population in which all individuals are assumed unrelated; values therefore depend on the choice of baseline (Falconer and Mackay 1996, p. 84;Keller and Waller 2002). For the actual pedigree, one option would be to define the 1993 breeders (the first year in which genetic paternity was comprehensively assigned) as basal. However, substantial data on relatedness among these breeders exists in the apparent pedigree covering individuals banded during [1975][1976][1977][1978][1979][1980][1981][1982][1983][1984][1985][1986][1987][1988][1989][1990][1991][1992]. Assuming a similar extra-pair paternity rate to that observed during 1993-2011, about 86% of links in the 1975-1992 pedigree will be correct (i.e., all mothers and about 72% of fathers). Estimates of relatedness among the 1993 breeders calculated from the 1975-1992 apparent pedigree are therefore more informative than the alternative assumption of zero relatedness (see Discussion and Reid et al. 2011b). We therefore grafted the actual pedigree for 1993-2011 onto the apparent pedigree for 1975-1992, and used the entire grafted pedigree to calculate "actual f" for individuals fledged during 1993-2011 (Reid et al. 2011b). To further minimize error in actual f, the paternity of some individuals hatched before 1993 was genetically verified where blood samples were available. Specifically, genetic sires were confidently assigned to 37 song sparrows that hatched during 1991-1992 and bred subsequently. Extra-pair sires were assigned to eight (22%) of these individuals.
Inbreeding coefficients of post-1975 immigrants to Mandarte are undefined relative to the basal native population (Keller 1998;Reid et al. 2008). Immigrants were consequently excluded from analyses of error in f and inbreeding depression. However, microsatellite genotypes suggest that immigrants are not closely related to existing natives (Keller et al. 2001b). Offspring of immigrant-native pairings were therefore defined as outbred (f = 0) and included in analyses (see also Keller et al. 2008;Reid et al. 2008Reid et al. , 2011b.

ERROR IN COEFFICIENTS OF INBREEDING
Standard statistics (median, range, mean, variance, and skewness) were computed to describe the distributions of apparent f and actual f and thereby summarize the overall effect of extra-pair paternity on estimates of f in males and females. Similar statistics were computed to describe the distributions of the error in f ( f, where f = apparent f − actual f calculated for each individual song sparrow) and the absolute magnitude of this error (| f|). The percentages of individuals where | f| exceeded zero and where f was positive or negative, and the correlations between f and actual f and between apparent f and actual f, were also calculated. Finally, maximum pedigree depth for each individual, defined as the maximum number of generations of ancestors contained in the apparent and actual pedigrees, was computed.

REPRODUCTIVE SUCCESS
ARS and LRS were calculated as the total number of offspring that survived to independence from parental care during a single breeding season or over an individual adult's lifetime, respectively. For males, "apparent" and "actual" ARS and LRS were calculated from the apparent and actual pedigree data, respectively, and quantified offspring reared and sired, respectively. Female ARS and LRS were identical whether calculated from the apparent or actual pedigree data because extra-pair maternity was never observed (Sardell et al. 2010).
ARS was calculated for all males and females fledged during 1993-2010 (and hence whose own parentage was genetically verified) that survived to adulthood during 1994-2011 (and hence whose offspring's parentage was genetically verified). LRS was calculated for all males and females fledged during 1993-2006 that survived to adulthood. LRS was not calculated for individuals fledged after 2006 because multiple individuals from these cohorts were still alive in 2012. Their LRS was therefore incompletely measured, and excluding long-lived individuals with potentially high LRS could bias analyses (Keller et al. 2008).

ERROR AND VARIANCE IN REPRODUCTIVE SUCCESS
To quantify the error that extra-pair paternity introduced into estimates of male ARS and LRS, we calculated the proportions of cases where estimates of actual and apparent ARS and LRS differed and the range of the discrepancy. We additionally calculated the mean (μ RS ) and variance (var RS ) in apparent and actual ARS and LRS and hence the respective "opportunities for inbreeding depression" (I = var RS /μ RS 2 ), where I is a mean-scaled variance that facilitates comparison across traits (Waller et al. 2008).

BIAS IN ESTIMATES OF INBREEDING DEPRESSION
Separate negative binomial linear (mixed) models were used to estimate inbreeding depression in ARS and LRS for females and males, using log-link functions. For both sexes, separate analyses were run in which ARS, LRS, and f were all estimated from the apparent pedigree or from the actual pedigree. For males, additional analyses were run with apparent ARS or LRS and actual f, and with actual ARS or LRS and apparent f. These latter models allowed us to distinguish whether bias in inbreeding depression estimated from the apparent pedigree was primarily due to error in estimated f, or to error in estimated reproductive success, or due to additive or nonadditive combinations of both. The models with apparent reproductive success and actual f also have a useful biological interpretation, measuring inbreeding depression in the number of offspring a male reared. Analyses of ARS included fixed effects of an individual's breeding year (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) and age class (one to six, with individuals aged ≥6 pooled) to account for known variation with year and age (Smith et al. 2006;Keller et al. 2008). Random individual effects were modeled to account for non-independence among observations of individuals that bred in multiple years. Analyses of LRS included fixed effects of an individual's natal year (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) to account for known among-cohort variation (Smith et al. 2006;Keller et al. 2008).
Separate interval-censored proportional hazards models were fitted to estimate inbreeding depression in juvenile survival from independence from parental care to age one year (recruitment), and adult survival between subsequent years in males and females (Keller 1998;Keller et al. 2008). These proportional hazards models estimated the effect of f on a baseline hazard function, where the probability that individual k will survive year i is S ik = exp(−exp(α i + βf)), where α i is the baseline hazard and β is a regression coefficient quantifying the increment due to f (Heisey 1992;Kalbfleisch and Prentice 2002). Positive β indicates an increased hazard and hence reduced survival probability, given increased f. Sexes of all independent juveniles fledged during 1993-2011 were determined by genotyping at the CHD-1 locus (Postma et al. 2011;Sardell et al. 2011). Because survival varies among years and cohorts (Keller 1998;Smith et al. 2006), models were stratified by hatch year, and therefore compared survival among individuals of the same sex but differing f that hatched in the same year. Analysis included individuals fledged during 1993-2011 that survived or died up to 2012. Data for individuals that were still alive in 2012 were right-censored.
The magnitude of bias that pedigree error stemming from extra-pair paternity introduced into estimates of inbreeding depression was formally quantified by comparing inbreeding load (the number of "lethal equivalents") calculated from the apparent and actual pedigrees. The lower limit of the inbreeding load can be estimated for diploid organisms as twice the slope of a regression of log(fitness) on f (Morton et al. 1956;Lynch and Walsh 1998, p. 276;Keller and Waller 2002). This slope was directly estimated for LRS as the negative binomial regression coefficient for f given a log-linear model.
Analyses were run in R (version 2.15.2, R Development Core Team 2009) using libraries kinship2 and glmmADMB, and SAS/STAT R software (version 9.3, SAS Institute, Inc., Cary, NC). Negative binomial models assumed variance function u + u 2 v, where u and v are compound parameters defining the underlying gamma distribution. Previous analyses of apparent pedigree data showed that juvenile and adult survival vary with an individual's own f, whereas pre-independence traits vary with parent f (Keller 1998;Reid et al. 2003). Exploratory analyses of actual pedigree data supported this pattern. Analyses of inbreeding depression in survival and reproductive success therefore focused on f of focal individuals and breeding adults, respectively. Data collection was approved by the University of British Columbia Animal Care Committee.

INBREEDING
For 1808 known-sex juvenile song sparrows that survived to independence from parental care during 1993-2011, the apparent and actual pedigrees contained median depths of 18 and 17 generations, respectively (Table 1), providing substantial power to quantify subtle variation in f.
Extra-pair paternity introduced error ( f = 0) in 74% and 76% of individual f values for juvenile females and males, respectively, as estimated from the apparent pedigree versus the actual pedigree (Table 1). The distributions of apparent f and actual f were broadly similar within both sexes, although the range, mean, variance, and skewness were all slightly greater in apparent f ( Table 1). Actual f and apparent f were moderately correlated across all 1808 juveniles (overall correlation coefficient: r = 0.64; linear regression coefficient: b = 0.60 ± 0.02 (standard error [SE]), R 2 = 0.41, Fig. 1A; r = 0.67 and 0.62 for females and males separately). However, this relationship was weak (r = 0.18, b = 0.14 ± 0.04 [SE], R 2 = 0.03, Fig. 1A) across 492 individuals that had been sired by an extra-pair male (i.e., when a first-order ancestor was incorrect in the apparent pedigree), and much stronger (r = 0.83, b = 0.82 ± 0.02 [SE], R 2 = 0.68, Fig. 1A) across 1316 individuals that had been sired by their apparent father (i.e., when first-order ancestors were correct in the apparent pedigree and any error in f stemmed from more distant misassigned ancestors).
Due to the multigenerational impact of pedigree errors on estimates of f, the percentage of individuals where f = 0 increased from 31% in the 1993 cohort to 83% in the 2011 cohort (Fig. 1B). Although the magnitude of error in f (| f|) increased across cohorts (r = 0.21), the mean per-cohort increment was small (b = 0.0013 ± 0.0001 [SE], Fig. 1B). Overall, median | f| was about 0.01, equating to about 15% of median actual f (Table 1). However, mean f was only slightly greater than zero

Relationships between (A) apparent and actual coefficients of inbreeding (f), (B) hatch cohort and the absolute magnitude of error in f (| f|), and (C) actual f and the error in f ( f) across independent juvenile song sparrows. (A) and (C) show relationships for individuals sired by a female's observed socially paired male (within-pair offspring, filled symbols, dotted line) or extra-pair male (extrapair offspring, open symbols, dashed line), and all offspring combined (solid line). (B) Shows the cohort-specific median | f| (central symbols), the first and third quartiles (thick bar), and the maximum and minimum (dashed lines, minima were zero for all cohorts).
in both sexes, showing that apparent f was only slightly larger than actual f on average (Table 1). The distribution of f was skewed in juvenile females but less so in juvenile males (Table 1). Furthermore, across all juveniles, f was negatively correlated with actual f, showing that error in f was not independent of actual f (Table 1, Fig. 1C). These patterns were similar across the 204 females and 268 males that survived to adulthood and hence contributed to estimates of inbreeding depression in ARS and adult survival (Table 1), and were also similar across the 155 females and 210 males that contributed to estimates of inbreeding depression in postrecruitment LRS (data not shown).

ERROR IN ESTIMATED REPRODUCTIVE SUCCESS
Extra-pair paternity introduced net error into 276 of 627 (44%) estimates of male ARS and 112 of 210 (53%) estimates of male LRS. The difference between an individual's apparent and actual ARS and LRS ranged from −8 to +7 and −9 to +11 offspring, respectively (median = 0 in both cases).
Extra-pair paternity increased the variance in male ARS (7.7 vs. 6.4 estimated from actual and apparent ARS, respectively) and LRS (54.8 vs. 48.1 respectively), but did not change the respective means (ARS: 2.2; LRS: 5.1). Extra-pair paternity therefore slightly increased the opportunity for inbreeding depression in male ARS (I = 1.66 and 1.35 estimated from actual and apparent ARS, respectively) and LRS (I = 2.08 and 1.84, respectively).
The error in both ARS and LRS was weakly negatively correlated with the error in f (r = −0.06 in both cases), meaning that males whose ARS or LRS was overestimated based on the apparent pedigree tended to have slightly underestimated f values. The error in ARS and LRS was also weakly positively correlated with a male's actual f (r = 0.10 in both cases). Analyses of the apparent pedigree therefore tended to overestimate ARS and LRS to a greater degree for males that were actually relatively inbred.
Because all mothers were identical in the apparent and actual pedigrees (Sardell et al. 2010), estimates of female ARS and LRS, and hence the mean, variance, and I in these traits, were unaffected by extra-pair reproduction, and were 3.3, 4.4, and 0.39 for ARS and 7.2, 42.9, and 0.82 for LRS, respectively. Thus, I was considerably greater in males than in females.

BIAS IN ESTIMATES OF INBREEDING DEPRESSION
Pedigree error stemming from extra-pair paternity caused estimates of inbreeding depression in ARS to be biased towards zero (attenuated). Specifically, estimated inbreeding depression in ARS was substantially greater based on the actual pedigree than the apparent pedigree in adult females and males (representing increases of 550% and 105%, Table 2, Fig. 2A and B). Inbreeding depression was significantly greater than zero based on both pedigrees for males, and based on the actual pedigree but not the apparent pedigree for females (Table 2, Fig. 2A and B). The bias in the overall estimate of inbreeding depression in male ARS stemmed from error in both f and ARS; inbreeding depression was 18% greater when estimated from actual ARS and apparent f, and 57% greater when estimated from apparent ARS and actual f, than when estimated from apparent ARS and apparent f (Table 2, Fig. 2B).    Pedigree error stemming from extra-pair paternity also caused inbreeding depression in LRS to be underestimated. Inbreeding depression in female LRS was estimated to be slight based on the apparent pedigree, but substantially greater based on the actual pedigree (although still marginally nonsignificantly different from zero, Table 2, Fig. 2C), representing an increase of 890%. In contrast, inbreeding depression in male LRS was estimated to be substantial and statistically significant based on both the apparent and actual pedigrees (Table 2, Fig. 2D). However, the estimated magnitude was greater based on the actual pedigree, representing an increase of 40% (Table 2, Fig. 2D). As with ARS, error in estimates of f and in male LRS both caused inbreeding de-pression to be underestimated. However, contrary to the situation with ARS, error in f caused less bias than error in LRS (Table 2, Fig. 2B and D).

Magnitudes of inbreeding depression in (A) annual reproductive success (ARS) and (B) lifetime reproductive success (LRS) in adult female and male song sparrows estimated from combinations of the apparent pedigree (Apparent f and Apparent RS) and the actual pedigree (Actual f and Actual RS). Point estimates (β) from negative binomial linear (mixed) models are presented with standard errors (SEs) and 95% confidence intervals (95% CI) and the probability that
Overall survival probabilities were 0.25 and 0.34 in juvenile females and males, and 0.53 and 0.59 in adult females and males, respectively. Inbreeding depression in juvenile female survival was estimated to be slight based on both the apparent and actual pedigrees, and the estimated hazards were quantitatively similar (Table 3, Fig. 3A). In contrast, inbreeding depression in juvenile male survival was estimated to be 170% greater based on the actual pedigree than on the apparent pedigree, and significantly greater than zero based only on the former (Table 3, Fig. 3B).   Meanwhile, inbreeding depression in adult survival did not differ significantly from zero when estimated from either the apparent or actual pedigrees in either sex (Table 3, Fig. 3C and D), and confidence intervals around estimated coefficients were wide. However, the point estimate of the effect of inbreeding on adult female survival changed sign from negative to positive, whereas the magnitude of estimated inbreeding depression in adult male survival decreased by about 85% when estimated from the actual rather than apparent pedigree (Table 3, Fig. 3C and D). Extra-pair paternity therefore caused inbreeding depression in adult male survival to be overestimated (reverse attenuation) based on the point estimate. Pedigree error therefore affected estimates of inbreeding depression in survival in ways that were not consistent across sexes or age classes.

Discussion
Understanding the impact of inbreeding depression on mating system evolution, and the ultimate impact of mating system on the magnitude of inbreeding depression, requires estimates of inbreeding depression that are not systematically biased by properties of underlying mating systems or our consequent ability to measure f and fitness. Unbiased estimates of inbreeding depression are also required to assess the likely viability and persistence of populations whose sizes and mating systems mean that inbreeding occurs (Ralls et al. 1988;Hedrick and Kalinowski 2000;O'Grady et al. 2006). However, there are multiple reasons why estimates of inbreeding depression derived from observational data collected in wild populations might be biased (Reid et al. 2008), including pedigree error. We used "apparent" pedigree data compiled from observed parental behavior in socially monogamous song sparrows and corresponding "actual" pedigree data that were substantially corrected for extra-pair paternity to quantify the impact of paternity error on estimates of individual coefficients of inbreeding (f), male reproductive success, and the magnitude of inbreeding depression in major fitness components.

PEDIGREE ERROR
Most pedigrees contain error due to misassigned parentage (Visscher et al. 2002;Pemberton 2008). Error rates can be considerable even under controlled mating schemes (e.g., Visscher et al. 2002), but are expected to be substantial in populations where extra-pair reproduction means that true genetic parents are frequently misassigned based on observed social behavior (e.g., Keller et al. 2001aKruuk et al. 2002;Hadfield et al. 2006;Brommer et al. 2007;Walling et al. 2010). Such error will bias estimates of f, reproductive success, and inbreeding depression in ways that depend on the relationships between multiple mating, inbreeding, and fitness. Despite considerable research, there is as yet no overarching consensus regarding the general form of such relationships (Griffith et al. 2002;Sardell et al. 2012;Slatyer et al. 2012). The consequent lack of a comprehensive observation model that could accurately predict "actual" parentage from observed "apparent" parentage means that resulting bias in estimates of inbreeding depression cannot be directly inferred. Empirical data are therefore required to quantify the degree to which pedigree error violates key assumptions of standard regression approaches to estimating inbreeding loads, and also violates assumptions of classical additive measurement error models that would otherwise allow the magnitude and direction of resulting bias to be predicted.
Our "actual" pedigree data comprise highly resolved molecular genetic parentage assignments for all song sparrows hatched on Mandarte during 1993-2011 (Sardell et al. 2010;Reid et al. 2011a,b). Because paternity assignments are probabilistic, some paternity error likely remains, but this is expected to be small (<1-2%, Sardell et al. 2010). The full actual pedigree also contains paternity error stemming from unobserved extra-pair reproduction during 1975-1992. However, this error will introduce increasingly slight error into estimates of f for individuals hatched subsequently, because the impact of any assigned ancestor on a descendant's estimated f decreases rapidly with increasing intervening generations (Cassell et al. 2003;Balloux et al. 2004;Pemberton 2008). Indeed, the mean magnitude of error in f due to extrapair paternity did not increase rapidly across cohorts (Fig. 1B) even though the percentage of individuals with non-zero error did increase substantially. The actual pedigree is therefore unusually deep, complete, and accurate for a wild population (Sardell et al. 2010;Walling et al. 2010;Reid et al. 2011a), allowing useful comparison of estimates of f, fitness, and inbreeding depression with analogous estimates calculated from the apparent pedigree, which is itself deep and complete, but which was not corrected for extra-pair paternity.

REPRODUCTIVE SUCCESS
The about 28% extra-pair paternity detected in Mandarte's song sparrows during 1993-2011 caused widespread cumulative error in estimates of f; 75% of 1808 juveniles and 70% of 472 adults differed in apparent f versus actual f (Table 1). One key assumption underpinning unbiased estimation of inbreeding depression using standard regression analyses, that the independent variable f is measured without error, was therefore clearly violated.
Moreover, the nature of the errors violated assumptions of the classical additive measurement error models that have been implicitly invoked to infer that such regression analyses probably underestimate inbreeding depression Kruuk et al. 2002). The median error in f ( f) was zero (Table 1), showing that f was positive and negative equally often. However, the distribution of f was positively skewed, especially in females, meaning that mean f was slightly positive and that actual f averaged about 2-8% smaller than apparent f (Table 1). Extra-pair paternity therefore caused estimates of mean f to be slightly positively biased on average. However, this discrepancy was small, suggesting that, contrary to empirical studies on other species (Tregenza and Wedell 2000;Griffith et al. 2002;Brouwer et al. 2011;Varian-Ramos & Webster 2012), female song sparrows did not use extra-pair reproduction to avoid inbreeding to a substantial degree. Indeed, despite the widespread error in f, the overall distributions of actual f and apparent f were broadly similar (Table 1). However, f was negatively correlated with actual f, meaning that error in f was not independent of the actual value (Table 1, Fig. 1C). This pattern reflects the zero-bounded and skewed distribution of f. For low or zero values of actual f, f cannot be negative even if extra-pair paternity and resulting pedigree errors are random. In contrast, for high actual f, f is unlikely to be substantially positive.
In addition to error in estimates of f, extra-pair paternity also caused widespread error in estimates of male ARS and LRS. Extra-pair paternity increased the variance in male ARS and LRS by about 20% and 14%, respectively, thereby slightly increasing the opportunity for inbreeding depression (Lebigre et al. 2012). The maximum magnitude of inbreeding depression in male reproductive success that could have been estimated was therefore higher once paternity error was corrected.

BIAS IN ESTIMATES OF INBREEDING DEPRESSION
Pedigree error stemming from extra-pair paternity caused estimated magnitudes of inbreeding depression to be quantitatively quite different when calculated from the actual versus apparent pedigrees, with substantial absolute and proportional discrepancies (Figs. 2 and 3). Even though the errors in estimates of f and male reproductive success violated assumptions of classical additive measurement error models, the basic expectation from such models, that pedigree error would cause inbreeding depression to be underestimated ("attenuated", e.g., Kruuk et al. 2002;Pemberton 2008), was still fulfilled for ARS, LRS, and male juvenile survival. Most dramatically, inbreeding depression in ARS was estimated to be 550% and 110% greater in females and males, respectively, once paternity error was corrected. Paternity error also caused the total diploid inbreeding loads in LRS to be substantially underestimated in both sexes: estimated loads were approximately 7.3 and 17.5 lethal equivalents for females and males, respectively, based on the actual pedigree, 6.6 and 5.0 units greater than equivalent loads estimated from the apparent pedigree. These differences are far from trivial given that diploid inbreeding loads estimated for other bird populations range through 0-14 lethal equivalents (e.g., Laws and Jamieson 2010), albeit typically based on pedigree data that probably contain error.
Despite the large discrepancies in point estimates of inbreeding depression, confidence intervals around analogous estimates derived from the apparent and actual pedigrees generally overlapped the alternative estimate (Figs. 2 and 3), meaning that the estimates did not differ significantly from each other. However, the impact of pedigree error was sufficient to render some hypothesis tests incorrect. Most notably, for female ARS and male juvenile survival, analyses of apparent pedigree data would have incorrectly failed to reject the null hypothesis of no inbreeding de-pression, and analyses of female LRS showed a similar tendency (Tables 2 and 3), creating serious type II errors. Furthermore, confidence intervals surrounding estimates of inbreeding depression were also slightly but consistently larger based on the actual pedigree (Figs. 2 and 3). Analyses of apparent pedigree data therefore not only underestimated inbreeding depression, but also overestimated the precision of resulting estimates.
However, not all fitness components conformed to the basic expectation of regression attenuation and underestimation of inbreeding depression given measurement error in f. Adult male survival showed a higher point estimate of inbreeding depression given the apparent pedigree, indicating "reverse attenuation," although the confidence intervals were wide. The point estimate for adult female survival changed sign, while that for juvenile female survival was quantitatively similar based on both pedigrees. Because reverse attenuation occurred for adult male survival but not adult female or juvenile survival, it is not a general property of proportional hazards models as applied to our dataset. Indeed, attenuation occurs in proportional hazards models under many but not all measurement error models (Hughes 1993;Li and Ryan 2004), and in capture-recapture models commonly employed in studies of free-living populations (Hwang and Huang 2007). The cause of reverse attenuation in estimates of inbreeding depression in adult male survival is unclear, but may imply that error in f is not independent of survival. Nevertheless, these results show that estimates of inbreeding depression in key fitness components derived from pedigree data that contain paternity error cannot necessarily be assumed to be conservative.
Finally, it is notable that the error that extra-pair paternity introduced into estimates of male ARS and LRS also directly biased estimates of inbreeding depression independently of error in f (Table 2, Fig. 2). This is somewhat unexpected because error in dependent variables does not necessarily bias regression estimates, at least given the assumptions of classical additive measurement error models (Fuller 1987). Furthermore, our analyses that used actual ARS or LRS and apparent f, or actual f and apparent ARS or LRS, showed that error in f and reproductive success had non-additive effects on error in estimated inbreeding depression. Specifically, the changes in estimated inbreeding depression when either f or reproductive success were corrected for paternity error did not sum to the total change in inbreeding depression estimated when both were corrected (Fig. 2). Violation of a further key assumption is probably responsible: error in ARS and LRS was not independent of actual f (or apparent f). This is because inbreeding depression in male extra-pair reproductive success occurs in song sparrows; inbred males sire relatively few offspring through extra-pair reproduction (Reid et al. 2011b).
In summary, extra-pair paternity created errors in estimates of f and male ARS and LRS in song sparrows, and caused substantial bias in estimates of inbreeding depression. Such bias is expected even when error in f is unbiased (e.g., Carroll et al. 2006), and can therefore afflict all studies that estimate f with error. Magnitudes of inbreeding depression estimated from unverified "apparent" pedigree data, or from partially genetically corrected data, should consequently be treated with caution. Comparative analyses of inbreeding depression based on such studies may consequently be impeded or biased by variation in the underlying mating system that causes observation error in the pedigree.

MAGNITUDE OF INBREEDING DEPRESSION
The actual pedigree compiled from highly resolved genetic parentage assignments yielded some very large estimates of inbreeding depression, particularly in male song sparrows. The estimated diploid inbreeding load in male LRS of 17.5 (95% CI 6.0-29.0) is at the high end of estimates reported for other wild populations (Crnokrak and Roff 1999;Keller and Waller 2002;Kruuk et al. 2002;O'Grady et al. 2006;Szulkin et al. 2007;Laws and Jamieson 2010). Inbreeding depression of this magnitude would likely overwhelm any postulated inclusive fitness benefit of mating with a relative, cause selection for inbreeding avoidance, and potentially reduce population viability (Lande and Schemske 1985;Kokko and Ots 2006;O'Grady et al. 2006). However, even given our comprehensive and deep genetic pedigree data, reasonable sample sizes and moderately high mean and variance in f, confidence intervals around estimates of inbreeding depression were still wide. Consequently, the magnitude of inbreeding depression in key fitness components was only estimable with relatively low precision, which may lead to uncertain prediction of ecological or evolutionary consequences.