Multiple aspects of plasticity in clutch size vary among populations of a globally distributed songbird



  1. Plasticity in life-history characteristics can influence many ecological and evolutionary phenomena, including how invading organisms cope with novel conditions in new locations or how environmental change affects organisms in native locations. Variation in reaction norm attributes is a critical element to understanding plasticity in life history, yet we know relatively little about the ways in which reaction norms vary within and among populations.
  2. We amassed data on clutch size from marked females in eight populations of house sparrows (Passer domesticus) from North America and Europe. We exploited repeated measures of clutch size to assess both the extent of within-individual phenotypic plasticity and among-individual variation and to test alternative hypotheses about the underlying causes of reaction norm shape, particularly the decline in clutch size with date.
  3. Across all populations, females of this multibrooded species altered their clutch size with respect to date, attempt order, and the interaction of date and order, producing a reaction norm in multidimensional environmental space. The reaction norm fits that predicted by a model in which optimal clutch size is driven by a decline with date hatched in the ability of offspring to recruit. Our results do not fit those predicted for other proposed causes of a seasonal decline in clutch size.
  4. We also found significant differences between populations in response to date and the date by attempt order interaction. We tested the prediction that the relationship with date should be increasingly negative as breeding season becomes shorter but found steeper declines in clutch size with date in populations with longer seasons, contrary to the prediction. Populations also differed in the level of among-individual variation in reaction norm intercept, but we found no evidence of among-individual variation in reaction norm slope.
  5. We show that complex reaction norms in life-history characters exhibit within- and among-population variance. The nature of this variance is only partially consistent with current life-history theory and stimulates expansions of such theory to accommodate complexities in adaptive life history.


Many organisms exhibit labile traits – behavioural, physiological or morphological characters that differ in expression during an individual's lifetime. Labile traits are examples of phenotypic plasticity, broadly defined as the effect of an environmental factor on the phenotypic expression of a genotype (Stearns 1989; Scheiner 1993). Because individual organisms can express labile traits multiple times, the trait values can be described by a function relating phenotype to environment (a reaction norm; Woltereck 1909; Bradshaw 1965; Gomulkiewicz & Kirkpatrick 1992; Nussey, Wilson & Brommer 2007). For linear reaction norms, both the elevation (or intercept; the individual's mean phenotype in the average environment) and the slope (the change in the individual's phenotype in different environments) could vary among individuals. Because the parameters of a reaction norm have often been hypothesized to have evolved through natural selection (e.g. Gotthard & Nylin 1995; Pigliucci 2001; Ghalambor, Angeloni & Carroll 2010), the nature of variation in reaction norm parameters is essential for both selection and, to the extent that this variation has a genetic basis, the evolutionary response to selection (e.g. Lande 2009). Despite a growing number of studies measuring reaction norm parameters in natural populations (e.g. Dingemanse & Wolf 2013), many important aspects of reaction norms and the way they vary within and among populations remain poorly understood (e.g. Husby et al. 2010).

Consider the number of eggs laid by birds per breeding attempt, or clutch size, which is an important component of life history that varies substantially both between and within species (e.g. Lack 1947; Martin 1987; Crick, Gibbons & Magrath 1993; Bennett & Owens 2002). There is growing evidence that this trait is phenotypically plastic – individuals within a population produce different clutch sizes in different breeding attempts (van Noordwijk 1989; Postma & van Noordwijk 2005; Westneat, Stewart & Hatch 2009; Husby et al. 2010) or in response to manipulated environmental conditions (e.g. Boutin 1990; Nager, Rueger & van Noordwijk 1997; Clifford & Anderson 2001). This plasticity appears adaptive because females usually alter the size of their clutch in the direction expected to maximize lifetime reproductive success (Pettifor, Perrins & McCleery 2001).

One possible example of adaptive plasticity is that clutch size in many bird species declines with clutch initiation date (Klomp 1970; Drent & Daan 1980; Murphy 1986; Brommer, Pietiäinen & Kokko 2002). At least 10 hypotheses exist to explain such declines (Decker, Conway & Fontaine 2012), but tests of predictions are made difficult by confusion over whether the decline is produced by differences between individuals or plasticity within individuals. For example, the decline could be driven by individuals of high-quality breeding early and producing more eggs, or it could reflect phenotypic plasticity if individuals adjust clutch size in response to time of season (Drent & Daan 1980; Verhulst & Nilsson 2008). Plasticity could be adaptive because season affects either the benefits of producing a given clutch size or the costs of that clutch size to the parents. Many individual factors could be sensitive to season and affect one or the other of these two fitness components (Decker, Conway & Fontaine 2012), including seasonal declines in food supply for eggs or nestlings (Lack 1947) or in juvenile survival (Drent & Daan 1980; Rowe, Ludwig & Schluter 1994), or seasonal increases in the costs of care to the parents' survival or condition (Siikamäki, Hovi & Rätti 1994, Decker, Conway & Fontaine 2012).

The hypotheses that declines in clutch size arise from differences in quality or from plasticity can be distinguished in species that produce multiple breeding attempts within a season. The replicate measures of clutch size allow estimates of both within- and among-individual variation and their covariates. In addition, comparing reaction norm variation within and among populations of such species offers several opportunities for better understanding the nature of phenotypic plasticity in general, and the underlying causes of clutch size variation in particular.

In general, one might predict that differences between populations in reaction norm attributes (either intercept or slope) would arise if there were both differences between populations in ecological conditions and variation within populations in individual responses to those conditions (e.g. Scheiner 1993; Nussey, Wilson & Brommer 2007). Specifically, the hypothesized causes of seasonal declines in clutch size make different predictions about reaction norm shape in species that produce multiple clutches in a season, and comparisons among populations might further distinguish among these hypotheses. For example, if the seasonal decline arises because parents are matching clutch size to food supply, then the reaction norm should decline with date, possibly as a quadratic (food supply likely first increases and then decreases with date), and populations with different season lengths should show different quadratic relationships. If date affects the costs to parents of reproducing, then clutch size should decline with date within individuals. An additional, independent decline with number of previous attempts might be expected under the assumption that prior breeding also increases the costs of reproduction for the current attempt (e.g. Williams 2005). If parental costs are associated with the end of the breeding season, then populations with shorter seasons should also show a stronger decline with date. A final hypothesis is that the decline with date results solely from a decline in offspring fitness with the date they are hatched. Rowe, Ludwig and Schluter (1994) made several predictions about how this time horizon hypothesis would influence plasticity in clutch size with respect to date. One prediction was that where juvenile survival is more negatively affected by date, then date should have a stronger effect on clutch size. However, Brommer, Pietiäinen and Kokko (2002) showed that in Ural owls (Strix uralensis), the clutch size–date relationship was inversely related to the date–juvenile recruitment rate relationship, opposite to the prediction.

Rowe et al. also predicted that independently of the decline with date, clutch size in species that breed multiple times a season was expected to increase with attempt order within the season and decline with a date by attempt order interaction (their Fig. A1). As far as we are aware, no other hypothesized factor except the time horizon hypothesis should predict this type of reaction norm. Westneat, Stewart and Hatch (2009) tested these predictions using a long-term data set from a single population of house sparrows (Passer domesticus) and found that clutch size indeed declined with date, increased with attempt order within a season after controlling for date and declined more strongly with date as attempt order increased. These results supported the time horizon hypothesis, but we do not know if this pattern is present in all populations of house sparrows nor how differences between populations might influence it.

Here, we present an among-population analysis of within- and among-individual variance in clutch size reaction norms, using data collected from eight populations of house sparrows. The house sparrow has a nearly global distribution, with long-established populations in western Asia and throughout Europe and variably established introduced populations in North America, South America, Africa and Australia. The species also breeds from the equator to just south of the Arctic circle and just north of the Antarctic Circle in both mainland and island locales (Anderson 2006). This wide distribution and the habit of producing multiple clutches per season offer an unusual opportunity to compare multiple populations to assess several predictions about the evolution of reaction norms in general and the forces affecting plasticity of clutch size in particular. First, to be under selection, reaction norms must exhibit within-population variation in slope (e.g. Postma & van Noordwijk 2005; Nussey, Wilson & Brommer 2007), and the extent of plasticity should affect fitness. Secondly, if selection on plasticity is driven by general features of the life history of the organism, then given the results of Westneat, Stewart and Hatch (2009), house sparrows would seem to be influenced by the time horizon hypothesis. If the time horizon is a general feature of their life history, then the multidimensional reaction norm exhibited in the Kentucky population should be present in all populations. Thirdly, if plasticity is driven by ecological conditions, populations differing in ecology should exhibit predictably different reaction norms, possibly in both elevation and slope. In particular, because the work of Rowe, Ludwig and Schluter (1994) was focused on the idea that if juvenile fitness and thus the reproductive value of an egg declines as lay date approaches the end of the season, clutch sizes in populations with shorter seasons should display a steeper decline with date and a stronger interaction with attempt order. The comparative analysis also allows tests of the alternative hypotheses for the decline in clutch size with date. Finally, the distribution of house sparrows world-wide is a mix of populations that are native or introduced as well as insular or continental. We also asked whether the magnitude of among-individual variation differed among populations, which might suggest differences in genetic structure or the presence of additional environmental variables affecting the plasticity of clutch size.


Data Set

We analysed data on clutch size from individually marked female house sparrows from eight multiyear studies distributed in North America and Europe (Table 1): Chizé (France), Helgeland (Norway), Hoedic (France), Kentucky (USA), Lundy Island (UK), Nottingham (UK), Oklahoma (USA) and Veszprém (Hungary). At all sites, females were individually marked with either a numbered metal band, a unique combination of colour bands on their legs, or both. Females were captured using seed-baited traps (Kentucky: Westneat, Stewart & Hatch 2009), mist-nets (Helgeland: Jensen et al. 2008; Veszprém: Bókony et al. 2008) or some combination of methods (Chizé: Chastel & Kersten 2002; Lendvai & Chastel 2010; Hoedic: C. Bichet, D.J. Penn, Y. Moodley, L. Dunoyer, E. Cellier-Holzem, M. Belvalette, A. Grégoire, S. Garnier & G. Sorci, unpublished; Lundy Island: Cleasby et al. 2010; Nottingham: Burke 1984; Oklahoma: Schwagmeyer, Mock & Parker 2008). In seven of the eight populations, females bred in artificial nest boxes, but in the Helgeland population, females nested in cavities and crevices in farm buildings and other man-made structures. In all populations, nests were checked at least once per week beginning in early spring continuing all summer. Clutch completion and final clutch size were indicated by the same number of eggs on two successive checks or information (such as the timing of hatching) that indicated a check occurred in the middle of incubation (sparrows incubate for 10–11 days following clutch completion, Anderson 2006). The Hoedic population was followed through the first two attempts but not through the end of breeding, so it was removed from some analyses. For seven of the eight populations, females were assigned to nest attempts based on them being observed repeatedly entering or standing on the nest box during incubation or nestling provisioning, or being captured in the box. In some cases in each population, the female was not observed at a particular clutch, but was assigned that clutch based on the fact that she nested there during the previous or subsequent attempt (or both), and there were no unusual disruptions or long intervals between attempts to suggest there had been a change in ownership. In the Helgeland population, females were assigned to nests by use of microsatellite analysis of DNA collected from females and nestlings (Jensen et al. 2008). Clutches at Helgeland that failed to hatch therefore could not be assigned except if they were at the same site and timed in between two other attempts by the same female.

Table 1. Summary statistics on location, study duration, timing of breeding and key variables relevant to analysis of clutch size for eight populations of house sparrows. Means ± 1 SD and ranges given as appropriate
  1. a

    Hoedic population was not followed through the end of breeding in most years.

  2. b

    Nottingham study occurred on two sites located 40 km apart in Nottinghamshire, England. Coordinates are the midpoint between them.

  3. c

    The Lundy Island population nearly went extinct in 2000, and new birds were introduced. However, for the purposes of this analysis, it was still considered a native population, as the introduced birds were from the nearby mainland, and the island previously supported house sparrows.

  4. d

    While Britain is an island, the Nottingham study was embedded in a much larger population and so is considered more mainland in key features such as dispersal.

Latitude46N 8′ 50″66N 32′ 16″47N 20′ 24″38N 6′ 42″51N 9′ 53″52N 56′ 51″35N 23′ 00″47N 05′ 32″
Longitude0W 25′ 29″12E 50′ 58″2W 52′ 42″84W 29′ 51″4W 39′ 44″1W 05′ 45″97W 45′ 00″17E 53′ 43″
Clutch size4·9 ± 0·85·0 ± 1·04·5 ± 0·84·9 ± 0·94·2 ± 0·84·2 ± 0·74·4 ± 0·84·7 ± 0·8
Date of first egg (Julian)138 ± 24 (96–199)154 ± 25 (107–212)139 ± 19 (112–176)141 ± 37 (60–220)155 ± 31 (88–224)150 ± 29 (102–222)139 ± 31 (80–207)142 ± 30 (90–210)
Attempts per female per season1·6 ± 0·6 (1–3)1·8 ± 0·7 (1–4)1·5 ± 0·5 (1–2)2·7 ± 1·2 (1–6)2·3 ± 0·9 (1–6)1·2 ± 0·5 (1–3)1·9 ± 0·9 (1–4)2·2 ± 0·9 (1–4)
Total attempts known per female3·0 ± 1·5 (1–6)3·7 ± 2·7 (1–11)2·4 ± 1·4 (1–6)7·3 ± 5·2 (1–26)6·4 ± 3·9 (1–17)1·2 ± 0·8 (1–4)4·7 ± 3·5 (1–14)2·8 ± 1·9 (1–9)
Years female bred1·7 ± 0·7 (1–3)1·9 ± 1·3 (1–5)1·4 ± 0·6 (1–3)2·3 ± 1·3 (1–6)2·5 ± 1·3 (1–6)1·1 ± 0·4 (1–2)1·9 ± 1·1 (1–5)1·3 ± 0·7 (1–4)

All clutches were assigned an attempt order, referring to their position in the series of clutches produced by a known female in that season. For nearly all clutches in all populations, we also determined the date that the first egg of the clutch was laid. For clutches checked during laying, this was deduced from the fact that house sparrows lay 1 egg per day (Anderson 2006). For successful clutches that were not checked until laying had been completed, we inferred the date the first egg was laid based upon the date they hatched, assuming an 11-day incubation period between the laying of the penultimate egg and hatching (Anderson 2006). For clutches that never hatched but had been checked at least twice during incubation, we estimated the date of first egg as the midpoint of the period between the earliest and latest possible date of first egg. Breeding season was estimated as the range in days between the first egg date of the first clutch and that of the last clutch in that population in that season.

Statistical analysis

Each clutch laid by each female was considered an observation, and in total, we amassed data on 4871 breeding attempts, with 42 attempts omitted because of unknown clutch sizes due to gaps in egg laying or nest failure before a final clutch size could be determined. We analysed the variation in clutch size using linear mixed-effects models with restricted maximum-likelihood estimation and a Gaussian error structure. We checked the distribution of residual error in the global model and found it to fit well with expectation of normality. Over 80% of the residuals behaved as expected, with only a slight excess of small residuals (Fig. S1).

The initial model of the ith clutch (Yijkg) included three random intercept effects representing the hierarchical nature of the data, and so is useful to understand at what levels key parameters affecting variation in clutch size may act. The three random intercepts included effects of the gth population (pop0g), the kth year nested within population (year0kg) and the jth individual nested within population (ind0jg):

display math(eqn 1)

where e0ijkg is the residual clutch size (within-individual) for the ith clutch, and pop0g, year0kg and ind0jg are all normally distributed with mean 0 and variance σ2, inline image, inline image and inline image, respectively. The statistical significance of a random effect was tested using the likelihood ratio test (LRT) in which twice the difference in log likelihood (−2dLL) between a model (fitted by restricted maximum likelihood) with the random effect and a model lacking that term is distributed as a chi-square with 1 d.f. (Pinheiro & Bates 2000). We also tested whether the within-population individual variance differed between populations by estimating the among-individual variance for each population (e.g. inline image). This required estimating among-individual variance for all eight populations (or seven when we omitted Hoedic from the analysis) and so was tested using an LRT with 7 (or 6) d.f.

We tested for the effect of clutch initiation date and attempt order on the variation in clutch size using 4829 clutches with full information). Although clutch initiation date could be considered a second phenotype (e.g. Husby et al. 2010), we treated it as an environmental variable because after the first clutch of the season, subsequent clutch initiation dates are affected by an array of new variables that complicate a multivariate analysis. We created two variables for each of these factors allowing us to assess between- and within-individual differences in initiation date and attempt order (e.g. van de Pol & Wright 2009). To assess among-individual variance in both variables, we calculated the mean clutch initiation date and mean attempt order over all clutches produced by each female. We centred these values (so that intercepts would be estimated at the mean) by subtracting the yearly population mean date and attempt order from each value (mean-centred-between, or B). To assess within-individual variance, we subtracted the female's mean date and attempt order from that of each of her clutches (mean-centred-within, or W). These variables allowed us to assess clutch size plasticity in response to clutch initiation date, attempt order, and their interaction, and control for between-individual biases in both variables, as depicted in the following model:

display math(eqn 2)

The variables of within-female date (dateW) and within-female attempt order (attemptW) were strongly positively correlated (r = 0·87). This raised concerns that collinearity would produce spurious results in the analysis. We assessed this by running maximum-likelihood models omitting one or the other of those variables and comparing AIC values. Removing either within-female date or within-female attempt produced substantially larger AIC values (ΔAIC > 2), indicating that including both improved model fit despite the strong correlation between them. We proceeded to analyse both terms together with their interactions, but we also checked AIC values of models lacking the within-individual date or attempt order to ensure their contribution was not influenced by this correlation.

To test the idea that the reaction norm might differ between populations and to assess how, model 2 was modified to include population as a fixed effect, where both the main effect of population as well as interaction terms with all three within-individual terms (within-individual mean-centred date, attempt order and their interaction) were added. To test specific differences between populations in characteristics (e.g. season length), we altered model 2 by adding season length (determined as the difference in days between the lay date of the earliest nest and that of the last nest for each population-season) and its interactions with within-individual date and attempt order as fixed effects.

All analyses were conducted in sas 9.2 (SAS Institute 2008). Fixed effects were tested using F tests with denominator degrees of freedom estimated with the Kenward–Roger method, which adjust degrees of freedom to control for repeated measures within the appropriate level of replication (individual within population and year, year within population and population). Code used for the analysis in Table 3 is provided in the Supplementary Material.


Patterns of Variance

The 4829 clutches averaged 4·6 eggs, and the variance in clutch size over the entire data set was 0·85. Clutch size, dates of first egg and number of attempts per season and per female varied among the eight populations (Table 1). Model 1, which partitioned the variance in clutch size into that among populations, among years within populations, and among individuals within populations, revealed significant variance of all three random effects (Table 2).

Table 2. Partitioning of variance in clutch size into components for 4829 clutches in eight populations of house sparrows from a REML mixed model with three random intercepts and no fixed effects (Model 1 in main text)
Variance componentEstimate ± SELikelihood ratio testaP-value
  1. a

    Calculated from the fit of the complete model (log likelihood = 12207·5) and a model lacking the focal term (d.f. = 1).

Population0·09 ± 0·0554·7<0·0001
Year (within population)0·01 ± 0·00526·1<0·0001
Individual (within population)0·11 ± 0·01155·8<0·0001
Residual (within individuals)0·64 ± 0·02

Reaction Norm for Clutch Size

Linear mixed model analysis confirmed the basic multidimensional reaction norm reported for Kentucky sparrows by Westneat, Stewart and Hatch (2009). In the full data set of all eight populations, clutch size declined linearly within females with date, also showed a negative quadratic with date, independently increased with attempt order, and the decline in date was more negative as attempt order increased (Table 3). We assessed whether these patterns were driven by the KY data by testing the model without them. However, the results were qualitatively the same as those from the full analysis (i.e. all effects were in the same direction and remained statistically significant; results not shown).

Table 3. Results of REML linear mixed model of clutch size from 4829 clutches produced by 1512 female house sparrows from eight populations (Model 2 in text)
FactorEffect ± SEF-Statisticd.f.P-value
  1. Population, year within population and individual within population were included as random effects. Fixed effects were mean-centred between-individual date (DateB) and attempt number (AttemptB) and their interaction, mean-centred within-individual date and attempt (DateW and AttemptW, respectively) and their interaction, the quadratic of DateW, and interactions of within- and between-individual date and attempt order.

Global intercept (β0)4·7 ± 0·1
AttemptB 0·25 ± 0·0351·81, 1086<0·0001
DateB −0·007 ± 0·00148·01, 2074<0·0001
DateB X AttemptB−0·001 ± 0·0010·81, 26220·38
AttemptW 0·12 ± 0·0313·61, 38170·0002
DateW −0·009 ± 0·0008111·01, 3803<0·0001
DateW X AttemptW−0·003 ± 0·000813·11, 43330·0003
DateW X DateW−0·00009 ± 0·0000219·81, 4502<0·0001
DateB X DateW−0·0003 ± 0·0000453·81, 3568<0·0001
AttemptB X AttemptW0·008 ± 0·030·081, 35610·78

We also controlled for individual differences within populations in mean number of attempts and dates of clutch initiation. Females with a larger average attempt order (more attempts per season) produced larger clutches (attemptB, Table 3), and those with a later average date of clutch initiation produced smaller clutches (dateB, Table 3). Individual females with a later average date of clutch initiation also showed a steeper decline in clutch size as within-individual date progressed (dateB by dateW; Table 3).

Population Differences in the Clutch Size Reaction Norm

Populations differed in several elements of the reaction norm. Because the Hoedic population was not fully studied through the entire breeding season, we omitted it from most of these analyses. Populations had significantly different mean clutch sizes (F6,1801 = 49·0, P < 0·0001, Table S1) and differed significantly in the relationship between date and clutch size within females (Table S1, Fig. 1; dateW by population interaction: F6,3020 = 2·6, P < 0·02). In five of the seven populations, clutch size decreased significantly with date within females (Table S1). Both Helgeland and Nottingham exhibited a non-significant positive association between clutch size and date within females. Population identity did not affect the magnitude of the effect of attempt order (F6,3283 = 0·9, P = 0·47, Table S1), which was positive in six of seven populations. The interaction between date and attempt order was negative in all but one population (Chizé), but did not significantly differ among populations (F6,4568 = 1·6, P = 0·15). The nonlinear effect of within-individual variation in date was negative in five of seven populations (positive in Helgeland and Veszprém) but also did not differ significantly among populations (DateW2; F6,4517 = 1·3, P = 0·26).

Figure 1.

Population average reaction norms of clutch size with respect to nest initiation date (mean-centred within-individual) for the eight populations of house sparrows studied. Lines are plotted over the range of dates experienced by individuals in each population. Slopes are unadjusted for the influence of other variables and estimated slopes from LMM analysis differ slightly (see Table S1).

Some of this variability among populations appeared linked to population attributes. In these analyses, we used all eight populations, included population identity as a random effect and the fixed effects as shown in Table 3 excepting the non-significant interaction terms, and tested separately population location (on an island or mainland) and population status (introduced or native). We found no difference in population mean clutch size between island and mainland populations (mainland–island = 0·1 ± 0·3 eggs, F1,6·2 = 0·2, P = 0·70), but mainland populations had a significantly more negative slope with respect to date than did island populations (mainland–island = −0·005 ± 0·002 eggs per day, F1,3807 = 7·7 P = 0·005). The two introduced populations had similar mean clutch sizes to the six native populations (introduced-native = 0·1 ± 0·3 eggs, F1,5·8 = 0·2, P = 0·66), but exhibited significantly more negative slope with respect to date (introduced-native = −0·005 ± 0·002, F1,3805 = 9·3, P = 0·002). No other interaction terms including either location or status were significant.

We tested whether variation in the length of the breeding season influenced elements of the clutch size reaction norm. We altered model 2 in three ways and ran the modified model to test this idea. First, we dropped all fixed effects terms that were not significant. Secondly, we added season length for each year within each population as a continuous covariate. Finally, we added a season length by within-individual date interaction term. This analysis thus tested the impact of variation in season length, combining both within- and between-population effects on the term whose impact varied significantly among populations.

We found no main effect of season length on clutch size (−0·001 ± 0·003 eggs per day, F1,65·1 = 0·3, P = 0·62). However, we found that the negative impact of within-individual date was significantly more negative as season length increased (Season length by date: −0·00009 ± 0·00002 eggs per day, F1,3590 = 23·8, P < 0·0001), a result that is hinted at in Fig. 1 (which shows only between-population differences in season length). While adding season length had little qualitative effect on most other terms (fixed and random) in the model shown in Table 3, two fixed terms were substantially different. The significantly negative effect of within-individual date entirely disappeared (0·003 ± 0·003 eggs per day, F1,3612 = 1·2, P = 0·27), suggesting that the previous main effect was driven by results arising from populations or years with longer seasons (See Table S1). Secondly, the significant negative effect of the interaction between within-individual date and attempt order became non-significant (−0·002 ± 0·003 eggs per day per attempt, F1,4237 = 0·5, P = 0·47), possibly because the season length–date interaction may have accounted for some of the variation explained by attempt order and date as season length affects the number of attempts that are possible.

Among-Individual Variance in Reaction Norm Parameters

As was found in Westneat, Stewart and Hatch (2009), female identity explained a significant portion of the variance in clutch size (Table 2) and this persisted regardless of the fixed effects included in the models. In contrast to Westneat, Stewart and Hatch (2009), here, we controlled for potential biases due to individual females producing clutches at different times in the season or having different numbers of attempts. Both individual mean date of clutch initiation and individual mean attempt order had significant effects on clutch size (Table 3), but both had roughly similar effect sizes as within-female date and attempt order, suggesting that they may arise from the same underlying reaction norm. Despite this, female identity (random effect of individual in model 2) continued to have a significant effect on variance in clutch size, accounting for 14% (0·12) of the total variance (0·85) in clutch size.

We tested whether among-individual variance in clutch size differed between populations by comparing a modification of the model in Table 3 (population omitted as a random effect, included as a fixed effect along with a population by dateW interaction) with one in which the random effect of female identity was split into separate estimates from each of the eight populations (Fig. 2). A LRT revealed that the model with the separate population estimates was a significantly better fit than one with a single estimate of among-female variation (−2dLL = 23·5, d.f. = 7, P = 0·0006), indicating that populations differed in among-female variance in clutch size after controlling for within-individual plasticity and between-individual differences in timing. The model could not estimate variance in the Nottingham population (Fig. 2), possibly due to small samples sizes for both the number of individuals and the number of clutches per individual.

Figure 2.

Estimates of population-specific among-individual variance in clutch size from the full REML mixed model including both within- and among-individual fixed effects (date and attempt order and their interaction; Table 3). Error bars are estimated standard errors.

We also assessed potential among-individual variance in slopes with respect to both date and attempt order by adding a random slope term to the model. We found no evidence for differences in individual slope with respect to dateW (estimate = 0·00006 ± 0·00005, −2dLL = 2·6, d.f. = 2, P = 0·30). We also found no evidence that the multidimensionality of the reaction norm due to the interaction between date and attempt order varied among individuals (−2dLL = 0·3, d.f. = 2, P = 0·86). By contrast, we found significant between-individual differences in slope with respect to within-individual attempt order (0·014 ± 0·006, −2dLL = 9·7, d.f. = 2, P = 0·01), with the estimated covariance between slope and intercept slightly negative (−0·004 ± 0·007). Estimating the individual random slope term for each population did not significantly improve the fit of the model (−2dLL = 6·8, d.f. = 6, P = 0·34).


Our results from multiple populations confirm that clutch size in house sparrows exhibits a complex form of phenotypic plasticity in which the reaction norm is a warped plane in multidimensional environmental space. This reaction norm varied among populations but not in the way predicted by any of the hypothesized forces thought to drive selection on the shape of the optimal reaction norm. The reaction norm is not as expected if seasonal declines in clutch size are driven by changing food resources, risk of predation or effects on parental residual reproductive value. The attributes of this reaction norm, with a decline in clutch size with date, an increase in clutch size with attempt order, and a more negative clutch size–date relationship with increasing attempt order, fitted that expected under the time horizon hypothesis (Rowe, Ludwig & Schluter 1994). However, season length did not have the expected positive effect on the date–clutch size relationship and was instead significantly negative. Our results also show that populations differed in the nature of individual variation in reaction norm parameters. This may impact understanding the evolution of reaction norms in general and the specific forces potentially acting on plasticity in clutch size in birds. We explore these implications in more detail.

Comparative Analyses of Reaction Norms

The clutch size reaction norm, as measured across eight populations of house sparrows, exhibits several intriguing properties. Populations differed in intercept (elevation), and there was significant repeatability of elevation within individuals across all populations after controlling for some potential biases due to plasticity. However, the level of repeatability in clutch size differed between populations. Because within-population, among-individual variation could reflect the level of underlying genetic variation (Nussey, Wilson & Brommer 2007), differences in repeatability could reflect differences in evolutionary potential. Between-individual differences in any phenotypic trait could arise from two sources: genetic variation or unaccounted environmental variation acting throughout an individual's reproductive life span. The characteristics of populations showing high among-individual variance support the latter hypothesis over the former for two main reasons. North American populations of house sparrows have lower levels of genetic diversity (as measured at presumably neutral microsatellite loci) than do European populations (Schrey et al. 2011). We tested whether introduced populations (the two from North America) had a different among-individual variance than did native populations and found no evidence for a difference (LRT, χ2 = 0·1, d.f. = 1, P = 0·75). Island populations also generally have lower levels of genetic variation than mainland populations in many species (e.g. Frankham 1997). However, we found no difference in the among-individual variation in clutch size between island and mainland populations (LRT, χ2 = 0·8, d.f. = 1, P = 0·37).

A second reason for suspecting that population-level differences in between-individual variation may be due to unaccounted environmental variation is that populations differed in several variables that contributed to phenotypic plasticity in clutch size. For example, based on the number of years bred, populations might vary in their age distributions (Table 1). Clutch size increased with number of years a female was present in our data set (best estimate of female age that we have; model as in Table 3 with addition of female years and the Nottingham population omitted; effect = 0·10 ± 0·01, F1,3465 = 42·2, P < 0·0001), but this did not eliminate the differences between populations in among-female variance (−2dLL = 20·9, d.f. = 5, P < 0·001). Nevertheless, females in some locations could be experiencing a wider range of other environmental conditions. Because we have not accounted for all environmental factors in our analysis, it seems likely that differences in environmental variance may be the cause of population differences in between-individual variance.

The evolution of plasticity requires variation in reaction norm slope, and our results suggest complexity in patterns of variation in slopes. An especially interesting element of the clutch size reaction norm in sparrows is the interaction between date and attempt order. In most populations, this is negative (although it is significantly so in only two, Table S1), meaning that as females have produced more prior clutches, there is a more negative effect of date. This interaction term produces the non-additive feature of the multidimensional reaction norm and is a unique prediction of the Rowe, Ludwig and Schluter (1994) model of optimal clutch size. To evolve, there must be within-population variation in this interaction, but we found no support for individuals differing in this parameter of the reaction norm. We strongly suspect this is because of statistical power rather than biology; while we have data on a total of 1512 females, for only 107 did we have three or more clutches (sufficient to measure slope with some residual variance) for each of three attempt orders. To reduce the impact of sampling variance on the residual variance, we would need even more clutches per attempt order, and consequently, sample size drops considerably. Power to detect among-individual variation in this parameter of the reaction norm is thus likely to be quite low (e.g. van de Pol 2012).

Our power to detect variance among individuals in univariate slopes is considerably greater, and our results from these analyses raise some interesting questions regarding the evolution of plasticity. First, we found that populations differed significantly in the average slope of clutch size with date. Divergence in slope is expected when plasticity is under different selection pressures in different populations. Divergent selection and an evolutionary response are only possible if there are individual, heritable, differences in slope, yet we failed to detect significant among-individual variation in this slope. We did uncover significant among-individual variation in slope with respect to attempt order, but found no evidence of among-population variance in slope. Thus, our results seem to indicate a paradoxical situation; for the parameter of the clutch size reaction norm (slope with respect to date) that appears to have diverged between populations, there is no evidence of the individual variation within populations that is necessary for selection to lead to such divergence. Conversely, for the parameter of the reaction norm (slope with respect to attempt order) that exhibits the necessary variance among individuals, no divergence between populations has apparently occurred.

There are many possible explanations for this situation, but we focus on two that seem especially interesting. One is that slope with respect to attempt order is under little or no selection in all populations, thereby retaining individual variation and limiting divergence, whereas slope with respect to date is under strong stabilizing selection with some directional selection, and the combination has eliminated present sources of individual variation but also has led to divergence. Testing this would require data on selection and heritability (additive genetic variance) of clutch size from each population, which we do not have at present. Another hypothesis that deserves more attention in cases of plasticity is the action of additional, unaccounted environmental variables. Most studies of reaction norms assess slopes in only a single environmental axis (Pigliucci 2001; Brommer, Pietiäinen & Kokko 2002; Postma & van Noordwijk 2005; Nussey, Wilson & Brommer 2007). We explicitly analysed the clutch size reaction norm as a plane in bivariate environmental space. Westneat, Stewart and Hatch (2009) considered additional environmental variables (e.g. precipitation) but found no phenotypic association. It is conceivable, however, that clutch size responds to variables that are not captured by either date or attempt order. If these differ among populations in ways that generate both differences between individuals within populations and average effects that differ between populations, then multidimensional plasticity could explain the patterns of variation we observed. Testing this idea would require a more detailed understanding of the ecology of clutch size and within-individual plasticity to identify this unknown variable (or variables) and measurements of them within and among populations.

The Life History of Clutch Size

Our results raise new questions about life history and clutch size. We tested whether the differences between populations in reaction norm slopes with respect to date (Table 3, Fig. 1) might distinguish among various hypotheses for why clutch size declines with date. As found in Westneat, Stewart and Hatch (2009), house sparrows exhibit a complex reaction norm that is best explained by the time horizon hypothesis. We reasoned this support would be stronger if season length had a positive effect on the relationship between clutch size and date (less negative in longer seasons). We found the opposite; the steeper declines in clutch size occurred in populations with longer breeding seasons. This result raises questions about the assumption for the time horizon hypothesis that offspring quality declines with date. Data on recruitment rate have been published for two of the populations we studied; in Helgeland, Ringsby et al. (2002) found that recruitment probability increased initially and then declined with hatch date (Husby et al. 2006). In Oklahoma, recruitment rate appears to be influenced by the quality of food items provided by parents, which increases with date; when food quality is held constant, recruitment declines with date (Schwagmeyer & Mock 2003). This is an intriguing coincidence given the relationships between clutch size and date in the two populations (Fig. 1), with Helgeland having a flat relationship and Oklahoma steeply negative, but we clearly need more information from all populations, including these two, to understand the relationship between juvenile survival and clutch size.

It is possible the time horizon applies to some populations more than others, but if so, why? One possibility is that the key relationship driving the time horizon effect is more complex than assumed in the Rowe, Ludwig and Schluter (1994) model. Brommer, Pietiäinen and Kokko (2002) noted that relatively subtle differences in the relationship of offspring fitness with date could have large effects on the outcome of the model. The model can be simplified to the following equation, which is the condition that must be met for the optimal time–clutch combination:

display math(eqn 3)

where C(t) is the clutch size at time t and V(t) is the recruitment probability at time t, and C′ and V′ are the rates of change in clutch size or recruitment probability with t. In our analysis of variation in breeding season length and clutch size, we assumed that V(t) = 0 at the end of the breeding season and that peak recruitment [max V(t)] was the same in all populations; thus, season length would be collinear with V′(t). Brommer, Pietiäinen and Kokko (2002) noted in their study that differences in max V(t) alone could create differences in C′(t). This could affect our results as well. However, Rowe, Ludwig and Schluter (1994) assumed that parents stop breeding when V(t) = 0. We suggest that this assumption may be invalid because other factors may influence when parents stop breeding, ensuring that V(t) does not equal zero and altering the relationship between V′(t) and season length. For example, if independent offspring must have time to acquire skills at finding food (e.g. seeds) before peak times of food stress (e.g. Loman 1982; Hochachka 1990), then it is possible the decline in offspring fitness over the breeding season depends on when this food stress occurs relative to the end of the breeding season. If the insect food fed to nestlings declines early such that parents cease breeding long before the period of food stress that juveniles might experience, then breeding date may have little impact on clutch size. The relative timing and rate of declines in insect food and periods of food stress could produce complex differences in reaction norms for clutch size between populations. Alternatively but similarly, house sparrows moult once per year in autumn (Anderson 2006). Late-hatched juveniles might experience increased mortality if their moult is delayed into increasingly colder weather or if they have to moult quickly and produce poorer feathers as a result. As with the first hypothesis, differences between populations in when breeding ceases and when moult is optimal could create selection favouring a different decline in clutch size with date.

Other possibilities exist, although all of them would require more detailed information on the underlying basis of presumed seasonal declines in recruitment. Such data are not presently available for the populations in this study. Moreover, the details are likely to matter. Consider the Oklahoma population, the southernmost locale in our analysis. Females there started breeding early in the season, but also ceased breeding in early July (Table 1), possibly because the hot and dry conditions of late July and August in Oklahoma increase the costs of breeding for adults. Because of its southern location, however, there is a long delay between breeding and cold weather. Because of this delay, we would expect that late-hatched juveniles should be in less of a time crunch and so should have similar survival to early-hatched juveniles. The Rowe, Ludwig and Schluter (1994) model would thus predict that date would have little effect on clutch size in Oklahoma, but in fact, the Oklahoma birds have the steepest decline with date. This implies that either our presumption about the timing of stressful conditions for juveniles (e.g. when cold weather arrives) is incorrect, or that other processes linked to date are affecting clutch size.

Finally, clutch size is likely to be part of an integrated phenotype (Pigliucci 2003) that includes when a female begins breeding within each season and how many attempts she has. In our analyses, we treated attempt and date as environmental factors, but some of their variation is likely due to variation in female phenotype. For example, the timing of a female's first attempt of the season also exhibits phenotypic plasticity and shows among-individual variance in plasticity (e.g. Brommer et al. 2005; Brommer, Rattiste & Wilson 2007; Husby et al. 2010), which sometimes has a genetic basis (e.g. Charmantier et al. 2008). Earlier breeding in a multibrooded species such as the house sparrow is typically associated with more attempts and earlier initiation dates for each attempt (Anderson 2006). Both date and attempt order in house sparrows are themselves traits that could also be sensitive to environmental conditions. Thus, phenotypic integration of date of first breeding with clutch size is likely to exist. If so, then among-individual variation in plasticity associated with date of first breeding (e.g. spring temperature) could exist. Selection acting on the decision to start breeding may influence reaction norm shape for clutch size during later attempts. A looming challenge then is to assess the level of phenotypic integration within and among populations simultaneously with independent effects of environment on life-history traits such as clutch size.

Data Accessibility

Data available from the Dryad Digital Repository:;5061/dryad.112cs.


We thank the large number of field personnel across eight studies who helped contribute to this data set. We also thank the multiple agencies that supported this work, including the U.S. National Science Foundation (ÁZL, DFW, DM, IRKS and PLS), the Norwegian Research Council (HJ and TK), NERC (TB and JS), Hungarian Scientific Research Fund (Grants no. T047256, K72827, K84132, PD76862; AL, ÁZL and VB), Hungarian Scholarship Board [ÁZL, the CRBPA (OC)] and the French ANR (GS). The lead author also thanks the University of Kentucky for support during preparation of this manuscript and the Norwegian University of Science and Technology for hosting him during a sabbatical when plans for this paper took shape. We also appreciate the useful suggestions on the manuscript provided by three anonymous reviewers.