#### Field collection and controlled breeding

We collected water moulds from two ponds in Lynn Woods Reservation (Lynn, Massachusetts, USA) in late March 2005, after the ponds thawed and weeks before *B. americanus* began to breed. We sank 20 tea bags filled with sterilized hemp seeds in each pond as water mould baits (Robinson *et al.*, 2003). We retrieved the seeds a week later, plated them on a sterilized cornmeal agar medium in Petri dishes together with fresh sterilized seeds, and incubated them at 11 °C. This procedure resulted in a mixed culture of *Achlya* sp. and *Saprolegnia* sp. water moulds (Gomez-Mestre *et al.*, 2006; Touchon *et al.*, 2006).

On 20 April we collected 30 female *B. americanus* from a single pond, all of which had been amplected already but had not begun to lay eggs, and 15 randomly selected calling males. Males and females were separately distributed in large plastic containers filled with shallow water and leaf litter, and transported to an environmental chamber at Boston University set at 11 °C and a 10 : 14 h light : dark cycle, as observed in the field. We randomly assigned two females to each male and placed them in plastic containers holding 5 L of carbon-filtered tap water plus 1 L pond water. Each container was divided in two with a fine mesh, and females were placed on opposite sides. Overnight, each male was sequentially allowed to amplect each female, while the mesh divider prevented mixing of the clutches. Fourteen males successfully mated with two females each, for a nested series of crosses consisting of 28 full-sib families and 14 half-sib families. This breeding design (North Carolina I) does not require killing adults, which were returned to their pond within 24 h, but neither does it allow partitioning of dominance from maternal effects (Kearsey & Pooni, 1996; Lynch & Walsh, 1998). In this design, the difference between variance components due to sire, and to dam nested within sire, is a function of dominance and epistasis, plus variance caused by maternal environment (Lynch & Walsh, 1998). As there is no maternal care in this species, the latter is probably restricted to differences in egg provisioning.

#### Experimental mould infection

*Bufo americanus* clutches consist of a thin gelatinous string, usually containing over 4000 eggs. We cut 16 segments of 10 eggs each from each clutch, placed them in plastic cups with 115 mL of carbon-filtered tap water, and randomly assigned them to either mould infection or control treatments, for a total of eight replicates per cross in each treatment. For the mould infection treatment, we added one heavily infected hemp seed (i.e. showing hyphal growth) per cup. We added one uninfected, sterilized seed to the control cups. Replicates were arrayed in random blocks across four shelves in the environmental chamber, so that two replicates per cross and treatment were allocated to each block. An extra 15-egg segment was taken from each clutch to measure egg diameter. We measured eggs to the nearest 0.01 mm using an ocular micrometer mounted on a dissecting microscope. All replicates were checked daily until all eggs had either hatched or died. We recorded the date of each event (i.e. death or hatching) and preserved the first hatchling from each replicate in 10% formalin. These first hatchlings were later staged, digitally photographed through a dissecting microscope, and measured for total length using Image J (version 1.33; National Institutes of Health, Bethesda, MD, USA). Seven experimental units contained eggs that did not develop (probably unfertilized) and were discarded, resulting in 218 control and 223 water mould cups. An additional nine hatchlings (five control, four mould treatment) were poorly preserved, precluding accurate length measurements; for these animals we have only hatching age and stage.

#### Statistical analysis

All statistical analyses were conducted in SAS version 9.12 (SAS Institute, 2003). Sire, dam nested within sire and experimental block were all considered random factors, whereas environment (mould infection or control) was considered a fixed effect. Experimental block never had a significant effect but was included in the models when it increased the goodness of fit [i.e. reduced the Akaike information criterion (AIC)]. We tested for differences in survival between water mould and control environments to assess the pathogenic effect of the water mould. We also tested for ‘sire × environment’ and ‘dam-within-sire × environment’ interactions to test for variation in susceptibility to water mould infection. Survival was analysed by fitting generalized linear mixed models with an underlying binomial distribution and a logit link function using the glimmix macro.

To avoid possible bias in analyses of hatchling traits (age, size, stage) caused by differential mortality between the treatments, in all cases we analysed only the first hatchling from each replicate, rather than averaging across surviving hatchlings. For hatching age and hatchling size (total length), we tested for effects of environment (mould or control), sire, dam-within-sire and, to assess genetic variation in embryo response to infection, sire × environment and dam-within-sire × environment, via general linear mixed models using proc glm. Data on hatchling developmental stages violated parametric assumptions; we therefore used a Kruskal–Wallis analysis to test for environmental effects on hatching stage. To compare the overall variation in hatching age and hatchling size across sibships in each environment, we computed coefficients of variation and estimated their standard errors using 500 nonparametric bootstrapping replicates. Finally, we tested whether egg size was correlated with hatching age and hatchling size within each environment. As dams were nested within sires, we could not include all full-sibships in a single analysis because of potential sire effects on the hatching variables and hence non-independence of the data. We thus calculated Pearson’s product–moment correlations from 14 full-sibships, randomly excluding one of every two dams nested within each sire. The results were robust to which dams were removed, and a similar pattern of correlations resulted if we used sire mean values or the full set of dams (ignoring non-independence of sires) instead.

Genotype × environment interactions may have a variety of biological causes. Thus, we calculated some within-environment variance components using restricted maximum likelihood (REML) in varcomp; these were in good agreement with estimates obtained through least squares. The variance component associated with differences among sires, assuming epistasis is negligible, estimates one-fourth of the additive genetic variance (*V*_{A}) (Lynch & Walsh, 1998). All families were reared under standardized laboratory conditions, so variance because of differences among dams, nested within sires, estimated one-fourth of *V*_{A} plus a non-additive component, *V*_{NA+M} (Via, 1984a). Assuming that epistatic sources of variation were negligible, we estimated a non-additive component of variance as *V*_{NA+M} = *V*_{DAM(SIRE)} − *V*_{SIRE} (Via, 1984a; Lynch & Walsh, 1998). Given our breeding design, this variance component encompasses both non-additive genetic (¼ dominance variance, *V*_{D}) and maternal environmental effects. The residual variance includes the environmental variance (*V*_{E}) plus ½*V*_{A} and ¾*V*_{D} (Kearsey & Pooni, 1996; Laurila *et al.*, 2002).

Heritability of hatching age and hatchling size within environment was calculated as:

To facilitate comparisons of the estimates of genetic variability between infection and control treatments, and to those reported in other studies, we also calculated the coefficient of additive genetic variation (Houle, 1992):

We tested for differences in CV_{A} between environments using the two-tailed test for differences between coefficients of variation described by Zar (1999).

We estimated the heritability of trait plasticities for hatching age and hatchling size across environments as (Groeters, 1988; Relyea, 2005):

Standard errors for variance components and heritability were calculated using bootstrap sampling with 500 replicates. We calculated cross-environment additive genetic correlation for hatchling size from the REML-estimated variance components as (Via, 1984b; Fox *et al.*, 1999):

We estimated its standard error using a jackknife procedure across sires (Laurila *et al.*, 2002) as the estimate required variance components from different models (within and across environments). This method for calculating cross-environment genetic correlation was not applicable to hatching age because the additive variance component in the infection treatment equalled zero. Instead, we estimated its cross-environment correlation by calculating Pearson’s product–moment correlation among sire means (Roff, 1997; Laurila *et al.*, 2002; Relyea, 2005). We calculated standard errors for Pearson’s correlations using a bootstrap procedure across sires with 500 replicates.