DEVELOPMENTAL INSTABILITY IS GENETICALLY CORRELATED WITH PHENOTYPIC PLASTICITY, CONSTRAINING HERITABILITY, AND FITNESS

Although adaptive plasticity would seem always to be favored by selection, it occurs less often than expected. This lack of ubiquity suggests that there must be trade‐offs, costs, or limitations associated with plasticity. Yet, few costs have been found. We explore one type of limitation, a correlation between plasticity and developmental instability, and use quantitative genetic theory to show why one should expect a genetic correlation. We test that hypothesis using the Landsberg erecta × Cape Verde Islands recombinant inbred lines (RILs) of Arabidopsis thaliana. RILs were grown at four different nitrogen (N) supply levels that span the range of N availabilities previously documented in North American field populations. We found a significant multivariate relationship between the cross‐environment trait plasticity and the within‐environment, within‐RIL developmental instability across 13 traits. This genetic covariation between plasticity and developmental instability has two costs. First, theory predicts diminished fitness for highly plastic lines under stabilizing selection, because their developmental instability and variance around the optimum phenotype will be greater compared to nonplastic genotypes. Second, empirically the most plastic traits exhibited heritabilities reduced by 57% on average compared to nonplastic traits. This demonstration of potential costs in inclusive fitness and heritability provoke a rethinking of the evolutionary role of plasticity.

Variation during development among genetically identical individuals can have two causes: developmental sensitivity to external factors, most often termed phenotypic plasticity, and variation due to internal factors, termed developmental instability or developmental noise. Both processes affect the mapping of genotype to phenotype, in turn affecting individual fitness. In a heterogeneous environment, plastic responses that allow individuals to track and manifest the optimal phenotype can increase individual fitness.
In contrast, developmental instability creates deviations from the optimum phenotype that reduce both fitness under stabilizing selection (Gavrilets and Hastings 1994) and trait heritability. Conversely, developmental instability can be a form of bet-hedging in unpredictable environments (Seger and Brockman 1987), and increases adaptive evolution in the face of changing environments (Rutherford and Lindquist 1998;Masel 2006).
Of these two sources of variation, phenotypic plasticity is the more widely studied (Pigliucci 2001;DeWitt and Scheiner 2004). Phenotypic plasticity is a repeatable phenotypic association between genotype and environment (Bradshaw 1965) that contributes to evolvability in manifold ways (Lynch and Walsh 1998, chapter 6). However, a major question remains regarding the role of plasticity in adaptive evolution. Any genotype whose plastic response is always toward the optimum within each environment, should replace all less plastic genotypes. Yet, adaptive plasticity is less common than many expect, leading to a persistent problem: what constrains the evolution of adaptive plasticity (DeWitt et al. 1998;? Constraints on the evolution of adaptive plasticity have been placed in two categories: costs and limitations (DeWitt et al. 1998). We suggest that plasticity necessarily carries with it an increase in the range of phenotypes produced by a given genotype because of an increase in developmental noise. This increase may be a limitation when it reduces the capacity of a genotype to match the optimal phenotype, leading to a short-term decrease in fitness. Increased developmental noise also lowers heritability, leading to a decrease in the ability to adapt. We follow Gavrilets and Hastings (1994) in distinguishing between macro-and microenvironmental variation and phenotypic responses to each. Macroenvironmental effects-changes in the environment at scales of habitat patches or generational time periods-have been the primary focus of research on phenotypic plasticity. Much less attention has been paid to microenvironmental effects: very small differences at the spatial scale of individuals or the temporal scale of day-to-day fluctuations. We suggest, as have others, that microenvironmental sensitivity is a cause of increased developmental instability and that high macroenvironmental plasticity is accompanied by increased microenvironmental sensitivity. To avoid confusion, we will use the word "sensitivity" when referring to microenvironmental effects on development and "plasticity" when referring to macroenvironmental effects, although both are a manifestation of the same responsiveness to the environment.
Developmental instability is apparently random variation during the process of development, a characteristic that all organisms exhibit to some degree (McAdams and Arkin 1997). It is typically measured as either asymmetry of homologous parts of a single individual or as variation among replicates of a genotype raised in a single environment. Despite its importance, because of technical challenges in its measurement, developmental instability remains among the most causally opaque sources of phenotypic variation. In this article, we examine one postulated cause for developmental instability by testing the hypothesis that it is associated with environmental sensitivity and plasticity.
The idea that sensitivity and plasticity could be related to developmental instability is not new. Theories of the evolution of plasticity have postulated a relationship (Gavrilets and Hastings 1994;Wagner et al. 1997;Lynch and Walsh 1998;Badyaev 2005), and it is one possible limitation on plasticity (DeWitt et al. 1998). Given the central importance of heritability, plasticity, and developmental instability in the evolutionary process, the potential connection between macroenvironmental plasticity and microenvironmental sensitivity is an important question. Tests of the relationship between sensitivity and instability are nevertheless difficult and few. Early studies for the most part failed to find any relationship, reviewed in Scheiner et al. (1991) who themselves found mixed evidence in a study of Drosophila melanogaster. Likewise, Perkins and Jinks (1973) found mixed evidence in tobacco, Nicotiana rustica. The hypothesized relationship between plasticity and instability thus remains incompletely resolved.
In this article, we examine the relationships between plasticity, sensitivity, and instability of the 13-trait multivariate phenotype of a recombinant inbred line (RIL) population of Arabidopsis thaliana, examined across a gradient in nitrogen supply rate.

PARTITIONING PHENOTYPIC VARIATION
In the classic quantitative genetic framework of the effects of plasticity on the evolvability of traits, one first identifies an environmental effector of phenotype and a scale of effector variation that is relevant either to fitness in natural populations or to economic considerations in domesticated organisms. Here, we assume that an environmental effector of phenotype, θ, can be described as a continuous function. An organism's response to θ is likewise a continuous function, so that both macro-and microscale responses to θ (our plasticity and sensitivity) are obtainable from the same functional description of the plastic response, ζ. Although assuming that plasticity and sensitivity represent a single function, that is, are genetically correlated (Gavrilets and Hastings 1994), we acknowledge that they may differ in some systems, such as threshold responses in development or gene expression.
If a set of genotypes is grown at distinct levels of the environmental effector, the phenotypic value of an individual, z ijk , can be decomposed into five causal components, assuming no genotype-environment correlation, and their five associated variance components (Lynch and Walsh 1998): where μ is the grand phenotypic mean across environments and genotypes; G i is the deviation associated with genotype (or lineage) i, averaged across environments; E j is the mean deviation within environment j, averaged across genotypes-a measure of plasticity; and I ij is the mean deviation of the ith genotype from the mean phenotype in environment j as predicted by (μ + G i + E j )-a measure of genetic variation in plasticity. Finally, ε ijk is the residual deviation from μ which cannot be explained by the explicit causal components.
If the intent of the partitioning in equations (1), as in its introduction by Comstock and Moll (1963), is to understand the causes of differences in phenotype observed between phylogenetically, spatially, or ecologically differentiated groups of organisms in the context of discrete differences among environments then this approach is appropriate. If instead, one is interested in understanding the interaction between genes, development, metabolism, and the environment as causes of and limits on phenotypic variation in general, the characterization of any unmeasured association between environmental variation and phenotypic variation as "error" is an error. The error variance is often the largest component in the denominator of the heritability ratio. In a nonsystematic survey of heritability in wild plant populations, of 10 studies and 71 heritability estimates, the error variance on average accounted for 74% (±3% SE) of the total phenotypic variance (Table S1). Thus, a failure to understand the components and causes of this variation can severely limit our ability to understand evolutionary dynamics. Debat and David (2001) point out that many of the terms we use in studies of both development and adaptation to environmental variation (e.g., plasticity, genotype-by-environment interaction, developmental instability) do not take into account recent advances in our understanding of underlying physiological and developmental processes. In addition, the often separate threads of discourse connected to each of the terms can obscure their interwoven nature. A prime example is the use of "error" for the residual phenotypic variation, originating in statistical genetics (Fisher 1918).

PARTITIONING THE "ERROR" TERM
The residual term (ε ijk of eq. 1a) itself contains multiple types of variation. First, it contains measurement error, generally estimable through repeated measures of the same individual (Van Dongen 1999). Second, it contains the truly stochastic component of developmental noise (Palmer and Strobeck 1986;Klingenberg and Nijhout 1999;Kilfoil et al. 2009). (Kilifoil et al. (2009 proposed that this can be partitioned as a separate variance component V S .) This putative developmental stochasticity can act as a bethedging adaptation in unpredictable environments (e.g., Kaplan and Cooper 1984;Kussell and Leibler 2005) and may be important in determining alternative states of higher order biological phenomena (Killifoil et al. 2009, Oates 2011. Lastly, ε ijk also contains all of the environmental influences on phenotype that are not accounted for by the measured environment, either because the measure is a proxy for the true environmental cause, or because the variation in phenotype is in response to environmental variation at a scale smaller than that being manipulated or measured. This third component is "error" only in the sense that the researcher has ignored (i.e., not measured) the scale of environmental variation needed to understand the deterministic causes of variation at this scale. It is more accurate to refer to error variance as "ignorance variance." In this article, we aim to reduce that ignorance.
The residual, ignorance, or error variance (of eq. 1b), can be further partitioned into four components. The microenvironmental variation (at the scale of the individual) included in ε ijk has two components analogous to E j and I ij (eq. 1a). To avoid confusion between micro-and macroscales, we make two notational distinctions. First, the lowercase letter "l" (el) will be used to designate levels of microenvironmental variation. Second, the among-individual microenvironmental phenotypic effects will be referred to by using lower-case letters, whereas capital letters will be used for the same effects at a macroenvironmental scale, that is, e l versus E j and i il versus I il , respectively. The residual deviation can then be decomposed into four components that also have four variance components: where m k is measurement error associated with individual k, S k is the deviation resulting from stochastic developmental events that are peculiar to individual k (Graham et al. 1993;Klingenberg and Nijhout 1999;Kaern et al. 2005;Kilfoil et al. 2009), e l is the average phenotypic deviation due to the microenvironment l experienced by individuals in microenvironment l, and i il is the deviation of the ith genotype's response to the lth microenvironment from the average response to microenvironment l.
The population-level microenvironmental sensitivity Deviation due to microenvironment e l can be expressed as the product of the microenvironmental deviation experienced by individual k (θ l ) and the local slope of the reaction norm within macroenvironment j (ζ j ). The variance in e l among replicates of a given genotype (σ 2 e ) depends in turn on the slope of the reaction norm and the variation of microenvironments around θ l (σ 2 l ) that individuals of genotype k experience. If the reaction norm is flat in macroenvironment j (as is the case with regard to plasticity to resource supply when the supply is saturating), or the environment is constant through time and space, we expect σ 2 e to be zero. If the slope of the reaction norm deviates substantially from zero in the vicinity of j, we expect σ 2 e to be an important contributor to σ 2 ε .
The genotype-specific microenvironmental sensitivity Developmental instability can be measured as the variance among replicates of a genotype (Kaneko 2012). When genotypes differ in the slopes of their reaction norms, microenvironmental variation affects replicates of genotypes differently. As a result, different plasticities among genotypes are associated with different amongreplicate, within-genotype variances. The deviation from e l of genotype i's response, i il , is the result of the difference in slope of genotype i's reaction norm slope from the pooled reaction norm slope (ζ -ζ i ). The variance in σ 2 i is a function of two factors: the deviation of the slope of the reaction norm of the ith genotype from the average reaction norm slope, and the variation in microenvironments among individuals of genotype i. When σ 2 e contributes to σ 2 ε , σ 2 i will also contribute to σ 2 ε , provided that the slopes of the reaction norms vary among genotypes.
We therefore hypothesize that there is a covariance between the absolute value of the local slope of the reaction norm and the variance among genotypic replicates within an environment, a measure of developmental instability.

EVOLUTIONARY CONSEQUENCES
A consequence of two populations differing in plasticity is a difference in the relative magnitudes of genetic and nongenetic components of the total phenotypic variance. The variance components appear in two quantities central to evolutionary dynamics: heritability and fitness. Heritability within a macroenvironment is the ratio of genetic variance σ 2 G to total phenotypic variance. Equation (1b) collapses to two terms for a single macroenvironment, σ 2 P = σ 2 G + σ 2 ε , and heritability is Because the denominator of h 2 , the total phenotypic variance, includes the residual variance σ 2 ε , it will increase with increasing microenvironmental sensitivity, σ 2 θ = (σ 2 e + σ 2 j ), and heritability will decrease. Therefore, one limitation to the evolution of adaptive plasticity may be an indirect cost of reduced heritability.
The mean fitness of a set of genotypic replicates under stabilizing selection is as follows: Here s is the strength of stabilizing selection, z − z 0 is the difference between the mean phenotype across genotypic replicates from the optimal phenotype, and the other terms are defined as above. If the mean genotypic value is equal to the macroenvironment's phenotypic optimum, then fitness decreases as a function of the local slope of the reaction norm. Thus, when a population is genetically optimized under stabilizing selection, the microenvironmental "noise" associated with microenvironmental plasticity decreases fitness by pushing the phenotypes of individuals away from the optimum. Thus plasticity, favored in a fluctuating environment, would be selected against in a constant environment to the extent that the plasticity results in developmental instability.
We therefore need to understand the magnitude of these effects of developmental noise and environmental sensitivity.
In this study, we test the hypothesis that plasticity is related to developmental instability, using a RIL population of A. thaliana as a model system. We further ask if through this mechanism plasticity may carry two costs: at the individual level a lower developmental stability and at the population level a lower heritability.

ENVIRONMENTAL GRADIENT
We used nitrogen (N) supply rate as an environmental effector of phenotype. Plastic responses to N have been found for: total growth (Glass 1989;Epstein and Bloom 2005), photosynthetic and leaf economic properties (Reich et al. 2003;Wright 2004), multiple developmental attributes (Redinbaugh and Campbell 1991;Stitt 1999), allocation patterns within plants, including changes in root structure (Walch-Liu et al. 2006), and allocation to roots (Reynolds and D'Antonio 1996;Poorter and Nagel 2000). Thus, variations in N supply rates are both ecologically important and likely to elicit plastic responses across a broad variety of traits. Loudet et al. (2003Loudet et al. ( , 2005

GROWTH CONDITIONS AND EXPERIMENTAL DESIGN
Each RIL was grown in each of four N supply rate treatments. Within each N treatment, RILs were randomly assigned to tray locations. The racks of plants for each N treatment were randomly assorted within each chamber, and each chamber contained all N treatments. Plants were grown in Ray Leach SC10 Supercell Conetainer R 164 mL plastic pots (http://www.stuewe.com/ products/rayleach.html) filled with washed Turface R MVP (www.turface.com) fritted clay and placed in Conetainer racks, 24 per rack, in fiberglass bins that were as deep as the racks were high. Seeds were placed directly on the Turface and immediately moistened. After five days of dark moist stratification at 4 • C to break dormancy, racks and bins were placed in one of four Conviron growth chambers, two PGW 36 chambers and two custom chambers with Conviron controls. Light was supplied at 270 μmol photons m −2 s −1 for 16 h/24 h period with a combination of VHO fluorescent and incandescent bulbs. Illumination was 100 μmol m −2 s −1 , increased to 270 μmol m −2 s −1 after the first hour, and reduced to 100 μmol m −2 s −1 for the final hour. Twenty percent of chamber light bulbs were changed every three weeks to maintain constant light intensity and spectral quality throughout the study. Temperature changed gradually from a low of 15 • C predawn to 22 • C 3 h prior to the end of illumination each day.
Water was supplied daily with an automatic ebb and flood system. Conviron controllers initiated flooding of the bins at simulated dawn each day. A standpipe in each bin assured uniform water column height. Water column height was maintained for 45 min; the controller then opened a drain solenoid and released the water. The supply of P, K, and micronutrients was ad libitum for all plants. N was added as NO 3 − at one of four concentrations: 1, 56, 51, or 56 ppm. Background NO 3 − concentration in the water supply contributed on average an additional 1 ppm. Nutrients were supplied using four Dosatron R D25RE2 (www.dosatronusa.com) adjustable nutrient apportioners. Supply rate was calibrated weekly using an ion-specific probe to measure NO 3 − concentration as the water entered the bins. Actual NO 3 − concentrations were maintained at ±1-2 ppm for the two lower concentrations and at ±2-3 ppm for the two higher concentrations. The water supply was temperature controlled with a thermostatic mixing valve. Plants were automatically misted three times per day for the first 10 days of growth to prevent salt buildup, which can be fatal for seedlings of A. thaliana. Misting was not completely effective in eliminating this problem, leading to some imbalance among RILs in the number of replicates within N treatments.
Because the chambers were located in a high-use building, ambient CO 2 concentration was as high as 500 ppm and varied with season. Arabidopsis thaliana exhibits a strong, and in this study undesirable, plastic response to CO 2 concentration in the 355-530 ppm range in these growth conditions and chambers (Tonsor and Scheiner 2007). We therefore maintained chamber CO 2 concentration at 500 ppm throughout the experiment. CO 2 concentration was monitored and controlled with LiCor Gashounds R (www.licor.com), integrated with and controlled by the Conviron chamber controllers. Ethylene-free CO 2 was injected by the controller as necessary to maintain CO 2 concentration.
After 21 days, one full set of replicates RILs within each N treatment were sacrificed to obtain measures of dry mass, above and below ground allocation, and N content. The remaining plants were grown to maturity and harvested at day 70 of growth.

TRAIT MEASUREMENT
Because our goal was to provide a general test of the relationship between trait plasticity and developmental instability, we chose traits that span a range of levels of biological integration from simple elemental composition, through instantaneous physiological measures, mass and mass partitioning, to integrative traits associated with reproductive performance, 13 measures in total. We measured the number of days from germination to bolting. CO 2 and water exchange rates were measured 29 days after stratification as whole plant instantaneous carbon gain and transpiration rate at 270 μmol photons m −2 s −1 and 500 ppm CO 2 using whole plant gas exchange cuvettes connected to a LiCor 6400 IRGA (for details see Tonsor and Scheiner 2007;Earley et al. 2009). On the same day, we measured the fluorescence-based photosynthetic dark-adapted maximum quantum efficiency (F V /F m ), steady-state quantum yield [(F m − F t )/F m ], and photosynthetic electron transport rate (see Maxwell and Johnson (2000) for an explanation of the parameters) using a Walz PAM-2000 fluorometer (www.walz.com). At day 70, plants were separated into root, rosette (short-shoot and associated leaves), and inflorescences. All parts were dried at 65 • C and weighed to the nearest milligram. We recorded total dry mass, total number of fruits, total length of all scape branches, average fruit length per plant, and average number of fruits per 5 cm of scape length. Our proxies for fitness were total fruit number (total scape length multiplied by the number of fruits per unit scape length) and the summed fruit length (total fruit number multiplied by average fruit length; Mauricio and Rausher 1997). All plants were ground to a powder with Wiley Mills (www.thomassci.com/wileymill), homogenized, and subsampled for measurement of whole-plant %N, %C, and %H using a Perkin-Elmer 2400 elemental analyzer (www.perkinelmer.com).
The variance among RILs within an N supply environment is a measure of broad-sense genetic variance, V G . Because A. thaliana is nearly entirely selfing, a broad-sense genetic variance best quantifies inheritance. The difference in mean phenotypes between N supply environments for each RIL is a measure of that RIL's plasticity across the N supply environments. Finally, the variance among replicate plants within RILs in a given N supply environment is a measure of that RIL's developmental instability in that environment.

ANALYSES
To examine the association between phenotypic plasticity and developmental instability, we conducted univariate and multivariate analyses of variances and correlations using SAS R v9.1.3 (SAS Institute 2005). We conducted a multivariate analysis of variance (MANOVA) using a generalized linear model (SAS Proc GLM) to test for variation among chambers. The combination of specific chamber identities and the specific runs of the chambers had a significant effect on phenotype. We therefore removed the chamber residual prior to all subsequent analyses.
Plasticity to N supply rate was measured as the difference in mean trait value across N treatments. Our four N supply rates establish three potential scales for measuring plasticity: from 1 to 6 ppm, from 6 to 51 ppm, and from 51 to 56 ppm. Plasticities of a given RIL were calculated as the difference between the RIL mean in two environments divided by the RIL mean in the environment with the lower N supply rate (hereafter mean-standardized plasticity or simply plasticity). For example, plasticity across the 1-6 ppm environments was calculated as (z 6 − z 1 )/z 1 , where z 1 and z 6 are the RIL means in the two environments. Plasticities are thus scaleless and proportional to the phenotype in the lower of the two N treatments. To determine the extent to which crosstreatment N supply rate gradients produced significant overall plasticity, we conducted separate MANOVAs (SAS Proc GLM) comparing the two lowest, the two intermediate, and the two highest N treatments. In these and subsequent analyses, RIL genotype was considered a random effect and N supply was treated as a fixed effect.
To further characterize response to N supply for each trait, we performed univariate analyses (SAS Proc Mixed). Each mixed model was run twice, once including RIL in the model statement and once excluding it. A log odds difference (LOD) test with one degree of freedom was used to determine the explanatory significance of the variance among RILs. Fixed effects tests were performed using F-ratios. Transformations were performed where necessary to meet assumptions of parametric significance tests.
We estimated developmental instability for each RIL in each of the four N treatments as the variance among replicate plants within RILs. Because plasticity was estimated for adjacent N treatments only, there were six combinations of plasticities and within-RIL variances (Table 1). Because each test used only the developmental instability estimates within a N supply treatment, there was no need to standardize within-RIL variances by trait means for the canonical correlation analyses described below.
The central hypothesis of this study is that greater plasticity is associated with greater developmental instability. To examine  (2000).) In this study, canonical correlation was used to compare the vector of trait plasticities with the vector of trait developmental instabilities. The canonical correlation between the two vectors indicates the extent to which the two vectors predict each other. A separate canonical correlation was conducted for each combination of an across-N-supply plasticity vector and a within-N-supply developmental instability vector. We also examined these relationships for each trait separately using a simple linear regression of the within-RIL variance on mean-standardized RIL plasticity. We expected that as a consequence of the hypothesized association between greater plasticity and greater developmental instability, we would also observe an effect of plasticity on heritability. Heritabilities could be affected by changes in either the genetic or the microenvironmental variance components. We compared heritability to within-treatment standardized genetic and nongenetic variance components as the coefficients of variation, CV G and CV e , respectively, calculated as the standard deviations divided by the within-treatment grand trait means. We conducted three regressions, one each of h 2 , CV G , and CV e on plasticity, allowing us to determine the effect of plasticity on heritability and the cause of that effect.
Heritabilities were estimated as the ratio of the among-RIL variance component to the total variance, a broad-sense heritability measure. SAS Proc Varcomp was used to produce the variance component estimates. A SAS macro program bootstrap resampled RILs 5000 times to obtain confidence limits on the genetic variance components and heritability. Heritability was regressed on plasticity for all trait-environment combinations in which the heritability's 95% bootstrap confidence interval was bounded away from zero. Many of the plasticity and heritability measures used in this analysis were correlated with each other, for example, the heritabilities of many traits were highly correlated across at least some of the four N supply environments. Although the regression slope represents a legitimate estimate of the relationship between heritability and plasticity, this lack of independence among pairs of measures prevents the use of parametric testing because the appropriate number of degrees of freedom cannot be determined. Instead, we used a SAS macro to bootstrap resample and estimate the likelihood of obtaining a zero slope given the data. Our estimate of this probability is the proportion of bootstrapped simple linear regression slope estimates greater than or equal to zero out of 5000 bootstraps. (All SAS macros were written by ST and are available at www.tonsorlab.pitt.edu.) For all traits in all environments for which the genetic variance component was bounded above zero by the bootstrap confidence intervals, we regressed CV G and CV e on the trait's mean-standardized plasticity ( Table 1). As with the regression of heritability on plasticity, the parametric test of the slope's significance is not valid for either of these regressions, and we used a similar bootstrap procedure.

Results
Nitrogen supply rate strongly affected plant growth (Fig. 1). The RIL population exhibited highly significant plastic responses to N supply rate for both individual traits (Table 2) and for multivariate comparisons (Table 3). When MANOVAs included only adjacent pairs of N supply rates, the effect of N supply rate on the multivariate phenotype was significant for all paired comparisons of environments (Table 3, rows 2-4). Individual trait phenotypes in the two high N treatments, 51 and 56 ppm, did not differ signif-icantly except for whole plant carbon assimilation rate and whole plant transpiration rate, for which rates were significantly higher at 51 ppm. Consistent variation was observed among RILs averaged across N supply rates (Table 3). RILs exhibited significant heterogeneity in their plastic responses to N supply rate when all four rates were included in the MANOVA (Table 3, row 1), and in all pairwise rate comparisons.
Our measure of trait developmental instability-withinenvironment, within-RIL variance-was significantly correlated with trait plasticity. The multivariate test was always significant and showed a high squared canonical correlation (between 0.48 and 0.95, Table 4). The standardized cross-environment plasticity explained 11-44% of the standardized within-environment, within-RIL variance. Similarly, for the univariate regressions (Table 5) between 46% and 92% of the traits, depending on treatment, showed a significant effect of cross-environment plasticity on developmental instability before correcting for multiple tests. Under a sequential Bonferroni correction, the proportions that were significant varied between 23% and 69%. The truth likely lies somewhere in between these two estimates. Proportional allocation to root mass and photosynthetic quantum efficiency showed the most consistent relationship between plasticity and instability, with significance across all six environmental combinations. For those two traits, plasticity explained 32% and 25%, respectively, of the variation in instability. Overall, there was no clear pattern of significant relationship by type of trait.
Of the 52 estimates of heritability, 33 (63%) were statistically significant (Table 6). Significant heritabilities ranged from 0.12 to 0.81. Heritability and plasticity were negatively related (r 2 = 0.12, P < 0.003, Fig. 2). The heritability of the trait with the greatest plasticity was half that of the trait with the lowest plasticity. No trait with heritability greater than about 0.6 had measurable plasticity.
For traits with nonzero genetic variance components, plasticity explained 19% of the variation in coefficients of genetic variation (P < 0.003; Fig. 3). Traits with the greatest plasticity had nearly twice the coefficient of genetic variation as those with zero plasticity. The effect of plasticity on the environmental variance component was even larger (46% of variation explained), with the coefficient of microenvironmental variation more than tripling in magnitude when comparing those traits with the lowest plasticity to those with the greatest (Fig. 4).

Discussion
In a RIL population of A. thaliana, increased macroenvironmental trait plasticity was associated with greater within-environment developmental instability and decreased trait heritability. Grown across a gradient of nitrogen supply, A. thaliana shows a vexing relationship between noise (developmental instability) and gain (phenotypic plasticity). Ours is the first test of this relationship across multiple traits, despite a long history of speculation and tests based on limited numbers of traits (e.g., Perkin and Jinks 1968;1973;Scheiner et al. 1991;Hall et al. 2007). In this exper- N supply N Treatment × comparison supply RIL RIL All 9.6 *** 13.8 *** 4.2 *** 1-6 16.7 *** 19.7 *** 11.2 *** 6-51 28.0 *** 151.5 *** 37.8 *** 51-56 2.60 * 6.6 *** 2.0 * Asterisks indicate probabilities of tests of significance: *P < 0.01; **P < 0.001; ***P < 0.0001. iment, we measured 13 traits that ranged from physicochemical properties of the photosynthetic system to allocation, life history, and fitness components. The positive relationship between plasticity and instability was found across all categories of traits and across all environments, although not universally. On average, plasticity explained 16% of the variation in developmental instability. Overall, 46% of the relationships were statistically significant when correcting for multiple tests, 72% without that correction. The greater microenvironmental sensitivity associated with greater macroenvironmental plasticity has a population-level cost of decreased ability to adapt in response to selection. Across traits and environments, our regression of trait heritability on RIL trait plasticities explained 12% of the variation in trait heritability. Despite the relatively low r 2 , the average effect of plasticity was quite strong. Traits with zero plasticity had, on average, a mean heritability of 0.47, whereas those with the greatest plasticity had a mean heritability of 0.20, a 57% reduction.  NS = not significant; † = 0.10 < P < 0.05. *P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001.
All cells with three or more asterisks (P ≤ 0.001) were significant given a sequential Bonferroni correction of the critical significance threshold, not accounting for correlations among variables. Numbers in parentheses are 95% bootstrapped confidence intervals. Only those heritabilities whose confidence intervals did not overlap zero are shown. A significant heritability of photosynthetic quantum efficiency has never been observed since or after in any experimental conditions in the Tonsor laboratory.
Thus, this measure in 51 ppm N is likely to be a type I error.  The most plastic traits may bring the phenotype in the vicinity of a novel fitness optimum, but will substantially slow the evolution of constitutive expression to that new optimum in succeeding generations. The evolution of constitutive expression from conditional plastic expression is often called genetic assimilation (Waddington 1953) and will be favored if the optimum remains constant. Compared to a nonplastic genotype manifesting the optimum, a highly plastic genotype exhibiting the same (mean) optimum phenotype will have a lowered fitness because its offspring will be less developmentally stable and thus on average further from the optimum (eq. 4; Gavrilets and Hastings 1994). Plasticity therefore bears a cost in a novel but constant environment: reduced fitness compared to less plastic genotypes with the same optimum, unless the microenvironmental fitness optimum scales with the microenvironmental variation (Zhang 2005).
A genetic correlation between plasticity and developmental instability may not always be a cost of plasticity. In a habitat in which trait optima shift stochastically from one generation to the next, developmental instability can be a bet-hedging strategy (Kaplan and Cooper 1984). When the environment includes both spatial and temporal variation, plasticity and developmental instability can trade-off with one or the other depending on patterns of variation (Scheiner, unpubl. modeling results).
In some circumstances, though, plasticity may enhance adaptation by increasing the amount of expressed genetic variation (Paenke et al. 2007;Lande 2009;Chevin and Lande 2010). In extreme habitats that exert strong stresses on organismal function, loss of canalization can sometimes lead to increased genetic and phenotypic variation (Waddington 1953;Sangster et al. 2008) that is then available to selection (Waddington 1953;West-Eberhard 2003;Moczek et al. 2011). Whether such increased variation enhances adaptation will depend on the relative magnitudes of genetic variation and developmental noise (Paenke et al. 2007;Lande 2009;Chevin and Lande 2010).
Recent theory (Draghi and Whitlock 2012) suggests that stochasticity in gene expression can accelerate adaptive evolution in some circumstance through a complex interplay between plasticity, robustness, and evolvability in which gene networks feed back on each other such that all of the above-mentioned attributes of genetic architecture coevolve. A number of studies (notably Hansen et al. 2011) have suggested that developmental instability in the sense of Kilfoil et al.'s (2009) Kaneko's (2012) V ip (our variance among replicates within RILs) and V G should be correlated in many circumstances, which may in turn lead to a correlation between developmental instability and evolvability (Kaneko 2007(Kaneko , 2012. In our study, the standardized developmental instability was indeed correlated with the coefficient of genetic variation (compare Figs. 3 and 4). However, because the environmental variance increased more steeply with plasticity than the genetic variance, heritability declined with increased plasticity. Heritability is the necessary measure for predicting short-term evolutionary responses. However, the coefficient of genetic variation may be a better measure of long-term evolvability (Hansen et al. 2011). Thus, while increased plasticity may slow short-term adaptive evolution, it may enhance long-term evolvability and adaptive responses.
We could not discern any general property that determines which traits will show a relationship between plasticity and instability. One explanation may lie with our assumption that the norm of reaction is a continuous function over even very small changes in environment. This assumption is reasonable for environmental factors with continuous effects, such as temperature acting on enzyme velocities. It may not be true, however, for plastic responses that involve a discontinuous effect, such as a switch in the regulation of gene expression. Without knowing the developmental pathways and genes underlying each of the study traits, we cannot test this hypothesis. It may also be that the lack of significance in specific cases had as much to do with measurement limitations as it did with any general characteristics of the biological system.
Despite naïve expectations that adaptive plasticity will be favored in heterogeneous environments, it tends to be much less common than genetic differentiation and local adaptation (e.g., James et al. 1997). Recent theoretical models (Zhang 2005;Draghi and Whitlock 2012; show that environmental variation and uncertainty affect whether trait plasticity is favored over local adaptation. Those models show that different sources of variation arising from the amount and timing of dispersal, from temporal variation, and from the underlying genetic architecture have contrasting and interacting effects that can disfavor adaptive plasticity. Published models of the evolution of plasticity have not thus far incorporated a genetic correlation with developmental instability and explored its consequences. The results in our article have several implications for population adaptation and fitness in the face of an anthropogenic global increase in N availability. Historically, N has been the most limiting resource in terrestrial systems worldwide. Nevertheless, N availability can be hugely variable among sites and ecosystems. For A. thaliana populations, the natural variation in N availability spans virtually the entire range of N availabilities investigated in this project, from extremely limiting to near saturating. Thus, movement among sites in this ephemeral weed requires the ability to adjust across a remarkable range of N supplies, potentially favoring plasticity despite the costs documented in this study. Recently, industrial farming and burning of fossil fuels have more than doubled atmospheric N deposition rates over preindustrial levels (Vitousek et al. 1997). This may flatten the gradient of N supply rates across sites and decrease the strength of selection favoring highly plastic genotypes. Perhaps a lowered strength of selection favoring plasticity will shift the plasticity optimum in the direction of a more invariant phenotype.