Properties of phenotypic plasticity in discrete threshold traits

Forms of phenotypic plasticity in key traits, and forms of selection on and genetic variation in such plasticity, fundamentally underpin phenotypic, population dynamic, and evolutionary responses to environmental variation and directional change. Accordingly, numerous theoretical and empirical studies have examined properties and consequences of plasticity, primarily considering traits that are continuously distributed on observed phenotypic scales with linear reaction norms. However, many environmentally sensitive traits are expressed as discrete alternative phenotypes and are appropriately characterized as quantitative genetic threshold traits. Here, we highlight that forms of phenotypic plasticity, genetic variation, and inheritance in plasticity, and outcomes of selection on plasticity, could differ substantially between threshold traits and continuously distributed traits (as are typically considered). We thereby highlight theoretical developments that are required to rationalize and predict phenotypic and microevolutionary dynamics involving plastic threshold traits, and outline how intrinsic properties of such traits could provide relatively straightforward explanations for apparently idiosyncratic observed patterns of phenotypic variation. We summarize how key quantitative genetic parameters underlying threshold traits can be estimated, and thereby set the scene for embedding dynamic discrete traits into theoretical and empirical understanding of the role of plasticity in driving phenotypic, population, and evolutionary responses to environmental variation and change.

Plasticity, defined as the ability of single genotypes to produce different phenotypes in different environments, is expected to substantially shape phenotypic, population dynamic, and evolutionary responses to environmental variation and change (Pigliucci 2005;Ghalambor et al. 2007;Wennersten and Forsman 2012;Ehrenreich and Pfennig 2016;Chevin and Hoffmann 2017;Kelly 2019). Specifically, plasticity directly allows short-term phenotypic responses within and/or across generations, which could currently be adaptive, nonadaptive, or maladaptive (e.g., Agrawal et al. 1999;Ghalambor et al. 2007;Beaman et al. 2016;Donelson et al. 2018;Arnold et al. 2019a). Theory also shows how plasticity could affect persistence of populations experiencing directional environmental change (Chevin et al. 2010), and facilitate longer term evolutionary adaptation to novel environments through rapid expression and evolution of plastic responses followed by genetic assimilation (Lande 2009(Lande , 2015. Rationalizing current forms of plasticity, and quantifying drivers of and constraints on future evolutionary changes, will consequently be central to understanding and predicting phenotypic, population dynamic, and evolutionary outcomes across multiple spatial and temporal scales (Gavrilets and Scheiner 1993a;Scheiner 1993;Chevin et al. 2010;Lande 2014;Murren et al. 2014;Ehrenreich and Pfennig 2016;Chevin and Hoffmann 2017;Kelly 2019).
In general, such outcomes depend on numerous properties of focal systems, including (i) forms and magnitudes of plasticity in ecologically relevant traits that affect fitness; (ii) resulting selection acting on such traits and on trait plasticity; and (iii) components of additive genetic (co)variation involving plasticity, and resulting patterns of inheritance and potential for plasticity evolution and genetic assimilation. These properties arise and

Term Definition
Phenotypic plasticity Ability of single genotypes to produce different phenotypes in different environments, resulting in directly observable phenotypic variation across environments. Developmental plasticity Plasticity in traits that are permanently and nonreversibly expressed once in an individual's life, through environment-dependent initial development. Directly evident when individuals from the same lineage are exposed to different initial environmental conditions, resulting in permanent among-individual variation. Labile plasticity Plasticity in traits that are repeatedly and hence potentially reversibly re-expressed multiple times through an individual's life. Directly evident when individuals experience changing environmental conditions, resulting in longitudinal within-individual variation. Threshold trait A trait that shows two (or more) discrete phenotypic states, where phenotypic expression directly depends on the value of an underlying "liability" relative to a threshold. Liability A latent quantitative genetic trait that can comprise genetic and environmental components, and that translates deterministically into expression of alternative discrete phenotypes when above versus below a threshold. Latent plasticity Environmentally induced variation in liability. Such latent plasticity is not necessarily manifested as observable phenotypic plasticity, depending on whether it causes the liability to cross the threshold. Latent plasticity can be developmental and/or labile.
exert their effects in the contexts of environmental variability and predictability across episodes of trait development and expression versus selection (Scheiner 1993;DeWitt et al. 1998;de Jong 2005;Lande 2009Lande , 2014Lande , 2015Chevin et al. 2010;Ashander et al. 2016;Chevin and Hoffmann 2017;Donelson et al. 2018). Accordingly, numerous empirical studies aim to quantify forms of plasticity in focal traits in experimental or wild populations, and to estimate components of selection and underlying additive genetic (co)variances (Scheiner 1993;Pigliucci 2005;Charmantier and Gienapp 2014;Murren et al. 2014;Hendry 2016;Chevin and Hoffmann 2017;Arnold et al. 2019a;Kelly 2019). However, many recent theoretical and empirical studies, and conceptual reviews, invoke a key assumption that relationships between expressed phenotypes and underlying environmental variables are linear on observed phenotypic scales (i.e., linear phenotypic reaction norms, e.g., Nussey et al. 2007;Lande 2009Lande , 2014Lande , 2015Chevin et al. 2010;Dingemanse et al. 2010;Chevin and Hoffmann 2017;Arnold et al. 2019a), with extensions to consider polynomial functions (Gavrilets and Scheiner 1993a,b;Scheiner 1993;de Jong 2005;Morrissey and Leifting 2016). Such linearities are commonly invoked to facilitate mathematical, statistical, and/or verbal tractability, not necessarily because there is strong evidence or belief that phenotype-environment relationships will typically be straightforwardly linear in nature (noted by Gavrilets and Scheiner 1993b;Nussey et al. 2007;Lande 2009Lande , 2014Chevin et al. 2010;Dingemanse et al. 2010;Chevin and Hoffmann 2017). Indeed, many ecologically important traits that could substantially affect population responses to varying and changing environments show markedly nonlinear phenotypic changes, which also do not directly conform to simple polyno-mial functions (Valladares et al. 2006;Chevin et al. 2010;Murren et al. 2014;Beaman et al. 2016;Arnold et al. 2019b).
Not least, many important morphological, behavioral, and life-history traits expressed across the natural world show two (or more) approximately discrete phenotypic states, whose expression partly depends on current and/or previous environments (West-Eberhard 1989;Scheiner 1993;Roff 1996). For example, such traits encompass dichotomous movements, mating behaviors, and reproductive parameters alongside alternative morphological developments. Given some polygenic basis, such traits can be appropriately conceptualized as quantitative genetic "threshold traits," where an underlying "liability" translates into expression of alternative discrete phenotypes when above versus below a threshold (Falconer and Mackay 1996;Roff 1996). Such threshold traits are well known to have distinctive intrinsic genetic and microevolutionary properties compared to standard quantitative traits that are continuously distributed on observed phenotypic scales. For example, they can maintain substantial "cryptic" genetic variation that is not typically phenotypically expressed, and phenotypic heritability, selection intensity, and evolutionary responses all depend on phenotype frequencies (Gianola 1982;Roff 1994aRoff , 1996Roff , 1998aFalconer and Mackay 1996, Ch. 18;Lynch and Walsh 1998, Ch. 25;Moorad and Linksvayer 2008). However, the ways in which threshold traits could also show distinctive intrinsic forms and properties of phenotypic plasticity, and hence foster distinctive patterns of short-term phenotypic variation and longer term microevolutionary outcomes in varying and changing environments, have received surprisingly little explicit attention. Because rapid plastic expression of alternative discrete phenotypes could dramatically affect population outcomes, and even flip populations between alternative ecological states, such properties and their consequences should be fully incorporated into modern treatments of plasticity and its implications.
Accordingly, we highlight multiple intrinsic properties of threshold traits that could cause distinctive patterns of phenotypic plasticity, and drive distinctive phenotypic and microevolutionary outcomes, that differ substantially from those involving traits that are continuously distributed on observed phenotypic scales. We thereby outline key attributes that should be incorporated into (eco)evolutionary theory for dichotomous traits and quantified empirically, and summarize how such advances could be achieved. To provide necessary context for our focus on threshold traits and facilitate comparisons, we first synthesize core principles of plasticity in continuously distributed traits, as are currently widely envisaged. Key terminologies are summarized in Table 1.

Concepts of Plasticity in Continuous Traits
Plasticity in continuously distributed quantitative traits is commonly conceptualized and quantified in two different ways by differing groups of researchers. First, experimental evolutionary biologists commonly measure plasticity by splitting families or clones across different (known) environments and measuring resulting mean phenotypes in each environment (Fig. 1A). Plasticity is therefore typically viewed as among-individual phenotypic variation within families or genotypes and hence within lineages, often focusing on traits that are effectively expressed once as the culmination of initial development (here termed developmental plasticity [ Table 1; Fig. 1A], elsewhere termed fixed, irreversible, or "one-shot" plasticity [Scheiner 1993[Scheiner , 2002van Buskirk and Steiner 2009;Auld et al. 2010;Lande 2015;Hendry 2016]). Such developmental plasticity is also the primary focus of most theory on plasticity evolution and its implications, which typically considers relationships between environments of trait development versus selection and assumes nonzero additive genetic variation in developmental reaction norm slopes (e.g., Gavrilets and Scheiner 1993a;Lande 2009Lande , 2015Ashander et al. 2016). Such genetic variation, and potential covariation with developmental reaction norm intercepts, can be evaluated empirically through quantitative genetic experiments that split sufficient families across environments (Fig. 1A;Scheiner 1993;Pigliucci 2005;Arnold et al. 2019b), and/or through experimental evolution (Scheiner 2002).
Second, in wild animal evolutionary ecology and behavioral ecology, where families or clones often cannot be so readily created or manipulated, plasticity is typically measured as changing phenotypes of single individuals when exposed to differ-ent environments (Fig. 1B). It is consequently viewed as longitudinal within-individual phenotypic variation, evidenced by individual-level reaction norm slopes that differ from zero. It therefore necessarily concerns flexible traits that individuals express multiple times (termed labile plasticity, Table 1; Fig. 1B; Nussey et al. 2007;Dingemanse et al. 2010;Dingemanse and Wolf 2013;Charmantier and Gienapp 2014;Lande 2014Lande , 2015Hendry 2016).
Since developmental plasticity and labile plasticity can induce differing short-term phenotypic dynamics and longer term evolutionary dynamics (e.g., Lande 2009Lande , 2014Lande , 2015Lande , 2019), yet could co-occur and interact, both ideally need to be combined into a single holistic conceptual framework that encompasses overall phenotypic variation resulting from initial development and subsequent environmental impacts (Beaman et al. 2016). This can in principle be achieved through quantitative genetic approaches that fully consider interacting environmental and genetic effects on overall individual reaction norms, encompassing intercepts (i.e., elevations), slopes, and intercept-slope and slope-slope covariances across developmental and labile plasticity ( Fig. 1C; Supporting Information S1).
Here, individual reaction norm intercepts can reflect the outcome of developmental plasticity, resulting in lasting effects on an individual's mean phenotype as shaped by its developmental reaction norm and early-life environmental experience (red, Fig. 1C). The developmental reaction norm intercept and slope could in turn show additive genetic (co)variation, as commonly considered in theoretical and empirical studies focusing solely on developmental plasticity (Fig. 1A). Meanwhile, variation in individual reaction norm slopes (blue, Fig. 1C) could reflect direct additive genetic effects, representing genetic variation in labile plasticity, and/or permanent environmental effects including further legacies of developmental plasticity. For example, an individual's mean phenotype, as shaped by developmental plasticity, could affect its ability to subsequently express labile plasticity. Complex genetic and/or environmental intercept-slope covariances for overall reaction norms could then arise, for example, if individuals with low phenotypic intercepts typically have shallower (or steeper) phenotypic reaction norm slopes than individuals with higher intercepts ( Fig. 1C; Supporting Information S1; Nussey et al. 2007;Dingemanse and Wolf 2013).
In simple cases with linear reaction norms for both developmental and labile plasticity (e.g., Fig. 1A, B), the full quantitative genetic architecture of the overall individual reaction norm (e.g., Fig. 1C) can then be conceptualized as a three-parameter Gmatrix (i.e., additive genetic variance-covariance matrix). Here, the three elementary parameters comprise the intercept and slope of the reaction norm for developmental plasticity (which together shape the intercept of the overall individual reaction norm), and the slope of the reaction norm for labile plasticity (Supporting Information S1). These three parameters could be positively or negatively genetically correlated (or uncorrelated), potentially implying composite constraints on the form of additive genetic variation available to allow evolutionary responses to selection on any characteristic of the overall individual reaction norm (e.g., Gomulkiewicz and Kirkpatrick 1992;Nussey et al. 2007;Walsh and Blows 2009). For example, if reaction norm slopes for developmental and labile plasticity are strongly positively or negatively correlated, then evolution of both could be exacerbated or inhibited given particular regimes of selection. This three-trait conceptualization of overall plasticity combines and advances current standard conceptualizations of linear reaction norms, which typically focus on either developmental plasticity or labile plasticity but not both. Such conceptualizations effectively generate two-parameter models that consider single intercepts and slopes (Supporting Information S1). Yet, the basic premise that environmentally induced phenotypic variation in continuously distributed quantitative traits is adequately described through linear (or polynomial) reaction norms for developmental and/or labile plasticity expressed on observed phenotypic scales implies multiple intrinsic properties of genetic variation in, selection on, and inheritance of plasticity (Table 2). These properties are fundamental to mathematical and verbal theory regarding the implications of plasticity for population dynamic and evolutionary outcomes, and have motivated and underpinned numerous empirical examinations.

Key Properties of Plasticity in Continuous Traits
First, ongoing evolution of the degree of phenotypic plasticity (whether developmental and/or labile) in continuously distributed traits is envisaged to involve evolution of reaction norm slopes. Such evolution in turn requires nonzero additive genetic variation in slope (and/or in other parameters describing polynomial reaction norm shapes; Gavrilets and Scheiner 1993a,b;Scheiner 1993Scheiner , 2002de Jong and Gavrilets 2000;Nussey et al. 2007;Lande 2009Lande , 2014Ashander et al. 2016;Arnold et al. 2019b). This implies existence of gene-by-environment interactions on the observed phenotypic scale (i.e., nonzero G×E shaping developmental and/or labile plasticity), meaning that genetic effects on phenotypes vary among environments ( Fig. 1C; Table 2). Given linear reaction norms, nonzero G×E can readily cause substantially greater additive genetic variation in phenotypes in more extreme (i.e., unusual) environments. This is especially likely given some canalization of mean phenotype (and hence little additive genetic variation) in typical environments that a population has previously frequently experienced, due to long-term stabilizing selection (e.g., Nussey et al. 2007;Lande 2015). Such effects generate potential for rapid evolutionary responses to selection following environmental change (Lande 2009;Ashander et al. 2016). However, there will not necessarily be nonzero G×E in any particular system, and quantifying the form and magnitude of G×E has been a major empirical endeavor in experimental systems (Scheiner 1993(Scheiner , 2002Pigliucci 2005) and, increasingly, in free-living wild populations (Nussey et al. 2007;Charmantier and Gienapp 2014;Ramakers et al. 2018;Kelly 2019;Supporting Information S2).
Second, directional selection for or against phenotypic plasticity is envisaged to effectively exert directional selection on the magnitude of the reaction norm slope (either directly or indirectly through covariance with intercepts and hence trait means; Fig. 1D; Table 2; Scheiner 1993Scheiner , 2002. Selection for increased plasticity (i.e., favoring steeper slopes) can in principle arise if increased phenotypic sensitivity to environmental conditions increases fitness, irrespective of whether the sign of the optimal reaction norm slope is positive or negative (Arnold et al. 2019a). Meanwhile, selection against plasticity (i.e., favoring shallower slopes) can arise if there are fitness costs of expressing different or changing phenotypes or simply of maintaining physiological capacities to do so (often termed induced vs. constitutive costs; DeWitt et al. 1998;van Buskirk and Steiner 2009;Auld et al. 2010;Chevin et al. 2010;Arnold et al. 2019a). Overall stabilizing selection around an optimal degree of plasticity, and hence around an optimal reaction norm slope, could potentially result.
If there is initially nonzero G×E (and hence additive genetic variation in reaction norm slopes), then directional selection could cause evolutionary changes in slope and hence in the degree of phenotypic plasticity. Yet, directional or stabilizing selection could gradually erode additive genetic variation and diminish G×E, reducing subsequent evolutionary responses (Fig. 1D;van Buskirk and Steiner 2009;Ehrenreich and Pfennig 2016;Saltz et al. 2018; but see Lande 2009). Further studies have consequently postulated mechanisms by which genetic variation in plasticity and associated evolutionary potential, and polymorphisms comprising plastic and canalized (i.e., nonenvironmentally responsive) individuals, can be maintained (Wolf and Weissing 2010;Dingemanse and Wolf 2013;Saltz et al. 2018). For example, combinations of negative frequency-dependent selection on plastic versus canalized phenotypes, and positive feedbacks that reduce costs of repeated trait expression, have been invoked to explain ongoing coexistence of responsive and unresponsive lineages (i.e., polymorphism in labile plasticity; Wolf et al. 2008;Wolf and Weissing 2010). In principle, additive genetic variation in reaction norm slopes could be maintained through disruptive selection, where individuals with slopes of either extreme have consistently higher fitness than individuals with intermediate slopes, but such selection seems unlikely to be commonplace (Saltz et al. 2018; Supporting Information S2).

Continuously distributed traits
Threshold traits (A) • Genetic variation in phenotypic plasticity requires genetic variation in reaction norm slopes, and hence requires gene-by-environment interactions (G×E) on the observed phenotypic scale. • Such G×E, and hence genetic variation in phenotypic plasticity, will not necessarily be present. • G×E could be eroded by directional or stabilizing selection on phenotypic plasticity, and hence on reaction norm slope. Long-term maintenance of additive genetic variation (and hence phenotypic variation) in phenotypic plasticity may therefore require further explanations. • With nonzero G×E and linear reaction norms and stabilizing selection on phenotypes in typical environments, additive genetic variation in phenotype can readily be greater in extreme environments, facilitating rapid evolutionary responses to selection.
• Genetic variation in phenotypic plasticity could result from genetic variation in liability-scale reaction norm intercepts. It does not necessarily require genetic variation in reaction norm slopes, or hence require gene-by-environment interactions (G×E), on the liability scale. • If there is genetic variation in liability intercept, and phenotypic variation, there is likely to be genetic variation in phenotypic plasticity. • Substantial additive genetic variation in intercepts and/or slopes of liability-scale reaction norms could potentially be maintained even given consistent strong directional selection. Genetic variation in phenotypic plasticity may therefore be revealed given environmental variation or change, and require little further explanation. • Genetic variation in phenotype will not necessarily increase, and could easily decrease toward zero, in extreme environments. (B) • A persistent mixture of highly phenotypically plastic and strongly environmentally canalized genotypes or individuals requires some further explanation (e.g., disruptive or negative frequency-dependent selection on reaction norm slope). • The degree of phenotypic plasticity or canalization defined by the reaction norm slope can, in principle, be independent of the reaction norm intercept, and hence of population mean phenotype.
• A persistent mixture of highly phenotypically plastic and strongly environmentally canalized genotypes or individuals is likely. It requires little further explanation beyond some mechanism maintaining variation in liability that spans the threshold, and resulting expression of each alternative phenotype. • The mean degree of canalization (and hence repeatability) of a particular discrete phenotype is linked to the mean liability-scale reaction norm intercept, and is expected to increase with phenotype frequency. (C) • Consistent directional selection for or against phenotypic plasticity will cause consistent directional selection on reaction norm slope. • Such selection could gradually erode G×E.
• Consistent directional selection for or against phenotypic plasticity could potentially cause stabilizing or disruptive selection on liability intercept. The form of selection on liability could change as evolution progresses. • Such selection could potentially drive evolution of complex forms of liability-scale G×E and intercept-slope covariances. (D) • Reproduction between highly canalized parents, or parents with opposite reaction norm slopes, will typically result in highly phenotypically canalized offspring. Highly canalized parents cannot readily produce highly phenotypically plastic offspring. • Mutation or gene flow would be required to regenerate plasticity in canalized populations.
• Reproduction between highly phenotypically canalized parents with different fixed phenotypes can readily result in highly phenotypically plastic offspring. Selection for or against plasticity may consequently impose indirect selection on mate choice. • Phenotypic plasticity (or canalization) could be hard to eradicate even given strong selection against it.
Third, reaction norm parameters, and resulting forms of phenotypic plasticity, are expected to adhere to basic principles of additive genetic inheritance given sexual reproduction. Consequently, matings between environmentally canalized parents (i.e., with reaction norm slopes for developmental and/or labile plasticity that are close to zero) will typically produce canalized offspring (i.e., also with expected reaction norm slopes close to zero), as could matings between parents with slopes of similar magnitudes but opposite signs ( Fig. 1E; Table 2). However, highly phenotypically canalized parents are unlikely to produce highly plastic offspring (i.e., with very steep reaction norm slopes), although such offspring could occasionally arise through sampling variance around the expectation. Indeed, populationwide canalization across environments can be viewed as fixation of genetic variants that increase robustness to environmental variation (Ehrenreich and Pfennig 2016). Additional sources of new genetic variation in reaction norm slopes, whether from mutation or gene flow stemming from immigration given local adaptation, are then required to generate new developmental and/or labile phenotypic plasticity in currently phenotypically canalized populations (e.g., Saltz et al. 2018).
The above properties and principles of genetic variation in, selection on, and inheritance of phenotypic plasticity in continuously distributed traits (Table 2) explicitly or implicitly underpin much theory regarding the expression, evolutionary dynamics, and consequences of plasticity. But such properties could be qualitatively different for threshold traits, implying that the dynamics and consequences of plasticity might also differ substantially.

Concepts of Plasticity in Threshold Traits
The concept of a threshold trait (or "threshold character") is long established in quantitative genetics, spanning animal breeding, medicine, and diverse areas of evolutionary biology (Wright 1934;Gianola 1982;Falconer and Mackay 1996, Ch. 18;Roff 1996Roff , 1998aLynch and Walsh 1998, Ch. 25;Tomkins and Hazel 2007;Moorad and Linksvayer 2008;Grossen et al. 2010;Pulido 2011;Dodson et al. 2013;Hadfield 2015). In brief, each individual is envisaged to have an underlying "liability," defined as a latent (i.e., unobserved) continuously distributed quantitative trait that can have multiple genetic and environmental components, and which causes expression of alternative discrete phenotypes when above versus below a threshold value (Table 1; Fig. 2A). Expression of the alternative phenotypes is envisaged to be deterministic conditional on the liability (i.e., outcomes are not necessarily directly probabilistic; Supporting Information S1). Substantial genetic and/or environmental variation in liability (on either side of the threshold) can consequently exist without causing any phenotypic variation.
However, as with many continuously distributed traits, many threshold traits are re-expressed on time frames of hours, days, seasons, or years, potentially allowing labile plasticity manifested as reversible individual expression of alternative phenotypes when exposed to different environments (i.e., within-individual phenotypic variation; Table 1). Obvious examples include dichotomous behaviors, including diel movements and dominant versus subordinate mating tactics (e.g., Harrison et al. 2017; Crocker-Buta and Leary 2018); seasonal migration versus yearround residence (Brodersen et al. 2014;Reid et al. 2018, Reid et al. 2020; and diverse aspects of reproduction including breeding versus skipping, production of twins versus one offspring, or divorce or extra-pair mating versus mate fidelity in iteroparous species (Falconer and Mackay 1996;Duthie et al. 2016;Germain et al. 2018). Key properties of plasticity in such threshold traits could differ quite fundamentally from those for continuously distributed traits where linear (or polynomial) reaction norms are manifested on observed phenotypic scales (Table 2). These properties and differences result from forms of genetic and environmental variation in underlying liability, and their nonlinear translation into discrete observed phenotypes ( Fig. 2A).
Viewed most simply, each individual has an inherited additive genetic value for liability, plus some set of environmental effects on liability that could reflect current and/or previous environmental conditions (Roff 1996). These environmental effects can be envisaged to result from liability-scale reaction norms that shape responses to environmental variation, where reaction norm slopes could show additive genetic variation (i.e., generating G×E in liability). Indeed, the whole structure of reaction norm intercepts, slopes, and covariances that acts on the observed phenotypic scale for continuously distributed traits (Fig. 1C) can be envisaged to similarly act on the latent liability scale for threshold traits (Fig. 2C). The threshold itself can then be viewed as a higher level reaction norm (or "developmental map") that deterministically translates the combined genetic and environmental effects on the continuously distributed latent liability into discrete observed phenotypic outcomes ( Fig. 2A, C; Supporting Information S1; Roff 1996;Lynch and Walsh 1998, Ch. 11). This conceptual model distinguishes plasticity occurring on the latent liability scale (i.e., latent plasticity) from plasticity occurring on the observed phenotypic scale (i.e., phenotypic plasticity; Table 1). It has multiple interesting implications for the expression, form, and maintenance of phenotypic plasticity in relation to underlying genetic variation; for the consequences of selection for or against phenotypic plasticity; and for patterns of phenotypic inheritance (Table 2).

UNDERLYING GENETIC VARIATION
Individuals and lineages with initial liabilities (i.e., liability intercepts) near the threshold are more likely to be phenotypically plastic (i.e., express different discrete phenotypes at different times; Fig. 2B). This is because any subsequent small environmental deviation in liability, whether due to predictable environmental responses shaped by liability-scale reaction norms and/or random microenvironmental deviations, is relatively likely to push individuals across the threshold (in either direction) and hence induce expression of the alternative phenotype (Fig. 2B, C, E). In contrast, individuals and lineages with initial liabilities far from the threshold (in either direction) are likely to be phenotypically canalized, because even sizeable subsequent environmental deviations in liability are unlikely to induce phenotypic shifts (Fig. 2B, C, E). Substantial latent plasticity could then exist in the form of liability-scale reaction norm slopes, but have zero phenotypic effect and hence be decoupled from expression of phenotypic plasticity.
Individuals' expressions of either alternative phenotype, and of labile phenotypic plasticity (i.e., switching between discrete phenotypes), can consequently be intrinsically positively correlated within and across time periods on different scales (e.g., days, seasons, years). This is because individuals with liability intercepts far from or near to a fixed threshold on one occasion will also have such intercepts subsequently. Sequences of small environmental deviations, or of predictable environmental changes, will consequently have phenotypic effects (or lack of effects) that are consistent across occasions within individuals. High individual repeatability of expression of each discrete phenotype, alongside high individual repeatability of phenotypic plasticity, can therefore readily emerge (Fig. 2E). Population-wide repeatability of expression of each discrete phenotype will also be positively associated with phenotype frequency, such that the most frequently expressed phenotype is the most highly repeatable. This is because, given the standard quantitative genetic assumption that the population distribution of liabilities is Gaussian, the most frequent phenotype will encompass more individuals whose liabilities are further from the threshold and hence less likely to be phenotypically plastic (Fig. 2B).
Consequently, if there is nonzero additive genetic variation in liability intercept with a distribution of values that spans or approaches the threshold, the basic implication is that there will be nonzero genetic variation in expression of phenotypic plasticity (Fig. 2C, D). Specifically, genotypes that place an individual's liability intercept near to or far from the threshold are likely to effectively cause plastic and canalized phenotypes, respectively, given some regime of environmental variation. Resulting genetic variation in phenotypic plasticity does not necessarily require genetic variation in any reaction norm slope, or hence require nonzero G×E on the liability scale ( Fig. 2D and Table 2, although such G×E could also exist, Fig. 2C). Rather, genetic and environmental effects that are straightforwardly additive on the liability intercept can induce nonadditive variation in environmental responses in observed discrete phenotypes (e.g., Gianola 1982). This is because any particular additive genetic or environmental increment on liability might or might not cause a change in phenotype, or hence cause expression of phenotypic plasticity, depending on the preexisting liability value. Further, even with G×E and substantial resulting variation in linear reaction norm slopes on the latent liability scale, extreme environments will not necessarily cause increased genetic variation in phenotype, and might even decrease such variation if all individuals are pushed to the same side of the threshold (Supporting Information S3).
Effective genetic variation in phenotypic plasticity might consequently be commonplace, or even ubiquitous, in polymorphic labile threshold traits. This is because such traits are expected to maintain substantial additive genetic variation in liability. This in turn is because selection on genetic variants that do not immediately cause overall liability values to cross the threshold (or hence cause any change in phenotype) is expected to be weak, even given strong directional selection on phenotype, and especially if liability is highly polygenic (Roff 1994a(Roff , 1996(Roff , 1998a. Intrinsic negative frequency-dependent selection on underlying genetic variants can also arise, further contributing to a relatively high expected mutation-selection balance, alongside effects of drift in small populations (Roff 1998a). Resulting maintenance of substantial additive genetic variation has, implicitly, been shown for liability-scale reaction norm intercepts (Roff 1994a(Roff , 1998a, but similar processes might also affect liability-scale reaction norm slopes. This is because considerable genetic variation in slope could exist yet have little or no phenotypic effect, and hence be effectively shielded from selection ( Fig. 2C; Supporting Information S3).

Reaction norm intercepts and slopes (depicted by red points and blue lines) could become (C) negatively or (D) positively correlated due to selection for or against phenotypic plasticity. Intercepts might in turn be shaped by the intercept and slope of the reaction norm for developmental plasticity, and reaction norms for labile and/or developmental plasticity could also be nonlinear.
Given some regime of stochastic and/or predictable environmental variation, populations could therefore readily comprise mixtures of highly phenotypically plastic and highly canalized individuals (Fig. 2C, D, E; Table 2). No further explanations, such as negative frequency-dependent selection on expression of plasticity coupled with positive feedbacks that reduce expression costs (Wolf et al. 2008;Wolf and Weissing 2010), or complex mechanisms driving evolution of multiple distinct tactics (Engqvist and Taborsky 2016), are necessarily required (Hazel et al. 2004, although such effects could additionally apply). Hence, although cross-environment phenotypic canalization of continuously distributed traits effectively requires fixation of genetic variants that reduce reaction norm slopes and make traits robust to environmental variation (Ehrenreich and Pfennig 2016), such fixation is not required to generate a highly phenotypically canalized threshold trait. Co-existence of canalized and phenotypic plastic individuals, and transitions between apparent canalization and plasticity given environmental changes, is consequently relatively straightforward to explain at both population and individual levels, and could rapidly occur without need for complex mechanisms or substantial mutation or gene flow.

SELECTION
The occurrence of directional selection for or against expression of labile phenotypic plasticity in a threshold trait (i.e., higher or lower fitness in individuals that sequentially express both alternative phenotypes) could imply that individuals with liability intercepts close to the threshold have higher or lower fitness, respectively, than individuals with liability intercepts further away. Directional selection for or against phenotypic plasticity could consequently induce stabilizing or disruptive selection on liability intercept if the distribution of liabilities substantially spans the threshold (Fig. 3A, B), or directional selection if the distribution only marginally spans the threshold. The form of selection on liability induced by selection on phenotypic plasticity could therefore depend on the liability distribution and hence change with the progress of evolution, even if the form of selection on expression of phenotypic plasticity does not change at all.
Eras of directional selection on phenotypic plasticity that cause directional, stabilizing, or disruptive selection on the liability intercept could then, respectively, reduce additive genetic variation, or increase additive genetic variation and even generate bimodal distributions (Fig. 3A, B). Unlike with continuously distributed traits, evolution of phenotypic plasticity (whether developmental and/or labile) could then proceed solely through evolution of distributions of reaction norm intercepts, irrespective of slopes (Table 2).
But, complex forms of liability-scale G×E that facilitate or prevent expression of phenotypic plasticity could also conceivably evolve. For example, individuals with liability intercepts close to or far from the threshold could in principle have liabilities that are differentially sensitive to environmental variation, implying evolution of some form of intercept-slope covariance for the liability-scale reaction norm (Fig. 3C, D). Directional selection for or against expression of phenotypic plasticity could therefore potentially generate nonzero liability-scale G×E rather than eradicate it, which is opposite to basic expectations for continuously distributed traits ( Fig. 1D; Table 2). The presence of nonzero liability-scale G×E could then serve to prevent expression of phenotypic plasticity rather than necessarily define variation in its presence.
Because the intercept of the overall individual liabilityscale reaction norm could depend on the form of developmental plasticity (Fig. 2C), such outcomes will depend on the full quantitative genetic architecture of the liability-scale G-matrix, including genetic covariances between reaction norm slopes for developmental versus labile plasticity. Evolution of such complex liability-scale quantitative genetic architectures (Fig. 3) would likely require consistently strong selection for or against phenotypic plasticity. But such selection might be expected to be stronger for labile threshold traits than for continuously distributed traits, because there could be substantial induced and/or constitutive costs, or benefits, associated with switching between alternative phenotypic states.

PHENOTYPIC INHERITANCE
Threshold traits could also generate distinctive patterns of apparent inheritance of phenotypic plasticity or canalization, manifested as parent-offspring resemblance (Table 2). Such patterns could arise because the threshold trait structure translates additive genetic effects on liability into nonadditive genotypic effects on observed phenotypes (Gianola 1982;Lynch and Walsh 1998, Ch. 25).
Specifically, reproduction between parents with different highly canalized discrete phenotypes, resulting from liability intercepts far from the threshold on opposite sides, could readily generate highly phenotypically plastic offspring. This is because resulting offsprings' liability intercepts could be close to the threshold, meaning that their phenotypes change in response to any small environmental deviation. Maintenance (or elimination) of phenotypic plasticity will therefore partly depend on the degree of assortative (or disassortative) mating with respect to canalized parental phenotypes. If both alternative phenotypes are expressed in a population, then phenotypic plasticity could be hard to eradicate, even given very strong selection against it, unless there is strong assortative mating. Equally, given standard inheritance of liability with Mendelian sampling variance (reflecting segregation and recombination), it could be hard to eradicate expression of both (relatively) canalized phenotypes even given strong selection for phenotypic plasticity. Genetic variation for plasticity (and for canalization) in threshold traits is consequently unlikely to be purged as readily as could be the case for continuously distributed traits (e.g., van Buskirk and Steiner 2009;Saltz et al. 2018). Rather, selection for or against phenotypic plasticity could potentially drive indirect selection on the mating system, where optimal mate choice given any regime of selection on plasticity could depend on the degree of phenotypic plasticity of any focal individual.
Finally, if females and males have different mean liability intercepts and hence show different phenotype frequencies (i.e., phenotypic sexual dimorphism), then deceptive patterns of phenotypic variation could arise, which could be incorrectly interpreted to imply sex-specific genetic architectures or other causes of phenotypic variation (e.g., Lynch and Walsh 1998, Ch. 25). For example, individuals of whichever sex has mean liability intercept closer to the threshold could be more phenotypically plastic, even without any sex-specific effects on liability-scale reaction norm slopes. Evidence of apparent sex-specific genetic dominance reversals at large-effect loci could also arise, even in the absence of any strict dominance reversals. This is because genetic effects that are purely additive on the liability scale, and identical in both sexes, can become nonadditive on the observed phenotypic scale, with forms that differ between the sexes if mean liabilities differ (Supporting Information S4).

Discussion
The potential for threshold traits to show distinctive properties of phenotypic plasticity compared to traits that are continuously distributed on observed phenotypic scales, including differing patterns of expression, genetic variation, selection, and apparent inheritance, has numerous implications for rationalizing and predicting phenotypic, population, and evolutionary dynamics in the contexts of environmental variation and change. Core premises of theory concerning continuously distributed traits include that genetic variation in phenotypic plasticity equates to genetic variation in reaction norm slopes, representing G×E; that directional selection for or against plasticity will cause evolutionary changes that respectively increase or decrease reaction norm slopes; that fixation of phenotypic canalization or of some degree of plasticity implies erosion of G×E; and that observed degrees of plasticity are directly inherited following standard (additive) quantitative genetic expectations (Table 2). Such properties underpin mathematical and verbal explorations of how combinations of plasticity and microevolution can allow mean values of continuously distributed traits to rematch changing environmentally determined optima, and thereby "rescue" declining populations by restoring the previous ecological status quo (Nussey et al. 2007;Lande 2009;Ashander et al. 2016;Chevin and Hoffmann 2017;Arnold et al. 2019a;Kelly 2019). But such properties and outcomes do not necessarily apply to threshold traits, where inherited reaction norms are postulated to act on latent liability scales.
For such traits, genetic variation in phenotypic plasticity does not necessarily require genetic variation in liability-scale reaction norm slopes; evolution of the degree of phenotypic plasticity could potentially occur through changing intercepts rather than necessarily slopes; strong directional selection on phenotypic plasticity could conceivably generate rather than erode complex forms of liability-scale G×E; and substantial phenotypic plasticity could readily re-emerge in offspring of highly phenotypically canalized parents, generating persistent population-wide variation in plasticity (Table 2). New theory will consequently be required to rationalize and predict joint phenotypic, population, and evolutionary dynamics involving plastic threshold traits, and thereby discern how such traits could act to rescue populations through flexible switching between alternative phenotypic and ecological states rather than by restoring the status quo.
Some progress can be achieved through mathematical theory, aiming to formalize basic principles that are currently outlined verbally (Table 2) and identify conditions under which they apply (e.g., Hazel et al. 2004). For example, population frequencies of phenotypically plastic versus canalized individuals, and resulting opportunities for selection, will presumably depend on relative magnitudes of additive genetic means and (co)variances in liability-scale reaction norm intercepts and slopes alongside microenvironmental variances. Evolved intercepts and slopes, and resulting environmental variances, will in turn affect the impact of selection and the consequent maintenance of (cryptic) genetic variation. However, the fact that existing evolutionary theory involving plasticity explicitly focuses on linear phenotypic reaction norms to facilitate mathematical tractability (e.g., Lande 2009Lande , 2019Ashander et al. 2016) implies that dynamics involving threshold traits will be less readily tractable. Advances may consequently require numerical or individual-based simulations, which can more readily encompass key features such as intrinsically dynamic impacts of selection and genetic (co)variances in relation to phenotype frequencies, and dependence of plastic-ity on forms of assortative mating. Simulations have previously been used to examine basic properties of threshold traits, such as the maintenance of substantial additive genetic variation in liability through selection-mutation-drift balance despite strong directional selection (Roff 1998a), evolution of threshold traits from continuously varying traits (Chevin and Lande 2013), and interacting effects of indirect selection, environmental variation, and assortative mating (de Zoeten and Pulido 2020). However, such approaches have not yet been applied to explicitly consider evolutionary dynamics of threshold trait plasticity.

APPLICABILITY OF THE THRESHOLD TRAIT MODEL
Justification for developing theory for dynamic threshold traits requires that key traits that show discrete alternative phenotypes do reasonably conform to the threshold trait model. Although latent liabilities cannot, by definition, be directly observed, the basic adequacy of the threshold trait model has been well demonstrated through quantitative genetics studies based in animal breeding and medicine, alongside experiments in evolutionary biology, which show that observed patterns of phenotypic variation and microevolution concur with theoretical predictions (Gianola 1982;Roff 1994bRoff , 1996Falconer and Mackay 1996;Lynch and Walsh 1998). However, such work has predominantly focused on initial development, not on labile phenotypic plasticity (but see Negussie et al. 2012). Formalizing theory and predictions for labile plasticity could therefore provide opportunities to evaluate the applicability of the threshold model to dynamic discrete traits in wild populations, and highlight pertinent parameters that need to be estimated. Such theory could then provide relatively straightforward explanations for complex observed patterns of phenotypic variation that are otherwise puzzling or require some specific or idiosyncratic explanation.
As one broad example, dichotomous expression of seasonal migration versus year-round residence in partially migratory systems is a critical trait in the context of understanding population responses to environmental variation and change. Here, any change in phenotype frequencies, whether due to microevolution, plasticity, and/or microevolution of plasticity, will directly alter seasonal population dynamics and distributions . Migration versus residence has long been conceptualized as a threshold trait, underpinned by combinations of genetic and environmental effects (Berthold 1988;Pulido 2011;Dodson et al. 2013). Classic work on blackcaps (Sylvia atricapilla) used breeding designs and artificial selection to show that liability for migration (as expressed in captivity) is highly heritable and positively genetically correlated with measures of migratory activity, and hence that phenotypic expression of migration or residence could be rapidly eliminated and regained (Berthold 1988;Pulido et al. 1996;Pulido 2011;de Zoeten and Pulido 2020). Recent analyses of partially migratory European shags (Phalacrocorax aristotelis) revealed patterns of among-individual and withinindividual phenotypic variation (i.e., labile plasticity) that are also highly consistent with verbal predictions from the threshold trait model. Specifically, focal populations comprise mixtures of individuals that are nonbreeding season migrants or residents (representing two alternative phenotypes), alongside individuals that switch between resident and migrant states partway through the nonbreeding season, which can be viewed as expressing within-season phenotypic plasticity (Reid et al. 2020). High between-year repeatability is evident, such that individuals that migrate, remain resident, or switch within one nonbreeding season are highly likely to do the same again in subsequent nonbreeding seasons, with the expected positive associations between phenotype frequency and repeatability (Reid et al. 2020;Acker et al. unpubl. ms.). Yet, between-year plasticity also occurs and is correlated with within-season plasticity, such that individuals that switch between residence and migration within any one nonbreeding season are more likely to switch to residence or full-season migration in subsequent nonbreeding seasons. This concurs with the expected correlation in plasticity across temporal scales (Acker et al. unpubl. ms.). These analyses also revealed episodes of selection for or against within-season plasticity that varied substantially among years in relation to environmental conditions. Such fluctuating selection implies variable costs of plasticity, and resulting episodes of directional, stabilizing, and disruptive selection on underlying liability (Reid et al. 2020;Acker et al. unpubl. ms.).
Other partially migratory systems also show mixtures of phenotypically canalized and plastic residents and migrants, with different levels of individual variation and repeatability across timescales, for example, regarding diel feeding movements in burbot (Lota lota, Harrison et al. 2017), and seasonal migration versus residence in roach (Rutilus rutilus, Brodersen et al. 2014). Such patterns are commonly discussed in the broad context of the threshold trait model (e.g., Dodson et al. 2013), but not explicitly tested against qualitative or quantitative predictions. For example, resident elk (Cervus elaphus) were more likely to switch to migration across years than vice versa in a population where residence was the more frequent tactic, and some individuals switched repeatedly (Eggeman et al. 2016). These patterns concur with expectations given the threshold trait model, yet were not explicitly considered in that context.
Further, apparent nonadditive genotypic effects on migration (and associated age at reproductive maturity) have been documented in Atlantic salmon (Salmo salar), and interpreted as sexspecific genetic dominance reversals at a large-effect locus that serve to resolve sexual conflict (Barson et al. 2015;Czorlich et al. 2018). However, because migration timing shows substantial sexual dimorphism, similar patterns of phenotypic variation could readily arise given a predominantly quantitative genetic threshold trait with sex-specific mean liability intercepts and sexindependent co-dominance at the large-effect locus, and hence without requiring any explicit sex-specific dominance reversal (Supporting Information S4). Finally, although assortative mating has been suggested to facilitate evolution of distinct migratory types within sympatric breeding populations (Bearhop et al. 2005;de Zoeten and Pulido 2020), the implication that such mating could reduce phenotypic plasticity in offspring has not been discussed. There is consequently considerable scope for tighter quantitative interpretation, or re-interpretation, of diverse empirical results in the context of the threshold trait model, regarding seasonal migration, and regarding numerous other labile dichotomous traits.

ESTIMATION OF KEY PARAMETERS
Given that observed patterns of phenotypic variation broadly conform to expectations for threshold traits, the next challenge is to explicitly estimate key quantitative genetic parameters comprising additive genetic (co)variances in intercepts and slopes of liability-scale reaction norms. Such estimates have sometimes been obtained by considering other relevant and directly observable continuously distributed phenotypic traits as proxies of the (unobserved) liability. For example, juvenile hormone titers have been interpreted as liability to develop winged versus wingless forms in hemimetabolous insects (Roff 1994b), and body size has been interpreted as liability to produce particular morphological forms or enact particular reproductive strategies in fish (Dodson et al. 2013). However, such observable proxy traits are unlikely to simply equate to liability, and are best viewed as additional traits that could be genetically correlated with liability (Roff and Fairbairn 2007;Debes et al. 2020). In general, such correlated traits could induce or constrain multivariate evolution, including evolution of overall plasticity (e.g., Lande 2019). Resulting indirect selection on threshold trait liabilities could have particularly important effects, by effectively exposing otherwise cryptic genetic variation to selection (e.g., de Zoeten and Pulido 2020).
Indeed, it is not necessary to measure any observable proxy trait to infer key quantitative genetic parameters for the latent liability underlying a threshold trait (Falconer and Mackay 1996). Rather, parameters representing liability-scale reaction norm intercepts and slopes can in principle be directly estimated given observations of dichotomous phenotypes expressed by relatives across environments. This can be achieved by using statistical machinery that is now well established in the form of generalized linear mixed models (de Villemereuil et al. 2016). Here, a specified function links observations of discrete phenotypes to underlying latent values, which could in turn be modeled as any function (i.e., reaction norm) of environmental variables or other covariates of interest. Such formulations of the threshold trait model are well established in statistical and quantitative genetics theory, and are explicitly enacted using a probit link (i.e., representing an inverse cumulative normal distribution, Gianola 1982;Hadfield 2015;de Villemereuil et al. 2016;Germain et al. 2018;Debes et al. 2020). Yet, the implications and interpretations of such analyses for concepts of latent versus phenotypic plasticity are still surprisingly infrequently explicitly discussed.
Rather, evolutionary ecologists interested in environmentdependent expression of dichotomous traits have reformulated the standard threshold trait model into the "(latent) environmental threshold model" (Hazel et al. 2004;Tomkins and Hazel 2007;Pulido 2011;Buoro et al. 2012;Buzatto et al. 2015). Here, the focal evolving trait is defined as the point on some known environmental axis (or "cue") at which each individual or lineage's phenotype changes state, envisaged as a genetically controlled threshold value. Hence, in contrast to the standard threshold trait model, the environmental threshold model assumes that the environmental axis exerts equal effects across individuals and lineages, whereas the threshold value can show additive genetic variation and evolve (Tomkins and Hazel 2007;Buoro et al. 2012;Buzatto et al. 2015). The standard threshold trait model and the environmental threshold model are equivalent (i.e., reparameterizations of the same conceptual model) in the special case where there is zero liability-scale G×E given the standard threshold model, and hence a cross-environment genetic correlation of 1 (i.e., zero variation in latent plasticity, Roff 1994b; Hazel et al. 2004;Tomkins and Hazel 2007). The point at which an individual or lineage's liability crosses the fixed threshold (i.e., its observed threshold value) is then directly related to its liability intercept (e.g., Fig. 2D). Consequently, the environmental threshold model formulation has motivated empirical investigations of whether the condition of zero G×E in liability (or in some proxy trait) is approximately valid (Roff 1994b).
However, if one ambition is to rationalize and predict evolutionary dynamics of plasticity, then it seems highly restrictive to formulate analyses and conceptual developments through a model that assumes zero liability-scale G×E. In fact, there is no need to make such an assumption, which arises because the environmental threshold model considers environmental and genetic effects on separate orthogonal axes of cues versus thresholds. This structure can be eliminated by considering joint genetic and environmental effects on the same latent liability, by formulating liability-scale reaction norms that could include G×E (Fig. 2C). Indeed, the possibility of both environmental and genetic effects on liability was already fully embedded in the original quantitative genetic threshold trait model, for example, yielding theory on heritability on latent liability (and observed phenotypic) scales (Dempster and Lerner 1950;Lynch and Walsh 1998, Ch. 25). Here, the heritability represents the ratio of genetic to total variance, and the total variance typically includes a substantial component of environmental variance (Roff 1996).
Formulation of a distinct (latent) environmental threshold model is consequently not really necessary, and has invoked additional restrictive assumptions (Roff 1994b;Hadfield 2015). Meanwhile, some articles that are phrased in terms of the environmental threshold model in fact approximately implement the standard threshold model, including interacting latent-scale genetic and environmental effects (e.g., Ostrowski et al. 2000).

INTERPRETATIONS AND IMPLICATIONS
Given that intrinsic properties of phenotypic plasticity and plasticity evolution could differ markedly between threshold traits and continuously distributed traits (Table 2), it becomes important that traits of biological interest are not misconceptualized (e.g., true threshold traits treated as continuously distributed traits), at least without considering the implications of such conversions. It might seem implausible that true discrete and continuous traits could be confused or interchanged. However, given labile phenotypic plasticity, the distinction between the two may not always be clear and may be a matter of biological interpretation.
For example, the seasonal timing of key life-history events, such as breeding, is now a major focus for theoretical and empirical work on plasticity and evolutionary dynamics in relation to environmental change (Nussey et al. 2007;Lof et al. 2012;Inouye et al. 2019;Radchuk et al. 2019). Here, the focal trait is commonly defined, measured, and analyzed as the observed date of event (e.g., egg laying in birds, parturition in mammals, flowering date in plants), which is continuously distributed (e.g., Arnold et al. 2019a,b;Inouye et al. 2019;de Villemereuil et al. 2020). However, biologically, these observed events represent the dates on which individuals switch between approximately discrete phenotypic states (e.g., "nonbreeding" and "breeding"). They could consequently be viewed as manifestations of a threshold trait with labile plasticity, where most or all individuals cross the threshold for reproduction at some point during a season. Quantitative genetic analyses that treat observed date as the focal trait are then effectively using the environmental threshold model, implicitly invoking the assumption of zero liability-scale G×E. This could hinder appropriate evolutionary inference, especially when the objective in studying breeding date is to test for, or evaluate the implications of, nonzero G×E (e.g., Nussey et al. 2007;Charmantier and Gienapp 2014). Further, "date" is unlikely to be the primary driving environmental variable; indeed mean breeding dates commonly vary substantially among years, implying that liability-scale reaction norms shaping the transition to breeding respond to other environmental variable(s) (e.g., Lof et al. 2012;Inouye et al. 2019;de Villemereuil et al. 2020). Microevolutionary analyses that treat observed event dates as continuously distributed traits, if in fact the true underlying biology is a threshold trait with labile plasticity, might consequently give incomplete or erroneous predictions of evolutionary dynamics, at least under some circumstances. Indeed, breeding dates often do not show expected microevolutionary responses to observed directional selection, which has been suggested to reflect additional environmental or genetic constraints (de Villemereuil et al. 2020). Yet, although studying phenology in terms of time to event and associated reaction norms rather than simply observed events has been advocated (Gienapp et al. 2005;Inouye et al. 2019), the merits and practicalities of treating events as manifestations of a plastic threshold trait have scarcely been explicitly considered.
Overall, therefore, future empirical ambitions should be to fully exploit recent conceptual and empirical developments in evolutionary quantitative genetics to explicitly estimate key parameters defining liability-scale reaction norms and resulting expression of threshold traits, and estimate genetic covariances with other traits and components of fitness. Such analyses will require substantial data on discrete phenotypes of relatives and nonrelatives distributed across representative environments, and will therefore require appropriate multivariate axes of environmental variation driving developmental and/or labile plasticity to be measured or imposed. Such advances will be challenging, but will be necessary to predict how complex dynamics of discrete traits can drive phenotypic, population, and evolutionary responses to environmental variation and directional change.

AUTHOR CONTRIBUTIONS
JMR conceptualized and wrote the manuscript, with substantial conceptual and editing input from PA.