• Felipe B. Rocha,

    1. Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas, Unicamp, Cx. Postal 6109, Campinas, 13083–970 SP, Brasil
    Search for more papers by this author
  • Louis B. Klaczko

    1. Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas, Unicamp, Cx. Postal 6109, Campinas, 13083–970 SP, Brasil
    2. E-mail:
    Search for more papers by this author


Two contrasting views can characterize the attitude of many studies toward reaction norms (RNs). An “optimistic” view attempts to use a linear model to describe RN variation; and a “pessimistic” view emphasizes RNs complexity without using any model to describe them. Here, we have analyzed the shape of 40 RNs of five traits of Drosophila mediopunctata in response to 11 temperatures. Our results, along with several other studies, show that RNs are typically curves best explained by nonlinear models. Estimating the set of 40 RNs on the basis of three rather than 11 temperatures produces a scenario, typical of the pessimistic view, where the linear model is either nonsignificant or a poor explanatory model. Moreover, we show that RN nonlinearity can significantly affect the conclusions of studies using the linear model. We propose a middle ground view on RNs which recognizes their general nonlinearity. Such view could, on the one hand, explain part of the important phenomenon of genotype–environment interaction emphasized by the pessimistic view. Moreover, it may explain features and patterns which are being ignored by the optimistic view. We suggest the parabolic model as first step to reveal patterns which were ignored before, or not fully appreciated.

In evolutionary biology, phenotypic plasticity (PP) and reaction norm (RN) are two central concepts connected to the fact that the phenotype of an organism is affected by the environment where development occurs (Fusco and Minelli 2010). PP may be defined as the ability of a given genotype to respond to environmental variation producing different phenotypes (Pigliucci 2005); and RN is the curve that describes the phenotypic response of a given genotype as a function of the environment, thus referring to a phenotypic trait and an environmental factor which are (or can be represented as) continuous variables (Chevin et al. 2010). Thus, RNs show if and how genotypes respond to a given environmental variation.

From a biometric analysis perspective, the variation in phenotypic responses of different genotypes submitted to diverse environments—the variation in RN response—is statistically represented as a nonadditive effect of genotype and environment in an analysis of variance (ANOVA), that is, a genotype–environment interaction (G×E) (Via and Lande 1985). Hence, we may establish the following relationship among these concepts:


Despite this correspondence, the “analysis of phenotypic variance is not a sufficient substitute for knowing the actual norms of reaction of genotypes” (Gupta and Lewontin 1982). Therefore, the understanding of G×E and its many evolutionary implications—from the evolution of adaptive PP to the maintenance of genetic variation for quantitative traits—requires the ability to describe the variation of RNs in a comprehensible manner, that is, describe G×E not only as a factor in an ANOVA, but in terms of RNs that may cross each other in a predictable manner. Ideally, this description should be made with a model, thus allowing for outcome predictions of natural selection acting on different genotypes occurring in diverse environments.

In this sense, many efforts have been made toward describing actual RNs. Often, such studies have used a simple experimental design: individuals of each genotype are allowed to develop at two, or a maximum of three different environments; and the RNs are described as the lines that are traced connecting the phenotypic values of each genotype (Thomas and Barker 1993; Noach et al. 1997; Bitner-Mathé and Klaczko 1999; Pérez and Garcia 2002; Elberse et al. 2004; Andrade et al. 2005; Gutteling et al. 2007; Ellers and Driessen 2011). This two/three-point-curve experimental design has already been used to produce a variety of empirical data which led to the emergence of different views on G×E, RN, and PP. For clarity, here we will characterize the two ends of a continuum which contains these different views. We named these epistemological views as “optimistic” and “pessimistic,” referring to their attitude toward the possibility of describing and understanding RN variation. The optimistic view seeks the understanding of G×E through the use of a simple model to describe RN variation. The pessimistic view, however, states that RNs vary in such complex ways that it becomes impossible to model and generalize G×E.

Additionally to the use of a model, the optimistic view is characterized by the belief that the simplest possible RN model—the linear equation—is a legitimate simplification of actual RNs. Thus, RN variation would be summarized by the parameters b0 (intercept or elevation) and b1 (slope) in the RN function P=b0+b1E, where P is the phenotypic value and E the environment. Thus, genotypes with different elevations but with the same slope would have RNs that run side to side, keeping the same rank order along the environmental axis (Fig. 1A); whereas genotypes with varying slopes but constant elevation would have different responses to the same environmental variation (different PPs), and their curves would cross each other, leading to predictable changes in rank order as the environment varies (Fig. 1B). Hence, the optimistic view promotes the building of models of RN and PP evolution (de Jong 1990; Gavrilets and Scheiner 1993; Scheiner 1993; de Jong 2005; Zhang 2006; Ghalambor et al. 2007; Nussey et al. 2007; Aubin-Horth and Penn 2009; Lande 2009; King and Roff 2010; Reed et al. 2010) by feeding characteristic RN parameters into models.

Figure 1.

Different types of reaction norm variation: (A) linear RNs with varying elevation values and constant slope; (B)linear RNs with varying slope values and the same elevation; (C) RN variation typically found in studies which use the three-point-curve experimental design.

On the other hand, the pessimistic view is characterized by the lack of an attempt to adjust any model, due to the perception that RNs show complex variation, crossing each other more frequently than expected by the linear model (Byers 2005) (Fig. 1C). Consequently, under this view it is not possible to build models to predict the evolution of RN shape (PP), and G×E is a synonym of unpredictable and ubiquitous rank order changes among environments. Thus, it perceives RN complexity as leaving no hope for further understanding, leading to the classical aphorism “it is impossible to say which genotype is better or worse” (Lewontin 1974; Gottlieb 2007; Vale et al. 2008).

Both the optimistic and the pessimistic views find support on empirical data. Part of the two/three-point-curve studies seems to provide empirical support for the optimistic view, showing RNs that seem to be actually linear (e.g., Coyne and Beecham 1987 and Liefting et al. 2009). However, other studies results seem to support the pessimistic view: three-point-curve RNs that vary in a seemingly random manner, with each genotype curve having its maximum and minimum values at any of the three environments, and RNs crossing each other more frequently than expected by the linear model (e.g., Fig. 2 in Lewontin 1974).

Figure 2.

Mean reaction norms of different traits of Drosophila mediopunctata in response to temperature (left column) along with the parabolic adjustments to each RN curve (middle column) and the reduced RNs with only three temperatures (right column). A, B, and C: Development time (days); D, E, and F: Thorax length (mm); G, H, and I: Aristal branches number; J, K, and L: Sternopleural bristle number; M, N, and O: number of abdominal dark spots (from Rocha et al. 2009).

Both views are represented and reinforced by studies which have addressed key issues on RN evolutionary genetics, arriving at important conclusions. Using a measure of plasticity derived from the optimistic view, Scheiner and Lyman (1991) performed a selection experiment to test the genetic independence between the mean value and the plasticity of a trait. They found that the response for selection for the mean thorax length of Drosophila melanogaster and for its plasticity were independent, and interpreted these results as supporting the epistasis model, that is, that the plasticity of a trait and the trait mean value are determined by different loci. On the other hand, the pessimistic view is clearly present in the work of Gupta and Lewontion (1982), which paradoxically made an emphatic defense of the importance of knowing RNs, against the analysis of phenotypic variance. After analyzing the response of three traits of D. pseudoobscura to three temperatures, they stressed that “the essential feature of the norms of reaction” is “that they cross each other so that the large main effect of genotype does not allow one to assume that there are really ‘better’ or ‘worse’ genotypes” (Gupta and Lewontin 1982).

Unmistakably, the results which support the pessimistic view are evidence against the central assumption of the optimistic view: RNs that cross each other more than once are necessarily nonlinear. Indeed, since the pioneer work of Krafka (1920), a number of studies which have described RNs using more than three environments (Thomas 1993; Rocha et al. 2009; and many papers from J. David's group, see below) often show RNs which are clearly nonlinear, even for RNs that had previously been treated as linear (e.g., Coyne and Beecham 1987). Nonetheless, despite the potential consequences of these findings, the two prevailing views on RNs, PP, and G×E seem to remain unchanged.

In a previous work, using 11 different temperatures to test each strain, we have shown that the RNs of the number of dark spots in the abdomen of D. mediopunctata are well described by parabolas (Rocha et al. 2009). Using these curves to test the consensus on the independence between the mean value and the plasticity of a trait, we found a significant correlation between the mean number of spots of each strain and its RN shape (curvature). The RNs change from bowed downward to bowed upward parabolas as the mean number of spots across all temperatures of each strain increases (Rocha et al. 2009). Hence, in contrast with Scheiner and Lyman (1991) who favored the epistasis model, our results supported the pleiotropy model (Via 1993), which states that the same locus determines both the plasticity and the mean value of a trait. By using a nonlinear model for RN description, we could find a pattern of genetic association between the mean value and the plasticity of a trait not assessable to linear models, thus highlighting the methods’ power and providing evidence that its use may lead to a different view on RNs.

Taken together, these findings bring up some questions: (1) Are RNs typically linear or nonlinear curves? (2) Is the pessimistic view a consequence of a linear simplification of nonlinear RNs? and (3) What is being lost with the RN linear model that nonlinear models reveal? To address question (1), we described and analyzed a set of 40 RNs of five traits of D. mediopunctata in response to 11 temperatures. We tried to find, for each of the 40 curves, the polynomial with best significant fit, that is, the function which would get closest to the underlying RN shape. The generality of our findings was assessed by comparing our results to published RNs with more than three points. To investigate question (2), we have reduced the 11 temperature RNs to three temperature RNs and compared the results of the analysis of these two sets of curves as to the possibility of describing RN variation. Question (3) was addressed by verifying if the conclusions of two published studies based on the linear model may be affected by the nonlinearity of the RNs. Finally, we tried to delineate a view on RN, PP, and G×E which takes into account the nonlinearity of RNs, pointing out methodological issues and questions which would result from the adoption of such view.

Material and Methods



We used the same flies examined by Rocha et al. (2009) to describe the RNs of the number of abdominal spots of D. mediopunctata in response to temperature. They belong to a group of eight strains with different second chromosome inversions (PA0 or PC0) but otherwise the same genetic background, produced by Hatadani et al. (2004). These strains were sampled according to a design intended to include the variation of the whole abdominal pigmentation phenotype spectrum present in each chromosomal inversion, while minimizing the observed association between second chromosome inversions and the number of abdominal spots (Hatadani et al. 2004). They showed a marked division in two groups, one with low mean number of abdominal spots, and the other with high mean number of spots. Since the sole trait considered in this sampling design was the abdominal pigmentation phenotype, these strains should represent a random sample with regard to other genetically uncorrelated traits.

Thermal gradient

First instar larvae were collected from each strain and groups of 15 larvae were transferred to vials with 5 mL of culture medium (Rocha et al. 2009). Eleven vials from each strain were kept in a thermal gradient (modified from Fogleman 1978, see Fig. S1 for a picture of the apparatus), varying between 14 and 24°C, with 1°C increments. Three replicates were carried out simultaneously (8 strains × 11 temperatures × 3 replicates = total of 264 vials), and 1122 flies were analyzed. Imagoes of each vial were transferred daily to a new vial after the first emergence. The number of spots was counted on adult flies that were at least three days old, by which time color intensity stabilizes. After that, each fly was put in 70% ethanol, allowing the analysis of the other morphological traits.


We have examined five different traits:

  • 1Development time, from the day in which larvae were put into the thermal gradient to the day of emergence.
  • 2Thorax length—from the anterior margin of the thorax to the tip of the scutellum.
  • 3Number of aristal branches, including the major and the small terminal branches on both sides.
  • 4Number of sternopleural bristles, counted on both sides of the fly.
  • 5Number of dark spots on the abdominal tergites (data from Rocha et al. 2009).

Analysis of RN shape

A major problem of RN analysis lies on determining the underlying RN shape which produces the observed phenotypic values. David et al. (1990, 1997) proposed a two-step procedure that tests for an environmental effect on the RN slope along the environmental gradient, however, this analysis depends on a set of RNs which can be treated as repeats. This assumption is not warranted for genotypes with varying shape RNs, as found in our previous analysis (Rocha et al. 2009).

To circumvent this difficulty, we used a “forward selection” procedure of curve fitting to obtain the polynomial with the best significant fit for each individual curve. This method consists of testing, for each RN curve, polynomials of increasing order. For each strain and trait, we began by fitting a linear regression of the mean phenotypic value on the temperature, adjusting the linear model P=b0+b1T. Then, we proceeded to the quadratic model, adjusting the function P=b0+b1T+b2T2. To test whether the addition of the quadratic term (b2T2) led to a significant improvement of fit (increment of R2), we used the corresponding F value, calculated using the results of the respective ANOVA of each regression, as


with numerator degrees of freedom of 1 and a denominator degrees of freedom equal to the residual degrees of freedom of the higher degree model. Thus, even if the linear model had a good fit to an RN curve, we tested if a quadratic model fit significantly better.

The same procedure was used to test the fit of a third degree (cubic) function (P=b0+b1T+b2T2+b3T3), and so on, for increasingly higher order polynomials. For each RN, this procedure was carried forward until the addition of two more terms were tested as nonsignificant, to ensure that relevant terms were not being inadvertently neglected. Then, the last polynomial with a significant term was assigned as the best fit polynomial for each RN. This is a standard curve fitting procedure, which uses the multiple regression sequential analysis for choosing which predictor variables lead to a significant increase in the coefficient of determination, and is recommended by Sokal and Rholf (1995) and Zar (1999).


To investigate whether the typical RN results that sustain the pessimistic view may emerge as a result of oversimplifying RN nonlinearity, we investigated whether the number of temperatures could affect the possibility of describing and understanding RN variation. Since it is not possible to “fill in” the blanks of RN curves from studies which support this view, we carried out an analysis in the opposite direction, reducing our dataset to just three temperatures: 14, 19, and 24°C.

First, a linear regression was performed for the same type of data used in the analysis of the full dataset (mean phenotypic value per strain and temperature), with only the three temperatures. However, this analysis detected significant linear regressions in only two among 40 RNs, hindering the comparison of the results between the two approaches. Thus, to enable the comparison of two sets of mostly significant polynomials, we performed linear regressions for each three-point-curve RN using the individual phenotypes per strain and temperature.


Scheiner and Lyman (1991) performed six selection experiments on D. melanogaster populations: for increased and decreased thorax length at 19°C, increased and decreased thorax length at 25°C; and for increased and decreased plasticity (difference between 19 and 25°C means), and tested each population for direct and indirect responses.

To carry out an exploratory analysis of the effects of changes in RN parameters for each type of response, we examined how the variation of the underlying RN would affect the mean values at 19 and 25°C and the difference between them. Karan et al. (2000) have described the RNs of thorax length of D. melanogaster as parabolic curves bowed upward, using the characteristic values proposed by David et al. (1997) and Gibert et al. (1998): maximum value (MV), temperature of maximum value (TMV), and curvature (g2). Using the mean value for each parameter reported by Karan et al. (2000), we drew three sets of five curves, each one showing RNs: the mean (μ) and other four curves which varied solely at one parameter: for MV: μ+ 0.5, μ− 0.5, μ+ 1, μ− 1; for g2:μ+ 0.005, μ− 0.005, μ+ 0.01, μ− 0.01; and for TMV: μ+ 0.5, μ− 0.5, μ+ 1, μ− 1. For each set of parabolic curves, we deduced which linear curves would result if only 19 and 25°C were used to describe each RN.

A more recent study of Liefting et al. (2009) has analyzed the RNs of D. serrata in response to three temperatures (16, 22, and 28°C). They used population samples from a latitudinal gradient in Australia to study the variation of the RN slope as a function of latitude, and found a contrast between one fitness trait (developmental rate) and two morphological traits (wing size and wing:thorax ratio). For the morphological traits, they observed an increase in RN slope with increasing latitude, whereas for the lower half (16–22°C) of RNs of developmental rate they found the opposite result: a decrease in the slope with increasing latitude. The developmental rate was estimated as (development time)−1, yielding nearly linear RNs which facilitate the use of RN slope as a direct measure of plasticity. To examine whether this transformation could affect the variation of RNs, we used the parabolic function adjusted to the mean developmental time RN from our data. Each parameter was changed separately, producing three sets of development time curves which varied for each RN parameter at a time, which were transformed into developmental rate RNs for comparison.



Response curves

The RNs show large variability of plasticity to temperature (Fig. 2, left column), depending on the trait analyzed. A simple visual inspection indicates that the RNs of developmental time and thorax length show little variation among strains, whereas the number of aristal branches, number of sternopleural bristles, and the number of abdominal spots show striking differences among RNs.

RN shape

We found polynomials with significant adjustment for 39 RNs in a total of 40, that is, 39 curves showed significant response to temperature (Table 1). Among these curves, 18 RNs were significantly best described by second-order polynomials, seven by third-order polynomials, and five by fourth-order polynomials. Summing up, there were 30 significantly nonlinear RNs, and nine curves were described by linear equations with no significant increase of adjustment for higher order polynomials. One curve (number of aristal branches for one strain) did not respond to temperature, and so had no significant polynomial adjustment. Actually, the number of aristal branches was the trait with lower adjustment for all polynomial orders, and was also the sole trait where the direction of response varied, that is, strains could show an increase or decrease of phenotypic value with increasing temperature (Table 2). There were 28 curves for which a very good approximation of the underlying RN shape was obtained, with more than 80% of the phenotypic variation explained as a function of the temperature (R2 > 0.8). Among these, 24 curves were significantly nonlinear, whereas only four were described by linear polynomials and showed no significant improvement of adjustment to higher degree models.

Table 1.  Number of reaction norms with best significant fit for five traits of Drosophila mediopunctata, according to the reaction norm shape (polynomial degree).
 No fitLinearQuadraticCubicQuartic
Development time00620
Thorax length 0 3 2 2 1
Number of aristal branches12401
Number of sternopleural bristles 0 2 2 2 2
Number of abdominal spots02411
Total (overall=40) 1 9 18 7 5
Total with R2>0.80 41275
Table 2.  Mean R2 values (±SE) of each polynomial degree adjusted to the reaction norms of five traits of Drosophila mediopunctata.
Trait R 2
Development time0.91±0.010.98±0.010.98±0.010.98±0.01
Thorax length 0.62±0.07 0.78±0.03 0.84±0.03 0.86±0.03
Number of aristal branches0.28±0.130.57±0.070.63±0.070.72±0.05
Number of sternopleural bristles 0.62±0.05 0.79±0.01 0.84±0.02 0.88±0.02
Number of abdominal spots0.85±0.070.92±0.030.94±0.020.95±0.01
Overall mean 0.66±0.05 0.81±0.03 0.85±0.03 0.88±0.02

The distribution of RN shapes indicates some degree of trait specificity. Development time was the sole trait where only nonlinear curves were assigned as the final RN shape; it was also the trait which had highest R2 values for the linear regression when compared to the other traits. The number of sternopleural bristles showed highest variability of RN shapes, with each polynomial order showing best fit for the RNs of two strains.

Table 2 shows the mean R2 values for each polynomial degree for each trait. It shows an interesting feature which was common across all traits: the largest increase in R2 values occurred from first to second degree polynomials (from 0.66 to 0.81), whereas third and fourth degrees had similar explanatory power (0.85 and 0.88). Table S1 in the supplementary material shows the R2 for all polynomials for each strain and character. Table S2 shows the results of the F tests for each curve and each character.


The right column of Figure 2 shows, for the five characters, the mean RNs using only three temperatures, as is commonly found in the literature (see above). They show a marked contrast with the full curves and suggest that part of the crossings among RNs may be caused by error (due to developmental noise or sampling error) around each mean value, which is compensated when more environments are considered.

The results from the linear regression analysis of the 40 three-point RNs are shown in Table 3, along with the summarized results from the analysis of the shape of the full dataset for comparison. Among the 40 three-point RNs, 28 yielded significant linear regressions of the individual values of each strain on temperature (Table 3). The remaining 12 curves were mostly found among the RNs of the thorax length and of the number of aristal branches. The explanatory power of the linear model was very low: regressions with R2 values higher than 0.8 were obtained only in the eight RNs of development time, whereas the majority (24) of the RNs had R2 values lower than 0.5 (Table 3).

Table 3.  Summarized results of the regression analysis of 40 reaction norms, using three (14, 19, and 24°C) or 11 (14–24°C) temperatures: number of significant polynomial adjustments, mean R2 value across all regressions, and number of polynomials according to the R2 value. Polynomial data: three temperature RNs—only linear regressions; 11 temperature RNs—polynomials with best significant fit for each RN curve.
  Development timeThorax lengthAristal branches no.Sternopleural bristle no.Abdominal spots no.Total
Eleven temperature RNsNo. significant polynomials88788 39
  Mean R2 value 0.98 0.79 0.57 0.85 0.94 0.83
 Polynomials with R2>0.885168 28
  Polynomials with R2<0.5 0 0 3 0 0 3
Three temperature RNsNo. significant polynomials83368 28
  Mean R2 value 0.87 0.08 0.08 0.30 0.59 0.38
 Polynomials with R2>0.880000 8
  Polynomials with R2<0.5 0 8 8 7 1 24



Our results show that, for the traits of D. mediopunctata we studied, RNs are generally best described as nonlinear curves. Significantly fit nonlinear curves were remarkably more prevalent than linear curves, either if we consider the whole set of significant polynomials (77% of nonlinear curves against 23% linear) or just the set of curves with a better fit (R2 > 0.8: 86% nonlinear against 14% of linear RNs). Each trait showed a specific pattern of RN shape and variation, which may suggest that the variation of RNs is constrained, possibly depending on how closely related to fitness each trait is: RNs of development time were all bowed downward and showed small variation at each temperature, whereas the number of abdominal dark spots RNs varied from parabolas bowed upward to bowed downward and were more spread with larger variation over the phenotypic range at each temperature.

To evaluate the generality of this pattern, one may examine other studies which allow the distinction between linear and nonlinear curves, that is, which use more than three environments to describe RNs. Noticeably, most of these studies have analyzed the response of different Drosophila species to temperature. Among these, most are attributable to the research group headed by Prof. Jean David (Delpuech et al. 1995; Karan et al. 1999, 2000; Gibert and de Jong 2001; Moreteau et al. 2003; Gibert et al. 2004, 2009; David et al. 2005). So far, this group has examined at least six different Drosophila species, describing with nonlinear models the RNs of up to nine different traits and noticing that the RNs of morphological traits of ectothermic species are generally nonlinear (David et al. 1997). A nonexhaustive survey reveals that several other studies reinforce this observation: various traits of adult and larvae of Lepidoptera species (Windig 1994; Kingsolver et al. 2001, 2004); 12 morphological traits of an Hymenoptera species (Bernardo et al. 2007); body size of four Ephemeroptera (mayflies) species (Cabanita and Atkinson 2006); and egg development rate in seven species of Collembola (Janion et al. 2010) show the prevalence of nonlinear RNs.

Furthermore, one of the most complete studies on RNs is the work of Khan and Bradshaw (1976), which reports the results on 54 RNs of six varieties of Linum usitatissimum to six different densities, and show exactly the same pattern: among nine traits, for two linear curves are prevalent (seed weight and plant height), whereas in the other seven traits almost all curves are nonlinear. It is worth noting that this feature and the complex nature of the RNs were pointed out by Khan and Bradshaw (1976): “There are very clear differences in the response of all the varieties. The most obvious difference is (…) a curious break in the response of all the linseed varieties from S2 to S3 not shown by any of the flax varieties. This is a real effect (…).” Similar results are found in the work of Mal and Lovett-Doust (2005), which show the RNs of seven traits to four different water treatments, thus providing more evidence of a possible prevalence of nonlinear RNs in plants.

Hence, empirical data from a number of other studies which analyzed the RNs of different traits, species, and environmental variables, as well as our results, leave no doubt that the best answer to question (1) is that RNs are typically nonlinear curves.


The RNs of abdominal bristle number and viability of Gupta and Lewontin (1982) are the typical data supporting the pessimistic view: they show an almost random variation of RNs, where curves can have either the minimum or the maximum viability value in any of the three temperatures (14, 21, and 26°C). Such a scenario makes no sense if one tries to describe RN variation with linear models. Given the underlying RNs are probably nonlinear curves, part of this scenario nonsense can be attributed to the fact that with only three temperatures it is impossible to infer the general shape of each RN, thus removing evidence that a more complex model is necessary. The 40 three-point-curves derived from the 40 11-point RNs of D. mediopunctata provide evidence on this kind of artifact. In all five traits, RNs crossed each other more than once, making the rank order vary at the three temperatures, leading to Lewontin's (1974) well-known pattern.

The two sets of curves lead to totally different scenarios as to the possibility of describing the shape of the RN curves, which is evident in Table 3. With 11 temperatures, for only one curve no response (PP) was detected, and more than 2/3 (28 in 40) of the polynomials could describe more than 80% of the phenotypic response as a function of the environment. In contrast, with only three temperatures, the description of the RNs was hindered: 12 of 40 RNs showed no PP; and only the RNs of one trait had R2 > 0.8. Moreover, the mean R2 for the linear regression for the 40 three-point curves was 38%, whereas for the best models adjusted to the full dataset was 83%. Thus, in describing RNs with more environments, we could uncover their nonlinearity, which points to the necessity of more complex models to describe their shape and variation. This led to an increase in the explanatory power of the RN models of 45%, which made the rank order changes (the most important feature of RNs according to the pessimistic view) a phenomenon resulting from the variation of parameters which can be estimated and used to understand and predict the G×E.


Contrary to the pessimistic view, the optimistic view starts from the assumption that RNs can actually be described by a linear model. Even though this view ignores the nonlinearity of RNs, this approach can sometimes lead to interesting results. Such is the case of the work of Scheiner and Lyman (1991).

The indirect responses of their work revealed an intriguing pattern. Selection for mean thorax length at each temperature invariably produced indirect responses in the same direction (increasing or decreasing) at the other temperature. Yet, populations selected for larger thorax length at 19°C had an increase in plasticity; and the same held true for populations selected at 25°C in the opposite direction, that is, for smaller thorax length. Moreover, selection for higher plasticity led to decreased thorax length at 25°C, whereas for lower plasticity decreased (although nonsignificantly) thorax length at 19°C.

Given that the variation of mean values at a given temperature is part of the variation of RNs of different genotypes, it is clear that selection for mean value eventually selects also for RN. Thus, the difference between selecting solely the mean value or the plasticity of a trait would be whether the selection affects exclusively the elevation of the RN curve or not.

Figure 3 shows that, for each parameter variation, there would be different responses to the selection regimes of Scheiner and Lyman (1991). If only MV varies, selection for increased or decreased thorax length at either temperature would produce the same response at the other temperature, without changing the plasticity (Fig. 3A and B). If only g2 varies, selection at 25°C would produce a direct response on mean value and an indirect response on plasticity, with plasticity increasing as the mean value at 25°C decreases; whereas at 19°C, there would be none or little direct and indirect response to selection (Fig. 3C and D). Finally, if only the TMV varies, the RN parabolas maintain the same curve shape, changing only the horizontal position (Fig. 3E and F). Thus, selection for increasing mean value at 19°C would lead to decreased mean value at 25°C, increasing plasticity; and selection for increasing mean value at 25°C would lead to decreased mean values at 19°C, thus decreasing plasticity.

Figure 3.

Mean reaction norm of thorax length of D. melanogaster females reported by Karan et al. (2000), varying separately for each parameter (left column), along with the deduced variation of the mean values at 19 and 25°C and of the slope between means (right column) for each curve. A and B: reaction norms with varying MV. C and D: reaction norms with varying g2. E and F: reaction norms with varying TMV.

These findings may explain part of the results of Scheiner and Lyman (1991): MV variation could account for the indirect responses between the mean values at each temperature, whereas TMV could be a cause of the indirect responses between plasticity and mean values. Moreover, a possible constraint to the increase of TMV would explain why the authors could not decrease plasticity below zero. Weber and Scheiner (1992) could map part of the different responses to different chromosomes of D. melanogaster, suggesting that parameters with independent genetic basis were changed, perhaps MV and TMV. However, without knowing the final RN shape for each population, it is not possible to know which parameters actually caused the responses in mean value and plasticity, because different combinations of parameters could produce similar results.


Liefting et al. (2009) have used the variation of RN slope along a latitudinal gradient to test the theoretical prediction that fitness traits would have more canalized RNs in more variable environments. Their test was grounded on the results of developmental rate RNs, which showed a pattern of RN variation that they claimed to be evidence favoring this hypothesis.

Liefting et al. (2009) were aware that the RNs of development time are typically nonlinear curves, making it difficult to use a linear model to measure plasticity. To deal with this issue, they transformed the development time into developmental rate. Nevertheless, this was not sufficient to completely make the RNs linear, as noticed by them: “yet the slopes of the lower part of the reaction norm were clearly steeper than the slopes from the upper part” (Liefting et al. 2009). Arguing that “… small deviations from linearity in reaction norms can have profound effects on overall performance,” they decided to split the RNs into two segments of two temperatures (16–22°C and 22–28°C), and analyzed the slope of each segment separately (Liefting et al. 2009).

The comparison of parabolic RNs of development time which varied separately for each parameter with the transformed RNs of developmental rate reveals that, along with the linearization, this procedure introduces an undesirable effect. This effect is most clearly illustrated by the RNs of development time which vary only at the elevation parameter (b0) (Fig. 4A). Since b1 and b2 do not vary, the difference between curves is kept constant along the environmental axis, and there is no variation in plasticity. Figure 4B shows the effect of transforming these curves into developmental rate RNs: as development time increases toward lower temperatures (e.g., 15°C in Fig. 4A), the developmental rates of different RNs converge toward zero (Fig. 4B); conversely, as the development time decreases toward higher temperatures, the developmental rates diverge. Thus, RNs with different slopes were produced by the transformation of curves that were originally identical in their phenotypic responses.

Figure 4.

Effect of the transformation of development time into developmental rate on the plasticity of reaction norms: (A) mean reaction norm for the development time of D. mediopunctata with varying elevation (b0) values and constant plasticity (b1 and b2); (B) developmental rate reaction norms resulting from the transformation of the curves shown in A, as developmental rate = (development time)−1.

These RNs show a similar variation pattern to that found by Liefting et al. (2009), suggesting that the result which supports their work may actually be an artifact. This is a very recent example of a problem which may result from the misunderstanding of the optimistic view: instead of trying to understand the actual RN shape and investigate how it varies, the authors preferred to “force” the RN curve to adjust into the linear model, ultimately arriving at an unsupported conclusion.


Given these findings, we claim that a third view on RNs, PP, and G×E which takes into account the nonlinearity of RNs should be adopted as a middle ground between the two extreme views. Such view would incorporate part of the complexity emphasized by the pessimistic view, including many aspects of nonlinear curves which have been ignored by the optimistic view, but following the same attitude as the optimistic view in attempting to describe RNs with models whose parameters can be used to describe their variation.

This view would not necessarily entail the use of highly complex models. Our results show that the quadratic function could satisfactorily explain more than half of the significant RNs: there were 24 RNs where the quadratic model had significant adjustments, including those where higher order polynomials were also significant. Moreover, quadratic polynomials accounted for the majority of the explanatory power gains which were conferred by the use of nonlinear models. The increment in R2 values due to the increase in polynomial order can be partitioned among each polynomial degree. Thus, the improvement of fit due to quadratic polynomials represents 68% of the total increase in mean R2 (see Table 2):


The middle column of Figure 2 demonstrates this effect, showing that parabolic functions adjusted to the 40 RNs could capture the general pattern of RN variation for all five traits. Therefore, given its generality and explanatory power, the parabolic curve seems to be a reasonable model for studies of RN variation aiming to account for the nonlinearity of RNs. Additionally, it can be described by the characteristic values proposed by David et al. (1997), which summarize the parabolic RN properties in two positional parameters (MV and TMV) and one shape parameter (g2) with straightforward interpretation. An inevitable question which emerges from such model concerns the plasticity measure in a parabolic RN. Although it is clear that MV variation affects only the mean value of the genotype and that g2 variation directly affects the amount of change along the whole environmental axis, the variation in TMV produces a more complex scenario. RNs with varying TMVs keep the same curve shape but may or may not present the same total variation (difference between maximum and minimum value) depending on the environmental range.

The available data on nonlinear RNs allows a previous assessment of the patterns that may arise from such approach. An interesting pattern which our results and those from Prof. David's group show is that each trait shows specific shape and variation of RNs, and the RNs of the same trait seem to keep the overall shape in different species (see David et al. 1997). Thus, there seems to be some level of developmental constraint to the variation of RN shape, an aspect mostly ignored by RN models. Nonlinear RNs introduce new features which are completely ignored by the linear model, such as the existence of a peak (or valley) in the RN curve. Moreover, our previous findings (Rocha et al. 2009) are strong evidence that the use of nonlinear curves may uncover new patterns which would be ignored with a linear perspective: if, instead of parabolas, we had adjusted linear curves to each RN, we would have found that the mean value of the trait was correlated with the curve elevation, the opposite conclusion to that work. Finally, while Scheiner and Lyman (1991) could successfully select what may be thought to be different parameters of the thorax RNs, our previous results strongly suggest that, for the number of abdominal dark spots of D. mediopunctata, the three parabolic parameters do not vary independently (Rocha et al. 2009). Hence, perhaps with nonlinear RNs, the controversy on the independence of the genetic basis of trait mean and plasticity (Via et al. 1995) may turn out to be of limited significance.

To investigate these questions, different strategies may be adopted both by modeling and experimental approaches. For model building, certainly the use of a parabolic model as a first approximation to nonlinearity would be profitable, allowing to verify whether different scenarios of evolution of RNs and PP come forth.

For experimental studies, the minimum number of points required to a significant polynomial adjustment will depend on the order of the polynomial, on the error of each mean phenotypic value for each environment (a function of sample size), on the underlying shape of the RN. Yet, for a significant parabolic adjustment, at least five different environments should be used, to cover the possible positions of the MV.


In spite of the general pattern of RN shape we have shown, there are indeed RNs which are nearly linear curves and RNs which may be nonlinear but show linear response within a specific environmental range. In such cases, the linear model and the two/three-point-curve experimental design are undoubtedly appropriate. It could also be argued that nonlinear RNs may be made linear with the appropriate transformation. Yet, this would apply only for monotonically varying RNs which do not change the curvature sign among genotypes. In our case, this was only the case of development time, which, as we have shown, is usually transformed into developmental rate, creating an artifact. The question remains of what scales of environmental and phenotypic variables are relevant to each organism. Whereas we simply cannot answer it at present, we believe this question should not be a motivation for arbitrarily chosen transformations which, with the aim to coerce nature to models, may introduce more complication into the study of phenotypic determination and make the comparison among genotypes, traits, and species much more difficult.

On the other hand, there are, probably, RN curves which no model can describe. Facing these curves, the attitude of the pessimistic view is justifiable. Luckily, such curves seem to be very rare: here, one of 40 RNs could not be described by any polynomial. Furthermore, for both linear and nonlinear models, the description of the RNs of a given trait in response to more than one environmental variable will certainly introduce more complication into the study of G×E.

Finally, we must recognize that the biological relevance of nonlinearity of RNs will depend directly on the distribution of environments that occur in nature, on the distribution of individuals in natural populations and on what is the relevant environmental range for each species: RNs are just part of the equation that determines whether a linear or nonlinear approach should be used. For instance, if only the extremes of RN curves are nonlinear, the biological importance of nonlinearity will depend on whether natural populations face these extreme environments. In the present work, as in other studies (e.g., Karan et al. 2000), the position of the evidently nonlinear part of RNs depended on the trait: for the number of abdominal dark spots it was on the lower half of the temperature range, for development time it was on the upper half, and for the number of sternopleural bristles it was nearly the middle of the temperature range (Fig. 2, middle column). Moreover, we can confidently assume that D. mediopunctata natural populations actually face the temperature range we studied, as temperature measurements taken in different sites where D. mediopunctata individuals were collected varied between 8 and 26°C. Thus, the whole temperature range of our RNs is likely to be relevant to understand the determination of phenotypic variation in the field. Yet, this is not necessarily valid for other species, and RN studies should always consider these factors.


Here, more than answering questions, we aimed to emphasize the urgent necessity of facing RN features and questions which have been ignored up to now, or not appreciated, and suggest that a possible manner of dealing with them is adopting a middle ground view on RN variation, considering slightly more complex models than the straight line.

Models aim at making simplifications of natural phenomena and patterns while capturing their essential features (Levins 1966). This is in conformity with Whitehead's view (1920) that “the aim of science is to seek the simplest explanations of complex facts.” Certainly, the linear model was the first step of a research program on the genotype–environment determination of the phenotype. It enabled us to have a first glimpse of the G×E and to discover genes that act on different parameters of the RN, such as in Weber and Scheiner (1992).

However, as Whitehead (1920) foresaw, with its use “we are apt to fall into the error of thinking that the facts are simple because simplicity is the goal of our quest.” In many cases, the linear model may have misled us (e.g., Liefting et al. 2009) to think that RN should be linear (or linearized) because the RN models were built upon this simplifying assumption; or we simply stopped short of exploring the consequences both empirical and theoretical of slightly more complex models. In this article, we urge that a relatively small extra effort should be made to assess this new level of complexity, again repeating Whitehead in his dictum: “seek simplicity and distrust it.”

Associate Editor: R. Bonduriansky


We thank H. F. Medeiros for tips on building and maintaining the thermal gradient, D. H. D. Moraes for technical and collecting help; M. R. D. Batista and I. M. Ventura helped with collecting. L. Nascimento, Coordenador de Pesquisa do Parque Nacional do Itatiaia for collecting authorization and hospitality at the Park and Joel Bernardino for helping with the field work. We appreciate the technical help of M. S. Couto, C. Couto, and K. A. de Carvalho. We are grateful to F. Boschiero and C. S. V. de Oliveira from Espaço da Escrita, CGU-Unicamp, who were in charge of reviewing the English version of the manuscript. H. Montenegro gave cogent suggestions to improve the manuscript. R. Bonduriansky and two anonymous referees gave us important suggestions on a previous version of the manuscript. We thank the financial support of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Fundação de Amparo ao Ensino e Pesquisa (FAEP), and Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP).