Present address: Department of Ecology, Evolution and Behavior, University of Minnesota, 1987 Upper Buford Circle, St. Paul, Minnesota 55108–6097.


A key aspect of biodiversity is the great quantitative variation in functional traits observed among species. One perspective asserts that trait values should converge on a single optimum value in a particular selective environment, and consequently trait variation would reflect differences in selective environment, and evolutionary outcomes would be predictable. An alternative perspective asserts that there are likely multiple alternative optima within a particular selective environment, and consequently different lineages would evolve toward different optima due to chance. Because there is evidence for both of these perspectives, there is a long-standing controversy over the relative importance of convergence due to environmental selection versus divergence due to chance in shaping trait variation. Here, I use a model of tree seedling growth and survival to distinguish trait variation associated with multiple alternative optima from variation associated with environmental differences. I show that variation in whole plant traits is best explained by environmental differences, whereas in organ level traits variation is more affected by alternative optima. Consequently, I predict that in nature variation in organ level traits is most closely related to phylogeny, whereas variation in whole plant traits is most closely related to ecology.

One of the most fascinating challenges in biology is to explain the large quantitative variation in trait values observed in a group of organisms (Thompson 1917; Stebbins 1950). This variation is evident both in groups that are defined by phylogenetic relatedness and groups that are defined by ecological function. For example, some functional traits of plants can vary by several orders of magnitude even within the same site (Westoby et al. 2002).

Theoretical biologists have traditionally explained trait variation as arising predominantly as a consequence of variation in external selection factors such as differences in climate and levels of predation (Darwin 1859; Levin and Muller-Landau 2000). Implicit in this perspective is the assumption that evolution should tend to converge on a single optimum strategy for a particular selective environment (Maynard Smith 1982; Sutherland 2005; Vermeij 2006; Weinreich et al. 2006). However, a few evolutionary biologists have pointed out that there could be multiple solutions to the same evolutionary challenge (Bock 1959; Lewontin 1978; Körner 1991; Price et al. 2004). If there are multiple local optima, then the particular optimum strategy that a lineage arrives at, and the particular path that the lineage takes to reach this local optimum, will depend on chance (Bock 1976; Gould 1989; Pál et al. 2006). Chance in this case includes both the randomness of genetic recombination and mutation, as well as the many contingencies encountered by a lineage over its evolutionary history (Gould and Lewontin 1979).

Although most biologists acknowledge that both chance and environmental selection affect trait variation, there is a long-standing controversy around the relative importance of these two evolutionary forces (Kimura 1968; King and Jukes 1969; Bock 1976; Gould and Lewontin 1979; Mayr 1983; Maynard Smith et al. 1985; Travisano et al. 1995; Losos et al. 1998; Joshi et al. 2003; Wagenaar and Adami 2004; Lehman 2004; Bradley and Folk 2004; Vermeij 2006; Pál et al. 2006). Here I use an original approach to modeling tree seedling trait variation to quantitatively investigate the following questions bearing directly on this controversy: (1) Does a trait difference between two species represent adaptation to environmental differences, or is it a consequence of evolution toward different optima due to chance, or is the difference caused by a combination of the two effects? (2) How does the relative importance of environmental differences and alternative optima change with the trait considered? (3) What distinguishes traits that are more affected by environmental selection from traits that are more affected by alternative optima?

Using a simplified generic model of trait selection, I will generalize the results obtained with the biologically realistic tree seedling model. The greater simplicity of this model will also allow investigation of the mechanism causing the emergence of multiple alternative trait optima in organisms. Finally, I will use the insights gained from this investigation to make predictions for trait variation that can be tested in comparative studies.


Most approaches to investigating the evolution of trait diversity have focused on the radiation of a clade (Losos et al. 1998; Travisano and Rainey 2000; Givnish et al. 2000; Schluter 2000a, Schluter 2000b). Although this approach has been insightful in investigating many evolutionary questions, it cannot be used to answer the question of the relative importance of environmental selection versus chance as causes of diversity, because the trait changes in the evolution of each branch are a consequence of both causes. To circumvent this problem of covariation, I used a model where the two causes of diversification can be varied independently allowing them to be distinguished in the analysis.

My approach was to search for optimal strategies within two environmental treatment levels using a realistic model of tree seedling form and function (Marks and Lechowicz 2006a). This model was used previously to show that multiple optimal trait strategies exist within a single environment, which makes it an ideal choice for the present investigation (Marks and Lechowicz 2006b). Each optimization run of the model was started with a different set of random initial trait value combinations. In this way any similarities in traits between different trait strategies optimized for a particular environment will be due to convergent selection rather than common ancestry, thus avoiding the problem of covariation. Furthermore, the random combinations span most of the range of possible trait values to insure that most of the trait strategy space is explored, which is essential to drawing general conclusions (Fig. 1).

Figure 1.

Diagram illustrating a hypothetical phylogeny in trait strategy space. If a clade originates a very long time ago (time 1 in the diagram), then it has sufficient opportunity to explore the entire breadth of the trait strategy space by time 3. In contrast, if a particular subclade originates relatively recently at time 2, then it will only have the opportunity to explore a small section of the trait strategy space by time 3. Thus, to insure that the entire strategy space is explored while still using a recent starting time (time 2 in the diagram), one needs to run multiple replicate simulations starting at random locations spanning the breadth of trait strategy space. I took this random starting point approach to be able to generalize over the full range of optimal strategy outcomes. In contrast, when analyzing field data, one is usually restricted to a single subclade and a restricted range of potential strategy outcomes is inevitable. Conclusions from field studies are thus case specific and cannot be generalized easily as is necessary for theory.

I used analysis of variance (ANOVA) to partition the variation associated with the environmental difference from the variation associated with alternative optima for each trait in the model output. Specifically, I used the r2 value from an ANOVA as a measure of the degree to which the variation in a particular trait is explained by the environmental treatment. I considered the remaining unexplained variation as due to the trait variation within the simulated environments caused by the existence of multiple optimal strategies. Thus, in model results the r2 value can be used as a measure to compare the relative importance of the environmental treatment differences and the effect of alternative optima in explaining the variation in a trait. By examining the general pattern of variation in these r2 values across all of the ANOVAs, I gained general insights about the relative influences of environmental differences and alternative optima in shaping trait variation. Based on these general patterns I make several concrete predictions that I offer as testable hypotheses to guide future field research into trait diversity.



The simulations in this article were done using the tree seedling adaptive designs model (or TAD model). The TAD model and its realism has been described in detail elsewhere and I will only introduce its main features here (Marks and Lechowicz 2006a). Although a seedling model cannot include trade-offs between seedling traits and the traits of adult trees or seeds, the mechanisms operating to produce multiple optima in the more special case of a seedling model would also operate in a more complicated adult tree model. In the section following the results from the TAD model, I will use a simplified generic model of trait optimization to demonstrate that the general patterns illustrated by this investigation with the TAD model should also appear in holistic models of any other type of organism.

The TAD model is an optimization model, but unlike many previous optimization models that focused on studying a few traits related to a single trade-off in isolation, the objective in TAD was to sacrifice some accuracy in individual traits in favor of a more holistic approach that includes the interactions and trade-offs among multiple traits at the whole plant level where selection operates. In particular, in TAD growth does not depend only on the economy of a single resource, but rather growth is affected by carbon uptake, water balance, nitrate uptake, and light interception, as well as their interactions. The uptake of each of the resources depends on the interactions among multiple individual traits as well as the environmental conditions. For example, root length and distribution, xylem traits, and stomatal behavior as well as the water potential in the soil and the vapor pressure deficit of the air interact to affect water use. Furthermore the environment is modeled explicitly to include the effects of the seedling and its competing neighbors on resource availability. Thus, there is a complex network of interactions in the TAD model.

Another important feature of the TAD model is that it has a single fitness measure that is biologically realistic for tree seedlings. Specifically, selection maximizes growth rate, but this optimization is subject to constraints. I use the word constraint in the sense that it is used in constrained optimization by mathematicians. In optimization, a constraint is a side condition that restricts the values for individual variables or more typically combinations of variables (traits in the case of TAD). In TAD for example, survival acts as a constraint that requires the tree seedling strategies to prevent dehydration, avoid carbon starvation, and prevent mechanical failure. Thus, as in nature, plant organs must be competent at performing multiple functions. Seedling stems should supply leaves with water at a minimal cost, but they also need to be strong enough to support the crown mechanically; and leaves must maximize net carbon gain, while at the same time avoiding permanent wilting. The optimal combinations of trait values are a compromise between maximizing growth, and satisfying these constraints. Due to the limited space here I refer the interested reader to Marks and Lechowicz (2006a,b) for a comprehensive discussion of the many constraints, trade-offs, feedbacks, and traits included in the TAD model.

Simulations with the TAD model search for an optimal combination of values for 34 individual tree seedling traits. In addition to these 34 independent traits, the model also calculates many other traits and performance measures that depend on these 34 traits in the model. The 34 independent traits that are collectively optimized include four parameters related to seed reserve allocation, six to carbon allocation, three involved in nitrogen allocation, three for stomatal control, nine leaf traits, five root traits, and four wood traits (Table A1). Historically, a set of trait values specifying the morphology of an organism has sometimes been referred to as a “Bauplan,” the German word for an engineering design (Riedl 1978; Gould and Lewontin 1979). However the traits in TAD also include behavioral parameters such as those specifying stomatal control, and consequently I will instead use the more modern term of an “evolutionary strategy” used in evolutionary game theory (Maynard Smith 1982; Dieckmann and Ferrière 2004).

Table A1.  List of treatment effects on tree seedling traits (based on a separate ANOVA for each trait). The sign gives the direction of the effect, and the strength of the effect is indicated by the magnitude of the r2 value. The table lists all significant effects at a P<0.05. The analyses were done on all 100 replicates and repeated using just the optima that were in the top quartile of fitness, because one can argue that the less fit optima are not well adapted to the environment. Traits are arranged by level of integration from individual traits to whole plant traits. The first 34 individual traits listed in the table are the trait variables that are optimized in the model. The values of all of the other traits are derived from these 34 trait variables during the seedling growth simulation. Note that because the optimal strategies are all able to survive and remain mechanically intact, fitness is equal to average growth rate, which is listed at the end of the table.
  TreatmentNitrate availability treatment effect
All 100 optima25 best optima only
Effect P Adj. r2Effect P Adj. r2
Individual traits
 Carbon allocation multiplier
 Carbon allocation exponent
 Carbon allocation constant
 Nitrogen allocation multiplier
 Nitrogen allocation exponent
 Nitrogen allocation constant
 Carbon storage allocation multiplier
 Carbon storage allocation exponent
 Carbon storage allocation constant
 Max root depth+<0.0010.37+<0.0010.60
 Max lateral crown extent
 Max lateral root system extent0.0360.020.0430.08
 Max leaf intensity parameter   0.003 0.04
 Max fine root intensity parameter0.0260.03 
 Stem taper   <0.001 0.09
 More twig than petioles0.010.03 
 Vertical fine root distribution exponent   0.018 0.03
 Max stomatal conductance
 Stomatal control constant   0.015 0.12
 Stomatal control exponent
 Minimum stem cross section area per leaf area   0.01 0.03
 Minimum thick root cross section area per fine root length
 Xylem cell wall thickness to cell diameter ratio   0.047 0.02
 Leaf cuticle thickness
 Mesophyll cell nitrogen content per volume   <0.001 0.09   0.046 0.08
 Mesophyll cell wall thickness <0.0010.37
 Sclerenchyma fraction of leaf cross section area   <0.001 0.08
 Mesophyll cell diameter
 Number of mesophyll cell layers
 Fine root nitrogen content<0.0010.120.0170.11
 Initial root to shoot ratio   0.001 0.05   <0.001 0.43
 Initial leaf to stem ratio0.0230.02 
 Initial fine to thick root ratio
 Initial carbon allocation to storage+0.0440.02 
 Minimum xylem water potential
 Wood density+0.0370.02 
 Maximum rate of active nitrate uptake   <0.001 0.12   0.017 0.11
 Specific root length0.0010.05 
 Leaf cuticular conductance   0.028 0.02
Composite traits
 Huber value   <0.001 0.08   0.038 0.09
 Final height to basal diameter ratio
 Leaf blade nitrogen content   <0.001 0.13   <0.001 0.46
 Leaf carbon construction cost0.0460.02<0.0010.40
 Specific leaf area   0.004 0.04   <0.001 0.36
 Leaf blade thickness
 Minimum leaf osmotic potential   <0.001 0.25
 Average xylem sap-flow rate per leaf area
 Average xylem sap-flow rate per xylem area
 Average xylem sap-flow rate per fine root surface area
 Proportion of nitrate uptake that was passive   <0.001 0.39
 Realized root depth+<0.0010.36+<0.0010.52
 Root area index   <0.001 0.07
 Leaf area index
 Maximum realized net photosynthetic rate per leaf mass   <0.001 0.09   <0.001 0.41
 Maximum realized net photosynthetic rate per leaf area+<0.0010.08+<0.0010.33
 Maximum realized stomatal conductance
 Leaf longevity<0.0010.28<0.0010.22
 Time to first root extension growth   <0.001 0.07   0.019 0.11
 Average leaf level nitrogen-use efficiency
 Average leaf level water-use efficiency   0.004 0.04   <0.001 0.25
Whole plant traits
 Final nitrogen storage relative to dry mass   <0.001 0.08
 Final carbon storage relative to dry mass<0.0010.240.0020.19
 Potential carbon uptake wasted due to lack of carbon sink strength   <0.001 0.16   0.044 0.08
 Final whole plant respiration rate per mass+<0.0010.07+<0.0010.50
 Carbon-use efficiency   <0.001 0.59   <0.001 0.52
 Light-use efficiency+<0.0010.74+<0.0010.81
 Nitrogen-use efficiency   <0.001 0.24
 Water-use efficiency+<0.0010.65+<0.0010.58
 Average carbon use rate of seedling   <0.001 0.08   <0.001 0.72
 Average light use rate of seedling +<0.0010.30
 Average nitrogen use rate of seedling   <0.001 0.98   <0.001 1.0
 Average water use rate of seedling
 Average growth rate   <0.001 0.91   <0.001 0.97
 Fitness (average growth rate and survival)+<0.0010.91+<0.0010.97

Optimization of the 34 traits requires a powerful algorithm. A genetic algorithm (GA) was chosen for its efficiency at this task (Goldberg 1989). An interesting feature of genetic algorithms is that GAs have many similarities with biological evolution and thus include many of the same qualitative limitations (Farnsworth and Niklas 1995; Wagner and Altenberg 1996). Specifically, the particular optimum found in a simulation with a GA is affected by the initial conditions, and especially by the randomness of genetic mutation and recombination, similar to the effect of chance and history in biological evolution. Although it is tempting to interpret GAs as a model of the evolutionary process, it is important to recognize that GAs are not an accurate description of evolution in all its details; GAs only contain a qualitative analogy to evolution's most essential mechanisms. GAs are primarily designed for maximum mathematical efficiency, not biological realism. However, a precise model of the evolutionary process was not needed here, because this experiment is about the differences among optimal evolutionary outcomes rather than about the process of approaching those outcomes. Other powerful optimization techniques such as simulated annealing could have worked just as well. Recall that in the model, any consistent variation in optimal trait outcomes among different environmental niches is due to convergent selection, whereas variation within an environmental niche is due to the existence of multiple optima. The relative contribution of the various historical chance processes in affecting which optimum a lineage arrives at will be case specific both in the model and in nature, and need not concern us in examining the general pattern of the outcomes.


I chose differing levels of nitrate availability as the environmental treatment because nitrogen availability is believed to affect traits not only directly (Aerts and Chapin 2000), but also indirectly through effects on competition and productivity (Grime 1977; Huston 1994). The nitrate availability treatments were 10 and 20 gN/m2 per year applied at a constant daily rate. These annual values are typical of the natural range for forests dominated by angiosperm broadleaf trees (Larcher 2003).

The other environmental conditions were the same across all treatments. Specifically, I set the model to use a loam soil, humid air, and high light in a warm climate. Soil water was replenished to 95% of field capacity every 15 days. The water in the top soil layer was replenished each time, the layer below it every second time, the layer below that every third time, and so on to imitate the natural variation in the amount of rainfall. Furthermore the simulations included a drought period where no new water was added to the soil for 60 days starting on day 90 of each seedling growth simulation. The atmospheric CO2 concentration was set to the current ambient of 375 ppm to be able to compare modeled trait values to current field measurements. The interested reader can find the details of how these environmental conditions are implemented elsewhere (Marks and Lechowicz 2006a).

Unlike results from the TAD model presented previously (Marks and Lechowicz 2006a,b), the current simulations include the effects of competing neighbors. Under competition, different trait strategies compete against each other directly, rather than in terms of comparisons of their performance when grown individually. Although these strategies are completely different at the beginning of an optimization run, the competing strategies eventually become more and more similar as the genetic algorithm gradually converges on an optimal strategy. Consequently, it is most appropriate to think of the competition in the model as intraspecific. I focused on intraspecific competition in this study because in nature seedlings are consistently exposed to competition with conspecifics; whereas the identity of competitors from other species varies widely throughout the range of a species and over time, making interspecific competition a much less consistent selection force (Wright 1982; Niklas 1997c; Dieckmann and Ferrière 2004; Canham et al. 2006). A minor change made to the TAD model for the current simulations is that the diameter of fine roots was reduced from 0.5 mm (Marks and Lechowicz 2006a) to 0.2 mm, a value that was deemed more representative of most tree species. This minor change improved the realism of the modeled seedlings slightly, but did not affect the general behavior of the model, the main subject of this research.

I ran the GA in TAD 100 times in each environmental treatment to search for optimally fit tree seedling strategies. As in field experiments, in the simulated environments the seedlings display differences in growth rate or productivity due to the differences in resource availability among treatment levels. Consequently, if simulations were run for the same number of days in all of the treatment combinations, then the final size of the seedlings would differ substantially between environments. Because it is well known that the traits of plants change with size, this size difference could potentially confound effects due to environmental differences (Coleman et al. 1994; Weiner 2004). To avoid this problem I ran the simulations not for a set number of days, but rather until the seedling reached a threshold size, as also recommended for field experiments (Coleman et al. 1994). A size of 120 g total dry mass was used as the threshold. The average growth rate was then calculated as the threshold size divided by the number of days that it took seedlings of the particular strategy to reach that size. If that seedling strategy was also able to avoid mechanical failure and dehydration, this growth rate was then used as the estimate of the strategy's fitness. The GA would then search for similar strategies that might be of higher fitness in the next cycle. It typically took the GA several hundred cycles to arrive at a local optimum (i.e., a strategy where there are no similar strategies of higher fitness). Thus to insure that the GA had stabilized on a local optimum, the GA searched for 2000 cycles during each run.


For each trait listed in Table A1, a separate ANOVA analysis was done to test for the effects of the nitrate availability treatment (SYSTAT version 9, SPSS Inc., 1998). Prior to running an ANOVA, the data for each trait were plotted to confirm that the data in each treatment level were reasonably normal and homoscedastic and thus met the assumptions of ANOVA. Although I report all results that were significant at P < 0.05, I only discuss the strongest results among them because the weaker ones may have been significant due to chance alone when running such a large number of ANOVAs.



Each of the 100 optimization runs within each of the treatment levels stabilized on a different optimum. The fitness (growth rate and survival) of many of the 100 optima within a treatment approached a maximum. Thus many of the alternative strategies in a particular simulated environment are of approximately equal fitness. However not all optima approached this upper fitness limit. Because I consider only the optima near this upper fitness limit to represent a well-adapted strategy, I will focus on the optima in the top quartile of fitness in each treatment in the analyses below. However, I also report the results for the full set of 100 replicates in the appendix. By using 25 or 100 optima for each treatment level the number of replicates is the same in each as required by the assumptions of ANOVA.

Figure 2 illustrates that fitness of alternative trait strategies can be quite similar in a given environment despite large differences in important functional traits such as maximum net-photosynthetic rate. It also shows that the upper fitness limit approached by the alternative optima roughly doubled as the nitrate availability was doubled, suggesting that maximum fitness is a function of environmental conditions, particularly resource availability.

Figure 2.

The average growth rate (a fitness measure for tree seedlings) plotted versus the maximum net-photosynthetic rate for the different optimal trait strategies. The two discrete bands of growth rates are a consequence of the differences in the upper limit of growth rate achievable in the different environmental treatments. Only optima from the top quartile in fitness rank are plotted for each treatment.

The effect of environmental treatment on traits was calculated for each trait by comparing trait values between treatment levels using ANOVA. Table A1 in the Appendix lists the results of ANOVA analyses for this treatment effect on tree seedling traits. Many traits show significant effects of the environmental treatment, thus supporting the view that trait variation reflects adaptation to different environments. However, most of the r2 values are modest, as is often the case in empirical data as well. If there were only a single optimal strategy, then one would expect complete convergence in trait values within an environment. However, if there are multiple trait strategies that optimize performance in a particular environment then within treatment variation is expected. Thus the modest r2 values imply an important role for such alternative strategies in explaining trait variation.

To gain insight into the relative importance of environment versus alternative optima, I examined the trends in the variation of ANOVA r2 values (Table A1) across traits. It quickly became apparent that the degree of integration of the trait played a role. Therefore in the presentation of Table A1, I organized the traits according to a hierarchy of integration. In particular, I differentiated between individual traits, composite traits, and whole plant traits. The individual traits include the 34 traits being optimized, as well as any other traits that are a function of only one of these 34 independent traits in the model. I defined composite traits as traits that depend on several individual traits. These composite traits include measures of organ performance such as maximum net-photosynthetic rate. Finally, whole plant traits are performance measures that depend on the interactions of many organ level composite traits. I summarized this information in box plots where the r2 values from Table A1 are plotted versus the level of integration of the traits (Fig. 3). The hierarchy is an idealization of course, and traits only approximately fit into the different categories. Despite the idealized nature of the trait hierarchy, it provides an insightful framework for analyses. Specifically, Figure 3 shows a clear trend of an increase in the effect of environmental treatments with increasing level of integration (i.e., higher r2 values). The trait variation tends to be better explained by environmental differences the higher the trait is within the hierarchy of integration, whereas the variation tends to be affected more by alternative optima the lower the trait is within the hierarchy of integration. Thus the relative importance of alternative optima versus environmental differences in explaining trait variation depends on the level of functional integration of the trait.

Figure 3.

Box-plot indicating the mean, standard deviation and maximum and minimum for the adjusted r2 values in the ANOVAs for traits of various levels of integration (using top quartile of optima only). The hierarchy of trait integration starts with individual traits, and is followed by composite traits that depend on more than one individual trait, and finally whole plant traits that are affected by many composite and individual traits.


Even in relatively realistic models such as TAD (Marks and Lechowicz 2006a), there are inevitably inaccuracies due to simplifying assumptions and errors in the estimation of parameters. The key question when using a model is to ask do these inaccuracies affect the results of primary interest? In other words, are the results of multiple optima and the hierarchy of effects robust to changes in the model? In preliminary simulations, I discovered that these primary results are robust to a range of model changes. These included changes to the trade-offs and interactions, changes to parameter values, and changes to the environmental conditions. For example, if parameters are given unrealistic values or particular trade-offs are modeled with incorrect relationships or are left out, the behavior of model tree seedlings will become unrealistic, but there will still be multiple optima in the results and a hierarchy pattern in the effects of environmental selection on trait values. Changing the particular environmental treatment also does not change the pattern that whole plant traits are more affected by the environmental treatment than composite traits which in turn are on average more affected than individual traits. However changing the environmental factor that is varied changes which among the composite traits is most affected by the treatment, as one would expect based on field observations. I also tried running a set of simulations where each simulation started with the same initial trait values rather than a range of random values as in the present study. The trait variation in the optima found in these simulations was just as great as among optima with random initial trait values (Marks 2005). This robustness of the main results to model changes suggests that they are valid generally.


To gain a further insight into the generality of the multiple optima and hierarchy of trait integration results, I investigated the mechanisms underlying these patterns. In particular, I asked what elements are needed in a simple generic model to produce multiple optima and a hierarchy of trait convergence.

Simple analytically solvable models for optimizing an evolutionary trade-off usually do not have multiple optima (Parker and Maynard Smith 1990; Levin and Muller-Landau 2000; Mäkelä et al. 2002; Sutherland 2005). Therefore, I examined which ways of making optimization models more complex results in the emergence of multiple optima. In doing so, I made use of the model in Figure 4. This model is a generic example of models containing multiple traits, each subject to a trade-off, that are interacting with each other in a hierarchy of integration, as described in the Appendix.

Figure 4.

The schematic diagram illustrates the hierarchical network of trait interactions that was used to investigate the cause of multiple optima. When signs are not given next to arrows the effect is positive. The constraints on the optimization on traits at level C in the hierarchy that were used to test their effect are not shown, because they were different during each of the 21 sets of optimizations. More information on these optimizations is given in the text as well as in the Appendix.

I discovered that the way to consistently obtain multiple optima of similar fitness in this kind of model was by adding a constraint on combinations of traits. Changing the number of traits in the generic model or the particular interactions among them or the values of the coefficients in the equations had little effect. Without constraints, replicate optimizations of this model all converge on a single global optimum. To show this effect of constraints explicitly, I ran 21 sets of 20 optimizations each, where each set was subject to a different number of constraints on the traits at level C in the hierarchy (Fig. 4). The constraints took the form of a sum as would be the case in the allocation of a limited resource among different functions (see the Appendix for details). Note that constraints do not have to have the form of a sum to cause the emergence of multiple optima. The number of constraints ranged from 0 to 20. The results showing the effect of changing the number of constraints are plotted in Figure 5. The figure shows that as the number of constraints increases the average fitness of the optima declines and the trait differences among the optima increase. The plot at the bottom of the figure illustrates how the degree of convergence dramatically declines with decreasing level of trait integration in the hierarchy. It also shows that by the time the second constraint is added, the optimization is sufficiently constrained to have multiple optima, and variation jumps by close to two orders of magnitude. Thus the generic model implies that the general cause of multiple optima in trait optimization models such as TAD is a consequence of including constraints.

Figure 5.

The plots show how the generic multitrait trade-off model responds to increasing the number of constraints on the traits in level C of the hierarchy from 0 to 20. For each of these 21 cases, a set of 20 replicate optimizations were done to determine if there are multiple optima. The top plot shows how the average fitness for these 20 replicate optimizations declines as the number of constraints increases. The middle plot shows that the fitness variation among these replicates increases with the number of constraints, a reflection of the increasing number of optima and the increasing differences among them. Specifically, the y-axis plots the number of replicates where the fitness of the optimum found was over 98% of the fitness of the best optimum found in the set. The lower plot shows the variance within a hierarchical level for each set. Note that the y-axis is logarithmic indicating a dramatic increase in variance as the level of trait integration decreases. Note also that level A of the hierarchy is the same as fitness. The sharp rise initially is a reflection of the transition from a single optimum to multiple optima.



Each of the 100 simulations within each of the treatment environments stabilized on a different optimal combination of values for the 34 functional traits optimized in TAD. Thus there was little convergence in terms of the individual seedling traits within a treatment due to the effect of alternative optima. Furthermore, many of these alternative strategies have approximately equal growth rate despite much variation in important functional traits such as maximum net-photosynthetic rate (Fig. 2). Accepting seedling average growth rate and survival as key components of fitness in trees (Van Valen 1975; Harcombe 1987; Oliver and Larson 1996; Caspersen and Kobe 2001), I can conclude that within each of these environments there is substantial potential for the evolution of alternative trait strategies of approximately equal fitness.

A number of empirical studies have found biological examples of multiple distinct strategies for achieving a similar high level of performance under the same conditions. These examples include jaw mechanics in labrid fish (Alfaro et al. 2005), body shape in Gambusia fish (Langerhans and DeWitt 2004), light interception by plants (Hirose and Werger 1995; Valladares et al. 2002), water use by trees (Goulden 1996; Becker et al. 1999), growth rate of bacteria (Korona 1996; Pál et al. 2006), circuits of gene regulation in yeast (Tsong et al. 2006), and various anatomical adaptations in mammals and birds (Bock 1976; Lewontin 1978). This wide range of examples suggests that multiple evolutionary basins of attraction are a phenomenon that is ubiquitous in biological organisms and should receive greater attention in interpreting experimental results.

For example, if there were only a single optimal response to a given environmental difference one would expect all species to respond in the same way. However, in long-term laboratory experiments with elevated atmospheric CO2 concentrations, evolutionary responses of replicate lineages of algae responded in individualistic ways that increased trait variation (Collins and Bell 2004), consistent with my theoretical results. Similarly the method of phylogenetically independent contrasts (PICs) tests if different lineages would respond to a given environmental difference in the same direction once phylogeny is controlled for (Harvey and Pagel 1991; Losos and Miles 1994). In field data, PICs have shown many cases where trait values respond to the same environmental difference in opposite directions in different clades (e.g., Westoby et al. 1998; Wright et al. 2002), again consistent with the view of multiple optima presented here.

The existence of a multitude of alternative optima can also explain a number of evolutionary patterns. For instance in evolution experiments with replicate lines of initially identical populations of bacteria, the different lineages adapted to the same environmental change in different ways implying that they evolved toward different optima (Korona 1996). Now consider if selection were to act in one direction for an initial evolutionary experiment and then the direction is reversed in a second experiment. The populations would be expected to evolve toward different optima in the second experiment. Consequently the replicate populations would not be expected to return to the original trait optimum, and this probability would be higher the more optima there are. Here I have shown that there are multitudes of alternative optima in a biologically realistic model, greatly exceeding the conventional expectation. Thus this result can also explain why evolutionary outcomes are irreversible (Stebbins 1950; Gould 1989) and irreproducible (Travisano et al. 1995; Korona 1996).


Although the existence of alternative optimal strategies caused much variation in individual tree seedling traits, it did not prevent the emergence of consistent patterns in trait variation between environmental treatment levels (Table A1). For example, adaptation to greater nitrate availability tended to increase tissue nitrogen concentrations particularly for leaf mesophyll cells, as also seen in field studies (Aerts and Chapin 2000). Both the whole plant respiration rate and maximum photosynthetic rates increased as a consequence of these higher tissue nitrogen contents, consistent with experiments (Wright et al. 2004; Lambers and Poorter 2004; Reich et al. 2006). There is also evidence for a trade-off between water uptake and nitrate uptake given the greater availability of water in deep soil layers and greater nitrate availability in the topsoil. In particular, higher nitrate availability increased root depth and increased leaf level water-use efficiency, consistent with the view that the relative availability of multiple resources should be considered in studies of plant trait adaptation.


These results show that many tree seedling traits were significantly affected by the environmental treatments, which is consistent with conventional theory (Levin and Muller-Landau 2000). However, as in nature, the environmental treatments account for only part of the variation in most traits (see Table A1) (Niinemets 2001; Schenk and Jackson 2002; McDonald et al. 2003; Santiago et al. 2004; Maherali et al. 2004; Wright et al. 2005). The residual variation is a result of different model searches stabilizing on different optimal strategies. Thus my analysis so far suggests that both environmental differentiation and alternative optima (i.e., chance) are important in explaining trait variation. Although this qualitative insight is already interesting, it is more interesting to investigate quantitatively what determines the relative importance of these two factors in explaining variation in different traits.

Typically environmental selection causes trait values to be more convergent near the top of the hierarchy of trait integration, whereas multiple optima cause increasing variation in trait values among optima lower in the hierarchy of trait integration (Fig. 3). The lack of strong relationships between individual traits and environment is consistent with empirical field studies of organ level traits that are lower in the hierarchy of integration (Niinemets 2001; Schenk and Jackson 2002; McDonald et al. 2003; Santiago et al. 2004; Maherali et al. 2004; Wright et al. 2005). For example, leaf level nitrogen-use efficiency was unaffected by the nitrate treatment even though nitrate treatment had an effect on whole plant nitrogen-use efficiency, a pattern that is also observed in field measurements (Aerts and Chapin 2000). This lack of a close correspondence between individual traits and environment in both the model and field data suggests a prominent role for alternative evolutionary strategies in nature. Thus for composite or individual traits further insight into interspecific variation is more likely to come by studying the relationships among closely related traits (e.g., Ryser 1996; Hacke et al. 2001; Wright et al. 2004) rather than the relationships among traits and environmental gradients.

In relating plant adaptations to plant distributions it will be more insightful to measure whole plant traits because these tend to show much greater convergence with respect to environment. The difficulty in measuring whole plant traits presents a considerable challenge for field ecologists. One promising solution could be a hybrid approach that combines measurement of individual traits in the field with their integration in a computer model (Valladares et al. 2002; Pearcy et al. 2004). Due to the within treatment variation in organ level traits associated with alternative optima, the most reliable tests could compare community averages for these diagnostic traits from sites contrasting in the environmental conditions of interest. This technique of correlating the average trait value for a community with climate rather than studying species individually has been employed successfully by paleontologists (Royer et al. 2005), biogeographers (ter Steege et al. 2006), and plant ecologists (Ackerly et al. 2002). Another possibility is to work with the composite traits whose averages in exception to the overall trend showed a relatively large response to the environmental difference. Theoretical models such as TAD could be used to determine which composite traits will make the best diagnostics as indicators of adaptation to a particular difference in environmental conditions. Models would also have the advantage that environmental conditions can be varied independently, whereas multiple environmental factors typically covary in the field.


The experimentation with various types of simple generic models of trait optimization revealed that multiple optima emerge as a consequence of the inclusion of constraints. In TAD as well as in nature, seedlings need to prevent dehydration, avoid mechanical failure, and prevent running out of carbon in addition to maximizing growth. In addition to these obvious constraints at the whole plant level, there are many organ level and developmental constraints that are also included in TAD (Marks and Lechowicz 2006a). In natural organisms, one would expect still more constraints at the cellular, molecular, and genetic levels, potentially resulting in an even greater propensity for alternative evolutionary strategies in nature than in TAD.

The cause of the convergence hierarchy can be explained conceptually. Consider the simplest possible hierarchy with only two levels. Selection is maximizing fitness at the top of this hierarchy and consequently the top level should tend to be convergent. In contrast a change in the value of a trait in the lower level of the hierarchy can be compensated for by a change in the value of another trait that it interacts with, assuming that there are multiple optima in the model. Consequently there is considerable leeway for divergence in the values for the traits in the lower level. If we insert intermediate levels into the hierarchy as in the generic model (Fig. 4), then these intermediate level traits also take an intermediate position in trait convergence (Fig. 5).


Other biological models have also found multiple alternative optima but contain subtle yet important differences with the approach presented here. In particular, Niklas' pioneering research into alternative strategies took a multi-objective optimization approach to explore variation in tree architectures (Niklas 1994, 1997a,b, 1999). In single objective optimization without constraints there is usually a single optimum. In contrast in multi-objective optimization there is an optimum associated with each objective. If these two or more objectives are conflicting, then one cannot optimize both objectives simultaneously, and any optimum must be a compromise between maximizing these two or more objectives. Because there are infinitely many possible compromises, there are also an infinite number of optima. Each compromise solution is associated with a different weighting assigned to each of the multiple objectives (Deb 2001). Therefore, each of the multiple optima found by a multi-objective optimization is associated with a different selective environment (Farnsworth and Niklas 1995). For example, in Niklas' study the objectives were mechanical stability, light interception, and seed dispersal. The optimal architectures associated with environments that selected for greater light interception were very different than the architectures associated with environments that selected for greater mechanical stability because these two objectives are opposing. Similarly trade-offs in conventional trade-off models optimizing life history traits can result in alternative strategies of equal fitness, but each of these alternative strategies is associated with a different selective environment (e.g., r versus K selection) (Givnish 1986; Roff 1992). Those approaches thus inform about alternative optima across environmental gradients rather than within the same selective environment.

In terms of understanding multiple alternative optima within the same selective environment, Kauffman's NK model is instructive (Kauffman 1993). Specifically, by using a model of random epistatic gene interactions, Kauffman showed that as the number of epistatic interactions increases the number of local optima in the fitness landscape becomes extremely large and the fitness differences among optima decline. There is some similarity between this result and the TAD results, which suggests that there may be a rough analogy between the effect of increasing the number of constraints in TAD, and increasing the number of epistatic interactions in the NK model, or increasing the number of objectives in a multi-objective optimization. In all three cases increasing the amount of conflicting requirements that the solution must satisfy increases the propensity for variation among optima. This similarity in the findings from different modeling approaches hints at a synthesis of the theoretical underpinnings for the evolution of functional diversity. I expect that such a synthesis will have interesting insights into complex evolutionary systems not just in biology but also in engineering and economics (Bentley 1999; Kauffman 2000; Ottino 2004; Price et al. 2004).


The hierarchy result has important implications for phylogenetic approaches to studying trait variation (Stebbins 1950; Felsenstein 1985; Lauder 1990; Harvey and Pagel 1991; Losos and Miles 1994; Niklas 1997c; Westoby et al. 1998; Martins 2000; Webb et al. 2002; Matos et al. 2004). Specifically, evolutionary biologists believe that evolution proceeds predominantly via many small incremental changes (Ganong 1901; Wright 1982; Dieckmann and Ferrière 2004). Thus, a species is highly unlikely to jump from one optimum to another because they are located far apart in trait space, with regions of lower fitness in between. It is much more likely that speciation occurs by a slight adjustment in the trait values of the current strategy to make the strategy feasible in a slightly different environment, a view supported by field data (Ackerly 2004). It is only through the accumulation of many such small changes via different evolutionary histories that different lineages can eventually reach alternative optima for the same environment. In diverse ecosystems, evolution has plenty of different evolutionary histories to work with to reach alternative optima. Consider for example that the composition of diverse forests usually includes species from different plant families and from many different genera (Webb et al. 2002). Therefore, I make two predictions. First, for species that are very closely related, trait differences are likely dominated by environmental niche differentiation. Second, for co-occurring species that show large environmental niche overlap, trait differences are likely dominated by alternative optima, particularly at higher taxonomic levels. Thus variation in traits at the whole plant level of the hierarchy should tend to be best explained by ecology, whereas at the composite trait and lower levels of integration, trait variation should tend to be better explained by phylogeny, although exceptions are expected as indicated by the large within integration level variances (Fig. 3).

This predicted dichotomy in terms of ecological closeness versus phylogenetic closeness is evident in some empirical studies of tree species traits in relation to habitat and phylogeny. For example, in a study comparing Floridian oaks (Quercus), Cavender-Bares et al. (2004) found that species within the white oak clade diversified along a soil gradient correlated with fire regime, moisture, and nutrient availability. Species within the red oak clade were also differentiated along this environmental gradient. However, within any of the individual communities, species from both clades were represented implying that the two clades may represent alternative strategies. Whole plant traits that were closely related to fitness such as growth rate, canopy height, and fire survival were best explained by the environmental differences and showed little variation within environments; whereas the values of leaf level traits such as specific leaf area tended to be conserved within a clade (i.e., their variation was better explained by phylogeny).

Thus the hierarchy result presents a useful advance toward resolving the long-standing debate among evolutionary biologists about whether functional traits are determined primarily by selection pressure or by random historical accident (Kimura 1968; King and Jukes 1969; Bock 1976; Lewontin 1978; Gould and Lewontin 1979; Mayr 1983; Kimura 1983; Maynard Smith et al. 1985; Gould 1989; Parker and Maynard Smith 1990; Kimura 1991; Travisano et al. 1995; Losos et al. 1998; Joshi et al. 2003; Lehman 2004; Bradley and Folk 2004; Wagenaar and Adami 2004; Vermeij 2006; Pál et al. 2006). Here I have gone beyond the obvious answer that both factors are important causes of diversity, and showed how the relative importance of the two factors changes with the level of integration of the trait. I present the hierarchy of trait integration as a clear conceptual framework to guide future studies of the evolution of functional diversity.


The existence of multiple optima and differences in selective environment jointly affect trait variation. The effects of multiple optima predominate at the organ level and lower levels of trait integration, whereas the effects of environmental differences predominate at the level of whole tree seedlings. One advantage of this hierarchy of effects is that from the ecosystem perspective it permits aggregating species into functional types or guilds that ignore much of the variation in individual traits. Furthermore the hierarchy of trait integration provides a useful framework not just for comparative studies of functional plant traits but also to help in choosing appropriate traits to define functional types.

The patterns of trait variation in the model results were similar to relationships observed in nature. In particular, the variation associated with multiple alternative optima is consistent with the large quantitative variation in individual traits observed within environments. The relative degree to which this variation obscured the effect of environmental differences on different traits depended on the level of the trait within the hierarchy of integration. Because such hierarchies of functional trait integration exist in all biological organisms, it implies that the potential for alternative strategies should also exist in many other groups of organisms. Thus my predictions that whole organism traits should be most closely related to ecology, whereas composite and individual traits should be more closely related to phylogeny, should apply to organisms in general. Furthermore, I predict that within species and among closely related species trait variation should be mostly explained by environmental variation, whereas alternative strategies likely explain most trait variation at higher taxonomic levels.

Associate Editor: E. Conti


I would like to thank Ehab Abouheif, David Ackerly, Ken Arii, Graham Bell, Frank Berninger, Sinead Collins, Elena Conti, Thomas DeWitt, Raj Dhindsa, Marisha Futer, Andy Gonzalez, Fred Guichard, Andrew Hendry, Stephen Hubbell, Marty Lechowicz, Hafiz Maherali, Brian McGill, Christian Messier, Helene Muller-Landau, Claudia Neuhauser, Karl Niklas, Rick Roy, and an anonymous reviewer for commenting on earlier versions of the manuscript. I am grateful to the Natural Sciences and Engineering Research Council of Canada, McGill University, and the Groupe de Recherche en Écologie Forestière interuniversitaire of Montreal for funding.


Appendix: Details of Generic Model

The following describes the details for the generic model used as an example to illustrate the effect of constraints on the number of optima and the hierarchy of convergence. The number of traits was chosen to be able to have several hierarchical levels of integration that could be integrated into multiple levels of hierarchical organization in a generic way. As illustrated by the arrows in Figure. 4, each of the individual traits D1 to D10 had at least one positive effect (via B1 or B2) and one negative effect (via B3) on fitness (A1). The positive effects (en) were non-linear, whereas the negative effects (an) were linear as follows:


A Michaelis–Menthen form was chosen for the non-linearity because many biological relationships follow this form. Having both a negative and a positive effect introduces a trade-off for each trait. A trade-off that includes a non-linearity is a necessary condition for the value of an independent trait to stabilize in an optimization. Without a trade-off the value for the trait would drift randomly. The coefficients (pn, hn, gn) were chosen from a uniform random distribution ranging between 0 and 1. The same coefficients were used for all simulations presented here. The interactions among trait effects were modeled using geometric means to keep magnitudes normalized as follows:


Only at the top of the hierarchy were the interactions modeled with a sum to incorporate the negative and positive effects needed for the trade-off requirement:


Note that this generic model is only an example. I have tried changing the values of coefficients, the types and number of trait interactions, etc. However the only change that consistently produces a transition from a single global optimum to multiple local optima is the introduction of constraints. All of the constraints were modeled as follows in the simulations presented here:


For each constraint a different combination of two C level traits was chosen. Note that other types of constraints were also tried and caused multiple optima. In general, constraints need to restrict an optimization just enough to prevent it from accessing the single global optimum, which causes multiple alternative optima to emerge.