Abstract
- Top of page
- Abstract
- General Modeling Approach
- Results
- Discussion
- Conclusions
- ACKNOWLEDGMENTS
- LITERATURE CITED
- Appendix
- Supporting Information
Simulation models of the evolution of genes in a branched metabolic pathway subject to stabilizing selection on flux are described and analyzed. The models are based either on metabolic control theory (MCT), with the assumption that enzymes are far from saturation, or on Michaelis–Menten kinetics, which allows for saturation and near saturation. Several predictions emerge from the models: (1) flux control evolves to be concentrated at pathway branch points, including the first enzyme in the pathway. (2) When flux is far from its optimum, adaptive substitutions occur disproportionately often in branching enzymes. (3) When flux is near its optimum, adaptive substitutions occur disproportionately often in nonbranching enzymes. (4) Slightly deleterious substitutions occur disproportionately often in nonbranching enzymes. (5) In terms of both flux control and patterns of substitution, pathway branches are similar to those predicted for linear pathways. These predictions provide null hypotheses for empirical examination of the evolution of genes in metabolic pathways.
The past decade has seen a resurgence of interest in developing a general theory of adaptation. Much of this effort has been directed at understanding the distribution of the magnitude of fitness effects on adaptive walks and on the nature of sequence evolution on rugged adaptive landscapes (Orr 2005). By contrast, an equally important issue for such a theory—whether certain genes are more likely than others to contribute to adaptive change, and if so, which ones—has received considerably less attention. Yet being able to predict which genes are “targeted” by selection in this way would be helpful in interpreting a number of evolutionary patterns, including differences in evolutionary rates among genes in metabolic pathways, (e.g., Rausher et al. 1999; Ramsay et al. 2009) and why the probability of fixation of mutations in some genes may be greater than that of mutations in other genes (Stern and Orgogozo 2008, 2009; Streisfeld and Rausher 2009, 2011).
Genomic analyses have revealed a number of interesting patterns suggesting that the positions of genes or their products in networks and pathways may affect the rates of adaptive or deleterious substitutions in those genes. For example, fewer substitutions tend to occur in genes whose products are centrally located in networks (Hahn and Kern 2005), interact with more other gene products (Fraser et al. 2002; Hahn et al. 2004), or are upstream in metabolic pathways (Rausher et al. 1999; Ramsay et al. 2009). Despite these empirically derived patterns, however, there is little theoretical analysis to explain why these patterns may exist. In this report, I partially address this lacuna by modeling the evolution of genes in branched metabolic pathways.
This analysis builds on a previous simulation analysis of the evolution of genes associated with linear metabolic pathways, which revealed that adaptive substitutions in upstream genes occur more frequently than in downstream genes, whereas slightly deleterious mutations are fixed by genetic drift more frequently in downstream genes (Wright and Rausher 2010). These substitution patterns arise because under stabilizing selection on metabolic flux, the pattern of flux control across pathway genes evolves in a predictable manner: the system tends to evolve toward states in which upstream genes have greater flux control than downstream genes. Given this pattern, adaptive substitutions are expected to be concentrated in enzymes that exert the greatest control over flux (Hartl et al. 1985; Eanes 1999; Watt and Dean 2000). An adaptive mutation with a given effect on enzyme kinetics will produce a greater adaptive change in flux, and hence in fitness, when it occurs in upstream genes. Because the probability of fixation is proportional to the magnitude of selection, adaptive mutations in upstream genes will have a higher probability of fixation than those in downstream genes, resulting in the greater number of mutations fixed in upstream genes. By contrast, the fitness effect of a deleterious mutation will be smaller for downstream genes because of their lower control over flux, resulting in a higher fixation probability because the probability of fixation of a deleterious mutation is inversely related to the magnitude of selection.
It is not clear, however, whether these conclusions are expected to hold for branched pathways. Moreover, several empirical observations on branched pathways are at odds with the expectations derived for linear pathways. One is the general belief that flux control in branched pathways tends to be concentrated at the branch points (Eanes 1999; Flowers et al. 2007), which implies that it does not decrease monotonically from upstream to downstream genes as found by Wright and Rausher (2010). However, no formal theoretical justification has been provided for this belief. Another observation is that, in at least one study, adaptive substitutions tend to be concentrated at branch-point genes, rather than at the most upstream genes of the pathway (Flowers et al. 2007). Finally, one investigation has demonstrated that nonsynonymous substitutions occur less frequently in branch-point genes, reflecting greater selective constraint in these genes (Yang et al. 2009). In the analyses presented here, I ask whether models of the evolution of genes associated with branched pathways predict these patterns, and, more generally, whether the conclusions derived from models of evolution of branched pathways differ from those derived for models involving linear pathways. Specifically, I consider several questions:
- Prediction 1:
When fluxes in the branches of a pathway deviate substantially from the environmental optimum, a disproportionate share of adaptive substitutions will occur in genes coding for enzymes with the greatest flux control. In this situation, mutations of large effect can be adaptive, and, as argued above, a disproportionate share will occur in genes of enzymes with the largest control over flux.
- Prediction 2:
Regardless of deviation from optimal fluxes, slightly deleterious substitutions should occur disproportionately often in enzymes with little control over flux. In this situation, a given change in activity of these enzymes will have a smaller detrimental effect on fitness. Such a change will then have a higher probability of fixation because for deleterious alleles, probability of fixation is inversely related to the magnitude of the selection coefficient.
- Prediction 3:
When fluxes are near their optima, adaptive substitutions are expected to occur disproportionately often in enzymes with little flux control. This is because near the optimum, only mutations with small effects on fitness will be adaptive (
Fisher 1930). Because enzymes with little flux control have less of an effect on fitness than enzymes with substantial flux control, a greater proportion of the former is expected to be adaptive. Consequently, the probability of fixation, once a mutation has arisen, is expected to be higher for enzymes with less control over flux.
- 3
Do the relative magnitudes of flux down different pathway branches influence either the evolved pattern of flux control or the relative numbers of substitutions that occur in different genes? Intuitively, it seems reasonable to expect that addition of a side branch through which there is little flux to a linear pathway would have little effect on either the pattern of flux control or the pattern of substitutions in that pathway. By contrast, a side branch through which there is substantial flux would likely have a substantial effect. I ask whether this hypothesis is supported by the model.
General Modeling Approach
- Top of page
- Abstract
- General Modeling Approach
- Results
- Discussion
- Conclusions
- ACKNOWLEDGMENTS
- LITERATURE CITED
- Appendix
- Supporting Information
I focus initially on a simple branched metabolic pathway in which there are two enzymes above the branch point, and two enzymes in each branch (Fig. 1A). Subsequently, I consider a branched pathway with a longer terminal branch (Fig. 1B), and one with a longer internal branch (Fig. 1C). In all three cases, I focus on control of flux by, and substitutions in, enzymes in branches 1 and 2, which constitute the “focal” path (Fig. 1). As in Wright and Rausher (2010), I analyze two different types of model: one based on metabolic control theory (MCT model) (Kacser and Burns 1973; Heinrich and Rapoport 1974) and one based on Michaelis–Menten kinetics (SK, or “saturation kinetics,” model). The former approach assumes that enzymes are very far from saturation, whereas the latter allows for saturation and near-saturation. I present both models because in any given real case it is often not known which assumption is more realistic. Details of both types of model are given in the Appendix.
Both types of model incorporate a set of kinetic parameters, one for each enzyme (See Table 1 for list of parameters). In the MCT model, these parameters are the k
the rate of conversion of substrate into product. In the SK model, the parameters are the V
, the maximum velocity of the reaction. In both models, these are the parameters that are allowed to evolve, and they determine the magnitudes of fluxes, J, down each branch of the pathway. Changes in both parameters may be due to either changes in kinetic properties or changes in enzyme abundance, and I do not distinguish between these possibilities. In addition, there are fixed (nonevolving) parameters that also influence flux: α, the ratio of the reverse to forward rate of the reaction associated with an enzyme (e.g., α=
; see Appendix); and A, the concentration of the initial substrate in the pathway. Other fixed parameters are described below. In these analyses I assume α= 0.01, corresponding to largely, but not completely, reversible reactions, because most metabolic reactions are largely irreversible (Wright and Rausher 2010), and because preliminary analyses indicated that results with smaller values of α produce qualitatively very similar results. I also arbitrarily set A = 10, since the flux equations (see Appendix) can be scaled in arbitrary units that make this parameter take on any value. This parameter represents the concentration of the initial substrate, which is held constant because it is assumed to be well buffered (Wright and Rausher 2010).
Table 1. List of parameters used in the models. | Parameter | Description |
|---|
| |
|---|
k | Activity of enzyme i in MCT model. |
| Vi | Maximum velocity of enzyme i in SK model. |
J | Optimal flux through branch 2 of the pathway |
| J3opt | Optimal flux through branch 3 of the pathway |
| σ22 | Strength of stabilizing selection on branch 2 flux |
| σ23 | Strength of stabilizing selection on branch 3 flux |
| Variance of mutational effects on kinetic parameters |
| | Ratio of rate of forward reaction to rate of reverse reaction |
| T | Threshold concentration of intermediates that reduces fitness |
| J2opt/J3opt | Optimal flux ratio |
Next, the fitnesses associated with the new and old fluxes are calculated based on the assumption that stabilizing selection acts on the flux down each branch of the pathway. In the SK model, fitness is also decreased by high concentrations of intermediate substrates (for explanation, see below). The mutation is accepted (a substitution occurs) based on the probability of fixation of a mutation with a given selection coefficient (determined by difference in fitness). This constitutes one mutation cycle. A trial consists of numerous cycles (typically 50,000).
For each model, two types of simulations were conducted. In the first type (constant environment), the starting kinetic parameters were chosen randomly from a uniform distribution of values between 0 and 1 that produced fluxes from just above 0 to a maximum of A (Fig. S1). This limit was imposed because it seems unreasonable to believe that flux would be greater than the buffered concentration of the initial substrate. The purpose of this type of simulation was to explore what patterns of flux control are possible (represented by flux controls associated with the flux determined by the starting kinetic parameters) and what patterns of flux control evolution produces from the universe of possible kinetic parameters. Because the assumption of random initial points may be inappropriate, a second type of simulation (fluctuating environment) was conducted. In these simulations, starting values randomly chosen from the endpoints generated in the constant environment simulations, which are evolutionarily attainable points. The system was allowed to evolve to an equilibrium determined by the initial optimal fluxes along each branch. The optimum for branch 2 was then shifted by randomly choosing a new value, and the system was allowed to evolve for 50,000 mutation cycles. The optimum was then shifted again and process repeated. A total of 20 shifts in optimal flux were conducted for each starting point. This type of simulation represents long-term evolution with periodic environmental change.
For each of the four categories of models (MCT vs. SK, constant vs. fluctuating environment), 2000 (MCT model) or 200 (SK) trials were conducted for each combination of parameters. For the MCT model, I examined all factorial combinations of J
= {0.3, 1, 3} ⊗J
= {0.3, 1, 3}⊗(
,
), where (
,
) is an element of {(1.6, 0.4), (1.3, 0.7), (1,1), (0.7, 1.3), (0.4,1.6)}. For the SK trials, the combinations involving (
,
) = {(1.3, 0.7) and (0.7, 1.3)} were omitted because these simulations require much more time than the MCT simulations. SK simulations included a “cost” parameter, T, which penalizes fitness when intermediate products accumulate (see Appendix). This parameter is included because in models of linear pathways, such a cost is necessary to cause equilibrium flux Control Coefficients (CCs) to differ among enzymes (Wright and Rausher 2012). This parameter was varied from 2 (high cost) to 50 (low cost).
The models generate two types of output. The first consists of the set of flux CCs. (These CCs are equivalent to the sensitivity coefficients of Kacser and Burns 1973.) There are two CCs for each enzyme. One is the CC for flux down branch 2, which is the proportional change in flux down that branch that is caused by a given proportional change in the enzyme activity. Symbolically, this is (
) /(
for enzyme i. The second CC is the corresponding effect on flux down branch 3 (side branch; see Fig. 1). For most of the analysis, I will be concerned primarily with CCs associated with branch 2. For each trial, CCs were calculated at the beginning and the end of the trial.
The second set of outputs consists of the proportions of substitutions associated with each enzyme in the pathway. Substitutions were divided into four categories determined by current fitness when the substitution occurred and whether it was an advantageous or deleterious substitution. In the MCT models, substitutions that occurred when fitness was less than 0.95 were considered to be in the “adaptive phase,” that is, the period in which the population is still climbing the adaptive landscape. By contrast, substitutions that occurred when fitness was greater than 0.95 were considered to be in the “equilibrium phase,” when the population is near the optimum. Although the precise fitness threshold separating these two phases is somewhat arbitrary, the choice of 0.95 produced results that seem reasonable.
Exploratory trials with the SK models indicated that evolution quickly brings populations near the optimal flux allocation, but the optimal sizes of the intermediate pools are approached much more slowly. In most runs, even after 100,000 mutation cycles, the sizes of these pools were still evolving. Consequently, I divided substitutions into adaptive and equilibrium phases using only criteria based on deviation of J
and J
from the optima. In particular, fitness in this model is the product of three quantities. The first two are Gaussian fitness functions based on these deviations, whereas the third is a factor that penalizes high intermediate concentrations (see Appendix). I considered a substitution to have occurred in the adaptive phase when the product of the first two terms was <0.99; otherwise I considered it to have occurred in the equilibrium phase.