• Daniel M. Weinreich,

    1. Department of Ecology and Evolutionary Biology, and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
    Search for more papers by this author
  • Jennifer L. Knies

    1. Department of Ecology and Evolutionary Biology, and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
    Current affiliation:
    1. Department of Molecular Biology and Chemistry, Christopher Newport University, Newport News, Virginia
    Search for more papers by this author


The functional synthesis uses experimental methods from molecular biology, biochemistry and structural biology to decompose evolutionarily important mutations into their more proximal mechanistic determinants. However these methods are technically challenging and expensive. Noting strong formal parallels between R.A. Fisher's geometric model of adaptation and a recent model for the phenotypic basis of protein evolution, we sought to use the former to make inferences into the latter using data on pairwise fitness epistasis between mutations. We present an analytic framework for classifying pairs of mutations with respect to similarity of underlying mechanism on this basis, and also show that these data can yield an estimate of the number of mutationally labile phenotypes underlying fitness effects. We use computer simulations to explore the robustness of our approach to violations of analytic assumptions and analyze several recently published datasets. This work provides a theoretical complement to the functional synthesis as well as a novel test of Fisher's geometric model.

A mutation's effect on phenotype commonly depends on the presence or absence of other mutations in the genome, a phenomenon called epistasis. This word was coined 100 years ago by William Bateson (1909) to describe departures from the Mendelian 9:3:3:1 ratios expected in the F2 generation of a dihybrid cross involving unlinked loci (Phillips 1998, 2008). The principled analyses of such departures can yield insights into the mechanisms that determine phenotype (Avery and Wasserman 1992; Griffiths et al. 2002; Roth et al. 2009). For example in the blue-eyed Mary (Collinsia parviflora), crossing true-breeding plants with white and magenta flowers yields 100% blue flowers in the F1, and blue, magenta, and white flowers at ratios 9:3:4 in the F2. Thus, a mutation at one locus is masked in organisms homozygous mutant at the other, implying that the gene product of the masked locus acts downstream of the gene product of the masking locus.

Mendelian phenotypes are discrete, but epistasis is also widespread in continuous phenotypes and has important implications for human medicine, agriculture, and evolutionary biology (Mackay 2001). Population genetics traditionally define epistasis for fitness between two mutations i and j as

display math(1)

where Wx is the absolute fitness of an organism carrying mutation(s) x and W0 is the fitness of the ancestor. We further partition epistasis (ε ≠ 0) into positive and negative epistasis (meaning that the fitness of the double-mutant is higher and lower, respectively, than expected).

The principled analyses of the sign of fitness epistasis between deleterious mutations in pairs of genes can also yield insights into the topology of an organism's metabolic network. Building on the metabolic control analysis of Kascer and Burns (1973), Szathmáry (1993) showed that mutations in pairs of genes whose products act in the same linear pathway should exhibit positive epistasis, whereas mutations in pairs of genes whose products act in parallel, redundant pathways should exhibit negative epistasis (Segrè et al. 2005; D. M. Weinreich, unpubl. ms.). Recently, this reasoning has been used with remarkable success to correctly reconstruct metabolic networks from microbial growth rate data for pairwise gene deletions estimated computationally (Segrè et al. 2005) or measured experimentally (Costanzo et al. 2010; He et al. 2010).

Although metabolic control analysis suggests that deleterious mutations in the same gene should only exhibit negative fitness epistasis (Szathmáry 1993), experimental work illustrates that the situation is more complicated (e.g., Lunzer et al. 2005a; Weinreich et al. 2006; Bridgham et al. 2007; Lozevsky et al. 2009). Indeed, these results have stimulated experimental interest in developing a quantitative understanding of how mechanistically more proximal phenotypes drive protein evolution (the functional synthesis of Dean and Thornton 2007). We therefore sought a theoretical perspective complementary to this recent empirical work. We were specifically motivated by the fact that in many systems high-throughput reverse genetics and fitness assays are now practical, whereas the dissection of the underlying molecular biology of fitness effects remains more challenging. Our approach yields insight into more proximal determinants of fitness using only data on pairwise epistasis between deleterious mutations.

We begin here from the observation (Weinreich 2010) that a principled model of protein evolution first proposed by DePristo et al. (2005; see also Camps et al. 2007; Gu 2007; Tokuriki et al. 2008) bears strong formal similarities to Fisher's geometric model (Fisher 1930; hereafter the FGM). The FGM maps from an abstract representation of the space of all conceivable organismal phenotypes to fitness, and recently Martin et al. (Martin and Lenormand 2006; Martin et al. 2007; Chevin et al. 2010) have developed quantitative predictions for patterns of pairwise fitness epistasis under the premise of this abstract phenotypic space. We inverted this logic by asking whether data on fitness epistasis might allow us to quantitatively characterize the actual phenotypic space of the FGM in specific experimental systems. Note too that although motivated by a model of protein evolution, our approach can be applied at any level of organismal organization.

We first present a novel analytic framework that uses epistatic interactions to characterize pairs of deleterious mutations with respect to similarity of phenotypic effect in the FGM. In addition, given data on pairwise epistatic interactions among some set of deleterious mutations, our framework yields an estimate of the total number of mutationally labile dimensions in the phenotypic space of the corresponding FGM. We then use computer simulations to explore the robustness of results to violations of analytic assumptions. Finally, we examine several published datasets. Taken together this study offers a novel theoretical complement to ongoing efforts at dissecting the phenotypic determinants of fitness as well as a quantitatively rigorous examination of Fisher's geometric model.


DePristo et al. (2005) assert three facts about protein biology. First, that in addition to affecting functional properties (e.g., catalytic activity of an enzyme or binding specificity of a transcription factor), mutations in protein coding genes influence organismal fitness via several other molecular phenotypes. These may include gene transcription and message translation rates, the protein's native-form folding kinetics and thermodynamics, and its misfolding, aggregation, and degradation rates. Second, many of these phenotypes are commonly under stabilizing selection, that is intermediate values are optimal. For example, although too little folding stability might render most copies of an enzyme nonfunctional, too much stability will also hinder function (Malcolm et al. 1990) as some flexibility, termed “breathing,” is essential. (Zhou et al. 1998; Wang et al. 2002; Tomatis et al. 2005) (But see Bloom et al. 2005; Zeldovich et al. 2007 for an alternative perspective.) Finally, many mutations act pleiotropically within a protein (e.g., Raquet et al. 1995; Sideraki et al. 2001; Wang et al. 2002; Tomatis et al. 2005). Thus, DePristo et al. (2005) argue that protein evolution is influenced by stabilizing selection acting on multiple phenotypes responding pleiotropically to mutation. Those authors note that fitness epistasis emerges as a consequence of the nonlinear mapping from phenotype to fitness inherent in stabilizing selection (see also Brodie 2000; Martin et al. 2007).

These facts are illustrated in Figure 1 with data for two missense mutations in the TEM-1 β-lactamase allele, responsible for bacterial resistance against β-lactam drugs such as penicillin and cephalosporins. Introducing the G238S mutation (representing serine for glycine at residue 238; amino acid numbering as in Ambler et al. 1991) into the TEM-1 allele increases resistance against the drug cefotaxime (Wang et al. 2002; Weinreich et al. 2006). In vitro this mutation increases catalytic activity against this drug (Raquet et al. 1994; Sideraki et al. 2001; Wang et al. 2002), but it reduces thermodynamic stability (Raquet et al. 1995; Wang et al. 2002). In contrast, the M182T mutation reduces cefotaxime resistance when introduced into TEM-1 but increases resistance in the presence of G238S (Wang et al. 2002; Weinreich et al. 2006). M182T reduces in vitro activity while increasing stability (Huang and Palzkill 1997). Thus, interplay between the pleiotropy of these mutations is thought to render M182T conditionally beneficial because the double mutant enjoys increased hydrolysis (Wang et al. 2002) without loss of stability (Huang and Palzkill 1997). (See Weinreich et al. 2005, 2006 for the evolutionary significance of conditionally beneficial or “sign epistatic” mutations.)

Figure 1.

Effect of two β-lactamase mutations on cefotaxime drug resistance (minimum inhibitory concentration, in μg/mL), catalytic activity (kcat/KM in M−1·s−1), ­and thermodynamic folding stability (ΔG in kcal/mol). Values relative to the TEM-1 wild-type allele. M182T only conditionally improves cefotaxime resistance: it reduces drug resistance (gray bars) on the TEM-1 background but increases it sharply in the presence of G238S. (Data from Wang et al. 2002; Weinreich et al. 2006; see Weinreich et al. 2005 for the evolutionary implications of such epistasis.) In contrast, note that for both mutations the effect on catalytic activity and thermodynamic folding stability is roughly additive. Figure adapted from Weinreich (2010).

Fisher's geometric model (Fig. 2A) begins from the premise that the fundamental attribute of adaptation are the many simultaneous “features of conformity” (Fisher 1930, p. 38), both within an organism and between the organism and its environment. Fisher first imagined a continuous, multidimensional phenotype space; any organism can be represented by some point in this space. The biologically optimal combination of phenotypes is at the origin, and fitness elsewhere is a declining function of the distance to the origin. Thus, stabilizing selection acts simultaneously on multiple phenotypes. Second, Fisher assumed that mutations displace an organism in an arbitrary direction in phenotype space; i.e., mutations usually act pleiotropically. Thus, the FGM captures the essential elements of protein evolution outlined above (Weinreich 2010), and this point is illustrated qualitatively in Figure 2B for the G238S and M182T mutations of the TEM-1 allele of β-lactamase. The FGM is one of the few phenotypic models of evolution (reviewed in Orr 2005a) and has received extensive theoretical (Kimura 1983; Hartl and Taubes 1996; Hartl and Taubes 1998; Orr 1998, 1999, 2005b; 2006; Poon and Otto 2000; Welch and Waxman 2003; Waxman and Welch 2005; Martin and Lenormand 2006; Waxman 2006; Sella 2009; Chevin et al. 2010; Le Nagard et al. 2011; Sellis et al. 2011) and experimental (Burch and Chao 1999; Martin et al. 2007; Tenaillon et al. 2007; MacLean et al. 2010; Rokyta et al. 2011) attention.

Figure 2.

Fisher's geometric model of adaptation for n = 2 phenotypes. (A) All combinations of phenotype values are represented in n-dimensional space (here, the plane); the optimal value is represented by the point labeled zopt and an individual-labeled z is displaced from this optimum (as perhaps following an environmental perturbation). To be beneficial, a mutation dz on wild-type z must yield a phenotype z + dz lying within the circle passing through z and centered at zopt. Inset: y represents another organism with fitness equal to that of z. Note, however, that mutation dz is seen to exhibit epistasis: it is beneficial on z but deleterious on y. (B) Mutational interactions in TEM-1 β-lactamase between catalytic activity and thermodynamic stability in determining cefotaxime resistance (see Fig. 1), represented qualitatively in Fisher's geometric model. Figure adapted from Weinreich (2010).

Together these facts motivate this study, which explores pairwise epistasis for fitness between deleterious mutations under the FGM. Specifically any two mutational vectors in phenotype space define an angle θ between them, and θ influences their fitness epistasis as illustrated in Figure 3A and B. We now invert this logic to make inferences about the geometry of the phenotypic space defined by the FGM from data on epistasis for fitness between deleterious mutations.

Figure 3.

Phenotypic angle θ between mutations dzi and dzj under Fisher's geometric model. Placing any two mutational vectors dzi and dzj on a common ancestral phenotype defines a plane and an angle θ between vectors. (A, B) ∥dzi∥ and ∥dzj∥ are both constant, where ∥x∥ is the norm of vector x defined as xTSx. Thin circles have radii ∥dzi∥ and ∥dzj∥; heavy circles have radii ∥dzi+dzj∥. Note dependence of ∥dzi+dzj∥ (and hence, of epistasis) on θ. (C) Maximum and minimum values of ∥dzi+dzj∥ allowed under the model (eq. 5) correspond respectively to θ = 0° and 180°. Experimental error in fitness assays and violations of model assumptions can yield imaginary estimates of θ. Specifically when Max(∥dzi∥, ∥dzj∥) − Min(∥dzi∥, ∥dzj∥) > ∥dzi+dzj∥, the phenotypic vector representing dzi+dzj is inferred to lie inside the inner heavy circle and the arccos(·) in equation (4) will yield θFGM values of the form 180° + ci. Conversely, when ∥dzi+dzj∥ > ∥dzi∥ + ∥dzj∥, this vector is inferred to lie outside the outer heavy circle and θFGM will be of the form 0° + ci. (D) Real (dashed) and imaginary (solid) parts of θ = arccos(x)·180°/π. Note that in any case we can view results as lying on a single continuum: as already suggested in C: values of θFGM of the form 0° + ci lie beyond θ = 0° and values of θFGM of the form 180° + ci lie beyond θ = 180°. See Results and Discussion for further details.



We begin by making two assumptions: that fitness is a Gaussian function of phenotypic distance from the optimum, and that phenotypic space is a vector space, that is that mutations are additive in phenotype space (e.g., Lande 1980; Martin and Lenormand 2006; Martin et al. 2007; but see computer simulations later). The first assumption approximates many functions close to the optimum (Lande 1980; Waxman and Welch 2005; Martin and Lenormand 2006). The second assumption is equivalent to supposing that epistasis acts only on fitness and not on underlying phenotypes. There is good empirical support for this assumption among missense mutations in proteins (Lunzer et al. 2005a; Zeldovich et al. 2007; although see Yokoyama et al. 2008 for an important counterexample) and moreover recent theoretical results were found to be robust to its violation (Martin et al. 2007).

Following Martin et al. (2007; see also Fig. 2 here), given n phenotypes under selection, we represent each organism by a column vector z whose ith component (1 ≤ in) gives the organism's deviation at the ith phenotype from the optimal value. (Thus, the optimal phenotype zopt is the column vector of all zeros.) Similarly, we represent mutations by column vector dz, whose ith component gives the mutation's perturbation of the ith phenotype. Writing the ancestral organism's phenotype z0, the absolute fitness of an offspring carrying mutation dz is then

display math(2a)

where superscript T represents transposition and S is the positive semidefinite n × n selective covariance matrix. For notational convenience, we shall also sometimes employ log-transformed absolute fitness

display math(2b)

Now writing the weighted inner product 〈dzi, dzj〉 = dziTSdzj and corresponding vector norm ∥dzi2 = 〈dzi, dzi〉 = dziTSdzi, the phenotypic angle between two mutational vectors dzi and dzj is defined by

display math(3)

the right-hand side of which can in some cases be written in terms of empirical fitness data. To see this note first that under equation (2b), w(z0 + dzi + dzj) + w(z0) − w(z0 + dzi) − w(z0 + dzj) = −dziTSdzj. This expression captures the epistasis between mutations dzi and dzj, and while slightly different than the classical definition (our equation (1), it is identical to eij of Martin et al. (2007; see their eq. 2, although that derivation is given in terms of relative rather than absolute fitnesses; see also Bonhoeffer et al. 2004). Importantly, assuming a Gaussian fitness function this empirical expression for dziTSdzj is correct for all ancestral phenotypes z0 because it simply reflects the (invariant) curvature of the function, as already noted (Martin et al. 2007). In contrast the denominator of equation (3) can be expressed in terms of empirical fitness data only by assuming z0 = zopt. In this case, we invert our equation (2b) to find 2w(dzi) = −dziTSdzi. Substituting these results into equation (3) yields

display math(4)

as the phenotypic angle between mutations dzi and dzj.

Equation (4) thus uses fitness data to give insight into the phenotypic space defined by the FGM; we write θFGM to highlight this connection. Specifically, the cosine of θFGM is Martin et al. (2007)'s epistasis eij between mutations i and j, normalized by the magnitude of their individual effects. Consequently pairs of strongly epistatic mutations are parallel or antiparallel in phenotype space (i.e., have correlated phenotypic effects, although the converse is not true in general; see Results).

Importantly, our norm in phenotypic space must obey the triangle and reverse triangle inequalities:

display math(5)

These bounds correspond respectively to θ = 180° and θ = 0° and are shown as heavy lines in Figure 3C. Violating this assumption (here as a consequence of experimental error in fitness assays or violations of assumptions leading to eq. 4) causes the expression inside the curly brackets of equation (4) to fall outside the interval [−1, 1] and will yield imaginary values of θFGM.

However as illustrated by Figure 3D, all real and imaginary values of θFGM lie on a continuum: values of θFGM of the form 0° + ci are simply more extreme than θ = 0° and values of θFGM of the form 180° + ci are simply more extreme than θ = 180°. (Here i is the imaginary unit defined by (−1)½ and c is some real constant.) We therefore treat such observations as categorical data lying on either side of the continuous interval [0°, 180°].


The distribution of values of θFGM observed also contains information about n, the number of mutationally labile dimensions in the underlying phenotype space. It has previously been shown that that the probability density function for angles θ between some reference direction and a set of vectors of random orientation in an n-dimensional space is

display math(6)

(Hartl and Taubes 1998; Poon and Otto 2000, where Z is the scaling constant required to get a proper probability density function over [0°, 180°]). This is also the probability density function for θ between vectors sampled from a multivariate Gaussian distribution with mean 0 and covariance matrix M = c·I, where c is any positive scalar and I is the identity matrix (not shown). Thus, assuming this sampling scheme, equation (6) gives the expected distribution of θFGM between some focal mutation dzi and all others in the dataset. But because this distribution is independent of the choice of focal mutation, it applies equally to the unconditioned distribution of θFGM across an entire dataset. After discarding imaginary values of θFGM we thus employed this density function in a likelihood framework to estimate math formula together with 95% confidence intervals (CIs).


Computer simulations were run using MATLAB R2011a (Mathworks, Natick, MA) to confirm analytical results and to explore the effect of experimental noise in fitness assays and relaxations of model assumptions. All code has been archived at Following Martin et al. (2006, 2007), we assume that the mutational probability density function is Gaussian with covariance matrix M. For the purpose of exploring the robustness of equation (4) we further assume a rotationally symmetric mutational covariance matrix M = m × I (because its derivation is independent of M) with standard deviation m = 0.1 phenotypic units. We relaxed this assumption to assess our ability to estimate phenotypic dimensionality. Because equation (4) is also independent of selection matrix S, we set S = I throughout.


In each run, we sampled 200 random mutational vectors dzi and dzj and computed θtrue between each of the resulting math formula pairs of mutations using equation (3) and S = I. We then tabulated the mean and standard deviation for corresponding values of θFGM (as described next) within a sliding window of θtrue values 10° wide. We also tabulated the frequency of each kind of imaginary value of θFGM observed in each sliding window.

To explore the sensitivity of equation (4) to experimental noise in fitness assays, we first added a normal deviate with mean zero and standard deviation σnoise to fitness equation (2a). For each pair of mutations, we then computed θFGM with equation (4) using the average of R replicate noisy fitness measures for each mutational vector. We explored values of experimental noise σnoise = 0.01 and 0.02, experimental replication R = 10 and 40, and phenotypic dimensions, n = 2, 5, and 20. Because fitness values in our simulations are in the range of 0.8–1.0 (not shown), values of σnoise here are roughly comparable to the CV values shown in Table 1 (experimental standard error normalize by experimental mean values, see Analysis of Empirical Data). Simulated levels of replication are consistent with current experimental protocols (e.g., Sanjuán et al. 2004), and simulated phenotypic dimensionality values are consistent with estimates here (Fig. 6) and elsewhere (Tenaillon et al. 2007).

Table 1. Published biological datasets analyzed
 Number of    
Organism,variable sites orSource of   
gene(s)loci analyzedmutationsFitness assayFitness assay CVCitation
  • 1

    Reported in Lunzer et al. (2002).

Escherichia coli, isopropylmalate dehydrogenase6Site-directedChemostat competition assay against the wild-type0.00371Lunzer et al. (2005a, b)
Escherichia coli, β-lactamase5Site-directedMinimum inhibitory concentration against cefotaxime0.087Weinreich et al. (2006)
Plasmodium falciparum dihydrofolate reductase4Site-directedMinimum inhibitory concentration against pyrimethamineNo data reportedLozovsky et al. (2009)
Escherichia coli, gyrA, rpoB, and rpsL3RandomPaired growth assays against the wild-type in LB0.68Trindade et al. (2009a, b)
Eschericia coli, Mini-Tn10 mutants8RandomPaired growth assay against ancestor in DM250.018Elena and Lenski (1997a, b)
Vesicular stomatitis virus N, P, M, G, and L28RandomIntrinsic growth rate in mammalian cell culture0.056Sanjuan et al. (2004)
Aspergillus niger, arg, pyr, leu, phe, oli, and crn7RandomRadial growth rate of colony0.030de Visser et al. (2009)

To test consequences of our assumption that the ancestral phenotype is at the fitness optimum, we also ran simulations in which W(z0) = 0.99 and 0.90 for phenotypic dimension n = 2, 5, and 20. We accomplished this by setting each component of z0 to (2ln[W(z0)]/n)½. To relax the assumption that mutations are additive in phenotype space, we added a phenotypic epistasis vector E to each double mutant before applying equation (2a). Each component of E was drawn from a normal distribution with mean 0 and standard deviation σepistasis, parameterized as a proportion of mean mutational vector length m. Values of σepistasis = 0.1m and 0.5m were examined.


To explore the sensitivity of our approach to estimating phenotypic dimensionality n in the face of experimental noise and violations of model assumptions, we simulated samples of 10 or 25 mutations each, setting n = 2, 5, or 20, and allowing experimental noise in fitness assays, nonoptimal ancestral phenotypes, or phenotypic epistasis as described earlier. We also examined violations of the assumption of rotational symmetry in mutational matrix M underlying equation (6). Here again, samples of 10 or 25 mutations were generated in n = 2, 5, or 20 dimensions with mutational matrix M = m × I as above, except that we set matrix entries m1,1 = m/2, or m1,2 = m/2, or both.

For each sample, we discarded imaginary values of θFGM, calculated the likelihood surface for a range of values of n between 2 and 40 in steps of 0.05, and recorded the maximum likelihood estimate together with the 95% CI bounds. These simulated datasets are comparable in size (10 and 25 mutations yield at most 45 and 300 comparisons, respectively) and phenotypic dimensionality to those in Table 3.


Theoretical results developed here were applied to the seven published datasets shown in Table 1. Only deleterious mutations described in Elena and Lenski (1997a,b) and Sanjuán (2004) were considered, and in all datasets lethal mutations were removed (as required by eq. 4; for this reason we analyzed MIC values presented in Lozevsky et al. 2009 rather than IC50 values). We then used equation (4) to compute θFGM between all pairs of remaining mutations in each dataset, and the number of dimensions of the underlying phenotypic space was estimated as described. Finally, we used the Kolmogorov–Smirnov goodness-of-fit test (Sokal and Rohlf 1995) as implemented in Matlab to compare the observed distributions of θFGM for each dataset with model expectations (eq. 6) for given math formula.


Under analytic assumptions, Fisher's geometric model allows us to use epistasis for fitness between pairs of deleterious mutations to characterize those pairs with respect to similarity of underlying mechanism. Specifically, pairs of mutations with very similar mechanisms will exhibit strong epistasis and values of θFGM near 0° or 180°, although the converse need not be true. (As the numerator of eq. 3 illustrates, epistasis will only be absent if the two mutational vectors are orthogonal in the space defined by matrix S, but such mutations will only exhibit uncorrelated phenotypes if all off-diagonal elements of S are 0.) Moreover, the distribution of epistasis across a panel of deleterious mutations can yield an estimate of the number of mutationally labile phenotypes contributing to fitness.


Under analytic assumptions (i.e., setting σnoise = σepistasis = 0, z0 = zopt and random positive semidefinite S) simulation results were precisely concordant with equation (4) for all parameter values examined (not shown). Figures 4 and 5 present simulation results that explore sensitivity of equation (4) to experimental noise and to relaxing model assumptions, respectively. Each panel displays the mean real θFGM values (± 1 SD) across a sliding window of θtrue values 10° wide. In addition, we show the frequency of each of the two classes of imaginary values of θFGM observed each in each sliding window. Finally, in each panel, we report the r2 value for a linear regression of real estimates of θ forced through the origin (see Discussion).

Figure 4.

Estimation of θFGM in computer simulations allowing experimental error in fitness assay. Equation (4) can return real and imaginary values. Heavy lines represent mean ±1 SD of real values of θFGM (eq. 4) for a given θtrue (computed with eq. 3) in a sliding window 10° wide; heavy dashed line represents θFGM = θtrue. (Scale on left y-axis.) Light lines represent fraction of imaginary values of θFGM's in the same sliding window (solid: frequency of imaginary θFGM with real part = 0°; dashed: frequency of imaginary θFGM's with real part = 180°; scale on right y-axis). All parameter values as shown (phenotypic dimensionality n, experimental noise σnoise, experimental replicates R). Phenotypic epistasis (σepistasis) equal to 0, z0 = zopt, and S = I.

Figure 5.

Estimation of θFGM in computer simulations when model assumptions are relaxed. (A) Relaxing z0 = zopt. (B) Relaxing phenotypic additivity. Note that as n increases, values of θtrue become increasingly concentrated around 90° (Hartl and Taubes 1998; Poon and Otto 2000). See Figure 4 legend for additional details.

We note first that in all cases, imaginary values of θFGM of the form 0° + ci are primarily observed when θtrue is small, whereas imaginary values of the form 180° + ci are associated with large values of θtrue. This is to be expected given our proposed interpretation of such observations (Fig. 3D). Moreover, conditioned on θFGM being real, θFGM almost always overestimates θtrue when θtrue is small and underestimates it when large. This is a simple consequence of our conditioning: the “missing,” more extreme estimates of θFGM are to be found in the imaginary realizations. Finally, in almost all cases summing across all values of θtrue, the frequency of imaginary values of θFGM of the form 0° + ci exceeds the frequency of values of the form 180° + ci. This can be understood by examining Figure 3C, which illustrates that for all parameter values the area outside the larger heavy circle [representing ∥dzi∥ + ∥dzj∥)] exceeds that inside the smaller heavy circle [representing Max(∥dzi∥, ∥dzj∥) – Min(∥dzi∥, ∥dzj∥)]. Put another way, experimental noise and violations of model assumptions perturb our estimate of where the vector dzi + dzj lies. But when θtrue is near 180° such perturbation is less likely to result in an imaginary estimate of θFGM than when it is near 0°.

Figure 4 illustrates that conditioning on the observation of real values of θFGM, equation (4) can estimate θtrue quite well from modestly noisy fitness data. Strong, nearly unbiased predictive power is observed in the first panel and standard deviation declines as phenotypic dimension increases (the trend continues for n = 20; not shown). This can be understood as a consequence of two facts. First, as dimension increases the probability distribution of realized θFGM becomes increasingly concentrated around 90° (Hartl and Taubes 1996; Poon and Otto 2000; see also Figs. 5, 6 here). This alone accounts for the drop in frequency of imaginary values of θFGM as dimension rises. Second, because S = I in simulations, as θFGM approaches 90° W(dzi + dzj) approaches W(dzi)W(dzi). Thus, even after allowing noise in fitness assays, the numerator of the expression inside curly braces in equation (4) will tend to be small, minimizing the standard deviation in θFGM. Increasing experimental error (σnoise) increases standard deviation of θFGM for all values of θ, although as expected this effect can be offset by additional fitness assay replicates (R).

Figure 6.

Frequency of real pairwise angles between pairs of deleterious mutations in published datasets shown in Table 1. Filled: θFGM computed with equation (4). Open: expectation as described, corresponding to phenotypic dimensionality math formula estimated for each dataset. (A) Lunzer et al. (2005a, b), (B) Weinreich et al. (2006), (C) Lozovsky et al. (2009), (D) Trindade et al. (2009a, b), (E) Elena and Lenski(1997a, b), (F) Sanjuan et al (2004), (G) de Visser et al. (2009), (H) Bottom-up estimates of θFS derived from Lunzer et al. (2005b). (See text for details.)

Sensitivity of equation (4) to violations of two assumptions is explored in Figure 5. Figure 5A illustrates that when the ancestor is not at the phenotypic optimum (i.e., z0­zopt), values of θFGM are skewed toward intermediate values. As noted earlier, this effect is entirely the result of errors introduced through the denominator of equation (4) because −dziTSdzj is only equal to 2ln[W(dzi)] when z0­ = zopt. Consequently, the magnitude of the effect is seen to increase as the fitness of z0 declines. However, the effect is moderated with increasing phenotypic dimension because the phenotypic perturbation introduced by any given fitness drop in z0 is proportional to n−½.

Figure 5B illustrates the consequences of phenotypic epistasis. Again, increased deviations from model assumptions increase the error in equation (4). We also note that in contrast to the situation in Figure 5A, increasing phenotypic dimensions render results more sensitive to violations of model assumptions. This is to be expected as we add a normally distributed random deviate to each component of the double-mutant's phenotype. Thus, the net error in z0 + dzi + dzj is proportional to n½.


Sensitivity of our estimator for phenotypic dimensionality in the face of experimental noise and violation of model assumptions is shown in Table 2. In general, we note that the maximum likelihood estimate is a bit below the center of the 95% CI, signaling a slightly positive skew to the likelihood function. We also note a slight bias toward overestimating phenotypic dimensionality in the face of experimental error, most notably when z0zopt. This is to be expected: by conditioning on real values of θFGM we are discarding realizations that otherwise would have fallen in the tails of the distribution of angles (Fig. 3), making phenotypic dimensionality appear higher than it is (eq. 6). However, deviations from rotationally symmetric mutations bias dimensionality estimates downward. This is also to be expected: as mutational variance in one dimension is reduced (here, m1,1 has been reduced from 0.1 to 0.05) mutations will become increasingly skewed in the remaining dimension(s). And similarly, as mutational covariance is increased (here m1,2 = m2,1 has increased from 0.0 to 0.05), mutations will more often have a common orientation.

Table 2. Maximum likelihood estimate and confidence intervals on phenotypic dimensionality math formula in computer simulations
  • 1

    Default values used throughout except where noted. σnoise = experimental noise (default: 0.0); R = experimental replicates (default: 1), W(z0) = fitness of ancestor (default: 1.0), σepistasis = phenotypic epistasis (default: 0.0m), M = m × I mutational covariance matrix (default: m 0.1), selection covariance matrix S = I throughout.

True phenotypic dimensionality, n25202520
Number of mutations sampled101010252525
Using default values11.954.1017.652.005.0518.70
 (1.70 – 2.35)(3.45 – 6.10)(12.20 – 24.00)(1.90 – 2.10)(4.50 – 5.65)(16.15 – 21.70)
σnoise = 0.01, R = 102.554.7525.052.705.1020.80
 (2.05 – 3.20)(3.60 – 6.45)(16.80 – 29.60)(2.50 – 2.95)(4.60 – 5.75)(17.90 – 23.95)
σnoise = 0.02, R = 102.705.3520.752.504.9519.15
 (2.20 – 3.50)(4.05 – 7.35)(14.00 – 24.65)(2.30 – 2.75)(4.45 – 5.55)(16.55 – 22.20)
σnoise = 0.02, R = 402.355.4515.902.304.4520.15
 (1.95 – 2.95)(4.10 – 7.50)(11.05 – 23.20)(2.15 -2.50)(4.00 – 4.95)(17.40 – 23.35)
W(z0) = 0.992.856.1521.803.655.1020.60
 (2.25 – 3.70)(4.55 – 8.50)(14.95 – 29.05)(3.30 – 4.10)(4.55 – 5.70)(17.75 – 23.90)
W(z0) = 0.907.0012.5025.205.607.7521.90
 (5.00 – 10.15)(8.80 – 18.20)(17.20 – 33.85)(4.90 – 6.45)(6.80 – 8.90)(18.90 – 24.45)
σepistasis = 0.1m2.154.4515.852.205.0019.30
 (1.85 – 2.60)(3.40 – 6.00)(11.05 – 23.15)(2.05 – 2.35)(4.45 – 5.65)(16.70 – 22.40)
σepistasis = 0.5m2.505.5525.202.754.8519.55
 (2.05 – 3.20)(4.15 – 7.65)(17.20 – 33.85)(2.55 – 3.05)(4.35 – 5.45)(16.90 – 22.70)
m1,1 = m/21.855.0516.051.904.4518.40
 (1.60 – 2.15)(3.85 – 6.90)(11.15 – 23.40)(1.80 – 2.00)(4.00 – 5.00)(15.90 – 21.35
m1,2 = m2,1 = m/21.603.2016.151.653.8516.55
 (1.45 – 1.85)(2.60 – 4.15)(11.25 – 23.60)(1.60 – 1.75)(3.50 – 4.30)(14.30 – 19.15)
m1,1 = m1,2 = m2,1 = m/21.555.2020.351.704.2017.45
 (1.40 – 1.75)(3.90 – 7.10)(14.05 – 29.90)(1.65 – 1.80)(3.80 – 4.70)(15.10 – 20.25)


We computed θFGM between all pairs of mutations in each of the seven published datasets shown in Table 1, and used those distributions to compute the estimated phenotypic dimensionality math formula together with the 95% CI, as described. Observed and expected distributions, math formula values and confidence intervals, and P-values for goodness-of-fit to equation (6) are all shown for each dataset in Figure 6. Further details of these analyses are provided in Table 3.

Table 3. Analyses of published datasets.1
  Number ofNumber ofKolmogorov–Smirnov 
Organism,Number oftimes θFGM =times θFGM =test statistic andMean
gene(s)comparisons0 + ci observed180 + ci observedassociated P-valueθFGM
  • 1

    See Table 1 for citations.

  • 2

    Bottom-up estimates based on angles between phenotypic vectors. See text for details.

Escherichia coli, isopropylmalate dehydrogenase674300.502, 4.1 × 10−658.4°
Escherichia coli, β-lactamase10100.27, 0.4694.8°
Plasmodium falciparum dihydrofolate reductase6110.59, 0.07133.0°
Escherichia coli, gyrA, rpoB, and rpsL980160.39, 8.4 × 10−12117.4°
Escherichia coli, Mini-Tn10 mutants21640.37, 0.07104.5°
Vesicular stomatitis virus31150.122, 0.8092.7°
Aspergillus niger, arg, pyr, leu, phe, oli, and crn14000.652, 2.4 × 10−6120.1°
Escherichia coli, isopropylmalate dehydrogenase, bottom-up θFS267N/AN/A0.422, 4.4 × 10−1177.4°


Evolution by natural selection enriches populations for those organisms with higher reproductive success (Darwin 1859), but what is the underlying biology responsible for these differences? Fitness is a complex property that emerges from many mechanistically more proximal phenotypes, and R.A. Fisher's geometric model (1930; here, the FGM) provides a popular, explicit, and abstract representations of that mapping (although see Kimura and Maruyama 1966; Kondrashov 1988; Rice 2000 for other mappings). Noting the close qualitative parallels to a recent model of protein evolution (DePristo et al. 2005; see Fig. 2 here), we thought to use the FGM to explore our ability to make inferences into this underlying phenotypic space from data on epistasis for fitness between deleterious mutations. We were motivated by the many low-throughput (e.g., Lunzer et al. 2005a; Weinreich et al. 2006; Lozevsky et al. 2009) and high-throughput (e.g., Costanzo et al. 2010; He et al. 2010) datasets now being developed. We sought two practical contributions for the experimentalist: the ability to group mechanistically related mutations (i.e., those with highly correlated phenotypic effects) and the ability to make gross statements about the phenotypic complexity of the system (i.e., the dimensionality of phenotypic space).


Importantly however, our approach is complicated by the fact that equation (4) can return biologically ambiguous results in the form of imaginary values (Table 3). Indeed, all datasets considered here except one (de Visser et al. 2009) include some imaginary values of θFGM; these range in frequency from 10% (Weinreich et al. 2006) to 64% (Lunzer et al. 2005a). In simulations, we observed that imaginary values occur as a consequence of measurement error in fitness assays (Fig. 4) as well as violations of model assumptions (Fig. 5). However, no one of these factors appears to fully explain the source of these observations in the data.

Simulations demonstrate that the fraction of imaginary values should rise with experimental error in fitness assays and drop with phenotypic dimensionality (Fig. 4). We thus first tabulated experimental error (CV) for six of the seven datasets examined (Table 1; no error estimates are given in Lozevsky et al. 2009), but contrary to the suggestion from simulation results, the fraction of imaginary values of θFGM observed is weakly and negatively correlated with experimental CV (r2 = 7.3%). We next regressed the fraction of imaginary values of θFGM against our estimated phenotypic dimensionality for each dataset. Although this correlation is also negative, consistent with expectation, it is also only weakly so (r2 = 17.7%). Thus, although some imaginary values of θFGM may be the consequence of experimental error in fitness assays, we wondered whether they may not also derive in part from violations of model assumptions: z0 = zopt, additivity among phenotypic vectors, and a Gaussian fitness function.

For three datasets (Lunzer et al. 2005a; de Visser et al. 2009; Trindade et al. 2009bb), we computed epistasis on a naturally occurring wild-type isolate, which we might reasonably expect to be at or near the (local) fitness optimum, and these studies describe no beneficial mutations on those backgrounds. Two datasets (Weinreich et al. 2006; Lozevsky et al. 2009) employ drug resistance as a proxy for fitness, and we computed epistasis on the highest-resistance variant described. Again, these studies describe no beneficial mutations on those backgrounds. (And in one case extensive mutagenesis has failed to reveal a higher-resistance variant; Orencia et al. 2001; Salverda et al. 2011.) Although beneficial mutations on the ancestor were described in the last two datasets (Elena and Lenski 1997b; Sanjuán et al. 2004) we believe it is unlikely that z0zopt is responsible for many of the imaginary values of θFGM in Table 3. Indeed, in the case where the starting genotype is demonstrably far the fitness optimum (Sanjuán et al. 2004), the frequency of imaginary values of θFGM is comparatively modest.

To our knowledge, data on phenotypic additivity are scarce. One case however comes from Lunzer et al. (2005a,b). In addition to measuring organismal fitness, those authors performed in vitro assays for four functionally proximal biochemical phenotypes likely responsible for fitness effects of mutations in their system. They find that mutational effects on log-transformed values of these phenotypes are very nearly additive (r2 ≥ 0.92). Thus at present, we find little evidence to suggest that phenotypes are not additive.

Finally, we acknowledge that non-Gaussian fitness functions can also give rise to imaginary values of θFGM. Although sensitivity of our approach to this effect has not been explored, principled models of the phenotype-to-fitness function for both IMDH (eq. 3 in Lunzer et al. 2005a) and β-lactamase (eq. 4 in Brown et al. 2009) are distinctly non-Gaussian. Following others (e.g., Perelson and Oster 1979; Lapedes and Farber 2001) we are presently pursuing an alternative approach that relieves us of the burden of this assumption.


Here we have employed what may be regarded as a top-down approach to quantitatively characterize the phenotypic space underlying the FGM. The functional synthesis (Dean and Thornton 2007) represents a complementary bottom-up approach: characterization of the mechanistically proximal phenotypic traits responsible for fitness effects of particular mutations. As already noted, in addition to fitness Lunzer et al. (2005a, b) characterized mutational perturbation in four candidate proximal phenotypes related to enzyme–substrate affinity and enzyme kinetics. These data allow further examination of equation (4), because we can explicitly represent mutations as phenotypic vectors math formula where Δ is relative to the allele of highest fitness. Application of equation (3) to these vectors thus yields independent measures of pairwise angles θ between mutations in phenotypic space. We designate these values θFS to reflect their connection to the functional synthesis.

To determine how well θFS predicts θFGM, we regressed the latter on the former (Fig. 7). Consistent with expectations we observe a significantly positive slope between real values of θFGM and θFS (Spearman rank correlation coefficient ρ = 0.408, P = 0.024 by a permutation test). The slope of the regression through the origin (0.69) is different from 1.0, although this should perhaps not surprise us because the numeric angle θ between two vectors is sensitive to scaling of phenotypic space. For example, Lunzer et al. (2005a) report values of Vmax/KM in units of M−1·s−1; rescaling Vmax/KM in units of mM−1·s−1 would change all values of θFS without changing any of the biology. Indeed, we are substantially heartened by this result for two reasons. First, rescaling dimensions of phenotypic space can even perturb rank orders of angles when n > 2. Second, although θFS offers direct access to mutational vectors in phenotype space, θFGM values are influenced by the (unknown) asymmetries of the true selection covariance matrix S. Figure 7 also shows that the frequency of imaginary values of θFGM of the form 0° + ci declines as θFS rises, consistent with model expectations. (No values of θFGM of the form 180° + c were observed in this dataset; Table 3.)

Figure 7.

Bottom-up analysis of phenotypic angles among mutations in isopropylmalate dehydrogenase examined in Lunzer et al. (2005a, b). For each pair of mutations, x-axis represents pairwise phenotypic angle θFS computed with equation (3) using phenotypic perturbations in log-transformed cofactor performance and binding affinity estimated by Lunzer et al. (2005a, b). Left y-axis represents real θFGM values computed from fitness data using equation (4); dashed line shows linear regression through the origin. Right y-axis represents the fraction of imaginary values of θFGM in a sliding window 10° wide. Only imaginary θFGM values of the form 0 + ci were observed in this dataset. Cf. Figures 4 and 5.


Our second objective was to provide some access to the phenotypic complexity underlying fitness. Interestingly, values reported here (Fig. 6) fall between those previously developed under the FGM by others (0.2–2.5 in Martin and Lenormand 2006; and 10–45 in Tenaillon et al. 2007; see also Lourenço et al. 2011 for an attempt to bridge those divergent estimates). Although equation (4) examines curvature of the fitness function at the fitness optimum, Martin and Lenormand (2006) employ data from mutation accumulation experiments and their results thus characterize the fitness function over a wider mutational scope. Finally, estimates in Tenaillon et al. (2007) are based on differences in fitness at mutation/selection/drift equilibrium as a function of population size, and may therefore represent the analysis covering the largest genetic scale.

As above, phenotypic data in Lunzer et al. (2005b) also allow us to perform a bottom-up complement to our top-down estimation of math formula. The distribution of θFS (Fig. 6H) implies that these vectors lie in a higher dimensionality space than do results based on θFGM (Fig. 6A; math formula = 10.90 vs. 3.45), although the two distributions are statistically indistinguishable (G = 2.45, P > 0.98, df = 9).

A bias in math formula derives from our assumption that mutational vectors are uniformly distributed in phenotypic space. To the extent that they are not, ours are underestimates (Table 2). We thus regard our approach as estimating an “effective” number of dimensions in phenotype space (i.e., the number of dimensions of equal and uncorrelated mutational lability that would give rise to the given distribution of θFGM). Although an analytic analog to equation (6) for arbitrary mutational covariance matrix M appears possible, it would not be helpful here because we would then be faced with estimating both math formula and math formula from single distributions of θFGM.

In addition, because equation (6) is undefined when θ is imaginary, it was necessary to omit such values of θFGM when we estimated phenotypic dimensionality. This protocol biases math formula upward. Recognizing the diversity of datasets in our study, we however observe no signal of this bias: the correlation between math formula and the frequency of imaginary values of θFGM is in fact weakly negative.

Finally, the distribution of θFGM deviates significantly from the expectation given by equation (6) in three cases (Fig. 6; Table 3). In one (Lunzer et al. 2005a, b), the mean (58.4°) is far less than expectation (as is the mean θFS from the same system­: 77.4°), implying that these mutations are underdispersed in phenotype space. This effect is already visible in Figure 3 of Lunzer et al. (2005a), which shows that all phenotypic mutations lie in a narrow sector emanating from the wild type. Means are larger than expectation in the other two cases (de Visser et al. 2009; Trindade et al. 2009b; 120.2° and 117.4°, respectively), implying mutational overdispersion, and we imagine that in these systems ridges of high mutational density radiate from the wild type, separated by valleys of low mutational density approximately 120° wide. Both of these situations imply correlations between mutations (positive in the first case and negative in the latter two; note that these correlations are distinct from correlations within mutations represented by the matrix M), although the biological and evolutionary implications are as yet unclear to us. No particular association between these deviations and the sources of mutations (Table 1) was observed.


Using Fisher's geometric model we have developed a novel approach to characterizing aspects of the phenotypic determinants of deleterious mutations using data on pairwise epistasis. Despite our original motivation for adopting the FGM (Fig. 2), we are generally heartened by the apparent utility of our approach when applied to the heterogeneous datasets (Table 1: four from bacteria, two from eukaryotes and one from a virus). Moreover, although growth rate is the fitness assay in five cases, drug resistance is employed in two. And finally, three of the datasets consider mutations in a single gene while the other four use mutations more broadly distributed across the genome.

We conclude by noting that much of evolutionary biology rightly focuses on beneficial mutations, those which fuel adaptation. In contrast, we have only considered epistasis between deleterious mutations, both because they are far more plentiful and because our algebraic framework cannot accommodate beneficial mutations. Moreover statistical patterns of fitness epistasis among deleterious mutations do not always mirror those among beneficial mutations (Sanjuán et al. 2004; Rokyta et al. 2011). However, our interest in epistasis here is limited to making inferences into the mechanistically proximal phenotypes underlying mutational effects on fitness. And as it seems reasonable to assume that deleterious and beneficial mutations operate in the same phenotypic space, we suggest that application of the framework developed here will yield dividends equally applicable to the molecular basis of Darwinian evolution.

Associate Editor: J. Kelly


L. Chao first drew our attention to the FGM and B. Hickey proposed experiments that triggered this line of thinking. Several suggestions by J. Kelly and two very generous reviewers substantially improved this manuscript. This work was supported by DMW's startup award from Brown University, National Science Foundation award 1038657 to DMW, and National Institutes of Health NRSA 1 F32 GM086105 to JLK.