We used joint-scaling analyses in conjunction with rearing temperature variation to investigate the contributions of additive, non-additive, and environmental effects to genetic divergence and incipient speciation among 12 populations of the red flour beetle, Tribolium castaneum, with small levels of pairwise nuclear genetic divergence (0.033 < Nei's D < 0.125). For 15 population pairs we created a full spectrum of line crosses (two parental, two reciprocal F1's, four F2's, and eight backcrosses), reared them at multiple temperatures, and analyzed the numbers and developmental defects of offspring. We assayed a total of 219,388 offspring from 5147 families. Failed crosses occurred predominately in F2's, giving evidence of F2 breakdown within this species. In all cases where a significant model could be fit to the data on offspring number, we observed at least one type of digenic epistasis. We also found maternal and cytoplasmic effects to be common components of divergence among T. castaneum populations. In some cases, the most complex model tested (additive, dominance, epistatic, maternal, and cytoplasmic effects) did not provide a significant fit to the data, suggesting that linkage or higher order epistasis is involved in differentiation between some populations. For the limb deformity data, we observed significant genotype-by-environment interaction in most crosses and pure parent crosses tended to have fewer deformities than hybrid crosses. Complexity of genetic architecture was not correlated with either geographic distance or genetic distance. Our results support the view that genetic incompatibilities responsible for postzygotic isolation, an important component of speciation, may be a natural but serendipitous consequence of nonadditive genetic effects and structured populations.

Because the genetic architecture underlying a trait governs its response to selection, drift, and mutation, much of evolutionary theory is predicated on assumptions about the way genes act alone or in concert with their genetic and environmental context to produce a phenotype. At present, evolutionary biologists do not agree upon which assumptions most accurately reflect the way genes behave in nature (e.g., Fisher 1930; Wright 1931; reviewed in Provine 1985; Coyne et al. 1997; Wade and Goodnight 1998).

It is well established that adaptation proceeds most efficiently when the effect of an allelic substitution is independent of the genetic context (i.e., additive effect, Fisher 1941; Crow and Kimura 1970; Lande and Arnold 1983; Falconer and Mackay 1996; Lynch and Walsh 1998). The consequences of gene interactions (epistasis) are also well documented, although less simple. Depending on the type of genetic interaction and population demography modeled, epistasis can impede the response to selection by changing the relative fitness of genotypes, or accelerate it by conversion of epistatic to additive variation (Goodnight 1988, 2000; Wade 2000a, 2002; Carter et al. 2005; Carlborg et al. 2006). Epistasis may also facilitate several evolutionary phenomena that are difficult to explain with purely additive models, such as the origin of sex and recombination (Charlesworth 1990; Barton 1995; Peters and Lively 1999), mating system evolution (Charlesworth and Charlesworth 1990; Schierup and Christiansen 1996; Jacobs and Wade 2003), and developmental robustness and canalization (Rice 1998; Bergman and Siegal 2003; de Visser et al. 2003; Flatt 2005).

Additionally, there is a general tendency to transition from theory emphasizing within locus effects to theory emphasizing interactions among loci as divergence between mates increases (Lynch 1991). For example, inbreeding depression and heterosis between closely related individuals are traditionally credited to dominance effects (Bruce 1910; Crow 1948; Lande and Schemske 1985; Charlesworth and Charlesworth 1999; but see Moorad and Wade 2005), whereas outcrossing depression and speciation are typically attributed to the breakup of coadapted gene complexes (Mayr 1970; Carson and Templeton 1984) and/or negative interactions between loci (Dobzhansky 1937; Muller 1940, 1942; Gavrilets 1997, 2003; Demuth and Wade 2005). This dichotomy in emphasis is nowhere more evident than in the standard model for the origin of postzygotic isolation where it has been argued that epistasis is irrelevant within lineages but causes incompatibility between lineages (Coyne et al. 1997; Orr 2001). Reconciling whether there is a real change in genic effects with change in context will require relevant empirical measures of genetic architecture along a continuum of divergence (Lynch 1991).

The difficulty in obtaining, and consequent rarity, of adequate empirical data is a major reason for lack of consensus concerning the evolutionary importance of epistasis (Barker 1979; Lynch 1991; Whitlock et al. 1995). Traditional variance partitioning methods and most quantitative trait locus studies are plagued by low statistical power to detect nonadditive effects, but are the most widely adopted means of dissecting genetic architectures within populations or species (reviewed in Demuth and Wade 2006a). An alternative method involving analysis of line means (see Joint-Scaling Analysis) may be more powerful (Fenster et al. 1997; Kelly 2005), but has only been employed by evolutionary biologists in a few systems (e.g., Lair et al. 1997; Edmands 1999; Fenster and Galloway 2000b; Kelly 2005).

The contributions of other nonadditive effects such as maternal, cytoplasmic, and genotype-by-environment interaction (G × E) effects have been studied even less well than dominance and epistasis. Maternal effects are a common consequence of anisogamy in many plants and animals and the maternal environment and offspring genotype may often be equally important to offspring fitness (reviewed in Wade 1998). Population genetics theory also predicts that genes under purifying selection will evolve more rapidly when expressed only by mothers (Barker et al. 2005; Demuth and Wade 2006b). Cytoplasmic effects are a product of mitochondrial and/or chloroplast genes and their interactions with the nuclear genome. Haploidy, uniparental inheritance, and mutation rate differences all contribute to different expectations for the evolutionary dynamics of cytoplasmic genes relative to their nuclear counterparts (Lynch et al. 2006; Wade and Goodnight 2006). Finally, G × E is theoretically similar to epistasis except that environmental variation, rather than genetic background, mediates changes in genotypic value (Wade 2000b). With G × E, the contributions of genetic effects measured in one environment may not be indicative of their contributions in other environments. G × E may also accelerate speciation rates because allelic differences and epistasis between ancestral genotypes may contribute to genetic incompatibility (Bordenstein and Drapeau 2001).

We present an extensive effort to describe the genetic basis of divergence among populations of the flour beetle Tribolium castaneum. We ask: (1) what are the relative contributions of additive and nonadditive effects to hybrid traits across populations separated by a wide range of geographic distances (830–16,336 km); (2) is there a relationship between the complexity of genetic architecture and geographic or genetic distance; and (3) is there an interaction between genetic architecture and the rearing environment? To answer these questions, we use a global sample of T. castaneum populations and employ analysis of line means (i.e., joint scaling; see below) to estimate additive and nonadditive effects contributing to hybrid offspring production and developmental defects. We find that dominance and epistasis typically contribute most to divergence among populations, whereas maternal and cytoplasmic effects also contribute substantially. Because nonadditive effects are prevalent even among geographically proximate populations, we do not find a strong relationship between geographic or genetic distance and complexity of genetic architecture. However, we do show that G × E influences the perceived genetic effects underlying divergence between many populations, particularly for developmental defects. We discuss how these findings bear on the importance of nonadditive effects at all levels of evolutionary divergence.

Materials and Methods

The red flour beetle, T. castaneum, is a cosmopolitan human commensal and major pest of stored cereal crops. Previous studies demonstrate that variation for the extent of reproductive isolation between T. castaneum and T. freemani segregates within and among T. castaneum populations (Wade and Johnson 1994; Wade et al. 1997; Wade et al. 1999). Specifically, hybrid offspring number, sex ratio, and developmental defects depend on both hybrid rearing temperature and the geographic origin of the T. castaneum male parent. In this study, we use crosses among T. castaneum populations, rather than interspecific crosses, to estimate the contributions of additive, dominance, epistatic, maternal, cytoplasmic, and genotype-by-environment interaction (G × E) effects to characterize divergence over a much shorter time-scale of evolution.


All populations used in this study originated from wild collections of > 50 adults. Each population has been maintained in large numbers (> 200) on standard medium (20:1, flour: brewers yeast, by weight) under standard environmental conditions (24 h dark, 29°C, approx. 70% relative humidity) for > 50 generations except for the United States (Indiana), Ecuador, and Mexico, which were collected < 25 generations prior to initiation of experimental crosses. We crossed 12 populations in 15 combinations constituting a broad range of latitudinal and longitudinal spatial separation (Table 1). Populations paired with more than one other population help determine whether distance from a given reference population is correlated with genetic architecture. In a few cases, information about nuclear marker variation (amplified fragment length polymorphism, AFLP) and mtDNA sequence variation (AT-rich control region) were also available to investigate relationships between geographic distance, genetic divergence, and genetic architecture.

Table 1.  Approximate geographic and genetic distances between pairs of populations.
Parent 1Parent 2Distance
Geographic (km)Nuclear1 (Nei's D)mtDNA2 (substitutions/site)
  1. 1Nei's D based on amplified fragment length polymorphism (AFLP) markers.

  2. 2mtDNA sequence divergence based on control-region sequences.

CroatiaPortugal 830 
ColombiaEcuador 931 0.004
NigeriaCroatia 3,9630.043 
EcuadorMexico 5,3130.050 
TanzaniaIndia 5,324 0.012
PeruMexico 6,2950.067 
EcuadorPortugal 8,351 0.000
PeruPortugal 9,029 
MalaysiaCroatia 9,7700.111 
U.S.A. (IN)Malaysia15,1580.034 
TanzaniaMexico16,336 0.002

For each population pair, we made 16 cross types: two parental (P1 and P2), reciprocal F1's, four F2's, four backcrosses in each direction (BC1 and BC2; detailed in Supplementary Table 1 online). To control for temporal differences inherent in comparing first and second-generation crosses, we performed an initial round of crosses between individual virgin beetles to generate F1 and reciprocal or rF1 offspring. These initial F1s are not included in any analyses so that results reflect data for all 16 cross types conducted simultaneously in the same generation. Outcrossing was imposed throughout the design.

For the Ecuador × Mexico, India × Mexico, and Tanzania × Mexico population pairs, we replicated all cross types using 21 mating pairs, except the four F2s, which were each replicated 14 times. For crosses between these three population pairs, we placed a virgin male and female in eight dram vials containing 8 g of standard medium. Each pair mated and laid eggs for one week under standard environmental conditions for stock maintenance (see above); we then removed the mating pair, and evenly distributed vials containing eggs and medium across the 26°C, 29°C, and 35°C rearing temperatures. To avoid potential confounding effects of age and mating duration, the protocol was repeated for an additional two weeks so that each replicate for each cross type was present at all rearing temperatures.

Power analysis based on the observed cross and temperature effects in the three population pairs above suggested that 50% differences in offspring number could be detected with 90% power (α= 0.05) using only two replicates. However, given the necessity to later correct for multiple tests, we replicated all subsequent population pairs with seven or eight mating pairs per cross type and offspring were reared at 29°C and 35°C. Although mating is considered to be indiscriminant within the genus Tribolium (Park 1933; Sokoloff 1974), we also changed the mating protocol to compensate for potential effects of premating, behavioral isolation. Virgin males and females were allowed to mate in empty eight dram vials for three days. This method increases mating opportunities by decreasing the search time associated with potential differences in tunneling behavior between populations or cross types. The absence of food for this short duration does not affect subsequent reproductive output. After three days in empty vials, mating pairs were placed on standard medium and allowed to lay eggs for four days before being transferred to new flour.

Upon eclosion from the pupal stage, we counted and examined offspring for the presence of developmental defects (deformities). Deformities include missing or fused limb segments, missing or fused antennal segments, and failure to properly eclose (cf. Wade et al. 1999). We scored all deformities as presence/absence for an individual to avoid subjectivity as to the severity of deformities. To facilitate analysis of sex differences in hybrids (Demuth and Wade 2007), offspring were killed by microwaving for 90 sec prior to data collection in all population pairs except Ecuador × Mexico, India × Mexico, and Tanzania × Mexico. Although scoring dead beetles was necessary for sex determination on such a large number of beetles, the additional manipulations often resulted in limb damage that could not be distinguished from some types of true deformities. The bias introduced by this manipulation makes testing for treatment effects more conservative because treatments with low numbers of true deformities will be inflated proportionally more than treatments with high numbers of true deformities.


Joint scaling is a method by which the net additive and non-additive genetic effects can be inferred from phenotypic differences among lines of known pedigree (Cavalli 1952; Cockerham 1954; Kempthorne 1954; Hayman 1958; Cockerham 1980; Mather and Jinks 1982; Lynch 1991; Lynch and Walsh 1998). Following the derivation of Mather and Jinks (1982), the difference in additive and nonadditive effects between two populations (P1 and P2) is scaled to the mean (μ) of the multigenerational population derived from an initial cross between diverged lines. This “F-metric” is equivalent to the mid-parent mean when interactions are absent, and has the advantage of remaining constant independently of the number of genetic interactions (Mather and Jinks, 1982, pp. 90–91).

To determine the most parsimonious fit to the observed generation means, we used the following weighted least-squares model (Lair et al. 1997; Lynch and Walsh 1998):


where inline image is the vector parameter estimates for the composite genetic contributions to each generation. C is the matrix of coefficients for the parameters in inline image (supplementary Table S1) and y is the vector of observed generation means. The diagonal of matrix E consists of the squared phenotypic standard errors. Variance in inline image is given by the diagonal elements of the matrix inline image and inline image is the vector of predicted line means based on the specified model. Model-wide goodness of fit was tested using the chi-square statistic:


with degrees of freedom equal to the number of phenotypic means (16 in all cases) minus the number of model parameters.

To determine the genetic architecture that best described the observed generation means, parameters were fit hierarchically in four increasingly complex models. First, only additive and dominance parameters were included, then the three digenic epistasis terms, followed by two maternal effect terms, and finally a parameter for cytoplasmic effects. To test whether the more complex models were a significant improvement over more parsimonious models, we used the likelihood ratio test statistic


where χ2j and χ2i are the goodness of fit statistics for the current model and the next most parsimonious model, respectively. This statistic is asymptotically χ2 distributed with degrees of freedom equal to the difference in the number of parameters included in the two models (Lynch and Walsh 1998). When no further improvement was made by the addition of parameters, the significance of each parameter estimate in the best fit model was assessed as


with degrees of freedom greater than 100 in all cases. Starting with the parameter having the highest t-value, parameters were removed and the models tested again using equation (5). This procedure produces the most parsimonious model with the least multicolinearity among parameters and the least model-wide inflation of the significance threshold due to multiple testing. To maintain a significance threshold near α= 0.05 for joint-scaling parameter estimates, sequential Bonferroni correction was applied within crosses (Rice 1989).

It is important to note that we are measuring between-population divergence in the genetic basis of traits; therefore, demonstration of an epistatic contribution to offspring number in crosses between populations is not indicative of epistasis contributing to offspring number within populations. For example, interacting loci may be fixed differently in alternate populations, so that their epistatic effects are only manifest when made to segregate in hybrid generations. Additionally, all of the composite genetic effects measured by joint scaling should be viewed as conservative estimates of the absolute contributions of genetic effects, because positive and negative effects may cancel each other in hybrid generations.

All joint-scaling analyses were computed using a Mathcad 2000 program. The Mathcad program was implemented as an “add-in” for Microsoft Excel 2003 to facilitate interactive model fitting. Significant but different joint-scaling models for the same population pair reared at different temperatures are indicative of G × E. However, to test directly for G × E, we computed a two-way analysis of variance (ANOVA) for each population pair using rearing temperature and cross type and as fixed main effects. Proportions of deformities were analyzed in the same way. The proportion of failed crosses was analyzed using population pair, rearing temperature, and cross type as main effects in a factorial ANOVA. Where significant treatment effects were detected by ANOVA, we tested differences among means using Tukey's HSD. All statistical tests not directly related to joint-scaling analyses were computed in STATISTICA 6.1. Proportion data were arcsine square root transformed. Other transformations required for the assumptions of parametric tests are noted in the results.


We use regressions of the number of significant joint-scaling parameters against geographic distance and genetic distance to test whether genetic or geographic distance measures are correlated with complexity of genetic architecture. For population pairs reared at multiple temperatures, we average the number of parameters across environments. Analysis of the entire dataset includes multiple overlapping uses of several populations. Therefore, to correct for nonindependence of data points, we also analyze groups of populations crossed to the same parental population to observe the relationship between genetic architecture and distance from a given reference population. Each such group can be considered an independent test of the relationship between complexity of the genetic architecture and distance. We also examined the relationship between distance and the relative magnitudes of additive versus nonadditive contributions to genetic architecture. To do this, we compute a regression of the proportion of the total joint-scaling model that is attributable to each type of genetic effect (e.g., absolute value of the additive effect/sum of absolute values of all effects) against geographic distance.


In total, the 15 population pairs resulted in 213,346 offspring among 5142 families. An additional 345 families (6.7%) yielded no offspring. Because complete failures were rare, we pooled data within parent, F1, F2, and backcross generations for analysis. Failure to produce offspring is significantly affected by population pair (F14,125= 17.29, P= 0 < 0.001), cross type (F3,125= 6.53, P < 0.001), and the population pair × cross type interaction (F27,125= 4.04, P < 0.001). The significant interaction between population pair and cross type indicates that the proportion of failures within cross types is not consistent across all population pairs; however, there is a clear increase in the proportion of failures in F2 crosses relative all others, and pure parent crosses fail less often than hybrid crosses (Fig. 1B).

Figure 1.

Distribution of failed crosses among population pairs (A) and cross types (B). The overall height of each bar is the number of failures divided by the total number of crosses within each category. (A) Each bar is partitioned into the relative proportion of failures occurring in the cross types represented by shading in (B). Numbers below the bars indicate rearing temperature. Geographic distances for each pair are reported in Table 1. Analysis of variance (ANOVA) was computed using all 16 cross type, but we combine crosses into four categories: parental (P), F1, F2, and backcrosses (BC), for illustrative purposes. Horizontal lines below the x axes indicate significant differences among means. (B) Because rearing temperatures do not significantly affect the proportion of failures, cross types are pooled across temperatures. Only Ecuador × Mexico, India × Mexico, and Tanzania × Mexico included a 26°C rearing temperature.

Cross failure is not affected by rearing temperature (F1, 125= 0.000, P= 0.989), or other interactions. In fact, 89.4% (330/345) of families that failed to produce offspring at one temperature, failed at all temperatures. Malaysia × Croatia, a pair representing intermediate geographic distance, failed more frequently than any other population pair; whereas three of the four most distant pairs are in the group with the lowest failure rate (Fig. 1A). Total failure of a cross could result from several causes besides offspring inviability (e.g., behavioral isolation, sterility of one or both parents, experimental error), and because the experimental design does not provide a mechanism for distinguishing among these causes, we exclude them from the generation means in subsequent analyses.


Among pairs that yield offspring, the number of offspring produced differs significantly among cross types in 80% (12/15) of population pairs (Fig. 2A). Rearing temperature affects only ∼33% (5/15) of population pairs, and Malaysia × Croatia is the only pair to exhibit G × E for offspring production (compete ANOVA results available as Supplementary Table 2 online). The trend across generations is to produce significantly more F1 hybrid offspring and significantly fewer F2 and backcross offspring relative to mean numbers produced by pure parent crosses (Fig. 2B), but this trend is not present in all population pairs. Three population pairs (Ecuador × Japan, Ecuador × Portugal, and Croatia × Portugal) exhibited no significant effect of rearing temperature, cross type, or G × E on the number of offspring produced (Fig. 2A).

Figure 2.

Distribution of numbers of offspring produced among population pairs (A) and cross types (B). The overall height of each bar is the total number of offspring for each category. Each bar in (A) is partitioned into the number of offspring in each of the cross types represented by shading in (B). Letters above bars indicate significant treatment effects in two-way analysis of variance (ANOVA) (t, temperature; c, cross type; i, temperature × cross interaction). Bars without letters indicate no significant effects. Numbers below the bars indicate rearing temperature. Geographic distances for each pair are reported in Table 1. ANOVA was computed using all 16 cross types, but we combined crosses into four categories: parental (P), F1, F2, and backcrosses (BC), for illustrative purposes. (B) Horizontal lines below the x-axis indicate significant differences among means. Because rearing temperatures do not significantly affect the number of offspring produced, cross types are pooled across temperatures. (Complete ANOVA results are reported in Supplementary Table 2 online.)

Figure 3 illustrates examples of the general patterns of variation observed in hybrid offspring number (complete data available as Supplementary Fig. 1 online). The most common pattern is increased scatter of hybrid means while remaining centered around the additive expectation (Fig. 3A). Second generation hybrid breakdown (i.e., significantly fewer F2 and/or backcross hybrids relative to parental means) is the second most common pattern observed (Fig. 3B and C). In a subset of population pairs where second generation hybrid breakdown is observed, we also see F1 hybrid vigor (i.e., significantly more F1 hybrids relative to parental means; Fig. 3C). Hybrid vigor extending into second generation crosses also occurs in some cases (Fig. 3D).

Figure 3.

Examples of the patterns of hybrid offspring production in crosses between T. castaneum populations. Plots show the mean number of offspring produced in each cross type ± 2SE. Shapes indicate different generation means: ○, Parental; ▪, F1; ▴, F2; ◆, Backcross (BC). Open shapes indicate reciprocal crosses. Solid lines indicate the least-squares expectation under a purely additive model. (Complete results are reported in Supplementary Fig. 1 online.)

Joint-scaling analyses show that interactions within and/or between loci (dominance and/or epistasis, respectively) contribute most to the observed patterns of population differentiation (Fig. 4). In many cases the proportion of genetic effects attributable to dominance or epistasis is several times greater than the additive effect. Maternal and cytoplasmic effects are also frequent components of divergence (Fig. 4; also illustrated by differences between open and filled symbols for each cross type in Fig. 3). In some cases differences between reciprocals of a given cross type, which are indicative of maternal effects, encompass the total range of variation among hybrid and parental lines (e.g., Fig 3A) and in 14 of 30 cases maternal effects contribute more to divergence than additive effects.

Figure 4.

Results of joint-scaling analyses on offspring numbers. Bars indicate the relative proportions of each type of composite genetic effect. Only the significant parameters in the joint-scaling model are included in the calculations. (Detailed joint-scaling results are reported in Supplementary Table 3 online.)

Finally, although ANOVA indicates that rearing temperature is unimportant for most population pairs, all pairs where a significant model could be fit, require different joint-scaling parameters to explain offspring numbers at each rearing temperature. For example, only single locus effects are significant in the Croatia × Portugal cross when reared at 29°C, but at 35°C epistasis and maternal effects contribute significantly whereas additive effects do not. Complete joint-scaling results are reported in Supplementary Figure 1 and Supplementary Table 3 online.


Overall, deformed offspring are rare in crosses between T. castaneum populations. In 27 of 33 cases (three population pairs at three rearing temperatures and 12 population pairs at two rearing temperatures), less than 15% of all beetles suffered any deformities (Fig. 5A). The rarity of deformed offspring resulted in violations of parametric test assumptions in six population pairs even after transformation (indicated in Fig. 5A). Because most effects were highly significant (P < 0.0001) for both transformed and raw data, we believe that our interpretations are likely to be robust in the six cases where data were nonnormal (ANOVA results are reported in Supplementary Table 2 online). The Ecuador × Japan pair resulted in too few deformed offspring to conduct ANOVA.

Figure 5.

Distribution of proportions of deformed offspring among population pairs (A) and cross types (B). The overall height of each bar is the number of offspring with developmental defects divided by the total number of offspring within each category. (A) Each bar is partitioned into the relative proportion of deformities within each of the cross types represented by shading in (B). Letters above bars indicate significant treatment effects in two-way analysis of variance (ANOVA) (t, temperature; c, cross type; i, temperature × cross interaction). Bars without letters indicate no significant effects or not tested due to insufficient numbers of deformities. Numbers below the bars indicate rearing temperature. Geographic distances for each pair are reported in Table 1. ANOVA was computed using all 16 cross types, but we combined crosses into four categories, parental (P), F1, F2, and backcrosses (BC), for illustrative purposes. (B) Horizontal lines below the x axis indicate significant differences among means. (Complete ANOVA results are reported in Supplementary Table 2.)

Rearing temperature significantly affects the proportion of deformities in 73% of population pairs (11/15; Fig. 5A). However, the temperature yielding the greatest proportion of deformities differs among pairs (35°C = 6 pairs, 29°C = 4 pairs, and 26°C = 1 pair) such that the mean number of deformities from 29°C versus 35°C is not different within each cross type (Fig. 5B). Cross type significantly affects the proportion of deformities in all population pairs (Supplementary Table 1 online), and pure parent crosses from 29°C have fewer deformities than all second generation crosses despite large variances owing to differences among population pairs (Fig. 5B).

The complexity of temperature and cross effects on proportion of deformities are reflected in the presence of a significant interaction between rearing temperature and cross type (G × E) in 80% (12 of 15) of the population pairs (Fig. 5A). Significant G × E means that, within population pairs, cross types yield significantly different proportions of deformities depending on the offspring's rearing temperature. The interaction is also apparent in the differences among joint-scaling models fit to data from the different temperatures.

Similar to our results for offspring numbers, dominance, epistasis, maternal effects, and cytoplasmic effects are common components contributing to developmental abnormalities. In most cases, even where additive effects are significant, they make a minor contribution to the sum of composite genetic effects. Of 17 population pair × rearing temperature combinations that produced sufficient deformities to test joint-scaling models, 15 were dominated by within and between locus interactions (Fig. 6). In the remaining two cases, Peru × Portugal reared at 29°C is explained entirely by additive effects, and India × Mexico reared at 35°C is explained by only maternal and cytoplasmic effects (Fig. 6). Complete joint-scaling results for the deformity data are reported in Supplementary Table 4 online.

Figure 6.

Results of joint-scaling analyses on the proportion of developmental defects. Bars indicate the relative proportions of each type of composite genetic effect. Only the significant parameters in the joint-scaling model are included in the calculations. Numbers in parenthese indicate rearing temperature. (Detailed joint-scaling results are reported in Supplementary Table 4 online.)


Geographic distance among population pairs ranges from 830 km to 16,336 km (Table 1). For eight pairs of populations where genetic distance based on AFLP markers is known, Nei's D ranges from 0.022 to 0.125. In the four population pairs for which mtDNA sequence divergence is known, substitutions per site range from 0 to 0.012 (Table 1). We do not find a significant correlation between molecular measures of distance and geographic distance in the populations sampled here (AFLP Nei's D, r=−0.002, P= 0.356). This lack of relationship is true for the full set of comparisons as well as when corrected for non-independence.

Regression of the number of significant joint-scaling parameters on measures of geographic and genetic distance shows no relationship for offspring numbers (geog. dist. r=−0.004, P= 0.587; AFLP Nei's D, r=−0.012, P= 0.067) or deformities (geog. dist. r= 0.011, P= 0.287; AFLP Nei's D, r=−0.02, P= 0.316). The lack of relationship in this measure of complexity results from the frequent occurrence of nonadditive effects even among populations that are genetically and geographically proximate (Figs. 4 and 6). The only significant trend in the regression of proportions of total composite effects against genetic and geographic distance is a negative correlation between additive effects and geographic distance (r=−0.522, P= 0.046).

The proportion of failed crosses is significantly correlated with genetic distance between the eight population pairs with nuclear distance estimates (AFLP Nei's D, r= 0.647, P= 0.016). This relationship does not hold for the four crosses incorporating divergence from Mexico (r= 0.490, P= 0.188) or from Ecuador (r=−0.388, P= 0.726). Furthermore, there is no relationship between proportion of failed crosses and geographic distance (r= 0.049, P= 0.428).



Dominance, epistasis, and maternal effects tend to make much larger contributions to differentiation in genetic architecture among T. castaneum populations than additive effects (Figs. 4 and 6). Additionally, cytoplasmic differentiation is important to loss of developmental stability in hybrids between some population pairs. The most pervasive observation in population crosses from all levels of divergence is the release of substantial cryptic variation apparent in the increased variation among the means of hybrid generations relative to the parent populations. The distribution of additive and nonadditive contributions to genetic architecture differs for each population pair and similarity of parental phenotypes are not indicative either of the outcome of hybridization or the absence of divergence in underlying genetic architectures (e.g., Fig. 3C).

Several studies using similar line-cross methodologies have also found significant contributions of nonadditive genetic effects in divergence between both plant and animal populations (e.g., Hard et al. 1993; Edmands 1999; Fenster and Galloway 2000b; Carroll et al. 2001). For example, epistasis is required to explain divergence in fitness related traits among populations of the marine copepod Tigriopus californicus from the west coast of North America (Edmands 1999). Reduced F2 fitness in T. californicus hybrids has been further shown to result from coadapted cytonuclear combinations (Willett and Burton 2001). In T. castaneum, cytoplasmic effects, which may include cytonuclear interactions, contributed significantly to variation in offspring numbers in five population pairs and to deformities in four population pairs (Supplementary Tables 3 and 4 online). Kin−structured colonization is known to differentially affect mitochondrial and nuclear differentiation among populations (Wade and McCauley 1988; Whitlock and McCauley 1990; Wade et al. 1994) and because both T. castaneum and T. californicus are dependent upon transient patches of resources with variable rates of renewal, the observed importance of cytoplasmic effects may indicate an important role for colonization and extinction in the genetic differentiation of these species.

Similar nonadditive contributions to divergence in fitness related traits have also been found among populations of the annual legume Chamaecrista fasciculata separated by 100 m to 2000 km (Fenster and Galloway 2000a, b; Galloway and Fenster 2001). As in T. castaneum, phenotypic similarity of C. fasciculata parental population phenotypes is not a good indicator of genetic differentiation. Importantly, the C. fasciculata studies demonstrate that nonadditive differentiation and release of cryptic variation occur when crosses are reared in nature. Furthermore, C. fasciculata populations reared in their native environment have a fitness advantage, suggesting that differentiation is due, in part, to local adaptation. Together these data support a view of adaptive landscapes where response to selection is governed by the local genetic context and different genetic combinations result in arrival at adaptive peaks of similar height, with fitness valleys between them (Wade and Goodnight 1998).

Maternal effects commonly make a significant contribution to differentiation among T. castaneum populations, often contributing more than additive effects (Figs. 4 and 6). Other studies incorporating appropriate reciprocal crosses also commonly find a maternal effect contribution to trait differentiation (Lair et al. 1997; Edmands 1999; Carroll et al. 2001). Strict maternal effect genes are expected to maintain twice the genetic variance at mutation–selection balance as genes expressed in both sexes (Wright 1969; Crow and Kimura 1970), and they may evolve more rapidly under some conditions (Whitlock and Wade 1995; Barker et al. 2005; Demuth and Wade 2006b). In this study, differences between reciprocal crosses (where maternal effects are most obvious) are sometimes as large as the differences among the rest of the hybrid generation means combined (e.g., Fig. 3A).

Some populations of T. castaneum harbor a maternal effect selfish genetic element, Medea (maternal effect dominant embryonic arrest). In populations with Medea, homozygous non-Medea offspring from heterozygous mothers fail to develop early in larval life (Beeman et al. 1992; Beeman and Friesen 1999). Although population genetic theory predicts that even limited gene flow should allow Medea to spread rapidly among T. castaneum populations (Wade and Beeman 1994), regional differences in Medea's presence suggest cryptic barriers to gene flow (Beeman 2003). Despite the strong signature of maternal effects in our study, neither the phenotypes of deformed offspring nor the pattern of cross failure are consistent with the presence of Medea in any of these populations.

Complete failure of individual mating pairs was infrequent in T. castaneum population hybrids, but was significantly more likely to occur in the F2 generation than other cross type. Similar types of F2 breakdown have also been observed in T. californicus and C. fasciculata, and variety of other systems (reviewed in Endler 1977; Burton et al. 1999). F2 breakdown suggests that the deleterious effects of breaking up parental coadapted gene complexes outweigh the beneficial effects of dominance and/or dominance × dominance epistasis that are apparent from our common observation of F1 hybrid vigor (Supplementary Fig. 1 online). From a speciation genetics perspective, our results suggest that incompatibilities separating closely related populations appear primarily as a consequence of combining homozygous loci from both parents in the hybrid genome; rather than the heterozygous incompatibilities involving dominance that are typically associated with F1 sterility and inviability between species. The observed pattern of hybrid dysgenesis is an expected consequence of additive × additive epistasis (Iαα), which we also find to be a common component of differentiation for offspring numbers and deformities (Supplementary Tables 3 and 4 online). Employing a different experimental method, Wade (1985, 2000b) also found that the effect of Iαα, estimated as the sire × deme component of variance, was much larger than the additive variation among sires for T. castaneum offspring numbers.

Significant Iαα epistasis has been found to underlie phenotypes in T. californicus, C. fasciculata, corn, lima beans, tomatoes, tobacco, Drosophila spp., cave fish, mice, chickens, and pitcher-plant mosquitoes (reviewed in Lynch and Walsh 1998). This type of epistasis is interesting because it can change the sign of allelic values from one homozygous background to another (Wade 1992; Wade and Goodnight 1998; Wade 2002). Furthermore, Iαα has been shown on theoretical grounds to play a large role in “conversion” (Goodnight 1988, 1995; Goodnight and Wade 2000), the process that occurs in subdivided populations wherein random genetic drift or selection affects the frequency of one member of a pair of interacting loci. In this case, the epistatic genetic variance “is converted to” additive variance at the remaining segregating locus and it can undergo selection based entirely on this “additive effect” derived from Iαα. Thus, when Iαα changes the sign of allelic effects among populations, it may be simultaneously promote adaptation within populations and accelerate the origin of genetic barriers to gene flow among them (Wade 2000a; Demuth and Wade 2005).

Conversion of epistatic to additive variation is hypothesized to have facilitated the evolution of an important fitness trait (photoperiodic response) during recent range expansion of the pitcher-plant mosquito, Wyeomyia smithii (Hard et al. 1993). Joint-scaling analyses show epistatic differences among more ancestral W. smithii populations (Lair et al. 1997) as well as epistasis within populations (Bradshaw et al. 2005). Evidence for epistasis underlying fitness related traits within populations is also found in joint-scaling analyses of crosses between inbred lines of monkeyflowers (Mimulus guttatus) derived from a single natural population (Kelly 2005); and in significant differences in offspring numbers of T. castaneum males mated to groups of sisters from different families in the same population (Lopez and Wade unpubl. ms.). These studies are important because they provide evidence that the epistasis observed between populations and species may be a natural consequence of differentially sorting epistatic loci in response to population subdivision and local selection pressures.

There are two principal caveats to our general interpretations concerning the prevalence of non-additive effects in our study. First, the traits we observe, offspring numbers and deformities, may represent composites of several interacting traits that individually may not show such complex differentiation (e.g., Galloway and Fenster 2001). Second, we interpret offspring numbers strictly as viability of offspring genotypes when parental fecundity and mating preference may also influence offspring numbers. We do not believe that mate preference is a concern given our experimental protocol and the promiscuity of beetle mating (see Materials and Methods). Some insight into whether F1 fecundity/fertility has a general effect on our conclusions is provided by inclusion of backcrosses in both directions as well as reciprocals for all crosses in the joint-scaling analyses. Although we do not see a clear signature of different average fecundity/fertility in F1s for any of our population pairs, direct measures of egg production and larval survivorship in future studies will be necessary to fully address the impact of parental phenotypes on our conclusions.


Architectural complexity, as measured by number of joint-scaling parameters, is not correlated with geographic and/or genetic distance among T. castaneum populations because differentiation is uniformly complex. Although most models of genetic architecture require similar numbers of genetic components, the magnitudes of those components are highly variable and the resulting hybrid phenotypes are unpredictable (e.g., Fig. 3). The sole trend for diminished contributions of additive effects to hybrid offspring numbers as geographic distance increases is puzzling in light of the lack of relationship between geographic distance and genetic divergence among the populations in this study. Perhaps the molecular markers measured do not correspond well with loci important for the fitness traits measured, but the available measures are based on ∼400 AFLP markers spread randomly throughout the genome (Demuth unpubl. data).

If the early stages of speciation were a gradual process one might expect to see correlations between molecular divergence, developmental defects, offspring numbers (negative correlation), and cross failure rate. We find only a weak positive correlation between failure rate and genetic distance. Instead, the outcome of T. castaneum hybridizations appears to depend on the serendipitous particulars of divergence separating any given pair. Although our study does not reflect a correlation between hybrid breakdown and divergence, the two strongest cases of hybrid breakdown are also the most divergent population pairs for which we have AFLP data (Malaysia × Croatia and India × Mexico; Table 1 and Supplementary Fig. 1 online).

In toto, our findings are consistent with the observation that the fitness consequences of hybridization between populations or species are unpredictable during the early period of divergence (reviewed in Edmands 2002). This unpredictability even in crosses between genetically and geographically proximate populations suggests that divergence may result more frequently from genes of large effect rather than from the slow accumulation of many genes of small effect (Edmands 2002). However, the greater the extent to which non-additive effects are involved in population divergence, the less uniform and predictable will be the additive effects of the genes themselves, even in the initial stages of population genetic divergence.

The absence of a simple linear relationship reflecting isolation-by-distance with increasing contributions of non-additive effects was also a feature of similar studies of W. smithii (Lair et al. 1997) and C. fasciculata (Fenster and Galloway 2000b). In W. smithii this is likely the result of recent, post-glacial, range expansion (Armbruster et al. 1998) but may also be a general outcome of a stepping stone model of colonization as proposed for C. fasciculata (Fenster and Galloway 2000a). The overall low levels of divergence in T. castaneum may also be a result of range expansion in conjunction with human commensalism, a hypothesis supported by well-defined, yet shallow, phylogenetic clustering among populations (Demuth unpubl. ms.).


Our results demonstrate that offspring rearing temperature has a profound effect on the perception of genetic architecture underlying differentiation. Joint-scaling parameters differ between rearing temperatures for nearly all population pairs for both phenotypes, regardless of whether a significant effect was detected by ANOVA. G × E is particularly strong in the case of developmental defects and in several cases hybrid dysgenesis occurs only under one rearing temperature (Fig. 5A). Observation of G × E implies that the effects of genes are not only dependent on genetic context but also on the environmental context.

Most studies that attempt to measure the genetic architecture of a trait do not systematically account for G × E in terms of changes in genetic architecture across environments (e.g.,Wade 1990). However, the strength of cytonuclear effects in T. californicus population hybrids is dependent on temperature and light (Willett and Burton 2003), and the genetic architecture of photoperiodic response in W. smithii also shows significant G × E in response to population density (Bradshaw and Holzapfel 2000). Lack of replication in many QTL studies conducted under lab versus field conditions or in different years may be a consequence of G × E. Indeed, QTL studies that attempt to quantify G × E often find that it is prevalent and may change the attributes (i.e., magnitude, dominance, epistasis) and/or presence of QTL (e.g., Vieira et al. 2000; Kamoshita et al. 2002).

The strong effect of G × E on developmental defects may have implications for the speciation process. Expression of incompatibility phenotypes are often overlooked in empirical studies of population divergence. However, when G × E is present, postzygotic isolation may evolve more rapidly than under the standard Dobzhansky–Muller model because allelic incompatibilities may form within loci (Bordenstein and Drapeau 2001). To our knowledge the existence of G × E for post-zygotic isolation has not previously been documented at such an early stage of divergence (e.g., Colombia × Ecuador mtDNA distance = 0.004 subs/site). In an accompanying paper we analyze the genetic basis of sex differences in hybrid breakdown and G × E (Demuth and Wade accompanying manuscript). Possible causes of the effect of temperature on proportions of hybrid deformities could be deactivation of molecular chaperones, such as heat shock proteins (Sung et al. 2003), or activation of transposable elements (Regner et al. 1999), both of which have been shown to affect developmental stability in response to rearing temperature.


Our study demonstrates widespread cryptic genetic variation affecting offspring numbers and developmental robustness in a global sample of T. castaneum populations. Even among closely related populations where phenotypic means of the pure parent populations are the same, we find complex differentiation in underlying genetic architectures involving dominance, epistasis, maternal, and cytoplasmic effects. Furthermore, G × E affects measures of genetic architecture for all population pairs. Some of the differentiation among populations results in hybrid breakdown, but the occurrence of reduced fitness is not predictable from geographic or genetic distance. Our results are consistent with the view that genetic incompatibilities responsible for postzygotic isolation, an important component of speciation, may be a natural but serendipitous consequence of nonadditive genetic effects and structured populations.

Associate Editor: C. Goodnight


We thank K. Hoyt, D. Dohl, Z. Wendling, M. Robertson, E. Barajas, L. Gallinot, K. Harris, J. Katsahnias, P. Morone, M. Aldulescu, J. Howell, and T. Webb for assistance in beetle maintenance and data collection. N. Johnson, D. McCauley, C. Goodnight, E. D. Brodie, J. Wolf, L. Rieseberg, M. Lynch, T. Linksvayer, J. Lopez, J. Moorad, M. Saur, and T. Wood provided discussion and lab assistance throughout the preparation of this work. This study was financially supported by: Sigma Xi Grants-in-Aid of Research, Indiana University President's Summer Undergraduate Research Initiative, National Science Foundation Integrative Graduate Education and Research Traineeship in Evolution Development and Genomics (9972830), National Science Foundation Doctoral Dissertation Improvement grant (0206628), and National Institutes of Heath grant (GM065414-01A).