• We have archived our data on sequences, microsatellites and CHD markers, geographic coordinates, climatic variables, sex, etc. in Dryad and GenBank.


Relationships among multilocus genetic variation, geography, and environment can reveal how evolutionary processes affect genomes. We examined the evolution of an Australian bird, the eastern yellow robin Eopsaltria australis, using mitochondrial (mtDNA) and nuclear (nDNA) genetic markers, and bioclimatic variables. In southeastern Australia, two divergent mtDNA lineages occur east and west of the Great Dividing Range, perpendicular to latitudinal nDNA structure. We evaluated alternative scenarios to explain this striking discordance in landscape genetic patterning. Stochastic mtDNA lineage sorting can be rejected because the mtDNA lineages are essentially distinct geographically for > 1500 km. Vicariance is unlikely: the Great Dividing Range is neither a current barrier nor was it at the Last Glacial Maximum according to species distribution modeling; nuclear gene flow inferred from coalescent analysis affirms this. Female philopatry contradicts known female-biased dispersal. Contrasting mtDNA and nDNA demographies indicate their evolutionary histories are decoupled. Distance-based redundancy analysis, in which environmental temperatures explain mtDNA variance above that explained by geographic position and isolation-by-distance, favors a nonneutral explanation for mitochondrial phylogeographic patterning. Thus, observed mito-nuclear discordance accords with environmental selection on a female-linked trait, such as mtDNA, mtDNA–nDNA interactions or genes on W-chromosome, driving mitochondrial divergence in the presence of nuclear gene flow.

Evolutionary processes (e.g., genetic drift, natural selection, and gene flow) operate at different times and places in species’ histories, differentially affecting genomes and phenotypes; thus, understanding their roles in speciation remains challenging (Coyne and Orr 2004; Price 2008; Butlin et al. 2009; Smadja and Butlin 2011). Discordant geographic patterns within and among genomes in a species could result from coalescent variance (Lohse et al. 2010), genetically localized selective sweeps (Bensch et al. 2006), life-history traits (e.g., sex-biased philopatry/dispersal; Turmelle et al. 2011), different modes of marker inheritance (and thus effective population sizes), or differential marker behavior following secondary contact (Petit and Excoffier 2009). A recent review (Toews and Brelsford 2012) found that mito-nuclear discordance, a major difference in the patterns of differentiation between mitochondrial and nuclear DNA, is a common phenomenon in animal systems. In the great majority of reviewed cases, mito-nuclear discordance arose following allopatric divergence and secondary contact (Toews and Brelsford 2012). This is expressed either in more structuring and/or narrower geographic clines in mtDNA compared with nDNA (usually explained by nuclear introgression and/or sex-biased asymmetries including dispersal, mating, or offspring survival) or relatively less structuring and/or wider geographic clines of mtDNA, usually explained by adaptive introgression of mtDNA, demographic disparities including genetic drift, or sex-biased asymmetries. Historic isolation and secondary contact can also result in coexistence of deeply divergent mitochondrial lineages in sympatry in one panmictic population (Zink et al. 2008; Webb et al. 2011; Hogner et al. 2012). However, in four cases, all involving avian systems, strong mitochondrial but not nuclear structure has arisen in the absence of obvious geographic isolation (Irwin et al. 2005; Cheviron and Brumfield 2009; Ribeiro et al. 2011; Spottiswoode et al. 2011). Furthermore, in two bird species, mitochondrial but not nuclear structure correlated with environmental variation (Cheviron and Brumfield 2009; Ribeiro et al. 2011), suggesting that mtDNA haplotypes might be differentially adapted to environmental conditions. Growing evidence of selection on mtDNA (Ballard and Whitlock 2004; Bazin et al. 2006; Meiklejohn et al. 2007) indicates that this process might be more common than recognized, although it is rarely tested for in phylogeographic studies (Toews and Brelsford 2012). Links between mitochondrial variation and energy metabolism (Mishmar et al. 2003; Tieleman et al. 2009) suggest that mtDNA data should be routinely examined in relation to environmental variables in species distributed over wide ranges of climatic conditions. Multilocus investigation of spatial genetic structure in relation to environmental variation should lead to better understanding of evolutionary forces shaping molecular variation.

The eastern yellow robin Eopsaltria australis (Passeriformes: Petroicidae) is a common and widespread bird of mesic woodlands and forests of tropical to temperate eastern Australia (Barrett et al. 2003). The species’ distribution spans 20° of latitude, associated with a wide range of climates (Higgins and Peter 2002). Superimposed on this climatic gradient, the Great Dividing Range, a mountain chain of modest elevation peaking at 2228 m above sea level, runs the entire length of the eastern coastline, and imposes climates that are generally drier and warmer further inland (Bowman et al. 2010; Byrne et al. 2011). Comprehensive, continent-wide bird surveys at 1° grids (1998–2002) showed E. australis to be evenly distributed and breeding throughout the Great Dividing Range (Barrett et al. 2003). Dispersal of E. australis is limited and significantly female-biased (Higgins and Peter 2002; Debus and Ford 2012; Harrisson et al. 2012), thus local adaptation is feasible (Schneider et al. 1999). So, too, is neutral divergence in allopatry: aridification and cooling during Pleistocene glacial cycles have promoted divergence and speciation of some forest and woodland Australian mesic organisms (Sunnucks et al. 2006; Symula et al. 2008; Malekian et al. 2010; Byrne et al. 2011).

A preliminary molecular dataset (Loynes et al. 2009) showed two highly divergent mitochondrial ND2 haplogroups in E. australis (> 6% divergence), whereas one polymorphic nuclear gene and two nuclear introns were not structured. Using mitochondrial COI Christidis et al. (2011) confirmed this mitochondrial subdivision and suggested that, pending further phylogeographic study, the haplogroups may comprise separate species. Notably, the haplogroups did not correspond to the recognized subspecies, northern E. a. chrysorrhoa and southern E. a. australis (Schodde and Mason 1999), suggesting a possibility of mito-nuclear discordance. Interpreting strong mtDNA divergence as indicative of speciation is based on the usually untested assumption that mitochondrial lineages will behave as if under neutral evolution. Although in some cases this assumption might be valid (Harrison 1989; Zink and Barrowclough 2008), simulations show that even weak selection for local adaptations can result in strong phylogeographic structure of a uniparentally inherited locus, where divergent clades are geographically localized and differently adapted (Irwin 2012). Here we report an intriguing phylogeographic pattern in E. australis, where two major mtDNA lineages were distributed roughly eastward or westward of the Great Dividing Range in southeastern Australia, perpendicular to roughly latitudinal isolation-by-distance (IBD) of some nuclear loci, and we test rigorously whether selection or one or more neutral process better explains this mito-nuclear discordance. We used multiple types of loci (mitochondrial ND2 gene, nuclear microsatellites, and introns) having different rates of evolution and patterns of inheritance (autosomal, Z-linked, and female-limited W-linked), and climatic variables, applied over the full range of the species in landscape genetic and genealogical frameworks. First, we tested if mitochondrial haplogroups diverged by drift in allopatry followed by secondary contact (as in the majority of cases reviewed by Toews and Brelsford 2012). Under this hypothesis, we expected to observe low habitat suitability for E. australis on and along the Great Dividing Range (an agent of vicariance for some taxa along the length of Australia's eastern seaboard; Byrne et al. 2011) during the Last Glacial Maximum (LGM). We used species distribution modeling to test this. Second, we tested if mitochondrial divergence in E. australis occurred in the presence of nuclear gene flow. Under divergence with gene flow (Pinho and Hey 2010), some nuclear gene flow between mtDNA haplogroups is expected. We tested this by fitting an isolation-with-migration (IMa) model to nDNA data. Finally, we tested the hypothesis that female-linked environmental selection (including selection on mtDNA and linked nuclear genes; Rand et al. 2004; Meiklejohn et al. 2007; Dowling et al. 2008; Tieleman et al. 2009; Shen et al. 2010; Moghadam et al. 2012) resulted in mtDNA divergence in the presence of nuclear gene flow between populations defined by their mtDNA haplogroups. This hypothesis includes the case where dispersal (which is female-biased in E. australis) between environmentally divergent areas is lower than dispersal between similar areas. Under the female-linked selection hypothesis, decoupled evolutionary history of mtDNA and nDNA is expected to result in different patterns of geographic structure for these genomes, and mtDNA may show associations with environmental variation, after spatial autocorrelation has been controlled for. We tested these predictions using phylogeographic, landscape genetic, and distance-based redundancy analyses.



To understand how current and late Pleistocene climatic variation in temperature and humidity might have impacted the spatial distribution of E. australis (and its genetic variation), and to test for a vicariant effect of the Great Dividing Range, we modeled current and LGM distributions of this species using the machine-learning maximum entropy model implemented in MaxEnt 3.3.3e (Phillips et al. 2006). Nineteen bioclimatic variables for the present and the LGM (Busby 1991) at a resolution of 2.5 arc-min (∼5 km2 grid) were obtained from the WorldClim 1.4 database (Hijmans et al. 2005). Layers were visualized and cropped to span from latitude 46–5°S and longitude 130–156°S using DIVA-GIS 7.4.0 (Hijmans 2009). Presence data comprised 454 presence localities of E. australis in the Atlas of Living Australia ( A set of 341 randomly chosen presence points (75%) was used to train MaxEnt model under current climatic conditions, and a set of 113 points (25%) was used to test the model. The fit of several preliminary MaxEnt models generated with different combinations of variables was compared using the area under the receiver-operating curve (AUC). The final model of current species distribution was generated using six variables that maximized AUC: max temperature of warmest month, min temperature of coldest month, mean diurnal range, precipitation of wettest month, precipitation of driest month, and precipitation seasonality. LGM distribution was modeled by projecting the final current model onto the set of these six variables estimated for the LGM (see online Supplementary Material S1 for details).


Genomic DNA was extracted from 63 frozen tissues of E. australis spanning the species’ entire geographic range (Australian National Wildlife Collection, CSIRO Ecosystems Sciences, Canberra) following Kearns et al. (2009), and from 44 blood samples collected in the south and north of the species’ range following Harrisson et al. (2012) (Fig. 1, Supplementary Material S2). Individuals were screened for eight autosomal nuclear microsatellites, Cpi3, Cpi8 (Doerr 2005), Escmu6 (Hanotte et al. 1994), HrU2 (Primmer et al. 1995), Pocc6 (Bensch et al. 1997), Smm7 (Maguire et al. 2006), Pgm1, and Pgm7 (Dowling et al. 2003) following Harrisson et al. (2012), and for allele-length polymorphism in the female-specific CHD-W locus and its Z-chromosome homolog CHD-Z (Griffiths et al. 1998; Supplementary Material S3). Pgm1 had suspected null alleles, and Pgm7 was found to be Z-linked and extremely variable, thus neither were used in analyses. The other six microsatellite loci were previously shown to be in Hardy–Weinberg equilibrium and independently segregating in E. australis (Harrisson et al. 2012).

Figure 1.

(a) Sampling localities (dots) and geographic distribution of mitochondrial haplogroups (colors): light red dots—A1, dark red dots—A2, black dots—B; numbers show distribution of haplotypes (labeled as in Fig. 3). Insets show two locations where both haplogroups were sampled: in Tenterfield, a haplogroup A male was collected on the same day from the same group of birds as a haplogroup B male and female; in Shelbourne, a haplogroup B male and a haplogroup A male were caught in May and October 2008, respectively. (b) Distribution of Q-values (pie charts within circles) for two microsatellite genetic clusters (“red” and “black”), detected by structure; each circle's outer border color shows the mitochondrial haplogroup for each individual: light red—A1, dark red—A2, black—B; samples are spaced apart for easier viewing. Altitudinal layers of the Great Dividing Range are: light gray—200 to 500 m, gray—500 to 1000 m, dark gray—above 1000 m.

The mitochondrial (mtDNA) ND2 gene was amplified using primers L5215 (Hackett 1996) and H6313 (Sorenson et al. 1999) following Kearns et al. (2009) and sequenced commercially in both directions using the amplification primers (Macrogen, Korea). Six nuclear introns (aldolase B intron 4 [AB4], glyceraldehyde-3-phosphate dehydrogenase intron 11 [GAPDH], phosphenolpyruvate carboxykinase intron 9 [GTP], Z-linked muscle skeletal receptor tyrosine kinase intron 3 [MUSK-I3], rhodopsin intron 2 [RI2] and transforming growth factor-β2 intron 5 [TGFβ2]) were amplified in 47 individuals (Table 1, Supplementary Material S2) using published primers (Waltari and Edwards 2002; Sorenson et al. 2004; McCracken and Sorenson 2005; Vallender et al. 2007). All nuclear DNA (nDNA) PCRs contained MgCl2 (1.5 mM), 1× KCl buffer, 0.125 μM each dNTP, 0.4 μM primers, 0.01 units/μL Taq polymerase, and 0.2 mg/mL BSA, and the microsatellite touchdown protocol above was applied. Nuclear loci were sequenced using forward primers by the UK NERC Genepool sequencing facility (University of Edinburgh, UK). For a subset of individuals representing all inferred haplotypes, nuclear loci were further sequenced in reverse direction to check that the haplotypes did not represent sequencing errors.

Table 1. Descriptive statistics for all samples and members of distinct mitochondrial clades for sequenced loci (wet tropics individuals belonging to clade A2 were not sequenced for nDNA). N ind = number of individuals, N alleles = number of alleles, bp = base pairs, S = polymorphic sites, H = number of haplotypes, Hd = haplotype diversity (expected heterozygosity), π = nucleotide diversity, Rm = minimum number of recombinant events, n/s = not significant (P > 0.05), n/a = not applicable. Significant ΦST - values are in bold
   NNLength,Indels,    Fu's Fs;Tajima'sΦST 
Locus DataindallelesbpbpSHHdπPD; PA1 vs. BRm
AB4Autosomal intronAll44881961230.4470.00250.613 n/s0.343 n/s0.047; P = 0.0270
  Clade A20401961130.4050.0022-0.078 n/s0.565 n/s  
  Clade B22441950230.5190.00300.609 n/s0.474 n/s  
GAPDHAutosomal intronAll469231708100.8180.0047−2.217 n/s−0.147 n/s0.065; P = 0.0091
  Clade A21423170680.8660.0054−1.263 n/s0.577 n/s  
  Clade B23463170660.7230.0036−0.742 n/s−0.432 n/s  
GTPAutosomal intronAll428445618100.4100.0011−8.403; P < 0.001−1.942; P < 0.0010.034; P = 0.0180
  Clade A19384560450.2920.0007−3.671; P = 0.001−1.630; P = 0.016  
  Clade B21424561780.5350.0018−4.128; P = 0.004−1.502; P = 0.056  
MUSK-I3Z-linked intronAll3151519021150.8100.0078−2.240 n/s−0.416 n/s0.061; P = 0.0723
  Clade A203251901790.6730.0065−0.114 n/s−0.671 n/s  
  Clade B1117519018100.9190.01020.516 n/s−0.033 n/s  
RI2AutosomalAll47942920220.0210.0002−1.421 n/s−1.386 n/s−0.001 n/s0
 intronClade A22442920010.0000.00000 n/a0 n/s  
  Clade B23462920220.0440.0003−0.783 n/s−1.473; P = 0.03  
TGFβ2AutosomalAll3978595124190.8700.0070−3.420 n/s−0.443 n/s0.007 n/s4
 intronClade A2040595023160.8680.0075−3.465 n/s−0.599 n/s  
  Clade B173459411290.8570.00660.553 n/s0.935 n/s  
ND2mt codingAll10010010020102400.9080.03342.785 n/s2.303; P = 0.010.968; P < 0.001n/a
  Clades A1+B97971002094380.9020.03323.382 n/s2.704; P = 0.002  
  Clade A153971002026200.7380.0017−16.418; P < 0.001−2.276; P < 0.001  
  Clade A239710020220.6670.0013  
  Clade B44971002016180.9010.0023−10.728; P < 0.001−1.158 n/s  

Chromatograms were edited and aligned in Geneious Pro 4.8.5 (Drummond et al. 2010). The mitochondrial origin of ND2 sequences was supported by lack of stop-codons, insertions/deletions, or sequence ambiguities. For nuclear introns containing indels, only the first fragments of sequence, where the majority of sequences were unambiguous, were analyzed (Table 1), and individuals heterozygous for indels within these fragments were removed from analysis (one for GAPDH, five for GTP, eight for TGFβ2). Gametic phases of heterozygous sequences were reconstructed using the Phase 2.1 algorithm (Stephens et al. 2001; Stephens and Donnelly 2003) implemented in DnaSP version 5.10.1 (Librado and Rozas 2009). Because of presence of many rare alleles for some loci, haplotypes for small proportions of individuals (0.07 for GTP, 0.19 for MUSK-I3, 0.01 for RI2, 0.18 for TGFβ2, and 0.16 for MC1R) were resolved with < 80% probability. We used all best pairs of haplotypes for analyses to avoid systematic bias in estimates of population genetic parameters (Garrick et al. 2010), although incorrectly resolved haplotypes could slightly influence estimates of linkage disequilibrium, recombination, and coalescence analyses (Balakrishnan and Edwards 2009). Introns were checked for recombination in DnaSP using the four-gamete test (Hudson and Kaplan 1985).


Relationships among haplotypes for mtDNA and nDNA loci were visualized on a median-joining network (Bandelt et al. 1999) constructed in Network version ( Owing to recombination, nuclear gene networks may not represent true genealogical relationships, but are nonetheless useful for assessing allele clustering with respect to mtDNA-defined lineages. Support for mitochondrial ND2 haplogroups and their divergence times were estimated in Beast 1.6.1 and 1.7 (Drummond and Rambaut 2007; Drummond et al. 2012) using all 100 ND2 sequences (Supplementary Material S4). Structure 2.3.3 (Pritchard et al. 2000) and Tess 2.3.1 (Chen et al. 2007) were used to explore genotype substructuring in microsatellites and autosomal introns (Supplementary Material S5). Spatial principal component analysis (sPCA; Jombart et al. 2008) implemented through the functions spca and global.rtest in the package adegenet version 1.2-8 (Jombart 2008) for R (R Development Core Team 2011) was used to investigate spatial patterns of genetic variation. sPCA maximizes the product of spatial autocorrelation and genetic variance to summarize multivariate data on allele frequencies into a few uncorrelated axes. This approach is powerful for detecting cryptic spatial patterns that are not associated with high genetic variation (Jombart et al. 2008); unlike Structure or Tess, it does not require Hardy–Weinberg or linkage equilibria, and can be applied to sequence data. We used a Gabriel graph and introduced a small amount of random noise to coordinates of multiple individuals collected at the same location (function jitter, factor = 1, amount = 0.1; the network was robust to the addition of noise). Analyses were run separately for ND2, each of the sequenced nuclear markers, and the dataset of six microsatellites (removing 9 individuals with missing scores for a locus).

Multiple processes, such as IBD and local selection (e.g., isolation-by-adaptation; Nosil et al. 2008), can simultaneously shape spatial structure of variation of different molecular markers. To separate the effect of geographic position, which may reflect local processes, from any underlying effect of IBD, we applied redundancy analysis (RDA, function rda, R package vegan; Dixon 2003; Oksanen et al. 2011). RDA is a constrained ordination technique that tests whether a given predictor explains residual variation in response after fitting (conditioning on) other predictors first. We tested whether geographic position can explain genetic variance of nuclear loci beyond what can be explained by the IBD effect. As the response, we used data on presence or absence of each identified allele (microsatellites) or variant nucleotide at polymorphic sites (sequences) in each individual, computed using a genind constructor and centered and scaled using scaleGen (R package adegenet). The two predictors were geographic position (latitude and longitude, analyzed together for simplicity of interpretation) and pairwise geographic distance between locations. For RDA (and also distance-based RDA, below), the information about an individual's pairwise geographic distances to other individuals was reduced to a few columns using the following procedure. First, geographic distances between locations, initially computed with function (R package fields), were expressed as a rectangular matrix using principal coordinates of the neighborhood matrix procedure (pcnm; Borcard and Legendre 2002) implemented in function pcnm in vegan. Then this rectangular matrix was reduced to ≤ 3 columns best explaining genetic variance. For that, we fitted all pcnm axes as predictors of allele presence using RDA, and chose up to three axes significant at P < 0.05 to represent geographic distances in the final RDAs. This approach is more powerful than Mantel tests, which require multivariate environmental data (such as latitude, longitude, and environmental variables) to be expressed as distances (Legendre and Fortin 2010). For the final RDAs, we tested for the ability of each predictor to explain significant variance in allele presence–absence data alone (marginal tests), and after fitting the other predictor first (conditional tests). Significance was assessed with 999 permutations (of the rows and columns of the predictor matrix for marginal tests, or of the rows and columns of the multivariate residual matrix for conditional tests). A predictor variable that explained significant (P < 0.05) genetic variance in marginal and conditional tests was inferred to have major effect on genetic structure. If both marginal tests were significant but conditional tests were not, both variables were inferred to influence genetic variation. RDAs were performed on each sequenced nuclear marker, and complete genotypes at six microsatellites. For mitochondrial ND2 sequences, effects of geographic position and distance were explored using distance-based RDA (dbRDA) as explained later.


For each sequenced locus, we calculated haplotype diversity (Hd), nucleotide diversity (π), pairwise ΦST between haplogroups, Tajima's D (Tajima 1996), and Fu's Fs (Fu 1997) tests for selective neutrality using Arlequin 3.11 (Excoffier et al. 2005). DnaSP was used to test for selection on ND2 itself using the McDonald–Kreitman (MK) test (McDonald and Kreitman 1991), two individuals with missing data for two sites were removed from this analysis (for additional tests for selection, see online Supplementary Material S7). Arlequin was used to compute expected heterozygosity (He) and FST between haplogroups for six microsatellites (Supplementary Material S6). Calculations were performed for all data, and for the two populations defined by mtDNA haplogroups.


We fitted the model of isolation-with-migration (Hey and Nielsen 2004) as implemented in IMa2.0 (Hey and Nielsen 2007) to the multilocus dataset comprising the largest nonrecombining fragments of four autosomal introns that had no signature of selection (AB4, GAPDH, MUSK-I3, and TGFβ2), the size-variable Z-linked intron (CHD-Z), and six autosomal microsatellites for the two populations defined by mtDNA haplogroup membership (A1 and B—see Results), omitting three individuals from an isolated northernmost population. Under this classification, any migration detected between haplogroups would indicate presence of nuclear gene flow inconsistent with vicariant divergence. The fit of a full migration model was compared to that of a no-migration submodel using a likelihood ratio test. Parameter estimates were converted to demographic units (number of years or individuals) using a generation time of 3.5 years (calculated as average age of reproductive females, assuming age of first reproduction of 1 year and lifespan of 6 years) and mutation rates of Lerner et al. (2011) (see online Supplementary Material S8 for details of analyses).


We used dbRDA (Legendre and Anderson 1999) implemented in the capscale function in R-package vegan to test whether mitochondrial divergence occurred as a result of selection acting upon a female-linked trait (broadly defined, i.e., mtDNA, mtDNA–nDNA interaction, or genes on the W-chromosome). We examined whether the two bioclimatic variables that were most informative in predicting the E. australis distribution (Supplementary Material S1) explain mitochondrial variation above that explained by geographic position and/or distance between individuals (Legendre and Fortin 2010). In this analysis, individuals were the units of observation, and pairwise interindividual TN93 + G mitochondrial genetic distances (calculated in Mega5 [Tamura et al., 2011] excluding two individuals with missing sites; Supplementary Material S4) were treated as information on multivariate response (and not as a single univariate response variable as in Mantel tests; Geffen et al. 2004). Predictors (standardized to a zero mean and unit variance) were two environmental variables: maximum temperature of warmest month and precipitation of driest month, and three geographic variables: latitude, longitude, and geographic distance between samples represented by the first significant (P < 0.05) pcnm axis (as explained above for RDA). The final dbRDA analyses included (1) marginal tests, where the relationship between mtDNA genetic distances and each of the predictors was analyzed separately; (2) conditional tests, where significance of environmental variables (maximum temp-erature or precipitation) as predictors of mtDNA distances was assessed after first fitting other variables one at a time or together; and (3) the reciprocal conditional tests, where significance of geographic variables was similarly assessed by first fitting environmental variables. Significance was assessed with 999 permutations. Environmental variables were inferred to explain mtDNA variance over that explained by geographic position and distance if marginal as well as conditional tests were significant. Significant association of mitochondrial variation and climatic variables beyond its correlation with geographic position and distance would provide strong evidence against hypotheses assuming neutral behavior of mtDNA.



Current and LGM species distribution models (Supplementary Material S1) predict the northernmost (Wet Tropics) range of E. australis to be isolated from the remainder. Thus, for this outlying region, vicariant effects might have acted at some time. In contrast, the species is predicted under LGM and current conditions to have a continuous distribution on and along the Great Dividing Range, except for a proportionally small area of habitat unsuitability (localized glaciation) during the LGM in the mountains of southeastern Victoria and southern New South Wales (Barrows et al. 2001). These predicted distributions provide no basis to expect vicariant patterns due to the Great Dividing Range (arguing against allopatric divergence). Two variables describing extremes of climatic conditions, maximum temperature of warmest month and precipitation of driest month, explained the majority of variance in E. australis presence (50% and 38%, respectively, in the final model).


The 100 individuals sequenced for 1002 base pairs of ND2 revealed 102 polymorphic sites (80 parsimony informative), defining 40 haplotypes (GenBank accession numbers KC466740–KC466839). All individuals fell into one of haplogroups A or B, between which there was 6.6% net nucleotide divergence (Bayesian posterior probability = 1, MCC Beast tree; Supplementary Material S4, and ND2 network; Fig. 2). Assuming neutral evolution with rates similar to ND2 rate of the Hawaiian honeycreepers (Lerner et al. 2011), the haplogroups A and B diverged in the early Pleistocene, ∼1.5 (95% highest posterior density [HPD] = 0.975–2.147) million years ago. Five of 53 fixed differences between A and B represented nonsynonymous but biochemically conservative amino acid substitutions (Grantham 1974; Supplementary Material S7). Haplogroup A was widespread north to south, and inland of the Great Dividing Range in southeastern Australia (Fig. 1a, red points), whereas haplogroup B was principally coastward and restricted to southeastern Australia (Fig. 1a, black points). Two locations, Tenterfield and Shelbourne (Fig. 1a) had individuals of both haplogroups. Haplogroup A comprised haplogroups A1 (widespread except Wet Tropics) and northernmost A2 (three Wet Tropics individuals; Bayesian posterior probability = 1, MCC tree; net divergence 1%; Supplementary Material S4), which diverged in the mid-Pleistocene, ∼212 (95% HPD = 104–341) thousand years ago, assuming neutral evolution. One of 10 fixed differences between A1 and A2 was non-synonymous. Consistent with these results, sPCA of mtDNA (Fig. 3) detected two significant global structures (P < 0.001), the first (mt-sPC1) reflecting sharp A versus B subdivision, and the second (mt-sPC2) subdivision between A2 (Wet Tropics) and the rest (A1 + B).

Figure 2.

Haplotype networks for mitochondrial ND2 and six nuclear introns. Circles indicate unique haplotypes, with area proportional to haplotype frequency. Connections between circles are a single mutation unless indicated otherwise (italic). Colors correspond to the haplogroups: gray—A (light gray—A1, corresponding to light red dots in Figure 1A, dark gray—A2 [two haplotypes, H9 and H12, on mtDNA network], dark red dot in Figure 1A), white—B (black dots in Fig. 1A); black pies on nuclear networks indicate individuals sequenced for nDNA but not for mtDNA (thus, mtDNA haplogroup for these individuals was unknown).

Figure 3.

Geographic distribution of spatial principal component (sPCA) scores for mitochondrial ND2 (mt-sPC1 and mt-sPC2), microsatellites (msat-sPC1 and msat-sPC2), GAPDH (GAPDH-sPC1), and MUSK-I3 (MUSK-I3-sPC1). Large black squares indicate samples well differentiated from those denoted by large white squares, small squares are less differentiated. Lines indicate values of sPCA scores for interpolated surface (lags). Narrowly spread lines and a lack of small squares indicate discontinuity or sharp geographic transition between two groups (as for mtDNA). Presence of smaller squares and broadly spread lines indicate a cline (as for microsatellites).


For 32 females and 64 males sequenced for ND2 and sexed by CHD variation, the distribution of two W-linked (female-limited) alleles exactly matched mtDNA haplogroup variation (allele 366 was restricted to haplogroup A, allele 362 to B), whereas the two Z-linked alleles were widespread and unstructured with respect to haplogroups (Supplementary Material S3). Ninety-two individuals were genotyped for six microsatellites. Genotypic analyses in Structure and Tess suggested two genetic clusters, with a gradual north to south change of individual cluster probabilities (Fig. 1b, Supplementary Material S5). sPCA on microsatellites detected two significant global structures (P < 0.001; Fig. 3). The strongest structure (msat-sPC1) showed a roughly north–south cline similar to that detected by Structure and Tess; the second, subtle structure (msat-sPC2) showed a cline across the Great Dividing Range perpendicular to msat-sPC1, directionally consistent with the major mitochondrial lineage subdivision (Fig. 3). IBD was a dominant process shaping microsatellite variation: geographic distances alone could explain the variance in microsatellite frequencies (P = 0.005, marginal RDA test; Table 2), and remained significant even after geographic position was fitted first (P = 0.017 conditional RDA test). In contrast, geographic position, although significant when fitted alone (P = 0.005, marginal RDA test), did not explain microsatellite structure after geographic distances were controlled for (P = 0.1, the reciprocal conditional RDA test).

Table 2. Redundancy (RDA) analysis: separating the effect of IBD and geographic position (latitude and longitude analyzed together) on multivariate distribution of allele frequencies (presence–absence) for six microsatellites and nuclear introns. Distance = geographic distance (represented as a rectangular matrix of pcnm axes), geography = joint effect of latitude and longitude, N pcnm axes = number of significant pcnm axes representing geographic distance. Marginal tests show the relationship between allele frequencies and either distance or geographic position alone. Conditional tests show the same relationships, but having first fitted the alternative variable (geographic position and distance, respectively) as the covariates in the analyses. Significant P-values (P < 0.05) are in bold
   Marginal testsConditional tests
 N pcnmPredictor  % VarianceVariables  % Variance
Locus (Loci)axesVariable testedFPexplainedfitted firstFPexplained


Sequences were obtained from six nuclear introns in 31–47 individuals, which had two to 19 alleles (Table 1; GenBank accession numbers KC466694–KC466739 and KC466694–KC467012). No intron displayed clusters or allele-sharing concordant with mitochondrial haplogroup designations (Fig. 2). Two genetic clusters detected by genotypic analysis of the autosomal introns in Tess were consistent with north–south gradient in cluster membership detected from microsatellites (Supplementary Material S5). Significant spatial structure was suggested by sPCA for GAPDH (P = 0.053) and for MUSK-I3 (P = 0.065; Fig. 3) in which southwestern individuals were somewhat differentiated from the rest; no structure was found for the other sequenced introns (P > 0.15; RI2 was not tested as only two haplotypes were present). RDA (Table 2) showed that IBD as well as geographic position explained variation in GTP (both P < 0.05 on marginal tests, P > 0.05 on conditional tests), geographic position explained variation in AB4, GAPDH, and MUSK-I3 beyond that explained by IBD (P < 0.05 on marginal and conditional tests); and IBD but not geographic position explained patterns in TFGβ2 (P < 0.05 on marginal test).


Within-haplogroup nucleotide diversity was lower for ND2 than that for four (of six) nuclear loci (AB4, GAPDH, MUSK-I3, and TGFβ2; Table 1); π for nuclear loci ranged 0–0.0075 and 0.0003–0.0102, and for ND2 was 0.0013 and 0.0023 for A1 and B, respectively. Significant differentiation between A1 and B was detected for ND2ST = 0.968; P < 0.001), inevitable under reciprocal monophyly. Much lower but significant A1–B differentiation was also found for nuclear introns AB4ST = 0.047; P = 0.03), GAPDHST = 0.065; P = 0.01), GTPST = 0.034; P = 0.02) and for microsatellites (FST = 0.029; P < 0.001), but not for MUSK-I3, RI2, or TGFβ2 (P > 0.05; Table 1). Significantly negative Tajima's D and/or Fu's Fs, indicative of population expansion, purifying selection, or genetic hitchhiking, were found in three loci with the lowest nucleotide diversity (for A1 in ND2 and GTP, and for B in ND2, GTP, and RI2). Furthermore, an MK test for selection on ND2 detected significant deficiency of nonsynonymous polymorphism between A1 and B (Fisher's exact test P = 0.047) with 50 synonymous and five nonsynonymous changes fixed between groups, compared to 30 synonymous and 10 nonsynonymous polymorphisms present within groups; MK tests between haplogroups A and B or between A1 and A2 were not significant (P > 0.05). Thus, purifying selection is indicated to have acted upon ND2; positive selection can be rejected (additional tests in Supplementary Material S7).


Coalescent analysis of nuclear data in IMa2 could not definitively estimate some model parameters for two populations defined by their mitochondrial haplogroup membership (details in online Supplementary Material S8), thus, and because assumptions of the model might not be met, IMa2 results should be interpreted with caution. Based on all nuclear loci, populations A and B were estimated to have diverged in the late Pleistocene or Holocene, ∼7,000 (95% HPD = 700–149,000) years ago. Posterior distributions of some Θ parameters did not reach zero, nevertheless, each of them had a distinct peak within the range of prior parameter distributions. Demographic estimates of population sizes from all nuclear data were NA = 14 (95% HPD = 3–86), NB = 13 (2–101), and NANC = 186 (122–222) thousand individuals (for A, B, and ancestral populations, respectively). Although nonzero probabilities were associated with all prior values of coalescent migration parameters, peak posterior distribution estimates suggested some migration between populations. Migration estimates from all nuclear data converted to population migration rate (2 Nm) corresponded to forward-in-time gene movement of 6.2 (95% HPD = 0–32) genes per generation from A to B and 3.8 (0–22) genes per generation from B to A. IMA2 was unable to conclusively reject the no-migration submodel based on a likelihood ratio test.


DbRDA (Table 3) showed that maximum temperature of warmest month (max.temp), precipitation of driest month, and latitude explain a significant amount of variance in mtDNA genetic distances (marginal tests P < 0.005), accounting for 51%, 47%, and 24% of genetic variance in marginal tests, respectively, whereas longitude and geographic distance do not (P = 0.59 and 0.072). So we excluded longitude from partial tests but retained geographic distance, which was close to significance. There was evidence of strong correlation of environmental variation with mtDNA haplogroups: max.temp remained a significant predictor of genetic distances even when other variables were fitted first alone or together (P = 0.005; Table 3). Precipitation seems somewhat less influential than max.temp: it remained significant when the three other variables were fitted individually, or latitude and geographic distance were fitted first (P = 0.005), but not when max.temp plus the two geographic variables were fitted first (P = 0.38; Table 3). Latitude remained significant after first fitting the other variables alone or together (P < 0.05; Table 3). Overall, these results suggest that max.temp and latitude may play an important role in shaping mitochondrial genetic structure of E. australis. Haplogroup A individuals occupied locations with higher maximum temperatures of the warmest month and lower precipitation of the driest month than those of haplogroup B (Fig. 4).

Table 3. Results of dbRDA analysis: tests of relationship between multivariate mtDNA genetic variation (defined by pairwise mtDNA genetic distances) and several predictor variables (geographic and environmental). The first row for each tested predictor is a marginal test for that variable, the next four rows are conditional (partial) tests where variables given in second column are partialed out (fitted before the variable being tested). Last column indicates percentage of multivariate genetic variation explained by variable being tested. Max.temp = maximum temperature of warmest month, precipitation = precipitation of driest month, All = all variables except the one being tested. Longitude was not a significant predictor of genetic distance in the marginal test (F = 0.34, P = 0.59), and was not used for partial tests. Significant P-values (P < 0.05) are in bold
 Variables fitted first  % variance
Predictor variable testedin partial testsFP (> F)explained
Geographic distance 3.30.0723.3
Latitude 31.10.00524.5
 Geographic distance60.90.00539.0
Max.temp 1000.00551.1
 Geographic distance94.10.00549.7
 Geog. distance + latitude57.80.00538.1
Precipitation 86.40.00547.4
 Geographic distance82.40.00546.5
 Geog. distance + latitude18.80.00516.7
Figure 4.

Distribution of two major mtDNA lineages of Eopsaltria australis in environmental space, as defined by two BIOCLIM variables, max temperature of warmest month and precipitation of driest month. Two localities where both haplogroups occurred together are marked with arrows.


We attempted to clarify the evolutionary processes shaping a profound mito-nuclear discordance in the Australian bird, the eastern yellow robin E. australis, in which the major patterns of spatial variation in mtDNA and nDNA are perpendicular. We addressed several evolutionary scenarios, applying a series of landscape-genetic, spatial-genetic, and genealogical tests.

Deep mitochondrial divergence can arise by chance in a continuous population, when dispersal and population size are low, due to the stochastic nature of the coalescent process (Irwin 2002). However, such an effect is highly unlikely in E. australis, where the two haplogroups have a sharp spatial division within this continuously distributed, common, widespread species (Barrett et al. 2003). The haplogroups about over > 1500 km with substantial potential for mixing, yet only in two sampled locations did they co-occur (Fig. 1). Random processes also cannot explain correlation of mtDNA lineages with environmental variables.

Under a scenario of allopatric divergence followed by secondary contact, common neutral explanations for more structured mtDNA than nDNA include incompletely sorted nuclear lineages, nuclear introgression, and/or sex-biased asymmetries such as dispersal, mating, or offspring production/survival (Toews and Brelsford 2012). Although the species distribution models did indicate vicariance as a plausible explanation of the A1 and A2 split at or after the LGM, there is little basis for vicariance in the range of E. australis south of the Wet Tropics. The species is common and widespread along the Great Dividing Range (Barrett et al. 2003), and its likely presence there during the LGM is inferred from our species distribution models (although the LGM populations on parts of the Range could have been sparse, according to relatively low probability of occurrence; Supplementary Material S1). Geographic inconsistencies between mtDNA and nDNA structure rule out incomplete lineage sorting as a cause of discordance (Toews and Brelsford 2012). Historical lack of gene flow across the Great Dividing Range but not in other directions would be expected to structure both mitochondrial and microsatellite variation: yet no geographically distinct genetic clusters were detected for microsatellites. Instead, the major spatial pattern for microsatellites is latitudinal IBD, with only mild IBD across the Great Dividing Range. This suggests that distance, rather than a barrier, has structured nuclear variation. This, and inference of some nuclear gene exchange from coalescent analysis in IMa2 (tempered by parameter uncertainty), suggests that some nuclear gene flow between haplogroups has occurred. Different demographic histories were inferred from mtDNA and nDNA data: significant Tajima's D and Fu's Fs neutrality tests, suggestive of population expansion, purifying selection, or hitchhiking, were detected for mtDNA and two least variable nuclear loci, not for any of the more variable nuclear markers. Population expansion would be expected to yield generally consistent signals across different markers, thus nonneutral evolution or hitchhiking of some loci appears more likely, which was also supported by the MK test for ND2. This suggests that evolutionary histories of mitochondrial and nuclear genomes are decoupled. The same conclusion arises from contrasting A versus B divergence times estimated from the two genomes while accounting for coalescent variance: the late Pleistocene–Holocene divergence of ∼7000 (700–149,000) years ago estimated by IMa2 from nuclear data (Supplementary Material S8) is much more recent than the early Pleistocene mitochondrial lineage divergence of ∼1.5 (1–2.1) million years ago estimated by Beast from ND2 (Supplementary Material S4). Although we note that the divergence time estimates of Beast and IMa2 are not strictly comparable because IMa2 incorporated migration into the model, the difference represents evidence that mtDNA and nDNA both cannot be responding solely to neutral processes.

Very strong female philopatry unrelated to selection, and selection correlated with mtDNA, are both expected to limit female but not male movement and gene flow across populations and thus to impact mtDNA more and nDNA less than expected in the absence of these effects. However, the female philopatry needed to explain observed patterns in the data contrasts strongly with what is known about the biology of E. australis: field observations (Debus and Ford 2012) and population genetic data (Harrisson et al. 2012) show that females disperse significantly further than males (at least ∼7 and ∼1 km, respectively; Debus and Ford 2012). It is possible that females prefer to settle in habitats similar to their natal habitats, whereas males are less discriminating (Tonnis et al. 2005). However, behavior where females were so unable or unwilling to disperse or settle between adjacent haplogroup ranges over > 1500 km of contact seems likely to be linked to some major fitness advantage, implying female-linked selection.

If we reject the selectively neutral hypotheses as unparsimonious, it remains to consider possibilities involving female/mtDNA-linked selection. Geographically localized haplogroups and very shallow coalescent times within haplogroups compared to between-haplogroups observed in E. australis are consistent with simulation scenarios under local selection and low to moderate dispersal (Irwin 2012). The correlation of environmental variables with the spatial distribution of mtDNA variation, after controlling for IBD and geographic position (dbRDA results; Table 3) is consistent with environmental selection. The most informative bioclimatic variables with regards to species occurrence (maximum temperature of the warmest month and, to a lesser extent, precipitation of the driest month) explained significant mitochondrial patterning beyond that explained by latitude and IBD, suggesting that these climatic factors may be associated with female-linked selection.

These results, coupled with the arguments against selectively neutral explanations, lead us to propose that the most likely solution to the geographically perpendicular mito-nuclear discordance in E. australis is female-linked selection (broadly-defined), taken here to include selection on one or more of mtDNA, nuclear-encoded genes for proteins involved in oxidative phosphorylation, nuclear-encoded genes for mitochondrial proteins, joint mito-nuclear genotype or W-linked parts of the genome (Rand et al. 2004; Meiklejohn et al. 2007; Dowling et al. 2008; Tieleman et al. 2009; Shen et al. 2010; Moghadam et al. 2012), female biology including biased habitat selection (as in Tonnis et al. 2005) and “divergence hitchhiking” (gene flow reduced as a side effect of strong divergent selection on genes involved in local adaptation; Via 2012). Nuclear gene flow within E. australis appears to be structured by IBD along the gradients in two perpendicular directions: latitudinal (main microsatellite structure, msat-sPC1; Fig. 3) and across the Great Dividing Range (the cryptic second structure in the microsatellites, msat-sPC2, directionally consistent with distribution of Tess clusters from analysis of sequenced introns assuming three genetic clusters [Supplementary Material S5], which could be a signature of gene flow impeded by arrested dispersal or survival of females across the Range). The inference of IBD is also supported by congruent geographic structure in color of the upper tail-coverts, which changes from bright yellow to olive-green from north to south and east to west in E. australis (Ford 1979). Keast (1958) and Ford (1979) suggested that this color variation arose due to local ecotypic selection where brighter yellow is preferred in darker environments where it would be more useful for signaling. Thus, selective pressures driving evolution of female-linked traits and color variation could be different and their evolution unlinked. Overall, dramatic mito-nuclear discordance in E. australis suggests that natural selection constrains mitochondrial but not selectively neutral nuclear gene flow along environmental gradients.

More research is needed to understand the mechanisms through which putative selection on female-linked traits in E. australis operates. Passerine birds from hotter and drier environments have lower basal metabolic rates and evaporative water loss (Tieleman et al. 2003; Williams and Tieleman 2005). Selection on mitochondrial haplotypes can drive such adaptive responses because the mitochondrial genome plays an important role in the regulation of energy metabolism in birds (Tieleman et al. 2009). For E. australis, the MK tests suggested purifying selection on ND2, thus positive selection appears not to impact ND2 directly, but could act upon other mtDNA-encoded genes, or indirectly on nDNA-encoded genes that produce structural proteins imported into the mitochondria or on the joint mito-nuclear genotype, mito-nuclear hybrids being selected against in the contact zone between two haplogroups (Dowling et al. 2008; Ballard and Melvin 2010). Of all reviewed cases of mito-nuclear discordance (Toews and Brelsford 2012), only two other studies, both involving birds, report association between mtDNA and environmental variation in the presence of nuclear gene flow when evidence for previous isolation is lacking. In the Rufous-collared Sparrow Zonotrichia capensis, mitochondrial but not nuclear gene flow was significantly reduced along elevational gradients, suggesting that mitochondrial haplotypes could be locally adapted (Cheviron and Brumfield 2009). A relationship between a gradient of aridity and mitochondrial but not nuclear or morphological diversity has been reported in the South African arid-zone endemic Karoo scrub-robin Cercotrichas coryphaeus raising the possibility of common mechanisms for adaptations to extreme environmental conditions (Ribeiro et al. 2011). Due to linked inheritance with the W-chromosome, mtDNA may also reflect female-specific selection on W-linked genes, as has been demonstrated for the evolution of gene expression on the W-chromosome (Moghadam et al. 2012). Thus, it may not be coincidental that all three studies (Cheviron and Brumfield 2009; Ribeiro et al. 2011; this study) reporting association between mtDNA and environmental variation in the presence of nuclear gene flow are from birds. Whatever the mechanism, any environmental selection acting on female-linked traits in E. australis would strongly inhibit the sympatry of mitochondrial lineages across a climatic gradient associated with the Great Dividing Range.


Our study of E. australis benefited from a combination of landscape genetic, spatial genetic, and coalescent analyses of multiple locus types and environmental variables in building an integrated understanding of evolutionary processes operating at different scales. It also presents a striking example of how geographically structured mtDNA diversity can unreliably reflect species’ evolutionary history when neutrality is assumed and not thoroughly tested. It is beneficial to incorporate tests for the effect on genetic variation of geographic position and environmental variation, in addition to IBD, into phylogeographic studies, and to try and tease apart their almost inevitable correlations. Correlation between mitochondrial genetic variation and geographic position, when distance is controlled for, warrants additional exploration of potential selection pressures driving evolution of female-linked traits. Discovery of two highly divergent haplogroups apparently under strong environmental selection within a putatively continuous population of E. australis (possibly a common pattern for species whose range spans environmental gradients) provides another example of divergence with gene flow (Pinho and Hey 2010), which has potentially profound implications for management and taxonomy. The evidence for female-linked environmental selection implies that mtDNA haplogroups are not ecologically exchangeable sensu Crandall et al. (2000). For example, translocation among haplogroup regions may result in negative fitness consequences, and on the evidence here would be an inadvisable management strategy. Our study illustrates how the use of mtDNA as a barcoding tool to define species or lineages (Baker et al. 2009) can be severely compromised without the knowledge to appreciate the role of mtDNA-correlated selection in such units.


This work was funded by the Australian Research Council Linkage Grant (LP0776322), the Victorian Department of Sustainability and Environment (DSE), Museum of Victoria, Victorian Department of Primary Industries, Parks Victoria, North Central Catchment Management Authority, Goulburn Broken Catchment Management Authority, CSIRO Ecosystem Sciences, and the Australian National Wildlife Collection Foundation. NA was funded by a Monash University Faculty of Science Dean's Postgraduate Scholarship and Birds Australia with additional support from the Holsworth Wildlife Research Endowment. Blood samples from Victoria were collected under DSE permits (number 10004294 under the Wildlife Act 1975 and the National Parks Act 1975, and number NWF10455 under section 52 of the forest Act 1958). The authors thank A. Lill, N. Takeuchi, and other collectors of specimens and all agencies who granted permission to collect specimens, R. Palmer for curatorial assistance, S. Metcalfe for help with mitochondrial sequencing, and K. Harrisson for help with nuclear sequencing. Computationally intensive analyses were performed on the Monash Sun Grid courtesy of Monash eResearch. The authors thank the UK NERC Genepool facility for sequencing support. The authors are grateful to D. Irwin, J. Endler, Evolution editorial input including that of K. Petren, three anonymous reviewers for their comments on the earlier drafts of the manuscript, to T. Jombart for advice on analyses of spatial genetic structures, G. Dolman, H. Ford, and P. Teske for helpful discussions.