•Even in cases in which geographic isolation appears to have driven the speciation of regional endemics, range shifts during the Pleistocene climatic oscillations may also have influenced their evolutionary history. Elucidating speciation history can provide novel insights into evolutionary dynamics following climatic oscillations.
•We demonstrated a sister relationship between the Japanese alpine endemic Cardamine nipponica and the currently allopatric, widespread arctic-alpine Cardamine bellidifolia (Brassicaceae) based on internal transcribed spacer (ITS) sequences and 10 other nuclear genes. Speciation history was inferred using demographic parameters under the isolation with migration model.
•The estimated demographic parameters showed that the population size of C. nipponica was similar to that of C. bellidifolia and that gene flow occurred exclusively from C. nipponica to C. bellidifolia after speciation.
•The inferred speciation history, which included gene flow, suggests that geographic barriers between the peripheral C. nipponica and the widespread C. bellidifolia were reduced during the Pleistocene. The asymmetric introgression implies that genetic isolation may have been involved in the speciation of C. nipponica. Our results suggest that even currently allopatric species may not have diverged solely under geographic isolation, and that their evolutionary history may have been influenced by Pleistocene range dynamics.
Geographic isolation reduces opportunities for gene flow within a species. Therefore, genetic differences accumulate across the species’ range and may lead to allopatric speciation. Given that marginal populations of widespread species are more likely than other populations to become isolated, they can play an important role in speciation. Consequently, peripheral speciation might be a common way by which regional endemics originate.
Population dynamics during the Pleistocene, however, had a large influence on evolutionary history. Species that avoided extinction expanded and retreated as a result of environmental changes during the Pleistocene, resulting in their current distribution and genetic structure throughout their range (Avise, 2000; Hewitt, 2000). A simple evolutionary consequence is that range contractions of widespread species have left isolated populations in marginal areas, leading to peripheral speciation of regional endemics. In this case, the population sizes of endemics would be much smaller than those of their ancestral and sister species, and a lack of migration would be observed after speciation (Conye & Orr, 2004). However, Pleistocene population dynamics, such as range expansion, contraction and/or population extinction, could blur distribution patterns and complicate the evolutionary history of a species. Indeed, it is now recognized that introgression may have occurred between species during past Pleistocene range expansions and contractions at times when they were sympatric (Kikuchi et al., 2010) and that, in general, the effects of historical hybridization between species that are currently allopatric might be more frequent than previously thought (Cronn & Wendel, 2004; Pelser et al., 2011). Thus, the speciation process cannot simply be inferred based on the current range of a species. A better understanding of the evolutionary history of regional endemics is likely to provide novel insights into the evolutionary dynamics of species during climatic oscillations.
Here we assess the potential influence of Pleistocene population dynamics on the evolutionary history of a species through an examination of the speciation history of a Japanese alpine endemic, Cardamine nipponica (Brassicaceae). The high mountains in Japan range from 2000 to 3000 m in altitude and 35° to 44° in latitude, and harbor an alpine flora consisting of species that also occur in the Arctic (e.g. Empetrum nigrum, Diapensia lapponica, and Vaccinium vitis-idaea) as well as endemics that appear to be closely related to arctic-alpine species. Specifically, one-third of the Japanese alpine plant species are either endemic to this archipelago, but with closely related arctic species, or are themselves widely distributed in the Arctic (Shimizu, 1982, 1983). C. nipponica, a perennial, mostly autogamous herb with compound leaves (Kitakawa, 1999; H. Ikeda, pers. obv.), is a representative of such endemics. The narrow range of this endemic species is located at the periphery of the range of an apparently closely related, widespread arctic–alpine species, Cardamine bellidifolia, which also is perennial and mostly autogamous, but with simple leaves (Hultén & Fries, 1986; Brochmann & Steen, 1999). The current range of C. bellidifolia extends throughout the entire Circum-Arctic and into the high mountains of North America and East Asia (Fig. 1). Both species are diploid (2n = 16; Kučera et al., 2005; Warwick & Al-Shehbaz, 2006; Y. I. Iwatsubo, pers. obv.). The current distributions of C. nipponica and C. bellidifolia do not overlap, but C. bellidifolia extends southward to Sakhalin Island north of Japan (Fig. 1). Thus, one may reasonably hypothesize that the Japanese endemic C. nipponica is sister to the arctic-alpine C. bellidifolia and has diverged in the periphery of the range of this widespread species. However, a previous study inferred a nonsister relationship between these two species based on internal transcribed spacer (ITS) sequences (Carlsen et al., 2009). That study used DNA extracted from an old herbarium specimen of C. nipponica, and thus the risk of contamination or other errors during experimental procedures may have been high. Therefore, re-evaluating the phylogenetic position of C. nipponica and its relationship with C. bellidifolia would be worthwhile.
In this study, we first re-examined whether the Japanese endemic C. nipponica is sister to the arctic–alpine C. bellidifolia. Initially, we reanalyzed the previously published as well as new ITS sequences of the genus Cardamine. Because phylogenetic relationships inferred between close relatives are sometimes inconsistent across genes as a result of incomplete lineage sorting and/or gene flow, multilocus analyses are necessary to determine a robust sister species relationship. Therefore, using several putative sister species identified in the ITS analysis, further phylogenetic analyses were conducted based on 10 other nuclear genes. Given that C. nipponica and C. bellidifolia were indeed identified as sister taxa in our study, we next explored their speciation history statistically using demographic parameters under the isolation-with-migration (IM) model (Nielsen & Wakeley, 2001; Hey & Nielsen, 2004). Based on these parameters, we attempted to evaluate the potential peripheral speciation between the sister species and infer the evolutionary influence of Pleistocene climatic oscillations.
Materials and Methods
Sample collection and DNA extraction
One individual of C. nipponica from each of six populations from northern Japan and 12 populations from southern Japan was selected based on previous studies (Ikeda et al., 2008, 2009a,b, 2011), and previously extracted DNA or obtained sequences were used. These samples covered the entire range of the species and provide a random representation of its polymorphisms (Fig. 1; Ikeda et al., 2008). For C. bellidifolia, 20 individuals representing the entire distribution range were used (Fig. 1), and DNA was extracted from silica-dried leaf materials or leaf samples from herbarium specimens using the DNeasy Kit (Qiagen, Hilden, Germany). In addition, six and eight individuals of Cardaminealpina and Cardamineresedifolia, respectively, were included to test whether C. bellidifolia is a sister species of C. nipponica, and their DNA was extracted from silica-dried leaf materials using the DNeasy Kit (Qiagen). The detailed localities of the samples are shown in the supporting information (Supporting Information, Table S1). Our analyses included sequencing of multiple nuclear loci to reduce the bias found in single-locus genealogies as a result of incomplete lineage sorting and the effects of our relatively small sample size for each species. Cardamine glauca was selected as an outgroup for phylogenetic and population genetics analyses because this species is sister to the clade that includes C. bellidifolia L. (Carlsen et al., 2009) and C. nipponica Franch. et Savat. (see the Results section). The DNA of C. glauca was extracted from silica-dried leaf materials of eight individuals using the DNeasy Kit (Qiagen).
Sequencing ITS in C. nipponica and phylogenetic analyses
We sequenced the internal transcribed spacer (ITS) region from C. nipponica to reassess the phylogenetic position of this species. PCR amplification of ITS from C. nipponica was conducted according to a previous phylogenetic analysis (Carlsen et al., 2009). After gel purification with glass powder using GeneClean II (Bio 101, Vista, CA, USA), PCR products were sequenced directly from both directions using an ABI 3130 Genetic Analyzer (POP-7 polymer and 80 cm capillary; Applied Biosystems, Foster City, CA, USA). The sequences were aligned with those of other Cardamine species (88 species) from a database (Table S2) using Clustal X (Thompson et al., 1997). In total, 98 sequences were included, most of which had also been used in the previous phylogenetic study (Carlsen et al., 2009). The newly obtained sequences were deposited in the DDBJ (AB639120–AB639136). Rorippa indica and Rorippa islandica were used as outgroups (Table S2).
A maximum-likelihood (ML) tree was estimated with TREEFINDER (Jobb et al., 2004) using the default setting. Different substitution models were applied to the 5.8S rDNA gene (TIM2 + G) and its flanking regions (ITS1 and ITS 2; GTR + G), which were estimated using the partition option of the software. The significance of branches was evaluated by 1000 bootstrap resamplings. In addition, the Bayesian tree was estimated using MrBayes, ver. 3.1.2 (Huelsenbeck & Ronquist, 2001; Ronquist & Huelsenbeck, 2003). Applying the partitioned models of substitutions, Markov chain Monte Carlo (MCMC) searches were run for 5 000 000 generations with four chains, and trees were sampled every 100 generations. Of the four chains, one was kept cold, and the other three were heated at a low-temperature setting (temperature = 0.02). After the first 50 000 trees were discarded as burn-in based on the stationarity of the likelihood values, a majority-consensus tree and posterior probabilities were obtained from the estimated distribution of trees. Because two replicate runs resulted in a consistent tree, posterior probabilities were summarized from the two runs.
PCR and sequencing of multiple nuclear genes
Polymerase chain reaction amplification of nuclear genes and sequencing procedures were conducted following a previous study (Ikeda et al., 2009a). PCR amplification was attempted for 10 loci, including six loci with previously reported primers (COP1, DET1, GA1, TFL1, CHS, and F3H; Kuittinen et al., 2002), three loci from portions of entire genes using previously designed primers (PHYA, PHYC, and CRY1; Table S3; Ikeda et al., 2009b; 2011), and one locus using primers designed specifically for this study (CO; Table S3). All PCR products were directly sequenced using an ABI 3130-avant Genetic Analyzer (POP-7 polymer and 36-cm capillary; Applied Biosystems). The entire sequences of the amplified PCR products were determined using both PCR primers for six loci (COP1, DET1, F3H, PHYA, PHYC, and CRY1). Sequences of portions of the amplified PCR fragments were determined using one of the PCR primers and an internal primer (Table S3) for four loci (CO, CHS, TFL1, and GA1), and these partial sequences were used for further analysis. In total, c. 5800 bp from 10 loci (491–750 bp per locus) were analyzed. These amplified loci are located in portions of the coding regions of genes (Kuittinen et al., 2002; Ikeda et al., 2009b, 2011), suggesting they were not tightly linked to one another.
Polymorphisms were confirmed by sequencing several independent PCR products. As expected from the reported diploidy of all four species (2n = 16; Kučera et al., 2005; Warwick & Al-Shehbaz, 2006; Y. I. Iwatsubo, unpublished), we obtained unambiguous electropherograms, and no sequences had more than one heterozygous position. Therefore, alleles of each sample were visually determined. Sequences of each locus were aligned using the Auto Assembler software (Applied Biosystems). All new sequences were deposited in the DDBJ (AB607347–AB607800). The relationships among alleles at each locus were determined based on sequences excluding insertions and deletions (indels) using TCS1.06 (Clement et al., 2000).
Multilocus analyses for determining phylogenetic relationships
The phylogenetic relationships among C. nipponica, C. bellidifolia, C. alpina, and C. resedifolia were estimated using Bayesian approaches with the sequences of 10 nuclear loci. In total, 42 samples with no missing data for any loci were used. In addition, one sample representing the sequence of C. glauca was used as an outgroup for each locus. Because samples of C. glauca frequently had heterozygous indels, we failed to obtain unambiguous sequences of all loci from a single sample. Instead, each locus was sequenced from several samples, and the resulting sequences were used for the outgroup sequences.
First, to assess to what degree the genetic composition of each individual was influenced by introgression, Bayesian clustering, which estimates clusters assuming Hardy–Weinberg equilibrium, was conducted based on combinations of alleles from multiple loci using STRUCTURE, ver. 2.1 (Pritchard et al., 2000; Falush et al., 2003). The probability of assigning individuals into clusters was estimated using an admixture model with correlated allele frequency using 250 000 generations, following a 100 000 generation burn-in period. The number of clusters (K) was set from one to seven, and 20 runs were repeated for each K. The symmetric similarity coefficient (SCC: H’) was calculated between all pairs of runs for the same K using CLUMPP (Jakobsson & Rosenberg, 2007). The final assignment of each individual was estimated as the mean overall configuration. In addition, the plausible number of clusters was evaluated according to the model value (ΔK ) based on the second-order rate of change of the likelihood function (Evanno et al., 2005). Because C. nipponica and C. bellidifolia are mainly autogamous, the assumption of Hardy–Weinberg equilibrium might be violated, which may lead to spurious signals of population substructure (Falush et al., 2003). Bayesian clustering was also implemented using the program InStruct (Gao et al., 2007), which extended STRUCTURE by eliminating the assumption of Hardy–Weinberg equilibrium, using the same settings as STRUCTURE.
To construct the phylogenetic tree, concatenated data sets for 10 nuclear loci were analyzed using partitioned Bayesian and ML analyses. Each locus was a single partition with its own substitution model and unlinked parameters, with the exception of those used for trees. Substitution models for each partition were estimated using TREEFINDER (Jobb et al., 2004). The Bayesian tree was estimated using MrBayes, ver. 3.1.2 (Huelsenbeck & Ronquist, 2001; Ronquist & Huelsenbeck, 2003). MCMC searches were run for 1 000 000 generations with four chains, and trees were sampled every 100 generations. Of the four chains, one was kept cold, and three were heated at a low-temperature setting (temp = 0.005). After the first 100 000 trees were discarded as burn-in based on the stationarity of the likelihood values, a majority-consensus tree and posterior probabilities were obtained from the estimated distribution of trees. Because two replicate runs resulted in a consistent tree, posterior probabilities were summarized from the two runs. The ML tree was estimated with TREEFINDER (Jobb et al., 2004) using the default setting. The significance of each branch was evaluated by 1000 bootstrap resamplings. We conducted both Bayesian and ML analyses using two data sets that either did or did not include corresponding database sequences of Arabidopsis thaliana (ecotype = Columbia). When A. thaliana was included, only exon sites were used, whereas all sites were used for the data set without A. thaliana.
In addition to the concatenated analyses, a species tree was estimated using a Bayesian hierarchical method implemented by BEST, ver. 2.3 (Liu, 2008). This approach employs coalescent theory across multiple loci to generate a posterior distribution of species trees and can represent the independent evolutionary history across loci (Liu et al., 2008). The same data sets used in the concatenated analyses were used for estimating the species tree, while all parameters, including those for trees, were unlinked. MCMC searches ran for 50 000 000 generations with eight chains, in which one was kept cold and seven were heated at a low-temperature setting (temp = 0.10). Trees were sampled every 1000 generations, and the first 5000 trees were discarded as burn-in based on the stationarity of likelihood values. A majority-rule consensus tree was obtained from the estimated distribution of species trees. Because two replicate runs resulted in consistent topology, the posterior probabilities of the species tree were summarized from the two runs. A species tree was solely obtained from data sets without A. thaliana.
Summary statistics and neutrality tests
The average number of pairwise nucleotide differences per site (π; Nei, 1987), the genetic divergence from C. glauca (KCg), and the minimum number of recombinations (RM) following the four-gamete test (Hudson & Kaplan, 1985) were calculated for C. nipponica and C. bellidifolia. To evaluate whether evolution at each locus follows a standard neutral model, Tajima’s D (Tajima, 1989) was estimated, and any deviation from neutral expectation was evaluated using 10,000 coalescent simulations. These summary statistics were estimated using DnaSP, ver. 5.0 (Librado & Rozas, 2009). Furthermore, the HKA test (Hudson et al., 1987) was conducted to evaluate all loci evolved under neutral equilibrium using the HKA program (http://lifesci.rutgers.edu/~heylab/HeylabSoftware.htm#HKA). Using the HKA test, we evaluated whether nucleotide diversities within a given species (i.e. C. nipponica or C. bellidifolia) followed expectations based on divergences from C. glauca.
Demographic parameters of the IM model
To determine speciation history, the demographic parameters of the IM model (population size: θ1, θ2, θA; migration rate: m1, m2; divergence time: t), which are expressed in terms of a mutation rate (μ), were calculated using the IMa program (Hey & Nielsen, 2007). The IM model assumes that an ancestral species with population size NA (= θA/4μ) split into two species at T (= tμ) yr before present (BP). After speciation, the descendants had distinct population sizes (N1 = θ1/4μ, N2 = θ2/4μ) and experienced constant migrations (m1, m2; migration rates per mutation event). Because the model assumed panmixia for each population, the estimated migration rates (m1, m2) represent gene exchange between sister species. The program IMa estimates probability density functions for parameters under the model and assesses the posterior probability densities for model parameters using MCMC methods (Hey & Nielsen, 2004, 2007). In addition, the likelihood of the nested models, such as the model without migration (m1 = m2 = 0) or that with equal population sizes (θ1 = θ2 = θA), was calculated using the estimated functions. In comparing the likelihood of the IM model with each nested model using the likelihood ratio test, the significance of the IM model, as well as the estimated demographic parameters, was evaluated. The IM model is based on several assumptions such as neutrality of loci, no recombination within loci, free recombination across loci, and no migrations from an unsampled third species to the focal species (Nielsen & Wakeley, 2001; Hey & Nielsen, 2004). However, a recent simulation study showed that estimated demographic parameters were quite robust to a violation of assumptions such as the population structure within species (Strasburg & Rieseberg, 2010), while a violation in the mode of speciation such as a change in migration rates resulted in a false estimation of demographic parameters (Becquet & Przeworski, 2009). In our study systems, all loci apparently did not violate the assumptions (see the Results section), and the putative other species, C. resedifolia and C. alpina, are restricted to central and southern European mountains and thus gene exchange with C. nipponica and C. bellidifolia appears unlikely.
Running IMa involved two steps: M-mode and L-mode. First, functions of the model parameters were estimated in M-mode. After several preliminary runs using a wide range of prior probability densities (m1 = m2 = 20; q1 = 20; t =20), demographic parameters were estimated by 5 × 107 MCMC steps following 500 000 burn-in periods with Metropolis coupling implemented using 25 chains under the geometric increment model (h1 = 0.925; h2 = 0.625) and with a narrower prior probability density (m1 = m2 = 2.5; q1 = 2.5; t =5), from which 2.5 × 105 genealogies were saved. The M-mode run was repeated three times with different random seeds to check for convergence. Using the functions of model parameters estimated in three independent runs in M-mode, the marginal posterior distribution and the maximum likelihood estimates (MLEs) of demographic parameters were predicted by running IMa in L-mode. In addition, likelihood ratio tests were conducted in L-mode to evaluate whether the IM model fitted better than the nested models assuming unidirectional migration (m1 or m2 = 0) or equal population sizes (θ1 = θ2, θ1 = θA, θ2 = θA, or θ1 = θ2 = θA). Given that our sequence data had no sites with multiple recurrent mutations, an infinite site model of mutation was applied to all loci. To remove intralocus recombinations from loci with recombinations (COP1, DET1, and GA1), the longest blocks of sequences without recombinations were selected based on a four-gamete test (Hudson & Kaplan, 1985) and used for parameter estimation. The estimated divergence time and population sizes were scaled by the mutation rate for C. nipponica used in our previous study (8.67 × 10−9 substitutions per site yr–1; Ikeda et al., 2009a). For scaling population sizes, we assumed 1 yr per generation because both C. nipponica and C. bellidifolia can flower and fruit a few months after germination in the laboratory (H. Ikeda, pers. obv.).
ITS phylogeny of the genus Cardamine including C. nipponica
Internal transcribed spacer (ITS) sequences were obtained from all samples of C. nipponica except Cn_17, in which 10 distinct types were found among 615 bp sequences (Table S1). Although most branches of the ML tree were not well supported, all sequences of C. nipponica formed a well-supported monophyletic group with C. bellidifolia, C. alpina, and C. resedifolia except for the previously reported sequence (EU819349) that grouped with C. microzyga in agreement with the previous paper (Carlsen et al., 2009) (bootstrap, 100%; posterior probability, 1.00; Fig. 2). In our analysis, the C. nipponica-bellidifolia-alpina-resedifolia clade (bootstrap, 100%; posterior probability, 1.00) was sister to another well-supported monophyletic group including C. glauca and C. pancicii (bootstrap, 93%; posterior probability, 1.00). Notably, C. nipponica and C. bellidifolia formed a well-supported monophyletic group (bootstrap, 96%; posterior probability, 1.00), supporting a sister relationship between these two species. Furthermore, all our new C. nipponica sequences except ITS_10 formed a well-supported monophyletic group (bootstrap, 97%; posterior probability, 1.00), whereas ITS_10, which was detected in Shokanbetsudake (Table S1), grouped with C. bellidifolia (bootstrap, 52%; posterior probability, 0.99). Most of the C. nipponica samples from southern Japan (ITS_6–ITS_9) formed a monophyletic group (bootstrap, 99%; posterior probability, 1.00).
Sister relationship between C. nipponica and C. bellidifolia inferred from other nuclear regions
Sequences of the 10 nuclear genes (in total c. 5800 bp) were obtained from most samples of C. nipponica, C. bellidifolia, C. alpina, and C. resedifolia, and the alleles in each sample were determined. Although the alleles in each species have similar sequences, many individual alleles were shared among two or more of the species (Fig. S1). For example, C. bellidifolia shared alleles with other species at seven of the 10 loci. At five of these seven loci, C. bellidifolia shared alleles with C. nipponica, whereas it shared alleles at only one or two loci with C. alpina and C. resedifolia.
Despite the occurrence of shared alleles, Bayesian clustering and the phylogenetic tree inferred from the concatenated data sets showed that the four species could be unambiguously distinguished from each other. In the Bayesian clustering implemented by STRUCTURE, highly consistent configurations of individual assignments were detected at K =5 and 6 (H’ = 0.95 and 0.96, respectively; Fig. S2). In particular, the ΔK was the highest at K =5, representing the most probable number of clusters (Fig. S2). At K =5, the genetic composition of each species was unambiguously distinguished from that of the others (Fig. S2). C. nipponica was divided into two distinct clusters, one corresponding to northern Japan and the other to southern Japan (Fig. S2). These patterns of individual assignments were consistent with the results of the clusterings using InStruct (Fig. S3).
The inferred phylogenetic relationships were mostly consistent between the Bayesian and ML approaches (topologies of the Bayesian analyses not shown). Whether A. thaliana was included or not, the phylogenetic analyses of the concatenated data sets consistently resolved three major clades: C. alpina, C. resedifolia, and C. nipponica–C. bellidifolia (Figs 3, S4; bootstrap, > 95%; posterior probabilities, 1.00). In addition, C. nipponica was consistently monophyletic (bootstrap, > 99%; posterior probabilities, 1.00), whereas C. bellidifolia was paraphyletic (Fig. 3) or did not form a well-supported clade (bootstrap, < 50%; but posterior probability, 0.97; Fig. S4). When A. thaliana was included, Cb_3, an individual from arctic Russia (Table S1) clustered with C. nipponica, but this relationship was not well supported (bootstrap < 50%). Moreover, Cb_3 clustered with other C. bellidifolia in the Bayesian tree (data not shown) and trees estimated excluding A. thaliana (Fig. S4). C. nipponica from southern Japan (Cn_7–Cn_18) consistently formed a monophyletic group (bootstraps, 90%, 87%; posterior probabilities, 1.00, 1.00), but C. nipponica from northern Japan was not monophyletic (Cn_1–Cn_6). The species tree estimated using the Bayesian hierarchical method also showed a sister relationship between C. nipponica and C. bellidifolia (posterior probability, 1.00; Fig. 3).
Analysis of polymorphisms and neutrality tests
Nucleotide diversity was variable across loci (Table 1; C. nipponica, π = 0.0002–0.0049; C. bellidifolia, π = 0.0006–0.0029), whereas the degrees of diversity within species were not significantly different (Wilcoxon rank-sum test: P >0.3). The patterns of polymorphisms across loci did not significantly deviate from those expected from neutral equilibrium (Table 1). In addition, under the assumption of neutral evolution at all loci, the degree of polymorphisms across loci did not deviate from expectations as a result of divergences from C. glauca (HKA test, C. nipponica, χ2 = 6.00, P =0.74; C. bellidifolia, χ2 = 6.53, P =0.69). Deviations from neutral evolution can be caused by various factors such as population size changes, population subdivisions, inbreeding, or natural selection. Several biological characteristics of C. nipponica and C. bellidifolia, such as population differentiation between northern and southern Japan, frequent inbreeding in small alpine or arctic populations, or range shifts following the Pleistocene climatic oscillations may have caused deviation from neutral evolution. Moreover, the loci analyzed here are part of genes that are involved in ecologically important traits such as flowering time (e.g. PHYC), and therefore may not follow neutral evolution. However, the lack of a significant rejection of the neutrality tests indicates that such biological factors did not influence the evolutionary patterns of polymorphisms at the analyzed loci. Therefore, all loci appeared to have evolved in a neutral manner and were thus appropriate for estimating demographic parameters of the IM model.
Table 1. Summary statistics for each locus in Cardamine nipponica and Cardaminebellidifolia
n, number of sequences for analysis; bp, length of sequences for analysis; π, average number of pairwise nucleotide differences per site calculated based on all sites; D, Tajima’D; KCg, genetic divergence from Cardamine glauca; RM, minimum number of recombinations. *RM = 1 for including all sequences of C. nipponica and C. bellidifolia.
Demographic parameters of the IM model
For three independent M-mode runs of IMa, consistent MLEs and highest posterior densities (HPDs) were obtained for each parameter (Table S4). In the L-mode run, the posterior probability distributions of all parameters except for population size in the ancestral populations exhibited single peaks (Fig. 4), indicating the robustness of the MLEs of each demographic parameter. For ancestral population size, the peak was located near 0 (θA = 0.001; Table S4), and a high posterior probability plateau was found (c. 0.2; Fig. 4).
Likelihood ratio tests showed that the IM model was a better fit than the nested model when assuming that migration from C. nipponica to C. bellidifolia was zero (m2 = 0, P <0.05; Table 2), whereas it was not significantly better when assuming that migration from C. bellidifolia to C. nipponica was zero (m1 = 0, P =0.48; Table 2). Thus, migrations after speciation appeared to have occurred unidirectionally from C. nipponica to C. bellidifolia (m2 = 0.19, 90% HPD = 0.04–0.47; Table S4). In addition, the IM model was a better fit than the nested model when assuming that all population sizes (θ1, θ2, θA) were equal (P <0.05; Table 2), whereas the fit was better than the model that assumed equal population sizes for C. nipponica (θ1 = 0.88) and C. bellidifolia (θ2 = 0.86, P =0.95; Table 2). Therefore, the current effective population sizes of C. nipponica and C. bellidifolia were inferred to be equal to each other and larger than the population size of their ancestral species. The scaled effective population sizes are 51.8 × 103 and 50.3 × 103 for C. nipponica and C. bellidifolia, respectively, and 0.07 × 103 for the ancestral population (Table S4). The current population sizes are higher than those of the annual selfing Capsella species (8.1–32.9 × 103; Slotte et al., 2008), but smaller than that of the annual selfing Oryza nivara (c. 210 × 103; Zheng & Ge, 2010).
Table 2. Likelihood ratio tests of the standard isolation-with-migration (IM) model to nested models with restrained migration rates or population sizes
North and south
*χ2 distribution is expected to be a mixture. LLR, log-likelihood ratio.
Standard IM model
θ1,θ2,θA,m1,m2 = 0
θ1,θ2,θA,m1 = 0,m2
θ1 = θ2 = θA,m1,m2
θ1 = θ2,θA,m1,m2
θ1 = θA,θ2,m1,m2
θ2 = θA,θ1,m1,m2
The time of the split between C. nipponica and C. bellidifolia was estimated as t =0.89 (90% HPD = 0.56–1.27; Table S4, Fig. 4). After calibration with the mutation rate (8.67 × 10−9 substitutions per site yr–1; Ikeda et al., 2009a), the estimated divergence time between C. nipponica and C. bellidifolia was 208 000 (90% HPD = 132 000–299 000) yr BP (Table S4), coinciding with the penultimate interglacial period (180 000–230 000 yr BP, Gibbard & Van Kolfschoten, 2004).
Sister relationship between C. nipponica and C. bellidifolia
Our results contradict the finding of the previous phylogenetic analysis of the genus Cardamine based on ITS (Carlsen et al., 2009), in which C. nipponica was reported to form a clade with the Chinese species C. microzyga. In our ITS phylogenetic tree, no sequences of C. nipponica except for the previously published sequence (EU819349) grouped with C. microzyga; instead, C. nipponica was assigned to a clade including C. bellidifolia, C. alpina, and C. resedifolia (Fig. 2). The previous investigation used C. nipponica DNA extracted from a single old herbarium specimen, which morphologically appeared to belong to C. nipponica, suggesting that its ITS may not have been correctly amplified. Notably, the present study inferred a consistent phylogeny from several independent samples. We conclude that the present phylogenetic analyses give a more precise phylogenetic position of C. nipponica.
Our new ITS data revealed a robust sister relationship between C. nipponica and C. bellidifolia (Fig. 2). This sister relationship was initially inferred from a single-locus phylogeny, but was thereafter supported by phylogenetic analyses of a concatenated data set of 10 other nuclear genes (Fig. 3) and the estimated species tree (posterior probability, 1.00; Fig. 3). Consequently, we consider the inferred sister relationship between C. nipponica and C. bellidifolia to be robust, although it might be argued that interspecific introgression may have caused an apparent close relationship between the two species. For example, the Bayesian species tree estimation assumes that discrepancies between gene trees and species trees are the result of incomplete lineage sorting rather than gene transfer such as interspecific introgression (Liu, 2008). Indeed, the parameters based on the IM model suggested the occurrence of introgression from C. nipponica to C. bellidifolia (Fig. 4, Table 2). The individuals used for the phylogenetic analyses nevertheless lacked unambiguous signals of genetic admixture from other species (Fig. S2). Thus, we conclude that our species tree estimated from multilocus data, and particularly the inferred sister relationship between C. nipponica and C. bellidifolia, were not biased by interspecific introgression.
Evolutionary history of the endemic C. nipponica and the widespread C. bellidifolia
The present demographic parameters cannot directly demonstrate any spatial context of speciation. Considering the current, exclusively peripheral occurrence of C. nipponica, we can simply hypothesize peripheral speciation involving divergence of C. nipponica in marginal populations of a widespread ancestral arctic species. Because arctic-alpine plants at lower latitudes experienced range contraction during interglacial periods (e.g. Kropf et al., 2003; Ikeda et al., 2006; Ikeda & Setoguchi, 2007), the inferred penultimate interglacial divergence of C. nipponica and C. bellidifolia (c. 210 000 yr BP, Table S4; Gibbard & Van Kolfschoten, 2004) supports peripheral speciation. In addition, according to the migration dynamics of range expansions, colonizing species tend to suffer from introgression from local species, resulting in asymmetric introgression (Currat et al., 2008; Petit & Excoffier, 2009). Given that arctic plants had opportunities to migrate southward during glacial periods, the presently observed asymmetric introgression (Fig. 4, Table 2) can be interpreted as resulting from colonization of C. bellidifolia into the range of C. nipponica following peripheral speciation.
However, we found that our results were not sufficient to demonstrate a peripatric speciation model. First, the HPD of the divergence time is large (132 000–299 000 yr BP) and encompasses both glacial and interglacial periods. Secondly, the mutation rate used for scaling the divergence time was calculated based on the silent mutation rate of CHS (1.5 × 10−8 (95% CI = 1.0–2.0 × 10−8) substitutions per site yr–1; Koch et al., 2000). According to several studies on Arabidopsis (Ossowski et al., 2010), Brassicaceae (Beilstein et al., 2010), and angiosperms (Wolfe et al., 1989), the rate of CHS is rather high, and therefore may underestimate the divergence time. Moreover, the mutation rate was contained within confidence intervals (calculated 95% CI = 4.78–11.56 × 10−9 substitutions per site yr–1; Koch et al., 2000; Ikeda et al., 2009a). Furthermore, asymmetric gene flow could have resulted from other factors causing C. nipponica and C. bellidifolia not to cross in a symmetric manner. In angiosperms, phenological differences in flowering time, morphological differences in floral organs, and/or physiological incompatibility between related species can cause asymmetric crossing (Tiffin et al., 2001). In addition, the signature of gene flow from C. bellidifolia to C. nipponica may have been selectively eliminated, resulting in the asymmetry. This hypothesis would be plausible if C. nipponica were better adapted than C. bellidifolia to the environment in the Japanese archipelago (e.g. positive selection for PHYE along a latitudinal gradient; Ikeda et al., 2009b).
In addition, if peripheral speciation did occur, the population size of the peripheral C. nipponica would be expected to be smaller than that of the widespread C. bellidifolia, which would have a population size similar to that of the ancestral population. In fact, the estimated population size of C. nipponica was not significantly different from that of C. bellidifolia, whereas they were both significantly larger than their ancestral population (Fig. 4, Table 2). This pattern of population sizes is apparently contradictory to the expected situation under peripheral speciation. In addition, we cannot with certainty reject an alternative hypothesis that C. nipponica once had a wider range but suffered from population extinction following Pleistocene environmental change. Consequently, we cannot draw definite conclusions about peripheral speciation of C. nipponica from the results of this study.
Gene flow between current allopatric species
In accordance with recent findings that speciation may occur with gene flow (e.g. cultivated tomato and its wild relatives, Städler et al., 2005; Capsella bursa-pastoris and Capsellarubella, Slotte et al., 2008; Helianthus annuus and Helianthuspetiolaris, Strasburg & Rieseberg, 2008; Oryza rufipogon and O. nivara, Zheng & Ge, 2010), our results showed that gene flow can occur even in cases involving a peripheral endemic and a widespread species. A recent review showed that many studies implementing the IM model found nonzero gene flow (Pinho & Hey, 2010), calling for careful interpretation of migration rates. In this study, the estimated migration rate (m2 = 0.19) was significantly different from zero (P <0.05; Table 2), suggesting the robustness of the inference of gene flow after speciation. Thus, our finding is reasonable given that range shifts during the Pleistocene did reduce geographic barriers between currently allopatric species and allowed interspecific gene flow. Notably, the introgression occurred solely from C. nipponica to C. bellidifolia, implying that genetic isolation may have been involved in the speciation of C. nipponica.
Historical gene flow, however, violates the assumption of the IM model, that is, that migration occurred constantly between two descendant species after they diverged (Hey & Nielsen, 2004). According to a simulation study (Becquet & Przeworski, 2009), this type of violation of the speciation model results in a lack of detection of migrations. Thereby, speciation with gene flow would be falsely estimated as occurring without gene flow. By contrast, we detected gene flow between these currently allopatric species (Fig. 4, Table 2), indicating a robust history of speciation with gene flow for C. nipponica and C. bellidifolia. Migration from C. bellidifolia to C. nipponica, however, might be underestimated because of violation of the model, resulting in apparent lack of migration. Nevertheless, a speciation history with gene flow can be robustly inferred for these allopatric species, a finding that contrasts with the previous finding of intraspecific differentiation of C. nipponica originating from complete geographic isolation between northern and southern Japan (Ikeda et al., 2009a).
Accordingly, even if two sister species are currently allopatric, they may not have diverged from each other in geographic isolation or subsequently always been allopatrically distributed. A historical widespread range followed by population extinction of C. nipponica and/or widespread range expansion of C. bellidifolia would have resulted in the current distribution of the sister species, and therefore Pleistocene population dynamics may have complicated their evolutionary history. The IM model assumes a simple demographic history such as constant population sizes and migration rates. These estimated parameters cannot be expected to sufficiently reflect complicated demographic histories affected by the recurrent Pleistocene glacial cycles. Future challenges such as the use of even larger numbers of loci and a parameter-rich evolutionary model implemented with Approximate Bayesian Computation (Bertorelle et al., 2010) may resolve the apparent contradictions observed in our analyses.
Our phylogenetic analyses inferred a robust sister relationship between the regional endemic C. nipponica and the widespread C. bellidifolia. However, our analysis of demographic parameters throughout speciation, which showed the two species to have equal population sizes, did not support peripheral speciation of these species, which had appeared plausible based on their current ranges. In fact, the analysis showed that speciation occurred with gene flow, consistent with a scenario in which Pleistocene climatic oscillations reduced geographic barriers between these currently allopatric species. In particular, the inferred asymmetric introgression from C. nipponica to C. bellidifolia implies that genetic isolation was involved in the evolution of C. nipponica. Consequently, our study suggests that even if two sister species are currently allopatric, they may not have diverged solely under geographic isolation, and that Pleistocene population dynamics, such as a historical widespread range and population extinction of C. nipponica and/or widespread expansion of C. bellidifolia, influenced their evolutionary history causing historical gene flow to have occurred between them.
We are grateful to K. Marhold and J. Lihova for providing the leaf materials of C. alpina, C. resedifolia, and C. glauca, to L. E. Johannessen for assistance, Y. Iwatsubo for determining the chromosome number of C. nipponica, and H. Tachida and Y. Mitsui for advice regarding the IM analysis. We also thank anonymous reviewers and the editor for their constructive comments on the manuscript. This study was supported by a Grant-in-Aid for Scientific Research (H.S.) and a Grant-in-Aid for JSPS Fellows (H.I.).