Holocene chloroplast genetic variation of shrubs (Alnus alnobetula, Betula nana, Salix sp.) at the siberian tundra‐taiga ecotone inferred from modern chloroplast genome assembly and sedimentary ancient DNA analyses

Abstract Climate warming alters plant composition and population dynamics of arctic ecosystems. In particular, an increase in relative abundance and cover of deciduous shrub species (shrubification) has been recorded. We inferred genetic variation of common shrub species (Alnus alnobetula, Betula nana, Salix sp.) through time. Chloroplast genomes were assembled from modern plants (n = 15) from the Siberian forest‐tundra ecotone. Sedimentary ancient DNA (sedaDNA; n = 4) was retrieved from a lake on the southern Taymyr Peninsula and analyzed by metagenomics shotgun sequencing and a hybridization capture approach. For A. alnobetula, analyses of modern DNA showed low intraspecies genetic variability and a clear geographical structure in haplotype distribution. In contrast, B. nana showed high intraspecies genetic diversity and weak geographical structure. Analyses of sedaDNA revealed a decreasing relative abundance of Alnus since 5,400 cal yr BP, whereas Betula and Salix increased. A comparison between genetic variations identified in modern DNA and sedaDNA showed that Alnus variants were maintained over the last 6,700 years in the Taymyr region. In accordance with modern individuals, the variants retrieved from Betula and Salix sedaDNA showed higher genetic diversity. The success of the hybridization capture in retrieving diverged sequences demonstrates the high potential for future studies of plant biodiversity as well as specific genetic variation on ancient DNA from lake sediments. Overall, our results suggest that shrubification has species‐specific trajectories. The low genetic diversity in A. alnobetula suggests a local population recruitment and growth response of the already present communities, whereas the higher genetic variability and lack of geographical structure in B. nana may indicate a recruitment from different populations due to more efficient seed dispersal, increasing the genetic connectivity over long distances.


| INTRODUC TI ON
Ongoing climate change is altering plant composition and biomass in arctic ecosystems. A major contributor to the observed changes is the increase in relative abundance and cover of deciduous shrub species including alder, birch, and willow. This process is colloquially known as shrubification (Crofts et al., 2018;Myers-Smith et al., 2011). Increased shrub cover alters ecosystem properties.
Changes in biodiversity and carbon budgets due to shrubification have been reported, potentially affecting climate warming effects (Blok et al., 2010;Crofts et al., 2018;Lantz et al., 2013;McLaren & Turkington, 2010;Myers-Smith et al., 2011;Myers-Smith & Hik, 2013). Species richness, for instance, can decline with increasing shrub dominance across the Arctic (Pajunen et al., 2011). This has been linked to the competitive exclusion of shade-intolerant species, in particular lichens and mosses growing under shrub canopies (Pajunen et al., 2011;Walker et al., 2006). On the other hand, low densities of shrubs have been associated with increased understory species richness (Crofts et al., 2018).
Drivers of shrubification and its consequences are still poorly understood, particularly with respect to the vast sub-arctic Siberian areas (Niemeyer et al., 2017). These areas are dominated by alders (Alnus), dwarf birches (Betula), and willows (Salix). For all these taxa, temperature limitation of reproduction most likely determines their current northern extent in the low Arctic (Myers-Smith et al., 2011), but how they may respond to current changes will probably differ.
Shrub species such as Betula nana seem to respond fast and tend to outcompete other tundra plants under more favorable growing conditions (e.g., increased air temperatures, higher nitrogen availability; Bret-Harte et al., 2001, 2002Myers-Smith et al., 2011). Alder also shows positive growth responses to increased temperatures; an increase in annual growth of Alnus fruticosa was recorded at the southernmost site on Siberia's Yamal Peninsula (Forbes et al., 2010).
Shrub willows are also involved in the shrubification of the Russian artic (Myers-Smith et al., 2011). A strong relationship was found between shrub ring width in an abundant tundra willow and summer temperature for the period 1942-2005 (Forbes et al., 2010). Hence, while a recent increase in shrub biomass is rather well documented, it remains an open question whether the recruitment is a phenological response of the local communities already present or due to recruitment from remote communities.
The expansion or contraction of a species range after changes in biotic or abiotic conditions is the main driver of current genetic structure (Clarke et al., 2019;Epp et al., 2015). Glacial refugia and secondary contact zones are the principal sources of hybridization which increase genetic diversity (Havrdová et al., 2015;Hewitt, 1999;Taberlet et al., 1998), whereas range contraction is most likely to cause loss of genetic diversity, reducing the species' ability to adapt to a changing climate (Alsos et al., 2012;Vranckx et al., 2012). Decreasing genetic diversity is also associated with increasing distance from refugia (Stamford & Taylor, 2004) as well as with founder effects and bottlenecks following postglacial dispersal (Hewitt, 1996;Jadwiszczak, 2012). However, studies have shown that dispersal ability can counterbalance the loss of genetic diversity due to range contraction in several shrub species (Alsos et al., 2009(Alsos et al., , 2012.
Changes in shrub abundance and composition along with Siberian treeline changes are well known from the paleo-record on millennial timescales. Alnus sp. is, for instance, known to have retreated southwards during glacial periods and shifted northwards during the late glacial and early Holocene (Klemm et al., 2016;Myers-Smith et al., 2011;Naito & Cairns, 2011;, and retreated since then. Continuous vegetation turnover from open forest to single-tree tundra in southern Taymyr over the last 7,100 years was reflected by a decrease of Alnus and the increase of Salix, both in a palynological study (Niemeyer et al., 2017) and a metabarcoding study using sedimentary ancient DNA (sedaDNA) records (Epp et al., 2018). These alternating contractions and expansions of shrubs due to climatic fluctuations during the Quaternary have probably affected the geographical distribution of plastid haplotypes in modern shrub populations (Casazza et al., 2016).
The chloroplast genome is haploid and maternally inherited in Alnus, Betula, and Salix. It is more prone to stochastic events (i.e., genetic drift) because its effective population size is half that of diploid genomes. In addition, when chloroplast genomes are maternally inherited, the dispersal is limited to seed dispersal that generally occurs over shorter distances than pollen (Lopez et al., 2008). For these two reasons, analyses of genetic variation based on chloroplast genomes usually reveal pronounced geographic structure (Magri, 2008;Petit et al., 1993) that can be suitable for phylogeographic analyses (Gryta et al., 2017).
Since sedaDNA is predominantly of local origin (Parducci et al., 2013(Parducci et al., , 2015, the comparison of modern DNA from focal species with sedaDNA retrieved from lake sediment cores has K E Y W O R D S chloroplast genome, genetic variation, hybridization capture, lake sediments, sedimentary ancient DNA (sedaDNA) great potential to reveal paleovegetation changes in periglacial regions (Jørgensen et al., 2012;Willerslev et al., 2003;. Hitherto, the metabarcoding approach was mainly used to evaluate biodiversity and population dynamics despite its limited information (Epp et al., 2018). Current advances in high-throughput DNA sequencing technologies allow comprehensive population studies over the whole chloroplast genome, for instance by the combination of shotgun sequencing and hybridization capture. Schulte et al. (2020) show that shotgun sequencing can result in low sequencing coverage of specific taxa of interest due to the extremely high presence of unknown DNA sequences in sedaDNA. To address this problem, they used a hybridization capture approach specifically designed for the Larix chloroplast genome. This method successfully enriched targeted sequences and improved accurate detection of genetic variants in sedaDNA (Schulte et al., 2020). However, conserved chloroplast DNA (cpDNA) regions from nontargeted taxa were also captured through the cp conserved regions.
In this study, we compare genetic variation through time to infer shrub population dynamics in common shrub species of the Siberian forest-tundra ecotone (Alnus alnobetula, Betula nana, Salix sp.). We sequenced and assembled 15 chloroplast genomes based on modern DNA (seven A. alnobetula, seven B. nana, one Salix sp.). The samples of A. alnobetula and B. nana were sampled along a west-east gradient of the Siberian forest-tundra ecotone. We identified genetic variations and estimated haplotype networks among the chloroplast genomes of each taxa. To infer historical genetic variation, we used both shotgun and hybridization capture sequencing data retrieved from four sedaDNA samples by Schulte et al. (2020). The samples were obtained from four different age segments of a sediment core from lake CH12 in the Taymyr region. Since the hybridization capture baits were originally designed for the Larix chloroplast, only the conserved chloroplast sequences of nontargeted taxa were captured. The sedaDNA sequencing data were taxonomically assigned to Alnus, Betula, and Salix with a lowest common ancestor approach using a chloroplast genome database. Absolute and relative abundances of each of the taxa were identified in order to investigate shrub population dynamics through time. Genetic variations identified in modern individuals were compared with ancient sedaDNA variants. Our study presents new insights of the genomic diversity in A. alnobetula, B. nana, and Salix sp. in relation to vegetation patterns through space and time.

| Plant material
Fresh plant material from seven Alnus alnobetula ssp. fruticosa (Rupr.) Raus (sample name: A01-A07) and seven Betula nana ssp. exilis (Sukaczev) Hultén (sample name: B01-B07) samples were collected and identified from three regions located along a west-east gradient running from the southern Taymyr Peninsula (~97-105°E), the Lower Omoloy River (~132°E), and the Lower Kolyma River (~161°E; Figure 1). In addition, one unknown species of Salix sp. (sample name: S01) was collected at the Lower Omoloy River (Table 1). Satellite images and vegetation maps (Stone & Schlesinger, 2003) were used to select uniform vegetation stands within three vegetation types: 'single-tree tundra', 'forest-tundra' (open stands at the northern forest margin), and 'closed forest' (Wieczorek et al., 2017). Short twigs with leaves were collected and placed in separate filter bags and dried on silica gel during fieldwork. A reference collection is stored at the Alfred-Wegener-Institute Helmholtz Centre for Polar and Marine Research, Potsdam.

| Sequence processing and de novo assembly of chloroplast genomes
The bioinformatic pipeline followed the procedure described in

| Chloroplast genome annotation, variant detection, and haplotype network
The draft chloroplast genomes were annotated integrating cpGAVAS (Liu et al., 2012)

Single-nucleotide polymorphisms (SNPs) and insertions and deletions
(InDels) among the individual chloroplast genomes were checked through visual inspection. Haplotypes were defined as unique combinations of variants in one individual (Clement et al., 2000). The SNPbased haplotype networks for the chloroplast (cpDNA) sequences of each taxon were generated by computing absolute pairwise distances and a statistical parsimony approach with TCS (Clement et al., 2000) implemented in PopART v. 1.7 (Population Analysis with Reticulate Trees [Leigh & Bryant, 2015]).

| Analyses of sedimentary ancient DNA
Core samples were obtained from a sediment core from lake CH12 Salix ssp. grows predominantly along the river and lake shorelines (Klemm et al., 2016).
Laboratory work and data processing were performed by Schulte et al. (2020). Core samples were collected at the following depths/ages: 121.5 cm/~6,700 cal yr BP, 87.5 cm/~5,400 cal yr BP, 46.5 cm/~1900 cal yr BP, and 2.5 cm/~60 cal yr BP. The four sam-   The rps12, which encodes the 30S ribosomal protein S12, was a trans-spliced gene with the 5′-end located in the LSC region and the duplicated 3′-end located in the IR regions. The trnK-UUU had the largest intron (2,552 bp), which contains the matK gene.
Transversions made up 58% (Vranckx et al., 2012) of the SNPs, while 42%  of the SNPs are transitions. 14 SNPs were located within genes, of which five were located in introns. Two SNPs led to changes in the translated amino acid sequence. In one case, the amino acid property also changed (Table 2). One SNP was located on ndhB, which was doubled in the IRs regions. Three SNPs were located on ycf1, which crossed the IRb/SSC boundary region, resulting in the incomplete duplication of this gene in two IRs. Furthermore, two SNPs were detected in transfer RNA genes, trnL-UAA, and trnK-UUU. Two InDels were in intragenic regions of rpl16 and clpP genes.

F I G U R E 5
Single-nucleotide polymorphisms (SNPs) detected in the whole chloroplast gnome alignment of the seven Alnus alnobetula individuals. The reference chloroplast genome and the sedaDNA retrieved from core samples through hybridization capture and shotgun sequencing methods were mapped subsequently in order to evaluate their variants corresponding to the SNPs' positions. An SNP's position corresponds to the reference chloroplast genome. If an SNP is located within a gene, the corresponding gene name is given in the first row. If no reads were retrieved from core samples, no variant is reported. Color code: Taymyr specific variation = yellow; Omoloy specific variation = orange; Kolyma specific variation = green; potential marker for Taymyr geographic discrimination = position highlighted in red; potential marker for Kolyma geographic discrimination = position highlighted in blue; potential marker for Omoloy geographic discrimination = position highlighted in light green TA B L E 2 Chloroplast genes with nonsynonymous single-nucleotide polymorphisms (SNPs), the SNP position on the respective reference chloroplast genome, and the changes in amino acid properties  (Table S3). The hybridization capture data had a lower breadth of coverage then the metagenomics shotgun data on average (Table 3). Despite this, the ratio of read counts and breadth of coverage was proportional and comparable between the hybridization capture and shotgun data (Table 3; Figure 9).
Overall, Salix and Betula chloroplast genomes had the highest and the lowest breadth of coverage, respectively. Hybridization capture sequencing data had a higher rate of PCR duplicates due to additional 18 amplification cycles performed after capturing.
To evaluate the abundance of Alnus, Betula, and Salix, we calculated the percentage of sample reads relative to the corresponding sample read sum of Alnus, Betula, and Salix (Table 3).
Alnus and Salix showed comparable relative percentages of reads at 6,700 and 5,400 cal yr BP (Table 3; Figure 10)  with the individual S01 or the reference Salix purpurea, respectively.
Whereas, ten variations (32%) showed both the individual and the reference variants.
To further substantiate the reliability of the sediment core data, the complete dataset of sedaDNA reads were aligned to each of the modern genomes. The results reported in Table S4 and Figure S1 show that the number of "identical sites" are consistently higher in the individuals from the Taymyr region in comparison to the individuals from Omoloy and Kolyma regions.

| Genomes and genetic variation
The assembled chloroplast genomes of all A. alnobetula, B. nana, and Previous studies proposed ycf1 as a promising cpDNA barcode of land plants due to its species-level resolution (Dong et al., 2015). The SNPs identified in our study are therefore evaluated as candidate biogeographic markers for A. alnobetula individuals from the Kolyma region and Salix sp. individuals from Omoloy.

| Phylogeography
The genetic variations detected in the assemblies can be used to explore their biogeographical relationships (Wu, 2015;Yang et al., 2018), although due to the small sample size, our results represent a population subset and a partial genetic characterization.
Parsimony-based chloroplast haplotype networks of A. alnobetula  (Douda et al., 2014;Gryta et al., 2017;Hantemirova et al., 2018), which all report low intrapopulation variability and pronounced phylogeographic structure in this species. Considering F I G U R E 1 0 Percentage of sample reads relative to the corresponding sample read sum of Alnus, Betula and Salix in hybridization capture and shotgun dataset the extensive geographic distances between Taymyr, Omoloy, and Kolyma, we detected a lower intrapopulation variability than studies on smaller areas of Europe and southern Asia (Douda et al., 2014;Gryta et al., 2017;Hantemirova et al., 2018). Overall, we identified 11 SNPs and 2 insertions as potential candidate biogeographic markers for the Taymyr region, 1 SNP for Omoloy, and 1 SNP for Kolyma which can be screened in future studies with larger sample sizes covering more populations.
In B. nana, we identified 9 SNPs as potential biogeographic markers for Kolyma, while no specific variant was found for Taymyr or Omoloy.
We found higher genetic variation but no clear spatial structure among B. nana individuals. Although B. nana produces smaller amounts of pollen and seeds compared to tree birches (Wang et al., 2014), its seeds are smaller and lighter in comparison to A. alnobetula, which might maintain higher genetic variability in B. nana chloroplast genomes (Alsos et al., 2012). Previous studies show that the cpDNA haplotypes can be extensively shared among populations of Betula, both within and between species (Palmé, 2003), which may confound phylogeographic patterns. However, in our study areas, B. nana is not growing in sympatry with their southerly neighboring tree birch species (B. pubescens and B. pendula) (Andreev et al., 2002;Niemeyer et al., 2015;Safronova & Yurkovsksya, 2019). The study areas are mostly isolated from tree birches by monodominant larch forests. Therefore, we assume that the chance of modern hybridization is negligible. Moreover, previous studies have shown that gene flow into B. nana from B. pubescens is less likely due to the reproductive barrier between diploids and tetraploids (Zohren et al., 2016). On the other hand, there is evidence that between 8,000 and 9,000 yr BP B. pendula spread to 72°N in the Taymyr region, due to warmer summer temperatures (Kremenetski et al., 1998). Hybridization and introgression processes with diploid neighboring species (e.g., B. pendula) could therefore have happened in the past and be a possible source of haplotype sharing in Betula (Da̧browska et al., 2006;Eidesen et al., 2015;Jadwiszczak et al., 2012;Salojärvi et al., 2017;Li et al., 2005;Palmé, 2003;Thomson et al., 2015).
Although we did not perform sequence divergence or allele frequency distribution analysis with neighboring birches, our hypothesis might serve as a stimulus for future studies.
Overall, the low genetic diversity among A. alnobetula from Taymyr, Omoloy, and Kolyma suggests that the recent northward shrubification might be due to local population recruitment and growth response of the already present local community. In contrast, the higher genetic variability in B. nana suggests that recruitment might originate from different populations due to more efficient seed dispersal, increasing the genetic connectivity over long distances, whereas A. alnobetula seeds are larger and heavier and consequently have a shorter dispersal distance. The de novo assembled chloroplast genomes, the potential biogeographic markers, and the chloroplast haplotype networks provided in our study provide a basis for further investigations of past and present shrub population dynamics in northern Siberia and their phylogenetic relationship. Due to small sample size, we used the modern genomes mainly to guide the analyses of ancient genomes rather than having a comprehensive genome characterization of modern individuals.

| Past shrub population dynamics at the Siberian tundra-taiga ecotone
In order to investigate shrub paleodynamics, we used both metagenomic shotgun sequencing and hybridization capture enriched sequencing data retrieved from four lake sedaDNA samples by Schulte et al. (2020). The reads were taxonomically assigned to Alnus, Betula, and Salix genus. The hybridization capture data show a lower breadth of coverage then shotgun data on average due to the specificity of the hybridization capture baits for targeting the chloroplast genome of L. gmelinii (Schulte et al., 2020;Zimmermann et al., 2019). Only the cp conserved regions of Alnus, Betula, and Salix were captured, of which only a reduced pool can be uniquely assigned to specific taxa using the lowest common ancestor approach. However, the proportions of read counts and breadth of coverage among the taxa are comparable between the capture and shotgun data. Previous studies have shown the potential of the hybridization capture method to retrieve diverged sequences up to a maximum of 25%-40% in different studies on chloroplast, mitochondrial, and nuclear genomes (Li et al., 2013;Paijmans et al., 2016;Peñalba et al., 2014 (Epp et al., 2018;Klemm et al., 2016). A similar timing of strong change has also been reported from other sites in the Taymyr region (Andreev et al., 2011;Klemm et al., 2016) and throughout most circumarctic environments (Epp et al., 2018;Kaufman et al., 2004;Klemm et al., 2016;Luoto et al., 2014;Salonen et al., 2011). The beginning of the Holocene was characterized by an increase in summer temperature by 5-6°C in Taymyr followed by a climate deterioration and the reduction of the Alnus fruticosa range after 5,000 cal yr BP (Kremenetski et al., 1998). Alnus may have retreated southwards together with Larix while Salix spp. and Betula spp. replaced Alnus spp. in its once favored places. However, possible overrepresentation of Salix sp. via both the source signal (i.e., Salix growing close to the lake and tributary rivers rims) and preservation signal (i.e., differences in the original DNA content of the leaves and DNA degradation) need to be taken into account (Forbes et al., 2010;Niemeyer et al., 2017).

| CON CLUS IONS
Current environmental changes, particularly climate warming and its feedback mechanisms are affecting plant composition and biomass of arctic and alpine ecosystems. The increase in relative abundance and cover of deciduous shrubs in sub-arctic Siberian areas is rather well documented (Frost & Epstein, 1960s). To investigate the dynamics of shrubification, we sequenced and assembled 15 chloroplast genomes based on modern DNA (7 A. alnobetula, 7 B. nana, 1 Salix sp. individual). We identified genetic variations and estimated haplotype networks for the chloroplast genomes of each taxon. To infer prehistorical genetic variation and past population dynamics, we used taxonomically assigned metagenomics shotgun and hybridization capture sequencing data retrieved from four sedaDNA samples by Schulte et al. (2020).
The samples were obtained from four different age segments of a sediment core from lake CH12 in the Taymyr region. Genetic variations identified in modern individuals were then compared with sedaDNA variants.
The low genetic diversity and the spatial structure of haplotypes among modern A. alnobetula individuals suggest that the recent northward shrubification might be due to local population recruitment and growth response of already present local communities, whereas the higher genetic variability and absence of a

CO N FLI C T O F I NTE R E S T
The authors declare no potential conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Chloroplast genome assemblies: The samples have been submitted to NCBI under the GenBank accession numbers available in Table 1 and