Evidence of hybridization between genetically distinct Baltic cod stocks during peak population abundance(s)

Abstract Range expansions can lead to increased contact of divergent populations, thus increasing the potential of hybridization events. Whether viable hybrids are produced will most likely depend on the level of genomic divergence and associated genomic incompatibilities between the different entities as well as environmental conditions. By taking advantage of historical Baltic cod (Gadus morhua) otolith samples combined with genotyping and whole genome sequencing, we here investigate the genetic impact of the increased spawning stock biomass of the eastern Baltic cod stock in the mid 1980s. The eastern Baltic cod is genetically highly differentiated from the adjacent western Baltic cod and locally adapted to the brackish environmental conditions in the deeper Eastern basins of the Baltic Sea unsuitable for its marine counterparts. Our genotyping results show an increased proportion of eastern Baltic cod in western Baltic areas (Mecklenburg Bay and Arkona Basin)—indicative of a range expansion westwards—during the peak population abundance in the 1980s. Additionally, we detect high frequencies of potential hybrids (including F1, F2 and backcrosses), verified by whole genome sequencing data for a subset of individuals. Analysis of mitochondrial genomes further indicates directional gene flow from eastern Baltic cod males to western Baltic cod females. Our findings unravel that increased overlap in distribution can promote hybridization between highly divergent populations and that the hybrids can be viable and survive under specific and favourable environmental conditions. However, the observed hybridization had seemingly no long‐lasting impact on the continuous separation and genetic differentiation between the unique Baltic cod stocks.

By taking advantage of historical Baltic cod (Gadus morhua) otolith samples combined with genotyping and whole genome sequencing, we here investigate the genetic impact of the increased spawning stock biomass of the eastern Baltic cod stock in the mid 1980s. The eastern Baltic cod is genetically highly differentiated from the adjacent western Baltic cod and locally adapted to the brackish environmental conditions in the deeper Eastern basins of the Baltic Sea unsuitable for its marine counterparts.
Our genotyping results show an increased proportion of eastern Baltic cod in western Baltic areas (Mecklenburg Bay and Arkona Basin)-indicative of a range expansion westwards-during the peak population abundance in the 1980s. Additionally, we detect high frequencies of potential hybrids (including F1, F2 and backcrosses), verified by whole genome sequencing data for a subset of individuals. Analysis of mitochondrial genomes further indicates directional gene flow from eastern Baltic cod males to western Baltic cod females. Our findings unravel that increased overlap in distribution can promote hybridization between highly divergent populations and that the hybrids can be viable and survive under specific and favourable environmental conditions. However, the observed hybridization had seemingly no long-lasting impact on the continuous separation and genetic differentiation between the unique Baltic cod stocks.

K E Y W O R D S
Baltic Sea, contact zone, Gadus morhua, hybridization, inversions, population genetics 1 | INTRODUC TI ON Hybridization, i.e. interbreeding, between species or genetically divergent populations (jointly referred to as "divergent entities" in the following) is a well-known biological phenomenon (Arnold, 1997;Macholán, 2013). The impact of hybridization will rely on several factors, including the genomic divergence as well as ecological selection on the parental populations or species (Pekkala et al., 2012;Sambatti et al., 2008). Hybridization often results in individuals with lower survival due to hybrid incompatibilities (postzygotic barriers) caused by deleterious genetic interactions between the two parental genomes (Presgraves, 2010;Pryke & Griffith, 2009;Rogers & Bernatchez, 2006). Lower fitness in intermediate phenotypes compared to their parental phenotypes is also sometimes reported (Hermansen et al., 2011). In some systems, however, hybridization, i.e. genetic exchange between divergent entities, is demonstrated to be globally neutral or beneficial. The maintenance of novel genetic variation (or introgressed regions) will by large depend on its adaptive significance, that is whether these regions include adaptive alleles resulting in phenotypes that are more advantageous than the parental phenotypes (Giska et al., 2019;Norris et al., 2015).
Additionally, introgressive hybridization may increase the standing genetic variation (Glasheen et al., 2020;Ronco et al., 2021) and give rise to a higher degree of adaptability and evolutionary potential (Grant & Grant, 2019;Meier et al., 2017), and thus be crucial for overcoming environmental stressors, including those related to ongoing climate change.
In the wild, hybrid zones are established where closely related species or divergent populations of a species overlap in geographical distribution and hybridize (Barton & Hewitt, 1985;Johannesson et al., 2020;Nielsen et al., 2004). Hybrid zones can be formed through either (i) secondary contact after previous separation, where the divergent entities have diverged (by genetic drift or selection) during the period of separation, or via (ii) primary contact where the initial divergence emerge from a common background and is caused by varying selection regimes across an environmental gradient (Endler, 1977;Mayr, 1942). Alterations in environmental conditions can lead to shifts of geographical distribution ranges, and loss of ecological and geographical barriers that have kept closely related species or divergent populations apart, and thus facilitate new contact zones and hybridization events (Brennan et al., 2014;Chunco, 2014;Taylor et al., 2015). Hybrid zones have also been described to be dynamic and more prone to change over time and space than previously thought (Buggs, 2007;Krosby & Rohwer, 2010;Wielstra, 2019). However, the long-term evolutionary consequences of such contemporary hybridization events directed by environmental changes are hard to predict. For instance, if these events will nurture climatic adaptations of critically endangered and affected populations, likely depend on whether or not the selection regimes favor the hybrids over the parental phenotypes.
The Baltic Sea is a relatively young brackish intercontinental sea, dating back to approx. 8000 BP, and comprises a series of relatively shallow basins formed as the ice retreated at the end of the last ice age (12,600 BP) (Björck, 1995;Leppäranta & Myrberg, 2009).
During the transformation, the region proceeded from being fully freshwater to become connected with the marine environment in Kattegat and Skagerrak, through a few narrow straits (Björck, 1995;Leppäranta & Myrberg, 2009), and was colonized by numerous marine species. Most of these species are found in the outer more saline regions (Bonsdorff, 2006). However, some species have successfully colonized also the less saline regions and basins, where signs of local adaptation and thus, changes in important phenotypic traits are recorded (Johansson et al., 2017;Kautsky et al., 1990), including salinity tolerance (Johansson et al., 2017;Renborg et al., 2014;Wood et al., 2014). For most of these species, a genetic cline is observed following the decreasing salinities into the Baltic Sea from the marine conditions in the Skagerrak via the transition zone in the Kattegat and Belt Sea towards the lower saline basins in the Baltic Sea, i.e. Bornholm Basin and Gotland Deep, indicating low or no gene flow between the locally adapted populations (Johannesson et al., 2020;Le Moan et al., 2019;Nielsen et al., 2004). One of the ecologically and economically important marine species that has successfully colonized the Baltic Sea is the Atlantic cod (Gadus morhua Linnaeus, 1758). Numerous of reports have documented on the drastic phenotypic modifications that the eastern Baltic cod have undergone, such as changes in egg buoyancy, sperm motility, general osmoregulation as well as spawning depth and season, enabling it to successfully reproduce and cope with the challenges living in a dynamic brackish-water environment (Nissling et al., 1994;Nissling & Westin, 1997;Petereit et al., 2014). Differences between the eastern and western Baltic cod include a higher salinity requirement for activation of spermatozoa in the western Baltic cod, ranging from 15‰ to 16‰ as opposed to 11‰ to 12‰ for eastern Baltic cod (Nissling & Westin, 1997). Egg buoyancy is also different between the two, with western Baltic reaching neutral buoyancy at 20‰-22 ‰ and eastern at 13.3‰-15.7 ‰ (Nissling et al., 1994;Nissling & Westin, 1997). Differentiation in adaptation to salinity has also been documented for other life stages, including juvenile and adult individuals (Kijewska et al., 2016;Malachowicz & Wenne, 2019).
Moreover, the eastern Baltic cod is found to be highly divergent at the genome-wide level from its marine counterparts including western Baltic cod Berg et al., 2015;Poćwierz-Kotus et al., 2015;Wenne et al., 2020) and is one of several examples of potential ongoing speciation in the Baltic Sea (Johannesson et al., 2020;Momigliano et al., 2017). It should also be noted that two of the four larger chromosomal inversions identified in Atlantic cod which discriminate between populations throughout its geographical distribution Sodeland et al., 2016), are seemingly under strong selection in the Baltic Sea (Berg et al., 2015).
The inversion on linkage group (LG) 2 seems to be linked to osmoregulation and oxygen adaptation, while the inversion on LG12 seems to be linked to temperature adaptation (Berg et al., 2015). Such larger chromosomal inversions-or supergenes-are likely to facilitate the observed divergence between the locally adapted populations, and thus, hindering gene flow over the contact zone (Barth et al., 2017Johannesson et al., 2020).
Today, the two genetically divergent cod populations found in the Baltic region are managed as separate stocks: the western Baltic cod (WBC) mainly inhabiting the shallower western Baltic Sea (ICES, 2019a) (ICES subdivisions (SD) 22-24; see Figure 1b,c), and the eastern Baltic cod (EBC) inhabiting the deeper basins of the eastern Baltic Sea (ICES SD 24-32; see Figure 1b,c). Additionally, the distribution of the two stocks overlaps in the Arkona Sea (ARK; SD 24). From scientific trawl catches the mixing ratios of WBC and EBC in ARK (determined by otolith shape combined with genetics) are relatively stable over time, with about two thirds belonging to EBC and one third to WBC both in the 1980s and in recent years (ICES, 2019a;Schade et al., 2022). It should be noted, that a slight shift in the mixing ratio was seen in the 1990s, where the WBC was dominating (or at least) representing half (50%) of the catches (Schade et al., 2022).
The observed mixing ratios in this area could potentially be linked to increases and/or declines in the different stocks during the time period Schade et al., 2022;Stroganov et al., 2018). Despite this reported co-existence, there are to date no reports of hybridization ICES, 2021;Weist et al., 2019). Absence of hybridization could be linked to the non-overlap in spawning time between the two stocks, as the majority of the eastern Baltic cod spawn during summer, i.e. from June to August, whereas western Baltic cod mainly spawn during spring (February-April) (Bleil et al., 2009;Bleil & Oeberst, 2004;Hüssy, 2011;Tomkiewicz & Köster, 1999;Vitale et al., 2005).
However, historically (in the 1960s-1980s) a large proportion of EBC spawned during spring (Baranova et al., 2011). In the 1980s, the EBC stock was one of the most productive cod stocks worldwide with annual landings around 400.000 t (tonnes) at its highest and a spawning stock biomass around 600.000 t (ICES, 2019a). During the same period the growth rate recordings of Baltic cod was increasing and thought to be linked to high food availability (Mion et al., 2021;Svedäng & Hornborg, 2017). The combination of overlapping spawning periods and increased population abundance could have led to potential hybridization events and thus, genetic exchange between the two stocks.
In this study, the overall goal was to identify the genetic impact of the recorded historical peak in spawning stock biomass of EBC during the mid 1980s, i.e. the potential hybridization between the two stocks, by taking advantage of geographically resolved historical otolith archives combined with genotyping (Weist et al., 2019) and whole genome sequencing of both historical and contemporary samples.

| Sample collection and DNA extraction
Otolith samples with adherent blood and/or tissue (see Figure 1a (Vihtakari, 2022)). The MEC samples were collected in January, ARK samples mainly in January and February as well as some in November, and BOR samples in February as well as November-December.
The otoliths had been stored individually in paper envelopes at room temperature until processing. DNA was extracted from the dried tissue and/or blood still attached to the archived otoliths. We used a modified protocol with an overnight incubation step during the initial lysis (Cuveliers et al., 2009;Hutchinson et al., 1999). For each specimen, the tissue from both otoliths was used, no matter if they were whole or broken. In total, sufficient DNA was obtained for genotyping of 436 individuals.

| Genotyping of historical and contemporary samples
The historical samples were genotyped with the diagnostic SNP panel (n = 20, Table S2) designed to discriminate between WBC and EBC together with 7 SNPs located within inversions on three linkage groups: LG02, LG07 and LG12 (Table S2)

| Population structure analyses and genetic assignment of historical Baltic cod
First, assessment of population assignment of EBC and WBC during the early 1980s at the three different locations: MEC, ARK and BOR were analysed with Principal Component Analysis (PCA) using the EIGENSOFT software (SMARTPCA v.6.1.4 and v.7.2.1-intel-2018b) (Patterson et al., 2006). To account for high rates of missing genotypes in the historical data, as done previously in datasets with low coverage historical DNA , we calculated the First Principal Component using a set of unambiguously assigned contemporary reference individuals from KIE (N = 65; reference to be mostly WBC) and BOR (N = 37; reference to be mostly EBC) that had already been genotyped (Weist et al., 2019). Reference individuals are displayed in grey in Figure 2. PCA analyses were conducted by using all 27 SNPs as well as the smaller SNP panel using 19 of the 20 SNPs outside the inversion (Weist et al., 2019). The reduction from 20 to 19 SNPs is based upon STRUCTURE analyses where it was revealed that one of the SNPs in the diagnostic panel had substantial missing data (51.6%), and percentage missing data was further assessed per site in the SNP set with VCFtools 0.1.16 (Table S3).
Second, STRUCTURE (Falush et al., 2003;Pritchard et al., 2000) analyses using the 19 SNPs outside the inversions were performed to further assess the population origin of the historical specimens. The analyses were run with STRUCTURE v. 2.3.4 with FPRIORMEAN set to 0.39 (F ST obtained from Weist et al., 2019), with prior population information given for 15 randomly chosen contemporary individuals from BOR and MEC, i.e. belonging to either EBC or WBC respectively. Individuals with an intermediate scoring (0.3-0.7) of EBC or WBC ancestry signal were assigned as putative hybrids (HYB).
The selected threshold was chosen to account for some degree of backcrossing as well as hybrids not necessarily scoring at 0.5 (Barth et al., 2020;Elgvin et al., 2017;Zwahlen et al., 2022). Proportions over time in each area (MEC, ARK and BOR) was compared using Fisher's-exact tests with Bonferroni correction applied.
Third, validation of the 19 SNPs ability to score hybrids was performed by using a modified version of Hybridlab (Nielsen et al., 2006), for more information, see https://github.com/salan ova-ellio tt/recom -sim. Theoretical hybrids and backcrosses were obtained using the same 15 predefined EBC and 15 predefined WBC as for the STRUCTURE analyses parents and simulating F1, F2 (firstand second-generation hybrids) as well as B2 and B3 backcrossess (second-and third-generation backcrosses), 100 individuals for each category. The simulated data was then analyzed using STRUCTURE with identical settings as for the real dataset (see above), and the HYB boundaries of 0.3-0.7 were further evaluated.
Additionally, heterozygosity per individual was calculated in VCFtools (using the 20 SNPs), and used for calculation of average heterozygosity for the two populations (EBC and WBC) as well as for the potential hybrids (HYB). These estimations were further used to test if the heterozygosity level was different between the historic and contemporary datasets as well as across cod types in the historical dataset, using two-way ANOVA, Shapiro-Wilk normality test, Levene's Test for Homogeneity of Variance and Kruskal-Wallis test.

| Genome sequencing and hybrid validation
Based upon the STRUCTURE analyses, 11 individuals were chosen for low coverage whole genome sequencing (WGS) to approx. ~0.5-2X coverage (Table S4) See Table S5 for full overview. The samples were chosen randomly from a larger collection (the Aqua Genome project), with an approximately equal number of males and females. The number of contemporary samples were kept on the lower side mainly to not bias the genotyping (i.e. trying also to get a good representation of historically important polymorphic sites), but at the same time with enough power to discriminate between contemporary populations.
The sequencing reads were processed using the Paleomix pipeline (Schubert et al., 2014). Two different forms of BWA (Burrows-Wheeler Aligner) were used. BWA aln backtrack  (BWA v. 0.5.9-r26-dev) was used for historical samples and BWA mem (Li, 2013) for modern (BWA v.0.7.12-r1039). Both forms were used with alignment to the second version of the cod reference genome: gadMor2 (Tørresen et al., 2016), with the nuclear and the mitochondrial genome given as separate entities. In short, Paleomix pipeline included adapter removal (Lindgreen, 2012)  and filtering of unmapped reads. Subsequently, the bam-files were further processed following in-house processing and filtering pipelines Star et al., 2017). This included removal of soft-clipped reads with SAMTOOLS v.1.3.1 . Modern samples were down-sampled to a coverage of roughly 5X (Samtools v.1.3.1), except for samples that already had 5X coverage or lower (1 sample from BOR and 3 from ORE).
For the genome-wide detection of SNPs, haplotype and genotype calling was conducted using GATK (v. 3.8) and the gadMor2 genome. Haplotype calling was performed using linear index type and index parameter 128,000. Genotype calling was run using max alternative alleles set as 3. Filtering was performed using BCFtools It should be noted that a more stringent setting-removing sites with more than 4 individuals missing-was also tested. This filter-setting, however, reduced the number of sites (n = 5259, after pruning for LD) to the point where the separation between modern clusters started to break down (see Figures S2 and S3), so missingness of 5 individuals per site was chosen, which also gave higher resolution with a higher number of SNPs as limit (n = 17,359, after pruning for LD). Furthermore, hard to map regions and known repeats were removed (VCFtools v.0.1.14), followed by removal of deaminated sites (awk). For heterozygous sites, we also tested for read depth bias, expecting that true heterozygotes at a site should have an equal probability for each allele. Those sites that fell within an accepted 0.5 frequency using a binominal test using all heterozygote individuals (per site) were kept, whereas those falling outside (p < 0.05) were removed, as described in Pinsky et al., 2021. No filtering was made using minimal allele frequency due to small sample sizes. The F I G U R E 2 PCA for individual cod in SNP dataset (N = 539). PCA is split per sampling location and coloured by genotype. Genotype scoring is divided into WBC (western Baltic cod), HYB (putative hybrids) and EBC (eastern Baltic cod). Modern samples from KIE and BOR used for projection are shown in background regardless of sampling area. unplaced scaffolds in the genome were kept throughout all analysis due to the potential of essential genes for Baltic cod not conforming to the reference genome, which is based on the migratory Northeast Arctic cod. Pruning for LD was made using PLINK (v1.90b5.2) with settings -allow-extra-chr -chr-set 24 no-xy -indep-pairwise 10 1 0.8, after removal of inversions at LG02, LG07 and LG12 (Table S6)

| MT genome analyses
The reads that aligned to mitochondrial (MT) genome-extracted from the WGS dataset using the Paleomix pipeline (see above for details)-were used in haplotype and variant calling (GATK v. Phylogenetic analyses were made in order to look at clustering of maternal linages. To reconstruct the phylogeny of mitochondrial protein coding genes (PCGs), fasta sequences were extracted from the VCF using PPP v0.1.12 (Webb et al., 2021). PCGs were extracted using BLASTN (Camacho et al., 2009), aligned with MAFFT v7.480 (Katoh & Standley, 2013), manually corrected for length, and ND6 was reverse complemented as it is encoded on the light strand. The maximum likelihood analysis, the substitution models, partitioning scheme, and phylogenetic analysis were conducted in IQ-tree v1.6.12 (Chernomor et al., 2016;Kalyaanamoorthy et al., 2017;Nguyen et al., 2015). Branch support values were obtained using 1000 replicates of ultrafast boostrap (UFBoot) (Hoang et al., 2018), see Figure 5b and Figure S5 (for support values >90). To complement our Maximum likelihood (ML) analysis, a Bayesian phylogeny was inferred using BEAST v2.6.6 (Bouckaert et al., 2019). The PCGs were partitioned into codon positions and the substitution models were inferred using bModelTest (Bouckaert & Drummond, 2017). Analysis was run under the Coalescent Constant Population Tree prior using a strict clock with a chain length of 100,000,000 generations sampling at every 10,000 iterations for a total of 100,000 trees. Convergence was assessed using Tracer 1.7.2 (Rambaut et al., 2018), and a burn in of 10% was selected. TreeAnnotator was used to make a consensus tree (see Figure S6 with posterior probabilites >0.90). Final trees were visualized using the interactive Tree of Life v6.3.1 (iTOL) (Letunic & Bork, 2021).
Phylogenetic analysis was also performed on the complete mitochondrial genome to assess the effect on the topology. For the ML analysis the substitution model was inferred using ModelFinder, before running IQ-Tree v1.6.12 with 1000 replicates of UFBoot (see Figure S7). Analysis was also run in BEAST v2.6.6 under the Coalescent Constant Population Tree prior using a strict clock with a chain length of 100,000,000 sampling at every 10,000 iterations for a total of 100,000 trees. The substitution model for the Bayesian phylogeny was again inferred using bModelTest. Tracer v1.7.2 was used to assess convergence, and TreeAnnotator with a burn in of 10% was used to create the consensus tree (see Figure S8).

| Inversion scoring
The four known inversions in Atlantic cod on LG01, LG02, LG07 and LG12, were scored in the WGS dataset by running local PCAs (with SMARTPCA) using the identified SNPs localized inside the inversions. The inversions were scored either as REF (homozygous for the reference genotype as in gadMor2), HET or NON_REF (homozygous for the non-reference genotype). The inversion boundaries applied here were already defined in another study with historical samples from the Northeast Arctic ecotype (see Table S6). The inversion on LG02 is defined by two ranges due to a miss-assembly of the last part of this inversion when comparing to the newest version of the Atlantic cod genome gadMor3 (Matschiner et al., 2022). The number of heterozygous sites using VCFtools v.0.1.14 were calculated at the individual level as in equation: Of the total n = 17,359 SNPs, 537 SNPs were localized within the inversion on LG01. For LG01 no specific clustering was detected (as expected) since the southernmost Atlantic cod populations are fixed for the non-reference allele of the inversion on LG01 (Barth et al., 2017;Berg et al., 2015). However, the three latter inversions (on LG02, LG07 and LG12), that have been found to covary in its southernmost distribution (Barth et al., 2017Berg et al., 2015;Sodeland et al., 2016Sodeland et al., , 2022, did display the expected three clusters of the (i) homokaryotypes of the reference genotype, the heterokaryotypes and the homokaryotypes of the alternative genotype arrangement (Mérot, 2020). Within the inversion on LG02, 58 SNPs were identified, whereas 307 and 390 SNPs were identified within the inversions on LG07 and LG12, respectively. In the historical samples, the heterozygote cluster displayed a lower degree of heterozygosity than the contemporary samples (see Figure S9); this was mainly linked to the low coverage combined with a higher degree of missingness per site in the historical dataset (see Figure S10). For the heterozygote cluster on LG07, a tailing pattern was observed, and thus, motivated for further inspection of the genotypes within the inversion (at different filtering steps) using Integrative Genomics Viewer (IGV) v.2.8.9. This investigation resolved 3 out of the 5 individuals with uncertain inversion genotype.
The inversion status for all individuals in the lager historical dataset (n = 436) was also investigated using the 7 SNPs localized within the inversions on LG02, LG07 and LG12 (as described above and see lower part of Table S2). First, we did an assessment of the scoring ability of these SNPs (n = 2 SNPs on LG02, n = 3 SNPs on LG07 and n = 2 on LG12). This was done by comparing the results with the scoring conducted on the historical whole genome sequenced individuals (Table S7). Error rate estimations were calculated (not taking missingness into account) as the number of mismatches divided by the total number (of individuals). For further analyses, we selected only the sites with minimal error rate (≤9%). Thus, for LG07 none of the sites were of good quality with error rates ranging from 27% to 45%. For LG02 only one of the sites was selected (at the position 20,868,512), while both sites on LG12 (11,630,885 and 12,529,238) were included (see Table S7). The error rate in the modern WGS dataset was also calculated (Tables S8 and S9), and found to be at a lower level (roughly 4.8%). For LG12, the site with the least missingness in the total historical dataset (see Table S9) at the position 11,630,885 was chosen for the inversion scoring. In the cases of known mismatches scoring from WGS and/or IGV vs. the SNP scoring, the WGS data was more reliable and selected over the SNP scoring.   Figure 2). The PCA analysis for the SNP set was significant on only the first principal component (Table S10).

| Population structure and hybrid assessment
In concordance with the PCA results, STRUCTURE showed a clear separation between WBC (majority purple fraction) and EBC (majority orange fraction), with few individuals showing mixed genetic origin (Figure 3). These individuals were assigned as HYB.
It should be noted that even though the classification of the individuals was mostly similar between PCA and STRUCTURE, it did differ for some individuals (Figure 2). Statistically, the genotype proportions (EBC, HYB, WBC) were significantly different over the time period in each area (Table S11). When conducting the pairwise comparisons the statistically differences between timepoints for each area (Tables S12 and S14) could be pinpointed, except for ARK (Table S13).
The simulation analysis revealed that the 19 SNPs are powerful enough to detect hybrids and backcrosses ( Figure 4) and 34% of B2 individuals (with EBC as the refence population), see  (Table S15). The historical and contemporary WBC and EBC were separated on the first component axis (Figure 5a), supported by the ANOVA statistics from SMARTPCA (Tables S16 and S17). The historical WBC samples were not significantly different from the modern cod caught in KIE (p = 0.96). However, one of the modern KIE individuals was genetically identified as an EBC, based on the PCA (Figure 5a), but not affecting the outcome in the ANOVA.
The historical EBC samples were not significantly different from the contemporary cod caught in BOR, but the results were less clear (p = 0.10), possibly due to the low sample size (N = 3).
However, the slight off placement of the historical EBC closer to the "putative" hybrids could be indicative of some divergence in re-

F I G U R E 4 STRUCTURE plots
showing fractions (f Structure ) for K = 2, of Western (purple) and Eastern (orange) genetic signal for theoretical hybrids and backcrosses as obtained with hybrid simulation tool. WBC is the western parents, B3 is the third-generation backcrosses, B2 is the second-generation backcrosses, F1 is the first-generation hybrids, F2 is the second-generation hybrids and EBC is the eastern parents. Hybrids within fraction 0.3-0.7 are shadowed and fraction of simulated hybrids picked up given below f HYB .
looking at all eigenvectors together-generally supported the results from the first component (Table S18). None of the 10 sites identified as driving the PC1 of the WGS in SMARTPCA (eigbestsnp) were the same as the sites in the diagnostic SNP panel (Tables S2 and S19). The eigbest were chosen by SMARTPCA, and all had snpweights >4.64 on PC1. Among the eigbestsnp sites only 6 sites were located on the same linkage groups as the sites in the diagnostic SNP panel. The site closest to a SNP in the panel was located 37,841 bp downstream. All the 10 eigbestsnps were located within 2 kb of coding regions (Table S20), 8 were located within introns, one was located between two genes and one was located outside a gene.  (Figure 6; Table S22).

| Inversion frequencies
Calculation of the inversion frequencies on LG02 and LG12 for the two cod types (EBC and WBC) as well as in the HYB individuals at the different locations (MEC, ARK and BOR) indicates an east-west differentiation for LG02 in all cod types whereas for LG12 only for WBC and HYB (see Figure 7; Tables S23-S25). Almost all comparisons of inversion frequencies on LG02 were significant (Table S24) When testing for difference in inversion frequencies between the cod types irrespective of location, we found for LG02 significant differences that were pinpointed to be between the WBC and HYB (p = 1.26e-04) as well as between WBC and EBC (p = 2.23e-06), whereas no difference was found between EBC and HYB (p = 0.23). The HYB and EBC individuals displayed a higher frequency of the REF variant than the WBC (see pie-charts in Figure 7; Tables S26 and S27). For LG12, the results were not as clear-cut, with WBC deviating from HYB (p = 2.09e-03), but not from EBC (p = 0.13), and HYB not being significantly different from EBC (p = 0.02). However, in the latter case there might be a trend towards a difference in inversion frequencies on LG12 between EBC and HYB, with HYB displaying a higher number of the REF variant (as also seen for the WBC). When testing inversion frequencies for HWE for the different cod types, only the inversions in EBC were found to be in disequilibrium, both on LG02 and LG12 (Figure 7; Tables S28 and S29). Additionally, we plotted the inversion frequencies for the low coverage WGS dataset (see Figure S15), but due to the low number of historical individuals, no statistical testing was conducted.

| Geneflow and demographic inference by MT genome analyses
The mitochondrial analysis displayed two major branches (see   1980s (1988 and 1989). This is the first genetic evidence of a regular occurrence of adult (and mature) EBC this far west, in higher saline environmental conditions in historical times. There are, however, numerous of reports documenting the coexistence of the two stocks in the Arkona Basin (SD24), where the mixing ratio seems to be relatively stable, but to some degree impacted by population stock dynamics Hüssy, 2011;Hüssy et al., 2016;ICES, 2019b;Schade et al., 2022). Our data confirm that are seemingly found at lower frequencies in this area (Hüssy, 2011;Hüssy et al., 2020;Schade et al., 2022).

| Evidence of hybridization between the divergent EBC and WBC populations
Due to the well documented coexistence of EBC and WBC mainly in the Arkona Basin, the potential impact on recruitment and/or gene flow between the two stocks have been debated (Heidemann et al., 2012;Hüssy et al., 2016 in modern times could indicate that hybridization may still be occurring but to a lesser extent than historically, due to (i) lower population abundance and maybe most important (ii) the major divergence in spawning time between the two stocks (Hüssy et al., 2016;ICES, 2019b;Köster et al., 2017;Wieland et al., 2000). Abiotic factors are thought to be the major limitation for eastern Baltic cod recruitment (Kosior & Netzel, 1989;Plikshs et al., 1993). Poorer environmental conditions due to, that is, reduced inflow of saline waters during the past decade have resulted in only the southern deeper parts of the Baltic Sea at the halocline (above hypoxic and anoxic depths) available for successful cod spawning and producing vital larvae (Plikshs et al., 1993;Svedäng et al., 2022), and thus could have resulted in lower viability in general including the hybrid individuals.

| Inversion genotypes linked to genetic origin and location
We scored the two well-known inversions found in Atlantic cod of special importance in the Baltic (Barth et al., 2017Berg et al., 2015;Matschiner et al., 2022;Sodeland et al., 2016), and inferred the inversion status over the time period and location for the WBC, EBC as well as the hybrids. There was little variation of the inversion frequencies over the time (data not shown), but sig-

| MT genome analyses indicate directional gene flow from east to west
Phylogenetic analyses of the MT genome for the whole genome sequencing data-set, including both historical and contemporary samples, gave further insight into the potential gene flow between the two stocks during the assumed range expansion of EBC in the mid 1980s. The phylogenetic tree separated the samples into two main clusters where the contemporary samples are found mixed in both.
The historical samples of WBC and EBC are found in separate clusters, while the hybrids all cluster with the historical WBC samples.
This indicates that surviving hybrids had a WBC mother and an EBC father. Caution should however be taken here, due to the low sample size. That said, these results are in concordance with previous phylogenetic analyses, where the overall mitogenomic variation in modern Atlantic cod is demonstrated to be quite high, while the eastern Baltic cod were mainly found in two of the clusters (Jørgensen et al., 2018;Lait et al., 2018;Martínez-García et al., 2021).
Further evidence for directional gene flow is supported by the phenotypic difference between the two stocks in terms of egg buoyancy and sperm motility (Nissling et al., 1994;Nissling & Westin, 1997;Westin & Nissling, 1991), both important factors for spawning success and survival. WBC is not able to successfully reproduce in the central and eastern Baltic Sea, due to the fact that the eggs (adapted to higher saline environmental conditions) will sink to the deeper layers in the water column with no or low oxygen content and thus fail to develop to hatching (Nissling et al., 1994;Westernhagen, 1970).
Experimental studies have demonstrated that sperm activity for EBC is highest between 15.5% and 26‰ salinity, whereas the sperm motility is drastically reduced at lower salinities, which also negatively impact the fertilization (Westin & Nissling, 1991). However, EBC has been shown to spawn sporadically with success in western Baltic areas suggested to be linked to favourable hydrographic conditions and/or spill over from neighbouring spawning areas (Stroganov et al., 2018). Taken this into consideration together with coexistence of the two stocks (in the western Baltic regions in the 1980s) when spawning times were more or less overlapping, could result in hybridization events between the two stocks. Furthermore, hybrids resulting from crosses of female WBC and male EBC would have a higher chance of survival due to the neutral buoyancy of those eggs even in water of lower salinity (Nissling & Westin, 1997). Moreover, mapping of spawning habitat suitability in the western Baltic regions during the mid 1980s to the early 1990s uncovered poor conditions during the main spawning season in January-March, whereas it was considered to be more favourable during the late spawning season in April/May . These findings could indicate a higher survival of eggs and larvae of the late spawners, i.e. such as the EBC (Wieland et al., 2000;Wieland & Horbowa, 1996), and thus, give rise to a higher probability of successful hybridization events during that time period.

| CON CLUDING REMARK S
In this study, we demonstrate how ecosystem alterations, such as increased recruitment and spawning stock biomass of a marine fish stock (due to potential favourable environmental conditions), could mediate (i) range expansion, (ii) increased degree of mixing, and (iii) hybridization between genetically divergent populations.
An increased understanding of such shifts in the ecosystem dynamics and cryptic population division is relevant for stock assessment(s) and can support the development of sustainable management programmes for marine resources (Reiss et al., 2009). For instance, it is likely that many of the tagging experiments and observations conducted in the Baltic region have been confounded by this mixing of the two stocks, which have probably fluctuated with changes in EBC and WBC spawning stock biomass, as indicated by the results in this study and other studies reporting that the population dynamics is a key factor for the degree of mixing of the two stocks Schade et al., 2022;Stroganov et al., 2018). From a management perspective, it is important to know the extent of hybridization over time as well as degree of physical mixing of divergent populations so that harvesting regimes do not have unintended consequences.

ACK N OWLED G M ENTS
The authors thank the technical staff at the Thünen Institute of Baltic Sea Fisheries for assistance in sample processing as well as