Genetic analysis provides insights into species distribution and population structure in East Atlantic horse mackerel (Trachurus trachurus and T. capensis)

Abstract Two sister species of horse mackerel (Trachurus trachurus and T. capensis) are described that are intensively harvested in East Atlantic waters. To address long‐standing uncertainties as to their respective geographical ranges, overlap and intraspecific population structure this study combined genetic (mitochondrial DNA and microsatellite) analysis and targeted sampling of the hitherto understudied West African coast. mtDNA revealed two reciprocally monophyletic clades corresponding to each species with interspecies nuclear differentiation supported by F ST values. The T. trachurus clade was found across the north‐east Atlantic down to Ghana but was absent from Angolan and South African samples. The T. capensis clade was found only in South Africa, Angola and a single Ghanaian individual. This pattern suggests that both species may overlap in the waters around Ghana. The potential for cryptic hybridization and/or indiscriminate harvesting of both species in the region is discussed. For T. capensis mtDNA supports high gene flow across the Benguela upwelling system, which fits with the species' ecology. The data add to evidence of a lack of significant genetic structure throughout the range of T. trachurus though the assumption of demographic panmixia is cautioned against. For both species, resolution of stock recruitment heterogeneity relevant to fishery management, as well as potential hybridization, will require more powerful genomic analyses.

with the Benguela Upwelling System (BUS) and is most often described as extending from southern Angola, throughout Namibia and into the Eastern Cape of South Africa (Axelsen et al., 2004;Barange et al., 1998). T. trachurus is widely described throughout the NE Atlantic as far north as Norway and Iceland (Knijn et al., 1993) as well as the Mediterranean basins (Fischer, 1973). Both species are of great commercial importance and are the most valuable horse mackerel species in these regions (Murta, 2000) as exemplified by the Namibian T. capensis fishery, which reports the highest biomass and catch of all fish species landed within the region and is the second highest economic contributor to Namibian fisheries (Kirchner et al., 2010).
T. trachurus has demonstrated severe reductions in total annual catch with the stocks in the Mediterranean, Alboran and Black seas considered fully exploited, while the NE Atlantic stock is deemed to be overexploited (ICES, 2003(ICES, , 2005. This has led to T. trachurus being listed as "vulnerable" on the IUCN's red list of threatened species (Smith-Vaniz et al., 2015) and added further impetus to align management and biological units for both species to try and ensure fishery sustainability (reviewed in: Abaunza et al., 2008a) and maintenance of biodiversity.
In the absence of robust evidence of population structure the species is managed using regionally designated stocks (ICES, 2003(ICES, , 2005. Similarly, population genetic studies of T. capensis have reported no evidence of population structure (Karaiskou et al., 2004;Naish, 1990), with the species currently managed as arbitrarily delineated northern and southern BUS stocks (Hecht, 1990;Naish et al., 1991).
Studies to date have focused on either the NE Atlantic/Mediterranean or the south-west African coast, meaning that T. trachurus and T. capensis diversity within the Angolan/North African region is poorly resolved. Such a geographical sampling gap also adds ambiguity as to the southern and northern limits of T. trachurus and T. capensis, respectively. Although the majority of studies cite the distribution of T. trachurus as extending only as far south as Senegal and the Cape Verde islands (e.g., Abaunza et al., 2008a;Cimmaruta et al., 2008;Comesaña et al., 2008;Sala-Bozano et al., 2015), others recognize a far wider range, encompassing the west African tropics and ranging as far south as South Africa and southern Mozambique (Boateng, 2019;Koranteng, 2002;Smith-Vaniz et al., 2015). Considering the extensive fishing pressure on T. capensis in Namibia, the potential for overlap and thus a cryptic mixed stock with T. trachurus must be assessed.
Additionally, the available genetic data do not sufficiently test the validity of the currently recognized north and south BUS stocks for T. capensis.
To address these fundamental knowledge gaps for both species this study focused on collecting samples of putative T. trachurus/ T. capensis from the understudied west African coast alongside better studied, and thus reference, NE Atlantic/Mediterranean (T. trachurus) and South African (T. capensis) waters. Genetic analysis of sequence variation at two mtDNA regions and 10 microsatellite loci were used to delineate species and assess population structure within both species. The results provide new insight into the distribution of both species and their eco-evolutionary dynamics while also providing information as to the relationship between population genetic structure and current management units.

| Sample acquisition and DNA extraction
This study complied with all ethical requirements of the Journal of Fish Biology and local authorities. No fish were killed or interfered with in any way as part of this study as all samples were in the form of ethanolpreserved fin clips of adults acquired solely from artisanal fish markets to ensure local providence of individuals. Samples were acquired from locations ( Essaouria and Tiznit), the tropical/subtropical Atlantic coasts of West Africa (Ghana: Tema, Angola: Namibe), and the cool temperate west coast of South Africa (Saldanha) between July and September 2017.

| mtDNA amplification and analysis
Two mtDNA regions were sequenced. A fragment of the cytochrome oxidase subunit 1 (COI) was PCR amplified using the Fish F1 and Fish R1 universal primers (Ward et al., 2005) and a fragment of the control region (CR) was PCR amplified using primers (TrachurusCRF2: 5 0 -CCAGTCCAACCTGCAAAGGA-3 0 and TrachurusCRR2: 5 0 -CACAGGGA GGCGGATACTTG-3 0 ) designed during this study. Individual PCRs were performed in 20 μl total volumes, containing 50-100 ng template DNA, 2 pmol of each primer, 10 μl of 2XBioMix (Bioline,UK) and 2 μl of ddH 2 O. The PCR thermoprofile was 3 min at 95 C, followed by 35 cycles of 95 C for 30 s, 45 s annealing at 54 C for COI or 52 C for CR amplification and 1 min at 72 C, followed by a final extension stage at 72 C for 3 min. Amplicons were purified and sequenced with respective forward and reverse primers using BIG DYE technology. Sequence identity was confirmed by BLAST [standard database-nucleotide collection (nr/nt)] and sequences were aligned using the CLUSTALW program (Thompson et al., 1994), implemented in BIOEDIT (Hall, 1999).
Phylogenetic relationships among haplotypes were inferred using (a) a minimum spanning network constructed in NETWORK (www. fluxus-engineering.com/sharenet.htm); and (b) both maximum likelihood (ML) and Bayesian inference (BI) trees constructed in PHYML (Guindon et al., 2010) and MrBayes (Ronquist & Huelsenbeck, 2003), respectively. ML and BI trees were generated using the optimal substitution model (HKY + G) identified by ModelTest (Posada & Crandall, 1998). BI trees were run using three heated and one cold chain over 5,000,000 generations, with the Markov chain sampled at 1000 generation intervals, and the first 15% of trees discarded as burn-in. BI posterior probabilities were calculated over the 5,000,000 generation run and ML support was calculated based on measures of bootstrap (BS), inferred after 1000 replicates.
ARLEQUINv3.5.2.2 (Excoffier & Lischer, 2010) was used to estimate indices of haplotype diversity (h) and to quantify differentiation Fu's Fs (Fu, 1997) and Tajima's D (Tajima, 1989) tests were used to test for deviations from mutation-drift equilibrium that could be attributed to selection and/or population size changes. Mismatch distributions, the frequency distributions of pairwise differences between haplotypes within a sample and simulated distributions under a model of demographic expansion, were compared using the sum of squared deviations (SSD) as a test statistic with significance assessed after 10,000 bootstrap replications. The timing of expansions (T) was estimated using the formula T = τ/2u (Rogers & Harpending, 1992). IMa2 (Hey, 2009) was used to estimate divergence times between groups with analyses being run for 1,000,000 burn-in generations and >5,000,000 sampling generations so that the minimum ESS across parameters was >50 (Hey & Nielsen, 2004).
Mismatch and IMa2 analyses were performed using both a universal divergence rate for marine fish species mtDNA of 1.5% per million years (Bermingham et al., 1997) and a fossil calibrated Trachurus specific divergence rate of 0.14% per million years proposed by Cardenas et al. (2005).
Genetic variation within samples was characterized using the number of alleles (N A ), allelic richness (A R ), observed heterozygosity (H O ) and expected heterozygosity (H E ), all calculated using GENALEX 6.2 (Peakall & Smouse, 2006). GENALEX was also used to test for genotype frequency conformance to Hardy-Weinberg expectations (HWE) and genotypic linkage equilibrium between pairs of loci. As locus × sample tests of HWE yielded a large number of deviations due F I G U R E 2 Median joining haplotype networks of T. trachurus and T. capensis based on (a) 605 bp of mtDNA COI, (b) 470 bp of mtDNA CR and (c) 1075 bp of concatenated sequence. Each dash represents a single site difference, node sizes are proportional to the overall haplotype frequency and pie discs correspond to within sample frequencies. , MH; , PO; , ME; , MT; ; GT; , AN; , SAS to heterozygote deficits, the software FreeNA (Chapuis & Estoup, 2006) was used to estimate frequencies of null alleles. Genetic differentiation among samples was quantified using F ST values with significance assessed following 9999 permutations in GENALEX with pairwise F ST visualized using a principal coordinate analysis (PCoA).
Global and pairwise F ST values were also recalculated using the null allele correction in FreeNA. Genetic structure was also investigated with/without prior sample definition and a combination of admixture/ no-admixture models, using the model-based Bayesian clustering programme STRUCTURE V2.3.4. (Pritchard et al., 2000). Simulations were run over 500,000 Markov chain Monte Carlo iterations with the first 100,000 repetitions discarded as burn-in. Optimal K values (from a range of 1-5) were identified using the log probability of data (Pr(X/K)).
Individual assignment tests with Bayesian allele frequency estimation (Rannala & Mountain, 1997) were performed in GENECLASS 2 (Piry et al., 2004) to assess both the pattern and power with which individuals were assigned to species groups. Following Bekkevold et al. (2015) a two-step approach was employed. In the first step, each individual was assigned to the species group from which its genotype had the largest probability of occurring (standard assignment). The second step aimed to disentangle cases where individuals were weakly assigned from "confident assignment" by imposing a more restrictive criterion wherein only assignments with probabilities >0.95 (following Hauser et al., 2006) were used. Self-assignment tests using the leave one out method were first used to assess the power of the analysis and then assignment of individuals treated as of unknown origin were performed as directed by the other results.  2 and 3). There was a clear spatial separation of these clades with them only co-occurring in the Ghana sample. One was found across the NE Atlantic and Ghana (and was absent from Angola and South Africa). The other clade was found only in South Africa, Angola and a single Ghanaian individual (Figures 2 and 3). These clades are hereafter referred to as the T. trachurus and T. capensis clades, respectively, based on BLAST. Based on the concatenated sequences mean estimates of divergence time between the clades were 1.08 (95% probability, range 0.6-1.5) and 0.1 (95% probability, range 0.06-0.14) million years before present (BP), assuming 0.14% and 1.5% divergence rates, respectively. Levels of intrasample variation were similar among all samples, as reported in Table 1. Pair-wise tests (

| Microsatellites
In total 10 loci were screened across 189 individuals and seven sample locations (Table 1). Allele sizes differed in expected multiples of the microsatellite repeat motifs. Thirty out of 70 locus by sample location comparisons deviated significantly from Hardy-Weinberg expectations (critical P = 0.05). Heterozygote deficits were less prominent for three loci, with the loci Tt48, TmurB116 and TmurC4 reporting significant test results for 0, 1 and 2 sample locations, respectively (Supporting Information Table S1). These three loci also reported the lowest average (across sample site) estimated null allele frequencies (NAF): (NAF)Tt48 = 0.008, (NAF)TmurB116 = 0.002 and (NAF) TmurC4 = 0.05). Excluding these three loci the average null allele frequency per locus ranged from 0.08 (TT113) to 0.26 (A115) with a mean of 0.14 (Supporting Information Table S2).  (Table 3).

STRUCTURE analyses performed for the unedited and three loci
data sets all failed to resolve any structure, with K = 1 unanimously supported. The lowest P for K = 1 was 0.89 for the analysis performed with the three loci subset and using LocPrior and assuming no admixture. Classical assignment tests differ from STRUCTURE in that they require some a priori classification of baseline samples for individuals to be assigned to. We were particularly interested in assessing whether the T. capensis mitotype bearing individual was assigned to either T. trachurus or T. capensis. To assess the power of the analysis we first performed species self-classification tests wherein we set up a "pure" T. trachurus baseline group by pooling the Portuguese (PO) and Moroccan (MH, ME and MT) samples and a "pure" T. capensis baseline group by pooling the Angolan (AN) and  (Hoarau et al., 2002). The association of the heterozygote deficits with certain loci and not others in this study would suggest a prominent role for null alleles. In their population genetic study of T. trachurus Kasapidis & Magoulas (2008)  that commonly inferred and even fossil obtained "species" level mutation rates may be considerably lower than those experienced at intraspecific levels and/or over more recent evolutionary time frames (1-2 MYA). This can lead to an over-estimation of the ages of divergence and demographic events (Grant, 2015;Ho et al., 2005Ho et al., , 2007Hoareau, 2015).

| Population structure within both species
There was no significant mtDNA differentiation between the T. capensis samples from Angola and South Africa. This is a striking result in light of the high level of mtDNA divergence, including examples of reciprocal monophyly, between Angolan and South African sites for a number of marine taxa due to the BUS biogeographic boundary (Gwilliam et al., 2018;Henriques et al., 2012Henriques et al., , 2014Reid et al., 2016). This could be due to a combination of the previously mentioned slow mtDNA mutation rates for Trachurus as well as effective gene flow across the BUS as might be expected based on the species' life history. Specifically, while the cool upwelled waters of the BUS are a dispersal barrier for many species, T. capensis is an upwelling associated species with peak abundance observed across the main upwelling regions and is expected to freely disperse across the BUS (Axelsen et al., 2004;Barange et al., 1998;McLaverty, 2012 An important consideration previously highlighted by Kasapidis & Magoulas (2008) is that the absence of genetic structure in T. trachurus and T. capensis does not mean there is no significant isolation of stocks on timescales of interest to management (Hauser & Carvalho, 2008). Resolution of such stock structure will require more powerful genomic approaches, such as RADseq, which offer the  (Cimmaruta et al., 2005), cannot be discounted the data indicate that local populations retain high levels of genetic variation and if current effective population sizes are maintained genetic drift alone will be insufficient to erode variation.

| CONCLUSIONS
The results of this study both inform conservation management strategies and direct future research. The data support the view that T. trachurus is not found in Angolan and South African waters as has sometimes been suggested. The co-occurrence of both species' clades in Ghana also reveals this area to be an apparent suture zone between their ranges. Further sampling of the region is required to assess the extent of spatial overlap between both species. The results here suggest both species are likely being indiscriminately harvested within this region, which may severely compromise stock sustainability (Healey et al., 2018a,b;McKeown et al., 2015). More powerful genomic-based assays will also be useful to permit individual-based analyses of hybridization and fish traceability (Helyar et al., 2011;Hemmer-Hansen et al., 2019). Such genomic assays will also be necessary for both species to align spatial management units with patterns of stock recruitment. Until such information is available the current designation of regional management units in T. trachurus should be maintained as a precautionary measure. Paradoxically, for T. capensis the likelihood of connectivity between the northern and southern BUS operational stocks should be considered in attempts to understand potentially cohesive stock dynamics (Frisk et al., 2008).