Species identification and comparative population genetics of four coastal houndsharks based on novel NGS‐mined microsatellites

Abstract The common smooth‐hound (Mustelus mustelus) is the topmost bio‐economically and recreationally important shark species in southern Africa, western Africa, and Mediterranean Sea. Here, we used the Illumina HiSeq™ 2000 next‐generation sequencing (NGS) technology to develop novel microsatellite markers for Mustelus mustelus. Two microsatellite multiplex panels were constructed from 11 polymorphic loci and characterized in two populations of Mustelus mustelus representative of its South African distribution. The markers were then tested for cross‐species utility in Galeorhinus galeus, Mustelus palumbes, and Triakis megalopterus, three other demersal coastal sharks also subjected to recreational and/or commercial fishery pressures in South Africa. We assessed genetic diversity (N A, A R, H O, H E, and PIC) and differentiation (F ST and D est) for each species and also examined the potential use of these markers in species assignment. In each of the four species, all 11 microsatellites were variable with up to a mean N A of 8, A R up to 7.5, H E and PIC as high as 0.842. We were able to reject genetic homogeneity for all species investigated here except for T. megalopterus. We found that the panel of the microsatellite markers developed in this study could discriminate between the study species, particularly for those that are morphologically very similar. Our study provides molecular tools to address ecological and evolutionary questions vital to the conservation and management of these locally and globally exploited shark species.


| 1463
MADUNA et Al. genetic diversity among populations (Dudgeon et al., 2012). It is likely that sharks may not respond well to population declines compared to other marine fishes owing to their K-selected life-history traits, i.e., slow growth, late maturity, and low reproductive outputs (Compagno, 1984;Ebert, Fowler, Compagno, & Dando, 2013). This highlights the need for conservation and management measures to ensure the sustainable utilization of these fishery resources. Implementing such measures often requires information on fishery dynamics, biological and baseline ecological data which in most cases are not yet available (Velez-Zuazo, Alfaro-Shigueto, Mangel, Papa, & Agnarsson, 2015).
Despite ongoing sampling difficulties, population genetics studies of bio-economically important sharks are now fast increasing due to molecular genetic markers becoming more readily available. For example, next-generation sequencing (NGS) has become a common approach to developing microsatellites in nonmodel organisms as it enables the recovery of thousands of repeat-containing sequences at a reduced time and cost (Blower, Corley, Hereward, Riginos, & Ovenden, 2015;Chabot & Nigenda, 2011;Pirog, Blaison, Jaquemet, Soria, & Magalon, 2015). Also, newly developed microsatellites for source species can be assessed for cross-species transferability in congeneric and confamilial (target) species and have shown to have a high success rate in elasmobranchs (Blower et al., 2015;Boomer & Stow, 2010;Chabot, 2012;Maduna, Rossouw, Roodt-wilding, & Bester-van der Merwe, 2014;Pirog et al., 2015). This allows for the development of a standardized panel of microsatellite multiplex PCRs for comparative population genetics studies and identification of species.
Identification of bio-economically important sharks during port inspections is very difficult (or even impossible) when using traditional taxonomic tools because of carcass processing at sea, where the head and fins are removed (Abercrombie, Clarke, & Shivji, 2005;Akhilesh et al., 2014;Stevens, 2004). During processing morphological and meristic criteria which are pivotal to the accurate identification of specimens are lost (Mendonça et al., 2010;da Silva & Bürgener, 2007).
South Africa is an ecologically and evolutionarily dynamic region with a diverse elasmobranch fauna (Bester-van der Merwe & Gledhill, 2015;Compagno, 1984;Ebert et al., 2013) and is located in the transition zone between the Atlantic and Indo-Pacific biomes (Briggs & Bowen, 2012). The Atlantic/Indian Ocean boundary in this region is characterized by two ocean basins, the Southeast Atlantic Ocean (SEAO) and Southwest Indian Ocean (SWIO) with two major currents, the cold Benguela Current and the warm Agulhas Current (Briggs & Bowen, 2012;Hutchings et al., 2009). Thus far, only a few regional population genetics studies related to sharks have been conducted in southern Africa but have shed some light on the possible impact of oceanographic features on gene flow patterns of species affected by fisheries, including the tope shark (Galeorhinus galeus), common smooth-hound (Mustelus mustelus), and spotted gully shark (Triakis megalopterus) (Bitalo, Maduna, da Silva, Roodt-Wilding, & Bester-van der Merwe, 2015;Maduna, da Silva, Wintner, Roodt-Wilding, & Bester-van der Merwe, 2016;Soekoe, 2016). These studies showed that the interaction between the two ocean currents plays a prominent role in limiting dispersal around the southern tip of Africa, particularly in an eastward direction for the common smooth-hound shark for example. Given that single-species conservation strategies do not adequately protect the biological and ecological needs of multiple species within threatened ecosystems, the focus has shifted toward multispecies approaches.
The local distribution ranges of all the triakid species (family Triakidae) investigated here, the tope shark, common smooth-hound, whitespotted smooth-hound (M. palumbes), and the spotted gully shark, extend across the Atlantic/Indian Ocean boundary. This presents an ideal opportunity to test whether the interplay of oceanographic features and life-history traits are the drivers of population subdivision in these sharks. The tope shark is a highly mobile semipelagic demersal species that is widely distributed in temperate waters (Ebert et al., 2013). Although sexual maturity depends on the ocean basin of origins, females reach sexual maturity at a total length (L T ) of 118-150 cm and males at 107-135 cm L T . Reproduction is viviparous (no yolk-sac placenta) with a triennial reproductive cycle (Ebert et al., 2013;Lucifora, Menni, & Escalante, 2004;McCord, 2005). Conversely, smooth-hounds are relatively small and less mobile epibenthic sharks (<170 cm L T ) (da Silva et al., 2013;Smale & Compagno, 1997). The common smooth-hound ( Figure 1) is a cosmopolitan species distributed across the Mediterranean Sea, the eastern Atlantic Ocean, and the Southwest Indian Ocean, whereas the whitespotted smoothhound is endemic to southern Africa and is found from Namibia to northern KwaZulu-Natal (Ebert et al., 2013;Smale & Compagno, 1997). Reproduction in the common smooth-hound is characterized by placental viviparity and a seasonal reproductive cycle whereby each cycle may take 1 year or longer. Sexual maturity is reached at 70-112 cm L T for males and 107.5-124 cm L T for females (Saïdi, Bradaï, & Bouaïn, 2008;Smale & Compagno, 1997). For the whitespotted smooth-hound, reproduction is characterized by aplacental viviparity and an aseasonal reproductive cycle although the timing of reproductive cycles is presently unclear. Sexual maturity is reached at 75-85 cm L T for males and 80-100 cm L T for females (Ebert et al., 2013;Smale & Compagno, 1997). Similar to smooth-hounds morphologically but with a larger body size, the spotted gully shark is endemic to southern Africa and is found from southern Angola to Coffee Bay, South Africa. Reproduction is ovoviviparous with a biennial to triennial reproductive cycle (Smale & Goosen, 1999;Soekoe, 2016). Sexual maturity is reached at 94-130 cm L T for males and 140-150 cm L T for females. Anecdotal evidence based on tagging data suggests that the spotted gully sharks exhibit a high degree of site fidelity or residency because ca. 80% of these animals were recaptured close to their release site (within a 20-km radius), regardless of the time at liberty (Dunlop & Mann, 2014;Soekoe, 2016).
Here we characterize a set of NGS-mined microsatellites in common smooth-hound and evaluate the potential of cross-species utility of these markers in species identification and assessing the distribution of genetic variation across populations sampled along the South African coast.

| Sample collection and genomic DNA extraction
A total of 144 finclip samples from four coastal shark species (the tope shark, common smooth-hound, whitespotted smooth-hound, and the spotted gully shark) were examined (Table 1). We included samples from the west and east coasts, representing the two main ocean basins (SEAO and SWIO) spanning the South African coastline ( Figure 2   and conducted according to the manufacturer's instructions except for varying primer concentrations (Table 3) and T A , 56°C for MPS1 and 57°C for MPS2. For subsequent analysis on an ABI 3730XL DNA Analyzer, PCR products were diluted in distilled water and fragment analysis performed together with the LIZ600 internal size standard. Individual genotypes were scored based on fragment size via GENEMAPPER v. 4.0 (Life Technologies, South Africa). To determine the utility of these markers for future regional studies of intra-and interspecific genetic diversity in houndsharks (Triakidae), we also tested the 11 microsatellite loci on the blackspotted smooth-hound, spotted gully shark, starry smooth-hound, tope shark, and whitespotted smooth-hound using the PCRs and microsatellite genotyping conditions described previously.

| Microsatellite validation, cross-species amplification, and species identification
To evaluate the reliability of using cross-amplified microsatellites for species identification, we conducted multivariate clustering analysis using the discriminant analysis of principal components (DAPC) implemented in the R package ADEGENET (Jombart, 2008

| Microsatellite characterization
For the four study species, we tested all loci for scoring errors and allelic dropout using MICRO-CHECKER v. matching alleles were excluded from further analyses. Using FREENA (Chapuis & Estoup, 2007), we estimated the frequency of null alleles following the expectation maximization (EM) method described by Dempster, Laird, and Rubin (1977). We calculated deviations from Hardy-Weinberg equilibrium (HWE) for each locus using the exact probability test based on 10,000 iterations (10,000 dememorization, 500 batches) in GENEPOP v. 4.0 (Rousset, 2008 (Ryman & Palm, 2006) to assess the statistical power of the loci for F ST tests (i.e., rejection of the null hypothesis H 0 of genetic homogeneity among two subpopulations when it is false) and the α level (i.e., rejection of H 0 when it is true) using a sampling scheme of two subpopulations with 20 individuals each. The analyses were conducted using 10,000 dememorizations, 100 batches, and 1,000 iterations per batch with the allele frequencies observed for the complete dataset of 11 microsatellite loci and our reported sample sizes for each species.

| Within-species population genetic analysis
Pairwise F ST (Weir & Cockerham, 1984) and Jost's D est (Jost, 2008) were calculated using the DIVERSITY package, and the analysis of molecular variance (AMOVA) was calculated using ARLEQUIN. To account for our sampling strategy, the measures of genetic differentiation comparisons were considered significant if the lower CI was >0, and p-values were <.05 following FDR correction. To visualize population distinctness, we used ADEGENET to perform discriminant analysis of principal components (DAPC) on clusters defined by ocean basin.
The number of clusters was assessed using the find.clusters function, which runs successive K-means clustering with increasing number of clusters (k). For selecting the optimal k, we applied the Bayesian information criterion (BIC) for assessing the best supported model, and therefore the number and nature of clusters, as recommended by Jombart, Devillard, and Balloux (2010). DAPC scatter plots were only drawn for k > 2. We also used a Bayesian clustering model-based method implemented in STRUCTURE 2.3 (Pritchard, Stephens, & Donnelly, 2000) to detect the most probable number of genetic clusters (K) present in each species. We applied an admixture model with correlated allele frequencies for 10 replicates across K = 1 to K = 10 with each run consisting of 1,000,000 Markov chain Monte Carlo (MCMC) iterations and an initial burn-in phase of 100,000 iterations assuming no prior population information. Given that only two groups of samples were compared for each species, the ad hoc statistic ∆K described in Evanno, Regnaut, and Goudet (2005) and commonly used to identify the likely number of genetic clusters was not considered appropriate for our study. This ∆K statistic never assigns K = 1 (Evanno et al., 2005). Here, the posterior probability of the data (X) for a given K, Pr(X|K), calculated by STRUCTURE was used to compute the mean likelihood L(K) over 10 runs for each K to identify the likely K for which L(K) was highest (Pritchard et al., 2000) as implemented in STRUCTURE HARVESTER 0.6.94 (Earl & vonHoldt, 2012). CLUMPAK (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015) was used for the graphical representations of the STRUCTURE results. Given that we were uncertain about sampling locations of several individual Mustelus palumbes, we also used the program GENECLASS2 v2.0 (Piry et al., 2004), to examine genetic structure based on assignment tests for this species. Assignment probabilities of individuals were calculated using a Bayesian procedure (Rannala & Mountain, 1997) and Monte Carlo resampling using 100,000 simulated individuals and a threshold of 0.01.
Finally, we used the coalescence-based method in the program MIGRATE-N 3.6.11 (Beerli, 2006;Beerli & Palczewski, 2010) implemented on the CIPRES Portal v3.3 at the San Diego Supercomputer Center (Miller, Pfeiffer, & Schwartz, 2010) to compare alternative migration pattern across oceans. We evaluated four migration models: (1) a full model with two population sizes and two migration rates (from SEAO to SWIO and from SWIO to SEAO); (2)

| Microsatellite multiplex assays, cross-species amplification, and species identification
The two sequencing runs of the Nextera™ library for Mustelus mustelus generated 35 GB of raw reads. After trimming the raw sequences that included removal of adapters, N-containing reads, and low-quality reads, we retained a total of 17 GB clean reads. After the de novo assembly of the Illumina paired-end reads, we recovered a total of 27,512,666 contigs. We identified a total of 82,879 contigs that were longer than 250 bp, of which 2,572 (3.1%) contained microsatellites.
Dinucleotide repeats were the most frequent (1,629 or 86.1%), followed by trinucleotide repeats (232 or 12.3%), and tetranucleotide repeats (31 or 1.6%). We selected 15 microsatellite containing contigs for primer design with an expected PCR product size ranging between 112 and 431 bp. Of the 15 loci tested, all were successfully amplified while only 11 were polymorphic based on initial screening via polyacrylamide gels (Table 2). These loci were fluorescently labeled to construct a 5-plex and 6-plex assay that were both validated over 48 individuals from two populations of the common smooth-hound ( Figures A1 and A2, Appendix).
The genetic diversity summary statistics for both multiplex assays are presented in Table 2. All markers were polymorphic and produced a total of 74 alleles (mean 6.2). There was no evidence of stutter products or significant allelic dropout based on the MICRO-CHECKER results, but null alleles were detected at two loci (Mmu5 and Mmu14) with high frequencies estimated in FREENA relative to the rest of the loci ( and cross-species amplification rate of success ranged from 72% to 100% (Table 4).
Additionally, to validate the potential of these markers for withinspecies population genetic analysis, we inferred genetic variation in samples collected from two different ocean basins for each respective species (

| Tope shark Galeorhinus galeus
Population differentiation between the SEAO and SWIO was significantly greater than zero (F ST = 0.034, lower 95% CI > 0), while similar to M. mustelus, Jost's D est indicated less pronounced levels of T A B L E 4 Cross-species transferability results of 11 microsatellites tested among six triakid species  Coalescent analyses for migration model comparison highly supported (P Mi = 1.0) Model 2 (i.e., migration from SWIO to SEAO) and showed that Θ was highest in the SWIO (Θ = 6.820) and lowest in the SEAO (Θ = 1.380) (Tables A5 and A6).

| DISCUSSION
Recent advances in next-generation sequencing technologies have considerably accelerated the mining of species-specific microsatellite loci in shark species generally devoid of molecular markers (Blower et al., 2015;Chabot & Nigenda, 2011;Pirog et al., 2015). In this study, the use of Illumina HiSeq ™ 2000 for reduced genome sequencing was successful regarding speed, accuracy, and cost in generating microsatellites. It provided an efficient way to develop microsatellite markers, even though some factors such as library preparation, read length, and precision of the assembly can be improved in future studies. The Similar to the studies of the Australian gummy shark Mustelus antarticus (Boomer & Stow, 2010), the tope shark (Chabot & Nigenda, 2011), and the brown smooth-hound shark M. henlei (Chabot, 2012), we found that dinucleotide microsatellite repeats were the most frequent repeat type present in the common smooth-hound shark genome.
Furthermore, we successfully constructed and optimized two polymorphic multiplex assays for the common smooth-hound shark. The validation of our multiplex assays in the common smooth-hound revealed similar genetic diversity indices as found in a previous study of the same species using cross-amplified loci (Maduna et al., 2016).
Given that in sharks, microsatellite flanking sequences are conserved owing to low mutation rates (Martin, Pardini, Noble, & Jones, 2002) we tested for the cross-species amplification of orthologous microsatellite loci in other Triakidae species. We observed a high cross-species amplification rate of success (>70%) across all microsatellite loci. Such findings were similar to those previously reported on sharks (Blower et al., 2015;Chabot & Nigenda, 2011;Giresi, Renshaw, Portnoy, & Gold, 2012).
All the species that were included in this study were closely related and accordingly the high performance of cross-species amplification was expected, albeit the blackspotted smooth-hound had the lowest transferability rate possibly due to the presence of null alleles. These loci, nevertheless, could prove useful in elucidating patterns of population genetic structure and gene flow within other Triakidae species.
Besides the comparison of population genetic parameters among multiple closely related species, cross-species microsatellites can also be applied for species identification based on species-specific allele sizes at multiple loci, a technique that has rarely been used for forensic studies of sharks (Giresi et al., 2015;Maduna et al., 2014;Marino et al., 2014). Indeed, our multiplex assays proved useful in discriminating between the study species, particularly for those that are morphologically very similar.
Our assessment of the distribution of genetic diversity of the four codistributed coastal sharks (the common smooth-hound, spotted gully shark, tope shark, and the whitespotted smooth-hound) based on the newly developed multiplex assays indicated that the microsatellite loci are informative for species identification as well as for population genetic analysis. Our preliminary population genetics estimates hinted at the combined effects of oceanographical barriers and life-history differences (e.g., mobility and sex-specific dispersal strategies) to be the major factors influencing the patterns of regional  (Boomer, 2013;Pereyra et al., 2010). In such systems genetic structure is usually reflected by a combination of effective population size, individual movements and migrations, seascape feature, and habitat preferences, e.g., the narrownose smooth-hound M. schmitti (Pereyra et al., 2010), the Australian gummy shark (Boomer, 2013), the rig M. lenticulatus (Boomer, 2013), and the brown smooth-hound shark (Chabot et al., 2015;Sandoval-Castillo & Beheregaray, 2015). Pereyra et al. (2010) and Boomer (2013) found no evidence of population genetic structure, while Chabot et al. (2015) and Sandoval-Castillo and Beheregaray (2015) provided compelling evidence for the interplay of oceanography and dispersal differential between sexes in shaping genetic structure. In agreement with Maduna et al. (2016), our study found asymmetric gene flow that predominantly occurs from the Southwest Indian Ocean to Southeast Atlantic Ocean for the common smooth-hound, and a similar trend was observed for the whitespotted smooth-hound. Granted, the reproductive and seasonal behavior of the two study smooth-hounds remain for the most part unknown (sensu Smale & Compagno, 1997;da Silva et al., 2013), particularly for the whitespotted smooth-hound, but it appears that genetic structure in these species is highly similar (at least in the samples investigated here).
Results from previous research indicated that levels of gene flow across the Atlantic/Indian Ocean boundary for the tope shark were relatively high (Bitalo et al., 2015), yet we found significant interoceanic genetic structure with two genetic clusters characterized by lower levels of admixture (SEAO and SWIO). The Bitalo et al. (2015) study, however, included only one Indian Ocean population (Struis Bay) in close proximity to the proposed boundary and noted significant population differentiation between this SWIO sampling site and a SEAO sampling site, Robben Island. In addition, Bitalo et al. (2015) did note that overall samples collected west of the Atlantic/Indian Ocean boundary exhibited a more significant level of admixture than those collected east of the boundary. We conclude that the genetic structure observed in our study is in agreement with that of the previous study given our sampling locations for the species. Similarly, for smooth-hounds, long-term gene flow estimates between ocean basins were asymmetrical and mainly occur from the Southwest Indian Ocean to Southeast Atlantic Ocean. The homogenous population structure observed here for the spotted gully shark was unexpected, given the available tagging data which indicate possible philopatric behavior for the species, although, it freely travels across the Atlantic/Indian Ocean boundary (Dunlop & Mann, 2014;Soekoe, 2016 Coalescent analyses for migration model comparison highly supported the model of the southward flux of migrants (i.e., migration from SWIO to SEAO) and showed that Θ was highest in the SWIO and lowest in the SEAO populations in all study species. Our finding of similar asymmetric migration patterns in these species might suggest that such patterns arose from the action of shared physical boundaries. Also, water temperature changes have been shown to influence movement of these triakid sharks and other closely related species (Chabot & Allen, 2009;Espinoza, Farrugia, & Lowe, 2011;da Silva et al., 2013;Soekoe, 2016;West & Stevens, 2001). From the perspective of thermal physiology, albeit speculative, individuals from subtropical and/or warm-temperate bioregions can more easily colonize the cool-temperate bioregions as opposed to the reverse. Nevertheless, it is apparent that the cold Benguela Current and its interplay with the warm Agulhas Current also influence the patterns of gene flow in these coastal sharks as evident in a variety of other regional coastal fish species (Henriques, Potts, Santos, Sauer, & Shaw, 2014;Henriques, Potts, Sauer, & Shaw, 2012, 2015 as well as passively dispersing marine species (Teske, Bader, & Rao Golla, 2015). Although our population and genetic sampling are limited, the Agulhas Current presents a significant barrier to the northward migration in smaller coastal sharks. In summary, the newly developed multiplex assays will provide valuable molecular tools for species identification, assessing the distribution of genetic diversity and determining the directionality of gene flow, factors which are all vital for the conservation and management of these local exploited shark species.

ACKNOWLEDGMENTS
Anelda van der Walt is immensely acknowledged for training Simo N.