Chimerism and population dieback alter genetic inference related to invasion pathways and connectivity of biofouling populations on artificial substrata

Abstract Disentangling pathways by which nonindigenous species expand and spread regionally remains challenging. Molecular ecology tools are often employed to determine the origins and spread of introduced species, but the complexities of some organisms may be reducing the efficacy of these tools. Some colonial species exhibit complexities by way of chimerism and winter colony regression, which may alter the genetic diversity of populations and mask the connectivity occurring among them. This study uses nuclear microsatellite data and simple GIS‐based modeling to investigate the influence of chimerism and winter regression on the genetic diversity and patterns of genetic population connectivity among colonies of Didemnum vexillum on artificial substrates. Colonies sampled in summer were shown to form a metapopulation, with high levels of admixture, extreme outcrossing, and some substructure. These patterns were consistent within the subsampled winter colonies and with the inclusion of chimeric data. However, allelic richness and diversity were significantly different between winter and summer samples, altering interpretations relating to population connectivity and pelagic larval duration. This study demonstrates the importance of including seasonal sampling and imperative life history traits in genetic studies for clear interpretations and the successful management of introduced species.

may lead to a better understanding of the nonindigenous communities, and, therefore, more effective management tools. However, the incorporation of such information into the identification of pathways and spread of nonindigenous species (NIS) remains unclear, when we are yet to incorporate known phenomena, such as the stochasticity and complexity of broadcast spawning and fragmentation (Eldon & Wakeley, 2009;Zhu et al., 2015), increased genetic diversity due to chimerism (Ben-Shlomo, 2017), and the seasonality of occurrence (Fletcher, Forrest, Atalah, & Bell, 2013).
Chimerism is particularly problematic for genetic studies, but is yet to be comprehensively investigated and incorporated into analyses and interpretations. In a chimeric colony, two copies of a gene can occur in a single zooid, resulting in multiple copies within a small section of the colony tissue. Multiple studies have now identified chimerism in NIS, such as the colonial Botryllid and Didemnid ascidians (Ben-Shlomo, 2017;Pérez-Portela et al., 2012;Sheets, Cohen, Ruiz, & Rocha, 2016;Smith, 2012;Smith et al., 2015), but each researcher faced with multiple copies of the target sequence is forced to treat them in different ways due to limited analytical tools dealing with such intracolony variation. For example, Sheets et al. (2016) treated mitochondrial homoplasmy in Botrylloides nigrum (Herdman, 1886) individuals as two individuals, thereby artificially inflating the population size, whereas Smith (2012) highlighted the number of potential chimeras in Didemnum vexillum (Kott 2002), from New Zealand and Japan, but treated these as a separate experimental study due to the infancy of investigations of chimerism for this species. The main issue posed by chimerism for pathway identification is the potential for misinterpretation of genetic diversity and downstream population structure analyses.
Confounding the chimeric effect is the complex life history of colonial NIS. For instance, the life history of D. vexillum includes sexual and asexual reproductive strategies (Sakai et al., 2001), with seasonal regression in abundance and growth during winter months (Fletcher, Forrest, Atalah, et al., 2013;Valentine, Carman, Blackwood, & Heffron, 2007). The pelagic larval duration of most ascidians is limited to a few hours, reducing the potential for natural dispersal (Herborg, O'Hara, & Therriault, 2009;Morris & Carman, 2012;Osman & Whitlatch, 2007), but D. vexillum has been recorded as viable in the water column for up to 36 hr under laboratory conditions . Asexual reproduction by means of budding and fragmentation can also accelerate this species' spread, as fragments may be less susceptible to competition and predation, and brooded larvae in fragments could be released prior to, or following, reattachment (Bullard et al., 2007). Furthermore, Morris and Carman (2012) found that under laboratory conditions, D. vexillum fragments can remain viable after suspension in the water column for three weeks, with only one zooid required for successful colony regrowth.
It is not surprising that we have limited power to resolve pathways of spread for NIS given the challenges posed by complex life histories, integrated with anthropogenic vector transport. In this study, we have used D. vexillum to investigate the effects of chimerism and seasonal regression on downstream population genetic analyses, to better understand their influence on interpretations and management decisions. As a dominant biofouler of artificial structures and due to its enhanced potential for spread , common chimeric formations (Smith, 2012), and winter regression, D. vexillum represents a valuable model for this study.

The New Zealand incursion of D. vexillum in the Marlborough
Sounds region also provides a model system for this study. In New Zealand, D. vexillum has established in six major ports and an important aquaculture hub in the Marlborough Sounds region (Fletcher, Forrest, Atalah, et al., 2013;McDonald & Acosta, 2012) since its initial discovery in a North Island Harbour in 2001 (Coffey, 2001). Following its first recorded occurrence in the South Island in Shakespeare Bay, two events involving anthropogenic vectors were instrumental to the spread of D. vexillum around the Marlborough Sounds: the movement of an infected aquaculture structure to the outer regions of the Queen Charlotte Sound; and the regional transfer of infected mussel seed stock in the Pelorus Sound (Forrest & Hopkins, 2013; Figure 1). Subsequent local spread has been tentatively attributed to the movement of aquaculture equipment and vessels, as well as natural "stepping-stone" dispersal between man-made structures throughout the Sounds . However, this has not yet been investigated with genetic tools, providing an opportunity to test different scenarios of spread.
The primary goal of this region-specific study was to investigate standard population genetic measures of an invader with a relatively well-mapped pathway of invasion, but complex life history and genomic arrangement. Microsatellite markers were employed for finescale spatial and temporal resolution. The four objectives were to: (a) investigate the genetic structure of D. vexillum colonies sampled in the Austral summer using a diploid dataset obtained from standard microsatellite scoring; (b) compare the observed genetic connectivity with predictions of dispersal generated from mathematical modeling; (c) assess the influence of chimerism on the genetic diversity and observed genetic connectivity; and (d) assess the influence of seasonal regression on the genetic structure of a subset of D. vexillum colonies.

| Sample collections
Tissue samples from colonies morphologically identified as D. vexillum were collected from artificial structures (mussel ropes, floats, wharfs) between January-March 2013 (austral summer, Figure 2) and early in August 2014 (austral winter, Figure 2). Thirty tissue samples were collected from structures at each site within a main aquaculture region and nearby port areas in the top of the South Island, New Zealand. Sampled sites in these areas included Port Nelson (N = 1 site, in summer), Pelorus Sound (N = 9 sites in summer and N = 6 sites in winter), and Queen Charlotte Sound (N = 5 sites in summer). The Port Nelson site represented a younger incursion, outside of the Marlborough Sounds region, sampled in summer to provide a more complete broad-scale resolution for assignment tests. It was not used in the connectivity or seasonal comparisons. Samples were collected from colonies or colony fragments situated ≥2 m apart to minimize the chance of pseudoreplication due to sampling from clonally related colonies . Tissue samples (ca. 100-500 mg) were preserved in approx. 1.0 ml of 100% (v/v) ethanol and stored at −20°C.

| DNA extraction and amplification
For each specimen, a 2-mm 2 section of tissue was cut for DNA extractions. The tissue was macerated using flame-sterilized forceps and a scalpel. Total genomic DNA for polymerase chain reaction (PCR) amplification was extracted using a lithium chloride/chloroform extraction protocol (Gemmell & Akiyama, 1996). Extracted DNA samples were stored at −20°C.

| Microsatellite processing
Thirteen polymorphic microsatellite loci were amplified in all samples. PCR amplifications were done using the Qiagen Type-it Microsatellite PCR Kit and M13 tags to label each forward primer, with the addition of M13 fluorescent-labeled primers (FAM, PET, VIC, NED). Loci were pooled for multiplexing (Schuelke, 2000) and assigned using the MULTIPLEX MANAGER 1.0 software (Holleley & Geerts, 2009) (Supporting information Table S1). The multiplexing protocol was performed with a final volume of 4 μl and contained 2x Type-it Multiplex PCR Master Mix, 0.0216 µm of each M13-labeled, locus-specific forward primer, 0.0864 µm of each locus-specific reverse primer, 0.135 µm of M13 5'-end labeled with an Applied Biosystems (ABI) dye (FAM, NED, PET, or VIC), 0.82 μl of RNase-free water, and 2 μl of 5.5-8.5 ng/µl diluted DNA.
F I G U R E 1 Map of the Marlborough Sounds, indicating the subsequent spread of Didemnum vexillum (red arrows) within Queen Charlotte Sound, following its initial incursion into Shakespeare Bay (black square), and within Pelorus Sound, following its incursion into Te Puraka Bay (black square). Spread was facilitated by the movement of aquaculture infrastructure and infected seed stock; images are located on the map in accordance with the areas influenced by these movements (map recreated, with photograph credit to Cawthron Institute, Lauren Fletcher). Unfilled red circles indicate the location of marine farms that were instrumental in the spread of D. vexillum The thermocycling parameters for all PCRs included an initial denaturation at 94°C for 15 min, 94°C for 30 s, 58°C for 90 s, and 72°C for 60 s for 8 cycles, followed by 89°C for 30 s, 56°C for 90 s, and 72°C for 60 s for 25 cycles, with a final 30-min extension at 60°C (Schuelke, 2000). Post-PCR products were diluted with 5 μl of Milli-Q water, producing a total volume of 9 μl. A volume of 2 μl of the diluted PCR products was then taken from each multiplexed loci and pooled with other multiplexed loci to form genotyping groups (Supporting information Table S1), to a volume of 6-8 μl. To each genotype group, 10 μl formamide and 0.4 μl of GeneScan 500LIZ internal size standard ABI (per individual) were added for genotyping. Genotyping was resolved by the University of Canterbury Sequencing Service (Christchurch, New Zealand) on an ABI 3100 DNA analyzer. Alleles for each locus were scored using GENE MARKER v.1.6 (SoftGenetics LLC).
Replicates (minimum 10% of the sample size) were assessed for amplification errors and the repeatability of scoring throughout the experimental process. One individual was used in every run as a positive control. Scores from individual amplifications were also compared to those acquired via multiplexing to detect possible multiplexing errors. To ensure genotyping accuracy, one negative and one positive control sample were included with each PCR run. Given D. vexillum is able to form chimeric colonies, with the potential for more than two alleles per locus to be observed in a colony sample, the locus scores from the winter/summer samples were separated into two datasets. One dataset incorporated the two alleles with the highest peaks (diploid dataset), as per Smith (2012), and the other incorporated all detected peaks within 50% of the height of the two main peaks to produce the chimeric data.

| Standard microsatellite summary
For the diploid dataset, all microsatellite markers and sampled populations (winter and summer) were assessed for deviations from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) using ARLEQUIN v.3.5.1.3 (Excoffier & Lischer, 2010). All calculations were conducted per locus and sampled population. For analyses conducted in ARLEQUIN v.3.5.1.3, 10,000 permutations were used and 95% confidence intervals for F-statistics were obtained by bootstrapping over loci 20,000 times. In addition, MICRO-CHECKER v.2.2.3 (Arif et al., 2010;Van Oosterhout, Hutchinson, Wills, & Shipley, 2004) was used to assess all loci for null alleles, as well as genotyping errors, such as large allele dropouts and stutter (1,000 randomizations). To account for multiple comparisons and control alpha inflation when two or more statistical tests on the same data were performed, a false discovery rate (FDR) correction was applied (Benjamini & Yekutieli, 2001;García, 2004;Narum, 2006). Using the following equation: CPV = / ∑ k i=1 (1∕i), the corrected p-value (CPV) was obtained, where α = 0.05; k = the number of tests performed, and i = the ith observation (Narum, 2006). Values below this were considered significant for multiple comparisons.
Objective 1: Standard population genetic structure (summer diploid data) To investigate the genetic structure among founder populations (early introductions into Queen Charlotte Sound) and secondary introductions (adjacent populations in Pelorus Sound), likelihood F I G U R E 3 Maps of Pelorus Sound, displaying variable amounts of connectivity between mussel farms as generated by a connectivity matrix. Four different pelagic larval duration times and their associated clusters are presented (a = 2 hr, b = 12 hr, c = 24 hr, and d = 36 hr). Each cluster is represented by a different color, and connected clusters are denoted by lines and an overlaid, similar-colored image assignments were conducted in GenALEx v.6.5. Isolation by distance (IBD) among populations within Queen Charlotte Sound and Pelorus Sound was assessed using MANTEL tests in GenALEx v.6.5 (9,999 permutations) with a geographic Euclidean distance matrix (kilometers). The Port Nelson population was excluded from the IBD assessment, as spatial disharmony with sites would generate structural information, not pattern, in the results. No more than two loci were missing per individual, and 10% missing data per loci were selected as an acceptable missing data rate (Pritchard, Stephens, & Donnelly, 2000).
Objective 2: Population connectivity assessed by genetic and modeled data (summer diploid data) To generate the connectivity model, connectivity scores for artificial substrates throughout Pelorus Sound were obtained using a GISbased connectivity matrix. This matrix was constructed by incorporating estimated current velocities (Knight & Beamsley, 2012) and pelagic larval duration (PLD) (Fletcher, Forrest, Atalah, et al., 2013; ;. Figure 3 provides a visualization of potential "stepping-stone" movements between farms, with connectivity represented by lines and colors representing isolated "clusters". Locations sampled within Pelorus Sound were used to assess genetic differentiation between populations in relation to clusters generated from the mathematical connectivity model (testing 2, 12, 24, and 36-hr PLD). Two of the PLD models (12-and 24-hr) were tested with separate AMOVAs to assess genetic differentiation among sampled locations within and among resulting clusters.
The clusters formed in the 24-hr PLD were almost identical to the sampled farms within the 36-hr PLD model (generated by the matrix), both models were represented by the 24-hr model for further analysis. An AMOVA was not done on the 2-hr PLD model as 273 clusters were formed, with each population sampled representing unique clusters.
Objective 3: Chimeric comparison of standard population genetic structure and connectivity (summer chimera dataset) To assess the influence of chimerism, the raw allele data were rescored so that additional allele peaks detected by the ABI 3100 DNA analyzer within 50% of the height of the two main allele peaks (the diploid dataset) were incorporated into one dataset.
These were then treated as polyploid data as it was not possible to isolate the different individuals or genotypes in chimeric results. POLYSAT in R v.3.0.2 (Clark & Jasieniuk, 2011) was then used to determine differences in the allelic richness and diversity of D. vexillum colonies and PLD models when incorporating polyploid data compared to excluding it. In the POLYSAT analysis, all individuals were assumed to be tetraploid. Ploidy was set to four, and missing values were given a set value. POLYSAT was also used to generate pairwise FST comparisons for the winter and summer polyploid data. A single-factor analysis of variance test (ANOVAs) computed in R v.3.0.2 was used to test the significance of any differences.
To visualize the impact of incorporating chimeric data, population distance matrices were used to generate separate principal coordinates analyses (PCoA) in GENALEX v.6.5 for both diploid and polyploid datasets derived from the summer sampling period. resampling were based on industry boat movements and the presence of Didemnum. All genetic analyses, except for isolation by distance, were repeated for the winter diploid dataset. A winter polyploid dataset was created, and the influence of chimeric data was analyzed as previously described. Genetic differentiation between seasons was determined using a hierarchical analysis of molecular variance (AMOVA) performed in ARLEQUIN. No more than two loci were missing per individual, and 10% missing data per loci were selected as an acceptable missing data rate (Pritchard et al., 2000).

| Standard microsatellite summary
Eight of the thirteen polymorphic microsatellite loci amplified consistently across all sampled locations (for summer and winter samples), with 3 (Dvex01) to 13 (Dvex30 and Dvex33) alleles observed per locus (Supporting information Table S1). Allele scoring was repeatable at 95% for the positive control across all loci. Chimeric colonies were identified across all populations (ranging from 17% to 48%;  (Watts, 2014, unpublished). Therefore, the full set of loci was used for all final analyses.

| Objective 1: Standard population genetic structure (summer diploid data)
Of the of 420 linkage disequilibrium tests, 11 (2.6%) showed evidence for loci pairs with linkage disequilibrium after FDR correction (CPV = 0.008). The test for linkage disequilibrium in ARLEQUIN assumes Hardy-Weinberg proportions of genotypes and therefore the significance of these tests may be a result of departures from HWE (Excoffier & Slatkin, 1998). However, loci pairs were not consistently linked across populations, and therefore physical chromosomal linkage appears unlikely, and all markers were considered as independent replicates of the D. vexillum genome.
There was a total range of 4 (Dvex01) to 13 (Dvex30 and Dvex33) alleles for the eight polymorphic loci (Supporting information Likelihood assignments, where the three founder populations were preset and Pelorus Sound samples were assigned to these, indicated patchy substructure and high levels of admixture among all populations. Consequently, it was not possible to definitively identify a single founder population for Pelorus Sound (Figure 4a).
However, all three founder populations had ≥50% "self" assignment ( Figure 4a). In addition, there was evidence of clustering in the assignment, with the outer sound populations predominantly assigned to Te Aroha Bay, and the inner sound populations, as well as the Port of Nelson population predominantly assigned to Picton ( Figure 4a).

| Objective 3: Chimeric comparison of standard population genetic structure and connectivity (summer chimera dataset)
All populations exhibited chimerism at 17%-48% (Table 1). While technical artifacts can sometimes be created by programs such as POLYSAT, the allelic diversity for the chimeric dataset (polyploid data) was not significantly different from the diploid dataset (t = 2.15, p = 0.803), providing some confidence in the program.
However, for chimeric colonies, the frequency of more common alleles was reduced, less common allele frequencies were increased, and rare alleles were introduced (Supporting information Table S3 and S4).

Winter diploid
For the winter samples, there was a total range of 3 (Dvex01) to 9 (Dvex03) alleles for the eight polymorphic loci (Supporting information   (Table 2, winter results, 2.81% (F CT ) and 5.54% (F SC ), respectively).
Significant variation was also detected within the summer and winter groups, which explained 91.8% of the variation in the data (Table 2, Table 2, winter results). There was also significant genetic differentiation among populations within clusters for the 24-hr PLD model (F SC = 0.07305, p < 0.001; Table 2).
Similarly, significant genetic differentiation was not detected among clusters for the 12-hr PLD models (F CT = −0.02741, p > 0.05; Table 2).  (Supporting information Table S5a, b, and c), and was similarly reflective of the PLD and current movement represented by the connectivity model.

Winter chimeric data
The proportion of chimeric colonies identified in winter (17%-48%) was similar to summer (19%-47%). Overall, no significant differentiation was found between summer and winter populations when incorporating chimeric data (F ST = 0.02, p > 0.05), and significant differences in the overall allele diversity and frequency were also not found.
Results generated from the winter diploid PCoA analysis showed four separate "groups" (Figure 5c). Melville Cove appeared as an outlier with less similarity to all other sampled populations, as did Goulter Bay (Figure 5c). Hallam Cove and Forsyth were grouped together, as were Yncyca and Hikapu Reach, each appearing more similar to one another than all other populations ( Figure 5c). In contrast, the chimeric data changed these groupings, all samples appeared different to each other, with no obvious "groups" appearing in the winter polyploid PCoA analysis (Figure 5d).

| D ISCUSS I ON
In this study, we have assessed the influence of chimerism and winter dieback on the interpretation of standard population genetic structure and connectivity analyses of the nonindigenous colonial ascidian Didemnum vexillum. Although overall allelic changes were not significant with the inclusion of chimerism and winter dieback, the changed distribution of alleles did alter downstream population analyses and interpretation for both sets of data.
In contrast to our study, deviations from HWE were not detected within D. vexillum populations from samples taken across New Zealand. Those results were pooled from across New Zealand to present one "area," so it is plausible that the scale of sampling re- to only 27% of colonies tested in Japan. This high fusion potential of the New Zealand colonies suggests 17%-48% chimerism observed in our data was within an expected range.
To our knowledge, only one other study has partially incorporated chimerism into downstream analyses, and this was for the colonial ascidian Botrylloides nigra (Sheets et al., 2016). Homoplasy observed in B. nigra samples was incorporated by treating homoplasmic individuals as two separate individuals. This inclusion added three unique haplotypes to the data, which were spread between two oceans. When the chimeric individuals were removed from diversity analyses, two populations greatly decreased in diversity.
The downstream analyses were not conducted with and without the chimeras so the influence on statistical interpretations was not assessed (Sheets et al., 2016).
In our study, the inclusion of chimeras also altered the distribution of allelic diversity, such that the frequency of more common alleles in some populations was reduced, less common allele frequencies were increased, and rare alleles were introduced. The introduction of rare alleles (frequencies <0.01) is a common feature of microsatellite loci, and very rare alleles are often uninformative for population-based analyses, as their presence may be due to reoccurring mutations rather than historical association or contemporary gene flow (Hale, Burg, & Steeves, 2012). However, in this case, downstream analyses were affected by the shift in allelic variation, whereby similarities among populations were altered to become more dissimilar or more similar than observed in the "diploid" dataset. In addition, the influence of chimeras, specifically on the assessment of dispersal clusters and principal component analyses, was more pronounced for the winter samples. This study has highlighted subtle, but important shifts in the downstream analysis of population genetic data when chimeras and seasonal dieback are considered. These shifts become more important when assessing the potential influence on management options.
For instance, the assignment of source populations informs authorities of specific vector potential from an area, therefore spurious results due to sampling error may identify false introduction points and pathways of local spread, as well as inflating or understating the number of potential incursions into a newly invaded region.
Similarly, the alignment of genetic structure to natural dispersal potential may divert management tools away from vector management to node clearance, yet temporal sampling indicates that this result cannot be used as conclusive if sampling is restricted to a snapshot moment.
Fletcher, Forrest, Atalah, et al. (2013) showed that D. vexillum reproduce for a longer period than normally considered, for 9 months of the year, but not through the winter months. They also suggest that periods of increased recruitment and growth as seen in D. vexillum will influence the level of connectivity and potential spread of established populations, with higher biomass leading to increased propagule pressure. Furthermore, our study suggests that the winter dieback period provides an opportunity for rare alleles to propagate through the region when growth increases again the following season, thereby suggesting that a winter removal program presents an opportunity to further reduce biomass and potentially the genetic diversity of the spreading population. While further study would be required to ensure removal does not encourage supercolonies by further reducing the genetic diversity and thereby promoting allorecognition and enhanced fusion (Smith, 2012;Smith et al., 2015), the reduction in biomass may provide enough space to boost the chances of native early colonizers.ss Recent work by  indicated that natural dispersal of D. vexillum could exceed 1 km under high current flow regimes. Using Bayesian analyses based on these predictions and other parameters, we tested the potential for genetic connection over these distances with distinct dispersal clusters. We observed conflicting results between our summer and winter samples. The summer samples suggest genetic connectivity over 12 and 24 hr of modeled dispersal, but the winter samples were differentiated with a 24-hr modeled period, which alone would suggest limited dispersal potential. However, in both datasets, the genetic structure was consistent with this previous study in suggesting a greater potential for long-distance dispersal and connectivity than previously considered possible for this species .
Genetic data as a tool to advise management are limited by the resolution at which the organisms and their populations are sampled. Here, we have highlighted two important components of sampling and processing genetic data for colonial nonindigenous species that can alter the interpretation of genetic structure and of dispersal pathways; two primary aspects of invasion ecology that assist with the management of global and regional spread. Therefore, we would argue that seasonal sampling and the inclusion of important life history traits, such as chimerism, in genetic studies, are worth the effort for refinement of interpretations and for the management of NIS.

CO N FLI C T O F I NTE R E S T
None Declared.

DATA ACCE SS I B I LIT Y
MSTAT sequences and supporting information have been submitted to Dryad.