Sunshine versus gold: The effect of population age on genetic structure of an invasive mosquito

Abstract The genetic diversity and structure of invasive species are affected by the time since invasion, but it is not well understood how. We compare likely the oldest populations of Aedes aegypti in continental North America with some of the newest to illuminate the range of genetic diversity and structure that can be found within the invasive range of this important disease vector. Aedes aegypti populations in Florida have probably persisted since the 1600‐1700s, while populations in southern California derive from new invasions that occurred in the last 10 years. For this comparison, we genotyped 1,193 individuals from 28 sites at 12 highly variable microsatellites and a subset of these individuals at 23,961 single nucleotide polymorphisms (SNPs). This is the largest sample analyzed for genetic structure for either region, and it doubles the number of southern California populations previously analyzed. As predicted, the older populations (Florida) showed fewer indicators of recent founder effect and bottlenecks; in particular, these populations have dramatically higher genetic diversity and lower genetic structure. Geographic distance and driving distance were not good predictors of genetic distance in either region, especially southern California. Additionally, southern California had higher levels of genetic differentiation than any comparably sized documented region throughout the worldwide distribution of the species. Although population age and demographic history are likely driving these differences, differences in climate and transportation practices could also play a role.

bottlenecks and small population size is thought to harm populations through inbreeding depression and an inability to evolve to new environments (Allendorf & Lundquist, 2003). However, some IAS not only survive bottlenecks, but they go on to flourish and outcompete the outbred and highly adapted native species. To make matters more complex, some IAS do not show lower genetic diversity at all.
In fact, when an invasive population derives ancestry from multiple invasions, its genetic diversity can be even higher than any of the source populations (Allendorf & Lundquist, 2003;Hänfling, 2007).
The number of founders, the number of invasions, the time since invasion, local adaptation, gene flow, and hybridization with local species are a few of the factors that can ultimately affect the genetic diversity and structure of an IAS (Allendorf & Lundquist, 2003;Hänfling, 2007).
We investigate the genetic diversity and structure of the invasive Aedes aegypti mosquito, specifically by comparing well-established and newly founded populations in North America. Ae. aegypti-the primary vector of yellow fever, Zika, dengue, and chikungunyaoriginated in Africa and has since spread throughout much of the tropics and parts of the subtropics. Ae. aegypti first reached North America during the 1500s via the Atlantic slave trade, and it established overwintering populations in the US southeast that have likely persisted until today (Powell, Gloria-Soria, & Kotsakiozi, 2018). The species is distributed in urban areas throughout the southern tier of the United States and parts of Mexico, and its average active dispersal is no greater than ~200 m (Honorio et al., 2003;Reiter, 2007;Russell, Webb, Williams, & Ritchie, 2005), but it can also disperse by "hitchhiking" via human transportation (Fonzi, Higa, Bertuso, Futami, & Minakawa, 2015;Goncalves da Silva et al., 2012;Guagliardo et al., 2014). In this study, we compare the population genetics of the likely oldest populations in continental North America (in Florida, the "Sunshine State") with some of the youngest (in the southern portion of California, the "Golden State"; Figure 1).
With its tropical and subtropical climate, Florida has one of the highest densities of Ae. aegypti populations in the United States (Dickens, Sun, Jit, Cook, & Carrasco, 2018;Hahn et al., 2016) which increases risk of disease transmission, as illustrated by occasional outbreaks of Aedes-borne disease (Kuehn, 2014;Likos et al., 2016;Teets et al., 2014). Populations of Ae. aegypti have persisted in Florida for more than 200 years, likely making them the longest established populations in continental North America; other contenders for oldest populations were eliminated by vector control in the 1950s-1960s or replaced by Ae. albopictus in the 1980-1990s (Lounibos, Bargielowski, Carrasquilla, & Nishimura, 2016;Slosek, 1986;Soper, 1965). Given that this mosquito has 6-12 generations per year depending on location, hundreds of years translate to thousands of generations. Ae. aegypti populations in southern California, on the other hand, are very young, and they face a more temperate climate which is generally predicted to be less suitable for the species (Dickens et al., 2018). Ae. aegypti were first reported in California in 2013 in the central California counties of Madera, Fresno, and San Mateo (Metzger, Hardstone Yoshimizu, Padgett, Hu, & Kramer, 2017). In 2014 and 2015, the species was detected in many more counties, primarily in southern California (Metzger et al., 2017 Bargielowski, Lounibos, and Reiskind (2018) show differentiation between the four Florida populations included in their analysis.
Previous work with microsatellite and SNP chip data (Gloria-Soria, Brown, Kramer, Yoshimizu, & Powell, 2014;Pless et al., 2017), as well as whole-genome sequencing data (Lee et al., 2019), indicates multiple invasions of Ae. aegypti into California, probably at least one from the US southeast and one from the US southwest and/or F I G U R E 1 Aedes aegypti (a) sampling sites in southern California (b) and Florida (c), showing region size and sampling design. Ae. aegypti photo by James Gathany/CDC. See Table 1  Note: Sampled locations, corresponding abbreviation, sampling year, number of individuals genotyped for microsatellites (N), observed heterozygosity, expected heterozygosity, allelic richness (N = 30), inbreeding coefficient, and whether the sample is being published for the first time. In the "New" column, "SNPs" means the SNP data are being published for the first time, and "Micr." means the microsatellite data are being published for the first time. All populations were genotyped at microsatellites, and those followed by an asterisk (*) also have SNP data.
to the populations in Florida that were tested in multiple years (Palm Beach County and Key West).
In line with previous work (Sherpa et al., 2018) and the expectations of recent founder effects (Nei Maruyama & Chakraborty 1975), we tested the following predictions: (a) Southern California would have lower genetic diversity and a higher amount of genetic structure than Florida, (b) geographic and driving distance would be more important predictors of genetic distance in Florida than southern California, and (c) populations that were sampled more than once (in different years) are more likely to be stable in Florida than southern California.

| Mosquito collections
Mosquitoes from a total of 28 sites, 14 in Florida and 14 in south- have samples from two years, bringing our total number of "populations" analyzed to 31. Since some analyses are sensitive to large differences in sample size, populations with more than 55 individuals are represented by a random selection of 50 individuals. After this correction, the mean sample size was 38. All mosquitoes were collected as adults or eggs from traps and were shipped as adults to Yale University for analysis. No more than six individuals were used from a single ovitrap to minimize the chance of over-sampling siblings.

| DNA extraction and genotyping
Whole genomic DNA was extracted from all mosquitoes using the Qiagen DNeasy Blood and Tissue kit according to manufacturer instructions, including the optional RNAse A step. As in Brown et al. (2011), all individuals were genotyped at 12 highly variable microsatellite loci: four with trinucleotide repeats (A1, B2, B3, and A9) and eight with di-nucleotide repeats (AC2, CT2, AG2, AC4, AC1, AC5, AG1, and AG4). Any individuals that genotyped at fewer than 10 loci were excluded from analysis.
Additionally, a total of 156 individuals from ten Florida sites and four southern California sites were genotyped for single nucleotide polymorphisms (SNPs) using Axiom_aegypti, a high-throughput genotyping chip that has 50,000 probes (Evans et al., 2015).
Genotyping was conducted by the Functional Genomics Core at University of North Carolina, Chapel Hill. To prune the SNPs, we first excluded 2,166 that failed a test of Mendelian inheritance (Evans et al., 2015). Since some analyses can be confounded by SNPs in linkage disequilibrium (Alexander, Novembre, & Lange, 2009), we excluded tightly linked SNPs with Plink 1.9 using the command "--indep-pairwise 50 5 0.5" (Gloria-Soria et al., 2018). We also excluded any SNPs that genotyped in <98% of the individuals and those with a minor allele frequency of <1%, as these could be genotyping errors,

| Genetic diversity
All microsatellite loci were tested for within-population deviations from Hardy-Weinberg equilibrium and for linkage disequilibrium among loci pairs using the R package Genepop v. 1.1.4. with 10,000 dememorizations, 1,000 batches, and 10,000 iterations per batch for both tests (Raymond & Rousset, 1995 (Kalinowski, 2005).
The measurements were not calculated using the SNP dataset, because the SNP chip was designed to show equal genetic diversity across different populations (Evans et al., 2015).

| Genetic structure
We calculated pairwise genetic differentiation (F ST ) for microsatellites with Genepop v. 1.1.4. and tested for significance using an exact conditional contingency-table test with the following parameters: 10,000 dememorizations, 500 batches, and 10,000 iterations per batch (Raymond & Rousset, 1995). We calculated F ST for SNPs using the same method, and we used 1,000 permutations to test for significance in Arlequin v. 3.5 (Excoffier, Laval, & Schneider, 2005). Within each region, we tested for a relationship between linearized F ST (F ST / (1 − F ST )) and geographic and driving distances using a Mantel test with 9,999 permutations. Driving distance was calculated by finding the fastest driving routes between pairs of sites using Google Maps.
To explore genetic structure, we conducted twenty independent runs of STRUCTURE v. 2.3.4 for K = 1-12 for the complete microsatellite dataset and for each region (Pritchard, Stephens, & Donnelly, 2000). We used 600,000 generations, with the first 100,000 discarded as burn-in.
We visualized the results using the programs Clumpak and DISTRUCT v.1.1 (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015;Rosenberg, 2004), and we inferred the optimal value of K using relevant guidelines (Cullingham et al., 2020;Earl, 2012;Evanno, Regnaut, & Goudet, 2005). For the SNP dataset, we used the maximum likelihood software Admixture v. 1.3.0 and the CV error method described in the software's manual (Alexander et al., 2009). Additionally, we ran principal component analysis (PCA) for both datasets and discriminant analysis of principal components (DAPC) for the microsatellite dataset using the R package Adegenet v. 2.1.1. (Jombart, 2008). DAPC is a multivariate method for identifying genetic clusters which seeks to maximize variance between inferred groups (with inferred groups selected using a clustering algorithm, k-means).  There was a slight and marginal correlation between linearized  Although these expectations were generally met, there were also surprises, including the dramatic and unique extent of differentiation and structure in southern California.

| D ISCUSS I ON
In line with our expectations, the newly established populations in southern California had significantly lower genetic diversity (allelic richness, observed heterozygosity, and expected heterozygosity), and more southern California populations showed evidence of inbreeding (eight in southern California vs. four in Florida; Table 1).  Table 1 for full names of each site and additional information

F I G U R E 3
Southern California had more genetic structure and higher pairwise F ST (Figures 4 and 6), and there was no increase in genetic distance with geographic (or driving) distance ( Figure 2). While the two samples in Florida that were resampled in separate years appeared to be temporally stable, the population resampled in southern California showed high genetic differentiation and change over just two years (e.g. Figure 4c,d). (0.23) is higher than mean F ST values between Africa and other continents (0.11-0.14).
Overall the results allude to very different invasion timelines and histories for these two regions. Southern California shows signs of recent bottleneck, inbreeding, serial founder effect, and possibly multiple invasions from different regions. Florida is also in the invasive range of Ae. aegypti, and indeed its allelic richness is lower than populations sampled in Africa (Gloria-Soria et al., 2016).
Although Florida populations were subject to bottlenecks, ~2,000-5,000 generations (assuming ten generations/year) of mutation and admixture-likely involving numerous introduction events from Africa-have muted those effects, especially compared to southern California. Additionally, a relatively high amount of gene flow (≥1 individual per generation) likely still occurs among most Florida populations preventing distinct population structure from forming (Nathan, Kanno, & Vokoun, 2017). This gene flow is probably mediated by stochastic human movement, since geographic and driving distance are not good predictors of genetic distance.
In addition to time since invasion, the number of invasions, number of propagules during invasions, and other population bottlenecks or demographic history events could affect the genetic diversity and structure patterns we see here. Although we did not detect differences in effective population size or number of recent bottleneck events between these two regions (results not shown), more work and demographic history inference is needed, and we are currently analyzing these regions further in the context of North America more broadly.
Although we believe time since invasion is the most important factor driving the differences between Florida and southern California, the differences in climate between the regions could also have an effect. Florida is more tropical: It has more precipitation, higher humidity, a smaller daily temperature range, warmer winters, and wetter summers ( Figure 8). It is rated as higher habitat suitability than southern California in all studies we are aware of (Dickens et al., 2018), and there is evidence genetic differentiation can change depending on season (Huber et al., 2002;Sayson, Gloria-Soria, Powell, & Edillo, 2015 Another study compared genetic structure and diversity of new (Europe) and old (Réunion Island) populations of a similar species, Ae. albopictus (Sherpa et al., 2018). Like these authors, we found higher diversity and lower amounts of structure in the older invasive populations; however, we did not find strong evidence of Isolation by Distance in the older populations (Sherpa et al., 2018). This could be caused by differences in the organism (Ae. aegypti is more anthrophilic and has a shorter active dispersal range than Ae. albopictus) or the regions (Florida is larger and has different transportation patterns than Réunion Island) (Chouin-Carneiro et al., 2016;Vavassori, Saddler, & Müller, 2019).
We provide this case study to illustrate that even within its invasive range, the population genetics and structure of an IAS can vary dramatically. As such, a "one size fits all" control measure may not be appropriate for controlling an invasive species; rather the control methods should be tailored to the region in question and may need to be adjusted over time. Moreover, the marked differences between the two regions considered here evoke the diversities of invasion histories and novel environment Ae. aegypti has experienced during its global expansion. The unusual genetic patterns in southern California compared to other regions around the world make it especially intriguing for further study.

ACK N OWLED G M ENTS
We thank Vicki Kramer and the California Department of Public Health for facilitating sample submission and collaborating mosquito and vector control agency staff for providing California sam-

CO N FLI C T O F I NTE R E S T
The authors have no competing interests to declare. However, we would like to include this disclaimer due to an author affiliation: The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA or US Government determination or policy.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at Microsatellite dataset: https:// doi.org/10.5061/dryad.83bk3 j9p7, SNP dataset (unfiltered and filtered): https://doi.org/10.5061/dryad.8gtht 76m8.