Using citizen science data to assess the population genetic structure of the common yellowjacket wasp, Vespula vulgaris

Monitoring insect genetic diversity and population structure has never been more important to manage the biodiversity crisis. Citizen science has become an increasingly popular tool to gather ecological data affordably across a wide range of spatial and temporal scales. To date, most insect‐related citizen science initiatives have focused on occurrence and abundance data. Here, we show that poorly preserved insect samples collected by citizen scientists can yield population genetic information, providing new insights into population connectivity, genetic diversity and dispersal behaviour of little‐studied insects. We analysed social wasps collected by participants of the Big Wasp Survey, a citizen science project that aims to map the diversity and distributions of vespine wasps in the UK. Although Vespula vulgaris is a notorious invasive species around the world, it remains poorly studied in its native range. We used these data to assess the population genetic structure of the common yellowjacket V. vulgaris at different spatial scales. We found a single, panmictic population across the UK with little evidence of population genetic structuring; the only possible limit to gene flow is the Irish sea, resulting in significant differentiation between the Northern Ireland and mainland UK populations. Our results suggest that queens disperse considerable distances from their natal nests to found new nests, resulting in high rates of gene flow and thus little differentiation across the landscape. Citizen science data has made it feasible to perform this study, and we hope that it will encourage future projects to adopt similar practices in insect population monitoring.


INTRODUCTION
There is no doubt that anthropogenic action is having wide-spread impacts on biodiversity.However, biodiversity research is often taxonomically biased towards large and/or charismatic animals.
Most insects have received very little attention (Leather, 2018;Rocha-Ortega et al., 2021;Sumner et al., 2018).This is of particular concern as insects constitute a significant proportion of biodiversity, with some estimates exceeding 5.5 million species (Stork, 2018).In addition, insects provide important ecosystem services, including provisioning (e.g., food, pharmaceuticals), regulating (e.g., pollination, Danai Kontou, Alejandro Maeda-Obregon and Monika Yordanova contributed equally to this study.predation), cultural (e.g., tourism) and supporting (e.g., nutrient cycling, food source) services (Cardoso et al., 2011).Although insect populations have been widely reported as declining worldwide (Hallmann et al., 2017;Powney et al., 2019), recent research has shown that some taxa may instead benefit from some forms of anthropogenic action (Bowler et al., 2021;Crossley et al., 2020;Macgregor et al., 2019;Outhwaite et al., 2020).Overall, it is clear that we currently have little understanding of how different insect species are influenced by the complexities of anthropogenic change.
There is evidently a need to shift the emphasis of recording and reporting to include insects that have historically received less attention.One of the biggest challenges is finding ways to do this quickly and affordably.
Citizen science has become increasingly popular over the past decade or so as a method to help scientists gather information on biodiversity rapidly and across large spatial scales.Some of the biggest initiatives, such as iNaturalist (www.inaturalist.org)or eBird (www.ebird.org),have contributed millions of biological occurrence records, providing insights into important ecological questions, for instance species distributions (Comont et al., 2012;Zapponi et al., 2017); invasive species monitoring (Brown et al., 2018); effects of climate change and land-use (Whitehorn et al., 2022); and changes in population abundance over time (Hallmann et al., 2017), thus demonstrating the value of citizen scientists to mainstream science.An exciting development of citizen science in recent years has seen members of the public extending their contribution to collecting samples.Freshwater research has been particularly proactive in recruiting volunteers to collect samples, to the extent that citizen scientists now play an integral role in freshwater ecosystem monitoring and research (Metcalfe et al., 2022;Thornhill et al., 2019).Furthermore, having citizen scientists willing to provide samples opens the possibility of extracting genetic information.Such data can help understand (and predict) population composition and change (Schwartz et al., 2007), yielding insights into demography and population dynamics, including population structuring, differentiation and rates of gene flow across the landscape (Bohonak, 1999).This can help answer questions relating to geographic barriers to dispersal (e.g., Brunke et al., 2019), effects of land-use change on dispersal and survival (e.g., Dreier et al., 2014), population size (e.g., Furlan et al., 2012) or invasion history (e.g., Ascunce et al., 2011).
To our knowledge, there are currently only a few examples of genetic studies that have used citizen science/volunteer-collected samples in ecological research (Agersnap et al., 2022;de Virgilio et al., 2020;Guindon et al., 2015;Neveceralova et al., 2022;Skrbinšek et al., 2019).However, the reach of these studies remains limited given that volunteers are required to possess specialist skills/knowledge and/or be specifically trained to process the samples.On the other hand, insect-based research has the potential for more inclusive and extensive sampling using citizen-science: insects are small and typically easy to catch in traps.Samples can then be transferred to laboratories for more extensive analyses, including identification to lower taxonomic levels, which often requires expert input (Breeze et al., 2021;Le Féon et al., 2016;Lucky et al., 2014;O'Connor et al., 2019).Some studies focusing on insects have already used these attributes to their advantage and have coupled them with genetic analyses.These include the UK Pollinator Monitoring Scheme (https://ukpoms.org.uk/), which performs DNA barcoding on pan-trap collected samples to assess species diversity (Breeze et al., 2021;Creedy et al., 2020) and the School of Ants project (https://andrealucky.com/school-of-ants/), which characterised the population genetic structure of the invasive ant Tetramorium immigrans in North America (Zhang et al., 2019).
Vespine wasps (yellowjackets and hornets) are among those insects that remain poorly studied (Sumner et al., 2018) despite increasing evidence of their importance in supporting ecosystem functioning (Brock et al., 2021).Most recent research has focused on populations in non-native regions due to their high success as invasive species around the world with large economic and ecological impacts (Lester & Beggs, 2019).These wasps are eusocial with typically annual colonies (spring to autumn in native regions) that can host tens of thousands of workers which forage from a fixed nest.Sexuals (gynes and males) emerge in the autumn and, after mating, males die; mated gynes on the other hand disperse and overwinter before commencing their own nesting cycle in the spring (Lester & Beggs, 2019).Indeed, vespine wasps are well-known for their dispersal abilities, with data from invasive regions showing that they can spread at rates exceeding tens of kilometres per year (Masciocchi & Corley, 2013;Robinet et al., 2017).Population genetics studies have shown that populations are generally panmictic across continuous landscapes (i.e., unbroken by large geographical barriers such as oceans or seas) (Arca et al., 2015;Chau et al., 2015;Eloff et al., 2020;Hoffman et al., 2008;Schmack et al., 2019), suggesting high dispersal rates.It must be noted, however, that this is not always the case, for instance, in Vespa mandarinia at the continental scale in its native range (Arca et al., 2015) and Vespula germanica in South Eastern Australia (invasive range) across only a few tens of kilometres (although the extent to which this might be an effect of genetic bottlenecks and invasion history is unclear) (Goodisman et al., 2001).However overall, this indicates that whilst workers forage intensively as close to the nest as possible, probably within a few hundred metres (Akre et al., 1975;Greenleaf et al., 2007;Matsuura & Yamane, 1990), gynes are likely to disperse further from their natal nests before founding their own.Thus, the behavioural differences between these two castes may generate different population structuring at different spatial and temporal scales; however, we lack comprehensive tests of these conjectures.Additionally, there are currently no comprehensive studies on the population genetic structure of native populations of vespines in Europe, particularly the UK, which is the source country for multiple invasive populations around the world (Brenton-Rule et al., 2018;Lester & Beggs, 2019;Schmack et al., 2019).
An understanding of the genetic structure of their native populations is essential to predict how their ecosystem services will be/are being affected by anthropogenic change (Brock et al., 2021).
Here, we used citizen science-collected samples from the Big Wasp Survey (BWS) (www.bigwaspsurvey.org)to analyse the UK-wide population genetic structure of the common yellowjacket wasp Vespula vulgaris using polymorphic microsatellite markers.The BWS is a citizen science project that has run annually in the UK since 2017.It aims to improve our knowledge of social vespine wasps and encourage people to gain a better understanding of these little-loved insects.Members of the UK's public were invited to hang a simple homemade bottle trap in their gardens for a week in late August and then post the samples to identification hubs where experts could verify species identity.Since 2020, participants have identified their own samples using online identification tools developed for the project (Perry et al., 2021).The project has proved highly successful in engaging with the public: between 2017 and 2021, 3389 unique participants/participant groups set out 7916 traps across the UK, catching over 62,000 wasps.The data have produced reliable species distribution maps that are comparable in quality to those generated from four decades worth of data collected by expert recorders (Sumner et al., 2019).Moreover, the project stores nearly 50,000 physical wasp samples (collected between 2017 and 2019) now preserved in alcohol, which have the potential to provide molecular insights into native vespine wasps across the UK.
Because of the way the samples were collected, the DNA was likely to be quite degraded.Given restriction on budget, our first aim was to determine a cost-effective method to obtain DNA from individual wasps that would be suitable for analyses at multiple microsatellite markers (Aim 1).We tested two DNA extraction methods and then assayed existing microsatellite markers developed for different vespine species for their utility in studying the population genetics of V. vulgaris.We then used these data to determine the population structure of V. vulgaris in the UK at four different geographical scales: within traps, local (<10 km), regional (South-East of England, 185 km) and national (across the whole of the UK, 850 km) (Aim 2).
Different scales can inform on different temporal trends, including long-term dispersal patterns at larger scales (regional and national) and recent dispersal events at the finer scales (local) (Bluher et al., 2020).In this case, we were interested in testing hypotheses about how the contrasting dispersal patterns of queens and workers might give rise to different patterns of genetic structure (Figure 1).Specifically, we predict that there will be evidence of viscous population structure at a fine (local) spatial scale, given that workers are assumed to forage as close to the nest as possible.Accordingly, at the finer spatial scales (within traps and between nearby traps), we expect to detect siblings (or half-siblings) among foraging workers.At the intermediate spatial scale (<200 km distance), we expect little or no population differentiation given that gynes are assumed to be good dispersers.We expect a similarly low/absence of genetic differentiation at the national scale, unless geographical features (e.g., sea bodies) limit dispersal.This study is (to our knowledge) the first comprehensive population genetic study of one of the most invasive Vespula species (V.vulgaris) in its native range and provides much-needed insights into their ecology and dispersal behaviours.Second, it is a rare example of how poorly preserved citizenscience-collected insect samples can be used for genetic analyses.

Sampling
We selected a total of 393 samples collected in two consecutive years of the survey: 2017 and 2018.These were selected using a hierarchical geographical design at different scales (see Introduction), limited by the locations that had been sampled by BWS participants.
At the national scale, 137 wasps from the 2017 samples were selected: 91 individuals were randomly selected, and a further 46 individuals were selected from under-sampled areas (Northern Ireland and Northern England).These samples were divided into six different regional subsets defined by county lines and latitudinal gradient, with an additional separation between East and West England (Figure 1).
For the purpose of these analyses, we treated each region as a putative 'population'.
For the finer scale analyses (regional, local and within trap), we selected 256 samples from 2018; samples were selected from this year rather than the previous one as the BWS had resulted in higher participation in 2018, giving us more choice regarding sample selection.To create putative 'populations' for these analyses, we defined 10 clusters of 4-6 traps each, with any two traps within the cluster separated by up to 10 km, and with at least one pair of traps per cluster separated by less than 1 km.These distances were selected based on reported foraging distances of social wasps and other Hymenoptera (Akre et al., 1975;Greenleaf et al., 2007;Matsuura & Yamane, 1990).Any samples that did not belong to a cluster were removed.For the local scale analyses, we treated each trap within each of the 10 clusters as a 'population'.For the regional scale analyses, we treated each of the clusters located in Southern England (across 185 km) as a 'population'; note that one of the clusters was omitted from this analysis as it was located in Edinburgh, Scotland and outside the geographical scope of this analysis, therefore this analysis consisted of nine putative 'populations'.
Aim 1: Determining a cost-effective but reliable method for genotyping degraded citizen science samples We sampled a total of 393 wasps: 302 were genotyped at up to 15 loci using a modified Chelex protocol (256 from 2018 and 46 from 2017) (henceforth referred to as the Chelex method) and 91 (from 2017) were genotyped at up to 12 loci using a DNeasy ® Blood and Tissue kit (QIAGEN) (the DNeasy method).We were unsuccessful in extracting DNA from 10 of the DNeasy samples.We tested the influence of methods, days and liquid in trap using a generalised linear model (GLM).Amplification success was significantly influenced by the DNA extraction method ( p = 7.5 Â 10 À3 ), with the DNeasy method being more efficient than the Chelex method.There was no effect of the liquid (either orange juice or beer) used in the trap nor the number of days the trap was set out for.We did not include year in this analysis given that year and method were strongly associated (X 2 = 203.26, Locus D2-185 showed high genotyping error rates (0.312) and was removed from all further analyses (Table S2).Locus LIST2004 failed to amplify successfully in 62% of Chelex samples.It was subsequently removed from all 2018 analyses but was retained for the 2017 analyses given that, in combination with DNeasy samples, it amplified successfully in 68% of all 2017 samples.F I G U R E 1 Summary of sampling scheme and research questions addressed in Aim 2. Each geographical scale considered in this study is presented here along with information about the samples pertaining to each scale, and specific hypotheses, rationales and predictions.At the national scale, SE = Southern England, EE = Eastern England, WE = Wales and West England, NE = Northern England, SC = Scotland, NI = Northern Ireland.At the local scale, the Hastings cluster is presented as an example (image pulled from Google Maps); each pin corresponds to the location of a trap.Table S4 contains more detailed information relating to sample sizes.
We estimated sibship between individuals with a primary aim to remove the confounding effects of high relatedness on analyses of genetic structure (Rodríguez-Ramilo & Wang, 2012).In total, 25 pairs of siblings (involving 34 individuals in total) were inferred with a probability of >0.85 (Table S3).One of each sibling pair, including 8 individuals from the 2017 dataset and 10 from the 2018 dataset, was removed from subsequent analyses at the national and regional scales, but were retained for within-trap and local scale analyses.
The total number of alleles (N A ) per locus varied between 6 and 18 in the 2017 dataset and between 4 and 16 in the 2018 dataset (Table 1 S2).Locusspecific F IS values ranged between À0.093 and 0.352 (Table 1).
Multiple factors could contribute to these deviations from Hardy-Weinberg Equilibrium (HWE) and positive F IS values.Genotyping errors and null alleles (NAs; Morin et al., 2009;Waples, 2015) could be a possible cause.However, genotyping error rates and NA frequencies were overall low compared with those from non-invasive sampling such as faecal samples (Carlsson, 2008;Dakin & Avise, 2004;Rico et al., 2017), with genotyping error rates ranging between 0 and 0.115 and NA frequency estimates between 0 and 0.155.Another explanation for the deviation from HWE and positive F IS values is inbreeding or the presence of population substructuring, known as a Wahlund effect (Waples, 2015).However, structuring was not detected at the regional or national levels using unsupervised Bayesian structure analysis (see Results: Aim 2).Aim 2: Population structure of V. vulgaris across the UK We analysed the population genetic structure of V. vulgaris at four different geographical scales: within-trap, local (between traps up to 10 km apart), regional (between clusters of traps up to 185 km apart) and national (between traps up to 850 km apart).We expected to find evidence of high population structuring at the fine-scale (Figure 1) because Vespula are central-place foragers.Conversely, we expected to find less structuring at the regional and national scales if queens can disperse widely and if dispersal is unrestricted by geography.
Evidence of siblings/half-siblings at finer-spatial scales Across all samples (105 from 2017 and 196 from 2018), we found evidence for eight pairs of full siblings within traps in the 2017 dataset and 16 pairs of full siblings within traps in the 2018 dataset (Table S3).There was evidence for only one pair of half siblings in traps 3253 and 5597 in the 2018 dataset with a probability of 0.869; however, we chose to retain both of these individuals for further analyses as the traps were located over 135 km apart, and any inferred family relationship was considered unlikely.

Little evidence of population differentiation at the local scale
We analysed the population genetic structure between traps within 10 different clusters with a total sample size of 154 wasps (Table S4 for sample size breakdown).Siblings were retained for this analysis.No evidence of population differentiation at the regional scale We analysed the population genetic structure of 136 wasps between nine clusters of traps (mean number of individuals per cluster: 15).Siblings were not included in this analysis.We found no evidence for population structure: F ST values were consistently low, ranging between 0 and 0.035 (mean: 0.010; Table 2) and there was no significant IBD (Mantel test: r = 0.007, p = 0.374; Figure 2a).Our STRUCTURE analysis suggested that the highest ΔK was 119.46 at K = 2.However, individual admixture estimates from STRUCTURE analyses showed no visible evidence of any evident structuring for any K from K = 2 to K = 9 (Figure 2b).Overall, unsupervised population structure analysis and classical F ST analysis are concordant, suggesting the absence of regional population structuring.

Limits to dispersal at the national scale
We analysed the population genetic structure of 97 wasps between the six putative regions across the UK.This analysis did not include siblings.These analyses indicated low but significantly positive IBD (Mantel test: r = 0.059, p = 0.038; Figure 2a).Our STRUCTURE analysis yielded the highest ΔK of 23.10 at K = 2 (Figure 2b).The analysis at K = 2 showed that most of the samples from Northern Ireland clustered together; indeed, F ST values indicated that the Northern Irish samples were slightly, but significantly differentiated from the others, although these values remained low (<0.035;Table 3).We repeated the Mantel test without the samples from Northern Ireland to determine the importance of this location in the previous result.IBD without the Northern Irish samples was no longer significantly positive (r = 0.038, p = 0.20), confirming that these samples were driving the previous result.This suggests that wasps disperse widely across the UK, but that the Irish Sea presents a slight barrier to this.However, this differentiation remains weak and there remains evidence of gene flow between these locations.

DISCUSSION
Vespine wasps have a notoriously rocky relationship with humans (Sumner et al., 2018).They are considered a health risk, highly successful as invasive species and are ubiquitous across the globe.Moreover, most people can recognise them as 'a wasp'.Despite this, vespine wasps remain poorly studied relative to other insects, like their cousins the bees and ants (Sumner et al., 2018).This means that key biological traits, like population structure, are little studied.This is especially true for most of the vespine wasps (which include the wasps best recognised by people) in their native ranges.We address this data gap by analysing samples collected by citizen scientists to reveal the population genetic structure of the common yellowjacket wasp V. vulgaris across the UK at different spatial scales.Our analyses reveal a panmictic population across 850 km, with little evidence for any structuring at finer scales.Our findings provide evidence for two important aspects of the ecology and behaviour of V. vulgaris.First, queens disperse considerable distances from their natal nest to found new nests, resulting in high rates of gene flow and thus little differentiation across the landscape.Second, foragers forage close to their nests: worker siblings could be detected within traps but not between closely situated traps.Our study also showcases the potential for using citizen scientist-collected samples for molecular analyses, even when samples are poorly preserved.We anticipate our study will incentivise further engagement of citizen scientists in the urgent quest to enrich our understanding of insect populations, better equipping us in assuring future security of the natural capital these insects provide (Brock et al., 2021).

Population structure
First, we found that V. vulgaris shows little to no population genetic structure at the two broader spatial scales studied here and forms an effectively single panmictic population across the UK.These results provide insights into the dispersal biology of this species.Queens are known to disperse far from their natal nests (based on rates of spread of invasive populations; Arca et al., 2015;Masciocchi & Corley, 2013) to set up colonies the following spring.These high rates of dispersal appear to be driven by human mediation; wasps have repeatedly shown a propensity to expand their geographic range through human-assisted means (Chau et al., 2015;Crosland, 1991;Masciocchi & Corley, 2013;Veldtman et al., 2021).There is no reason to believe that human transport does not also occur across their native ranges, and would contribute to the lack of population structuring found in this study.The extent to which dispersal is sex-biased or not remains to be determined (although see Martínez et al., 2021;Masciocchi et al., 2020), but since we found little genetic structure at the regional scale, and little evidence of inbreeding, it is probable that both sexes are dispersing widely from their natal nests.However, our analyses also revealed a clear barrier to dispersal: there was differentiation between the mainland UK and Northern Ireland samples, demonstrating that water bodies (in this case, the Irish Sea) can act as barriers to dispersal.This is in accordance with previous studies that have analysed gene flow of Vespula populations across islands (e.g., invasive V. pensylvanica across Hawaiian islands [Chau et al., 2015] or native V. germanica across the English channel [Schmack et al., 2019]).The apparent ease with which these wasps can disperse across the landscape is likely a major contributing factor of the success of invasive wasps around the world (Moller, 1996) and highlights the challenge in containing future introductions.Future studies should concentrate on larger-scale comparisons of genetic diversity and structure between native and invasive V. vulgaris populations (as has been done for some other Vespula and Vespa species as detailed previously) to deepen our understanding of their invasion process.
It is not surprising that genotyping errors and NAs were detected across loci; this is partly due to the low quality of DNA extracted from citizen-collected samples, but is also symptomatic of using microsatellite markers developed for non-target species.NAs and allelic Presented here at the national scale at K = 2 (best ΔK) and for K = 6 (number of putative 'populations', here: regions); at the regional scale at K = 2 (best ΔK) and K = 9 (number of putative 'populations', here: clusters of traps).
dropouts are expected to lead to excess homozygosity and a deficiency in heterozygosity, and thus deviations in genotype frequency from Hardy-Weinberg equilibrium and a positive F IS , as observed in this study.However, these effects are accounted for in our analyses, which take account of genotyping errors.Sibship inference can be vulnerable to mistyping and NAs (Wang, 2004), but we adopted the likelihood method developed to handle noisy data with genotyping error rates much higher than observed in our data (Wang, 2019).Equally, we are confident in our sibship analysis results as all but one sibling pairs were detected within traps.Indeed, F ST and STRUCTURE analyses are both robust to mistyping and NAs, as they work with allele frequencies that are not much affected by genotyping abnormality.This said, the lack of any population structuring of V. vulgaris at regional and national levels revealed by our microsatellite data does not preclude the possibility that a weak genetic structure might exist, which might be detected by higher-resolution analyses, for example using many hundred single nucleotide polymorphisms (SNPs), spread across the genome.With the use of dense genomic SNPs, populations of extremely weak differentiations such as F ST = 0.0007 can be reliably distinguished by unsupervised structure analysis (Leslie et al., 2015).

Use of citizen science
Samples collected through the BWS allowed us to overcome common sampling limitations relating to finance, time, effort and geographical coverage (Conrad & Hilchey, 2011;Dickinson et al., 2010).In light of the recent Covid-19 pandemic, this is particularly encouraging for future research, demonstrating that samples may continue to be collected despite potential pandemic restrictions, and that this can be used for molecular studies.Additionally, our study tested a key challenge of citizen science data quality.BWS collected wasps had sat in beer or orange juice traps for up to a week, undergone at least two freeze-thaw cycles, and spent several days at room temperature during transit.Yet, we were successful in extracting the DNA from most samples used in this study.We benchmarked two DNA extraction methods: a commercial kit (DNeasy) and a cheaper, less complicated and less timeconsuming method (Chelex).
Using prices of the reagents used at the time of study (2018)(2019), we calculated that DNA extractions using the DNeasy kit (for 50 samples) cost £3.94/sample(accounting for failed extractions for $10% of samples), whilst the Chelex method cost £0.07/sample(including Chelex and Proteinase K).The Chelex method was therefore significantly cheaper than DNeasy method.However, the DNeasy method yielded higher quantity and quality DNA, and none study insect populations at the molecular level with samples collected by citizen scientists using unconventional methods, we suggest that the commercial kit is a more reliable option.However, in line of recent advances which now allow for DNA to be extracted from ancient and museum material (e.g., Rohland & Hofreiter, 2007;Straube et al., 2021), this does not mean that other, lower-cost DNA extraction protocols (including Chelex) may not be further optimised and costs to be further reduced.
We hope that the results of this study will encourage other citizen science-based projects to use their samples for molecular studies, for instance to monitor pollinator genetic diversity and population structures.Pollinator population health is of particular concern, and pollinators can be easily collected, as demonstrated by the UK Pollinator Monitoring Scheme (https://ukpoms.org.uk/), a national scheme in the UK that collects pollinators using pan-traps.

CONCLUSIONS
In this study, we showed the feasibility of using citizen science samples to provide a complete assessment of the population genetic structure of V. vulgaris across the UK.This is even more noteworthy T A B L E 3 Genetic variation at the national level.

Sampling
BWS volunteers set up home-made beer or orange juice-baited bottle traps for 7 days at the end of the summer, when Vespula colonies are close to the end of their life cycle, to target late-season workers (Sumner et al., 2019).Insects were removed from the trap after 7 days, washed in tap water and stored dry in household freezers (approx.À20 C) until the samplers were ready to post the samples to experts at the University College London (UCL) for identification.
Samples were sent via first-class post as dry specimens, wrapped in tinfoil inside a padded envelope for protection.On arrival at the UCL, samples were relocated to À20 C laboratory grade freezers as soon as possible.For identification, wasps were thawed and examined under a microscope, before being stored in 80% ethanol at room temperature until genetic analyses.
In total, across the UK in 2017, 2942 V. vulgaris individuals were collected from 407 traps; and in 2018, 14,804 wasps from 1275 traps.
Volunteers provided their post codes with the traps; we transformed these into coordinates using the website UK Grid Reference Finder (www.gridreferencefinder.com).We used these coordinates to calculate the distances between traps to select traps for the population genetic analyses at different geographical scales, as defined in the Results.

DNA extraction
Two conflicting issues were at play in our choice of DNA extraction method.On the one hand, samples collected by the BWS were likely to have been quite degraded, having been stored in warm beer/ orange juice for a week, endured 2-4 days at room temperature during posting, and under-gone at least two freeze-thaw cycles, before finally being stored in alcohol.On the other hand, we required a lowcost extraction method that would permit analysis of large sample sizes on the limited budget that is common across citizen science projects (Hecker et al., 2018).Consequently, we tested two different DNA extraction methods: a commercial kit (DNeasy ® Blood and Tissue kit, ID: 69504, Qiagen) and a less time-consuming and costly (Lienhard & Schäffer, 2019) modified Chelex100 (BioRad) protocol (Gadau, 2009;Moreau, 2014).For both protocols, we used the abdomen, thorax and legs (but not the head and wings) for DNA extraction.

Genotyping
We assayed 34 microsatellite markers that had been previously developed for other vespine species (Arca et al., 2012;Daly et al., 2002;Hasegawa & Takahashi, 2002;Thorén et al., 1995) S1 for information on primers and assay results).To compare understand the variables affecting amplification success, we used a GLM that accounted for DNA extraction method (DNeasy or Chelex), liquid in the trap (orange juice or beer), the number of days that the trap was set out for and the trap ID (as a random effects variable).We did not include year in this analysis given that year and method were strongly correlated.
We estimated sibship within traps and within clusters with two aims: first, to remove the confounding effects of high relatedness on analyses of genetic structure (Rodríguez-Ramilo & Wang, 2012), and second to infer Vespula foraging distances.We considered that two individuals were siblings if the probability of inference of either full or half sibship was 0.85 or higher, calculated using COLONY 2.0.6.5 (Jones & Wang, 2010) with a full-likelihood method, medium run lengths and no sibship prior.We performed analyses of population genetic structure (see below) without siblings at the national and regional scales but retained them for the local scale analysis given that our aim was to understand the movements of worker wasps.
We used an admixture model with a 10,000 burn-in period and 20,000 Markov Chain Monte Carlo iterations and set with the possible number of populations (K) between one and six for the national scale analysis (number of regions) and between one and nine for the regional analysis (number of clusters).Each analysis was repeated three times for each value of K. We used the R package pophelper (Francis, 2017) to implement Evanno's method (Evanno et al., 2005), which determines the most likely number of clusters (ΔK) and to generate graphical outputs.All R analyses were performed with R version 4.1.2(RC Team, 2022).
Two further DNeasy samples and 80 Chelex samples (20 from 2017 and 60 from 2018) did not amplify at seven or more loci.These samples were removed from subsequent analyses.The final sample sizes were 105 individuals from 2017 and 196 from 2018; of these, 79 were DNeasy samples and 222 were Chelex samples.
of the 81 PCRs required repeating; by contrast, 414 PCRs were necessary to genotype the 196 Chelex individuals used in this study.PCR reagents (based on the price of the Type-it Microsatellite PCR kit [Qiagen] at the time of this study) cost £38.64 per plate and sequencing cost £115 per plate.Accounting for the number of failed PCRs, we calculated that the total cost for DNA extraction, PCR and sequencing cost £12.02per DNeasy sample and £17.39 per Chelex sample.In terms of labour costs, DNA extraction was less time consuming with the Chelex method than the DNeasy one; however, because of the number of PCR repeats associated with Chelex samples, labour costs with Chelex exceeded those of DNeasy.For future studies wishing to We used the DNeasy kit for a subset of 2017 samples (n = 91), and the Chelex method for the others (n = 46 samples from 2017, n = 256 samples from 2018).The DNeasy method followed the manufacturer's instructions, except for the amount of Proteinase K (20 μL here) and that samples were incubated overnight.The Chelex method consisted of adding 450 μL of 10% Chelex 100 solution to each sample before manually crushing the sample with sterile pestles, then adding 5 μL Proteinase K and incubating the samples at 57 C for 3 h.After incubation, each sample was vortexed and boiled at 95 C for 8-10 min.To separate the supernatant from the wasp particles and the Chelex 100 beads, the samples were centrifuged at 16,873 x g (14,000 rpm) for 15 min, then $200 μL of supernatant was transferred to fresh tubes for PCR.Samples were stored overnight at 4 C or at À20 C for longer periods of time when necessary.We examined the success of DNA extraction using the DNeasy method with a Nanodrop where possible.Samples from both years were stored for similar periods of time and treated in the same way.
on seven BWS V. vulgaris individuals using Qiagen's Multiplex PCR kit, and determined amplification success using gel electrophoresis.Each reaction contained 6.25 μL master mix, 5 μL DNA and 0.21 μL of each forward and reverse primer (with three primer pairs per reaction).The annealing temperature (T a ) was 52 C for all loci.Forward markers were fluorescently tagged with the dyes 6FAM or HEX.Sequencing was performed with an ABI3730xl DNA Analyzer and peaks were scored by two people independently using the Geneious Prime MicrosatellitePlugin (v1.4.6).Twenty-eight markers amplified, and of these, 19 were polymorphic.We selected 12 primers for the DNeasy samples (LIST2007, LIST2003, LIST2013, LIST2018, VMA3, LIST2004, LIST2001, LIST2011, VMA6, R1-169, LIST2017 and Rufa19), with the addition of D3-15, VMA4 and D2-182 for the Chelex samples (Table ).
Values from pairwise F ST tests.Column and row names refer to regions: SE, Southern England; EE, Eastern England; NE, Northern England; WE + Western England + Wales; SC, Scotland; NI, Northern Ireland.Figures in brackets are 95% confidence intervals of F ST obtained by bootstrapping.