Population genetic structure and predominance of cyclical parthenogenesis in the bird cherry‐oat aphid Rhopalosiphum padi in England

Abstract Genetic diversity is the determinant for pest species’ success and vector competence. Understanding the ecological and evolutionary processes that determine the genetic diversity is fundamental to help identify the spatial scale at which pest populations are best managed. In the present study, we present the first comprehensive analysis of the genetic diversity and evolution of Rhopalosiphum padi, a major pest of cereals and a main vector of the barley yellow dwarf virus (BYDV), in England. We have used a genotyping‐by‐sequencing approach to study whether (a) there is any underlying population genetic structure at a national and regional scale in this pest that can disperse long distances; (b) the populations evolve as a response to environmental change and selective pressures; and (c) the populations comprise anholocyclic lineages. Individual R. padi were collected using the Rothamsted Insect Survey's suction‐trap network at several sites across England between 2004 and 2016 as part of the RIS long‐term nationwide surveillance. Results identified two genetic clusters in England that mostly corresponded to a North–South division, although gene flow is ongoing between the two subpopulations. These genetic clusters do not correspond to different life cycle types, and cyclical parthenogenesis is predominant in England. Results also show that there is dispersal with gene flow across England, although there is a reduction between the northern and southern sites with the south‐western population being the most genetically differentiated. There is no evidence for isolation by distance and other factors such as primary host distribution, uncommon in the south and absent in the south‐west, could influence the dispersal patterns. Finally, results also show no evidence for the evolution of the R. padi population, and it is demographically stable despite the ongoing environmental change. These results are discussed in view of their relevance to pest management and the transmission of BYDV.


| INTRODUC TI ON
Insect pests are responsible for the loss of up to 20% of the major grain crops in the world with some models predicting that this yield loss will increase to 19%-46% driven by a 2°C rise of the average global surface temperature (Deutsch et al., 2018). The efficacy and sustainability of new smarter control programmes will be contingent on the genetic variation of pest populations (Powell, 2018). This is because genetic variation determines the potential for adaptation and the rate of evolution in populations (i.e. how likely is that insecticide resistance will evolve; Hawkins, Bass, Dixon, & Neve, 2019), and the likely virus transmission dynamics as vector competence can differ between genotypes (Jacobson & Kennedy, 2013). Evolutionary and ecological factors, such as gene flow and distribution of hosts, affect genetic variation in populations that influences the adaptive responses to environmental change and selective pressures (Caprio & Tabashnik, 1992;Davis & Shaw, 2001;Dong, Li, & Zhang, 2018).
Genetic diversity across a species range can be used to infer ecological and evolutionary aspects of pest organisms that are difficult to study directly in wild populations. These inferences include but are not limited to levels of gene flow between populations and dispersal pathways, the population size and demographic responses to different environmental processes, and the ecological factors influencing genetic diversity and adaptation across space (Lushai & Loxdale, 2004;Roderick, 1996). Therefore, the study of the longitudinal and spatial distribution of the genetic diversity is critical for developing preventive strategies to control and manage pest and pathogens. This information will improve our capacity to monitor and model pest dynamics, which is fundamental for the development of new, more efficient smart crop protection approaches, promoting informed decision support.
Aphids are major pests of crops and vectors of some of the world's major plant viruses (Nault, 1997). They have a complex life cycle, which includes asexual reproduction, host alternation and winged/wingless morphs (Van Emden & Harrington, 2017), and this variation has been suggested to play an important role in their success (Loxdale, Edwards, Tagu, & Vorburger, 2017). Many aphid species have a wide range of hosts, ranging from generalists such as Myzus persicae to host specialist that can comprise host-specific strains with varying degrees of genetic divergence, such as Acyrthoshipon pisum (Blackman & Eastop, 2000;Peccoud, Maheo, De La Huerta, Laurence, & Simon, 2015;Peccoud, Ollivier, Plantegenest, & Simon, 2009). In addition, aphids show great plasticity in terms of reproductive type at the intraspecific level; lineages in some species range from cyclical parthenogenesis with one sexual generation in autumn-winter (holocycly) to obligate parthenogenetic (Blackman, 1974;Leather, 1993). This variation results in a great capacity to adapt to different environmental conditions. For example, it has been shown that populations of holocyclic species remain entirely parthenogenetic during mild winters. As a result of this climatic-induced response, a correlation between the proportion of obligate parthenogenetic and latitude has been shown (Blackman, 1974;Llewellyn et al., 2003;Simon et al., 1999). This variation in life cycle drives abundance and plays a central role in the transmission of viruses to crops. Primary infection is transmitted by winged aphids that move between fields and landscapes; however, secondary infection is evident at the field level by wingless individuals that spread the infection between neighbouring plants (Jepson & Green, 1983;Ribbands, 1964). In heteroecious species, virus transmission occurs during the parthenogenetic phase until autumn, and it is interrupted later by winter and the return of the gynoparae (asexual females that produce the sexual females) to the primary host, usually a woody plant not suitable for the viruses to persist. In milder climates where individuals reproduce parthenogenetically all year round, the virus transmission in the crops may continue throughout the winter. Therefore, it is fundamental for virus risk evaluation to monitor the predominant mode of reproduction of aphids and the proportion of anholocyclic lineages. The variation in life cycle leaves a signature in the genome of aphids, and the use of population genetics approaches can be used to infer the predominant mode of reproduction in populations (Halkett, Simon, & Balloux, 2005).
The bird cherry-oat aphid, Rhopalosiphum padi, is one of the major pests of cereals in the temperate regions, and it is a main vector of the barley yellow dwarf virus (BYDV) that causes cereal yield losses of between 20% and 80% (Vickerman & Wratten, 1979). In the UK, R. padi is a heteroecious species alternating between the primary host Prunus padus (bird cherry tree) and the secondary host, cereals and other grasses (Rogerson, 1947). While R. padi is predominantly holocyclic, with one generation of sexual reproduction to produce diapause eggs that overwinter on the primary host, in some regions of continental Europe and perhaps even in southern Britain, mild winters favour permanent asexual reproduction on the secondary grass host (Leather, Walters, & Dixon, 1989). As a result, a cline in the proportion of parthenogenetic clones related to winter temperature has been described in France; in addition, sexual and asexual forms are genetically differentiated although some gene flow occurs through the occasional generation of males by asexual clones (Delmotte, Leterme, Gauthier, Rispe, & Simon, 2002;Martinez-Torres, Moya, Hebert, & Simon, 1997). In the UK, it has also been suggested that there is an increase in the number of anholocyclic clones towards the south (Williams & Dixon, 2007 Durrant & Caudullo, 2016). In Great Britain, R. padi showed a homogeneous geographic genetic structure observed in alloenzyme studies suggesting that the aphids undergo a long-distance movement (Loxdale & Brookes, 1988). The long-distance dispersal in the UK mostly occurs in the gynoparae individuals as a result of their flight to the sparsely dispersed primary host, P. padus (Tatchell, Parker, & Woiwod, 1983;Tatchell, Plumb, & Carter, 1988). Thus, the host plant distribution influences the dispersal behaviour and ultimately the population genetic diversity of the species (Loxdale & Brookes, 1988).
In this study, we analysed the distribution of the genetic variation in time and space of R. padi in Great Britain to provide further insight into key aspects of this pest ecology and evolution. We demonstrate that genotyping-by-sequencing (GBS) approaches can detect population structure in regions where species disperse long distances that would show, a priori, a weak structure signal. Thus, we have analysed the population structure and gene flow levels and inferred the predominant mode of reproduction in the English populations. We have also analysed the proportion of gynoparae and virginoparae females collected during the autumn dispersal to confirm the genetic results. This study will be relevant to improve the management and control of cereal viruses vectored by this pest species, such as BYDV.

| Samples
A total of 316 individuals of R. padi from the RIS biological archive collected in seven locations (Starcross, Wye, Writtle, Hereford, Preston, York and Newcastle; Figure 1) in 2004,2007,2010,2013 and 2016 (Table 1 and Table S1) were used in the present study.
Samples in the biological archive of the RIS were preserved at room temperature in a solution containing 95% ethanol and 5% glycerol, although for this study the 2016 samples were stored at −20°C two weeks after collection rather than being committed to the archive.

| DNA extraction
DNA was extracted from single individuals using two commercial kits, Qiagen's DNeasy Blood & Tissue Kit or QIAamp DNA Micro Kit, to identify the method that provided better DNA yield and quality from the archive samples. DNA extractions were done following the manufacturer's protocol but with the following modifications: individual aphid specimens were placed in 1.5-ml microcentrifuge tubes and submersed in liquid nitrogen prior to homogenization with a sterile pestle; 180 µl of buffer ATL was added afterwards to the homogenized sample. Samples that were extracted with the Blood & Tissue Kit were incubated with 5 µl of RNase A (100 mg/ ml) for 30 min at 37°C, followed by a 3 hr' incubation with 20 µl of Proteinase K at 56°C in a shaker at 180 rpm. The DNA Micro Kit samples were instead incubated overnight at 56°C with 20 µl of Proteinase K in a shaker at 180 rpm. but without a previous incubation with RNase A. Carrier RNA was added to Buffer AL as recommended in the DNA Micro Kit manual for small samples, at a final concentration of 5 ng/µl of carrier RNA. The elution of the DNA was done with prewarmed (56°C) buffer AE, leaving the column stand for 10 min at room temperature before centrifugation; a second elution was performed using the first eluate. DNA yield was quantified using Qubit's dsDNA assay.

| Genome sequencing and assembly
In order to improve the available genome of R. padi, long reads were generated using a MinION (ONT). DNA was extracted from a pool of F I G U R E 1 Map of Great Britain showing the location of the suction traps from which the samples were used four individuals of R. padi using Qiagen's DNeasy Blood and Tissue Kit. A total of 2.9 µg of DNA was used to prepare a genomic library following the ONT 1D ligation library protocol SQK-LSK108. A total of 1 µg of genomic library was loaded on a FLO-MIN107 flow cell, and the sequencing run was performed with live base calling option on using MinKNOW v1.11.5.
Gene prediction was done with Augustus v3.3.1 (Stanke, Diekhans, Baertsch, & Haussler, 2008) using the same trained set of genes used by Thorpe et al. (2018) and RNAseq exon and intron hints. These hints were created by mapping RNAseq data available for R. padi PRJEB9912 and PRJEB24317 (ERR2238845-ERR223884) (Thorpe, Cock, & Bos, 2016) to the genome assembly using the splice aware aligner STAR v2.4.0h (Dobin et al., 2013). Hints were created from the aligned bam and wig files using bam2hints (for the intron hints) and wig2hints (for the exon hints); these were merged to run Augustus. Gene models were annotated using Blast2GO 5.0.22 (Conesa et al., 2005). Blast was run using the Arthropoda database. and York (n = 10) during March-October in 2004, 2010, 2013 (Table 2; Table S2). To increase the amount of DNA obtained from individual aphids, which in most cases was lower than 500 ng, whole-genome amplification (WGA) was performed using Qiagen's Repli-G UltraFast Mini Kit. The WGA reactions consisted of an equal volume of DNA (1, 2 or 4 µl depending on the DNA template concentration) and buffer D1 (prepared following the Kit's manual), which was incubated at room temperature for 3 min; after this incubation, an equal volume of buffer N1 was added (2, 4 or 8 µl) to the reaction. This was mixed by flicking and spun down before adding 15 µl of reaction buffer and 1 µl of REPLI-g UltraFast DNA Polymerase. Reactions were incubated 2 hr at 30°C followed by 3 min of incubation at 65°C. DNA quantification was done using Qubit's dsDNA BR assay. In some instances, a second WGA using 1 µl of whole-genome amplified DNA as template was performed when the DNA amount obtained after a first WGA was not enough for sequencing.

| Genotyping of samples
Samples were genotyped using genome-wide single nucleotide polymorphisms (SNPs) identified following a genomic reduced representation sequencing method (genotyping by sequencing, GBS).
Sequencing of Writtle and York samples was done separately to TA B L E 1 Summary of number of samples used for DNA extraction, kits used and the mean amount of DNA obtained per aphid Hereford Mean DNA yield (

| Analyses of population structure
SNPs called with FreeBayes were filtered using VCFtools v0.1.14 (Danecek et al., 2011) before the markers were used in subsequent analyses. Different filtering schemes were used to obtain a data set that maximized the quality of the SNPs and genotypes while minimizing the missing data at marker and individual levels (Table   S3), as recommended by O'Leary, Puritz, Willis, Hollenbeck, and Portnoy (2018).
The population structure was investigated using the Bayesian genetic clustering algorithm implemented in Structure 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). We used the admixture model with correlated frequencies. To detect any potential subtle genetic structure, we ran Structure with the sampling locations set as priors (locprior = 1); this model has the power to detect a weak structure signal and does not bias the results towards detecting genetic structure when there is none. The K parameter was tested for values ranging from 1 to 6 with 10 simulations for each. We used 100,000 samples as burn-in and 200,000 samples per run for the Monte Carlo Markov Chain (MCMC) replicates. Parameter convergence was inspected visually. We ran the Structure simulations using a multicore computer with the R package ParallelStructure (Besnier & Glover, 2013). The number of K groups that best fitted the data set was estimated using the method of Evanno, Regnaut, and Goudet (2005) using Structure Harvester Web v0.6.94 (Earl & Vonholdt, 2012). Cluster assignment probabilities were estimated using the programme Clumpp (Jakobsson & Rosenberg, 2007) as implemented in the web server CLUMPAK (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015). The genetic diversity measures of the populations were estimated using Arlequin 3.5.2.2 (Excoffier, Laval, & Schneider, 2005). Genetic variation among populations was investigated using an analysis of the molecular variance (AMOVA) with 10,000 permutations using Arlequin 3.5.2.2. We used hierarchical AMOVA to test the following population structures: (a) differentiation between the north (Newcastle, Preston and York) and south (Wye, Starcross and Writtle) regions; (b) differentiation between genetic clusters identified with Structure; (c) differentiation of Newcastle populations from different years (2004, 2007, 2010, 2013 and 2016); (d) Starcross samples by year (2004, 2007, 2010, 2013 and 2016); and (e) differentiation of populations in different seasons (spring, summer and autumn) in the north (Newcastle and Preston), south (Wye and Starcross) and all locations together. Population pairwise divergence was investigated using F ST , and the significance was evaluated with 10,000 permutations in Arlequin. We ran a Mantel test as performed in Arlequin to test for correlation between the genetic distances (F ST ) and the geographic distance between sampling locations estimated using Google maps. Demographic events such as population expansion or bottleneck were inferred using Tajima's D (Tajima, 1989) and Fu's F S (Fu, 1997) as estimated in Arlequin.

| Identification of reproductive types in autumn dispersing females
The identification of virginoparae (asexual females producing asexual progeny) and gynoparae (asexual females that produce sexual females) individuals dispersing in autumn is an indirect method to estimate the proportion of anholocyclic and cyclical parthenogenetic lineages. This is because gynoparae females fly to the primary host to produce the sexual forms, while the virginoparae are flying to winter cereals and other grasses to produce females that will reproduce parthenogenetically throughout winter. The RIS has been monitoring the autumn dispersal of R. padi

| Genome sequencing, assembly and annotation
The genome of R. padi has been recently sequenced (Thorpe et al., 2018). To improve the continuity of the published assembly, genomic DNA from R. padi was sequenced using a MinION (Oxford Nanopore Technology, ONT). The number of reads obtained that passed the quality control was 1,111,646, with a distribution of lengths ranging from 63 bp to 74,319 bp (median length: 3,006 bp, mean length: 3,625; Figure S1). These long reads were used together with the short Illumina reads generated by Thorpe et al. (2018) to assemble a new genome using MaSuRCA. The resulting assembly has a length of 321,589,008 bp, which is similar to the length of the previous genome version (

| Genotyping by sequencing of historical samples
DNA yield was low in general (Table 1, Table S1), but there was a significant difference in the amount of genomic DNA obtained between samples collected in different years (Appendix S1: Results, Figure S4). We performed whole-genome amplification of the DNA for most samples to obtain the required DNA amount for GBS (Table   S2). In the case of 21 samples, it was necessary to perform a second WGA reaction using as template 1 µl of the DNA product from the first WGA reaction. Of the 210 samples chosen for genotyping, 175 provided enough DNA for library construction and sequencing (  Figures S5 and S6). Different filtering schemes were applied to the SNP data set to identify the one that maximized the quality of the called SNPs and resulted in the highest number of called genotypes in the maximum number of individuals (Table S3).

| Genetic diversity of Rhopalosiphum padi in England
When all the samples were included in the analyses, filtering scheme 7 (FS7) resulted in 4,802 SNPs in 86 individuals (Table 4). The final proportion of missing data per locus was < 5% and per individual was < 45% ( Figure 2); hence, the subsequent analyses were carried out using the FS7 data set.
A total of 172 different haplotypes were observed in the population. The nucleotide diversity in all samples was π 0.21 and θ w 0.17 (Table 5). The gene diversity (also called expected heterozygosity, He) was higher than the observed heterozygosity (He = 0.24, Ho = 0.13), and the inbreeding coefficient (F IS ) was 0.42 (p = 0) (  Figure S7). The mean Ho was lower than the mean expected heterozygosity (He) in all locations, and the inbreeding coefficient (F IS ) observed in all sampling locations was positive and significant (Table 5).
Nevertheless, there was no deviation from the HWE at haplotype level in any of the populations (p = 1), although 37% (Newcastle), 25% (Preston), 17% (Starcross), 13% (Wye), 30% (Writtle) and 21% (York) of individual loci deviated significantly from the HWE. The F IS levels were higher in York and Writtle, which could be because the samples from these two locations were collected during a small period of two to three weeks in late July to early August (Table S4) (Table   S4), reducing the probability of resampling individuals from the same clones because the sampling period was spread over several months.

| Geographic structure of Rhopalosiphum padi
Results from the Structure analyses with the FS7 data set identified a K = 2 as the most probable number of genetic groups in the R.
padi population in England ( Figure S8). These two genetic clusters corresponded broadly to the south and north regions (Figure 3), although there were individuals in these two regions that had a higher probability of genetic membership to the alternative genetic cluster. This admixture is more common in the samples of southern origin (Starcross, Wye and Writtle), in which more individuals had a higher probability of membership to the northern genetic cluster.
The genetic differentiation (F ST ) within R. padi inferred by a nonhierarchical AMOVA was low but significant (0.041, p = .007). The north-south genetic differentiation was further explored using analyses of the molecular variance (AMOVA) ( Note:: minDP-includes only genotypes greater or equal to the value; min-meanDP-retains loci with a minimum mean depth of the given value; minQ-includes sites with quality above the value; mac-includes sites with minor allele count greater or equal to the value; max-missing-retains loci that have been successfully genotyped in the given proportion of individuals; imiss-retains individuals with a proportion of missing data smaller than the value; thin-removes loci that are closer than the given number of bp. This weak support for the phylogenetic clades is expected because, despite the high level of genetic differentiation between the two genetic groups, there is still gene flow between the locations.

TA B L E 4
Estimates of the Tajima's D and Fu's F S for the different populations did not deviate significantly from neutrality (

| Temporal differentiation of Rhopalosiphum padi populations
To explore the temporal differentiation of populations, we ana-    (Table 4), although the proportion of missing data per individual was still high in a few samples ( Figure S9). Thus, seven and 13 individuals had more than 70% of the loci missing in the Newcastle and Starcross data sets, respectively; still, 29 individuals from Newcastle and 17 from Starcross had less than 50% of missing data. In the case of Starcross, no samples from 2007 were kept after filtering. In both populations, the pairwise differentiation (F ST ) between years was low and nonsignificant (Table 8) (Table 9). When the differentiation was estimated between location season, the pattern becomes more complex (Table S5). The autumn populations were significantly differentiated from spring and summer populations, except in Starcross.
The spring and summer populations were not significantly differentiated in Preston and Wye, but have significant F ST values in Newcastle (0.057, p = .01) and Starcross (0.079, p = .049), although it should be noted that the seasonal analyses at population level could be biased by the different number of samples, which in some cases was low. It is interesting to note that the gene diversity varied throughout the season in all locations, although the variation pattern was different for each of the locations ( Figure 5). In general, the genetic diversity as measured by the He was higher in the spring than in summer and autumn when the samples were grouped by geographic region (north and south). Nevertheless, the He in the south decreased only in the autumn sample, while in the north the He decayed from spring to summer and increased again in the autumn to a similar level to that of the southern population ( Figure 5). The reason for this variation is unknown. A combination of photoperiod and temperature determines the timing of the production of gynoparae and males, and the mean temperature of July was correlated with their first appearance in the RIS suction traps (Dixon & Glen, 1971;Ward, Leather, & Dixon, 1984). The mean

| D ISCUSS I ON
The present study represents the first longitudinal analysis of the population genetic diversity of R. padi using a genomic approach with samples from the biological archive of the RIS that goes back to 2004. This study shows that reduced genome sequencing approaches can detect population genetic structure in species such as R. padi in regions where they disperse long distances (Delmotte et al., 2002;Loxdale & Brookes, 1988), which a priori should result in a weak signature of genetic differentiation, and help identify predominant mode of reproduction in the population by estimating the inbreeding coefficient and test for HWE, which is essential to understand plant virus transmission. This approach has also been able to detect weak signatures of seasonal genetic variation, which could be due to selection pressure changes.
The genome of R. padi was recently made publicly available (Thorpe et al., 2018), which facilitates the identification of genome-wide markers. In the present study, we have used long reads to improve the assembly and quality of the available genome. Thus, the number of scaffolds has been reduced more than 7x while maintaining a similar genome size as that of the previous version. In addition, the quality has increased as measured by the proportion of single ortholog genes identified, which has increased from 82% complete genes to 94% in the present assembly. Nevertheless, the number of genes annotated in the present genome is similar to that of the previous version. The availability of a quality genome assembly has helped with the identification and analyses of genome-wide molecular markers for the population genetic analyses of this species in England.
The study has identified two genetic clusters that correspond broadly to a geographic differentiation between the south and the north locations. These two genetic clusters explain 17% of the genetic variation, which is higher than the source of variation between sexual and asexual populations in France that was shown to be 11% (Delmotte et al., 2002) and in China, where it was 12.31% (Duan, Peng, Qiao, & Chen, 2017). The two clusters identified in England could also correspond to sexual and asexual forms; however, the inbreeding coefficient F IS is positive and significant in both clusters indicating a deficiency in the observed heterozygosity, which is normally explained by factors such as inbreeding or admixture of subpopulations (Wahlund effect). Instead, asexual reproduction results in an excess of heterozygotes and therefore a negative F IS , as observed in the French asexual population of R. padi (Halkett et al., 2005). In addition, the presence of individuals with mixed southern and northern genotypes is most likely the result of sexual reproduction between individuals from the northern and southern clusters, indicating cyclical parthenogenesis. This is also apparent in the phylogenetic tree, which shows short internal branches with low bootstrap support. This uncertainty in the phylogenetic relationships is possibly due to incongruent phylogenetic signal across the SNP data set, which could arise as a result of recombination (Brito & Edwards, 2009  Cyclical parthenogenetic populations of R. padi, S. avenae and other aphids have been shown previously to have a significant homozygosity excess (Delmotte et al., 2002;Duan et al., 2017;Hebert, Finston, & Foottit, 1991;Simon et al., 1999;Simon & Hebert, 1995;Wang, Hereward, & Zhang, 2016). Four hypotheses have been proposed to explain this in R. padi (Delmotte et al., 2002): the presence of null alleles, which is unlikely the case in our data set as filters have been applied to minimize the number of loci with missing genotypes; selection due to differential winter survival between lineages; inbreeding; and allochronic isolation due to different timing in the production of sexual aphids between different lineages. Differential survival between lineages was discarded by Delmotte et al. (2002) due to their sampling scheme.
Here, we can also rule out this explanation as natural selection would result in allele frequency variation and genetic differentiation between years, which is not observed in this study. Inbreeding linked to a patchy distribution of host plants was proposed to be the cause of an excess of homozygosity in Melaphis rhois (Hebert et al., 1991). In the case of R. padi, the distribution of the primary host P. padus is also patchy across Great Britain and specifically sparse in the south (Leather, 1996). The inbreeding coefficient is higher in the north, where individuals do not have to disperse long distances to find a P. padus tree to reproduce and overwinter, increasing the probability of related individuals mating. On the other hand, the individuals from the south will need to disperse longer distances as P. padus is rarer, decreasing the probability of inbreeding, although the lack of significant genetic differentiation within the geographic regions indicates that the distribution of P.
padus is not the only factor influencing the significant excess of homozygosity. While the distribution of the primary host is likely to influence the genetic variation of R. padi, further studies of the biology and diversity should be performed to determine the factors resulting in the observed pattern of deviation of HWE.
In addition to the two genetic clusters, it was observed that the levels of genetic differentiation varied geographically. There was no significant differentiation between the two northernmost locations (Preston and Newcastle) or the two southernmost (Wye and Starcross), but there was a significant differentiation between the northern and southern sites with the south-west being the most genetically differentiated population. The two locations of York and Writtle showed high genetic differentiation with respect to all the other populations, which was unexpected given their geographic location. There was no correlation between the geographic and genetic distances between the locations, so there is no isolation by distance. This suggests that there are other factors than just geographic distance that are reducing the dispersal of individuals between the north and the south, and mostly in the south-west. Ecological factors such as the distribution of the primary host, P. padus, are likely to influence the dispersal capacity of individuals although, as discussed above, it is probably not the only feature influencing the dispersal of this species. It should be noted that the two genetic clusters account for most of the differentiation within the species, and these correspond to the north and south regions. Thus, even when there is dispersal between the geographic regions as evidenced by individuals from each genetic cluster being collected in the alternative geographic region, the genetic differentiation is due to a reduction in gene flow between the two genetic forms. Landscape genetic analyses should be carried out with more dense sampling to determine the resistance surfaces that reduce dispersal between regions.
Climate change is shifting the distribution and reducing the genetic diversity in many species (Balint et al., 2011). In aphids, it has been observed that their phenology has changed during the last 50 years, with an earlier first flight (Bell et al., 2015). Nevertheless, there has been no significant change in their long-term population size despite yearly cycles in abundance (Bell et al., 2015). Thus, there is no indication that climate change has affected the abundance of this pest in Britain, which is consistent with the observations of Bell et al. (2015); there is also no evidence for a reduction in the genetic diversity of the populations through time, which could suggest that they are stable and resilient to changing climatic conditions.
Another aspect of temporal dynamics is the seasonal change in populations. This study has identified low but significant differentiation levels between samples collected in spring, summer and autumn, with this latter being the most differentiated. There is also a variation in the observed heterozygosity and genetic diversity from spring to autumn. Heterozygosity has been observed to decrease from the primary to the secondary host populations of R. padi in Canada (Simon & Hebert, 1995), padi are a tree, the bird cherry, and the secondary host is herbaceous, cereals (Rogerson, 1947), but the R. padi emigrants that come from the primary host during spring colonize grasses readily and are well-adapted, also there does not seem to be much effect of predators on the spring population of R. padi on P. padus (Dixon, 1971;Leather & Dixon, 1982). Therefore, the reduction in genetic diversity (He) in relation to the spring samples is not explained by the dispersal from first host to secondary host. A very low proportion of gynoparae and males flying in autumn to the primary host survive the dispersal phase (Ward, Leather, Pickup, & Harrington, 1998), which could explain in part the differentiation of the autumn population. In addition, insecticide use on crops during the growing season can result in a reduction in the heterozygosity and the allelic diversity during summer. Nevertheless, no insecticide resistance has been observed in populations of R. padi in the UK and it would be expected a larger reduction in genetic diversity in the autumn dispersing population than that observed in the study. The possibility remains that a large proportion of the R. padi population remains on other secondary host plants than crops and maintains the diversity of the species, serving as the source of variability.
In conclusion, the use of genomic approaches has allowed the detection of geographic structure signature in a pest species at a national and regional scale in a region where they disperse long distances. The study has identified two genetic clusters in this aphid pest in England, which could play different roles as vectors of plant viruses. In addition, there is significant genetic differentiation across the distribution of R. padi, being the south-west population the most differentiated of these. This genetic differentiation cannot be explained just by geographic distance, which suggests that there are other factors that prevent complete panmixia. The dominant form of reproduction in English populations of R. padi is mostly cyclical parthenogenetic, which has an impact on the transmission of BYDV, although population monitoring is still recommended to identify any possible shift towards anholocycly. There is no evidence for long-term demographic changes in the populations of R.
padi, which is consistent with previous studies of the population dynamics of the species, indicating that environmental change and seasonal selective pressures as insecticide application will have little impact on the genetic diversity of the species. These results have direct implications in control and management of R. padi, but further studies are needed to fully understand the diversity and dynamics of the species to improve control programmes and prediction models.

ACK N OWLED G EM ENTS
This study has been possible thanks to Rothamsted Smart Crop for useful discussions about the genome assembly.