Contrasting population structure and demographic history of cereal aphids in different environmental and agricultural landscapes

Abstract Genetic diversity of populations has important ecological and evolutionary consequences, whose understanding is fundamental to improve the sustainability of agricultural production. Studies of how differences in agricultural management and environment influence the population structure of insect pests are central to predict outbreaks and optimize control programs. Here, we have studied the population genetic diversity and evolution of Sitobion avenae and Sitobion miscanthi, which are among the most relevant aphid pests of cereals across Europe and Asia, respectively. We have used genotyping by sequencing (GBS) to identify genome‐wide single nucleotide polymorphisms (SNPs) to infer the geographic structure and migration patterns. In the present study, we show that the population structure in present‐day populations is different from that described in previous studies, which suggest that they have evolved recently possibly as a response to human‐induced changes in agriculture. This study shows that S. avenae in England is predominantly parthenogenetic and there has been a demographic and spatial expansion of a single genetic cluster, which could correspond with the insecticide resistance superclone identified in previous studies. Conversely, in China, S. miscanthi populations are mostly cyclical parthenogenetic, with one sexual stage in autumn to produce overwintering eggs, and there are six genetically differentiated subpopulations and high genetic differentiation between geographic locations, which suggests that further taxonomical research is needed. Unlike S. avenae in England, there is no evidence for insecticide resistance and there is no predominance of a single lineage in S. miscanthi in China.


| INTRODUC TI ON
A major challenge in agricultural entomology is to develop efficient control strategies for pest organisms. For this, it is important to understand how environmental and anthropogenic factors influence the genetic structure and the evolutionary dynamics of insect populations (Pelissie, Crossley, Cohen, & Schoville, 2018). The level of genetic structure and diversity is the result of a combination of several factors which include selection, migration, and life history (i.e., reproduction mode) (Leffler et al., 2012), and studying their consequences on insect populations is of great interest to improve ecological agricultural practices (Pelissie et al., 2018).
The use of pesticides remains a necessary way to control and manage pests in agriculture. However, their use imposes a strong selection pressure on pest populations and resistance to different types of insecticides has, therefore, evolved in many insects (Bass, Denholm, Williamson, & Nauen, 2015;Georghiou, 1972).
Designing new strategies of pest control that rationalize the use of insecticides and reduce the likelihood of an evolution of resistance is key for the development of sustainable agriculture practices that reduce the environmental footprint (Wijnands, 1997), and understanding how pest populations respond to selective pressure and adapt to ecological changes is key to design rational strategies of management and control that are more targeted. In addition, a better understanding of the geographic connectivity between populations and the dispersal capacity of pests provides valuable information to control their abundance and distribution, while preventing also the spread of adaptive genetic variation, such as insecticide resistance, across their geographic range (Mazzi & Dorn, 2012). Therefore, it is essential that we incorporate the fundamental knowledge about population genetics into agricultural entomology.
Aphids comprise some of the most pernicious species of crop pests. In cereals, Sitobion avenae and Sitobion miscanthi are two of the most economically important species in Europe and Asia, respectively, and they are major vectors of the barley yellow dwarf virus (BYDV), which can severely reduce cereal yield (Vickerman & Wratten, 1979). Both Sitobion species are monoecious, nonhost-alternating, feeding only on Poaceae grasses and cereals (Blackman & Eastop, 2017). Like many aphids, S. avenae and S. miscanthi show different levels of variation in their life cycle, from individuals that are obligate cyclical parthenogenetic and have a generation that undergoes sexual reproduction in the primary host (holocycly), to clones that are obligate parthenogenetic and reproduce asexually all year round (anholocycly) (Dedryver, Le Gallic, Gauthier, & Simon, 1998). In addition, individuals can remain asexual in the cereal crops during winter as a response to environmental cues such as warmer temperatures and day length (Dedryver et al., 1998).
As a result, a geographic cline in the reproductive type has been described in the S. avenae populations of UK and France, with increasing proportion of sexual reproduction toward the north of the countries (Llewellyn et al., 2003;Simon et al., 1999). In the case of S. miscanthi, variation in the life cycle has also been described.
Populations from this species in Australia and New Zealand are anholocyclic, while they are holocyclic in Taiwan (Sunnucks, England, Taylor, & Hales, 1996;Wilson, Sunnucks, & Hales, 1999). In China, S. miscanthi has been traditionally reported to be anholocyclic (Guo, Shen, Li, & Gao, 2005;Zhang, 1999). However, contrary to the observations in S. avenae and other species, a recent population genetics study has observed signatures of cyclical parthenogenesis in the southern populations of the country while obligate parthenogenetic reproduction would be dominant in the north (Wang, Hereward, & Zhang, 2016).
Resistance to pyrethroids was first detected in the UK populations of S. avenae in 2011. This was due to a knockdown resistance (kdr) mutation (L1014F) in the sodium channel gene . This knockdown appeared as a heterozygous mutation in one clone of S. avenae, known in the literature as clone SA3, and rapidly increased its abundance in the UK population from 2009 to 2014, although in variable proportions in different locations and years (Dewar & Foster, 2017;Malloch, Foster, & Williamson, 2016;Malloch, Williamson, Foster, & Fenton, 2014). The spread of the mutation in the UK was limited by the fact that the SA3 clone is anholocyclic, so pyrethroid resistance has not spread to other lineages through sexual recombination. In addition, the high connectivity of the UK populations revealed using four microsatellite loci (Llewellyn et al., 2003), probably facilitated the geographic spread of the resistant clone from its location of origin. Therefore, the continued use of pyrethroids combined with the long dispersal capacity of S. avenae has likely favored the spread of this clone across the UK. Nevertheless, the clonal diversity inferred using microsatellites remained high and similar to the diversity before the evolution of pyrethroid resistance, and other susceptible clones and phenotypes were still present in different proportions in British populations by 2015 (Llewellyn et al., 2003;Malloch et al., 2016). In the case of S. miscanthi, there is no available information in the literature regarding the evolution of insecticide resistance. However, understanding the dynamics and movement of the species can help manage and control the damage in cereal crops, and establish management programs to reduce the likelihood of insecticide resistance evolution. In China, previous studies have shown high levels of genetic diversity in S. miscanthi using a panel of five microsatellites (Guo et al., 2005;Wang et al., 2016), similar to those reported for S. avenae in the UK and France, and there is genetic differentiation between north and south of the country but low differentiation within each region, suggesting free gene flow within geographic regions (Guo et al., 2005;Wang et al., 2016).
In the present study, we analyze the population genetics and demographic history of S. avenae and S. miscanthi in England and China, respectively, using genotyping by sequencing (GBS) to identify potential weak differentiation. We discuss the results in view of the differences in life-history types and the evolution of insecticide resistance, which may be limited by the reproductive type. These genomics approaches have identified genetic variation at a national and regional scale for other aphids in regions where they disperse long distances (Morales-Hojas et al., 2019).

| Samples
Individuals of S. avenae were collected using the network of 12.2 m high suction traps that is run by Rothamsted Insect Survey (RIS). The RIS suction traps are continuously collecting flying insects, and during the aphid season, aphid samples are identified daily to species level (Morales-Hojas, 2017;Storkey et al., 2016); of the identified aphids, 10 individuals of S. avenae collected during June-July 2018 with suction traps located in 12 sites across England (Starcross, Wye, Writtle, Broom's Barn, Kirton, Rothamsted, Silwood Park, Wellesbourne, Hereford, Preston, York, and Newcastle; see Table 1 and Figure 1) were used for this study. Individuals of S. miscanthi were collected in 10 sites (Kunming, Mianyang, Wuhan, Qingdao, Tai'an, Pingliang, Yinchuan, Langfang, Taigu, and Suzhou) across the cereal growing areas of China between February and June of 2017 (Table 1, Figure 1). The 10 individuals of S. miscanthi were collected from the same wheat field but from plants separated by 10 m to reduce the probability of sampling the same clone.

| DNA extraction and SNP genotyping
DNA was extracted from samples using Qiagen's DNeasy Blood and Tissue kit following the manufacturer's protocol. Identification of SNP loci was done using GBS. Library preparation and sequencing of samples were outsourced commercially to Novogene LTD in the case of S. avenae and Allwegene Technology LTD S. miscanthi. Briefly, genomic DNA was digested with MseI in the case of S. avenae and with ApeKI in the case of S. miscanthi. The library preparation was performed following the standard Illumina pair-end (PE) protocol, and PE sequencing of 150 bp was performed on an Illumina HiSeq platform. Read quality was assessed with FastQC v0.67, and in the case of S. miscanthi, the first 10 bases were trimmed due to low quality using trimmomatic 0.36.1 (Bolger, Lohse, & Usadel, 2014). Reads were mapped to a draft of the S. avenae genome using BWA-MEM 0.7.16.0 letting BWA choosing the best algorithm to construct the index and the option of setting read group information Picard style.

| Analyses of population structure
The population structure of both species was investigated using the Bayesian genetic clustering algorithm implemented in Structure 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). We used the admixture model with correlated frequencies, and to detect any potential subtle genetic structure, we ran Structure with the sampling locations set as priors (locprior = 1); this model has the power to detect a weak structure signal and does not bias the results toward detecting genetic structure when there is none. Analyses for the two species were performed separately and with different parameters as the datasets were obtained following different protocols (different sequencing companies) and differed in number of individuals, markers, and quality. In the case of S. miscanthi, a first run of Bayesian clustering analyses of the population structure was carried out with five independent simulations with 100,000 burnin and 100,000 mcmc chains for each of K 1-10. An additional run of five independent simulations with 100,000 burn-in and 500,000 mcmc chains was carried out for K 5-10 to confirm the results of the first run and ensure convergence in the mcmc step. In the analyses of S. avenae from the UK, Structure was run with 5 replicates of 500,000 burn-in and 1,000,000 mcmc chains for K ranging from 1 to 12. Summary statistics (alpha and likelihood parameters) convergence was inspected visually to confirm that the burn-in and run lengths were adequate. We ran the Structure simulations using a multicore computer with the R package ParallelStructure (Besnier & Glover, 2013) in the CIPRES science gateway server (Miller, Pfeiffer, & Schwartz, 2010). The number of K groups that best fitted the dataset was estimated using the method of Evanno, Regnaut, and Goudet (2005) using Structure Harvester Web v0.6.94 (Earl & Vonholdt, 2012). Cluster assignment probabilities were estimated using the program Clumpp (Jakobsson & Rosenberg, 2007) as implemented in the webserver CLUMPAK The genetic diversity measures of the populations were estimated using Arlequin 3.5.2.2 (Excoffier, Laval, & Schneider, 2005); Fis was also estimated and tested for using the exact probability test in the R version of Genepop v. 4.7.5 (Rousset, 2008). Genetic variation among populations was investigated using an analysis of the molecular variance (AMOVA) with 10,000 permutations using Arlequin 3.5.2.2. We used hierarchical AMOVA to test the population structures resulting from the Structure runs. Population pairwise divergence was investigated using F ST , and the significance was evaluated with 10,000 permutations in Arlequin. We ran Mantel tests as performed in Arlequin to evaluate the correlation between the genetic distances (F ST ) and the geographic distance between sampling locations. The geographic distances (in Km) were estimated using Google maps (Tables 5 and 10). The demographic history was also explored using Arlequin. For this, we first estimated the gametic phase from the multilocus diploid data using the ELB algorithm with the default parameter values in Arlequin 3.5.2.2 (Excoffier, Laval, & Balding, 2003). Population expansion or bottleneck was inferred using Fu's F S (Fu, 1997), and mismatch analyses were run with 1,000 bootstrap replicates to estimate Harpending's raggedness index and the sum of squared deviation (SSD) given the population expansion and spatial expansion models.
Phylogenetic trees were constructed using maximum likelihood (ML) with RAxML 8.2.12 (Stamatakis, 2014) run in the server CIPRES (Miller et al., 2010). The data matrix used was the phased haplotypes.
RAxML was run with 1,000 bootstrap inferences with subsequent ML search using the gtrgamma model.

| Genetic diversity and population structure of S. miscanthi in China
A total of 14,520 SNPs with less than 20% of missing data per indi- A first run of Bayesian clustering analyses of the population structure was carried out with five independent simulations with 100,000 burn-in and 100,000 mcmc chains for each of K 1-10.
Analyses of the results following the Evanno method (Evanno et al., 2005) indicated that the most likely number of clusters was K = 6. An additional run of five independent simulations with 100,000 burn-in and 500,000 mcmc chains carried out for K 5-10 confirmed that the most likely number of K was 6 (Table 3,  Taigu (TG1, TG3, TG4, TG5, TG6, TG7, TG8, TG9, and TG10) and despite some of these locations being more than 1,000 km apart.
The genetic differentiation between the genetic clusters is higher and significant in all pairwise comparisons (Table 6).
F I G U R E 2 Structure analysis based on 14,520 SNPs across 10 Chinese populations, with K = 6. Each bar represents one individual and the colors of the bars the posterior probability that each belongs to one of the six genetic clusters. GC1-blue; GC2-magenta; GC3-yellow; GC4-green; GC5-purple; GC6-red

TA B L E 6 Genetic differentiation (pairwise F ST ) between the six genetic clusters (GC) identified with Structure
England. Approximately 19% of these loci are deviated from the HWE after Bonferroni correction. The gene diversity, estimated as the He, is high across all populations (Table 7), although it is lower than the He observed in a previous analysis of the UK populations using four microsatellites (Llewellyn et al., 2003), although comparison of the results obtained with genome-wide markers obtained in the present analysis and that of microsatellites has to be done with care. The H o was higher in all populations compared to the H e , resulting in high negative F IS estimates which indicate that the S. avenae in England is not in HWE (Table 7). This contrasts with the situation of the UK population 15-20 years ago where most markers in the analyzed populations were in HWE (Llewellyn et al., 2003). Although the negative F IS values were not significant according to the permutation test performed by Arlequin 3.5, it should be noted that this test evaluates the probability of obtaining random values that are higher than the observed ones rather than the probability of random values being more negative. Therefore, it is most probable that the p values reported are not correct for these values. When Fis were estimated using Genepop, all were also negative and most were significant using the exact test. The negative F IS is the result of an excess of heterozygotes, and it is considered a signature of clonal reproduction.
Bayesian clustering analysis was run with 5 replicates of 500,000 burn-in and 1,000,000 mcmc chains to ensure convergence. Results were analyzed following the Evanno method (Evanno et al., 2005), which suggested that the most likely number of genetic clusters is K = 2 (Table 8). However, this method does not estimate the deltaK for K = 1 and the mean LnP(K) is maximized for K = 1 suggesting that the most likely number of clusters is 1 (Table 8). In addition, the standard deviation increases from K = 2, which usually happens after reaching the best K. Finally, there is no clear distribution of individuals into 2 differentiated clusters in the bar plot resulting from the analyses with K = 2, with all of them having some probability of belonging to each cluster (Figure 4). An additional nonmodel-based analysis to verify the number of clusters identified by Structure in the data was performed with an unsupervised clustering approach (DAPC) (Jombart et al., 2010). For this, the "find.clusters" function of the R package adegenet 2.1.2 was used (Jombart, 2008;Jombart & Ahmed, 2011). This function applies a DAPC method to identify the optimal number of clusters using a model selection criterion, in this case the Bayesian information criterion (BIC). The BIC is minimized when K = 3 (BIC = 384.6787) but the difference with K = 1 (BIC = 386.0667) is of 1.388 and differences smaller than 2 between two models are considered to be weak evidence (Raftery, 1995).
When automatic selection was applied using the good fit criterion, the number of optimal clusters retained was K = 1. Taking all the F I G U R E 3 Midpoint rooted phylogenetic tree estimated with RAxML for the S. miscanthi phased haplotypes from China using a dataset of 14,520 SNPs. The six genetic clusters are highlighted with different colors, corresponding to the colors in the barplot of Figure 2. Labels on branches are bootstrap values >90%. GC1blue; GC2-magenta; GC3-yellow; GC4-green; GC5-purple; GC6-red results in consideration, they suggest that there is no population structure but just one genetic cluster.
Analyses of the molecular variance (AMOVA) also supported the existence of one single gene cluster. The overall diversity of S. avenae in England was low F ST = 0.001 and nonsignificant among populations, indicating a low differentiation level. When the individuals were clustered according to the Structure results for K = 2, assigning individuals to each genetic cluster according to the membership coefficient estimated with clump, the amount of genetic variation that was explained between groups was 2.33% (p = 0) ( Table 9). These results support those of Structure with one single genetic cluster and no population structure. In addition, the genetic differentiation estimates (F ST ) were all negative (Table 10) Phylogenetic analyses were run using the two multi-SNP haplotypes of each individual. The inferred ML phylogeny comprised two major clades highly supported by bootstrap ( Figure 5). Each of these two clades included one of the multi-SNP haplotypes from each individual, so that each multi-SNP haplotype is more closely related to a haplotype from another individual than to the second haplo-

| D ISCUSS I ON
The results from the present study provide information about the evo- Results of the study indicate that S. miscanthi in China has a higher diversity than previously identified (Guo et al., 2005 TA B L E 7 Mean genetic diversity indices estimated for each S. avenae population in England and overall et al., 2016). This can be explained by the higher number of genome-wide molecular markers that have been used in the present analysis in comparison with the five microsatellites of previous studies, but the estimated levels of genetic differentiation were similar to those observed in S. avenae in China using the same five microsatellites (Xin, Shang, Desneux, & Gao, 2014). However, S. avenae is known to be present only in Yili, Xinjiang region in the northwest (Zhang, 1999), and the two Sitobion species are morphologically very similar (Choe, Lee, & Lee, 2006;Hales, Foottit, & Maw, 2010), so it is likely that samples from China have been identified as S.
avenae in some studies and as S. miscanthi in others (Blackman & Eastop, 2017). In addition, it could also be the case that there is undescribed taxonomic diversity within S. miscanthi in China, and the high levels of genetic differentiation observed could be explained by  While S. miscanthi in Australia and New Zealand is predominantly functional parthenogenetic (Sunnucks et al., 1996;Wilson et al., 1999), the present study indicates that the populations in or had an excess of heterozygotes, although the significance was not tested (Guo et al., 2005). The contrasting results between the present and previous analyses are also at the population subdivision.
Thus, we observe no significant north-south differentiation at the QHL traditional geographic division of the country, identified in one previous study (Wang et al., 2016), and there is no evidence for isolation by distance (Guo et al., 2005) in cyclical parthenogenetic proportion with latitude (Llewellyn et al., 2003), this study shows that the present English population has an excess of heterozygotes and indicates strong clonality.
This suggests that anholocycly is predominant across its range, and there is no evidence for cyclical parthenogenesis occurring toward the north of the country as expected. This change in the S. avenae population could be the result of insecticide resistance evolution. In 2011, a knockdown resistance (kdr) mutation to pyrethroids was detected in England's population of S. avenae, and studies showed that the clone that gained this mutation spread and increased its proportion in the population from 2009 to 2014, and was also observed in Ireland from 2013 (Dewar & Foster, 2017;Foster et al., 2014;Malloch et al., 2014Malloch et al., , 2016Walsh et al., 2019).
It is therefore likely that this pyrethroid-resistant clone, which is a facultative parthenogenetic clone (Walsh et al., 2019), has continued to spread and increase in proportion, being now dominant in the English population. The single genetic cluster identified by the Bayesian and AMOVA analyses in this study could correspond to this clone. This is supported by the phylogenetic analysis, which shows a topology that can be explained by asexuality of the S. avenae population. Also, the levels of gene diversity (as measured by the He) are now lower than those observed in 2003 in the panel of four microsatellites (Llewellyn et al., 2003). In addition, analyses of the demographic history of the population in England indicate that there has been a population demographic and spatial expansion.
Thus, as one clone gained the resistance to pyrethroids in a given location, it increased in number in the location but also expanded its distribution as it spread to other regions via migration. However, further analyses will need to be carried out to demonstrate this.
Overall, this study shows that the populations of two species of cereal aphids of the genus Sitobion have evolved in recent years in two geographically distant regions under different environmental and human-influenced conditions. The diversity of S. miscanthi in China needs to be investigated more comprehensively, as the high level of genetic differentiation suggests the existence of yet unidentified forms. In contrast, the diversity of S. avenae has been affected by the evolution of pyrethroid resistance as shown in previous studies, and a single genetic cluster is now dominating the English population as shown in this present study. Although it is possible that the identified genetic cluster corresponds to the insecticide resistance clone, further analyses are needed to demonstrate this. In contrast to this, S. miscanthi has not gained insecticide resistance despite having been subject also to its use.
In England, the bird cherry-oat aphid, R. padi, has not evolved resistance to insecticides either, despite being sympatric with S.
avenae and therefore subject to the same agricultural practices.
Why some species evolve resistance while other do not it is still a matter of study. It has been shown in Drosophila melanogaster that thermotolerance influences the development and spread of insecticide resistance (Fournier-Level et al., 2019). Similarly, the distribution of cyclical and obligate parthenogenetic aphids is strongly influenced by temperature, so it is possible that there is an indirect relationship between life-cycle type and insecticide resistance evolution in aphids. Further studies in aphids would need to be carried out to test whether there is a relationship between thermotolerance, life-cycle types, and the evolution of the kdr and super-kdr mutations; for this, the population of S. avenae in England, where it is predominantly anholocyclic, its sympatric population R. padi and the related species S. miscanthi in China, which are predominantly cyclical parthenogenetic, can be a useful system to test the hypothesis.

ACK N OWLED G M ENTS
This study was made possible by a Biotechnology and Biological

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interest.