Population genetics and ecological niche modelling provide insights into management strategies of the herbivorous pest Phytomyza horticola (Diptera: Agromyzidae)

Research on population genetic patterns and potential distribution dynamics can provide insights into the development of pest management strategies. Herein, we integrated population genetic analyses with the climatic niche approach to investigate spatial population genetic variations and potential geographical distribution (PGD) of the herbivorous pest Phytomyza horticola. We also analysed its population response patterns to both late Pleistocene climatic events and future climate change.


| INTRODUC TI ON
Contemporary patterns of genetic variations in a species across different geographic areas are affected by historical factors (Avise, 2000;Hewitt, 2004).Past climatic changes, particularly during the late Pleistocene, have been important drivers of variations in genetic structure, demographic fluctuations and the current distributions of many extant species (Hewitt, 2000(Hewitt, , 2004;;Ye et al., 2014Ye et al., , 2022;;Zhang et al., 2022).Geological events such as the formation of mountain systems can generate physical barriers to dispersal, resulting in population differentiation and speciation (Che et al., 2010;Jiang et al., 2022;Zhu et al., 2022).Contemporary factors, including human-mediated factors, can also strongly affect geographic distributions and genetic variations within a population (Cao et al., 2017(Cao et al., , 2021;;Drahun et al., 2021;Xu et al., 2022).Determining how past and present factors have contributed to population genetic patterns enables a deeper understanding of the evolutionary history and ecological adaptations of a species, which contributes to the development of appropriate conservation and management strategies (Drahun et al., 2021;Jiang et al., 2022;Sherpa et al., 2022;Wang et al., 2021;Ye et al., 2022).
Population genetics has been used to understand patterns of genetic variation, such as genetic diversity, population structure and population responses to past climate change (Fedorov et al., 2020;Jiang et al., 2022;Ye et al., 2014Ye et al., , 2022;;Zhang et al., 2022).The characteristics of mitochondrial DNA, including maternal inheritance, moderate evolutionary rates, clear evolutionary patterns and nearly neutral evolution, make it suitable for studying population-level evolutionary patterns (Avise et al., 1987;Barraclough & Nee, 2001); thus, it has been widely used as a molecular marker in population genetic studies of insects and other fauna (Drahun et al., 2021;Jiang et al., 2022;Pérez-Alquicira et al., 2019;Sun et al., 2022;Ye et al., 2022).In recent years, ecological niche modelling (ENM) has become a popular tool for coupling analyses with population genetic studies.Before making spatial and/or temporal predictions, the ENM approach can be used to model a suitable climatic envelope for a particular species based on species occurrence records and climate data, thereby predicting the impacts of future climate scenarios on the species (Capblancq et al., 2020).Overall, a coupled population genetics and ENM approach can provide coherent insights into the impacts of past-to-future climatic changes on species populations.
The combination of these two methods has played an important role in pest research.It has allowed researchers to better understand their evolutionary process and ecological adaptability, and to predict their suitable conditions and spatial distributions.All of these insights are useful in the development of pest management strategies (Poveda-Martínez et al., 2022;Shabani et al., 2013;Zhu et al., 2016).
The host range of this polyphagous pest comprises over 300 plant species from 35 families worldwide, including numerous cultivated plants belonging to families such as Brassicaceae, Fabaceae, Cucurbitaceae and Asteraceae.Thus, this pest causes serious economic losses in important crop areas worldwide (Benavent-Corai et al., 2005;Gençer, 2005;Griffiths, 1967;Spencer, 1973).Adult females cause direct damage by feeding on the plant sap exuded from leaf surface punctures made by their ovipositors (Spencer, 1973).The major damage to the plants is the curved mining of leaves on which the larvae feed, which causes characteristic symptoms in plants, such as leaf mines that vary in appearance, resulting in injury to the plant, reduced photosynthetic capacity of the mined leaf areas and occasionally structural damage, all of which lead to declines in plant vigour, growth and yield (Spencer, 1973;Winkler, Scheffer, et al., 2009).
In China, diverse geographical environments, various climatic types and abundant host plants provide an ideal environment for the occurrence of P. horticola (Liu, Wang, et al., 2013).As crop planting areas have continued to expand in recent years, P. horticola has spread to most regions of China and caused serious damage to economically important crops, particularly vegetables and ornamental plants, which has hindered agricultural development (Chen et al., 2003;Liu, Wang, et al., 2013;Wang et al., 2005).However, little information is available regarding the patterns of genetic variation and potential geographical distribution (PGD) of P. horticola.

| Specimen collection
We conducted field surveys and sampling of P. horticola in China between 2016 and 2020, as shown in Figure 1 and Table S1.Host plants infested with larvae were collected, transferred to the laboratory and placed in clean cages until the adults emerged.Adult specimens were placed on absorbent cotton containing anhydrous ethanol in Petri dishes.Morphological identification was performed using Stemi 508 (ZEISS) and SZX16 (Olympus) microscopes.All specimens of P. horticola were stored in anhydrous ethanol and maintained at −20°C until molecular analysis.Specimens were deposited at the Institute of Plant Protection of the Chinese Academy of Agricultural Sciences, Beijing, China.A total of 440 individuals from 29 populations in 19 regions were used for the population genetic analysis (Figure 1; Table S2).The sampling location map was produced using the ArcGIS platform (version 10.8; ESRI) based on the longitudes and latitudes of the localities.

| DNA extraction, amplification and sequence alignment
Genomic DNA was extracted from a single adult using the rapid grinding method (De Barro & Driver, 1997) with minor modifications and the DNeasy Blood and Tissue Kit (Qiagen).The mitochondrial markers COI, COII and Cytb were used.The primers and the annealing temperatures used for DNA amplification and sequencing are listed in Table S3.Polymerase chain reaction (PCR) was carried out in a 25 μL reaction volume comprising 2.5 μL of 10× buffer (containing Mg 2+ ), 0.2 μL of Taq polymerase (2.5 U/μL), 0.5 μL of dNTPs (2.5 mmol/L), 0.4 μL each of forward and reverse primers (10 pmol/L), 1 μL of DNA and 20 μL of ddH 2 O. PCR was conducted under the following thermal conditions: initial denaturation at 94°C for 4 min, followed by 35 cycles at 94°C for 30 s, primer-specific annealing temperature for 30 s and 72°C for 90 s, and a final elongation at 72°C for 10 min.The PCR products were examined by 1% agarose gel electrophoresis, and the gels were stained using the nucleic acid staining solution GoldView II (Yeasen).The positive PCR products were sent to Sangon Biotech for sequencing.
The newly generated DNA sequences were visually checked and aligned using cluStAlW in MEGA version 7.0 (Kumar et al., 2016) and then deposited in GEnBAnk (Table S2).All DNA sequences were translated into amino acid sequences using the ExPASy web portal (https://web.expasy.org/translate/) to verify functional copies.

| Genetic diversity, population structure and Mantel test
The genetic diversity indices, including the number of haplotypes (N), haplotype diversity (Hd), nucleotide diversity (Pi) and the average number of nucleotide differences (K) of the 29 populations and total population, were estimated using DnASP 6.11 (Rozas et al., 2017) based on the concatenated gene sequences.
The Bayesian analysis of population structure (BAPS) model for the clustering of individuals implemented in BAPS 6.0 (Cheng et al., 2013) was used for the concatenated matrix.To ensure the consistency and convergence of the results, 10 runs for K = 1-10 were performed.Additionally, pairwise fixation (F ST ) and genetic flow index (Nm) matrices were estimated using the ArlEquIn software (version 3.5) (Excoffier & Lischer, 2010).Genetic differentiation was regarded as low for F ST < 0.05, moderate for 0.05 < F ST < 0.15, high for 0.15 < F ST < 0.25 and very high for F ST > 0.25 (Govindaraju, 1989).
Genetic flow was regarded as low for Nm < 1, high for 1 < Nm < 4 and very high for Nm > 4 (Boivin et al., 2004).Analysis of molecular variance (AMOVA) was conducted with 10,000 permutations to estimate the hierarchical genetic variations among and within populations using ArlEquIn version 3.5 (Excoffier & Lischer, 2010).Principal coordinate analysis (PCoA) based on the variance-covariance matrix calculated from the concatenated sequence data was performed using GEnAlEx 6.5 (Peakall & Smouse, 2012).
The relationships between genetic distance and both geographical distance (isolation by distance, IBD) and environmental distance (isolation by environment, IBE) were examined with a Mantel test based on the concatenated gene sequences; the R package ade4 (Dray & Dufour, 2007) was used for this.Significance tests were performed with 10,000 permutations.Genetic distance was estimated with F ST /(1 − F ST ), and the matrix of values was calculated using Ar-lEquIn version 3.5 (Excoffier & Lischer, 2010).Pairwise geographic distances among populations were calculated in the R package raster based on the sample site coordinates.The IBE was estimated using the six climatic variables applied in the ENM analysis (see ENM analysis in Section 2.6).The climate variables were extracted based on sample site coordinates using ArcGIS 10.8 (ESRI).Additionally, a principal component analysis (PCA) was performed on these climate variables using the ade4 R software package (Dray & Dufour, 2007).
The first two principal components were used to calculate the pairwise environmental distances between populations.

| Haplotype network and phylogenetic analysis
To construct the evolutionary relationships among the individual haplotypes of P. horticola, haplotype networks were generated using a median-joining algorithm in the PoPArt 1.7 software (Leigh & Bryant, 2015).Phytomyza syngenesiae (deposited in GEnBAnk under accession nos.OQ807656 (COI), OQ805409 (COII) and OQ805850 (Cytb)), P. lactuca and P. fuscula (Scheffer et al., 2007) were selected as outgroups for P. horticola for the phylogenetic analysis.We used MoDElFInDEr 2 (Kalyaanamoorthy et al., 2017) and allowed the merging of partitions to select the best-fit partitioning schemes and evolutionary models for the concatenated matrix based on the Bayesian information criterion.The software proposed one best-fitting partition and the substitution model HKY + F + I + G4.Haplotype phylogenetic analysis was performed using the Bayesian inference (BI) approach, which was conducted using the corresponding partitioning scheme in MrBAyES 3.2.2(Ronquist et al., 2012).We used two parallel runs, each of which consisted of four Markov chain Monte Carlo (MCMC) runs for 35 million generations.Trees were sampled every 100 generations, and the first 25% of the generations were discarded as burn-in.When the average standard deviation of split frequencies fell below 0.01, the analyses terminated.A 50% majority-strict consensus tree was calculated using the posterior probability method.were performed using ArlEquIn version 3.5 to calculate Fu's F S and Tajima's D values.In the neutrality test, a significantly negative value indicated that the population had expanded or sustained growth in the past.Mismatch distributions were calculated for the observed and expected mismatch distributions and used as test statistics in DnASP 6.11.According to the simulations, demographically stable or admixed populations usually present a multi-modal distribution, whereas populations exhibit unimodal mismatched distributions following recent population demographics or range expansion (Harpending et al., 1998).The BSPs were created using BEASt version 2.6.6 (Bouckaert et al., 2014)  A coalescent Bayesian skyline model was implemented as the prior tree.The MCMC was run for 200 million generations, and the trees were sampled every 1000 steps.We analysed the BSP results using trAcEr version 1.7 with a burn-in of 10% (Rambaut et al., 2018).
We used eight algorithms to model the PGDs of P. horticola: FDA, GAM, ctA, rF, GlM, MArS, GBM and MAxEnt.PhIllIPS.2.For each of the eight sub-models, 4-fold cross-validation and 10,000 pseudoabsence points were randomly selected and repeated once (Xian et al., 2023).The distribution records were divided randomly into 75% training data and 25% validation data to test the predictive power of the model (Xian et al., 2023).Each sub-model was run 10 times, and the three most accurate models were selected to construct the ensemble model (EM; Thuiller et al., 2016).True skill statistics (TSSs) and the area under the receiver operating characteristic (ROC) curve (AUC) were used to evaluate model accuracy (Thuiller et al., 2016).
TSS values ranging from 0.2 to 0.5 indicated poor performance, from 0.6 to 0.8 indicated useful performance, and values higher than 0.8 indicated excellent performance (Coetzee et al., 2009).AUC values below 0.6 were poor, from 0.6 to 0.9 were useful, and those greater than 0.9 were excellent (Phillips et al., 2006).
The grid calculator in ArcGIS 10.8 (ESRI) was used to visualize the expansion, contraction and constancy of the PGD for P. horticola under different climatic scenarios from the present to the future.

| Genetic diversity of populations
COI, COII and Cytb gene sequences with lengths of 765, 702 and 406 base pairs (bp), respectively, were successfully obtained from 440 P. horticola individuals (Table S2).We combined these three genes into a concatenated matrix of length 1873 bp for subsequent analysis.Of the 1873 base sites, 1616 were conserved and 257 were variable, which included 106 singleton polymorphic and 151 parsimonious informative sites.
A large number of haplotypes were observed in the total population (Figure 2; Table 1).The COI, COII and Cytb alignment showed 155, 132 and 91 different haplotypes in the 440 individuals, respectively (Figure 2a-c).When based on concatenated gene sequences, a total of 269 haplotypes were observed, and the number of haplotypes in each population ranged from 5 to 23.Among the 269 haplotypes, 214 were unique to an individual, 38 were shared haplotypes, and 231 were exclusive haplotypes.H10 was the most common haplotype, which was shared by 24 samples and present in 16 of the 29 populations (Figure 2d).The second most frequent haplotype was H254, which was shared by 17 individuals, but was present in only three populations (i.e.DL (C), SN (C) and LS (D)), and was the dominant haplotype in populations of the Tibet region (XZ group).In addition, the haplotypes shared by more than 10 individuals were H13, H35 and H67.The genetic diversity levels of the total population and among the different populations of P. horticola estimated based on the concatenated matrix are shown in Tables 1 and 2.   1).The highest genetic diversity was observed in population DL (C) (Hd = 1; Pi = 0.00641; K = 12.013).Populations SN (C), LS (D) and DZ (A) exhibited low genetic diversity.We observed higher haplotype diversity in the northern population than that in the southern population (Tables 1 and 2).

| Networks and phylogenetic relationships among haplotypes
The median-joining haplotype network exhibited star-like patterns, with some common haplotypes in the centre of the star (Figure 2); these may be the ancestral haplotypes.Although some haplotypes were missing in the centre of the star based on concatenated gene sequences, many haplotypes, such as H10, H13 and H35, were located in the centre of the star and surrounded by many other haplotypes (Figure 2d).In addition, haplotypes were connected by no more than three mutational steps in most cases (Figure 2a-d), which is often associated with demographic expansion.However, the haplotypes were distributed randomly among groups.In the haplotype network or phylogenetic tree, no obvious clustering was observed among the exclusive haplotypes in each population or between the haplotypes in populations with similar geographical ranges (Figure 2; Figure S2).
In addition, many unique haplotypes with close relationships were distributed among different groups, and many other haplotypes were also shared by different groups, including geographically distant groups (Figure 2).This suggested that the haplotypes had a great deal of exchange among populations in different regions.

| Population genetic structure and Mantel test
The BAPS analysis based on concatenated genes identified two main clusters (pink and orange).The main genetic component for populations in XZ contained an orange cluster; remaining populations in other groups contained a pink cluster (Figure 3a).Additionally, the pairwise F ST differences also exhibited high levels of genetic differentiation and limited gene flow between the populations SN (C) and LS (D) in XZ and all the other populations (Figure 3b; Table S5).
Population DZ (A) in HN exhibited moderate genetic differentiation.
However, low degrees of differentiation and high Nm values were observed among pairs of other populations, including different host populations of P. horticola under sympatric conditions.Therefore, populations from the same region were categorized in the same group to display the results of population structure (Figure 3a-c; Table S2).The PCoA based on Nei's genetic distance exhibited different patterns of genetic structures for the XZ group and the remaining groups (Figure 3c).This was in accordance with the results of BAPS and pairwise F ST analysis, indicating that the XZ group represents a genetic lineage that clearly diverged from the lineage comprising the remaining groups over a vast area (Figure 1).The first and second axes of the PCoA explained 38.09% and 17.21% of the overall variance, respectively.Most populations that were geographically close resembled each other, as observed in the analysis.The AMOVA results indicate that the molecular genetic variation within populations was 72.98%, that among populations and within lineages was 1.01%, and that among lineages was 26.01%, indicating that the genetic differentiation of P. horticola originated primarily within the population and that strong genetic differentiation occurred when the populations were divided into the two lineages (Table 3).
The Mantel test for IBE was performed for all populations of P. horticola.Since the high-altitude geographical barriers of the Tibet region may have an impact on IBD analysis, the Mantel test for IBD was performed for 27 populations (except SN (C) and LS (D)).If the dispersal of P. horticola was limited by distance, then genetic distance and geographical distance should be positively correlated, resulting in a pattern of IBD.However, no statistically significant positive relationship was observed between genetic distance and geographical distance (r = .164,p = .081)(Figure 3d) as well as environmental distance (r = .104,p = .164)(Figure 3e), indicating both geography and environment are not related to genetic differentiation of P. horticola populations.

| Demographic history
Neutrality tests were conducted using Tajima's D and Fu's Fstatistics.For most populations in China, the Fu's Fs values were significantly negative (Table 1).When all populations were considered as a whole, significant negative values were obtained (Tajima's  2).Moreover, the mismatch distributions of the total populations in China and the total populations from northern and southern China, obtained via concatenated and singlegene sequences, were unimodal (Figure 4; Figure S3), suggesting that the P. horticola populations recently underwent demographic expansion.The results of the BSP also supported the hypothesis that populations expanded in size (Figure 4; Figure S4).Based on

| Potential geographical distributions from the present to the future
In this study, RF, MAxEnt and GBM were the three best-performing sub-models (Figure S5).Therefore, we selected RF, MAxEnt and GBM to construct an EM to analyse the PGD of P. horticola in China.The results of the EM were highly reliable (AUC = 0.989).HII is the most powerful variable among the selected variables for predicting the current PGD of P. horticola, and the probability of P. horticola occurrence increased to a large extent as HII increased (Figure 5a;

| DISCUSS ION
Genetic variability is crucial for species survival and adaptation to different environmental conditions (Hofinger et al., 2011;Pérez-Alquicira et al., 2019;Scheffer & Lewis, 2006;Wang et al., 2021;Ye et al., 2022).For example, two spatially distinct genetic clusters of Aphis gossypii have been detected in eastern and western China, with significantly different haplotype compositions (Wang et al., 2017).Similarly, host-associated haplotype clusters have been detected in other leaf-mining flies that divide them into clades that feed on different host plants (Pérez-Alquicira et al., 2019;Scheffer & Lewis, 2005, 2006).Although P. horticola is a cosmopolitan pest with a wide host range, little is known about its genetic characteristics.
Overall, this study determined that abundant P. horticola haplotypes and high levels of genetic diversity are present in China, indicating that this species has strong adaptability to different local habitats, which may be an important reason for its demographic expansion and vast PGD.In addition, the EM results indicate that northern China is more suitable for the survival of P. horticola than southern China, and our field surveys and sampling also found that P. horticola is generally more harmful and distributed more densely in northern regions, in accordance with the higher level of demographic expansion of the northern population as revealed by BSPs, which may be a potential factor behind the higher haplotype diversity of the entire northern population.
Studies of population genetic structure provide insights into ecologically adaptive evolutionary processes of species (Andras et al., 2018;Maiakovska et al., 2021).When gene exchange among populations is limited by geographical and climatic barriers, natural selection under different ecological environments may lead to local adaptation, with genetic structures associated with different environments (Cao et al., 2021;Egger et al., 2017;Ye et al., 2018Ye et al., , 2020)).
For example, the uplift of the Qinghai-Tibet Plateau has formed a natural geographic barrier, remodelling the geological features of Chinese mainland (Harrison et al., 1992).The unique geographical and climatic features of this region have played an important role in driving the speciation and genetic divergence of populations (Che et al., 2010;Chen et al., 2022;Ding et al., 2018;Lu et al., 2014;Zhu et al., 2022).This may also help explain the great genetic differenti-  Winkler, Scheffer, et al., 2009;Xuan et al., 2022); however, we did not observe an obvious influence of host plants on the differentiation of P. horticola populations.In general, high levels of gene flow among genetically divergent populations could offset random genetic drift, selection and mutation, and ultimately homogenize the populations (Endersby et al., 2005;Lyons et al., 2012;Wang et al., 2017Wang et al., , 2022)).
P. horticola is a small-bodied insect with moderate flight and dispersal capabilities; thus, it exhibits only short-distance movements.Generally, IBD effects are pronounced in moderately mobile species, but weak in low-and high-mobility species (Peterson & Denno, 1998).Among populations of strong dispersers, extensive gene flow homogenizes genetic differentiation across large geographic areas.Intermediate dispersers can achieve genetic homogeneity at small spatial scales, but their limited dispersal ability makes homogenizing the differentiation difficult over long distances (Peterson & Denno, 1998).However, neither IBD nor IBE pattern was detected in our data, which supports the result that frequent gene flow homogenizes most populations in vast areas of China.Although there is a large geographical distance between the XJ group and the groups in Northeast China (i.e.HLJ, JL and NMG), no obvious genetic differentiation was observed among these groups, and many haplotypes were shared by them.Considering the moderate migration ability of P. horticola, we infer that human-mediated dispersal in recent times was likely an important mean of its spread and distribution, which has been reported in leaf-mining flies and other species (Cao et al., 2017(Cao et al., , 2021;;Wang et al., 2017;Xu et al., 2022).This is also in accordance with the other findings obtained herein, which demonstrate that the HII factor was the strongest variable for predicting the current PGD of P. horticola, and the probability of its occurrence increased as HII increased.In recent years, increasingly frequent agricultural trade and expanding crop planting areas in China may have facilitated the dispersal and distribution of this polyphagous species.P. horticola can move to different places via the trade and transport of infested plant material, which likely facilitates this process.The eggs and larvae of P. horticola can be embedded internally within plant leaves, thereby making it easy to transfer them to new regions without being detected.
In Europe and North America, faunal population demographic histories and changes in suitable habitats are often closely related to Pleistocene glacial cycles (Hewitt, 2000;Kimmitt et al., 2023;Kotsakiozi et al., 2018;Ye et al., 2022) et al., 2009;Gasse, 2000).In this study, four lines of evidence (haplotype network characteristics and demographic history inference) indicate a recent rapid expansion in the P. horticola population size.
BSP analyses suggest that this expansion occurred during the LGM in the late Pleistocene.These findings do not explain the population dynamics of the above species or the classic post-LGM expansion observed in many other organisms (Hewitt, 1999), reflecting an inconsistent faunal response pattern to late Pleistocene climatic changes.
Many factors can contribute to the persistence and the expansion of interconnected populations.First, P. horticola exhibits high ecological flexibility, strong cold adaptation ability and high reproductive and biotic potentials (Fathi, 2010;Iwasaki et al., 2008).Second, the high degree of polyphagicity of P. horticola is notable, as it has been reported to infect plants belonging to over 35 different families (Benavent-Corai et al., 2005;Spencer, 1973).Some of these plants were dominant in China during the LGM (Jiang et al., 2013) and may have served as food sources for P. horticola.Third, the milder Pleistocene climate in East Asia may have reduced the severity of demographic selective pressures faced by P. horticola.On the East Asian continent, temperatures during the LGM were 2-4°C colder than the present (Ju et al., 2007;Weaver et al., 1998).However, unlike Europe and North America, most areas of China were not covered by ice sheets (Liu, 1988;Shi et al., 1986;Zhu et al., 2016), which may have paved the way for the demographic expansion and diversity of the evolutionary lineage of cold-adapted P. horticola.
This finding corroborates the consensus on cold-adapted species in Europe, where biogeographical and phylogeographical analyses (Hewitt, 2004) have indicated that cold-adapted species might have had larger population sizes during glacial periods.
Although insects rely on the ambient temperature to regulate their physiological functions, their ability to regulate body temperature is limited (Colinet et al., 2015;Heinrich, 1993).Therefore, to adapt to global warming, insects living in different ecological environments have moved to cool areas at high latitudes and altitudes (Parmesan, 2006;Sánchez-Guillén et al., 2016).Similarly, in this study, the EM results indicate that the cold northern China is currently more suitable for the survival of P. horticola.In addition, under the impact of future global warming, P. horticola will likely shift towards high latitudes in China where the climate is more suitable.Northeast China will be the most suitable region for their survival; however, the degree of suitability will decrease in most southern regions.These results are supported by the cold-adaptive biological characteristics of P. horticola and reinforce the inference that the P. horticola population expanded during the LGM.Although many regions may become unsuitable for P. horticola in the future, studies have shown that the expansion of host-plant cultivation areas and centres of frequent crop trade have contributed to the expansion of pest populations, including those of Empoasca onukii and Frankliniella occidentalis (Cao et al., 2017;Li et al., 2022;Reitz et al., 2020).Therefore, to prevent and control P. horticola in the future, the focus should be on cold areas at high latitudes.It is necessary to be vigilant regarding its spread and potential for outbreak through human activity.
However, it should still be noted that due to the maternal inheritance of mitochondria, mitochondrial genes cannot provide complete information on the evolutionary history of both parents.
and ecological importance for agriculture, comprehensive analyses of the genetic diversity, differentiation and demographic history of P. horticola populations in China are required, as well as an understanding of future prospective changes in PGD owing to the impacts of climate change.This knowledge will help to explore their potential evolutionary patterns and the responses of populations to climate change, thereby facilitating the development of appropriate management strategies.The main objective of this study was to investigate the genetic variation patterns and PGD of P. horticola populations in China, as well as their influencing factors.We conducted distribution surveys and sampling across a wide geographic area in China.Samples of 440 individuals from 29 populations were used to investigate the genetic changes, prevention and control should focus on high-latitude regions, and vigilance regarding human-mediated pest dispersals and outbreaks should be maintained.K E Y W O R D S climate change, demographic history, ecological niche model, pest management strategies, Phytomyza horticola, population genetics, potential geographical distribution variation patterns.Three mitochondrial genes (i.e.cytochrome c oxidase subunit I (COI), cytochrome c oxidase subunit II (COII) and cytochrome b (Cytb)) were used as molecular markers to examine the genetic diversity, population structure and demographic history of P. horticola.This information could then be used to analyse the population evolution and further assess its ecological adaptability.We then analysed the population genetic structure and the factors that affect the population demographic dynamics.Finally, based on global distribution records, bioclimatic variables, elevation factors and the human influence index (HII), the PGD of P. horticola and its influencing factors were investigated to understand the geographical suitability pattern.The PGD responses to future global warming under conditions of two shared socioeconomic pathways (SSPs) (i.e.2-4.5 and 5-8.5) were then simulated.The results obtained herein provide a useful theoretical basis for developing appropriate pest control strategies.
Neutral analyses, mismatch distributions and Bayesian skyline plots (BSPs) were used to infer the demographic history.Neutral analyses F I G U R E 1 Occurrence sites of Phytomyza horticola in China based on field investigations and sampling from 2016 to 2020.Triangles represent the locations of the distribution.Circles represent the populations in corresponding locations selected for population genetic analysis, and different colours represent different genetic lineages obtained from population genetic structure analysis.
to speculate on the population history by assessing temporal changes in the effective population size using a general time-reversible model.An uncorrelated relaxed lognormal clock model was selected, and the substitution rate for COI was set to 1.77% per million years, as estimated byPapadopoulou et al. (2010).Automatic estimates were selected for COII and Cytb.
China under current and future climate scenarios.The package provided an algorithm platform for ENM, which integrated a maximum entropy model (MaxEnt), generalized linear model (GLM), multiple adaptive regression spline (MARS) model, flexible discriminant analysis (FDA), categorical regression tree analysis (CTA), gradient boosting model (GBM), random forest (RF), generalized additive model (GAM), surface distance envelope (SRE) and artificial neural network (ANN) to reduce the error of individual model prediction results and improve reliability

F
Haplotype network of Phytomyza horticola based on mitochondrial cytochrome c oxidase subunit I (COI) (a), cytochrome c oxidase subunit II (COII) (b), cytochrome b (Cytb) (c) and concatenated (d) gene sequences.The haplotype circle size represents the number of individuals.Mutational steps between haplotypes are represented by crossed lines.Individuals from the same region were categorized in the same group in this analysis, with each group corresponding to different colours.The group codes are shown in TableS2.TA B L E 1 Genetic diversity parameters and neutrality test results of 29 Phytomyza horticola populations in China based on concatenated gene sequences.

D
p < .01;p < .05).The total P. horticola populations from southern and northern China that fed on Pisum sativum and Brassica napus also yielded significantly negative Tajima's D and Fu's Fs values (Table

F I G U R E 3
Genetic structure analysis and Mantel test for isolation by distance (IBD) and isolation by environment (IBE) of the Phytomyza horticola populations.(a) Bayesian analysis of population structure (BAPS) for K = 2 of 19 groups.(b) Heatmap of pairwise F ST matrix for groups.Dark red colour indicates higher genetic differentiation between populations.Pairwise F ST numeric values are shown in Table S5.(c) Principal coordinate analysis (PCoA) shows clustering of groups.(d, e) Scatter plots show the relationships between genetic distance and geographical/environmental distance.the concatenated genes, the BSPs indicate that the effective population sizes of the total population and the total northern population remained stable over a long period of time, then expanded rapidly (approximately 0.025 Ma) during the Last Glacial Maximum (LGM) in the late Pleistocene (Figure 4d,e), while the size of the southern population underwent a relatively slow expansion (Figure 4f).
Hierarchical analysis of molecular variance (AMOVA) for Phytomyza horticola divided into two genetic lineages based on concatenated mitochondrial gene sequences.F I G U R E 4 Mismatch distribution analyses (a-c) and Bayesian skyline plots (BSPs) (d-f) of Phytomyza horticola, based on concatenated genes.(a, d) Total population.(b, e) Northern China.(c, f) Southern China.BSPs show effective population size as a function of time.The upper and the lower limits of the purple belt represent 95% confidence intervals of the highest probability density analysis.The orange shaded areas mark the Last Glacial Maximum (LGM, 16-28 ka; constrained by Zhao et al., 2011).

Figure
Figure S6).Predictions of the PGD of P. horticola indicate that the current potentially suitable range for this species covers most regions of China, including a small suitable range across Tibet, Qinghai, and Xinjiang (Figure5b).Moderately and highly suitable habitats were located in large areas of both northern and southern China at present, with a larger area of highly suitable habitats in the northern regions, such as Henan, Hebei, Shandong and Liaoning.Suitable habitats are expected to expand mainly in the northern regions (Heilongjiang, Jilin, Liaoning, Neimenggu, Qinghai and Gansu) and contract in the southern regions (Guangdong, Guangxi, Jiangxi, ation of the P. horticola XZ lineage in this study.The presence of the Qinghai-Tibet Plateau probably reduces the opportunities for gene exchange between lineages, whether through P. horticola migration or external transmission, thereby leading to the formation of longterm phylogeographic patterns.However, low levels of differentiation and high levels of gene flow have been observed among most geographical populations within the large genetic lineages.In addition, studies have shown that host specialization may lead to genetic differentiation in leaf-mining fly populations and may even promote the development of many new species(Scheffer et al., 2007;Scheffer & Lewis, 2005, 2006;Spencer, 1964; Winkler, Mitter, et al., 2009; F I G U R E 6 Variations in the potential geographical distribution of Phytomyza horticola in China from the present to the 2050s under the conditions of shared socioeconomic pathway (SSP) 2-4.5 (a) and SSP 5-8.5 (b).
Genetic diversity parameters and neutrality test results of the total Phytomyza horticola populations in northern and southern China based on concatenated gene sequences.
TA B L E 2