A genomic investigation on the origins of the Korean brown planthopper, Nilaparvata lugens (Stål) (Hemiptera: Delphasidae)

Nilaparvata lugens, the brown planthopper (BPH), is a serious pest species. BPHs cause significant damage to rice plants in Korea as well as other countries in East and Southeast Asia. As BPHs cannot survive winter in Korea, they annually migrate into Korea from foreign countries. The BPHs found in Korea are believed to originate from China, but most BPHs in China are also known to originate from Southeast Asia. To understand the origin of Korean BPHs, we conducted a genotyping‐by‐sequencing (GBS) study. We sampled BPHs from five locations in Korea and five countries in Southeast Asia and analyzed the GBS and sequencing results using various methods based on the f statistics and admixture graph analyses. We confirmed that the domestic BPHs shared a greater genetic drift than the BPHs from Southeast Asian populations over several years, implying that a continuous genetic substratum of Korea exists. This genetic substratum is genetically closer to BPHs from the southern part (Thailand and Cambodia) of Southeast Asia than to BPHs from the northern part (Bhutan, Myanmar and Laos) of Southeast Asia. In addition, direct gene flows from Southeast Asia seem possible, so Korea is considered a hotspot where diverse BPH populations mix. Therefore, the origin of Korean BPHs extends beyond China and as far as southern Southeast Asia. This result will help to understand and control the population dynamics of the Korean BPH population.


Introduction
Nilaparvata lugens (Stål) (brown planthoppers, BPHs) pose a significant threat to rice production, with their presence dating back to the 1960s in tropical regions (Bottrell & Schoenly 2012).However, recent BPH outbreaks in various parts of Asia have intensified the problem.These voracious pests not only feed on the sap of rice plants but also serve as vectors for viral diseases, resulting in reduced grain yields and substantial economic losses, amounting to billions of dollars annually (Denno & Roderick 1990).Notably, the BPH infestation in Korea in 1975 caused a substantial 38 % reduction in the harvest from the affected area, leading to an estimated damage cost of $10 million (Dyck & Thomas 1979).Over the years, both domestic and international efforts have been implemented to combat BPH infestations (Lao et al. 2015;Lamba & Dono 2021;Shin et al. 2021).
The BPHs have a wide distribution across South, Southeast and East Asia, including the islands of the South Pacific and Australia (Dyck & Thomas 1979).They can be found year-round in tropical regions, extending from Bhutan to the Red River Delta of Vietnam, and even into the winter rice-growing areas of southernmost China (Kisimoto 1976;Cheng et al. 1979;Horgan et al. 2020).However, north of these regions in East Asia, such as Korea, BPHs cannot endure the cold winters and cannot establish long-standing populations; instead, they undertake long-distance migrations to Korea annually in early summer.
These migrations are significantly influenced by wind patterns, such as the south-westerly winds of the East Asian summer monsoon (EASM) (Jiang et al. 1981;Kisimoto & Sogawa 1995;Rosenberg & Magor 1983;Pender 1994;Volonté et al. 2022). For instance, between 1977and 1979, Dung (1981) used a specialized trap attached to an airplane to collect 946 BPHs out of a total of 2102 planthopper specimens.Otuka et al. (2005) also found that BPHs traveling from the coastal region of Fujian Province in China to Kyushu, Japan (a distance of about 1400 km) took 31-38 h.Similarly, Turner et al. (1999) reported that migration durations range from 24 to 45 h from China to Korea.These studies support the hypothesis that BPHs can and do make a long-distance dispersal to Korea across oceans in a reasonable time frame from southern regions such as China and Southeast Asia.
So far, limited genetic information has been utilized to understand the origins of Korean BPHs, their genetic structure and intraspecific variation.For example, Mun et al. (1999) analyzed mitochondrial DNA cytochrome c oxidase subunit I (COI) sequences obtained from 48 individuals collected from seven Asian countries: Bangladesh, China, Korea, Malaysia, Philippines, Thailand and Vietnam.They found only three distinct haplotypes were recovered from the 48 individuals, with one of them being dominant in all countries except the Philippines (38/48 individuals in total) (Mun et al. 1999).All three haplotypes were found in China, whereas only one haplotype was found across Bangladesh, Malaysia, Thailand and Vietnam.Based on the findings of two of three haplotypes in Korea, they suggest that China is a likely source of the yearly migration into Korea (Mun et al. 1999).However, because of the limited sample size and low resolution of the COI gene sequences (Tyagi et al. 2022), as well as the presence of all three haplotypes in the Philippines, this study cannot pinpoint the origins of Korean BPH populations.
Considering the direction of the monsoon and the flying capability of BPHs, direct migration from Southeast Asia to Korea is also plausible.To test this hypothesis robustly, more comprehensive genetic information is required.In our study, we examined the origins of Korean BPHs over 3 years by capturing individuals and comparing them with the genetic profiles of Southeast Asian populations.To enhance the resolution of the population dynamics, we employed genotyping-by-sequencing (GBS) data and conducted population allele frequency-based analyses.Our results show that the Korean BPH populations contain diverse genetic profiles but that, overall, the majority of them are more closely related to each other than they are to the Southeast Asian populations sampled so far, suggesting the existence of a continuous genetic substratum of Korean BPHs.Although we cannot pinpoint where this substratum exists, we find that a direct long-distance migration from Southeast Asia and unsampled regions to Korea is possible, calling for further investigation.

Collecting samples
For 3 years, starting in 2020, specimens of BPHs were collected directly in various places in Korea and abroad, or secured with the cooperation of member countries under the Asian Food & Agriculture Cooperation Initiative (AFACI) project promoted by the Rural Development Administration as well as the National Institute of Agricultural Sciences and Agricultural Technology Centers.In addition to the data from two samples in China where full-genome sequences were presented, overseas countries that have secured samples for the study include Bhutan, Cambodia, Laos, Myanmar and Thailand (Table 1).
The samples collected went through a process to identify species and secure the quantity and quality of the genes, which included morphological identification and molecular biological identification using the COI barcode gene, and a process of genomic DNA quality checking for the GBS library, using a NanoDrop 8000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and electrophoreses in 1.2 % agarose gel (200 V, 30 min) and subsequent purification.

Library construction and GBS analysis
To prepare the GBS libraries, we digested the DNA using the restriction enzyme ApeKI (GCWGC) and ligated the DNA from each sample to a set of 96 barcoded adapters using a modified version of a previously published protocol (Elshire et al. 2011).We then pooled the libraries from each sample and sequenced them in 151-bp reads using TrueSeq 3.0 paired-end sequencing on the Illumina platform HiSeq X Ten, performed by Seeders Inc. (Daejeon, Korea).

Sequencing data processing
We generated GBS data from 88 captured and incubated BPHs.In addition, we downloaded the fastq files of two Chinese BPHs and included them as outgroup individuals.
We mapped sequence data on the BPH reference genome (Ma et al. 2021) with the "mem" module in the Burrows-Wheeler Aligner program BWA 0.7.17 (Li et al. 2009).We removed individuals with mapping rates of lower than 0.6 as being misclassified.We removed reads with a Phred-scaled mapping quality score of <30 using SAMtools 1.9 (Li et al. 2009) and calculated bacterial contamination with kraken 2.1.2(Wood et al. 2019) and the minikraken2_v1_8GB database.

PCA
As the genotype missingness for each BPH individual is high, it is hard to visualize the genotype data with standard PCA methods.This is because only SNP sites that are complete for all populations are used in a standard PCA.To overcome this issue, we centralized the genotype matrix by subtracting two times the mean allele frequency from the genotype value and computed the genetic similarity matrix by calculating the mean product of the centralized genotype values between each pair of individuals using all complete pairs of observations on those variables in R.Then, we used the "eigen" function to decompose the similarity matrix.We used eigenvalues and eigenvectors to plot the PCA result.

f statistics
We calculated the f statistics with qp3pop and the f 4 functions implemented in the R library ADMIXTOOLS2 2.0.0.(https:// github.com/uqrmaie1/admixtools). Two artificially incubated Chinese BPHs are used as an outgroup to measure the shared genetic drift between the populations.We calculated all pairs of BPH populations, and the results are available in Table S4.Likewise, these samples were used as an outgroup to calculate all combinations of f 4 statistics in the form of f 4 (Outgroup, X; Y, Z), for testing the cladality between targets or searching additional admixture sources.The results of the f 4 statistics are summarized in Table S5.

Graph-based analysis
To test the admixture models comprehensively, we used graph-based analysis with the qpgraph function implemented in the R library ADMIXTOOLS2 2.0.0.Before using the qpgraph function, f 2 statistics between all pairs of BPH populations were calculated with the extract_f2 function in the same library with the "max_miss = 0" option.The number of SNPs remaining when applying this option was 29 115.Incubated samples were also used as an outgroup in this analysis, and we reconstructed the backbone tree of Southeast Asian populations and Korea_Taen_o based on f 4 statistics.Then, Korean BPH populations were systematically added in the order of the year that they were captured: Korea_Group1 (Korea_Namhae and Korea_Goseong), Korea_Geoje and Korea_Taean.

Genome-scale variation data of Korean BPHs
In this study, we produced GBS data of wild BPH individuals collected in various regions of Korea and several Southeast Asian countries, including Bhutan, Cambodia, Laos, Myanmar, and Thailand, spanning the period from 2020 to 2022 (Fig. 1A; Table 1) (Elshire et al. 2011).Among the 62 individuals initially included in the GBS experiment, we removed three misidentified non-BPH individuals, leaving 59 individuals in the analysis (Table S1).Considering the low coverage of GBS data, we produced haploid genotypes by randomly sampling a single high-quality base per individual per position, to minimize reference bias.Keeping polymorphic bi-allelic single-nucleotide polymorphisms (SNPs) with less than 50 % genotype missingness, we retained 513 898 SNPs, with each individual covering 58 299-460 371 SNPs.We identified two pairs of first-degree relatives and excluded one from each pair from the group-based analyses (Table S1).In addition, we produced GBS data for seven wild individuals of a closely related species, Nilaparvata muiri, captured along with the Korean BPH individuals, and processed whole-genome sequence data of two published Chinese BPH individuals (Ma et al. 2021) for use as outgroups in the population genetic analyses.from more southern regions, including Cambodia, Laos and Thailand.This north-south genetic gradient in Southeast Asia aligns well with findings from a previous study (Hu et al. 2019).

Substantial genetic heterogeneity across Korean and Southeast Asian BPHs
Using PCA, we classified Korean individuals visually into three groups (Fig. 1B).First, the Korean BPH individuals sampled for 3 years along the southern coastal region, encompassing individuals from Namhae (Korea_Namhae, n = 7, captured in 2020), Goseong (Korea_Goseong, n = 7, captured in 2020), Geoje Island (Korea_Geoje, n = 2, captured in 2021) and Goheung (Korea_Goheung, n = 2, captured in 2022), are downwardly separated from the other BPH individuals along PC2 (Fig. 1B).Second, the Korean BPH individuals sampled in the western coastal city of Taean (Korea_Taean, n = 9, captured in 2022) are clearly separated from the first cluster along both PC1 and PC2.Notably, Korea_Goheung individuals captured in 2022 fall closer to the other southern coastal individuals than Korea_Taean individuals captured in the same year of 2022.Finally, one Korea_Taean BPH individual, identified as "Taean_outlier" (Korea_Taean_o), falls close to the Southeast Asian individuals, and is clearly separated from all other Korean individuals.This individual provides possible evidence for multiple sources of Korean BPHs, with at least one closely related to the observed Southeast Asian BPHs.

The shared genetic profile of Korean BPHs over three consecutive years
To quantify the population structure of planthoppers on a global scale, we employed the outgroup f 3 statistic in the form of f 3 (outgroup; population 1, population 2) (Patterson et al. 2012).A higher outgroup f 3 value signifies a greater degree of shared genetic drift between the two populations being compared.For the outgroup population, we decided to use the Chinese BPH individuals, although we acknowledge that the outgroup position may largely reflect the technical bias between WGS and GBS rather than the true genetic distance.Although the biologically relevant outgroup of N. muiri is available, there is a high level of genotype missingness (72 %-82 % per individual) for this outgroup, which leads to a large reduction in the number of SNPs available for the analysis.We confirmed the outgroup position of the Chinese BPH individuals using the following statistics.First, when calculating outgroup f 3 in the form of f 3 (N.muiri; BPH 1, BPH 2), the f 3 statistics for BPH pairs including Chinese individuals show substantially lower values than for pairs without Chinese individuals (Fig. S3; Table S2).Second, we confirm that almost all BPH pairs are symmetrically related to the Chinese individuals by calculating the f 4 statistic in the form of f 4 (N.muiri, Chinese; BPH 1, BPH 2) (Patterson et al. 2012).If this value does not significantly deviate from zero, it suggests that BPH 1 and BPH 2 form a clade to the exclusion of the Chinese individuals.On the other hand, this value becomes significantly negative if there is substantial gene flow between BPH 1 and the Chinese individuals; likewise, a significantly positive value suggests gene flow between BPH 2 and the Chinese individuals.Only one out of 55 BPH pairs show a significant deviation from zero (|Z| > 3), consistent with the basal outgroup position of the Chinese individuals (Fig. S3; Table S3).Considering these results, we decided to use the Chinese BPHs as the outgroup to maximize the statistical power of the population genetic analyses, but acknowledging that our data cannot provide information for the true genetic relationship between Chinese and non-Chinese BPH populations.
We computed all possible pairs of outgroup f 3 statistics in the form of f 3 (Chinese; BPH 1, BPH 2) between the 11 analysis groups (Fig. 2; Table S4).It is worth noting that Bhutan (n = 1, captured in 2021) and Laos_2022 (n = 1) exhibited consistently higher outgroup f 3 statistics, whereas the Korea_Taean_o displayed consistently lower outgroup f 3 statistics compared with other populations, positioning them as outliers (Fig. 2A).As there was only one Bhutan individual and Laos_2022 individual in our data set, we decided to exclude Bhutan and Laos_2022 from further f statistics analyses as outliers, although it would be worthwhile investigating these areas in a future study to see whether this unique pattern is replicated in a larger sample set and thus reflects a genuine genetic relationship.
Consistent with our PCA findings, we observed that, with the exception of Bhutan, Korea_Taean_o and Laos_2022, planthoppers could primarily be classified into two major groups: Korean individuals and Southeast Asian individuals (Fig. 2B).The shared genetic drifts between populations were notably higher among the Korean individuals than among the Southeast Asian individuals, suggesting the presence of additional shared genetic drift that is specific to Korean planthoppers, and not seen in Southeast Asian populations.If the sampled Southeast Asian populations include the true source populations of the Korean populations over the 3 years of sampling, such a shared genetic drift in the Korean populations is not expected to be observed (Fig. 3A).We speculate that such a pattern may be explained if a yet-tobe-sampled source population, distinct from the sampled Southeast Asian populations, has repeatedly founded Korean populations for the 3 year period between 2020 and 2022 (Fig. 3B).

The fine-scale genetic differences between the Korean BPH populations over time
To explicitly model the genetic relationship between various Korean and Southeast Asian BPH groups, we utilized the f 4 Population genomic analysis of Nilaparvata lugens Entomological Research 54 (2024) e12722 symmetry test.Among the Southern Southeast Asian (SSEA) groups detected in PCA, we find that Cambodia and Thailand form a clade against all the other groups, as shown by |f 4 (Outgroup, Others; Cambodia, Thailand)| < 2.0 standard error measure (SEM) (Table S5).The Northern Southeast Asian (NSEA) group, Myanmar, seems to be a common outgroup to (Cambodia, Thailand) because |f 4 (Outgroup, Myanmar; Cambodia, Thailand)| = 1.90 SEM (Table S5).We then built a population graph of Southeast Asian BPHs based on the observed f 4 statistics using qpGraph (Patterson et al. 2012).Including Korea_Taean_o, which falls close to Southeast Asian BPHs in the PCA, we find that a simple tree-like topology (Korea_Taean_o; Myanmar, SSEA) without any gene flow between these groups adequately fits these populations (Fig. S4).We obtain the identical tree topology using TreeMix, an automated population tree searching FIGURE 3 Two possible population trees explaining consecutive introductions of Korean brown planthoppers (BPHs).As an overview of the hypothetical models for the introduction of Korean BPHs, we propose two possible trees and expected outgroup f 3 results, respectively.If Korean populations were split from one of the sampled Southeast Asian populations (SEA), the genetic drift shared between SEA and Korean BPHs would be higher than that shared between two Korean BPHs (A).In contrast, if Korean populations were split from the unsampled ghost population (GHOST), the opposite result would be expected (B).

FIGURE 2
The shared genetic drift between brown planthopper populations.To investigate the shared genetic drift pattern among populations, we calculated the outgroup f 3 statistics of each population.We visualized the results in a box-and-whisker plot (A) and heat map (B).
program (Fig. S5).However, this also suggests that there is a substantial difference between the observed and the expected allele frequency covariance between Southeast Asian and Korean populations, calling for further investigation into gene flow among BPH populations.
To validate this pattern within the admixture graph, we sequentially incorporated Korean populations into the existing Southeast Asian population graph constructed beforehand (Fig. 5).To increase the number of individuals included in each group, we combined the cladal groups into one: Thailand and Cambodia for SSEA; and Korea_Namhae and Korea_Goseong for Korea_Group1.Korea_Group1 was found to diverge from the Southeast Asian populations, with their genetic make-up explained without any admixture.Subsequently, we introduced Korea_Geoje into the model and observed significant gene flow from the Myanmar population, effectively explaining the asymmetric genetic affinities we had identified in the f 4 symmetry test.Finally, the Korea_Taean population, which exhibited a strong genetic affinity with SSEA, was suitably represented as an admixture between SSEA and the Korea_Group1 populations.

Discussion
In our study, we have identified a close genetic relationship between Korean BPH populations across a 3 year period between 2020 and 2022, hinting at the repeated dispersal of BPHs into Korea from a source metapopulation/region that has not been sampled.Although more countries have been investigated in our study than in previous studies, we could not pinpoint this primary source.As Chinese BPHs are known to be genetically close to those in the southern part of Southeast Asia (Hu et al. 2019), they may possibly be the main source of Korean individuals.However, considering that no sampling was conducted at all for a large area of China in this study, it is clearly necessary to conduct sampling in multiple regions, FIGURE 4 The extra genetic affinity between Southeast Asian and Korean brown planthopper.To investigate the gene flow from Southeast Asia to Korea, we calculated the f 4 statistics in the form of f 4 (China, Southeast Asia; X, Korea).We divided the Southeast Asian populations into two subpopulations: Myanmar (A) and the southern Southeast Asia (SSEA; Thailand and Cambodia) (B).The Z scores of the f 4 statistics are visualized in box-and-whisker plots.even in one country.Therefore, our next step involves capturing planthoppers from China and other Southeast Asian regions to pinpoint the main source of the Korean substratum.
Although the precise origins of the Korean population remain elusive, we did discover significant gene flow from Southeast Asia.Our examination unveiled a distinct gene pool extending from north to south in Southeast Asian populations, and possible contributions from both Myanmar and SSEA to Korea.Given the long-distance migratory nature of planthoppers, seasonal winds and typhoons could potentially bring about direct genetic exchanges with these populations.As the flight time of BPHs arriving in Korea from Fujian Province, China, or from the south was estimated to be 24-45 h (Turner et al. 1999), we may also expect that BPHs can fly continually from countries of the Indochinese Peninsula to Korea via China along with the EASM wind in less than a week, if not a few days.
Furthermore, our analysis identified gene flow from an unidentified region when studying Korea_Taean_o.Positioned on the basal branch of Southeast Asia in our graph analysis, Korea_Taean_o is unequivocally from outside Korea but does not exhibit a clear genetic affinity to a specific Southeast Asian population.These doubts lead to the possibility that BPHs from areas outside the Indochinese Peninsula, such as the Philippines, may also be introduced into Korea.Typhoons passing through the Philippines, the country most exposed to tropical storms in the world (Santos 2021), typically move from east to west, and gradually turn north or west as a result of the Coriolis effect, so they mainly travel to the coast of Vietnam or southern China.If BPHs of the Philippines are carried to Vietnam by typhoons and are subsequently carried by the EASM wind to Korea, this may be another rare (Hereward et al. 2020) but possible event.
Expanding our study sites in Vietnam and the Philippines is essential for comprehending the origins of Korea_Taean_o.
Our findings shed light on the population structure of Korean BPHs and underscore Korea as a hotspot for East Asian BPH populations.Consequently, the Korean BPH population presents an ideal model for deciphering the genetic architecture of small, annually migrating invasive insect populations.The migratory nature of BPHs, with potential contributions from various sources to Korea, poses a considerable challenge to controlling BPH outbreaks in the country.As genetic differences in the populations can cause differences in the susceptibility to pesticides, it is imperative to identify the principal source of the Korean population and gain a deeper understanding of the genetic structure to develop strategies for preventing and managing Korean BPH populations.
FIGURE 1 (A) Geographic sites of brown planthoppers (BPHs) captured from the Indochinese Peninsula and the Korean Peninsula in 2020-2022.Each captured individual is marked by a circle, square or triangle, respectively.The base map data was downloaded from Natural Earth (https:// www.naturalearthdata.com/downloads/).(B) Principal component analysis results.We performed the principal component analysis with genotype data for BPHs and plotted the principal components 1 and 2 (PC1 and PC2) of each individual.

FIGURE 5
FIGURE 5 Admixture graph explaining the population dynamics between Southeast Asia and Korea.To test the population dynamics comprehensively, we systematically added the Korean populations to the backbone tree of the Southeast Asian populations (Fig.S3) and searched for the best graph.To increase the number of individuals in each group, we additionally combined the cladal groups into one: Thailand and Cambodia for SSEA; Korea_Namhae and Korea_Goseong for Korea_Group1.Drift lengths and admixture proportions parameterized by the given graph topology are optimized with the qpgraph function.The worst Z score of differences between optimized and observed f 4 statistics was À2.55.

Table 1
Meta information for the collected brown planthoppers The meta information includes country of collection, locality, date and number of samples.