Screening potential insect vectors in a museum biorepository reveals undiscovered diversity of plant pathogens in natural areas

Abstract Phytoplasmas (Mollicutes, Acholeplasmataceae), vector‐borne obligate bacterial plant parasites, infect nearly 1,000 plant species and unknown numbers of insects, mainly leafhoppers (Hemiptera, Deltocephalinae), which play a key role in transmission and epidemiology. Although the plant–phytoplasma–insect association has been evolving for >300 million years, nearly all known phytoplasmas have been discovered as a result of the damage inflicted by phytoplasma diseases on crops. Few efforts have been made to study phytoplasmas occurring in noneconomically important plants in natural habitats. In this study, a subsample of leafhopper specimens preserved in a large museum biorepository was analyzed to unveil potential new associations. PCR screening for phytoplasmas performed on 227 phloem‐feeding leafhoppers collected worldwide from natural habitats revealed the presence of 6 different previously unknown phytoplasma strains. This indicates that museum collections of herbivorous insects represent a rich and largely untapped resource for discovery of new plant pathogens, that natural areas worldwide harbor a diverse but largely undiscovered diversity of phytoplasmas and potential insect vectors, and that independent epidemiological cycles occur in such habitats, posing a potential threat of disease spillover into agricultural systems. Larger‐scale future investigations will contribute to a better understanding of phytoplasma genetic diversity, insect host range, and insect‐borne phytoplasma transmission and provide an early warning for the emergence of new phytoplasma diseases across global agroecosystems.

Phytoplasmas are transmitted from plant to plant by phloemfeeding hemipteran insect vectors, mainly leafhoppers, in a persistent-propagative manner (Hogenhout et al., 2008;Lee et al., 2000;Weintraub & Beanland, 2006). After acquisition of phytoplasmas from an infected plant by a hemipteran insect, the phytoplasma cells must cross the midgut epithelium, then multiply in the hemolymph in order to invade the salivary glands before being inoculated into another host plant (Hogenhout et al., 2008;Huang et al., 2020;Koinuma et al., 2020).
Attempts to culture phytoplasmas in vitro have, thus far, not succeeded. Thus, phytoplasmas are currently assigned to the provisional genus "Candidatus (Ca.) Phytoplasma," and 45 "Ca.
The intimate tritrophic interaction among phytoplasmas, host plants, and insect vectors defines a complex of multiple pathosystems worldwide (Trivellone, 2019). Almost all phytoplasma-host associations have been characterized by testing plants showing symptoms of diseases in agroecosystems. However, because the association between phytoplasmas, plants, and insect vectors has been evolving for at least 300 million years (Cao et al., 2020), phytoplasmas and their vectors should also be widespread and diverse in nonmanaged, native habitats . Indeed, current theories of infectious disease evolution suggest that most epidemic diseases afflicting humans, livestock, and crops emerge as a result of potentially pathogenic organisms "jumping" from a native host to a new host following anthropogenic disturbance of natural habitats (Brooks et al., 2019).
About 100 insect species have been recorded as competent vectors of phytoplasmas; however, for most the of described "Ca.
Phytoplasma" species and 16S rRNA subgroups the suite of vectors is still unknown (overview in Trivellone, 2019). Because insects are often difficult to identify and individuals infected with phytoplasmas cannot be distinguished from noninfected individuals except through microscopy, molecular screening, or pathogen transmission trials, efforts to identify competent phytoplasma vectors have lagged far behind efforts to characterize phytoplasmas and their host plants. Due to the mobility of insect vectors, spillovers of vector-borne phytoplasmas from adjacent highly diverse natural habitats into agroecosystems were hypothesized to play an important role in emergence of new phytoplasma diseases (see Brooks et al., 2021). However, few attempts have been made to study phytoplasma diversity in natural habitats. Therefore, diversity, plant host range, and insect vector range of phytoplasmas are probably significantly underestimated .
Due to increased awareness of the importance of wildlife as pathogen reservoirs (Brooks et al., 2020), the use of museum biorepositories to discover and track pathogens is a critical step for anticipating the emergence and re-emergence of infectious diseases (DiEuliis et al., 2016;Dunnum et al., 2017). The high levels of biodiversity and geographic coverage represented in such repositories can also help unveil the evolutionary history of pathogens and reveal previously unknown interactions with actual or potential hosts.
In this study, we analyzed specimens of Deltocephalinae leafhoppers (Hemiptera: Cicadellidae: Deltocephalinae) preserved in the collection of the Illinois Natural History Survey (INHS) (http://inhsi nsect colle ction.speci esfile.org/Insec tColl ection.aspx). The INHS leafhopper collection is one of the largest in world with over > 400,000 specimens stored either pinned or in ethanol at −20°C. In 2018, a subsample of ethanol-preserved leafhoppers collected in natural habitats were tested for the presence of phytoplasmas. The results revealed that about 3% of tested insect specimens harbored phytoplasmas. The newly discovered phytoplasmas group with phytoplasma strains belonging to three distinct taxonomic (16Sr) groups.
Phytoplasmas were detected from a total of six leafhopper species including five known and one recently described species, all recorded for the first time as potential phytoplasma vectors. These results indicated that phytoplasma diversity and potential insect host range are indeed underestimated, and further large-scale investigation of leafhopper samples collected from natural habitats is needed.

| Collection and preservation of leafhoppers
More than 3,000 bulk samples of sap-feeding hemipteran insects were obtained between 1998 and 2018 through fieldwork by the last author, his students, and colleagues during surveys that aimed to document poorly studied insect faunas in various parts of the world and to obtain representatives of all major lineages of Cicadellidae for use in phylogenetic and systematic studies. This material was supplemented by the first author's collections in Europe between 2001 and 2018. Specimens were collected using various methods including sweeping and vacuuming of vegetation, night collecting at lights, and in Malaise (flight intercept) traps. Specimens were collected directly into 95% ethanol in the field, returned to the laboratory and stored in −20°C freezers at the INHS. Voucher specimens were also pinned for species identification and reference. Some samples included undescribed species from under-investigated areas, and they are waiting to be described in the context of other projects. In 2018, screening was carried out on a subset of 227 samples from independent sampling events in 28 countries (six continents) worldwide (Argentina, Australia, | 6495 thematic maps within a geographic information system (QGIS3.8, 2019; Figure 1). Although 98% of the collections were intentionally obtained from natural areas or patches of native vegetation within more anthropogenic landscapes, we evaluated the land cover of a larger area including each sampling site using the raster layer Cropland and Pasture area (resolution 10 × 10 km; Ramankutty et al., 2008).
In total, the 227 samples encompassed about 1,000 specimens, with each species (or morphospecies) represented by 1 to 20 specimens belonging to the phloem-feeding leafhopper subfamily Deltocephalinae (except 1 sample belonging to the related hemipteran family Membracidae), which includes most of the previously documented vectors of phytoplasmas (Table S1). At least one specimen from each sample was selected randomly (with preference for males when present because species identification usually requires examination of male genitalia) for the molecular analyses.

| DNA extraction
Total DNA was extracted from individual leafhoppers using a nondestructive method to preserve the specimen exoskeletons as vouchers and for subsequent morphological study. For each specimen, the abdomen was dissected, transferred to a 1.5-ml tube containing 400 µl 1X TES pH 7.8 buffer (20 mM Tris, 10 mM EDTA, 0.5% SDS) and 4 µl Proteinase K (20 mg/µl), and incubated at 56°C overnight. The abdomen was then removed and preserved in ethanol for morphological study.
The buffer solution was then blended for 10 min using a mixer (MixMate), and the solution was transferred to a new 1.5-ml tube with 400 µl of chloroform, mixed, and centrifuged for 10 min at 4°C at 13552 RCF. The supernatant was transferred to a new tube, and the chloroform wash was repeated. DNA was then transferred to a new tube, and 400 µl of ice-cold isopropanol was added followed by mixing and centrifuging for 15 min at 4°C at 16128 RCF. Supernatant was discarded, and the DNA pellet was washed twice using 500 µl of ice-cold 96% ethanol. The DNA pellet was then dried for 20 min and resuspended in 50 µl of TE buffer (pH 7.8). To each leafhopper sample, a molecular code was assigned: For example, LH078 stands for LeafHopper followed by an ordinal number indicating the collection event.

| Leafhopper species identification
Specimens were sorted to morphospecies and tentatively identified  (Emeljanov, 1967;Fletcher, 2000;Stiller, 2010;Zahniser, 2008). One of them was a new species for science and was recently described by the last author (Dietrich, 2021). The abdomens of voucher specimens (males) were dissected to study the genitalia under an Olympus SZX10 stereoscopic microscope. Habitus photographs of voucher specimens were taken at INHS with a Canon SLR camera and 65-mm macro lens mounted on an automated lift.  Christensen et al., 2004) were tested using nested PCR of the 16S ribosomal RNA gene to confirm the phytoplasma identity. In the 16S rRNA region, nested PCR was performed using universal primer pair P1/P7 (Deng & Hiruki, 1991;Smart et al., 1996) followed by F2n/R2 (Gundersen & Lee, 1996). Amplicons were visualized on 1% agarose gel stained with GelRed (Biotium Inc.) under a Gel Doc XR UV transilluminator (Bio-Rad). The DNA of ALY (Italian alder yellows) phytoplasma, obtained from experimentally infected periwinkle (Catharanthus roseus), was used as a positive reference strain in all the amplification reactions. Sequencing of the F2n/R2 amplicons was carried out in both directions using automated equipment (BMR Service, Padua, Italy). Forward and reverse reads were assembled using Gap4 and Pregap (Bonfield et al., 1995), followed by manual editing. Nucleotide sequences were deposited in the GenBank database under the accession numbers listed in Table 1.

| DNA amplification and sequencing of phytoplasmas
An initial BLAST query (Altschul et al., 1990) was performed in order to evaluate the similarity of newly obtained sequences to the five most similar phytoplasma sequences evaluated for inclusion (Table S2) in the final dataset for further phylogenetic analyses. The final reference sequence dataset consisted of 21 sequences obtained from the National Center for Biotechnology Information (NCBI) database (Federhen, 2012). The ingroup included 20 phytoplasma strains (11 described as "Ca. Phytoplasma" species, including an incidental citation) representing different countries and isolated from distantly related hosts (Table S3), and the outgroup included Acholeplasma palmae (Acholeplasmataceae). Electropherograms were corrected and aligned using the Muscle algorithm as implemented in MEGA 7.0 (Edgar, 2004;Kumar et al., 2016)  model (Kimura, 1980) and neighbor-joining (NJ) method using the maximum composite likelihood model (Tamura et al., 2004). Branch support was measured using a bootstrap test with 1,000 replicates.

| Taxonomic diversity of tested leafhopper samples
The 227 specimens analyzed belong to 9 tribes (Athysanini, Chiasmini, are represented by multiple specimens (Table S1).
GIS analyses with the Cropland and Pasture overlay confirmed that the sampling sites were located mainly in natural areas, with average raster values of 0.091 ± 0.13 (compared with cropland raster value = 1).

| Detection and phylogenetic analysis of phytoplasmas
Using qPCR on 227 leafhoppers, a positive signal was detected in 111 specimens. Only 14 samples with Cq value ≤ 30.38 were selected for further analysis (Table S1). The nested PCR primed by F2n/R2 ampli-  (Table 1). The distance between these two sampling sites is about 1,120 km.
Samples from China (LH143) and Australia (LH139) are polyphyletic, with LH139 branching more deeply than LH143. A recent comprehensive ML tree for phytoplasmas recovered members of 16SrXI as paraphyletic with respect to 16SrXIV (Cao et al., 2020).  (Figure 1 and Table 1).

Both
In the second cluster (B), LH133 is sister to "Ca. Phytoplasma

| Ecological and evolutionary context of the new phytoplasma-host associations
None of the leafhopper species that tested positive for the presence of phytoplasmas in the present study were previously reported as hosts or vectors of phytoplasmas (Trivellone, 2019). These 6 F I G U R E 3 Maximum-likelihood tree based on 952 positions of the F2n/R2 fragment of the 16S rRNA gene obtained from 6 samples of the present study (in bold), 20 phytoplasma strains from GenBank (used as references) and Acholeplasma palmae (outgroup). Bootstrap values (> 63%) are shown above or below the branches. Branch lengths are proportional based on the scale indicated. Initial tree(s) for the heuristic search were obtained automatically by applying the Maximum Parsimony method. GenBank accession numbers and details of the reference phytoplasma strains are listed in Table S3. The names at the tip of the tree include the following: the phytoplasma strain (acronym or Candidatus species name), the 16Sr phytoplasma group in parenthesis or the name of the insect species host, and the Country (Kruger et al., 2015). Thus, this is the first record of phytoplasma strains in the clade 16SrXI/16SrXIV in South Africa. The leafhopper fauna of Africa is diverse but remains poorly known, with new genera and species continuing to be discovered (e.g., Stiller, 2019Stiller, , 2020. Pravistylus exquadratus and other members of the same genus have never been reported as pests, except for single records of this species on Korog wheat cultivar and on ryegrass (Stiller, 2010 (Novikov et al., 2006), including four that are competent vectors of 16SrI phytoplasmas in Europe although 16SrI phytoplasmas have not been previously recorded from this country (Trivellone, 2018(Trivellone, , 2019. Our discovery of a new association between a Macrosteles species not previously recorded as a phytoplasma host and a new 16SrI group strain or host suggests that further surveys and phytoplasma screening in Kyrgyzstan may be important for assessing the potential threat of emerging phytoplasma diseases in this region of Central Asia. Among the species collected in Australia, Mayawa capitata (LH133) belongs to the grass-specialist leafhopper tribe Paralimnini and reportedly occurs on grasses and Sida acuta (Malvaceae) (Fletcher, 2000). Mayawa affinifacialis (LH139) has been recently described (Dietrich, 2021), and little is known about its ecology; however, the species that was collected in grassland is likely a grass feeder. A specimen of the first species (LH133) was infected with a phytoplasma strain closely related to strains classified in the 16SrXV group and the second one (LH139) with a phytoplasma strain closely related to strains classified in the group 16SrXI. Only 3 competent vectors for phytoplasmas (all in group 16SrII) were previously known for this country, two species of Orosius, tribe Opsiini (Deltocephalinae), and Batracomorphus angustatus (Osborn) in the subfamily Iassinae (for an overview, see Trivellone, 2019). A recent review of Australian phytoplasma pathosystems revealed an important gap of knowledge, with several recorded phytoplasma strains not yet assigned to 16Sr groups and subgroups (Liu et al., 2017).
Moreover, information on competent vectors is scarce with many species still undescribed, hampering the understanding of epidemiological cycles. Our results expand the spectrum of potential vectors recorded in Australia to include species from the tribe Paralimnini, and reveal new possible epidemiological routes that require further investigation.
The specimen of Acharis ussuriensis (LH143) testing positive was infected with a strain closely related to strains in the 16SrXI/16SrXIV groups (Fig. 3, cluster A). Although both phytoplasma groups were previously detected in China, further investigation on the pattern of transmission and host plants involved in this pristine area will provide useful insights into the characterization of phytoplasma-host relationships in natural areas.

| Underestimated phytoplasma diversity in natural areas
Phytoplasmas are a highly diverse group of plant pathogens, and new strains continue to be discovered at a steady pace worldwide. Most such discoveries still mainly result from screening of plants showing "typical" phytoplasma disease symptoms in human-managed ecosystems.
By screening leafhopper specimens from natural habitats, we revealed new associations between phytoplasmas and their insect hosts, recording new phytoplasma group records for 3 countries.
The phytoplasma strains newly detected here have been further characterized, which represent multiple subgroup lineages (2021).
Our results highlight the fact that potential vectors in natural areas are poorly studied (as suggested by ) and may harbor phytoplasma species not yet discovered and described.
Discovery of new phytoplasmas in natural areas worldwide is not surprising, given the > 300-million-year history of coevolution between phytoplasmas, their plant hosts, and insect vectors and the lack of extensive screening for phytoplasmas in nonmanaged ecosystems (Cao et al., 2020;. According to a recent molecular timetree (Cao et al., 2020), the earliest diver-

| Museum biorepository as source of unknown phytoplasmas
Previous research showed that integrating different sources of knowledge is of paramount importance for discovering potentially emergent pathogens. Studies on zoonotic diseases showed that museum biorepositories represent an invaluable but still poorly utilized resource for pathogen discovery, due to the wealth of species represented and prevalent best practices of specimen preservation, identification, and collecting event description (Dunnum et al., 2017). Furthermore, existing databases and traditional ecological knowledge can contribute to discovery of the location and timing of potential spillover of pathogens into human-managed systems worldwide (Brook et al., 2009;Kutz et al., 2009). Plant, fungal, and animal specimens deposited in natural history museums and public or private collections are becoming increasingly accessible due to Web-based interfaces. These collections represent the most comprehensive available sources of data documenting the diversity of life and have proven useful for many purposes beyond their traditional applications to comparative morphology, taxonomy, and biogeography . Until recently, species interactions documented by collections were mainly investigated using metadata (e.g., Bartomeus et al., 2019;. The advent of increasingly sensitive molecular methods has recently allowed more cryptic symbiotic associations to be explored directly by the testing preserved tissues of potential hosts for the presence of microbes and other symbionts (e.g., Daru et al., 2019). To our knowledge, this is the first time that phytoplasma-insect associations have been documented using museum specimens. However, because most collections of leafhoppers and other terrestrial insects consist of dried, pinned specimens, the ethanol-preserved specimens screened for our study are not typical of the material usually available in museums. Nevertheless, dried, pinned insect specimens have been shown to yield DNA of sufficient quality for use in various applications, including DNA barcoding and phylogenetics (e.g., Blaimer et al., 2016;Mitchell, 2015). We did not attempt to screen such specimens for phytoplasmas in our study. The ethanol-preserved specimens tested for this study ranged in age from 1 to 20 years, and we detected no correlation between the quantity or quality of DNA obtained and the age of the specimen.
Our screening confirmed the presence of phytoplasmas in 6 leafhopper specimens (accounting for ~3% of the subset of 227 leafhoppers analyzed). Because we mostly tested single specimens from collecting events spread over 20 years on multiple continents, it is not surprising that most of our samples tested negative for the presence of phytoplasmas. Our data do not allow us to speculate on local infection rates of the new strains detected. However, considering the spatial, temporal, and taxonomic scale of the samples available in museum biorepositories, our results can be taken as a very rough, preliminary estimate of phytoplasma prevalence in natural areas worldwide and suggest that the undiscovered diversity of phytoplasmas in natural areas worldwide is substantial.
Given the success of our approach, larger-scale studies of museum biorepositories have strong potential to fill major gaps in our knowledge of phytoplasma diversity, the evolution of phytoplasmaplant-vector associations, and the potential for emergence of new pathogens of agricultural importance.

| Potential impact of vector-borne phytoplasma spillovers and large-scale future study
Centuries of homogenization of agricultural production systems led to decreased genetic and species diversity of crops. Such general biological depletion was previously associated with increased pathogen outbreaks and serious economic losses in agroecosystems (King & Lively, 2012;Newton, 2016). Earlier research recognized the role of wildlife as natural reservoirs where infections are often asymptomatic. The onslaught of emerging infectious diseases in crops often involved alternative sources of inoculum and creation of new ecological interfaces, and global changes (e.g., land use or climate warming) set the stage for new associations to occur. Spillover events from natural habitats in direct contact with cultivated fields have been documented for several plant pathogens (Brooks et al., 2021;McCann, 2020), and the involvement of vectors may facilitate host shifts, accelerating the spread of diseases at the regional level. The phytoplasmas associated with Flavescence dorée disease, and related strains (FDp), represent one of the most well-studied pathosystems (Malembic-Maher et al., 2020), providing a good example of spillover from wild plants to a crop (Vitis vinifera) through efficient insect vectors (Brooks et al., 2021;. For other phytoplasma pathosystems, epidemiological information and characterization of strains associated with crops have accumulated for over forty years. However, information on genetic diversity, the range of hosts, and ecological characteristics of the spreading of phytoplasmas in natural habitats are still broadly missing.
This gap of knowledge hinders basic understanding of the evolution of phytoplasmas in association with their hosts, and hampers the implementation of proactive measures to cope with emerging pathogens.

ACK N OWLED G EM ENTS
The authors would like to thank Dr. Nadia Bertazzon for her support during DNA extraction, and Dr. Elisa Angelini for the insightful comments and suggestions to the manuscript. We also gratefully acknowledge the assistance of numerous colleagues and collectors listed in Table S1 who helped obtain the specimens used in this study. This study was supported by the Swiss National Science Foundation (P2NEP3_168526) and partially by the US NSF grant DEB-1639601. This study was also supported by the US Department of Agriculture, Agricultural Research Service (Project number 8042-22000-306-00D).

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
The sequences supporting the conclusions of this article were deposited into the GenBank under the accession numbers MW473669-MW473674.