Molecular markers reveal diversity in composition of Megastigmus (Hymenoptera: Megastigmidae) from eucalypt galls

Abstract Since outbreaks of the invasive blue gum chalcids Leptocybe spp. began, the genus Megastigmus (Hymenoptera: Megastigmidae) has been increasingly studied as containing potential biocontrol agents against these pests. Megastigmus species have been collected and described from Australia, the presumed origin of Leptocybe spp., with M. zvimendeli and M. lawsoni reported as Leptocybe spp. parasitoids established outside of Australia. Parasitic Megastigmus have been reported to occur locally in the Neotropics, Afrotropic, Palearctic, and Indomalaya biogeographic realms, and in many cases described as new to science. However, molecular tools have not been used in studying parasitic Megastigmus, and difficulties in morphological taxonomy have compromised further understanding of eucalypt‐associated Megastigmus as well as the Megastigmus‐Leptocybe association. In this study, we used molecular markers to study the species composition and phylogeny of Megastigmus collected from eucalypt galls in Australia and from Leptocybe spp. galls from South Africa, Kenya, Israel, China, and Vietnam. We record thirteen discrete species and a species complex associated with eucalypt galls. A summary of morphological characters is provided to assist morphological delimitation of the studied group. A phylogeny based on 28S rDNA identified species groups of importance to Leptocybe spp. biocontrol agents from four clades with nine species. Relationships between Megastigmus from eucalypt galls and their phytophagous congeners were unresolved. Further molecular work is needed to clarify the identity of many species.


Megastigmus zebrinus was described from specimens from South
Africa and Australia (Grissell, 2006) and was later reported to occur in Thailand and Argentina (Doğanlar, 2015). Several species of Megastigmus have been recorded as associated with Leptocybe spp. in the Indomalaya, Palearctic, Afrotropic, and Neotropic biogeographic realms, and in many cases described as new species to science (Le et al., 2018).
Discrimination of Megastigmus associated with Leptocybe spp.
has so far relied largely on morphology (Doğanlar, 2015;Doğanlar & Hassan, 2010). This has caused uncertainty and impeded further understanding of phylogenetic relatedness, since variation in color and sizes of specimens has long been known to challenge taxonomists, particularly those working with Australian species (Bouček, 1988;Milliron, 1949). While molecular markers have been used in systematic studies and species delimitation of the Palearctic and Afrotropical fauna (Auger-Rozenberg et al., 2006;Roques, Copeland, Soldati, Denux, & Auger-Rozenberg, 2016), continued use of these markers is expected to assist species delimitation and give further insight into the phylogeny of this genus.
Incorporation of generated DNA sequences with published molecular data is expected to provide knowledge of the relationships between phytophagous and entomophagous species, and between Australian and non-Australian Megastigmus (Le et al., 2018;Roques et al., 2016).
Here we present a study of eucalypt-associated Megastigmus, with a focus on species of potential biocontrol use against invasive Leptocybe spp. Research materials included Megastigmus specimens from eucalypt galls collected in Queensland (QLD), New South Wales (NSW), Australian Capital Territory (ACT), Victoria (VIC), and, where possible, specimens associated with Leptocybe spp. galls in their exotic ranges provided by international colleagues. Species delimitation and phylogenetic reconstruction were completed using a combination of the Clyde-Bonnie fragment of mitochondrial DNA coding cytochrome c oxidase subunit 1 (COI mtDNA) and the partial nuclear DNA fragment coding 28S ribosomal RNA (28S rDNA).
This study is the first to report a molecular sequence comparison of gall-associated Megastigmus in the eucalypt gall system.

| Insect collection
Galled eucalypt material was collected in road-side surveys in QLD, NSW, ACT, and VIC between March 2014 and June 2019. Parts of eucalypt plants bearing galls on young shoots or leaves were collected and transferred to the laboratory within seven days of collection. Galls were placed in zip-lock bags and stored in a cooled insulated box or fridge (4°C) during transportation and then transferred to separate plastic vials (Φ30 mm × H100 mm) containing moistened tissue paper. These vials were kept in a controlled temperature cabinet, at 25 ± 2°C, 50% to 70% RH, and emerged insects were collected every 2-3 days over ~30 days. Emerged Megastigmus were transferred to glass vials (Φ11.6 mm × H32 mm) containing 100% ethanol (volume of insect: ethanol <1:10) and stored at −20°C.
Gall types bearing Megastigmus emergence were recorded and preliminarily sorted by gall morphology (see Appendix S1). Individual wasps were examined, photographed, and DNA was extracted when (a) specimens emerged from material from a new collection site; (b) specimens emerged from the same collection site but from different gall types or different eucalypt species; or (c) specimens appeared superficially different from those collected in (a) and (b). Species of Bootanomyia spp., characterized by the knobbed stigmal vein and exerted ovipositor like Megastigmus spp. but with a metallic body color (Bouček, 1988;Doğanlar, 2011), were also collected and included in DNA analyses but were not examined further morphologically.
PCR thermo-cycling was carried out in a Bio-Rad T100 (Greenslopes, QLD, AUS) using the setup 95°C for 1 min then 35 cycles of 95°C for 1 min + 55°C for 1 min and 72°C for 1 min then 72°C for 5 min for final extension before holding at 10°C. When the primer 1775-COI-F was used, the annealing temperature was reduced to 50°C.
The primers used (Table 1) were either from previous work on Megastigmus (Roques et al., 2016;Scheffer & Grissell, 2003) or designed in the course of this study. The targets sequenced were a partial region of the COI mitochondrial DNA (the "Clyde-Bonnie") and a fragment from the D1 to D3 region of the nuclear 28S rDNA. The primer pairs 1775-COI-F/2773-COI-R (amplicon size 1,040 bp) and 28S-D1F/28S-D3R (amplicon size ca. 1,090 bp) (Boivin et al., 2014;Roques et al., 2016) were attempted first. The alternative combination 28S-D1F/28S-1059R (amplicon size 1,078 bp) was used to amplify the 28S fragment if initial amplification failed. For M. zvimendeli specifically, to avoid pseudogene co-amplification, the target COI fragments were amplified by replacing 1775-COI-F with the upstream forward primer LCO1490 (Folmer, Black, Hoeh, Lutz, & Vrijenhoek, 1994). With this primer pair, the amplicon size (1,304 bp) was too long for completely overlapping forward and reverse read in Sanger sequencing, but the reverse primer gave clean capillary separation for the targeted fragment. Alternatively, the 1,304 bp COI fragments were obtained by manually assembling shorter fragments amplified using combinations of LCO1490 and 2773-COI-R with internal primers (Table 1). PCR products were visualized by electrophoresis on 1 × TBE and agarose gel with GelRed® (Biotium, California, USA). Products with a single band at the desired fragment size were sent to Macrogen Inc.
(Seoul, ROK) for purification and Sanger sequencing. Alternatively, purification and sequencing reactions were conducted on-site using ExoSAP-IT (Thermo Fisher Scientific, MA, USA) and a BigDye Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific).
On-site sequenced products were forwarded to the Australian Genome Research Facility (QLD, Australia) for electrophoresis capillary separation.

| Sequence alignment and species delimitation
Sequence alignment was performed using the built-in Geneious Alignment program in Geneious Prime (Biomatters, Auckland, NZ).
The paired forward and reverse reads were aligned and edited to unambiguous sequences unless otherwise specified. Primer sequences were removed from reads, and multiple sequences were Genetic species delimitation was determined using the mtDNA COI sequences using the web version of the Automatic Barcode Gap Discovery (ABGD) tool (Puillandre, Lambert, Brouillet, & Achaz, 2012) and the General Mixed Yule Coalescent (GMYC) method (Pons et al., 2006). ABGD examines the frequency distribution of pairwise differences to find the gap separating intragroup and intergroup differences and partitions the dataset by recursive application of a range of user-given thresholds P (the maximum divergence of intraspecific sequences). The model of evolution was KP80, which is a common parameter in mtDNA-based species delimitation (Boykin, Armstrong, Kubatko, & De Barro, 2012;Evans & Paulay, 2012).

Primer name Sequence Reference
CGAATAAATAATATAAGATTTTG Scheffer and Grissell (2003) LCO1490 (Forward) GGTCAACAAATCATAAAGATATTGG Folmer et al. (1994) 2222-COI-F (Forward) ATATTTTAATTTTACCAGGATTTGG Scheffer and Grissell (2003) 2399-COI-R (Reverse) TGTAGCTGAAGTAAAATAAGC Authors 2413-COI-R (Reverse) TCATCTAAAAACTTTAATTCCTGT Scheffer and Grissell (2003) 2773-COI-R (Reverse) population. An ultrametric tree was generated using the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package family (Bouckaert et al., 2019) version 2.6.3. Alignment of COI sequences obtained in the study was imported using the componential program BEAUTi, the selected substitution model was HKY + G+I (the first model determined using jModeltest for the analyzed dataset that is available for analysis in the software package), selected tree prior was Speciation: Yule Process (Gernhard, 2008), the maximum clade credibility (MCC) tree was generated using the program TreeAnnotator. GMYC analysis was performed using the function gmyc in the package splits (Ezard, Fujisawa, & Barraclough, 2017).
The identification of Leptocybe specimens to lineage B was based on the barcoding region and compared with published data (Dittrich-Schröder et al., 2018). Specimens of Leptocybe lineage B obtained in this study grouped with Leptocybe sp. lineage B with >99.3% identity. Other specimens referred to as "local" Leptocybe sp. were identified by morphology (Kim, 2008;Mendel et al., 2004), including one specimen extracted for DNA, compared with available Leptocybe barcoding sequences and confirmed to be an unpublished Leptocybe species.

| Phylogenetic inference
Placement of the sequences obtained in this study into a larger phylogeny was completed by incorporating sequences from Auger-  Roques et al., 2016), which was identical to sequences of M. zebrinus obtained in this study. One taxon was randomly selected if more than one sequence existed for a species-level taxon, similar to Roques et al. (2016). Alignments were trimmed to the same length for analysis.
Data were partitioned into three blocks for COI and four blocks for the concatenated alignment (three codon positions and one for 28S DNA The transition/transversion ratios were calculated and plotted using functions in the R-based package SPIDER (Brown et al., 2012).

| Morphological identification
Species identified using molecular evidence were labeled with numbers, including the four collected Bootanomyia spp. A minimal character set is provided for the collected Megastigmus species to assist future diagnostic works. Terminology follows Graham (1969), Bouček (1988), Gibson, Read, and Fairchild (1998) Ethanol-preserved insect specimens and DNA vouchers were deposited at the Insect collection, Queensland Department of Agriculture and Fisheries, Dutton Park, QLD, Australia.

| DNA sequences
A total of ninety-six specimens were extracted for DNA. Regarding  (Clary & Wolstenholme, 1985;Crozier & Crozier, 1993). The number of variable sites was 234/849 (191 sites at the third codon position). The observed pairwise transition/transversion ratio (R) was 0.50, and the maximum pairwise distance was 11.78%. The best fit evolution model was TIM2 + G + I (Transition Model 2 with gamma-distributed among-site rate variation and a proportion of invariable sites (Posada, 2003) and phytophagous Megastigmus are compared in Figure 1.

| COI DNA based species delimitation
The ABGD analysis of the COI mtDNA alignment identified a clear barcoding gap between KP80 distance 0.02 and 0.05, that is, two sequences with KP80 distances of greater than 0.05 can be confidently assigned to different groups. The stable number of retrieved groups was 20, obtained when data were partitioned using P values from 0.0049 to 0.0414 ( Figure 2). GMYC analysis suggested similar results with 20 Ml entities and the assignment of sequences to species identical to those suggested by ABGD (Figure 2

| Morphological species delimitation
A set of morphological characters was proposed (Table 2) to assist in distinguishing female specimens for species identified by ABGD.  Examination of M. zebrinus was based on ANIC and QM paratypes and South African alcohol preserved specimens were presented for females only, except for the distinct black patch in males of M. lawsoni .

| Phylogeny of eucalypt gall-associated Megastigmus
PHYML inference based on the 28S rDNA, applying the Jmodeltest2-suggested model (TPM1 + I + G) constructed a phylogenetic tree with 30 taxa including the outgroup with log-likelihood −2861 (Figure 3a). The species with metallic color from Australia, identified as Bootanomyia (Doğanlar, 2011), formed a highly supported group (98% bootstrap) and nested inside other Megastigmus taxa. The Megastigmus species were considerably divergent with some taxa forming well-supported groups, although most associations had bootstrap support values below 50% (hereby referred as unsupported) or from 50% to less than 70% (weakly supported).
Two clades of well-supported Leptocybe associates were the M. and factors influencing evolution rates was not possible with the limited taxon sampling and markers.

F I G U R E 3
Phylogeny of Megastigmus and Bootanomyia species associated with eucalypt galls in reference to noneucalypt phytophagous Megastigmus. (a) 28S, 865 bp including alignment gaps, model of evolution TPM1 + G+I, using PHYML; (b) concatenated 28S and COI, 1,620 bp including gaps, model of evolution GTR + I + G, using RAxML, data partitioned into four blocks (28S + 3 codon positions). Analyses were performed in the Geneious Primer platform, and branch support was determined using 1,000 bootstraps. Taxa associated with Leptocybe spp. in Australia or Leptocybe spp. in their invasive range are marked with black dots. Taxa in blue were eucalypt gall associates. Nodes marked with stars were additionally supported by ≥0.95 posterior probability in Bayesian analysis. Bootstrap supports <70% in ML analysis were excluded unless nodes were supported by Bayesian analysis. Tips with black triangles were taxa that successfully reproduced upon exposure to Leptocybe sp. lineage B gall under laboratory conditions (except Megastigmus sp. 5 tested and failed to reproduce). Outgroup was from genome of B. dorsalis (Bunnefeld et al., 2018). Reference sequences were obtained from Auger-Rozenberg et al.

| Molecular markers delimited Megastigmus and confirmed species associated with eucalypt galls
Except M. zvimendeli (discussed in Le et al., 2020) and M. zebrinus (discussed herein), the COI sequences obtained from eucalypt galls did not match any published sequence of phytophagous Megastigmus.
The number of species and assignment of sequences to species was consistent for the two species delimitation methods GMYC and ABGD. Eucalypt-associated species are separated from each other by a clear barcoding gap between the maximum intraspecies distance of 1.8% and the minimum interspecies distance of 5.6% (KP80 distance, Appendix S3). However, that barcode gap was not applicable for the phytophagous group. Several pairs of phytophagous species (Boivin et al., 2014) were recorded with distances of 2.6%, 2.5%, and even as low as 1.7%. This can partly be explained by the higher divergence in mtDNA of parasitic wasps than in nonparasitic wasps (Castro, Austin, & Dowton, 2002;Dowton & Austin, 1995). Figure 1 further illustrates the distribution pattern of pairwise distances between species associated with eucalypt galls (as presumptive Wales, Australia. This species has been released as a biocontrol agent against Leptocybe spp. in Israel (Mendel et al., 2017), and our recent study revealed the establishment of this species in Israel, Kenya, China, and India . Megastigmus pretorianensis was first described from South Africa as a Leptocybe spp. gall associate. In Australia, the species was recorded in association with a local Megastigmus lawsoni, established in Israel as a Leptocybe spp.
biocontrol agent (Mendel et al., 2017), represents at least three and likely four cryptic species differing in both COI mtDNA and 28S rDNA sequences. Female specimens of this group bear a single pair of setae on the scutellum, and male specimens have a distinct black patch around the median part of the transscutal articulation.
We were unable to find reliable morphological characters assisting delimitation of species in this group and therefore treat M. lawsoni as a species complex. In our collection, the most common species In the study, the COI-based species delimitation results were confirmed by a unique 28S sequence for each species and were therefore unlikely to be misinterpreted. However, the phylogeny inferred from COI data was unstable and highly unresolved, which can be explained by the saturation of COI sequences and possible violation of orthology rule. Saturation, indicated by the very low transition/transversion ratio (T i /T v = 0.50, plotted in Appendix S7), occurred when the sequences undergo excessive mutations so that the estimation of mutational changes is no longer accurate and the sequences lost their phylogenetic values (Duchene, Ho, & Holmes, 2015;Purvis & Lindell, 1997;Yang & Yoder, 1999).
Furthermore, violation of orthology, which is a factor contributing to phylogenetic incongruence (Ballard & Whitlock, 2004;Bensasson, Zhang, Hartl, & Hewitt, 2001;Som, 2014), cannot be excluded. In the study, pseudogenes have been found for M. zvimendeli and M. and Neomegastigmus were differently inferred in three analyses (Janšta et al., 2018). Askew et al. (2013) believed the use of the genus name Bootanomyia lead to paraphyletic status of Megastigmus. With genome sequencing technology, complete mitochondrial genome (Lee, Choi, Kim, Jeon, & Kim, 2018) and Ultra-Conserved Elements (Cruaud et al., 2019) have become accessible for phylogenetic study. An investigation using these tools at family level is expected to fully investigate the evolution history of COI and phylogeny of Megastigmus/Megastigmidae.

Megastigmus zebrinus was described as a gall-former from South
Africa and Australia associated with fruits of Syzygium cordatum and E. camaldulensis, respectively (Grissell, 2006). It was later recorded to associate with Leptocybe galls from Thailand (Doğanlar & Hassan, 2010), Argentina (Hernández, Aquino, Cuello, Andorno, & Botto, 2015), and South Africa, where its status was reclassified from primary galler to probable parasitoid (Klein, Hoffmann, Neser, & Dittrich-Schröder, 2015 and Israel. However, we were unable to obtain M. zebrinus specimens from Thailand and Argentina and failed to find the species in Australia. With a record of association with a noneucalypt host plant and a multicontinental distribution, M. zebrinus could be the model insect for an origin tracing study using mitochondrial marker, like Scheffer and Grissell (2003), and for understanding the host-shifting process in Megastigmus.
Specimens of M. zebrinus from Israel were strongly discolored, and their DNA was damaged during preservation. Amplification of the desired long segment of COI mtDNA and 28S rDNA failed for these specimens, but a short fragment (190 bp excluding primer binding sites) was amplified using the internal primer pair 2222-COI-F/2413-COI-R. These DNA fragments were aligned and were identical to the matching region of M. zebrinus haplotype 1. We herein argue for the occurrence of M. zebrinus in Israel based on this evidence: • The PHYML-based tree built from the obtained 190 bp COI fragment separated this species well from others (Appendix S8). In identification of parasitic Hymenoptera, diagnoses have been possible with COI fragments of as short as <150 bp (Andersen & Mills, 2012).
• For DNA barcoding, misidentification risk may result from the presence of a pseudogene that preferentially amplify over the target barcode gene (Bensasson et al., 2001;Song, Buhay, Whiting, & Crandall, 2008). It is unlikely that a pseudogene was the case here as the sequence chromatograms were free of double peaks, indicating the presence of a single PCR product. No disruption to the amino acid sequence was observed, suggesting that the gene is coding for protein and functional.
• Regarding morphology, specimens from Israel displayed the important characters found in South African M. zebrinus: two interocellar setae close to the midocellus (Grissell, 2006)

| Molecular work is required to overcome the uncertainty of Megastigmus taxonomy
Morphological data assisted in matching the species recognized in this study with those described in multiple works of Girault (Girault, 1915(Girault, , 1925(Girault, , 1929 and redescribed by Doğanlar and Hassan (2010). For example, the original description of M. eucalypti (Girault, 1915) described a species with "flagellum black, funicle 1 a little shorter than the pedicel, 1 somewhat longer than wide. Distal funicle joint a little wider than long. Sometimes, the propodeum is wholly black. Head lemon-yellow" and females with "length, 2.25 mm., exclusive of ovipositor which is extruded for a length somewhat over half that of the body." These were important characters observed in the large form of Megastigmus sp. 1.
The smaller form of Megastigmus sp. 1 is similar to the QM specimens identified as M. fieldingi by Grissell (2006) in shape, size, and body color. The scutellar setae form "3b" in Megastigmus sp. 6 and Megastigmus sp. 8 (Table 2) was an important character, "second setae of the scutellum twice closer to 3 than to 1," in the original description of M. amamoori (Girault, 1925) and M. pallidiocellus (Girault, 1929) (QM holotypes T5011 and T5021). However, for females of most species, the color and shape of collar and propodeum and visibility of mesoscutal pattern appeared to change when body size and color varied. Identification to established names was therefore confounded by the high variation in size and color of species in our study, and the poor condition of Girault's type specimens.
Bouček (1988)    could set an example of this approach.
Despite the discussed limitations, our study has successfully contributed to the understanding of species composition and species delimitation for eucalypt-associated Megastigmus. The constructed phylogeny identified several species groups of importance to Leptocybe spp. biocontrol. Data of COI mtDNA sequences clearly delimited species and can be further applied in designing diagnostic primer for use in monitoring of Leptocybe spp. biocontrol programs, while the presented morphological characters form a new baseline in understanding the morphological variation within species and between eucalypt gall-associated Megastigmus.

ACK N OWLED G M ENTS
Sincere thanks are expressed to colleagues who supplied specimens:

CO N FLI C T O F I NTE R E S T
The authors declared that there is no financial, general, and institutional competing interests regarding the publication of this article.