Molecular biogeography and host relations of a parasitoid fly

Abstract Successful geographic range expansion by parasites and parasitoids may also require host range expansion. Thus, the evolutionary advantages of host specialization may trade off against the ability to exploit new host species encountered in new geographic regions. Here, we use molecular techniques and confirmed host records to examine biogeography, population divergence, and host flexibility of the parasitoid fly, Ormia ochracea (Bigot). Gravid females of this fly find their cricket hosts acoustically by eavesdropping on male cricket calling songs; these songs vary greatly among the known host species of crickets. Using both nuclear and mitochondrial genetic markers, we (a) describe the geographical distribution and subdivision of genetic variation in O. ochracea from across the continental United States, the Mexican states of Sonora and Oaxaca, and populations introduced to Hawaii; (b) demonstrate that the distribution of genetic variation among fly populations is consistent with a single widespread species with regional host specialization, rather than locally differentiated cryptic species; (c) identify the more‐probable source populations for the flies introduced to the Hawaiian islands; (d) examine genetic variation and substructure within Hawaii; (e) show that among‐population geographic, genetic, and host song distances are all correlated; and (f) discuss specialization and lability in host‐finding behavior in light of the diversity of cricket songs serving as host cues in different geographically separate populations.

The shift from katydids to crickets and mole crickets represents a significant shift in female fly hearing toward lower frequency sounds (ca. 4-5 kHz in crickets and ca. 2-3 kHz in mole crickets) than are typical of most katydids (often >> 10 kHz). Utilization of katydids with relatively low frequency calls may have facilitated the evolutionary transition to crickets and mole crickets. For example, certain katydid hosts of Ormiines have relatively low frequency calls, for example, ca. 5-6 kHz in Sciarasaga quadrata (host of Homotrixa alleni; Allen, Kamien, Berry, Byrne, & Hunt, 1999); ca. 7 kHz in Neoconocephalus robustus (host of O. brevicornis; Nutting, 1953); and ca. 8 kHz in Orchelimum pulchellum (one of several hosts of O. lineifrons; Shapiro, 1995).
Within Ormia, O. ochracea has been most extensively studied.
Peak sensitivity of female fly hearing closely matches or is at slightly higher frequencies than typical male host calling song (Robert, Amoroso, & Hoy, 1992). The current geographic range attributed to this species extends from Florida (Walker & Wineriter, 1991), across the southern Gulf States (Henne & Johnson, 2001), into Texas (Cade, 1975), Arizona (Sakaguchi & Gray, 2011), California (Wagner, 1996), and Mexico (Sabrosky, 1953b); throughout this range, it parasitizes various species of Gryllus field crickets (see below). In addition, O. ochracea was introduced to Hawaii by at least 1989 (Evenhuis, 2003), where it parasitizes Teleogryllus oceanicus, itself introduced to Hawaii from Australia via Oceania by at least 1877 (Kevan, 1990) and possibly earlier, perhaps facilitated by Polynesian settlement (Tinghitella, Zuk, Beveridge, & Simmons, 2011). Localized populations of O. ochracea show varying degrees of host specialization: Flies in Florida almost exclusively parasitize Gryllus rubens (Walker, 1993;Walker & Wineriter, 1991); flies in Texas primarily parasitize G. texensis (Cade, 1975); flies in Arizona regularly parasitize multiple Gryllus species (Sakaguchi & Gray, 2011); flies in southern California primarily parasitize G. lineaticeps (Wagner, 1996;Wagner & Basolo, 2007); and as noted above, Hawaiian flies parasitize T. oceanicus. Remarkably, playback experiments in Florida, Texas, California, and Hawaii, which simultaneously presented the songs of G. rubens, G. texensis, G. lineaticeps, and T. oceanicus, revealed that each fly population showed a significant (but not exclusive) preference for the song of its primary local host species of cricket (Gray, Banuelos, Walker, Cade, & Zuk, 2007). This suggests an even further degree of host specialization in these flies-possibly indicative of cryptic host races or species as has been found in other Tachinids (Smith et al., 2008;Smith, Woodley, Janzen, Hallwachs, & Hebert, 2006). Determining the extent to which geographic and host range subdivision is coupled with genetic subdivision is thus one of the goals of this study.
Successful establishment of O. ochracea in Hawaii represents a significant expansion of both the geographic and host range of the fly. How can such a specialist invade switch to a novel host with a strongly divergent song structure, and in the course of a few decades come to prefer that novel host's song to the songs of ancestral hosts? Two of our aims in this paper are to use mitochondrial and nuclear markers both to examine genetic variation within Hawaii and to identify the more-likely continental source population(s) of those Hawaiian flies, and thereby the most likely types of recent ancestral host songs. This necessitates broad sampling of continental populations, and we therefore expand upon the previous work in the United States and include flies from populations in both northern and southern Mexico, as well as catalog the confirmed host species and their songs in each of these areas. We apply standard phylogeographic analyses to mitochondrial DNA sequence data, including outgroup species of Ormia, and we adopt a population genetic approach to analysis of microsatellite nuclear markers.

| Fly collection
We collected flies at mesh screen and/or bottle traps using playbacks of cricket songs (Walker, 1989). The songs played to attract flies varied among populations and across years, but for mainland sites always included songs of 2-4 species of crickets at least one of which was a known local host; for Hawaiian sites, some collections (WHC, 2003) were made with playbacks of four cricket songs (see Gray et al., 2007), whereas later collections used T. oceanicus song (the only Hawaiian host). We also collected a small number of flies at lights or as they emerged from field-collected crickets. Table 1 provides details of locations and dates of sampling. Collected flies were preserved in ethanol until DNA extraction and further analysis. We extracted DNA using a Qiagen DNeasy tissue kit according to the manufacturer's instructions.
We used entire flies as source tissue for all of the mainland and 13 of the Hawaiian flies, and head and thorax tissue for the remainder of the Hawaiian flies. In theory, the whole tissue extractions could include DNA from larvae, although the amounts of such DNA would be trivial compared to maternal DNA. We quantified DNA using a NanoDrop system and adjusted concentrations to between 20 and 75 ng/μl.

| Genetic markers and analysis
We analyzed population structure using both mitochondrial and nuclear markers. For mtDNA, we analyzed a section of Cytochrome C Oxidase subunit I (hereafter COI) PCR amplified in two overlapping fragments with "universal" primer pairs Jerry-Pat and Ron-Nancy (Simon et al., 1994), resulting in 1,111 bp after alignment. In addition, we developed nuclear microsatellite markers de novo for this project. Marker discovery was performed by 454 sequencing at the Cornell University Life Sciences Core Laboratories Center with further validation done by SLB and HDK. We identified and tested 17 msat markers from this dataset consisting of 3, 4, and 6 bp repeats. PCR conditions followed a "touchdown" protocol of 95° for 40 s, 66° for 45 s, and 72° for 45 s. The annealing step was reduced by one degree every cycle for the first seven cycles. Cycles 8-35 followed a pattern of 95° for 40 s, 58° for 45 s, and 72° for 45 s. PCR products were stored at −20°C until genotyped. Individuals were genotyped at microsatellite loci by the University of Minnesota Genomics Center on an Applied Biosystems 3730xl DNA Analyzer. We scored alleles for fragment size manually using Peak Scanner 2.0 software.
Multiple independent analysts scored the same products to assure veracity of the calls. If no clear designation could be made or alleles did not amplify, we scored the data as missing.

| Population genetics analyses
Prior to analysis of microsatellite fragments, we filtered individuals and loci for missing data. A strict cutoff of >25% missing data led to the exclusion of six loci. Following this filter, we excluded any individuals with missing data at three or more loci, resulting in the removal of 52 samples. The final dataset included 274 individuals genotyped at 11 loci with between 6 and 17 alleles per locus ( TA B L E 2 (Continued) below). To estimate allelic richness and the number of private alleles accurately given unequal sample sizes per population, we performed a rarefaction analysis using HP-Rare (Kalinowski, 2005) using the population with the smallest sample size (Oaxaca, 13 samples) to calculate adjusted values.
We visualized population genetic variation using a discriminant function analysis of principal components (DAPC) with 80 principal components and four discriminant functions using the adegenet (Jombart, 2008;Jombart & Ahmed, 2011) and pegas (Paradis, 2010) packages in R.
To visualize genetic structure, we implemented the Bayesian analysis program STRUCTURE v2.3.4 using an admixture model with correlated allele frequencies and without using source population as a prior. We used a burn-in of 50,000 steps and 100,000 MCMC iterations. We conducted separate runs for the full dataset, a mainland dataset with the Hawaiian samples excluded, and a dataset of Hawaiian samples only. For the 8-locus dataset, we performed 20 runs each for k = 1-9; for the 11-locus dataset, we performed five runs each for k = 2-9. To infer the likely number of genetic clusters, we used both the Ln estimated probability of the data from STRUCTURE and the Evanno method utilizing Δk (Evanno, Regnaut, & Goudet, 2005).
We calculated pairwise estimates of F ST (Weir & Cockerham, 1984) and Nei's genetic distance between populations using the R packages adegenet and ade4 (Chessel, Dufour, & Thioulouse, 2004), and we calculated expected and observed heterozygosity using adegenet. We tested if loci met Hardy-Weinberg expectations within each population (Hawaiian islands pooled) using an exact permutation test (Table 2).
To test for bottlenecks during a potential range expansion,

| Host ranges and songs
To provide context for understanding the degree of host specialization, we present in this paper the songs of confirmed hosts in each of the geographic regions studied. We present only hosts confirmed to be naturally parasitized by the development of O. ochracea from field-collected crickets. We suspect that a few additional host species will be confirmed in the United States, especially if the species is only occasionally parasitized, and we expect that many more spe- which would be preferable but is not currently possible. Our coding of song characters is only one of many possible coding schemes; our goal was to capture the major structural differences among cricket songs (Alexander, 1962) while attempting to have song features coded in such a way that comparisons across species represent "homologous" traits in song space, see Desutter-Grandcolas and Robillard (2003).
We used Mantel tests implemented in the R package ecodist (Goslee & Urban, 2007) to relate the cricket host song distances

| Nuclear and mitochondrial genetics
Three loci (Oo022, Oo024, and Oo035) showed significant departure from Hardy-Weinberg expectations in five or more populations (Table 2); subsequent analyses were done both including and excluding these three loci. Following filtration at missing data cutoffs, 274 individuals and either 11 or 8 loci (see above) were included in the final msat dataset, with 1.86% data missing. Heterozygosity across all individuals was 50.9% (11 loci) or 56.0% (8 loci). The Hawaiian populations showed a drastic decrease in heterozygosity (Table 3).
The rarefaction analysis also suggested a substantial decrease in both total and private allelic diversity within the Hawaiian populations ( Table 2).
Analysis of Nei's genetic distances documented a clear split between Hawaiian and mainland populations (Table 4) (Table 4).
For the 8-locus dataset, with all samples, STRUCTURE analyses indicated the strongest support for k = 2 genetic clusters (mean LnP(K) = −6286.49) separating Hawaiian from mainland populations ( Figure 2); however, support for k = 3 clusters was also high (mean LnP(K) = −6028.0), which further divided the mainland populations into eastern and western subsets (Figure 2). The Evanno method indicated the strongest support for k = 2 clusters (Table S1).

| Host range and song structures
Confirmed host species, geographic range information, and host calling song type, frequency, pulse rate, and pulses/chirp are presented in Table 5. Songs of confirmed host species vary dramatically, from simple chirps to complex trills; see waveform oscillograms and frequency spectrograms in Figures 6 and 7, respectively (prepared using the R package seewave).
F I G U R E 7 Spectrogram representations of 0.2 s of song from confirmed host species showing fine-scale song structure (pulses) Mantel tests showed strong associations between geographic, genetic, and song distances (Figures 9 and 10). To explore these patterns further, we repeated the analyses excluding the comparisons based on Hawaiian samples, that is, Mantel tests just for mainland population comparisons. Using average song distances among common hosts, song distance was correlated with genetic distance both when considering all comparisons and when considering only mainland comparisons (Figure 9c); the same was true when using minimum song distance among common hosts (Figure 10a), but not minimum song distance among any hosts (Figure 10b).
Partial Mantel tests gave somewhat inconsistent results (Table 6). Across all analyses, generally, it appears that the correlation between genetic and geographic distances persists even after conditioning on song distance. Song distance was significantly correlated with genetic distance, after conditioning on geographic distance, only for mainland comparisons using average song distance among commonly used hosts. The same pattern was not significant but somewhat suggested for all comparisons using average song distance among commonly used hosts, and for mainland comparisons using minimum song distance among commonly used hosts. Using minimum song distances among any hosts resulted in no relationship, or even a negative relationship, between song distance and genetic distance after conditioning on geographic distance.

| D ISCUSS I ON
Our results suggest the following: (a) O. ochracea is a single widespread species with regional host specialization, not a complex of  (Table 4) and the DAPC (Figure 1a) show that these two populations are genetically rather homogenous.
Both the mtDNA and msat data also inform the broader geographic history of the fly within North America. There is a clear east-west differentiation among samples, consistent with isolation by distance (Figure 9a). Moreover, the pattern of allelic variation in the msat loci (e.g., Figures 4 and S3-S12) suggests serial founder effects as flies colonized the western continental United States and then Hawaii; this interpretation is supported by Garza and Williamson's M ( Figure 5). The mtDNA similarly suggests that the older fly lineages are to be found within the southeastern US populations (Figures 1b   and S13). In this light, it is interesting to note that Florida is home to two Gryllus species, G. ovisopis and G. cayensis, which lack a normal calling song (Gray, Hormozi, Libby, & Cohen, 2018;Walker, 1974Walker, , 2001, range. This may be true in a strict sense, but frequency is clearly not the only song recognition feature. Multiple studies have shown that the temporal pattern of sound pulses is also important (Gray & Cade, 1999;Sakaguchi & Gray, 2011;Wagner, 1996;Wagner & Basolo, 2007;Walker, 1993). Moreover, fly populations prefer the temporal structure of their most common host species, even when dominant frequencies are similar (Gray et al., 2007). Perhaps most remarkably, Gray et al. (2007) (Vincent & Bertram, 2009) and species not normally used as hosts Thomson, Vincent, & Bertram, 2012) including Acheta domesticus (Paur & Gray, 2011a, 2011bWineriter & Walker, 1990) which is more distantly related to Gryllus than is Teleogryllus (D. A. Gray, D. B. Weissman, E. M. Lemmon, A. R. Lemmon, unpublished data). This latitude probably results from the generalized nature of the cricket immune encapsulation response (Vinson, 1990), which is exploited by Ormiines to develop a respiratory spiracle. Given this latitude, we expect that physiological compatibility with T. oceanicus was unlikely to be a significant factor in terms of host suitability.
Our results suggest that host specialization in O. ochracea is not at odds with rapid exploitation of novel hosts, as might be expected from evolutionary theory (Jaenike, 1990;Kelley & Farrell, 1998;Raia & Fortelius, 2013), despite associations between song divergence and genetic divergence independent of geography. But how can highly regional host song specificity (Gray et al., 2007), even to the point of flies having song preferences for certain intraspecific song variants (Gray & Cade, 1999;Sakaguchi & Gray, 2011;Wagner, 1996;Wagner & Basolo, 2007), be compatible with flexible and rapid adoption of novel hosts? We expect that behavioral plasticity coupled with local host learning (Paur & Gray, 2011a) may be the mechanism that enables flies to escape the "dead end" of specialization.

ACK N OWLED G M ENTS
We are grateful for the advice and/or assistance of Steve Bogdanowicz; Thomas Chaffee; Thomas J. Walker; and David B.
Weissman. Three reviewers made some really outstanding suggestions which greatly improved the final version.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
DAG, SLB, MZ, and WHC conceived of the study and collected flies; DAG performed the mtDNA sequencing; SLB and HDK performed the msat amplification and analysis; and all authors contributed to the writing and editing of the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
The COI sequence data have been deposited in GenBank with ac- Note: Comparisons in bold are significant at p < .05. Abbreviations: Gen, genetic distance (Nei); Geo, geographic distance (km); M1, response matrix; M2, explanatory matrix; M3, conditional matrix; Song, song distance.