Development of a Nasonia vitripennis outbred laboratory population for genetic analysis

The parasitoid wasp genus Nasonia has rapidly become a genetic model system for developmental and evolutionary biology. The release of its genome sequence led to the development of high-resolution genomic tools, for both interspecific and intraspecific research, which has resulted in great advances in understanding Nasonia biology. To further advance the utility of Nasonia vitripennis as a genetic model system and to be able to fully exploit the advantages of its fully sequenced and annotated genome, we developed a genetically variable and well-characterized experimental population. In this study, we describe the establishment of the genetically diverse HVRx laboratory population from strains collected from the field in the Netherlands. We established a maintenance method that retains genetic variation over generations of culturing in the laboratory. As a characterization of its genetic composition, we provide data on the standing genetic variation and estimate the effective population size (Ne) by microsatellite analysis. A genome-wide description of polymorphism is provided through pooled resequencing, which yielded 417 331 high-quality SNPs spanning all five Nasonia chromosomes. The HVRx population and its characterization are freely available as a community resource for investigators seeking to elucidate the genetic basis of complex trait variation using the Nasonia model system.


Introduction
The parasitoid wasp genus Nasonia has rapidly become a genetic model system for developmental and evolutionary biology (Desplan & Beukeboom 2003;Werren & Loehlin 2009;Brown 2010;Godfray 2010;Muers 2010). As a genetic model, it has inherent advantages that rival even the current model insect systems with a much longer history and tradition. Its haplodiploid reproduction offers advantages of haploid genetics (e.g. lack of genetic dominance) in a complex eukaryotic system (Pultz et al. 2000). Several other important characteristics (easy husbandry, visible mutation markers, four interfertile species, short-generation time and large family sizes, long-term storage after diapause induction) have laid the foundations for the large-scale study of Nasonia. However, the most important step in the development of Nasonia as a genetic model has been the availability of its genome sequence , which largely expanded the understanding of basic biological processes in this species complex.
In this study, we describe the establishment of the genetically diverse outbred HVRx laboratory population from strains that were collected from birds nest boxes and carrion baits in the Netherlands. Furthermore, we determine its (population) genetic characteristics and establish a breeding method that retains genetic variation over generations. To characterize the HVRx population, we describe the standing genetic variation and effective population size by microsatellite analysis. Pooled resequencing further yielded 417 331 high-quality SNPs, which are used for a genome-wide, high-resolution characterization of the population polymorphism. The HVRx population and its characterization are freely available as a community resource for investigators seeking to elucidate the genetic basis of complex trait variation using the Nasonia model system, for instance as a stock for association mapping in experimental evolution studies.

HVRx population establishment and maintenance
In 2001, five Nasonia vitripennis strains were collected from different localities within 20 km radius from longitude 5.3 E and latitude 52.1 N in the Netherlands: Elspeet (strain 'B5'), Bussum (strain 'Bussum') and the Hoge Veluwe (strains 'HV55', 'HV287' and 'HV295'). Strains from the Hoge Veluwe were collected from bird nest boxes, while those from Elspeet and Bussum were collected from baits. Baits consisted of mesh screen bags, containing 25 laboratory hosts (Calliphora vicina) and a piece of liver. Each field strain was initiated by collecting all emerging female wasps from a nest box or bait and providing these with hosts. The number of founding females was not known, but host patches are usually colonized by multiple foundresses (Grillenberger et al. 2009b). After four generations of culturing (hosting about 100 females), ten mated females were randomly collected from each strain and individually provided with two hosts. From the offspring of each mated female, three virgin females and one male were isolated and mixed to initiate the HV population. Thus, a total 30 virgin females and 10 males from each strain were mixed in a 250-mL culture bottle and allowed to mate randomly. After 24 h, 200 hosts were added and the wasps were placed at 25°C, under 16:8 light:dark conditions. After emergence, the HV population was split into two replicate populations (HV1 and HV2), by randomly isolating 100 mated females for each replicate and providing them with 200 hosts per replicate. Both replicates were maintained at 15°C, 16:8 light:dark conditions.
To maximize genetic diversity, we merged the HV1 and HV2 replicates to form one single HVRx population after approximately 36 generations of separate mass culturing. From both replicates, we hosted 80 mated females on 200 hosts, and the offspring was allowed to mate randomly before dividing the wasps over four mass culture tubes (70 9 20 mm) each containing 50 fly hosts for oviposition. After 1 week, the parasitized hosts were distributed over four new mass culture tubes. After emergence, approximately 30 mated females from each mass culture tube were transferred to new mass culture tubes to initiate the next generation ( Fig. 1). This breeding procedure ensures the maintenance of a large outbred population by allowing random mating between wasps of different tubes and was designed to preserve genetic diversity across generations. The HVRx population was maintained on C. vicina pupae as hosts, at 25°C, 16: 8 light:dark conditions.

Microsatellite analysis
To determine and monitor the genetic diversity of the HVRx population over time, we isolated DNA from females from the HV1 (n = 14) and HV2 (n = 14) strains, and from the HVRx population in generations 1 (n = 12), 5 (n = 24), 10 (n = 24) and 32 (n = 24), using a standard high-salt-chloroform protocol (Maniatis et al. 1982). A set of 40 microsatellite markers, evenly distributed over the five chromosomes of N. vitripennis and combined into six multiplex sets, was used to determine genetic variation Pannebakker et al. 2010;Koevoets et al. 2012). Details of these microsatellite markers are listed in Table S1 (Supporting information). Microsatellite markers were amplified using the Qiagen multiplex PCR kit (Qiagen, Hilden, Germany) according to the manufacturer's recommendations (PCR profile: 15 min at 95°C, followed by 30 cycles of 30 s at 94°C, 1.5 min at annealing temperature and 1 min at 72°C, followed by 45 min at 72°C). Amplification was in 5 lL volumes using Applied Biosystems Veriti or Applied Biosystems 9700 thermocyclers (Applied Biosystems, Foster City, CA, USA). Fragments were diluted 400 times, separated on a Applied Biosystems 3730 DNA Analyzer and analysed using GeneMapper v4.0 (Applied Biosystems). Population genetic statistics (allelic richness R, heterozygosity H E , F ST ) were calculated using FSTAT 2.9.3 (Goudet 2001). Statistical analyses were carried out using R 2.14.1 (R Development Core Team 2012).

Theoretical estimate of effective population size
To evaluate the expected effective population size (N e ) achieved by our set-up, we simulated a population propagated under the four-tube breeding schedule that was used for maintenance of the HVRx population, as well as a population of identical census size but propagated as a single large one (i.e. in a single large tube). We modelled the sampling of a single biallelic locus in a haplodiploid system with 30 (four-replicates) or 120 (single-replicate) mated females per tube, to match the approximate number of mated females transferred each generation. For each replicate, mating between diploid females and haploid males was simulated, with females producing offspring from a single male selected with replacement from a vector of males of length rN f , where N f is the number of females and r is the observed male over female ratio. For the multiple-replicate scheme, mated females from different tubes were mixed in equal proportions into new tubes (following Fig. 1).
For each simulation, we calculated F t , the allelic differentiation with respect to the founding population, as: (Nicholson et al. 2002), where p t is the allele frequency in generation t and p 0 is the frequency in the founding population. N e was then estimated by solving the following nonlinear equation: where F t was obtained by averaging F t over 100 000 independent simulations. This simulated estimate of N e was compared with the classic theoretical prediction of inbreeding effective size in a haplodiploid populations derived by Wright (1933) as: where N f and N m are the number of females and the number of males, respectively.

Empirical estimation of effective population size
We used approximate Bayesian computation (ABC, Tavare et al. 1997;Marjoram et al. 2003) to obtain estimates of the posterior distribution of realized N e over the past 32 generations from the microsatellite data. We estimated the following summary statistics from the data in generation 5, 10 and 32: number of alleles, expected heterozygosity and F t (as defined above and averaged over all loci and alleles). This resulted in a vector of length 9, O, with each element O ij containing the observed value for summary statistic i in generation j. We then simulated 10 000 000 instances of multilocus genotypes sampled from a diploid Wright-Fisher population subject to drift over 32 generations, with proposal values of N e drawn from a uniform distribution between 0 and 1000. Starting allele frequencies were set to match those observed in the base population. In each independent simulation, the same summary statistics as above were measured in generation 5, 10 and 32, yielding a vector S of simulated summary statistics, with each element S ij corresponding to a value in O ij . Following Pritchard et al. (1999), we only accepted iterations for which | S ij -O ij |/O ij < e for each element of S ij and O ij . The threshold value e was set to 0.11 to achieve an acceptance rate of 0.001. Fig. 1 Nasonia vitripennis HVRx outbred laboratory population maintenance schedule. In each of four mass culture tubes, 40 mated female N. vitripennis wasps of generation x are provided with 50 Calliphora spp. hosts. After oviposition, the parasitized hosts are redistributed over four clean mass culture tubes, to ensure optimal mixing of the wasps over all four culture tubes and to allow mating between all wasps emerging within generation x + 1.

Genome resequencing and analysis
To resequence the HVRx base population, DNA from 20 pooled females from generation four was extracted and purified using Qiagen DNeasy columns (Qiagen). A 101bp unpaired and a 76-bp paired-end library were constructed and subsequently sequenced by the University Medical Centre Groningen (UMCG) Genome Analysis Facility according to standard Illumina protocols. The unpaired library was run on a single lane and the paired-end library on five lanes of an Illumina Genome Analyzer II (Illumina, San Diego, CA, USA). Sequence data were provided to NasoniaBase (Munoz-Torres et al. 2011) and submitted to the NCBI Short Read Archive with accession no. SRP022050.
Mean SNP densities were calculated in nonoverlapping 100 kb windows, with a minimum mean coverage of 8. Genome-wide diversity analysis was performed using the popoolation pipeline (Kofler et al. 2011). We estimated nucleotide diversity p in nonoverlapping 100 kb windows, using the same parameters augmented with a maximum coverage of 1 000 000, a fraction of the window that has a coverage between the minimum and maximum coverage of 0.6. Additional statistical analyses were carried out using R 2.14.1 (R Development Core Team 2012).
The HV1 and HV2 replicates were merged to start the HVRx population with maximum genetic diversity. We subsequently cultured the HVRx population following a breeding procedure designed to retain genetic diversity over generations (Fig. 1). To confirm the effectiveness of our breeding procedure, we monitored the population genetic characteristics of the HVRx population at generations 1, 5, 10 and 32 after establishment.

Genetic differentiation
Already at generation 1, the HVRx population had significantly diverged from both original populations (F ST = 0.04 (HV1) and 0.16 (HV2), P < 0.05, Table 1). After generation 5, the HVRx population started to diverge modestly but significantly between subsequent generations (Table 1). However, 32 generations of controlled breeding resulted in a F ST value of 0.07 between generation 1 and generation 32, which is considerably lower than that of the differentiation between the original HV1 and HV2 populations after 36 generations (Table 1).

Genetic variation
Mean heterozygosities (H E AE SE) for HVRx generations 1, 5, 10 and 32 did not change and were 0.58 AE 0.04, 0.60 AE 0.03, 0.56 AE 0.03 and 0.56 AE 0.03, respectively. An overall comparison, including also the founding populations HV1 and HV2, showed that the only significant difference in heterozygosity was between HV2 and HVRx generation 5 (Tukey's HSD critical value = 0.127, Table 2). Our results indicate that heterozygosity was effectively maintained over 32 generations. This pattern is also reflected in the per-locus heterozygosity (Table S2, Supporting information). Mean allelic richness (R AE SE) showed a slight decrease over generations 1, 5, 10 and 32 and was 3.52 AE 0.22, 3.45 AE 0.20, 3.40 AE 0.18 and 3.34 AE 0.21, respectively. An overall comparison, including also the founding populations HV1 and HV2 only, showed significant differences in allelic richness between HV2 and all HVRx generations (One-way AOV; F 5,234 = 3.87, P = 2.170e-3; Tukey HSD critical value = 0.741, Table 2). Overall, our results indicate that our controlled breeding method effectively limits the decrease in allelic richness over 32 generations.

Effective population size
The empirical estimation of effective population size (N e ), based on the diversity parameters above, yielded a posterior mode of 195 and a mean of 236 diploid individuals. The computer simulations of the different breeding schemes yielded nearly identical effective population sizes of N e = 177 and N e = 173 for the four-replicate and single-replicate mating scheme, respectively, entirely consistent with the theoretical prediction of N e = 173 and falling within 0.9 and 0.3 standard deviations of the posterior mean and mode, respectively (Fig. 2).

Genomic variation
We found considerable genetic variability in the HVRx population, with a total of 417 331 SNPs distributed over the five chromosomes, or about 1 SNP per 796.7 bp [based on a physical genome size of 332.5 Mb (Gregory, 2013)]. SNP density varied between the chromosomes (one-way AOV; F 4,1880 = 9.58, P = 1.16e-07, Table 3), with the lowest mean density per 100 kb window on chromosome 1. The pattern in SNP density differentiation is retained (one-way AOV; F 4,1880 = 1105.1, P < 2.2e-16, Table 3, Fig. 3) upon adjusting for differences in read depth between the different chromosomes (one-way AOV; F 4,1880 = 12.268, P = 7.55e-10, Table 3, Fig. S1, Supporting information). Interestingly, the highest SNP density per 100 kb window was also found on chromosome 1 at 25.3 Mb showing 438 SNPs (or 39.84 SNPS after adjusting for 10.99X read depth).
The mean nucleotide diversity per 100 kb window over all chromosomes was 0.13%, showing significant differentiation by chromosome (one-way AOV; F 4,155 = 9.97, P = 5.76e-08, Table 3), with chromosome 1 showing the lowest nucleotide diversity compared with other chromosomes. The distribution of genomic variation over the chromosomes showed similar patterns for both SNP density as for nucleotide diversity. Both measures were higher towards the tips of the chromosomes than the centres (Figs 3 and 4). Estimates of nucleotide diversity were lacking in areas where the read depths were low, such as the near the centre of the chromosomes (Fig. 4).

Discussion
The HVRx laboratory population captures a large amount of genetic variation. With over 400 000 SNPs, it is a valuable resource for investigating the genetic architecture of complex trait variation in the Nasonia genetic model system. The HVRx population provides   HVRx is the first available for the Hymenoptera genetic model Nasonia.
The HVRx population is maintained at a breeding schedule that retains the available genetic variation, evidenced by the fact that over 32 generations, both the heterozygosity H E and the allelic richness R had not changed significantly. While there was greater differentiation between both founding HV replicates (F ST = 0.26) than between European and North American Nasonia vitripennis populations (F ST = 0.15, Raychoudhury et al. 2010a), our breeding schedule effectively prevented further differentiation over generations (Table 1). Using a simple breeding procedure, our set-up results in a simulated population size of N = 177, and an effective population size N e of 236. This equals to an N e /N ratio close to 1.5, which is high compared with previously reported estimates for natural and captive populations of other species (Briscoe et al. 1992;Frankham 1995;Simoes et al. 2010). While effective population size is influenced by many factors, such as fluctuations in population size over generations, variation in family size and sex ratio, mating system and selection, work by Simoes et al. (2010) has shown that careful maintenance procedures  can efficiently reduce the loss of genetic variability in laboratory populations in organisms that generate a high number of offspring. The results described here for the HVRx population corroborate this finding.
The effective population size is an important parameter for predicting the potential response to selection (Falconer & Mackay 1996). Because the effect of drift is reduced in larger populations, allelic diversity persists longer allowing for greater and faster responses to selection (Weber 2004). While the exact response to selection is determined by intensity of selection, heritability of the trait of interest and effective population size, the empirical estimate of N e = 236 for HVRx is on upper range used in artificial selection studies (Weber & Diggins 1990;Weber 2004;Zhang & Hill 2005). The population sizes of laboratory populations also received attention in the biological control literature, where the maintenance of genetic variability is an important issue. The population size of the HVRx exceeds the general recommendation to start and maintain natural enemy cultures with an effective population size of N e > 100 (Roush 1990;Bartlett 1993;Nunney 2003).
The genetic diversity of the HVRx laboratory population is slightly lower than that observed in studies of natural N. vitripennis populations. Evaluating 12 overlapping microsatellite loci, the heterozygosity of the HVRX population after 32 generations (mean 0.52 AE 0.07) is lower than that observed in natural N. vitripennis populations directly upon collection from the field (range of H E : 0.46-0.95 Table S3, Supporting information; Grillenberger et al. 2009a;Raychoudhury et al. 2010b;Paolucci et al. 2013). It is, however, considerably higher than in the highly inbred standard laboratory strain AsymCX . A similar pattern is observed in the genome-wide nucleotide diversity of the HVRx population (p = 0.0013), which is lower than the levels of synonymous nucleotide diversity (ranging from p = 0.0016 to 0.0026) observed in~16 kb sequence from 27 genes in multiple strains of N. vitripennis (Raychoudhury et al. 2010b;Werren et al. 2010). The genome-wide nucleotide diversity of the HVRx population, however, included both synonymous and nonsynonymous polymorphisms, which result in a lower estimate than when just synonymous polymorphisms were used. Overall, the levels of nucleotide diversity in Nasonia are much lower than those observed in natural Drosophila melanogaster populations (e.g. Mackay et al. 2012;Orozco-terWengel et al. 2012), which could be explained by founder events, but also by effects more specific to haplodiploids such as the purging of deleterious mutations in haploid males or high levels of inbreeding (Hedrick & Parker 1997).
Analysis of the genomic variation in the HVRx population shows that genomic polymorphism, both in terms of SNP density as well as of nucleotide diversity, is lower in the centre of the chromosomes and higher towards the distal ends. While we have no exact data on the location of the centromeres, the fact that all Nasonia chromosomes are meta-or submetracentric (Gokhman & Westendorff 2000) strongly suggests the centromeres are at or near the centre of the chromosomes. These regions not only show reduced nucleotide diversity in the HVRx population, but also show reduced recombination rates in Nasonia Desjardins et al. 2013), consistent with the correlation between recombination rate and nucleotide diversity observed in a wide range of organisms (Begun & Aquadro 1992;Nachman 2002). Unfortunately, the nucleotide diversity data are not complete throughout the HVRx genome, due to low read depths at several regions. While we cannot exclude sequence quality issues for these regions, low read depth regions colocate with gaps in the Nvit 2.0 reference sequence ) that make reference mapping impossible in those regions.
The HVRx laboratory population is a genetically diverse N. vitripennis outbred laboratory population that fills a gap between conventional inbred Nasonia laboratory strains and free-living natural populations. It offers higher levels genetic variation than inbred laboratory strains and easier experimental manipulation than freeliving populations, by capturing intraspecific genetic variation in a tractable laboratory population that is easily maintained using the provided breeding schedule. While inbred laboratory strains are a valuable tool for studying interspecific differences in Nasonia (e.g. Desjardins et al. 2010;Werren et al. 2010;Koevoets et al. 2012;Loehlin & Werren 2012;Gibson et al. 2013;Niehuis et al. 2013), the intraspecific context is crucial to understand the evolutionary forces underlying adaptation of complex (life-history) traits. The HVRx adds a novel resource for both inter-and intraspecific research in Nasonia, thereby further extending the qualities of the Nasonia genetic model system for developmental and evolutionary research. The combination of a genetically diverse outbred population with full genomic information makes HVRx a solid starting point for studying the genomic changes underlying the evolution of complex traits, for instance as a stock population for experimental evolution in combination with association mapping studies.
L.Z. and B.A.P. conceived and performed the research and wrote the paper. S.F. performed the microsatellite analyses. A.H. performed maintenance of HVRx and prepared the material for sequencing. L.W.B. collected the original field strains. J.H., L.Z. and B.A.P. performed statistical analyses.

Data Accessibility
Sequence data were provided to NasoniaBase (http:// hymenopteragenome.org/Nasonia/) and submitted to the NCBI Short Read Archive with accession no. SRP022050. SNP data were provided to NasoniaBase and deposited in Dryad (Dryad Entry doi:10.5061/dryad. f2330).

Supporting Information
Additional Supporting Information may be found in the online version of this article: