A new method of metabarcoding Microsporidia and their hosts reveals high levels of microsporidian infections in mosquitoes (Culicidae)

Abstract DNA metabarcoding offers new perspectives, especially with regard to the high‐throughput identification and diagnostics of pathogens. Microsporidia are an example of widely distributed, opportunistic and pathogenic microorganisms in which molecular identification is important for both environmental research and clinical diagnostics. We have developed a method for parallel detection of both microsporidian infection and the host species. We designed new primer sets: one specific for the classical Microsporidia (targeting the hypervariable V5 region of small subunit [ssu] rDNA), and a second one targeting a shortened fragment of the COI gene (standard metazoan DNA‐barcode); both markers are well suited for next generation sequencing. Analysis of the ssu rDNA data set representing 607 microsporidian species (120 genera) indicated that the V5 region enables identification of >98% species in the data set (596/607). To test the method, we used microsporidians that infect mosquitoes in natural populations. Using mini‐COI data, all field‐collected mosquitoes were unambiguously assigned to seven species; among them almost 60% of specimens were positive for at least 11 different microsporidian species, including a new microsporidian ssu rDNA sequence (Microsporidium sp. PL01). Phylogenetic analysis showed that this species belongs to one of the two main clades in the Terresporidia. We found a high rate of microsporidian co‐infections (9.4%). The numbers of sequence reads for the operational taxonomic units suggest that the occurrence of Nosema spp. in co‐infections could benefit them; however, this observation should be retested using a more intensive host sampling. Our results show that DNA barcoding is a rapid and cost‐effective method for deciphering sample diversity in greater resolution, including the hidden biodiversity that may be overlooked using classical methodology.


| INTRODUC TI ON
proposed next-generation biodiversity assessment using DNA metabarcoding. Since then automated identification of multiple species from a single bulk sample has become more accessible through increased availability of next-generation sequencing (NGS) platforms and bioinformatic tools for NGS data analysis (e.g., Krehenwinkel, Pomerantz, & Prost, 2019;Ruppert, Kline, & Rahman, 2019;Zepeda Mendoza, Sicheritz-Pontén, & Thomas Gilbert, 2015). DNA metabarcoding, which allows for rapid and cost-effective identification of the biodiversity of the entire sample across a wide range of taxa and habitats, has proved its usefulness for prokaryotic and eukaryotic species identification. Many examples of this approach have been provided in biodiversity assessment of bacteria (e.g., Adamczyk et al., 2019;Cordier, 2019;Thapa, Zhang, & Allen, 2019), fungi (e.g., Hu et al., 2019;Luis et al., 2019;Xiao et al., 2019), plants (e.g., Bell et al., 2019Tnah et al., 2019;Veldman et al., 2020) and animals from micro-to megafauna (e.g., Dopheide et al., 2019;Head et al., 2018;Lynggaard et al., 2019). However, there are still groups of organisms for which metabarcoding could offer new perspectives, especially with regard to high-throughput identification and diagnostics of pathogens and pests (Tedersoo,  The microsporidian spores are the only developmental stage with the ability to survive outside the host cell (Franzen, 2004). Infectious spores accumulate in the environment, when they are released in faeces or when an infected host dies (Campbell, van Frankenhuyzen, & Smith, 2007;Goertz & Hoch, 2008;Hoch, D'Amico, Solter, Zubrik, & McManus, 2008).
The real species diversity in this group is probably largely unknown, as Microsporidia are generally studied as zoonotic and/ or waterborne agents of human disease or veterinary parasites.
Standard Microsporidia detection methods are based on ultrastructural assessment of an infected material. Light microscopy-based methods mainly consist of the detection of a thick chitin wall of spores using different stains (Field, Hing, Milliken, & Marriott, 1993;Ignatius et al., 1997;Moura et al., 1997;Peterson, Spitsbergen, Feist, & Kent, 2011;van Gool et al., 1993). Staining for light microscopy rarely enables species identification, and electron microscopy has low sensitivity because only a small amount of sample can be examined (Weber, Bryan, Schwartz, & Owen, 1994). To identify pathogenic microsporidia in clinical samples, antigen-based detection assays are used to recognize characteristic pathogen-specific antigens, mostly located in the spore wall or the polar tube (del Aguila et al., 1998;Furuya et al., 2008;Luján et al., 1998;Zhang et al., 2005).
However, in the case of co-infection with different microsporidian species, such an approach based on direct amplicon sequencing using a conventional method could fail due to ambiguous Sanger sequencing results or can yield false results if one of the species dominates in the sample.
The first attempt to use high-throughput sequencing for microsporidian DNA detection was performed by Williams et al. (2018).
They applied the standard V1F and 530R primers to check the diversity of Microsporidia in environmental samples. Using this primer set, they were able to uncover a previously unknown microsporidian diversity; however, their raw data contained mostly nontarget sequences (see figure 2 in Williams et al., 2018). The high percentage of nonmicrosporidian sequences found in their study suggests that the standard V1F/530R primer set is not specific enough to amplify exclusively microsporidian DNA.
The present study aimed to develop a new molecular method for the rapid and sensitive detection of microsporidia using a DNA marker better suited for an NGS approach. Additionally, we present a new primer set to amplify the short DNA barcode based on the mitochondrial cytochrome c oxidase subunit I (COI) gene for parallel identification of the microsporidian host species. As a model, we used microsporidia that infect mosquitoes (Culicidae). Microsporidia are common in mosquitoes: over 90 microsporidian species have been recorded worldwide from this host. In addition, some microsporidians parasitic in mosquitoes also infect other species of insects, crustaceans and vertebrates (Andreadis, 2007;Becnel, White, & Shapiro, 2005;Vossbrinck et al., 2004).

| DNA extraction
Before DNA extraction, mosquitoes were washed in 96% ethyl alcohol (washing solution). Individual mosquitoes were placed in 1.5-ml Eppendorf tubes until DNA extraction, while the washing solution was filtered through the MF-Millipore Membrane Filter, 0.22 µm pore size (Merck). The filter was then cut and placed in an Eppendorf tube containing 180 µl ATL lysis buffer (Qiagen) and incubated with 0.2 mg of Proteinase K (Bio Basic) for 48 hr at 56°C. Later, 100 µl of the lysate was used for column-based DNA extraction using the DNeasy Blood & Tissue Kit (Qiagen) according to the manufacturer's protocol for animal tissues.
Genomic DNA was isolated from spores and mosquitoes using a modified ammonium hydroxide method (Rijpkema & Bruinink, 1996).
Two hundred microlitres of 0.7 M ammonium hydroxide (POCH S.A.) was added to 100 µl of the spore suspension or to one mosquito individual and homogenized for 30 s using the Pellet Cordless Motor Pellet (DWK Life Sciences) with disposable micro pestles (Scientific Specialties Inc.). Samples were incubated for 20 min at 99°C with shaking and the tubes were then opened and further left under the same conditions to concentrate the lysate to about 100µl volumes.
The samples were then centrifuged for 5 min at 14,100 x g, and the supernatant was collected. Before PCR amplification, DNA extracts from mosquitoes were normalized with sterile water to a concentration of ~10 ng/µl.

| Standard molecular approach for microsporidia detection
We compared the performance of our new metabarcoding approach with the standard PCR-based detection method. We amplified and sequenced the V1-V3 region of ssu rDNA using primers V1F (CACCAGGTTGATTCTGCCTGAC) and 530R (CCGCGGCKGCTGGCAC). PCRs were prepared in two technical replicates, each in a total volume of 10 µl containing Hot FIREPol DNA Polymerase (Solis BioDyne), 0.25 µm of each primer and 1 µl of template DNA. The amplification programme was as follows: 12 min at 95°C, followed by 35 cycles of 15 s at 95°C, 1 min at 60°C and 1 min at 72°C, with a final extension step at 72°C for 10 min. DNA isolated from 100 spores of E. intestinalis was used as a positive control. After amplification, technical replications were pooled.
Standard COI-barcode (Hebert, Cywinska, Ball, & deWaard, 2003) was amplified using primers bcdF01 and bcdR04 (Dabert, Witalinski, Kazmierski, Olszanowski, & Dabert, 2010). PCRs were prepared in a volume of 10 µl containing Hot FIREPol DNA Polymerase, 0.5 µm of each primer and 1 µl of template DNA. The PCR programme was as follows: 12 min at 95°C, followed by 35 cycles of 15 s at 95°C, 30 s at 50°C and 1 min at 72°C, with a final extension step at 72°C for 5 min. DNA isolated from Aedes aegypti was used as a positive control.
Five microlitres of the PCR was analysed by electrophoresis on a 1.5% agarose gel stained with GelRed (Biotium) according to the manufacturer's instruction. Samples containing visible bands were purified with Escherichia coli exonuclease I and FastAP Alkaline Phosphatase (Thermo Scientific) and sequenced using a BigDye version 3.1 kit and ABI Prism 3130XL Genetic Analyzer (Applied Biosystems), following the manufacturer's instructions. Sequence chromatograms were checked for accuracy and, if necessary, manually edited in geneious R11.1.5 (Biomatters).

| Designing new primers for microsporidia detection
A new microsporidia-specific primer set was developed based on ssu rDNA sequence data published in GenBank. In total, 1,133 sequences representing 120 genera belonging to the classical Microsporidia were aligned using mafft and L-INS-i algorithm (Katoh, Misawa, Kuma, & Miyata, 2002;Katoh & Standley, 2013) as implemented in geneious R11.1.5. Primers were manually designed to cover an ~200bp fragment coding for the helices 27-34 in the Heterosporis anguillarum ssu rDNA secondary structure (Tsai, Kou, Lo, & Wang, 2002); according to Neefs, Van de Peer, De Rijk, Chapelle, and de Wachter (1993) this region covers the V5 hypervariable region of ssu rRNA (V5 region). The primer sequences were analysed in oligo analyzer The percentage identities of the aligned V1-V3 and V5 regions were estimated using the Kolmogorov-Smirnov statistical test in the genedoc version 2.7 sequence editing tool (Nicholas & McClain, 1995). The final data set for V1-V3 and V5 alignments used in the analysis consisted of 649 and 607 available microsporidian sequences, respectively. Duplicate reads were extracted from the data set using the default settings in Dᴇᴅᴜᴘᴇ Dᴜᴘʟɪᴄᴀᴛᴇ Rᴇᴀᴅ Rᴇᴍᴏᴠᴇʀ version 37.64 implemented in geneious R11.1.5.

| Amplification of the V5 region and mini-COI for NGS sequencing
The microsporidian V5 region was amplified using primers CM-V5F (GATTAGANACCNNNGTAGTTC) and CM-V5R (TAANCAGCACAMTCCACTC) developed in this study. Mosquito species were determined by DNA-barcoding using the shortened (373-bp) fragment of the mitochondrial COI gene (hereafter: mini-COI) covering 5' fragment of the standard DNA-barcode.
The mini-COI was amplified using a primer pair developed in this study: bcdF01 (CATTTTCHACTAAYCATAARGATATTGG) (Dabert et al., 2010) and bcdR06 (GGDGGRTAHACAGTYCAHCCNGT). All PCR primers for NGS sequencing used in this study were tailed at 5' ends with dual-indexed Ion Torrent adapters (forward tail 5'-CCATCTCATCCCTGCGTGTCTCCGACTCAG-index-GAT, reverse tail 5'-CCTCTCTATGGGCAGTCGGTGAT-index) for sequencing using the Ion Torrent system (Life Technologies).
The V5 region was amplified in two technical replications, each in a total volume of 10 µl containing Hot FIREPol DNA Polymerase, 0.25 µm of each tailed primer and 1 µl of template DNA. The PCR programme was as follows: 12 min at 95°C, followed by 40 cycles of 15 s at 95°C, 30 s at 50°C and 30 s at 72°C, with a final extension step at 72°C for 5 min.
PCR amplification of mini-COI was performed in a volume of 5 µl containing Hot FIREPol DNA Polymerase, 0.25 µm of each tailed primer and 1 µl of template DNA. The PCR programme was as follows: 12 min at 95°C, followed by 35 cycles of 15 s at 95°C, 30 s at 50°C and 45 s at 72°C, with a final extension step at 72°C for 5 min.

| Library construction and NGS sequencing
The V5 region and mini-COI libraries were prepared separately ( Figure 1). For each PCR, 3 µl was electrophoresed on a 2% agarose gel to check amplification efficiency. The V5 region amplicons were pooled based on DNA band intensities. The sample volumes ranged from 5 µl, where the amplicons were invisible on the gel, to 1 µl in the case of brighter bands. The rare samples which had very high intensity were diluted 100-fold with sterile water before pooling.
For the mini-COI library, all samples were pooled using 1 µl of each PCR. The V5 region and mini-COI libraries were purified separately using a 2% E-Gel SizeSelect II Agarose Gels system (Invitrogen), ac-

| Read processing and data analysis
Raw sequence data were pre-filtered by Iᴏɴ Tᴏʀʀᴇɴᴛ Sᴜɪᴛᴇ software version 5.10.1 (Life Technologies) to remove polyclonal and lowquality sequences. Further bioinformatic analysis was conducted using a fastq data and custom workflow. Sequence reads shorter than180 bp were removed from the data set. Leading and trailing low-quality bases or Ns were removed using trimmomatic version 0.39 (Bolger, Lohse, & Usadel, 2014). The fastx toolkit (Hannon, 2010) was used to extract sequences with a minimum of 50% of bases with a quality score ≥ 25. Quality filtered sequences were separated into individual combinations of indexes in geneious R11.1.5. Chimeras were removed using the default settings in uchime2 version 4.2.40 (Edgar, 2016) and SILVA database for ARB for SSU rRNAs version 132 (Glöckner et al., 2017;Quast et al., 2013;Yilmaz et al., 2014) as implemented in geneious R11.1.5. Next, the sequences were trimmed at the 5' and 3' ends to exclude PCR primers. F I G U R E 1 Protocol for metabarcoding of microsporidia and their hosts. Total genomic DNA is extracted from each previously washed host specimen and from the medium used for host preservation and washing. The mini-COI barcode and the V5 region of ssu rDNA are amplified using PCR primers fused with doubleindexed NGS adaptors. The mini-COI and V5 libraries are prepared separately, and after quality control, pooled in a ratio of 1:10, respectively. The libraries are NGS-sequenced using a threshold of at least 10,000 reads per sample. After data quality filtering, the OTUs are clustered, and then compared to databases with reference sequences Operational taxonomic unit (OTU) clustering was done in usearch version 11.0.667 (Edgar, 2010). Singletons (<10 reads) were removed, then OTUs were clustered from the sequences whose abundance exceeded a threshold of 10 counts using the cluster _ otus algorithm (Edgar, 2013). The OTU consensus sequences were compared to GenBank using blastn (Zhang, Schwartz, Wagner, & Miller, 2000) optimized for highly similar sequences (megablast algorithm) (Morgulis et al., 2008).
We used a 97% identity threshold to determine mosquito species, and 100% identities for the identification of microsporidian species. The sequences generated are available in GenBank under accession nos. MT001301-MT001427, MT015707-MT015901 and MT075548-MT075550 (Table S1).

| Phylogenetic analyses
To confirm the taxonomic position of the Microsporidia detected in field-collected mosquitoes, we used 122 ssu rDNA sequences representing all known clades of the classical Microsporidia (74), Metchnikovellida (four) and Chytridopsida (one). As close outgroups, we used Rozellomycota (six) and Fungi (28). For details concerning all ingroup and outgroup taxa see Table S2. Sequences were aligned using the l-ins-i algorithm in mafft version 7.388 (Katoh et al., 2002;Katoh & Standley, 2013) as implemented in geneious R11.1.5. The final alignment consisted of 2,718 nucleotide positions. The best fit model of DNA evolution (GTR + I + G) was chosen by partitionfinder2 (Lanfear, Calcott, Ho, & Guindon, 2012). Phylogenetic trees were reconstructed using maximum likelihood (ML) in garli version 2.0 (Zwickl, 2006) and Bayesian inference (BI) in mrbayes 3.2.6 (Ronquist et al., 2012). Each BI run of four independent chains was performed in 2 × 10,000,000 generations, and the trees were sampled every 1,000 generations. The final consensus tree was generated after discarding the burn-in fraction of 0.25% of initial trees; the average standard deviation of split frequencies dropped below .003.
Bootstrap support for the ML tree was calculated by using 1,000 data replicates as implemented in garli. The trees were edited in figtree 1.4.4 (Rambaut, 2018) and further in corel draw X4.

| Statistical analyses
Sequence reads from microsporidia were normalized via the otutab _ rare algorithm (Edgar & Flyvbjerg, 2018) to compare sample diversities. The diversity of OTUs in individual samples were calculated using the alpha _ div algorithm (Edgar & Flyvbjerg, 2018). Due to the lack of a near-normal distribution in any sample (p < .05) in the initial Shapiro-Wilk test (Shapiro & Wilk, 1965), the nonparametric Kruskal-Wallis test (Kruskal & Wallis, 1952) was used to compare technical repetition series of spore dilutions and control co-infections, as well as to compare microsporidian species detected in mosquitoes. Rarefaction curves were generated using past software version 3.23 (Hammer, Harper, & Ryan, 2001). The level of significance of the microsporidian infection frequencies was tested with the Dunn test (Dunn, 1964) constituting the post-hoc test for Q-Cochran analysis (Cochran, 1950). A heatmap was prepared using heatmapper tools (Babicki et al., 2016). The chi-squared test statistic (Pearson, 1900) was used to evaluate whether there is an association between the detected microsporidia and their occurrence in different host species or in mixed infections, and the relationship between numbers of sequence reads per OTU in the metabarcoding approach and a successful amplification of the V1-V3 region. Pearson's correlation coefficient was used to determine the correlation between numbers of reads of individual microsporidians (Pearson, 1895). The results of Pearson's correlation were visualized in gephi software version 0.9.2 (Bastian, Heymann, & Jacomy, 2009).
Microsporidian DNA was considered as incidental in field-collected mosquitoes when noted in <1% of all analysed individuals, its OTU was covered by fewer than 50 sequences and the species has not been previously reported from mosquitoes.

| Applying the metabarcoding method for fieldcollected mosquitoes
Using mini-COI data, all field-collected mosquitoes were unambiguously assigned to seven species: Aedes cinereus (13) Table S1).
In total, the V5 region library of the field-collected mosquitos (212) and negative controls (five) generated about 1,330,000 reads after quality filtering. Negative controls yielded no sequence data using our default threshold. In microsporidia-free mosquitoes the number of reads never exceeded 626 per sample (median = 101).
Together, nonmicrosporidian sequences accounted for only 12% of all quality filtered reads; most of them were of host origin (11.9%) and coded ssu rRNA genes (9.8%) or a collagen alpha-1 chain gene (2.1%; 100% identity with GenBank accession no. XM_021840458).
The less abundant OTUs represented ssu rDNA fragments from F I G U R E 2 Relationship between the numbers of spores of different species in mixed samples and their relative sequence read abundance in quality-filtered sequence data. DNAs from Encephalitozoon cuniculi and E. hellem spores were detected even when they were mixed in a ratio of 1 to 1,000 (10:10:10,000) with E. intestinalis  fungi (in three mosquito individuals; <.03% of sequence reads), gregarine (in one individual; <0.002%) and human (in one individual; <0.001%) (Table S1).
In field-collected mosquitoes, microsporidian DNA was found in almost 60% (127/212) of individuals representing all analysed species (  Table S1). Rarefaction curve analysis showed that the read depth was sufficient to recover all microsporidian species in the tested individuals ( Figure S2). The analysis revealed that 10,000 reads per sample is required to identify all microsporidian diversity in the tested host.

| Co-occurrence of different Microsporidia in single mosquito individuals
The co-occurrence of DNAs representing different microsporidian species in one host individual was recorded in 20 field-collected mosquitoes (9.4%); this relationship concerned all microsporidian and host species ( A statistically significant relationship (p < .05) was found between the numbers of reads representing a given co-infecting microsporidian species. An almost perfect correlation was observed in increasing numbers of sequence reads between N. chrysorrhoeae/portugal and Nosema sp. CHW-2007a (r = .9); also, a similarly high correlation was observed between Amblyospora sp. and Amblyospora salinaria (r = .51) ( Figure 5; Table S3).  N. a., Nosema adaliae; N. c., N. ceranae; N. p., N. pieriae; N. t., N. thomsoni; N. ch./p., N. chrysorrhoeae and/or N. portugal; N. sp

| Comparison of metabarcoding and standard molecular approach
The Kolmogorov-Smirnov plots for the microsporidian V1-V3 and V5 sequence alignments showed that the proposed V5 marker is only 4% less variable than the standard sequence covering V1-V3 regions (about 40% and 36% of different nucleotide positions, respectively) ( Figure 6). The extraction of duplicate reads from the alignment comprising the V5 region representing 607 microsporidian species belonging to 120 genera indicated that the marker enables the proper TA B L E 2 Numbers of sequence reads representing microsporidian OTUs in co-infected mosquitoes

F I G U R E 5
Correlation network between the numbers of reads representing each microsporidian species found in co-infected mosquitoes. Blue lines-statistically significant correlation. Almost perfect correlation (thick blue line) was observed when Nosema chrysorrhoeae/portugal occurred in mixed infection with Nosema sp. CHW-2007a. High correlation (thin blue line) was found between co-infecting Amblyospora sp. and Amblyospora salinaria (for correlation values see
Several sequences that were found in duplicates represented species mainly of the genus Nosema (e.g., N. chrysorrhoeae and N. portugal or a group consisting of N. antheraeae, N. trichoplusiae, N. spodopterae, and N. philosamiae showed no variation in the V5 region). However, in these sample groups extremely low or no variation was also observed in other regions of the ssu rDNA, including V1-V3 marker (data not shown). In three cases, the same V5 region sequences were found in species representing different genera (Conglomerata obtusa and Berwaldia schaefernai; Ameson portunus, Ameson pulvis and Nadelospora canceri; Tetramicra brevifilum and Microgemma caulleryi).
In total, 42 microsporidia-positive samples were obtained using the standard V1F/530R primer set and DNA extracts from field-collected mosquitoes previously used to test the metabarcoding method ( Figure S3, Table S1). Using the same DNA templates we successfully amplified a full-length COI-barcode (about 670 bp), which means that all DNA isolates were suitable for PCR amplification ( Figure S4).
As PCR products were found among 127 microsporidia-positive mosquitoes identified previously by the metabarcoding method, the standard PCR approach gave almost 67% false-negative samples (Table S4). We found a statistically significant correlation (χ 2 = 68.59, p < .05) between the detection of microsporidia using the standard molecular approach and obtaining at least 300 reads per OTU found by the metabarcoding method. Direct Sanger sequencing of the  Table 2).

| Effectiveness of metabarcoding microsporidia and their hosts
The metabarcoding approach proposed in this study can be used for fast, accurate and sensitive detection of microsporidia in all types of DNA samples. Our data show that the new marker covering the hypervariable V5 region of ssu rDNA allows correct identification of almost all classical microsporidian species known from rDNA sequence data.
The V5 region is flanked by group-conserved sequences that allowed us to design the CM-V5F/CM-V5R primer set with great specificity towards microsporidian ssu rDNA. Indeed, the specificity of our primers seems to be higher than the commonly used V1F/530R pair applied by Williams et al. (2018), because in our experiments, the nontarget sequences never exceeded 12% of quality-filtered sequence data. Results of our phylogenetic analysis, which included the Percentage of identity Fraction of scores as low or lower than the score being plotted Fraction of scores as low or lower than the score being plotted V5 region of ssu rDNA ( Figure 4). However, recently published first sequence data for the two remaining microsporidian lineages-the metchnikovellid Amphiamblys sp. (Galindo et al., 2018;Mikhailov, Simdyanov, & Aleoshin, 2017) and chytridiopsid Chytridiopsis typographi (Corsaro et al., 2019)-suggest that the CM-V5F/CM-V5R primer set is rather group-specific for the classical Microsporidia. This indicates that there may be a need to develop additional primers that will be group-specific for both the Metchnikovellida and the Chytridiopsida to study species diversity in the whole phylum.
Our metabarcoding approach allowed us to detect 100 spores/ ml, which is comparable with other PCR-based methods (Menotti et al., 2003;Rinder et al., 1998;Wolk et al., 2002) or microarray techniques developed for Encephalitozoon spp. in clinical samples (Wang et al., 2005). Usually, clinical samples are not accessible; therefore, the screening of different microsporidian species in the same sample at the same time may be of clinical benefit. The advantage of the metabarcoding is that it allows simultaneous screening of all species, without the need to detect particular species in separate PCRs. For example, methods commonly used for microsporidia detection in water are based on spore staining in smears from concentrated water samples, followed by the PCR amplification of marker sequences using species-specific primers (e.g., Ben Ayed et al., 2012;Izquierdo et al., 2011;Li et al., 2012). The metabarcoding approach could help overcome these difficulties, reducing the time and costs of the analysis. In addition, the metabarcoding of concentrated water samples could be standardized, which can help to develop good laboratory methods for more accurate waterborne disease risk assessment.
The comparison of our new method with a standard PCR-based approach clearly shows that metabarcoding is more sensitive and accurate in detecting microsporidian infections. Our results show that direct amplicon sequencing would be impractical in co-infected samples where one microsporidian species was present at a much higher level of infection than the other co-infecting species. Although amplicon cloning in a plasmid vector and subsequent Sanger sequencing of several clones can detect mixed samples in the cases of relatively balanced co-infections, the dominance of one microsporidian species would require sequencing a large number of clones, which is not usually performed.
Based on the same mosquito DNA samples, we successfully amplified both the full-length COI-barcode (about 670 bp) and the mini-COI fragment (373 bp) for metabarcoding. COI sequence analysis showed that both fragments allowed us to unambiguously assign all mosquitoes to the correct species. Therefore, the mini-COI fragment can be successfully used in NGS approaches to determine microsporidian host species.

| Microsporidia in field-collected mosquitoes
More than half of the field-collected mosquitoes analysed in this study were positive for microsporidian DNA. Reports of the prevalence of microsporidia in adult mosquito populations in nature are sparse and focused on particular microsporidians and their primary mosquito hosts. In one experiment conducted for almost 20 years, the prevalence of Amblyospora stimuli infections in Aedes stimulans adult female populations was relatively low and ranged from 1% to 9.6% (Andreadis, 1999). On the other hand, in the study concerning the life cycle and ecology of Amblyospora khaliulini, prevalence rates of Amblyospora khaliulini infections in adult mosquito females ranged from 16.4% to 50% (Andreadis, Thomas, & Shepard, 2018).
In our study, all mosquito species showed a much higher rate of microsporidian-positive individuals, ranging from 54.0% to 92.3% (Table 1). However, in the studies conducted by Andreadis et al. (1999Andreadis et al. ( , 2018 The presence of microsporidian DNA does not necessarily result from infection. Therefore, we excluded E. hellem, E. artemiae and N. ceranae as infecting factors because they had been noted in fewer than 1% of all analysed individuals and their OTUs were covered by low numbers of reads. However, there is no empirical basis to exclude Nosema spp. or Amblyospora spp. even though in the present study we recorded them in single host individuals and/ or their OTUs were represented by low numbers of reads. To our knowledge, the remaining microsporidian species, found by us in field-collected mosquitoes, have been recorded in Culicidae for the first time. Our observations based on numbers of reads representing the newly recorded microsporidian species OTUs or their occurrence in multiple mosquito individuals support the hypothesis that these species can indeed infect mosquitoes. A wide range of hosts for members of the genus Nosema also supports this hypothesis. For example, N. thomsoni has been noted in Andrena vaga (Hymenoptera) (Ravoet et al., 2014), Harmonia axyridis (Coleoptera) (Vilcinskas, Stoecker, Schmidtberg, Röhrich, & Vogel, 2013) and Among the other microsporidians detected in our study, N. adaliae was found in Adalia bipunctata (Coleoptera) (Steele &Bjørnson, 2014), while N. pieriae, N. chrysorrhoeae, N. portugal andNosema sp. CHW-2007a were noted in different lepidopteran hosts (Huang et al., 2008;Hyliš et al., 2006;Maddox et al., 1999;Yaman, Bekircan, Radek, & Linde, 2014). Amblyospora sp. recorded in our study had been previously found in Cyclops strenuous (Copepoda) from the Czech Republic (Vossbrinck et al., 2004). The dominant presence of one microsporidian species-Microsporidium sp. PL01in all mosquito species analysed in this study suggests that there may be no "primary host" in microsporidia-mosquito relationships and the observed prevalence reflects only the abundance of a generalist microsporidian species infecting mosquito larvae in the water environment. Nevertheless, ultrastructural assessment, based on histology and transmission electron microscopy, should be carried out to confirm the actual infection.
The new microsporidian ssu rDNA sequence found in this study is most similar to that of Microsporidium sp. 1199, which was recorded in freshwater populations of Gammarus duebeni (Crustacea, Amphipoda) in a rivulet from Holyhead Island, Wales, UK, during research into the molecular characterization of the microsporidians of this host across its natural range (Krebes et al., 2010). Our phylogenetic analysis results strongly support that Microsporidium sp. PL01 and Microsporidium sp. 1199 represent different species in the same genus, which belongs to one of the two main clades in the Terresporidia. Members of this clade mostly parasitize terrestrial arthropods, but this clade also groups microsporidians that can infect vertebrates, including humans (e.g., Enterocytozoon bieneusi).

| Microsporidia in co-infections
Our metabarcoding method allowed us to detect a relatively high level (9.4%) of co-occurrence of DNAs representing different microsporidian species in the same host individual. We cannot exclude accidental intake of spores in food; however, high numbers of sequence reads representing N. thomsoni or N. chrysorrhoeae/portugal in samples where they co-occurred with Microsporidium sp. PL01 are evidence of mixed infection by these species.
It is noteworthy that we found Nosema spp. in field-collected mosquitoes mostly in co-infections. Furthermore, we noted that three out of five Nosema species-N. adaliae, N. pieriae and N. thomsoni-in our study were found only in co-infections. Additionally, we found a positive correlation between numbers of reads for OTUs representing N. chrysorrhoeae and/or N. portugal and Nosema sp.
CHW-2007a, which suggests that these species support each other.
A similar relationship was found for the Amblyospora spp. where the occurrence of Amblyospora salinaria was associated with the occurrence of Amblyospora sp., although in this case the numbers of reads suggest a lower level of infection. However, quantitative analyses should be carried out to confirm these relationships. To conclude, we believe that our observations based on the microsporidian DNA occurrences reflect the actual microsporidian diversity in a tested sample; however, the hypotheses concerning infections and mutual relationships of co-infecting species should be tested using a more intensive host sampling with the use of ultrastructural data.

| CON CLUS IONS
We have proposed a new molecular approach for the detection of microsporidian DNA in samples extracted from their hosts. The method uses NGS sequence data from the hypervariable V5 region of the ssu rDNA and the shortened fragment of the standard metazoan DNA-barcode (i.e., the mitochondrial COI gene) for microsporidian and host species identification, respectively.
We have tested this metabarcoding approach on microsporidians infecting mosquitoes; however, the method can be applied to all types of DNA extracts, including clinical and environmental samples. We compared our new method with the standard molecular approach for microsporidian DNA detection, which uses endpoint PCR and Sanger sequencing. This comparison showed that the metabarcoding is more sensitive and accurate than the traditional method.
An additional advantage of the new method is its effectiveness in identifying microsporidian species that co-infect the same host individual. Moreover, our data concerning numbers of sequence reads for the microsporidian OTUs found in co-infected hosts suggest that the occurrence of some microsporidian species in mixed infections could benefit them; however, this observation should be retested using a more intensive host sampling.
Our results show that next-generation biodiversity assessment is a rapid and cost-effective method for deciphering sample diversity in greater resolution, including the hidden biodiversity that may be overlooked in approaches based on classical methodology. In addition, our data indicate that information about the relative abundance of a specific pathogen species (number of reads for a specific OTU) obtained by use of the NGS approach can help to discover new ecological relationships among parasite species co-infecting the same host.

ACK N OWLED G EM ENTS
We thank Edward Baraniak from Adam Mickiewicz University, and Piotr Rzymski from the University of Medical Sciences for collecting mosquitos. This work was supported by Leading National Research Centre-KNOW RNA Research Centre in Poznan.  Figures S1-S4 and Tables S1-S4. Sequence alignments and the remaining data that support the findings of this study are available from the corresponding author upon request.