Bioaerosol biomonitoring: Sampling optimization for molecular microbial ecology

Abstract Bioaerosols (or biogenic aerosols) have largely been overlooked by molecular ecologists. However, this is rapidly changing as bioaerosols play key roles in public health, environmental chemistry and the dispersal ecology of microbes. Due to the low environmental concentrations of bioaerosols, collecting sufficient biomass for molecular methods is challenging. Currently, no standardized methods for bioaerosol collection for molecular ecology research exist. Each study requires a process of optimization, which greatly slows the advance of bioaerosol science. Here, we evaluated air filtration and liquid impingement for bioaerosol sampling across a range of environmental conditions. We also investigated the effect of sampling matrices, sample concentration strategies and sampling duration on DNA yield. Air filtration using polycarbonate filters gave the highest recovery, but due to the faster sampling rates possible with impingement, we recommend this method for fine ‐scale temporal/spatial ecological studies. To prevent bias for the recovery of Gram‐positive bacteria, we found that the matrix for impingement should be phosphate‐buffered saline. The optimal method for bioaerosol concentration from the liquid matrix was centrifugation. However, we also present a method using syringe filters for rapid in‐field recovery of bioaerosols from impingement samples, without compromising microbial diversity for high ‐throughput sequencing approaches. Finally, we provide a resource that enables molecular ecologists to select the most appropriate sampling strategy for their specific research question.

Despite bioaerosols being an important transmission route for infectious and sensitization agents, information on the microbiological components of bioaerosols from different environments is scarce (Blais-Lecours, Perrott, & Duchaine, 2015). This lack of knowledge hampers our ability to address both key public health (Douglas, Hayes et al., 2017;Pearson et al., 2015;WHO, 2009) and broad ecological questions, relating to species dispersal and biogeography Dumbrell, Nelson, Helgason, Dytham, & Fitter, 2010a, 2010bZhou & Ning, 2017). Traditionally, bioaerosols have been studied using culture-based methods, but culturing captures only a small fraction of the total microbial community. In aquatic and terrestrial environments, molecular methods are routinely used (Clark et al., 2018;Mommer, Dumbrell, Wagemaker, & Ouborg, 2011) and could also provide a rapid, sensitive and specific approach to analysing airborne microorganisms, especially from low-concentration environments (Colbeck & Whitby, press). Yet, optimum methods for collecting nucleic acid material from air samples remains under investigation, and standardized sampling procedures have not yet been established (Hoisington, Maestre, King, Siegel, & Kinney, 2014;Mbareche, Brisebois, Veillette, & Duchaine, 2017;Reponen, 2017).
Due to the low environmental concentrations of bioaerosols, collecting sufficient genetic material for molecular methods is problematic and is dependent on the biomass present, which varies between environments. Consequently, collection devices with high flow rates or high collection efficiencies are necessary (Morgan, Darling, & Eisen, 2010). Moreover, nucleic acid yield, and hence microbial diversity recovered, will depend on the nucleic acid extraction protocol used (Luhung et al., 2015;Morgan et al., 2010;Peccia & Hernandez, 2006). This is particularly important for low-biomass bioaerosol samples, where the DNA extraction method applied needs to have high recovery efficiencies (Morgan et al., 2010). Indeed, often only dominant sequences are recovered and the "rare biosphere" is missed (Colbeck & Whitby, press).
Characterizing microorganisms from bioaerosols is further hindered by high concentrations of PCR inhibitors (e.g., humic acids and inorganic particles), which impede downstream molecular analysis (Luhung et al., 2015;Peccia & Hernandez, 2006). The matrix used for collection is also known to introduce biases (Adams, Tian et al., 2015;Aguayo, Fourrier-Jeandel, Husson, & Ioos, 2018;Castaño et al., 2017;Wang et al., 2015). Thus, it is difficult to compare bioaerosol studies, due to differences in samplers, collection time, airflow rate and analysis methods, and consequently many basic questions remain unanswered. For example, how does sampler choice influence the results? How long/what volume of air shoud be sampled? How does filter/liquid matrix affect DNA yield? How should the sample be concentrated for analysis?
There are three main classes of filter: fibrous, membrane and flat filters. In this study, we used a representative of the most commonly used filter classes: glass fibre (GF), polycarbonate (PC) and gelatin (Gel). Fibrous filters trap particles within a matrix of randomly orientated fibres (e.g., glass fibre and cellulose). Membrane filters have a complex internal structure of pores within which particles are deposited (e.g., gelatin and polyvinyl chloride). Flat filters collect particles on the filter surface with the air passing through pores in the membrane (e.g., polycarbonate). The way the filter traps the particle influences both what is collected (e.g., spores or cells) and how easily it is released for downstream analysis Duquenne, Coulais, Bau, & Simon, 2018;Dybwad, Skogan, & Blatny, 2014a;Yoo et al., 2017). The advantages of filters for molecular analyses are that the captured microorganisms remain viable and nucleic acid extraction occurs directly from the filter (Yoo et al., 2017). However, one problem with using filters for molecular methods is that spore-forming microorganisms may be preferentially recovered, depending on filtration time, pore size and filter type (Yoo et al., 2017). Impactors collect particles by depositing them onto a surface transverse to the airflow. The main advantage is that the particle size collected can be controlled by varying the flow rate (which is typically between 10 and 700 L/min). However, with impactors, cell viability is lost due to impact stress and recovery efficiency is often reduced due to low flow rates and particle bounce (Griffin, 2007). Impingers use a cyclone to deposit bioaerosols into a liquid.
Impingers generally have lower collection efficiencies than filters, especially for small particles (~50% at 0.5-5 µm and ~90% at 10 µm; Carvalho et al., 2008;Dybwad et al., 2014a) but airflow rates for impingers are generally higher (300-600 L/min, compared to 2-300 L/ min for filtration), permitting shorter sampling periods. Impingement also overcomes the problems associated with organism desiccation that can occur with filters and impactors. However, depending on the liquid matrix used, cell growth and lysis during storage have a demonstrable effect on culture-based studies (Chang & Wang, 2015), but it is unknown if this is true for molecular studies, nor what the optimum method for concentrating samples for downstream nucleic acid extraction after impingement is.
Here, we investigated the optimal methods for the biomonitoring of bacterial bioaerosol samples. Specifically, we evaluated the suitability of air filtration and liquid impingement as bioaerosol collection methods across differing environmental settings and temporal/ spatial scales. We compared GF, PC and Gel filters in relation to DNA yield. We also investigated the effect of three liquid matrices for use with liquid impingement, namely deionized water (DI), phosphate-buffered saline (PBS) and Tris hydrochloride buffer (Tris-HCl), and whether biomass recovery was greater with centrifugation or filtration. Finally, we provide a new resource that enables molecular  ecologists and air regulators to select the most appropriate bioaerosol sampling strategy for their research questions. We address the following questions (summarized in Figure 1):   was added in ten 1-µl aliquots evenly across the surface of each filter (see Supporting information Appendix S1 for bacterial growth conditions). Filters were then rolled (so the bacteria were on the inside surface), placed into 2-ml microcentrifuge tubes and stored at −20°C overnight to simulate storage after sampling. Before DNA extraction, all filters were thawed to room temperature. To prevent the filters from obstructing bead lysis, the following was performed: (a) the PC filters were placed into the lysis tubes intact; (b) the GF filters were placed into a Petri dish, cut into fifths and placed into lysis tubes; and (c) the Gel filters were fragmented using sterile tweezers and then placed in the lysis tubes. Procedural blanks comprising empty tubes and sterile filters were also included. To determine the DNA extraction efficiency, 10 µl of each bacterial culture (at 1 × 10 7 CFU/ml) was added directly to a lysis tube. DNA extraction and qPCR followed methods described in the Supporting information Appendix S1 and qPCR analysis section (below) respectively. All conditions were carried out twice, in triplicate (n = 3) to determine reproducibility. (v/v) sodium dodecylsulphate buffer by aspiration in a 1-ml pipette, vortexed for 2 × 10 s and transferred to a bead-lysis tube. For each culture, 10 µl (at 1 × 10 7 CFU/ml) was also added directly to a lysis tube to determine the DNA extraction efficiency. Procedural controls (comprising buffer with no culture) were also performed. DNA extraction and qPCR followed methods described in the Supporting information Appendix S1 and qPCR analysis section (below) respectively. All conditions were carried out twice, in triplicate (n = 3) to determine reproducibility. Figure 2) to address Question 3: What is the best way to recover bacteria from liquid impingement samples: filtering or centrifugation?

| Experiment 3 applied lab and field experiments (
In the lab experiment, based on the data obtained from Experiments 1 and 2, the methods determined to be optimal (i.e. PC filters, and pelleting in PBS) were compared to a rapid in-field method using a syringe filter. In all cases, 10 µl of a mixed culture of E. coli (Gram negative) and  Figure S1). The syringe filter samples were immediately filtered. All samples were immediately frozen on dry ice and stored at −20°C for ≤2 weeks before DNA extraction. DNA extraction and qPCR followed methods described in the Supporting information Appendix S1 and qPCR analysis section (below) respectively. HiSeq sequencing was performed as described in the DNA sequencing section below.

| Experiment 4 addresses Question 4:
How long and what volume of air should be sampled in order to obtain sufficient DNA yields for downstream molecular processing in different environmental contexts?

| Experiment 5 addresses Question 5: How do DNA yields vary between air filtration and liquid impingement sampling methods processing in different environmental contexts?
Air samples were collected at nine sites (A-I) in southeast England, comprising a mixture of urban, industrial and agricultural locations (Supporting information Table S1). At each site, nine sets of triplicate 20-min samples were collected by liquid impingement using a Coriolis µ wet cyclone impinger (Bertin, Air Monitors) as described previously. At each site, three sets of triplicate air samples were collected by air filtration onto PC filters at 28 L/min for 120 min using a Gast vacuum pump (Cole-Palmer) with the filters placed in 47-mm Swin-Lok plastic filter holders (Whatman). The sampling cones and filter holders were sterilized in 1% (w/v) sodium hypochlorite and Milton liquid as described previously. All samples were frozen on dry ice and stored at − 20°C. DNA extraction and qPCR followed methods described in the Supporting information Appendix S1 and qPCR analysis section (below) respectively. The concentration of 16S rRNA gene copies was normalized for the volume of air sampled and compared across concurrent samples at the same site to determine differences in yield between methods.

| qPCR analysis of the 16S rRNA genes
DNA standards for qPCR analysis were created from PCR-ampli-

| Statistical analysis
Statistical analysis was carried out in r (R Development Core Team, 2015) and the cited associated packages. Means testing was carried out with a linear mixed model fitted in the r package "ime4" (Bates, Machler, Bolker, & Walker, 2015). 16S rRNA gene copy numbers were log 10 -transformed to approximate a normal distribution and experimental repeats, or samples that were temporally or spatially separated were added to the model as random effects. The model was fitted with restricted maximum likelihood and degrees of freedom was estimated by the Satterthwaite approximation with the r package "imertest" (Kuznetsova, Brockhoff, & Christensen, 2017). When random effects were not required, means testing was carried out with ANOVA, or a Student's t test if there was only one factor with two treatments. To test differences between individual group means pairwise comparisons with Tukey's HSD (honestly significant difference) test was used (with least square mean estimations for the mixed effects models). To compare the sampling efficiencies of impingers and air filtration in the field, a Pearson's correlation coefficient was calculated between the log 10 -transformed 16S rRNA gene copy numbers recovered by each method (normalized for the volume of air sampled) for each of the sites.
Analysis of the sequencing data was carried out using the r package Vegan (Oksanen et al., 2015) as described by Dumbrell et al. (2017). Sequence libraries were rarefied to the smallest library size as this method has good compatibility with the statistical methods and alternative methods require operational taxonomic units (OTUs) to be present in all samples (Weiss, Boyd et al., 2017;Weiss, Xu et al., 2017).
Significant differences between alpha diversity metrics was evaluated using means testing with mixed effects models, as previously described. To evaluate changes in bacterial community composition (beta diversity), a distance matrix using the Jaccard index was calculated and visualized with nonmetric multidimensional scaling (NMDS).
To test for differences between sampling groups, permutation-based multivariate analysis of variance (PERMANOVA) on the distance matrix was carried out with 1,000 randomizations (Anderson & Walsh, 2013). For all tests, an alpha value of p < 0.05 was used.

| Experiment 1: Does the filter material affect DNA yield?
The type of filters significantly influenced the recovery rates for

| Experiment 3: What is the best way to recover bacteria from liquid impingement samples: filtering or centrifugation?
In the lab experiment, the syringe filters recovered significantly fewer 16S rRNA gene copies compared to the other methods (PC filters or centrifugation to recover a bioaerosol pellet; Syringe (e) Beta diversity NMDS F 1,8 = 1, p = 0.5, R 2 = 0.1). However, the relative abundance of Enterobacteriales was twofold higher in the centrifugation treatment than the syringe filter treatment, so there may have been subtle differences in community structure driven by some taxa (Supporting information Figure S2). No background contaminants were detected in the blanks (all had Ct values within the range of the NTCs during qPCR and none of the sequences recovered from blanks was of high enough quality to form contiguous reads (see Supporting information Appendix S1).
Recommendation: Use centrifugation to recover bacteria from liquid impingement samples. However, syringe filters can be considered for use in the field.

| Experiment 4: How long and what volume of air should be sampled to obtain sufficient DNA yields for downstream molecular processing in different environmental contexts?
For all conditions, there was a significant increase in DNA recovery with increased time/volume of air sampled (Figure 7): filters at high bioaerosol concentration (F 2,6 = 276.5, p < 0.001) and filters at low bioaerosol concentration (F 2,6 = 24.8, p = 0.001), for impingers at high bioaerosol concentration (F 2,6, = 21.8, p = 0.002) and for impingers at low bioaerosol concentration (F 2,6 = 8.2, p = 0.02).
Current guidelines for preparation of libraries for PCR-free metagenome sequencing with the Nextera DNA Library Prep Reference Guide (Illumina) are 50 ng of DNA per sample. Our best cases recovered less than this threshold, for example 13 ng (SD 1.1) dsDNA when sampling with filters for 120 min, and 9 ng (SD 2) dsDNA with impingers for 20 min respectively from a high-biomass environment.
The yields achieved here would be sufficient for metagenomics using the Nextera XT kits (Illumina), which requires 1 ng DNA per sample (according to current guidelines) or metabarcoding with a dual PCR approach using the Nextera XT indices (Illumina). However, pooling of samples would be required for an amplification-free metagenome sequencing strategy.
Recommendation: Sample for at least 120 min (3.36 m 3 air) with filters, and 20 min (6 m 3 air) with impingement for metagenomics or metabarcoding approaches, but pooling of samples may still be required for an amplification-free metagenome sequencing strategy.

| Experiment 5: How do DNA yields vary between air filtration and liquid impingement sampling methods processing in different environmental contexts?
Filters recovered significantly more 16S rRNA gene copies than liquid impingement by just over an order of magnitude ( Figure 8,

| D ISCUSS I ON
In this study, we evaluated different sampling methods for collecting airborne bacteria from a range of different environments with the aim of maximizing DNA yield. Based on our results, we have made a number of recommendations for selecting the most appropriate bioaerosol sampling method (summarized in Table 2 and Figure 9).
Under the conditions tested, it is possible to collect sufficient bacterial genetic material from bioaerosols for molecular analysis across different environments (Experiments 4 and 5). Of the matrices tested, PC filters and PBS had the highest DNA recovery rates for air filtration and liquid impingement respectively. Centrifugation was also a better method for recovering bacterial bioaerosols from impingement liquids than syringe filters in terms of DNA yield.
However, as diversity is not significantly affected, syringe filters are a good option in certain circumstances (e.g., for rapid in-field analysis). Furthermore, we recommend air samples should be collected for at least 120 min (3.36 m 3 air) with filters and 20 min (6 m 3 air) with impingement. However, pooling of multiple samples may still be required for amplification-free metagenomics.

| Choice of filter for air sampling
Our results show that the type of filter used is a key consideration, as recovery rates differed by more than an order of magnitude between filter types, probably due to differential release of bacteria from the filter during DNA extraction. Other studies also concluded that recovery of bioaerosols from the sampling matrix can be a major limitation for both bacteria and fungi (Adams, Tian et al., 2015;Aguayo et al., 2018;Castaño et al., 2017;Wang et al., 2015). On flat filters (e.g., PC) the particles are easy to recover as they remain on the filter surface. In contrast, particles are not easily released from fibrous filters (e.g., GF) where they are trapped between filaments. In addition, during extraction, PC filters are dissolved in phenol/chloroform, releasing the bacteria and ensuring efficient bead lysis. GF filters, however, are more robust and do not disintegrate in phenol/chloroform, consequently obstructing bead lysis. Some studies have mitigated the lower recovery rates from GF filters by cutting them up (Cao et al., 2014;Pankhurst et al., 2012) or vortexing the filters to resuspend the bacteria into a liquid prior to DNA extraction Madsen, Zervas, Tendal, & Nielsen, 2015). However, to date, there is no information available on the efficiency of vortexing for cell recovery from filters. It also seems an unnecessary extra step, unless there is a specific reason not to use   Table S1). The mean value is marked by the point and the whiskers are the minimum and maximum range for each site (n = 9) [Colour figure can be viewed at wileyonlinelibrary.com]   or endotoxins . However, note that  measured the loading capacity of the filter, rather than the ability to release particles. As bioaerosol concentrations are typically low in the environment, loading is not as important a factor as particle release. In our study, it is the combination of sampling procedure and extraction method that is crucial. Question 2: PBS is the best liquid matrix for impingement.
Question 3: Centrifugation is the best way to recover bacteria from air samples; but filters can be used for rapid recovery in the field Question 4: At least 120 min with filters, and 20 min with impingement.
Which method?

Impingement Air filtration
Question 5: Air filtration for maximum recovery and liquid impingement for short snapshots. Yamamoto, Kimura, Matsuki, & Yanagisawa, 2010). We found the Gel filters were brittle and lost consistency in damp conditions, which may make them unsuitable in some environments, but enables them to release collected particles effectively. Moreover, Gel filters used in this study had trace amounts of contaminating bacterial DNA (Supporting information Figure S3), and therefore were inappropriate for culture-independent analysis, supporting previous findings (Fahlgren, Hagstrom, Nilsson, & Zweifel, 2010). It is unclear if this extends to other filters of biological origin that are used for bioaerosol studies, such as cellulose (Adams, Bhangar et al., 2015;Bowers, McLetchie, Knight, & Fierer, 2011;Cho & Hwang, 2011). Although Gel filters may remain useful for culture or microscopy studies, we would not recommend using them for molecular methods. Our results highlight the importance of running blank extractions and field blanks (Nehmé et al., 2009) as contamination can be introduced both at the sampling stage (e.g., Gel filter contamination) or in the lab during sequence library preparation (e.g., kit contamination; Adams, Miletto, Taylor, & Bruns, 2013;Nguyen, Smith, Peay, & Kennedy, 2015).

| Gram-negative bias in impingement samples
For culture-based analyses, the matrix used for impingement effects recovery of bacterial cells due to differential growth and lysis during storage at >4°C (Chang & Wang, 2015). For molecular analysis, samples are frozen, so growth is not expected to be an issue, but the differing resistance to lysis between groups of bacteria is a key consideration in any microbial ecology study (Guo & Zhang, 2013;Kennedy et al., 2014). In this study, we found significantly lower recovery of Gram-negative bacteria when not using PBS. This indicates that Gram-negative bacteria could be under-represented after liquid impingement, as they are less resistant to lysis during storage (e.g., freeze-thawing) than Gram-positive bacteria (Salton, 1953).
Once cells are lysed, genetic material is harder to collect by centrifugation due to lower mass than intact cells, resulting in the lower recovery rates observed. Using a buffer such as PBS may have reduced bacterial cell lysis during storage and associated freeze-thawing, thus improving DNA recovery. A possible alternative to PBS could be using an additive such as Tween or glycerol (Le Goff, Bru-Adan, Bacheley, Godon, & Wéry, 2010;Le Goff et al., 2012). For culturebased methods, an impingement mixture containing Tween 80, peptone and Antifoam Y-30 marginally outperformed PBS for recovery of Staphylococcus aureus (Chang & Wang, 2015). However, it resulted in faster rates of evaporation during sampling, which is a major limitation with impingement, and the peptone acted as a substrate for bacterial growth during storage (Chang & Wang, 2015).

| Syringe filters versus centrifugation for recovery of cells from liquid impingement samples
In the field, centrifugation is not always logistically tractable, whereas pre-sterilized syringe filters are easily deployed. Filters may also recover small components such as DNA and spores that are harder to recover by centrifugation making the results more representative (Mbareche et al., 2017). In this study, syringe filters recovered significantly less bacteria than centrifugation ( Figure 5); however, there was no significant effect on bacterial alpha or beta diversity. Indeed, there was a nonsignificant trend towards higher diversity with the syringe filters (Figure 6b-d), supporting findings elsewhere (Mbareche et al., 2017). There were also some changes in the relative abundance, but not presence/absence, of specific taxa (e.g., increase in Enterobacteriales for syringe filters, Supporting information Figure S2). Recovery with syringe filters could be used for rapid species-specific identification of bioaerosol agents in minutes if coupled with a portable analysis method such as loop-mediated isothermal amplification (LAMP), or film array-based PCR (Al-Sheikh, 2015;Lu, Mo, Zhao, Yan, & Shi, 2011;Weller et al., 2012

| Is air filtration or liquid impingement the best method for air sampling?
Air filtration collected approximately an order of magnitude more 16S rRNA gene copies than liquid impingement across environmental contexts with varying bioaerosol concentrations. However, our results only considered total DNA yield and further investigation is needed to determine whether differences in microbial diversity exist between methods. The collection of specific microbial targets may also be influenced by other factors such as their dispersal mechanisms or weather conditions. Thus, information on method-dependent patterns of microbial diversity may be vital for developing taxon-specific sampling methods (e.g., targeting a pathogen). Frankel, Timm, Hansen, and Madsen (2012) also found that filters are more effective at collecting various bioaerosols than impingement. One possible reason for this is the relative efficiency of recovering cells from liquid versus filters. Results from Experiment 3 showed a nonsignificant trend towards lower recovery from a liquid by centrifugation than direct extraction from a PC filter. Another possibility is the cut-off size (0.5 µm with the Coriolis µ) and decreasing collection efficiency for smaller particles with impingers (Dybwad et al., 2014a).
Collecting the highest DNA yield is not the only consideration in obtaining a representative sample. Bioaerosols can show high spatial and temporal heterogeneity, which cannot be captured with long sampling periods (Dybwad, Skogan, & Blatny, 2014b;Emerson et al., 2017). In soils, it has been proposed that a large number of low-volume samples are preferable to a few large-volume samples to capture high heterogeneity in microbial communities (Ranjard et al., 2003). The higher sampling rates achievable with (some) impingement systems (>100 L/min), compared to air filtration (2-30 L/min), make impingement suitable for collecting snapshot samples (Blais Lecours et al., 2012;Bowers et al., 2011;Le Goff et al., 2012;Madsen et al., 2015;Pankhurst et al., 2012;Robertson et al., 2013;Shin et al., 2015;Triadó-Margarit et al., 2016). However, caution should be taken when comparing air samples of varying duration and flow rate, as collecting 300 L over 10 min may not be the same as collecting 300 L over 2 hr and could represent fundamentally different microbial communities.
For example, when a large amount of material is collected, the rare members of a community might be overlooked.  Table S2). Either there is a tradeoff with efficiency at higher flow rates, or 300 L/min was already sufficient to collect the available material. In addition, it may not be appropriate to increase the sampling time with filters due to sample desiccation. For example, Luhung et al. (2015) found that with long periods the bacteria on filters degraded quickly, and as a result the DNA yield did not increase with time and the sample was only representative of the bacteria collected during the latter stages of sampling. This is especially important for determining microbial functional activity with RNA analyses, as RNA is degraded quickly and the bacteria will start to transcribe genes related to this stress.

| Determining optimal sampling time with impingement and filters
The variable nature of bioaerosol concentrations over very short time scales (minutes) also needs to be considered. Our results in Experiment 4 show that the increase in yield is not directly proportional to the time sampled (Figure 7). Bioaerosol concentrations, rather than being constant, are liable to sudden and short peaks of high concentrations. Supporting information Figure S4

| Assessing health risk of bioaerosols with molecular methods
The negative health effects of bioaerosols are a large driver for their research (Douwes et al., 2003). However, determining the health relevance of data from molecular bioaerosol studies is a challenge.
Molecular methods are liable to false positives as they have low detection limits and collect genetic material from dead cells. Often the call from regulators is that we should only be interested in "viable" microorganisms, by which they mean culturable. It is a misconception that the subset of the community that is culturable reflects the active/infective microorganisms. Strategies that attempt to combine culture-based and molecular methods should be treated with caution as they combine biases inherent to both (Duquenne, 2018). Rather, we could look to RNA-based methods, such as metatranscriptomics, to determine the active proportion of the community.
A common way of assessing the health relevance of air pollution is to collect material from the inhalable/respirable size range (e.g., Our ability to detect pathogenic bioaerosols with molecular methods is improving rapidly, but we cannot utilize these data unless we know what concentrations are meaningful from a health perspective.

| Limitations of the study
In this study, not all available sampling methods were tested. For example, we used only one representative of each class of filter and therefore we can only make recommendations based on the conditions we tested. However, we do present methods for field sampling with filters or impingers that work across a wide range of environmental contexts (e.g., levels of biomass, different inhibitors and weather conditions). We aimed to optimize methods based on obtaining the maximum DNA yield possible, as low bioaerosol concentration across environmental settings is currently a key obstacle (Aguayo et al., 2018;Castaño et al., 2017).
Determining the total concentration of bioaerosols accurately is important to public health. However, we are unable to draw any conclusions with respect to recovering maximal microbial diversity. Optimizing sampling based on obtaining the maximum bacterial diversity alone would not have been sufficient for developing the quantitative sampling methods required for public health monitoring. Notwithstanding this, the effects of sampling methods on diversity need further investigation.
We used pure cultures of bacteria as surrogate pathogens to test bioaerosol sampling procedures. Escherichia coli and Bacillus subtilis were selected as they are commonly found in bioaerosols (Degois et al., 2017;Dubuis et al., 2017;Pankhurst et al., 2012). However, it is unknown if these bacteria would behave in the same way when part of bioaerosol communities or when in combination with airborne contaminants. The structure of bacterial bioaerosol communities varies between sites/seasons, as does the concentration and composition of particulate matter. Different concentrations or types of particulate matter may repress the release of bacteria from filters and inhibit molecular analysis. One option is to use chamber studies, which are ideal for assessing collection efficiency, to mimic the behaviour of bacteria in bioaerosols (Carvalho et al., 2008;Dybwad et al., 2014a;Miaskiewicz-Peska & Lebkowska, 2012). However, it is difficult to control bacterial concentrations for accurate quantification when using chambers, which was important for the goals of this study. Furthermore, chamber studies do not truly represent the form of environmental bioaerosols and the range of weather conditions and contaminants that may be found. Despite these limitations, we have shown (Experiments 4 and 5) that our methods can be translated to the field to recover genetic material across a range of environmental contexts.

| CON CLUS IONS
Air filtration using PC filters gives the greatest DNA recovery, as air filtration collected an order of magnitude more bacteria per m 3 of air sampled in comparison with impingement. Therefore, PC gives the "best" quantitative data, but due to the long sampling times required, this method may average out temporal variations. In contrast, given the faster sampling rates with impingement, we recommend this method for fine-scale temporal/spatial ecological studies. With impingement, the liquid matrix should be PBS to reduce possible biases in recovering Gram-negative and Gram-positive bacteria. The optimal way to collect bacteria from the liquid matrix is centrifugation. However, for rapid recovery and on-site analysis in the field, syringe filters are a viable alternative. Importantly, it is not the sampling procedure alone, but the combination of the sampling procedure and extraction method that is crucial. Although we assessed bioaerosol sampling across different environments (Experiments 3-5), further method optimization is needed to cover other environments and to consider microbial diversity and DNA yield together. Ultimately, molecular ecologists need to consider the conditions of their specific environment in conjunction with their study aims in order to make an informed decision of which methods to use and this study provides a resource to facilitate this.

ACK N OWLED G EM ENTS
We thank NERC (Ref: NE/M010813/1) for funding this work. We thank all site operators for providing site access.

DATA ACCE SS I B I LIT Y
Sequences from this study are available through the European Nucleotide Archive under Project accession number PRJEB26329 and the individual sequences are ERS2414111--23. All other data generated or analysed during this study are included in the Supporting information Appendix S1 files.