Improved detection of influenza A virus from blue‐winged teals by sequencing directly from swab material

Abstract The greatest diversity of influenza A virus (IAV) is found in wild aquatic birds of the orders Anseriformes and Charadriiformes. In these birds, IAV replication occurs mostly in the intestinal tract. Fecal, cloacal, and/or tracheal swabs are typically collected and tested by real‐time RT‐PCR (rRT‐PCR) and/or by virus isolation in embryonated chicken eggs in order to determine the presence of IAV. Virus isolation may impose bottlenecks that select variant populations that are different from those circulating in nature, and such bottlenecks may result in artifactual representation of subtype diversity and/or underrepresented mixed infections. The advent of next‐generation sequencing (NGS) technologies provides an opportunity to explore to what extent IAV subtype diversity is affected by virus isolation in eggs. In the present work, we evaluated the advantage of sequencing by NGS directly from swab material of IAV rRT‐PCR‐positive swabs collected during the 2013–14 surveillance season in Guatemala and compared to results from NGS after virus isolation. The results highlight the benefit of sequencing IAV genomes directly from swabs to better understand subtype diversity and detection of alternative amino acid motifs that could otherwise escape detection using traditional methods of virus isolation. In addition, NGS sequencing data from swabs revealed reduced presence of defective interfering particles compared to virus isolates. We propose an alternative workflow in which original swab samples positive for IAV by rRT‐PCR are first subjected to NGS before attempting viral isolation. This approach should speed the processing of samples and better capture natural IAV diversity. OPEN RESEARCH BADGES This article has earned an Open Data Badge for making publicly available the digitally‐shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5061/dryad.3h2n106.


| 6535
FERRERI Et al. and shorebirds), are considered the natural hosts for 16  and 9 NA (N1-9) subtypes. In these birds, IAVs replicate primarily in the gastrointestinal tract in the absence of overt signs of disease.
Influenza A viruses are excreted in fecal material and are naturally perpetuated through fecal-oral transmission. The segmented genome allows IAVs to exchange gene segments, and thus, strains representing most HA/NA combinations are found in nature (Munster & Fouchier, 2009;Webster, Bean, Gorman, Chambers, & Kawaoka, 1992). Additional HA and NA subtypes have been identified in IAV associated with fruit bats (H17N10 and H18N11), but there is no evidence of exchange of genetic material between bat and avian IAVs (Tong et al., 2012(Tong et al., , 2013. For IAV surveillance in wild birds, fecal, cloacal, oropharyngeal, and/or tracheal swabs are tested by rRT-PCR and/or by virus isolation in embryonated chicken eggs (ECE) and/or tissuecultured cells (TC) (Webster, Cox, & Stohr, 2005). Virus isolation (VI) strongly depends on the quality of the swab material and the presence of an adequate amount of infectious virus. This approach can impose selective bottlenecks that may either change the viral genome consensus sequence (Varble et al., 2014), allow competition between strains present in the original sample, lower the detection of mixed infections (Lindsay et al., 2013), and/or generate a bias in favor of HA and NA subtype combinations that are better fit for replication in ECE. In addition, VI strongly depends on the preservation of viable viral particles, and depending on the number of passages required, it may take up to 2 weeks to obtain a positive virus isolate (Webster et al., 2005). Previous studies have reported that rRT-PCR is more sensitive than VI for detection of IAV (Gonzalez-Reiche, Muller, Ortiz, Cordon-Rosales, & Perez, 2016;Munster & Fouchier, 2009;Runstadler et al., 2007) but differences may reflect the ability to detect RNA from noninfectious IAV (Brown, Poulson, Carter, Lebarbenchon, & Stallknecht, 2013).
The utility of next-generation sequencing (NGS) and the ability to sequence directly from the original swab material for rapid IAV detection will produce more complete and accurate data reflective of naturally occurring subtypes and genetic diversity (McGinnis, Laplante, Shudt, & George, 2016;Zhao et al., 2016). In the present report, we show how performing NGS from original swab (swab-NGS) material can improve IAV characterization from field samples.

| Sample collection, rRT-PCR, and virus isolation (VI)
Samples were collected from hunter-killed ducks during the winter migration season 2013-2014 in the villages of El Pumpo, in the department of Santa Rosa; Pasaco, in the department of Jutiapa and La Gomera, in the department of Escuintla, Guatemala. Sampling sites and tracheal and cloacal swab collection methods from birds followed previous descriptions (Gonzalez-Reiche et al., 2012).
Permits for sampling different bird species at sampling sites were obtained from the Center for Conservation Studies (CECON) and the National Council of Protected Areas (CONAP). Samples were preserved in 1 ml of virus transport medium (VTM,Medium 199 with Hanks balanced salt solution, 2 mM l-glutamine, 0.5% bovine serum albumin, 0.35 g/L sodium bicarbonate, 2 × 10 6 IU/L penicillin, 200 mg/L streptomycin, 2 × 10 6 IU/L polymyxin B, 250 mg/L gentamycin, 0.5 × 10 6 IU/L nystatin, 60 mg/L ofloxacin, and 0.2 g/L sulphamethoxazol) (Gonzalez-Reiche et al., 2012) and stored at −70°C until used. All samples were tested for the presence of the IAV matrix (M) gene RNA by real-time reverse-transcriptase polymerase chain reaction (rRT-PCR) (Spackman et al., 2002). Swabs showing Ct values <40 were considered positive. The details of the methods for RNA extraction, molecular testing, and virus isolation have been described elsewhere (Gonzalez-Reiche et al., 2012). Viral isolation was attempted for rRT-PCR IAV-positive samples. Briefly, 200 µl of VTM from cloacal swab samples was inoculated into the allantoid cavity of 9-day-old specific pathogen-free ECEs, and the eggs were then incubated for 72 hr and harvested in accordance with standardized protocols described in the WHO Manual on Animal Influenza Diagnosis and Surveillance (Webster et al., 2005). Collected allantoid fluids were tested by the hemagglutination assay to assess for presence of IAV as described (Webster et al., 2005). Briefly, allantoid fluid-containing virus was serially diluted twofold using phosphate-buffered saline (pH 7.4) and mixed 1:1 with 0.5% chicken red blood cells in V-bottom 96well plates. Reading was performed after 45 min. Allantoid fluids that tested negative were diluted in half with phosphate-buffered saline (pH 7.4) and inoculated in a new batch of ECE. The process was repeated up three times as needed.

| Amplicon purification and library preparation
Amplicons from MS-RT-PCRs were cleaned by 0.45× of Agencourt AMPure XP Magnetic Beads (Beckman Coulter) according to manufacturer's protocol and eluted in 30 µl of HyClone molecular biology water (Genesee Scientific). Concentration of the eluate was measured using the Qubit dsDNA HS Assay kit (ThermoFisher) on the Qubit 3.0 fluorometer (ThermoFisher). Amplicons were normalized to 0.2 ng/µl. Adapters were added by tagmentation using Nextera XT DNA library preparation kit (Illumina). The reaction was set up using 40% of the suggested final volume. Libraries were purified using 0.7× Agencourt AMPure XP Magnetic Beads, and fragment size distribution was analyzed on the Agilent Bioanalyzer using the High Sensitivity DNA kit (Agilent). Next, samples were normalized to 4 nM and pooled. The loading concentration of the pooled libraries was 15 pM. Libraries were sequenced using the MiSeq v2, 300 cycle reagent Kit (Illumina) in a paired-end fashion (150 × 2).

| Genome assembly and variant analysis
Genome assembly was performed using a pipeline previously developed at Icahn School of Medicine at Mount Sinai by Harm Van Bakel and described (Mena et al., 2016). Briefly, low-quality sequences and adapters were removed by Cutadapt (Martin, 2011) from paired fastq files. An initial assembly was done using the inchworm component of Trinity (Grabherr et al., 2011) and viral contigs bearing internal deletions were identified by BLAT (Kent, 2002) mapping against nonredundant IRD reference sequences. Afterward, breakpoint-spanning kmers from the assembly graph were removed by repeating the inchworm assembly, and the resulting contigs were then oriented and trimmed to remove low-coverage ends and any extraneous sequences beyond the conserved IAV termini. To improve contiguity, the CAP3 (Huang & Madan, 1999) assembler was used and contigs from the same segment were merged if their ends overlapped by at least 25 nt. Finally, assembly contigs and contiguity were assessed for all segments by mapping sequence reads back to the final assembly using Burrows-Wheeler Alignment (Li & Durbin, 2009). For the H5 HA segments, we also performed de novo assembly using Trinity (Haas et al., 2013). Pairwise alignment did not show any differences in the HA segment regardless of the assembly approach. Nucleotide variant analysis was performed using the program LoFreq (Wilm et al., 2012) following the Genome Analysis Toolkit best practices (Van der Auwera et al., 2013). The cutoff for minor variant analysis was arbitrarily set at 1,000× based on similar analysis found in the literature (Grubaugh et al., 2019;McCrone & Lauring, 2016;Wilm et al., 2012). Briefly, after removing adapters using Cutadapt, reads were mapped back to their reference using the option mem from BWA (Li & Durbin, 2009). Formatting the data for input to GATK was made using Picard (http://broad insti tute.github. io/picar d/). Reads were realigned using RealignerTargetCreator and IndelRealigner from GATK. Finally, base's qualities were recalculated using BaseRecalibrator from GATK. The resulted bam file was used to perform variant calling analysis by LoFreq.

| Plots
Data analyses were performed using GraphPad Prism software,

| Influenza A virus detection from field samples was increased by the swab-NGS protocol
During the 2013-2014 IAV surveillance season in Guatemala, 579 paired tracheal/cloacal swab samples were obtained from wild aquatic birds, particularly blue-winged teals (Table 1). Screening for the IAV matrix (M) gene by rRT-PCR resulted in 74 IAV-positive samples with Ct values ranging from 18.5 to 39.3 (prevalence of 12.8%).
Positive samples were subsequently tested by H5 subtype-specific rRT-PCR (Spackman et al., 2002); nine samples (1.5% prevalence) yielded Ct values ranging from 28.1 to 36.6. This set of 74 IAV rRT-PCR-positive samples was further characterized by and compared under two NGS protocols. In the first, we followed a traditional protocol by attempting VI in ECE. Virus isolates were subsequently subjected to RNA extraction, followed by IAV genome amplification using multisegment RT-PCR (MS-RT-PCR) and sequencing by NGS (VI-NGS). In the second protocol, RNA was extracted directly from the original swab material, and the influenza genome was amplified by MS-RT-PCR and then sequenced by NGS (swab-NGS).
VI in ECEs resulted in 21 positive samples (28.4% of IAV rRT-PCR-positive samples). The number of passages needed for VI varied between 1 and 3 among samples (Table 1)

| Molecular analysis shows markers of resistance to NA/M2 inhibitors, increased virulence in mammals, and two different PB1-F2 ORFs
In order to better define the animal and public health risk of these viruses, molecular markers associated with drug resistance, enhanced transmission, and virulence to mammals were analyzed.

| Swab-NGS detects an unusual cleavage site motif for low-pathogenicity H5 subtype viruses
Transition from low to high pathogenicity depends on the acquisition of several basic amino acids on the cleavage site of the HA protein.
Subtypes that can naturally acquire this property have been so far restricted to H5 and H7. Protein sequence analyses of the different H5 (116-96, 116-07, 116-22, 116-50, 117-42, and 116-26)  Unique identifier with an "_I" and a "_S" indicates pair samples that produce NGS data by both VI-NGS and swab-NGS, respectively. b Genome assembly is indicated by either complete or incomplete genomes. Gene segment number with missing sequence information is indicated in parentheses and correspond to 1 (PB2), 2 (PB1), 3 (PA), 4 (HA), 5 (NP), 6 (NA), 7 (M), and 8 (NS). c VI, virus isolation. Number in parentheses indicate number of ECE passages that were required to produce a virus isolate. A (−) indicates a sample that was negative for virus isolation after three blind passages in ECE (but positive for swab-NGS). A n/a corresponds to pair "_S" samples. d Virus titers in ECE's allantoic fluid as measured by hemagglutination units (HAU) using chicken red blood cells.
TA B L E 1 (Continued) this molecular marker in these samples. The biological significance of such motif in nature and/or for in vitro or in ovo growth remains to be determined.

| Comparison of VI-NGS and swab-NGS shows differences in nucleotide and amino acid consensus sequences
Since modern IAV surveillance efforts have relied on sequencing of virus isolates, it has not been possible to ascertain the impact of ECEs (or tissue culture) on the selection of variants and/or subtype combinations. In this report, we sequenced 12 paired samples by Altogether, these results suggest that not all field samples are under selective pressure during VI in ECE.

| Fixed amino acid substitutions after ECE were detected as minor variants by swab-NGS pior isolation
The presence of minor variants helps the population to adapt to environmental changes. To assess the presence of minor variants that were later selected under VI, we carried out minor variant analysis. We used paired samples from swab-NGS and VI-NGS on sites where amino acid differences were detected. We analyzed

| Swab-NGS allows for assessment of genome integrity in the natural host
Genome integrity in IAV viruses may be disrupted by the generation of defective interfering particles (DIP) (Von Magnus, 1954). but carrying intact packaging signals (Davis, Hiti, & Nayak, 1980;Nayak, 1980;Nayak, Chambers, & Akkina, 1985;Odagiri & Tashiro, 1997). Previously, DIPs were detected in influenza viruses obtained from naturally infected human and chicken samples (Chambers & Webster, 1987;Saira et al., 2013). Within NGS data, DIPs are usually inferred by the formation a valley-like shape in the coverage distribution plot, produced by a bias of reads toward the 3′ and 5′ ends of the gene segments (Lee, Lee, Tang, Loh, & Koay, 2016;Saira et al., 2013). We compared the coverage distribution of PB2, PB1, and PA produced by swab-NGS and VI-NGS. The mean coverage for each segment was calculated and depicted in order to highlight the topography of the coverage plots ( Figure 3).
As it is easily observed, swab-NGS revealed no indication of major DIP populations in the samples. In contrast, VI-NGS showed reads compatible with the presence of DIPs in the sample. These observations suggest that DIPs observed from VI-NGS may not reflect the production of such particles in nature.

| D ISCUSS I ON
Historically, IAV detection in wild bird samples has been performed by virus isolation in ECE with or without prior testing of the samples by classical RT-PCR or rRT-PCR. The entire process is time-consuming and occasionally requires multiple ECE passages before a virus is isolated. Our group has been performing IAV surveillance in wild birds in Guatemala and Argentina for more than a decade (Gonzalez-Reiche et al., 2012, 2017Pereda et al., 2008;Rimondi et al., 2018Rimondi et al., , 2011Xu et al., 2012). In both of these countries, we have described the circulation of highly diverse IAVs in terms of subtype combinations as well as phylogenetic characteristics. Due to multiple limitations including reliable access to specific pathogen-free ECEs in other countries, increasing importation costs, and regulatory restrictions that prevent rapid sample evaluation, we sought to improve virus characterization by performing NGS directly from swab samples. Although NGS has been previously used to characterize IAV from original samples (Ren et al., 2013;Seong et al., 2016;Wang et al., 2008;Zhao et al., 2016), Even though total numbers favored swab-NGS, the subtype landscape depicted by both protocols showed substantial differences. found discrepancies in the resolution of mixed infections, which is in agreement with previous reports (Lindsay et al., 2013;Wang et al., 2008). Competition between subtypes with variable fitness for ECE isolation may account for the discrepancies observed between swab-NGS and VI-NGS with respect to mixed infections (Stallknecht et al., 2012;Varich, Gitelman, Shilov, Smirnov, & Kaverin, 2008).
At least one virus field sample (117-123) showed molecular markers of antiviral resistance to adamantane and oseltamivir, stressing natural circulation of drug-resistant IAV strains in wild aquatic birds. This sample also contained the PB1-F2 S66 marker, which has been associated with increased pathogenicity in mammals (Conenello et al., 2011). All sequenced PB1 segments showed either the 87 or 90 amino acid-long PB1-F2 protein sequence, which is common in isolates from avian species (James et al., 2016). We also found N and T at position 66; interestingly, PB1-F2 T66 was only found in samples (n = 3 samples) sequenced through swab-NGS. The presence of PB1-F2 T66 in databases is low, suggesting either low circulation of this marker in avian species and/or detection bias. In chickens, PB1-F2 has been implicated in increasing the duration of virus shedding upon infection and lowering pathogenicity (Kamal et al., 2015;Krumbholz et al., 2011;Zell et al., 2007). The consequence of the total length and residue preference at position 66 in PB1-F2 for fitness in natural hosts remains to be determined.
The VI-NGS coverage topography suggested the presence of DIPs in the samples Saira et al., 2013). In this case, synthetic DNA technology and reverse genetics offer the possibility to rescue a virus isolate from genomic information.
The swab-NGS proposed in this report has already been successfully used with human samples (Meinel et al., 2018). In our hands, it has been successfully applied on field samples from a variety of species such as A. discors, Anas platyrhynchos, Anas georgica, Netta peposaca, and Amazonetta brasiliensis as well as swine samples (data not shown). Hence, the swab-NGS workflow can be successfully applied to a variety of species. Newer sequencing technologies, such as nanopore sequencing (Keller et al., 2018a(Keller et al., , 2018b, and protocol refinements could result in improvements of this workflow in order to speed up virus genomic characterization from original swab samples.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R S ' CO NTR I B UTI O N S
LMF designed studies, developed methods, analyzed data, and wrote and edited the manuscript. LO, DM, ASGR, and CCR collected and processed field samples, and edited the manuscript. GPB and ASGR performed phylogenetic analysis, RP, JAC, and DS analyzed data and edited the manuscript. DR edited the manuscript. DRP designed studies, analyzed data, and wrote and edited the manuscript.