Full‐length 16S rRNA gene classification of Atlantic salmon bacteria and effects of using different 16S variable regions on community structure analysis

Abstract Understanding fish‐microbial relationships may be of great value for fish producers as fish growth, development and welfare are influenced by the microbial community associated with the rearing systems and fish surfaces. Accurate methods to generate and analyze these microbial communities would be an important tool to help improve understanding of microbial effects in the industry. In this study, we performed taxonomic classification and determination of operational taxonomic units on Atlantic salmon microbiota by taking advantage of full‐length 16S rRNA gene sequences. Skin mucus was dominated by the genera Flavobacterium and Psychrobacter. Intestinal samples were dominated by the genera Carnobacterium, Aeromonas, Mycoplasma and by sequences assigned to the order Clostridiales. Applying Sanger sequencing on the full‐length bacterial 16S rRNA gene from the pool of 46 isolates obtained in this study showed a clear assignment of the PacBio full‐length bacterial 16S rRNA gene sequences down to the genus level. One of the bottlenecks in comparing microbial profiles is that different studies use different 16S rRNA gene regions. Comparisons of sequence assignments between full‐length and in silico derived variable 16S rRNA gene regions showed different microbial profiles with variable effects between phylogenetic groups and taxonomic ranks.

provided a lower taxonomic profiling of the human feces, partly due to sequence accuracy and low coverage of terminal regions in the 16S rRNA databases (Whon et al., 2018). Still, long-read sequencing such as PacBio circular consensus sequencing (CCS) applied on the 16S rRNA gene provides a promising approach to increase the taxonomic resolution of microbial communities. The CCS technology generates a consensus sequence from a single molecule by reading a ligated circular DNA template multiple times, achieving high read accuracy (Travers, Chin, Rank, Eid, & Turner, 2010). However, few studies have investigated advantages and disadvantages of PacBio CCS for such analyses. A recent study showed that PacBio sequencing error rates were in the same range as Roche 454 and MiSeq platforms (Wagner et al., 2016). The authors reported inconsistencies in species-level analysis between full-length 16S rRNA gene sequences obtained from Sanger and PacBio sequencing comparisons and that more sample types were needed to determine whether partial or full-length 16S rRNA gene sequences was superior in terms of taxonomic profiling effectiveness. The first metagenomic marine environmental samples from PacBio CCS sequences provided a superior taxonomic resolution to the species level compared to in silico derived partial regions of the gene (Pootakham et al., 2017).
In fish, both the skin epithelial surface and the gastrointestinal tract are covered by a mucosal layer. The main components of this mucus layer are secreted glycoproteins called mucins, which are differentially regulated between tissue types in Atlantic salmon, Salmo salar (Sveen, Grammes, Ytteborg, Takle, & Jørgensen, 2017). These glycosylated proteins might be a highly attractive substrate for the attachment and settlement of microorganisms.
Interaction studies between cutaneous microbiota and the fish surface are scarce, but mutualistic relationships (Beklioglu, Telli, & Gozen, 2006) and the existence of a resilient microbiome (Larsen, Bullard, Womble, & Arias, 2015) has been suggested. Salinity acclimation results in turnover of dominant bacterial taxa within the host microbiome that are unrelated to changes in water microbiota (Schmidt, Smith, Melvin, & Amaral-Zettler, 2015), which is also reported for Atlantic salmon (Karlsen et al., 2017;Lokesh & Kiron, 2016). Microorganisms populating the gastrointestinal tract are believed to take part in digestive functions and contribute to fish health (Nayak, 2010). Many fish gut bacteria are not detected from water samples (Sullam et al., 2012), and gut microbiota profiles in Atlantic salmon change between intestinal compartments (Gajardo, Rodiles, et al., 2016), rearing environments (Dehler, Secombes, & Martin, 2017), and diets (Schmidt, Amaral-Zettler, Davidson, Summerfelt, & Good, 2016). Sequencing the 16S rRNA genes is a powerful tool that provides a comprehensive picture of the phylogenetic diversity and composition of the microorganisms present in a sample as many of the microbial groups are absent or difficult to cultivate. Traditionally, diagnosis of suspected Atlantic salmon bacterial infections has relied on clinical signs and symptoms, and microbiological culture-dependent methods. Many bacteria associated with salmonids have been considered difficult to cultivate and more accurate culture-independent diagnostic procedures are developed (Grove, Reitan, Lunder, & Colquhoun, 2008;Sepúlveda, Bohle, Labra, Grothusen, & Marshall, 2013).
However, culturing is an important step to better understand effects of microorganisms. Recovering isolates of microbial symbionts is increasing in focus as they can be used to study activity and functional relationships to a host (Esteves, Amer, Nguyen, & Thomas, 2016;KleinJan, Jeanthon, Boyen, & Dittami, 2017). The present study was also designed to recover and identify members of the Atlantic salmon microbial communities, characterized by 16S rRNA gene sequencing, for future studies.
The changing environment in the salmon production, utilization of different feeds, use of closed or semi-closed recirculation systems and transfer to unprotected seawater environment at the final stage of production affects skin and gut of complex microbial communities. Taxonomic profiling that can reveal alterations or deviations from "normal" microbial communities may advance our understanding of any functional effects of detected microbiota. Appropriate methods that accurately generate and analyze the microbial communities would be an important tool to help improve understanding of microbial effects in the aquaculture industry. Here, we applied longread technology to demonstrate its utility, as a proof of concept, in characterizing the microbial profiles of the skin surface and the bulk intestinal content of Atlantic salmon.

| Sample collection and DNA extraction
This study utilized fish from an industry research study conducted at 12 ppt salinity with recirculated water with temperature of 13°C at the Research Station for Sustainable Aquaculture (Sunndalsøra, Norway) in accordance with regulations of the Norwegian Food Safety Authority. As proof of concept, six Atlantic salmon were selected for sampling, anesthetized with benzocaine and killed by a blow to the head. The six sampled individuals were split between two dietary groups. Three had been fed fish meal (mean ± SEM body weight was 193 ± 41 g), and three were from a diet where fish meal was substituted with krill meal (mean ± SEM body weight was 181 ± 20 g). Fish were not fed for 48 hr prior to sampling. Skin mucus samples were obtained by swiping a cell swiper across the left side of the fish and concentrating mucus, which was aspirated by pipetting.
The abdominal cavity was then opened and an incision was made to open the distal intestine. The intestinal content was collected by gently scraping the intestine to collect bulk feces including the intestinal mucus layer. All tissue material was stored in 96% EtOH. DNA was extracted from 200 mg skin mucus samples and 100 mg intestinal samples. The protocol was performed using the PowerLyzer® PowerSoil® DNA Isolation Kit (MoBio) according to the manufacturer's specification with the following amendments: samples after adding Solution C1 were heated at 70°C for 10 min. Samples were homogenized with the mechanical bead beater device Precellys®24 (Bertin Technologies) for 1 × 20 s at 5,000 rpm. The DNA was resuspended in 30 μl of DNase/RNase free molecular water and concentration determined using a Thermo Scientific Nanodrop 2000c.

| Bacterial isolation and identification
Bacteriology was performed on the same sampled individuals for the total bacterial DNA sequence analysis. Skin mucus (200 mg) and bulk intestinal feces/mucus (100 mg) were separately vortexed and suspended in total of 1 ml 0.9% NaCl saline solution. In addition, gill mucus material and water were similarly suspended in saline solution to retrieve bacterial isolates, included in the Appendix A.
A 100 μl aliquot of the content was serial diluted up to 10 -6 and plated in duplicate onto R2A (BD Difco), representing low-nutrient conditions and MacConkey agar (CM007 Oxoid), representing highnutrient conditions, under aerobic incubation at 12°C for 9 days.
Plates were inspected and colony numbers were counted based on morphological characteristics, that is, pigmentation, colony form, elevation, surface appearance, and texture. The relative distribution in percentage between colony morphologies is provided as an average of each sample type. Representative colonies were selected according to dominant morphologies and then identified by 16S rRNA gene sequence analysis. Briefly, representative colonies were selected for purity plating onto R2A or MacConkey plates.
Genomic DNA isolation was performed using PureLink® Genomic DNA Mini Kit (Invitrogen). PCR amplification of the 16S rRNA gene with primers 27F (5′-AGAGTTTGATCMTGGCTCAG) and 1492R (5′-TACCTTGTTACGACTT) was identical to previous descriptions (Karlsen et al., 2014). Products were visualized following agarose (1.0%) gel electrophoresis and RedSafe (Chembio), before being purified using a QIAquick Gel Extraction Kit (Qiagen) followed by Sanger sequencing (sequenced at GATC Biotech, DNA sequencing services and bioinformatics, Germany). The forward and reverse sequences were assembled in Bioedit (Hall, 1999) and consensus sequences deposited in GenBank (submission: SUB3162162), with accession numbers MG263463-MG263508 (Table A1, Appendix). Sequences were aligned with type strain reference sequences using Sequence Match software from The Ribosomal Database Project II (RDP II) web site (Cole et al., 2014). The phylogenetic relationships between sequences were constructed utilizing selected 16S rRNA sequences of type strains in each genus ( Figure A1). Sequences were aligned using the ClustalW algorithm in BioEdit (Hall, 1999). The phylogenetic relationships were determined using maximum likelihood (ML) based on the Tamura 3-parameter model including all coding positions (total of 1,425) with 1,000 bootstrap trials (neighbor-joining tree) in MEGA6 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013).

| PCR amplification, barcoding, and PacBio sequencing
To analyze the microbial population associated with the skin mucus and intestine, sequencing of the 16S rRNA gene was performed using the PacBio sequencing technology (Pacific Biosciences). The full-length 16S rRNA gene was amplified using degenerated versions of the universal bacterial 16S rRNA gene primers 27 F (5′-AGRGTTTGATYMTGGCTCAG) and 1492 R (5′-GGYTACCTTGTTACGACTT). In accordance with "Guidelines for Using PacBio® Barcodes for SMRT® Sequencing" guide, a 5 nt (5′-GGTAG) padding sequence was added to each unique 16 nt barcode to allow all barcodes to ligate to the SMRTbell™ adapter with equal efficiency. The utilization of barcodes allowed the multiplex sequencing of amplicons from several samples in one library using SMRT®. Primers were synthesized and HPLC-purified as recommended in PacBio's SMRT guidelines by Invitrogen Custom DNA Oligos (Thermo Fisher Scientific). Barcoded 16S rRNA amplicons were obtained by a two-step amplification protocol using Phusion® High-Fidelity PCR Master Mix (Thermo Fisher Scientific). The first PCR was performed in triplicate with the 27 F and 1492 R universal bacterial 16S rRNA gene primers using 100 ng of extracted total DNA, 1 × Phusion Master Mix, 0.5 μmol/L 16S-F forward primer, 0.5 μmol/L 16S-R reverse primer, in a 50 μl reaction volume. Samples were prepared on ice and amplified in the thermocycler with the block preheated to 98°C. The reactions were performed using the following cycling conditions: preincubation at 98°C for 2 min, followed by 25 cycles of denaturation at 98°C for 10 s, annealing at 55°C for 15 s, elongation at 72°C for 60 s, and a final extension step at 72°C for 3 min. Triplicate samples were pooled, and amplification was verified by 1% agarose gel electrophoresis before the reaction products were purified with an Invitrogen™ PureLink™ PCR Purification Kit. The purified PCR products were diluted and triplicates of 1 ng DNA was used as template for the second amplification reactions to generate padded barcoded products, with reagent concentrations as described above. Product amplification was as above with the following changes: 14 cycles with annealing at 60°C for 15 s. Triplicates were pooled, and products were verified by agarose gel electrophoresis and padded barcoded 16S rRNA gene amplicons from the reaction were purified using PureLink™ PCR Purification Kit and quantified using a NanoDrop spectrophotometer. Purified barcoded amplicons from the 12 samples (skin and intestine from 6 fish) were then pooled in equimolar concentrations, and 250 ng of DNA was used for library preparation at the Norwegian Sequencing Centre (www.seque ncing.uio.no). Briefly, library was prepared using PacBio 2 kb library preparation protocol. Size selection was performed using Ampure beads. Adapters were ligated onto the barcoded amplicons, and the library was sequenced on a PacBio RSII system using the P6-C4 polymerase and chemistry with a 360-min movie time, using one SMRT cell.
Sequences positively identified with both v-regions were advanced to generate two new trimmed data subsets for the v3-v4 and v5-v6 regions. Sequences were discarded if a flanking region could not be determined, suggesting a missing or partial v-region for a given sequence.
LCAClassifier (Lanzén et al., 2012) was applied to obtain the taxonomic mapping of the datasets (sample acronyms I1-I6 and SM1-SM6) and their respective data subsets. These were individually aligned against the SilvaMod database by MEGABLAST Finally, the analysis was carried out using default settings in the LCAClassifier program. Numerical data output from taxonomic classification was ordered based on ranks of taxonomy and the assignments obtained for the 12 datasets and data subsets. This was used to calculate the variation in taxon mapping between fulllength PacBio CCS 16S rRNA gene sequences and their respective v-regions. To detail taxonomic variations as principal components linked to the sampling sites of intestine and skin mucus only the full-length 16S rRNA gene sequences were considered. The class level was used due to a 98.92% or greater successful assignment of the filtered full-length sequences of any sample. The sample dataset sizes were downscaled to sample SM2, which had the lowest number of successfully assigned sequences. The scaled values of class data, treating zero as "NA," were uploaded to ClustViz (Metsalu & Vilo, ) with parameters set to not perform row scaling and the method set to SVD with imputation. Rarefaction analysis on the obtained operational taxonomic units (OTUs) was conducted using MicrobiomeAnalyst (Dhariwal et al., 2017). Unscaled OTU counts from mapping of reads with USEARCH were provided alongside their taxonomy lineage to genus level. Filtering of data was set using default parameters. Data were not rarefied or transformed, but total sum scaling was applied. The rarefaction curve was obtained with the filtered data using 20 steps.
Cultivability was determined using the 43,910 pooled full-length sequences combined with the 46 plate-isolated strains. Next, the USEARCH function cluster_fast was applied to cluster all sequences within an identity threshold of 97%. Sequences in clusters containing at least one of the cultivated strains were included when computing the cultivability percentage. OTUs were also determined for the pooled dataset of 43,910 full-length 16S rRNA gene sequences positive for v3v4 and v5v6 regions. USEARCH was applied following dereplication of the dataset, clustering OTUs at a 97% identity threshold, keeping parameters at default. 31,634 (~72%) of the full-length sequences were successfully reassigned to the 10 OTU representatives found and counted according to sample origin. The sum of sample specific assignments was scaled to fit the smallest sample size of SM2 and incorporated as pie charts in the phylogenetic inference described below.
The OTU centroid sequences were further used to infer phylogenetic relationship between these representatives along with the 46 plateisolated strains and 10 reference type strains. The 66 sequences were aligned with Clustal Omega v.1.2.1 (Sievers & Higgins, 2014) using default nucleotide parameters. The complete alignment was inferred using NeighborNet network with Uncorrected P distances in SplitsTree v.4.13.1 (Huson & Bryant, 2006).

| Characteristics of full-length 16S rRNA gene sequencing
Full-length 16S rRNA gene sequences were amplified from DNA extracted from sampled bulk intestinal content and skin mucus of six Atlantic salmon. Amplicons were generated using a twostep PCR approach with asymmetrical primers during the second round of amplification for a more cost-effective way to multiplex amplicons from several samples. A total of 110,818 reads were obtained with an average read length of 23,881 nt. Of these were 76,723 assembled and demultiplexed into CCS reads with an average accuracy of 98.56% and a mean length of 1,252 nt.
The average number of full passes of the CCS reads was 15.6. The number of processed full-length 16S rRNA sequences per sample ranged from 1,483 to 7,634 reads (Table 1). The microbiota composition was determined by a phylotyping approach directly allocating sequences into taxonomic groups based on bitscore and identity threshold. Of the trimmed reads used for assignment, an average of 96.9% of the skin mucus and 78.8% of the intestine were aligned to the taxonomic level order. According to the taxonomic resolution from order to genus, this method assigned more reads to taxonomic references for skin mucus samples compared to intestinal samples ( Figure 1). The discriminant taxon, based on the generated reads from the intestine samples was the prominent Clostridiales that became unassigned at higher resolution ranks.
The rarefaction analysis ( Figure A2) of clustered OTUs showed that the bacterial communities of the intestines are less diverse compared to the skin mucus samples. Convergence were reached (sample I4, I5, I6, SM1, and SM3) but seven out of twelve samples, including both intestine and skin mucus, did not converge properly in the analysis. To determine whether longer sequences of the 16S rRNA gene would be advantageous to assign sequences to the lower ranks of taxonomic affiliation, partial sequences spanning the v3v4 and v5v6 regions were extracted in silico from the full-length 16S rRNA gene sequence dataset. The proportion of assigned sequences between the datasets at the class, order, family, genus, and species levels were evaluated (Figure 3a). Using the SilvaMod database showed that overall sequence assignments varied between the full-length 16S rRNA sequences and partial 16S rRNA sequences in addition to differences in the proportions of the assigned sequences at the different taxonomic ranks ( Table 2). The v3v4 sequences had the highest proportion of assignment at the family and genus levels. At the species level, the full-length dataset had the highest proportion of assigned sequences (mean 8.9%). genera. Thirty-six genera were present in all datasets. The relative proportion of total sequences assigned to the 36 jointly shared genera was 49.04%, 59.62%, and 43.44% for the full, v3v4 and v5v6 datasets, respectively. In contrast, the proportion of sequences assigned to the 42 remaining ancillary genera ranged between 0.01% and 0.05% ( Figure 3b). To further compare taxonomic profiling, the number of sequences assigned to each bacterial taxon was identified (Table 2). This revealed differences in efficiency of read assembly in the different datasets. The discriminant taxon at the order level was Clostridiales with 64.5% and 63.2% higher assignments in the v3v4 and v5v6 data subsets, respectively. Discrepancy is also seen between short sequence length data subsets. An apparent example, which is observed down to the genus level, is sequences assigned to Carnobacterium where 23.1% more v3v4 sequences are assigned compared to the full-length, while the v5v6 data-subset had only 0.3% assigned compared to the fulllength dataset. The data further suggest differences in the proportion of assigned sequences between sequence length and the taxonomic rank genus and species. Both v3v4 and v5v6 region sequences have a higher proportion assigned to the genus Flavobacterium compared to the full-length dataset. This is opposite to species level where no v3v4 or v5v6 sequences are assigned while 8.3% of the full-length sequences are assigned to Flavobacterium frigidarium.

| Bacteria recovered by culture-dependent methods
The culture-based method aimed to provide representative isolates within anticipated genera to compare sequence information against the recovered CCS pool of sequences. Bacterial colonies were phenotypically categorized, and a total of 46 representative colonies were further identified and allocated to seven different genera based on the comparative 16S rRNA gene sequence alignment (Table A1).
The relative distribution between colony morphologies on plates retrospectively identified by phylogeny is provided as an average of each sample type in Table A2, Appendix. Isolate information, sample origin, diet group, growth medium used and the most closely related type strains to each genus group is shown in Figure A1, Appendix.

| Assignment of CCS reads to the bacterial isolates
The distribution of the 46 identified bacteria isolates among the pool of sequences is shown in Figure 4, where the displayed OTUs  Figure A1). An exception within the tree is cre-

| D ISCUSS I ON
The skin mucus microbial community was dominated by Proteobacteria within the genus Flavobacterium and Psychrobacter.
Of note is the indication of more predominant Carnobacterium in the fish fed fish meal. This corroborates a previous study that also substituted fish meal with krill meal followed by culture-dependent techniques on gut microbiota (Ringø et al., 2006) where Carnobacterium was present in non-krillmeal-fed fish. A large proportion of the Clostridiales CCS 16S rRNA gene sequences could not be taxonomically assigned above the order rank, indicating the presence of so far uncharacterized bacteria.
Bacteria in the seawater column often inhabit a nutrient poor environment and a large number of the marine bacteria that occur in this environment are suggested to be nonculturable using standard culture-based techniques (Giovannoni & Stingl, 2005). Bacteria F I G U R E 4 Phylogenetic relationships as NeighborNet network of representative OTUs (red), cultivated bacteria sequences (black), and reference strains (blue). Pie charts detail the given OTU by indicating the proportion of reassigned full-length 16S reads (gray) and how these are distributed from the samples (colored). The sample values are shown scaled to the minimal sample SM2. Scale bar shows distance as number of nucleotide substitutions per site associated with the Atlantic salmon integument are also considered difficult to cultivate. This includes pathogens such as Tenacibaculum (Olsen et al., 2011) and Moritella (Grove et al., 2008), and culturebased diagnostics of Atlantic salmon are considered unreliable (Grove et al., 2008). Early studies reported relative high cultivability of bacteria from salmonid intestines (Huber et al., 2004;Spanggaard et al., 2000), but also inconsistencies between culture-dependent and molecular-based methods (Hovda, Lunestad, Fontanillas, & Rosnes, 2007). Our approached did not aim to assess culturability, but was applied to recover Atlantic salmon isolates for future studies. Dominant bacteria in the intestine within the order Clostridiales and OTUs assigned to Aeromonas and Mycoplasma were not isolated, likely due to the growth media and aerobic condition used. Still, an overall ratio of culturability of 3.99% was demonstrated based on the presence/absence of all assigned CCS 16S rRNA gene sequences corresponding to cultivated isolates with 97% sequence identity. This dropped to 0.13% at 99% sequence identity. Both the dominating genera Flavobacterium and Psychrobacter are studied for their degrading properties (Lasa & Romalde, 2017;Loch & Faisal, 2015), and Carnobacterium is isolated from a range of environments (Leisner, Laursen, Prévost, Drider, & Dalgaard, 2007). It is possible that these genera are well-adapted to grow on the laboratory cultivation media used in this study.
Like many genera associated with the Atlantic salmon, the phylogenetic assignment is often based on the 16S rRNA gene.
However, there is a lack of properly defined bacterial species within the aquatic environment that in general hamper our ability to understand and organize bacterial diversity to these environments. There are also technological limitations that may impact on the ecological data description of these analyses (Schmidt, Matias, & Mering, 2015). Related to Atlantic salmon, many of the dominant groups or recovered isolates cannot be discriminated at the species level by their 16S rRNA gene sequences (Grove et al., 2010;Småge et al., 2015). Furthermore, our data suggest that the salmon contains more than one species within observed genera such as Psychrobacter. Covering the full-length sequences of 16S rRNA genes is expected to be advantageous for inferring phylogenetic affiliations and provide a more precise microbial community profiling of Atlantic salmon. However, our partial gene sequences were assigned in a higher abundance compared to the full-length sequences. Only at the species level was full-length sequences assigned in a higher proportion (mean 8.9% compared to 0.01% and 0.005% for v3v4 and v5v6 data subsets, respectively). Effects on abundance profiling and discrepancy with an overestimation in bacterial diversity between full-length 16S rRNA gene and partial variable datasets are in similar accordance with other studies (Singer et al., 2016;Sun et al., 2013;Wagner et al., 2016;Whon et al., 2018). The taxonomic resolution in amplicon sequencing might be affected by several factors. In our study, where we generated different sequence lengths in silico one possibility is differences of the intravariability within each targeted hypervariable region (v3v4 and v5v6) in comparison with the full-length sequence (Kumar, Brooker, Dowd, & Camerlengo, 2011). Another influencing factor could be the reference sequences in the choice of database used (Werner et al., 2012). Evaluating the species richness, we could not fully describe the community in all samples. The proportion of pooled sequences and the sample sizes obtained can be contributing factors to this lack of convergence where potentially important taxa have not been identified. Using the pooled Atlantic salmon community, we observed discrepancy in community structure and phylogenetic resolution across multiple taxonomic levels between full-length and partial sequences. Differences were revealed more clearly in some of the phylogenetic lineages. In our data, especially Clostridiales at the order level but also genera within  Using the full-length 16S rRNA gene sequence has the potential to become a tool for more precise microbial community profiling that better allows global comparisons of microbiome studies. The Arctic University of Norway.

CO N FLI C T O F I NTE R E S T S
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AUTH O R CO NTR I B UTI O N S
CK initiated the study and performed the laboratory work. TK conducted and performed the bioinformatic analyses. NW coordinated the work. CK and TK drafted the manuscript and all authors provided critical feedback and helped shape the research, analysis, and manuscript.

E TH I C S S TATEM ENT
Fish in this study was based on post mortem sampling of material from fish harvested from a different industry research study for other purposes. Fish at the Research Station for Sustainable Aquaculture (Sunndalsøra, Norway) was reared by trained professionals in accordance with guidelines and regulations of the Norwegian Food Safety Authority.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets generated and analyzed in this study are available in the BioProject repository at https ://www.ncbi.nlm.nih.gov/biopr oject/ PRJEB 28410 . F I G U R E A 1 Unrooted neighbor-joining tree constructed from maximum likelihood distances between 16S rRNA gene sequences obtained from isolates of this study with the most closely aligned type strain obtained from the RDP. Type strains are displayed with strain ID followed by the GenBank 16S rRNA gene accession number. GenBank accession number for isolates obtained in this study is provided in Table A1. Bootstrap values ≥ 70% based on 1,000 replicates are indicated, and the scale bar represents the number of substitutions per site.

Terje
The right color-coding columns of the phylogenetic clades show the tissue type each strain was isolated from and culture medium used. Skin is dark blue, gill is light blue, intestine is red, and water is yellow. Culture medium R2A is dark green, and MacConkey is light green F I G U R E A 2 Rarefaction plots based on OTU analysis for intestine and skin samples. Underlaying data represent the number of sequences in each dataset mapped to OTUs clustered with a 97% identity threshold. Plot was generated using total sum scaling of filtered data showing 20 steps