Complete genome sequence and potential pathogenic assessment of Flavobacterium plurextorum RSG‐18 isolated from the gut of Schlegel's black rockfish, Sebastes schlegelii

Abstract Flavobacterium plurextorum is a potential fish pathogen of interest, previously isolated from diseased rainbow trout (Oncorhynchus mykiss) and oomycete‐infected chum salmon (Oncorhynchus keta) eggs. We report here the first complete genome sequence of F. plurextorum RSG‐18 isolated from the gut of Schlegel's black rockfish (Sebastes schlegelii). The genome of RSG‐18 consists of a circular chromosome of 5,610,911 bp with a 33.57% GC content, containing 4858 protein‐coding genes, 18 rRNAs, 63 tRNAs and 1 tmRNA. A comparative analysis was conducted on 11 Flavobacterium species previously reported as pathogens or isolated from diseased fish to confirm the potential pathogenicity of RSG‐18. In the SEED classification, RSG‐18 was found to have 36 genes categorized in ‘Virulence, Disease and Defense’. Across all Flavobacterium species, a total of 16 antibiotic resistance genes and 61 putative virulence factors were identified. All species had at least one phage region and type I, III and IX secretion systems. In pan‐genomic analysis, core genes consist of genes linked to phages, integrases and matrix‐tolerated elements associated with pathology. The complete genome sequence of F. plurextorum RSG‐18 will serve as a foundation for future research, enhancing our understanding of Flavobacterium pathogenicity in fish and contributing to the development of effective prevention strategies.

Here, we present the first complete genome of the F. plurextorum species isolated from the gut of Schlegel's black rockfish, Sebastes schlegelii.To investigate any pathogenic potential, comparative genome analysis with previously reported F. plurextorum genomes and other major flavobacterial fish pathogens was conducted.The genetic characteristics of F. plurextorum revealed in this study will broaden our understanding of the relationship between fish and Flavobacterium.

Genome characteristics
Strain RSG-18 was isolated from the gut of Schlegel's black rockfish, S. schlegelii.The gut samples were serially diluted by R2A broth.An aliquot of 200 μL was spread onto R2A agar at 20 C for 3 days.After 3 days, yellow, rough and irregular colonies were observed (Figure S1).Genomic DNA was extracted using a genomic DNA extraction kit (RBC Bioscience©) following the manufacturer's bacterial protocol.Library preparation was performed using the SMRTbell ® Template Preparation Kit (v.1.0),followed by sequencing on the PacBio RS II platform (Pacific Biosciences©).
All bioinformatics tools used in this study are summarized in Table S1.De novo assembly was conducted by Flye (v.2.8.3) with asm-coverage 100 options, followed by polishing using GCpp (v.2.0.2) to further correct errors in the assembled genome.The polished genome was adjusted to the starting coordinate position (dnaA gene) and rotated using the fixstart function in Circlator (v.1.5.5).Assessment was performed by BUSCO (v.5.2.2) with flavobacteriales_odb10 dataset.
As a result of de novo assembly and polishing, a single circular chromosome without gaps was produced.The complete genome sequence of RSG-18 was covered to a depth of 763 x with a total size of 5,610,911 bp and a GC content of 33.57%.Assembly completeness evaluated with BUSCO indicated 100% completeness and duplication rate of 1.2% using the flavobacteriales lineages.Genome annotation by Prokka (v.1.14.6) yielded 4858 coding sequences (CDS), 18 rRNAs, 63 tRNAs and 1 tmRNA (Table 1, Figure 1).Web-based annotation tools eggNOG-mapper (v.2.1.6)(http:// eggnog-mapper.embl.de/)was performed for additional annotation of the putative function of the genes based on orthologs.The largest protein-coding category (except 'Function unknown') in RSG-18 was 'Cell wall/membrane/envelope biogenesis (M)' (Table S2).

Taxonomic status
Barrnap (v.0.9) was used for the extraction of 16S rRNA gene sequences from the assembled genome.The sequences predicted by Barrnap were submitted to the BLASTN (https://blast.ncbi.nlm.nih.gov) and EzBioCloud database (https://www.ezbiocloud.net)to identify phylogenetic neighbours.Subsequently, the 16S rRNA gene sequences of 18 neighbouring Flavobacterium strains (14 species), including type strains, and Myroides guanonis IM13 T were obtained.SINA aligner (v.1.2.11) and tri-mAl (v.1.4.rev15) were used for multiple sequence alignment and alignment trimming, respectively.The phylogenetic tree was built using the maximum-likelihood (ML) algorithm in IQ-TREE (v.2.1.4)with 1000 bootstrap replicates (Minh et al., 2020).The phylogenetic analysis based on the 16S rRNA gene sequence revealed that RSG-18 was clustered with the species F. plurextorum (Figure S2A).The 16S rRNA gene sequence of RSG-18 was completely identical to F. plurextorum CCUG 60112 T , isolated from Trout eggs (O.mykiss) in Spain in 2008 (Table S3) (Zamora et al., 2013).RSG-18 was isolated from the gut of S. schlegelii in 2013, showing the mutational robustness of the 16S rRNA gene despite differences in the time of isolation, fish species, etc.As was the case with F. psychrophilum (Apablaza et al., 2013), this occurrence was not uncommon.Average nucleotide identity (ANI) was calculated using Pyani (v.0.2.11) to determine the taxonomic status based on the whole-genome sequence.A total of 13 Flavobacterium genomes, collected from the NCBI RefSeq database (https://www.ncbi.nlm.nih.gov/refseq/)(Table 2), were subjected to comparison.RSG-18 exhibited ANI values above 98% with two strains of F. plurextorum.Additionally, ANI values were 92.5% with Flavobacterium oncorhynchi and between 71.4% and 82.9% with the remaining Flavobacterium species (Figure S2B).According to the currently accepted criterion (Figueras et al., 2014), which considers two different strains as the same species when they exhibit an ANI value above 95% 96%, it is established that RSG-18 belongs to the species F. plurextorum.Flavobacterium plurextorum CCUG 60112 T and F. plurextorum 2 exhibit genome sizes of 5.06 and 5.04 Mbp, respectively, so RSG-18's genome size is about 0.6 Mbp longer.The GC content of all three strains is similar at 33%.
Rapid Annotations using Subsystems Technology (RAST) (v.2.0) was used with default parameters to categorize SEED subsystem of Flavobacterium genomes (Aziz et al., 2008).In all Flavobacterium species, no genes associated with 'cell division and cell cycle', 'motility and chemistry' and 'photosynthesis' were identified.However, genes corresponding to the other 24 SEED subsystems were found in all or most of the species (Figures 2A, S3A,B).Notably, all analysed genomes contained genes classified under the 'Virulence, Disease and Defense' category.However, even for obvious pathogens in this category, no genes related to adhesion or toxins were annotated, and most were categorized as genes for resistance to heavy metals or antibiotics (Figure S4).For example, metallo-β-lactamase (MBL) fold metallo-hydrolase, which has catalytic activity for a wide range of β-lactam antibiotics (Colson et al., 2020), and gyrA/gyrB associated with quinolone resistance were identified in all genomes.While gyrA and gyrB typically encode DNA gyrase subunits involved in regulating DNA supercoiling, functional mutations in these genes can confer antibiotic resistance (Mata et al., 2018).However, none of the Flavobacterium species used in the comparison, including RSG-18, exhibited variants known to induce quinolone resistance in these genes (Table S4) (Declercq et al., 2021;Izumi & Aranishi, 2004;Shah et al., 2012).
For the detection of antibiotic resistance genes (ARGs) and putative virulence, ABRicate (v.1.0.1) (https://github.com/tseemann/abricate)was used with the CARD and VFDB databases, respectively (Table S5).A total of 16 ARGs were detected in nine Flavobacterium genomes based on the CARD database.The JOHN-1 gene was detected in nine genomes, including RSG-18.JOHN-1 is a β-lactamase that was initially discovered in Flavobacterium johnsoniae and has been reported to be resistant to penicillin, cephalosporin and carbapenem antibiotics (Naas et al., 2003).
From the VFDB, a total of 61 putative virulence genes were predicted from 13 Flavobacterium T A B L E 2 General information on Flavobacterium species used in this study.genomes.High temperature protein B (htpB) gene was identified in all genomes, including RSG-18.However, it showed less than 70% nucleotide identity, and the actual gene annotated by Prokka was groL, which encodes a chaperonin GroEL.According to Valenzuela-Valderas et al., GroEL in Escherichia coli differs by several amino acids from HtpB in Legionella pneumophila (the reference species for htpB in the VFDB), leading to a functionally different protein folding that may not be conducive to infection (Valenzuela-Valderas et al., 2022).Therefore, it is not expected to be virulent and has only been reported to be immunogenic in Flavobacterium (Liu et al., 2012;Valenzuela-Valderas et al., 2022).
In addition, the ATP-dependent CLP protease proteolytic subunit, clpP, was identified only in the F. plurextorum genomes.In certain bacteria, clpP is important for the degradation of proteins involved in nutrient deficiency, heat-stress reaction, stationary phase adaptation, cell cycle progression, cell motility, biofilm formation, nutrition and metabolism (Moreno-Cinos et al., 2019).Since it has a widespread effect on proteins, clpP function plays an important role in infectivity and virulence in a number of bacterial pathogens (Bhandari et al., 2018).In summary, reference-based analysis shows that F. plurextorum has several virulence-related genes, suggesting that F. plurextorum has pathogenic potential like other fish pathogens.However, as in the case of htpB, candidates from in silico-based analyses must be verified through experimental investigation to ensure they are actually expressed in infecting fish.
Prophages have a key role in bacterial pathogenicity.They typically encode virulence genes and make significant contributions to the strains' genetic distinctiveness (Canchaya et al., 2004).To identify prophage sequences within genomes, PHAge Search Tool Enhanced Release (PHASTER) (https://phaster.ca)was used.All 13 Flavobacterium genomes submitted to PHASTER server had at least one type of phage regions.A total of 21 incomplete and one questionable prophage were identified.The GC content of phages showed a similar tendency in the range of 29.52%-36.28%,with an exception of phage found in RSG-18, which had a GC content of 42.77%.RSG-18 hosted three incomplete prophage sequences which included a 10.1 Kb Escherichia phAPEC8 (NCBI accession: NC_020079), an 8.2 Kb Paenibacillus Likha (NC_048693), and a 9.9 Kb Bacillus vB_BsuM-Goe3 (NC_048652).
F I G U R E 2 Summary for comparative genomic analysis.(A) The number of genes belonging to 27 RAST SEED subsystem categories for 13 Flavobacterium strains is displayed.(B) Secretion systems in Flavobacterium species.The number of elements identified in each genome is displayed.In a specific category, a higher gene count is depicted with a brighter green colour, while a lower gene count is represented by a brighter red colour.
The secretion system plays a crucial role in bacterial growth and diverse cellular processes, primarily transporting of proteins from the cytoplasm to the external environment, including bacteria and eukaryotic cells (Costa et al., 2015).To investigate which secretion systems Flavobacterium have, an exploration of 22 models of the protein secretion system was conducted using TXSScan (v.1.0.5), an MacSyFinderbased detection program.All Flavobacterium species contained type I secretion system (T1SS; omf, mfp, abc), type III secretion system (T3SS; sctN), and type IX secretion system (T9SS; porV, sprE, sprA, gldN, gldK, sprT, gldM, gldL) (Figure 2B).This finding is consistent with a previous study by Kumru et al., which reported that all genomes of 86 Flavobacterium strains isolated from aquatic hosts, mainly fish, possessed T1SS and T9SS secretion systems (Kumru et al., 2020).Additionally, three strains of F. plurextorum, F. oncorhynchi and F. succinicans, were found to contain flagellum-related gene (sctN_FLG), while most genomes exhibited type IV secretion system (T4SS) accessory genes and type VI secretion system (T6SS) mandatory genes.
This study offers valuable insights into the potential pathogenicity of F. plurextorum, contributing to the prevention and management of fish diseases in the fisheries and aquaculture industries.This study holds significant importance as the first complete genome F I G U R E 3 Pangenome of 13 Flavobacterium genomes.The outermost red rings represent information from three Flavobacterium plurextorum genomes and the blue rings represent 10 Flavobacterium genomes previously reported as fish pathogen.At the top right, genomic features are plotted, including the number of virulence factors (VFs), antibiotic resistance genes (ARs), total genes, the percentage of carbohydrates, protein, 'Virulence, Disease and Defense' within the SEED category, GC content and genome length.
sequence report on F. plurextorum and could lead to more in-depth research in the future as multiple strains accumulate.For example, a study of the multilocus sequence typing (MLST) system, which is based on nucleic acid polymorphism in genes, can be used to classify strains and investigate evolution of bacteria (Nicolas et al., 2008).Unfortunately, the health status of S. schlegelii was not documented at the time RSG-18 was isolated.However, the RSG-18 genome showed significant similarity to the type strain of F. plurextorum (GCF_002217395) which was isolated from rainbow trout (O.mykiss) with bacterial septicemia.To make accurate assessments of potential genicity, extended experimental validation based on these in silico findings is necessary.This can provide insights into the host-pathogen interactions and their association with environmental conditions.
General features of Flavobacterium plurextorum RSG-18 and the minimum information about a genome sequence (MIGS) mandatory information.
T A B L E 1