Taxonogenomics description of Parabacteroides timonensis sp. nov. isolated from a human stool sample

Abstract Intensive efforts have been made to describe the human microbiome and its involvement in health and disease. Culturomics has been recently adapted to target formerly uncultured bacteria and other unclassified bacterial species. This approach enabled us to isolate in the current study a new bacterial species, Parabacteroides timonensis strain Marseille‐P3236T, from a stool sample of a healthy 39‐year‐old pygmy male. This strain, is an anaerobic, gram‐negative, nonspore‐forming motile rod. Its genome is made up of 6,483,434 bp with 43.41% G+C content, 5046 protein‐encoding genes, and 84 RNA genes. We herein provide the full description of Parabacteroides timonensis strain Marseille‐P3236T through the taxonogenomic approach.


| INTRODUC TI ON
The gut microbiota is well-known for its microbial diversity and its role in health as well as in diseases. Even though scientific technologies have been greatly developed over the past years and have drastically facilitated the description of the gut microbiota, it still remains a challenging task (Turnbaugh et al., 2007) as the massive data generated over the last decade do not yet allow the clear depiction of the gut microbiota composition (Lagier et al., 2012).
Nevertheless, the fact that 1 g of human stool might contain up to 10 12 bacteria drives us to pursue our efforts in describing the human gut microbiota for which only around 2,776 species have been reported (Bilen et al., 2018;Hugon et al., 2015). Consequently, our laboratory has developed a new approach called culturomics which aims to isolate previously uncultured bacteria using sophisticated culture methods (Lagier et al., 2012). In doing so, culturomics has expanded our capabilities in human gut microbiota description and therefore lead to the isolation of a significant number of new genera and species . The process begins with the cultivation of stool samples under varying conditions and bacterial growth is assessed over 30 days. At this point, Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is primarily used for colony identification and 16S rRNA sequencing is adapted in case of MALDI-TOF MS's identification failure. Subsequently, unidentified species are subjected to taxonog enomics description (Fournier & Drancourt, 2015). The genome of the concerned species is then sequenced for a genomic description, followed by a phenotypic and biochemical analysis (Fournier & Drancourt, 2015;Fournier, Lagier, Dubourg, & Raoult, 2015;Lagier et al., 2012). By adapting this procedure, we isolated a new species known as Parabacteroides timonensis (P. timonensis), a member of the Parabacteroides genus known to be gram-negative, obligate anaerobic, nonmotile, rod-shaped, and nonspore-forming (Sakamoto & Benno, 2006). To date, eight Parabacteroides species have been isolated, out of which six were isolated from the human gut (www.bacterio.net). We demonstrate here the description of P.

| Ethicsandsamplecollection
Before stool sample collection in Congo, the sample's donor has signed an informed consent. The donor is a healthy 39-year-old pygmy male and the collected stool samples were stored at −80°C for further analysis. In addition, an approval from the ethic committee of the Institut Fédératif de Recherche IFR48 (Marseille, France) carrying the number 09-022 was obtained before launching the study.

| Strainisolation
A loop of stool sample was diluted in phosphate-buffered saline (Life Technologies, Carlsbad, CA, USA) prior to incubation in a blood culture bottle (BD BACTEC ® , Plus Anaerobic/F Media, Le Pont de Claix, France), supplemented with 5% sheep blood and 5% filtered rumen, at 37°C under anaerobic conditions. Bacterial growth and isolation was done by subculturing samples after 5 days on 5% sheep's blood-enriched Columbia agar solid medium (bioMérieux, Marcy l'Etoile, France).

| Colonyidentification
Isolated bacterial colonies identification trials were done first by using MALDI-TOF MS analysis as previously described (Elsawi et al., 2017). In case of MALDI-TOF MS identification failure, complete 16S rRNA sequencing was performed for further analysis with the same protocol used in our previous studies (Elsawi et al., 2017).
Complete 16S rRNA nucleotide sequence are assembled and manipulated using CodonCode Aligner software (http://www.codoncode.com) and blasted in the online PubMed National Center for Biotechnology Information (NCBI) database for phylogenetic analysis. According to Kim, Oh, Park, & Chun (2014), a threshold of 98.65% 16S rRNA gene sequence similarity was used to classify a new species, whereas a threshold of 95% 16S rRNA gene sequence was used for new genus classification. Generated mass spectrum of the concerned species was added to our custom database and its 16S rRNA gene sequence was submitted to EMBL-EBI database.

| Growthconditions
In order to determine the optimal growth environment, strain Marseille-P3236 T was cultured using different conditions such as temperature, pH, atmosphere, and salinity. To begin with, this strain was cultured under anaerobic, aerobic, and microaerophilic F I G U R E 1 Reference mass spectrum representing Parabacteroides timonensis strain Marseille-P3236 T conditions on 5% sheep's blood-enriched Colombia agar (bioMérieux) at 28, 45, 37, and 55°C. GENbag anaer and GENbag microaer systems (bioMérieux) were used for anaerobic and microaerophilic environment establishment, respectively. Furthermore, salt and acidity tolerance were evaluated using concentration of 0%, 5%, 15%, and 45% NaCl and pH values of 6, 6.5, 7, and 8.5.

| Morphologicalandbiochemicalassays
Biochemical characteristics of Marseille-P3236 T strain were determined using different API galleries (20A, ZYM, and 50CH, bioMérieux) according to the manufacturer's instructions. Not to mention, sporulation ability was tested by culturing this strain after exposing a bacterial suspension to a thermic shock of 80°C for 10 min. Strain Marseille-P3236 T morphology was determined as previously described (Elsawi et al., 2017). Additionally, DM1000 photonic microscope (Leica Microsystems, Nanterre, France) was used to observe the motility of strain Marseille-P3236 T from a fresh culture with a 100× objective lens. Cellular fatty acid methyl ester (FAME) analysis was performed using around 43 mg of bacterial biomass per tube as formerly described (Elsawi et al., 2017).
Quantification was done using the Qubit assay with the high F I G U R E 2 Phylogenetic subtree highlighting the position of Parabacteroides timonensis strain Marseille-P3236 T relative to other close species F I G U R E 3 Gel view comparing mass the mass spectrum of Parabacteroides timonensis strain Marseille-P3236 T to other species. The gel view displays the raw spectra of loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The right y-axis indicates the relation between the color of a peak and its intensity, in arbitrary units. Displayed species are indicated on the left sensitivity kit (Life technologies, Carlsbad, CA, USA) and determined to be 134 ng/μl. gDNA sequencing, library preparation, fragmentation, and tagmentation were performed as previously described with an optimal DNA fragment size of 8.675 kb . Total information of 5.1 Gb were generated from a 542 K/mm 2 cluster density with a quality control filters threshold of 95.7% (10,171,000 passing filter paired reads). Within this run, index representation for Marseille-P3236 T strain was determined to 7.69%. The 782,587 paired reads were trimmed and then assembled. Genome assembly, annotation, and comparison were done using the same pipeline and tools as previously described (Elsawi et al., 2017).

| Strainidentificationandphylogeneticanalysis
The identification of P3236 T strain using MALDI-TOF MS failed due to the absence of its mass spectrum in the current databases.
However, a typical spectrum was added to our custom database

| Phenotypicandbiochemicalcharacterization
Strain Marseille-P3236 T is a gram-negative rod, motile, unable to sporulate, and grows anaerobically between 25 and 42°C but optimally at 37° (Table 1). It is able to endure a range of pH between 6 and 8.5 and can sustain only a 5% salinity concentration. Strain Marseille-P3236 T is catalase positive, oxidase negative, and can be seen as smooth colonies with a diameter of 0.9-1 mm. Under electron microscopy, each bacterial cell has an average length of 1.4-2.7 μm and an average diameter of 0.5 μm ( Figure 4).
The major fatty acid found in this strain was 12-methyl-tetradecanoic acid (46%). Several specific 3-hydroxy branched structures were described. Minor amounts of unsaturated, branched, and other saturated fatty acids were also detected (Table 4).

| Comparisonofgenomeproperties
The draft genome sequence of Marseille-P3236 T strain was compared to those of Parabacteroides gordonii (P. gordonii) (AUAE00000000), Distribution of functional classes of predicted genes according to the COGs database regarding strain Marseille-P3236 T is presented in Figure 6. The distribution was similar in all the studied genomes.
Strain Marseille-P3236 T shared the highest number of orthologous proteins with P. gordonii (2,614 with 84.62% similarity at the nucleotide level,  Wayne et al., 1987). According to our results, strain Marseille-P3236 T shared with all its phylogenetically closest species with standing in nomenclature dDDH value of less than 70% and thus confirming it as a new species (Table 8). The total is based on either the size of the genome in base pairs or the total number of proteincoding genes in the annotated genome.
F I G U R E 5 Graphical circular map of the genome of Parabacteroides timonensis strain Marseille-P3236 T . From outside to the center: Contigs (red/gray), COG category of genes on the forward strand (three circles), genes on forward strand (blue circle), genes on the reverse strand (red circle), COG category on the reverse strand (three circles), G+C content

| D ISCUSS I ON
The human gut microbiota has been extensively studied by the scientific community and it has already been correlated with several health conditions such as obesity (Million et al., 2016), gastrointestinal diseases (Guinane & Cotter, 2013), or nonalcoholic fatty acid liver disease (Abu-Shanab & Quigley, 2010). Profiling the bacterial content and its ratio in the human gut have led to the development of several therapeutic strategies such as probiotics, and also therapeutic improvements such as the case of CTLA-4based cancer immunotherapy (Vétizou et al., 2015). Hence, describing the human gut microbiota without neglecting a group of its population is essential. Culturomics was developed for the purpose of isolating previously uncultured organisms along with attempting to correlate its sequences to operational taxonomic units (Lagier et al., 2012. This work adds on the previously performed descriptive studies on the human gut microbiota via culturomics  by isolating a new bacterial species belonging to the Parabacteroides genus (Sakamoto & Benno, 2006). However, Parabacteroides has been previously isolated from clinical cases (Kierzkowska et al., 2017) and was the causa-

| CON CLUS ION
In conclusion, describing the human microbiota remains a challenging task requiring intensive efforts. Even though sequencing approaches proved to be efficient in this field, culturomics reemphasized the importance of culture in deciphering the dark matter of the human microbiome by shedding the light on previously unidentified and uncultured species . Herein, we report the isolation of a new bacterial species from the human gut, Parabacteroides timonensis strain P3236 T which represents the ninth Parabacteroides species and the seventh found in the human gut. Marseille-P3236 T (=CSUR P3236 = CCUG 71183), and was isolated from the stool sample of a healthy 39-year-old pygmy male from Congo.

ACK N OWLED G M ENTS
The authors acknowledge Xegen (http://www.xegen.fr/) for genomic analyses performance and Claudia Andrieu for administrative assistance.

CO N FLI C TO FI NTE R E S T
The authors declare no conflict of interest.

DATAACCE SS I B I LIT Y
16S rRNA gene sequence was deposited under the accession number: LT598573. The genome bioproject was deposited under the accession number: PRJEB18032. The strain was deposited under the following strain deposit numbers: CSUR P3236 and CCUG 71183.