Eggerthella timonensis sp. nov, a new species isolated from the stool sample of a pygmy female

Abstract Eggerthella timonensis strain Marseille‐P3135 is a new bacterial species, isolated from the stool sample of a healthy 8‐year‐old pygmy female. This strain (LT598568) showed a 16S rRNA sequence similarity of 96.95% with its phylogenetically closest species with standing in nomenclature Eggerthella lenta strain DSM 2243 (AF292375). This bacterium is a nonspore forming, Gram‐positive, nonmotile rod with catalase but no oxidase activity. Its genome is 3,916,897 bp long with 65.17 mol% of G + C content. Of the 3,371 predicted genes, 57 were RNAs and 3,314 were protein‐coding genes. Here, we report the main phenotypic, biochemical, and genotypic characteristics of E. timonensis strain Marseille‐P3135 (=CSUR P3135, =CCUG 70327); ti.mo.nen′sis, N.L. masc. adj., with timonensis referring to La Timone, which is the name of the hospital in Marseille (France) where this work was performed). Strain is a nonmotile Gram‐positive rod, unable to sporulate, oxidase negative, and catalase positive. It grows under anaerobic conditions between 25°C and 42°C but optimally at 37°C.


| INTRODUCTION
The human gut microbiota has drawn more attention to with the advancement and development of new sequencing techniques (Gill et al., 2006;Ley, Turnbaugh, Klein, & Gordon, 2006;Ley et al., 2005). Yet, we face several limitations when using these techniques, especially when it comes to depth bias, incomplete database, and the obtention of raw material for further analysis (Greub, 2012). However, the ability to cultivate and isolate pure colonies is mandatory to describe the human gut microbiota, thus the need to develop a technique that enhances the efficiency of these two factors (Lagier et al., 2015). When talking about the human gut, stool samples are the best representatives of its microbiome since only 1 g of human stool sample might contain up to 10 11 -10 12 bacteria (Raoult & Henrissat, 2014). Before the introduction of culturomics, only 688 bacteria and 2 archaea had been recognized in the human gut . Culturomics was developed with the purpose of optimizing growth conditions of previously uncultured bacteria in order to fill the missing gaps in the human microbiome (Lagier et al., 2012a). In general, culturomics consists in culturing samples by using 18 different conditions along with isolating pure colonies for further identifications using the matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (matrixassisted laser desorption/ionization time-of-flight mass spectrometry [MALDI-TOF MS]) approach and 16S rRNA gene sequencing. Any unidentified colonies are subject to 16S rRNA gene sequencing and a series of descriptive experiments targeting the phenotypic, biochemical, and genomic characteristics at the same time (Lagier et al., 2012b;Seng et al., 2009). Using this methodology, we were able to isolate a new strain Eggerthella timonensis, a member of the genus Eggerthella (Bilen, Cadoret, Daoud, Fournier, & Raoult, 2016). Eggerthella lenta, formerly known as Eubacterium lentum, is the type strain of Eggerthella genus and was first reported in 1935 by Arnold Eggerth (Eggerth, 1935;Kageyama, Benno, & Nakase, 1999;Moore, Cato, & Holdeman, 1971). Species belongs to Eggerthella genus, Actinobacteria phylum in the Coriobacteriaceae family and known for its ability to grow under anaerobic conditions (Kageyama et al., 1999). Moreover, Eggerthella species have been reported to colonize the human gut microbiome and have been correlated to several health problems such as anal abscess and ulcerative colitis (Lau et al., 2004a).

| Strain isolation
Before stool sample collection in Congo in 2015, an approval was obtained from the ethic committee (09-022) of the Institut Hospitalo-Universitaire Méditerranée Infection (Marseille, France). The stool sample was collected from a healthy 8-year-old pygmy female accordingly to Nagoya protocol. Stool samples were shipped from Congo to France in the specific protecting medium C-Top Ae-Ana (Culture Top, Marseille, France) and stored at −80°C for further study and analysis.
Samples were inoculated in a blood culture bottle (BD BACTEC ® , Plus Anaerobic/F Media, Le Pont de Claix, France) supplemented with 5% of rumen and 5% of sheep blood at 37°C. Bacterial growth and isolation was assessed during 30 days on 5% sheep blood-enriched Columbia agar solid medium (bioMérieux, Marcy l'Etoile, France).
MALDI-TOF MS was used for colonies identification. When the latter fails to identify tested colonies, 16S rRNA gene sequencing was used (Lagier et al., 2012b;Seng et al., 2009). On average, 10,000 colonies have been tested for each stool sample.

| MALDI-TOF MS and 16S rRNA gene sequencing
Using a MSP 96 MALDI-TOF target plate, bacterial colonies were spotted and identified by the means of MALDI-TOF MS using a Microflex LT spectrometer as previously described (Seng et al., 2009). In case of MALDI-TOF's identification failure due to lack of a reference strain in the database, 16S rRNA sequencing was used for further analysis using the GeneAmp PCR System 2720 thermal cyclers (Applied Biosystems, Foster City, CA, USA) and the ABI Prism 3130xl Genetic Analyzer capillary sequencer (Applied Biosystems) (Morel et al., 2015).
Sequences were assembled and modified using CodonCode Aligner software (http://www.codoncode.com) and finally blasted against the online database of National Center for Biotechnology Information (NCBI) database (http://blast.ncbi.nlm.nih.gov.gate1.inist.fr/Blast. cgi). Once blasted, a sequence similarity of less than 98.65% with the closest species was used to define a new species and 95% for defining a new genus (Kim, Oh, Park, & Chun, 2014

| Phylogenetic analysis
16S rRNA sequences of strain's closest species were obtained from the database of "The All-Species Living Tree" Project of Silva (LTPs121) ("The SILVA and 'All-species Living Tree Project (LTP)' taxonomic frameworks," n.d.), aligned with Muscle software and phylogenetic inferences were done using FastTree with the approximately maximum-likelihood method (Price, Dehal, & Arkin, 2009).
Moreover, Shimodaira-Hasegawa test was adapted in order to compute the support local values shown on the nodes. Bad taxonomic reference strains were removed along with duplicates using phylopattern (Gouret, Thompson, & Pontarotti, 2009). This pipeline was done using the DAGOBAH software (Gouret et al., 2011), which comprises Figenix (Gouret et al., 2005) libraries.

| Growth conditions
In order to obtain the optimal growth conditions, the strain was cultured under several conditions in terms of temperature, atmosphere, pH, and salinity. First, the strain was cultured and incubated under aerobic, anaerobic, and micro-aerophilic conditions on 5% sheep blood-enriched Colombia agar (bioMérieux) at the following temperatures: 28°C, 37°C, 45°C, and 55°C. Bacterial growth under anaerobic and microaerophilic environment was tested using the GENbag anaer and GENbag microaer systems (Thermofisher Scientific, Basingstoke), respectively. Furthermore, salinity tolerance was tested by assessing growth at 37°C under anaerobic condition using different NaCl concentrations (0, 5, 10, 50, 75, and 100 g/L NaCl). As well, optimal pH for growth was evaluated by testing multiple pH: 6, 6,5, 7, and 8.5

| Morphological and biochemical assays
In order to biochemically describe strain Marseille-P3135; different API tests (ZYM, 20A and 50CH, bioMérieux) were used. Sporulation ability of this bacterium was tested by exposing a bacterial suspension for 10 min to a thermal shock at 80°C, and then cultured on COS media. Moreover, the motility of the strain was detected using a DM1000 photonic microscope (Leica Microsystems, Nanterre, France) under a 100× objective lens. Also, a bacterial suspension was fixed with a solution of 2.5% glutaraldehyde in 0.1 mol/L cacodylate buffer for more than 1 hr at 4°C for observation under the Morgagni 268D (Philips) transmission electron microscope. Finally, Gram staining results and images were obtained by DM1000 photonic microscope (Leica Microsystems) using a 100× oil-immersion objective lens.

| Fatty acid methyl ester (FAME) composition of strain Marseille-P3135
Using gas chromatography/mass spectrometry (GC/MS), Cellular FAME analysis was performed. Harvested from several culture plates, two samples were made with <1 mg of bacterial biomass per tube, and then FAME and GC/MS were done as previously described .

| DNA extraction and genome sequencing
To extract the genomic DNA (gDNA) of strain Marseille-P3135, FastPrep BIO 101 (Qbiogene, Strasbourg, France) was used for a mechanical treatment with acid-washed beads (G4649-500 g Sigma).
Then, samples were incubated with lysozyme after 2 hr and a half at 37°C and EZ1 biorobot (Qiagen) was used for DNA extraction according the to manufacture guidelines. Qubit was used for DNA quantification (69.3 ng/μl).
As for genome sequencing, MiSeq Technology (Illumina Inc, San Diego, CA, USA) was used with mate-pair and paired-end methods. Also, Nextera XT kit (Illumina) and Nextera Mater pair kit (Illumina) were used for samples barcoding. The DNA of the strain was mixed with 11 pairedend projects and 11 mate-pair projects. Pair-end libraries were prepared by using 1 ng of gDNA, which was fragmented and tagged. Twelve PCR amplification cycles accomplished the tag adapters and added dual-index barcodes. Subsequently, purification was done using AMPure XP bead (Beckman Coulter Inc, Fullerton, CA, USA), and libraries' normalization was done as described in Nextera XT protocol (Illumina) for pooling and sequencing on MiSeq. A single run of 39 hr in 2 × 250-bp was done for paired-end sequencing and clusters generation. This library was loaded on two flowcells. Total information of 6.5 and 4.3 Gb was obtained from a 685 and 446 k/mm 2 cluster density with a cluster quality 95.1% and 94.8% (12,615,000 and 8,234,000 passed filtered clusters). Index representation for strain Marseille-P3135 was determined to be of 4.57% and 3.83%. The 576,647 and 315,481 paired-end reads were filtered based on the quality of the reads.
As for mate-pair libraries preparation, 1.5 μg of gDNA were used with Nextera mate-pair Illumina protocol. Mate-pair junction adapter were used to tag fragmented gDNA, and Agilent 2100 BioAnalyzer passing filtered paired reads). Index representation of the studied strain was determined to be of 8.53% and 9.24%. The 867,401 and 511,563 paired reads were trimmed and then assembled.
Genome assembly, annotation, and comparison were made with the same pipeline as previously discussed in our previous work (Elsawi et al., 2017).

| Strain Marseille-P3135 identification
After comparing the 16S rRNA gene sequence of the present strain with other organisms, it was found that it exhibited a sequence similarity of 96.95% with E. lenta (DSM 2243; AF292375), its phylogenetically closest species with standing in nomenclature (Figure 1).
The phylogenetic analysis clearly supports that the studied strain is a member of the Eggerthella genus. Having more than 1.3% sequence divergence with its closest species, we can suggest that the isolate represents a new species named E. timonensis (Bilen et al., 2016).

| Phenotypical and biochemical analysis of strain Marseille-P3135
The strain is a nonmotile Gram-positive rod, unable to sporulate, oxidase negative, and catalase positive. It grows under anaerobic conditions between 25°C and 42°C but optimally at 37°C. As for acidity tolerance, this strain was able to survive in media with pH ranging between 6 and 8.5 and could sustain only a 5 g/L NaCl concentration.
Colonies have a smooth appearance with a mean diameter of 0.5 mm. Examined traits using API20A, API50CH, and APIZYM are detailed in supplementary Table 1. A comparison of some biochemical features was done in Table 1 between the studied strain and the literature data of closely related species (Lau et al., 2004b;Würdemann et al., 2009).
Composed of two scaffolds, the genome of strain Marseille-P3135 is 3,916,897 bp long with 65.17 mol% G+C content. When analyzing the detected 3,371 predicted genes, 57 were RNAs (2 genes are 23S rRNA, 2 genes are 5S rRNA, 2 genes are 16S rRNA, and 51 genes are tRNA genes) and 3,314 were protein-coding genes. Moreover, 2,524 F I G U R E 2 Electron micrographs of Eggerthella timonensis strain Marseille-P3135 generate wit Morgagni 268D (Philips) transmission electron microscope operated at 80 keV. Scale bar = 200 nm F I G U R E 3 Gel view comparing mass spectra of Eggerthella timonensis strain Marseille-P3135 to other species by displaying the raw spectra of different species in a pseudo-gel like arrangement. The x-axis represents the m/z value and the left y-axis correspond to the running spectrum number deriving from subsequent spectra loading. Intensities of the peaks are represented by with a gray scale. Also, the correlation between the peak color and its intensity is represented in the right y-axis with arbitrary units. Species shown for this analysis are noted on the left  Table 3.

| Comparison of genome properties
The draft genome sequence of the present new species was compared to with G. pamelaeae (FP929047) which is close but outside the Eggerthella genus and E. lenta (ABTT00000000) as the closest species and alone member of the genus for which the genome is available. The draft genome sequence of our strain was larger than that of G. pamelaeae and E. lenta (3,608 and 3,632 Mb, respectively). The G+C content was larger too (64% and 64.2%, respectively). The gene content was larger than that of G. pamelaeae and E. lenta (2,027 and 3,070, respectively). The functional classes' distribution of predicted genes of the present genome according to the COGs of proteins is shown in Figure S3. The latter showed an identical profile for the three compared strains.
Subsequently, DNA-DNA hybridization values between E. timonensis and other species with standing in nomenclature was of 43.6 with E. lenta, 21.2 with G. pamelaeae (Table 5). Interestingly, these data show that the genome of the strain was closer than E. lenta one and further of G. pamelaeae supporting the hypothesis that strain Marseille-P3135 as a unique species which is close to species of the Eggerthella genus (Kim et al., 2014;Tindall, Rosselló-Móra, Busse, Ludwig, & Kämpfer, 2010;Wayne et al., 1987).

| CONCLUSION
In conclusion, culturomics helped us in the isolation of a new species previously uncultured from the human gut normal flora and its description using a taxonogenomics approach. Given its 16S rRNA gene sequence divergence higher than 1.3% with its phylogenetically closest species with standing in nomenclature, we propose a new species E. timonensis, type strain Marseille-P3135 (=CSUR P3135, =CCUG 70327).

| E. timonensis sp. nov. description
E. timonensis (ti.mo.nen′sis, N.L. fem. adj., with timonensis referring to La Timone, which is the name of the hospital in Marseille (France) where this work was performed).
It is a nonmotile Gram-positive rod, unable to sporulate, oxidase negative, and catalase positive. It grows under anaerobic conditions optimally at 37°C. Colonies have a smooth appearance with a mean T A B L E 1 Differential characteristics of Eggerthella timonensis strain Marseille-P3135 with Eggerthella lenta strain NCTC 11813 (Kageyama et al., 1999), Eggerthella sinensis (Lau et al., 2004b), and Gordonibacter pamelaeae strain 7-10-1-b (T) (Würdemann et al., 2009)  diameter of 0.5 mm. Moreover, cells had a length of 0.7-1.6 μm when seen under electron microscope and an average diameter of 0.4 μm.
It is able to produce esterase C4, esterase lipase C8, acid phosphatase, and naphtol-AS-BI-phosphohydrolase. As well, it can