Genome sequence and description of Gracilibacillus timonensis sp. nov. strain Marseille‐P2481T, a moderate halophilic bacterium isolated from the human gut microflora

Abstract Microbial culturomics represents an ongoing revolution in the characterization of the human gut microbiota. By using three culture media containing high salt concentrations (10, 15, and 20% [w/v] NaCl), we attempted an exhaustive exploration of the halophilic microbial diversity of the human gut and isolated strain Marseille‐P2481 (= CSUR P2481 = DSM 103076), a new moderately halophilic bacterium. This bacterium is a Gram‐positive, strictly aerobic, spore‐forming rod that is motile by use of a flagellum and exhibits catalase, but not oxidase activity. Strain Marseille‐P2481 was cultivated in media containing up to 20% (w/v) NaCl, with optimal growth being obtained at 37°C, pH 7.0–8.0, and 7.5% [w/v] NaCl). The major fatty acids were 12‐methyl‐tetradecanoic acid and hexadecanoic acid. Its draft genome is 4,548,390 bp long, composed of 11 scaffolds, with a G+C content of 39.8%. It contains 4,335 predicted genes (4,266 protein coding including 89 pseudogenes and 69 RNA genes). Strain Marseille‐P2481 showed 96.57% 16S rRNA sequence similarity with Gracilibacillus alcaliphilus strain SG103T, the phylogenetically closest species with standing in nomenclature. On the basis of its specific features, strain Marseille‐P2481T was classified as type strain of a new species within the genus Gracilibacillus for which the name Gracilibacillus timonensis sp. nov. is formally proposed.

Using the taxonogenomics approach that includes phenotypic features, proteomic information obtained by matrix-assisted laserdesorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), and analysis of the complete genome sequence (Pagani et al., 2012;Ramasamy et al., 2014;Sentausa & Fournier, 2013), we present here the characterization of a new halophilic species for which we formally propose the name Gracilibacillus timonensis sp.

| Sample collection and culture conditions
A stool sample was collected from a 10-year-old healthy young Senegalese boy living in N'diop (a rural village in the Guinean-Sudanian zone of Senegal). The patient's parents gave an informed consent, and the study was approved by the National Ethics Committee of Senegal (N° 00.87 MSP/DS/CNERS) and by the local ethics committee of the IFR48 (Marseille, France) under agreement 09-022. The stool sample was collected immediately after defecation into a sterile plastic container, preserved at −80°C and transported to Marseille until further analysis.
The salinity of the sample was measured using a digital refractometer (Fisher scientific, Illkirch, France) and its pH measured using a pH-meter (Eutech Instruments, Strasbourg, France).
Strain Marseille-P2481 was isolated in aerobic conditions, on a home-made culture medium consisting of Columbia agar enriched with 10% (w/v) NaCl (Sigma-Aldrich, Saint-Louis, MO, USA), as previously described (Diop et al., 2016). Briefly, 1 g of stool sample was inoculated into 100 ml of our home-made liquid medium and incubated aerobically at 37°C. Subcultures were conducted after 1, 3, 7, 10, 15, 20, and 30 days of incubation. Serial dilutions of 10 −1 to 10 −10 were then performed in the home-made liquid culture medium and plated on Columbia and Chapman agar plates (Oxoid, Dardilly, France). After 2 days of incubation at 37°C, all apparent colonies were picked and subcultured several times to obtain pure cultures.

| MALDI-TOF MS strain identification
Briefly, one isolated bacterial colony was picked from chapman culture plate using a pipette tip and spread it as a thin film on a MTP 96 MALDI-TOF target plate for identification with a Microflex MALDI-TOF MS spectrometer (Bruker Daltonics, Leipzig, Germany). In total, 12 distinct deposits for strain Marseille-P2481were done from 12 individual colonies in duplicate. After air-drying, 2μl matrix solution was applied per spot, as previously reported . All spectra were recorded in positive linear mode for the mass range of 2,000-20,000 Da (parameter settings: ion source 1 (ISI), 20 kV; IS2, 18.5 kV; lens, 7 kV). The obtained protein spectra were compared with those of 2,480 spectra in the Bruker database enriched with our own database (Lagier, Hugon, et al., 2015). The strain was identified at the species level if the MALDI-TOF MS score was greater than 1.9. If the score was lower than this threshold, the identification was not considered as reliable and the 16S rRNA gene was sequenced.

| Phylogenetic analysis
The 16S sequences from the type strains of the species with a validly published name that exhibited the highest BLAST score with our new strain were downloaded from the NCBI ftp server (ftp:// ftp.ncbi.nih.gov/Genome/). Sequences were aligned using the CLUSTALW 2.0 software (Larkin et al., 2007), and phylogenetic inferences were obtained using the neighbor-joining method and the maximum likelihood method within the MEGA software, version 6 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013). The evolutionary distances were computed based on the Kimura 2-parameter model (Kimura, 1980) with 95% of deletion, and bootstraping analysis was performed with 500 replications.

| Morphological observation
To observe the cell morphology, transmission electron microscopy of the strain was performed using a Tecnai G20 Cryo (FEI company, Limeil-Brevannes, France) at an operating voltage of 60 Kv after negative staining. Gram staining was performed and observed using a photonic microscope Leica DM2500 (Leica Microsystems, Nanterre, France) with a 100X oil-immersion objective (Atlas & Snyder, 2011). The motility of the strain was assessed by the Hanging Drop method. The slide was examined using a DM1000 photonic microscope (Leica Microsystems) at 40×. Sporulation was tested following a thermic shock at 80°C during 20 min, and the endospore formation was visualized using a Tecnai G20 Cryo transmission electron microscope (FEI company, Limeil-Brevannes, France) at an operating voltage of 60 Kv after negative staining.
Biochemical tests were performed using the API ZYM, API 50 CH, and API 20 NE strips (bioMerieux, Marcy-l'Etoile, France), according to the manufacturer's instructions. The API ZYM was incubated for 4 hr and the other two strips for 48 hr.

| Fatty acid methyl ester (FAME) analysis by GC/ MS
For the FAME analysis, strain Marseille-P2481 was cultivated on Chapman agar (7.5% NaCl) (Oxoid, Dardilly, France) at 37°C under aerobic atmosphere for 2 days. Cellular fatty acid methyl ester (FAME) analysis was performed by gas chromatography/mass spectrometry (GC/MS). Two samples were prepared with approximately 70 mg of bacterial biomass per tube harvested from several culture plates. FAMEs were prepared as described by Sasser (Sasser, 1990). GC/MS analyses were carried out as previously described . Briefly, FAMEs were separated using an Elite 5-MS column and monitored by mass spectrometry

| Extraction and genome sequencing
After a pretreatment by lysozyme incubation at 37°C for 2 hr, the DNA of strain Marseille-P2481 was extracted on the EZ1 biorobot (Qiagen) with EZ1 DNA Tissue kit. The elution volume was 50 μl. A total sequencing output of 6.52 Gb was obtained from a 696 K/mm 2 cluster density with a cluster passing quality control filters of 95.6% (12,863,388 passing filter paired reads). Within this run, the index representation for strain Marseille-P2481 was determined to be 9.39%. The 1,207,306 paired reads were trimmed and then assembled.

| Genome annotation and comparison
Prodigal was used for open reading frame (ORF) prediction (Hyatt et al., 2010) with default parameters. Predicted ORFs spanning a sequencing gap region were excluded. Bacterial protein sequences were predicted using BLASTP (E-value 1e −03 , coverage 0.7 and identity percent 30%) against the Clusters of Orthologous Groups (COG) database. If no hit was found, a search against the nr database (Benson et al., 2015) was performed using BLASTP with Evalue of 1e −03 , a coverage of 0.7 and an identity percent of 30%. If sequence lengths were smaller than 80 amino acids, we used an Evalue of 1e −05 . Pfam conserved domains (PFAM-A an PFAM-B domains) were searched on each protein with the HHMscan tool (Finn et al., 2015). RNAmmer (Lagesen et al., 2007) and tRNAScanSE (Lowe & Eddy, 1997) were used to identify ribosomal RNAs and tRNAs, respectively. We predicted lipoprotein signal peptides and the number of transmembrane helices using Phobius (Käll, Krogh, & Sonnhammer, 2004). ORFans were identified if the BLASTP search was negative (E-value smaller than 1e −03 for ORFs with a sequence size larger than 80 aas or E-value smaller than 1e −05 for ORFs with sequence length smaller than 80 aas). Artemis (Carver, Harris, Berriman, Parkhill, & McQuillan, 2012) and DNA Plotter Annotation and comparison processes were performed using the multi-agent software system DAGOBAH (Gouret et al., 2011), which includes Figenix (Gouret et al., 2005) libraries that provide pipeline analysis. We also estimated the degrees of genomic sequence similarity among compared genomes using the following tools: first, we used the MAGI home-made software (Padmanabhan, Mishra, Raoult, & Fournier, 2013) This software calculates the average genomic identity of orthologous gene sequences (AGIOS) among compared genomes (Ramasamy et al., 2014). It combines the Proteinortho software (Lechner et al., 2011) for detecting orthologous proteins in pairwise genomic comparisons, then retrieves the corresponding genes and determines the mean percentage of nucleotide sequence identity among orthologous ORFs using the Needleman-Wunsch global alignment algorithm. Second, F I G U R E 1 Reference mass spectrum from Gracilibacillus timonensis strain Marseille-P2481 T F I G U R E 2 Gel view comparing Gracilibacillus timonensis strain Marseille-P2481 T with other species within the genera Gracilibacillus and Bacillus the digital DNA-DNA hybridization was performed using the GGDC (Genome-to-Genome Distance Calculator) analysis via the GGDC web server as previously reported (Klenk, Meier-Kolthoff, & Göker, 2014). Finally, the average amino acid identity (AAI) was calculated, based on the overall similarity between two genomic datasets of proteins (Rodriguez-R & Konstantinidis, 2014) available at (http://enve-omics.ce.gatech.edu/aai/index).

| Physiological and biochemical characteristics
Isolated for the first time in our home-made halophilic medium with 10% (w/v) NaCl, strain Marseille-P2481 was able to grow in media containing up to 20% (w/v) NaCl under aerobic conditions with a minimal concentration of growth at 7.5% NaCl, but was also able to grow in anaerobic and microaerophilic atmospheres (at 37°C). After 2 days of growth at 37°C, colonies were creamy orange and circular, with a mean diameter of 0.2 μm. Cells were  Table 1. Using an API ZYM strip, positive results were obtained for esterase, esterase lipase, acid phosphatase, naphtol-AS-BI-phosphohydrolase β-galactosidase, β-glucosidase, and α-glucosidase activities but no reaction was observed for alkaline phosphatase, lipase, Leucine arylamidase, Valine arylamidase, Cystine arylamidase, α-galactosidase, β-glucuronidase, trypsin, α-chymotrypsin, α-mannosidase, α-fucosidase, and Nacetylβ-glucosaminidase. The API 50CH strip revealed that strain Marseille-P2481 exhibited esculin hydrolysis, but negative reactions were obtained for d-arabitol, l-arabitol, d-glucose, F I G U R E 3 Phylogenetic tree highlighting the position of Gracilibacillus timonensis strain Marseille-P2481 T relative to other closely related species. GenBank accession numbers of each 16S rRNA are indicated after each species name. Sequences were aligned using CLUSTALW, and the evolutionary history was inferred using the Neighbor-Joining method (a) and the maximum likelihood method (b) with the Kimura 2-parameter method within MEGA6 software. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) is shown next to the branches. The analysis involved 24 nucleotide sequences. All positions with less than 95% site coverage were eliminated. There were a total of 1,404 positions in the final dataset. The scale bar represents a 2% nucleotide sequence divergence Phylum: Firmicutes TAS (Skerman & Sneath 1980, Murray, 1984, Gibbons and Murray, 1978, Garrity and Holt, 2001 Class: Bacilli TAS (Ludwig, Schleifer, & Whitman 2009) Order: Bacillales TAS (Skerman & Sneath 1980, Prevot, 1953 Family: Bacillaceae TAS (Skerman & Sneath 1980, Fischer, 1985 Genus: Gracilibacillus TAS (Wainø et al., 1999) Species: Gracilibacillus timonensis IDA

| Genome properties
The genome is 4,548,390 bp long with a 39.8% G+C content. It is composed of 11 scaffolds (composed of 12 contigs). Of the 4,335 predicted genes, 4,266 were protein-coding genes and 69 were RNAs (4 complete 16S rRNA, 6 complete 5S rRNA gene, 2 complete and 2 partiel 23S rRNA, and 51 tRNA genes, as well as additional 4 other rRNAs). A total of 3,043 genes (70.24%) were assigned a putative function (by COGs or BLAST against nr). A total of 214 genes were identified as ORFans (6.94%). The remaining genes were annotated as hypothetical proteins (861 genes => 19.92%).
The genome statistics are presented in Table 4, and the distribution of genes into COGs functional categories is summarized in Table 5.

| D ISCUSS I ON
Due to the concept of microbial culturomics, aiming at exploring the diversity of the human microbiota as exhaustively as possible, many new bacterial species have been discovered over the past 5 years . This concept is based on the diversification of physicochemical parameters of culture conditions (Lagier et al., 2012Lagier, Hugon, et al.,2015) to mimick as closely as possible the entirety of selective constraints that have shaped the human flora. To date, 329 new species have been characterized . These new species include 52 species belonging to the order Bacillales, which is one of the most represented bacterial orders

| CON CLUS ION
The moderately halophilic strain Marseille-P2481 was isolated from a stool sample of a 10-year-old healthy Senegalese boy as part of a study of halophilic bacteria from the human gut. Based on its phenotypic, phylogenetic, and genomic characteristics, this strain is proposed to represent a novel species in the genus Gracilibacillus, for which the name Gracilibacillus timonensis sp. nov. is proposed. Strain Marseille-P2481 T is the type strain of Gracilibacillus timonensis sp. nov. respectively.
Using an API 20NE strip, fermentation of glucose, urease activity, and metabolism of l-arginine, esculin and 4-nitrophenyl-βD-galact opyrasinoside were positive. In contrast, nitrate and indole production, gelatinase activity and metabolism of d-glucose, l-arabinose,

ACK N OWLED G M ENTS
This study was funded by the Méditerranée-Infection foundation and the French Agence Nationale de la Recherche under reference Investissements d'Avenir Méditerranée Infection 10-IAHU-03.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interest in relation to this research.