Genetic diversity of vaginal lactobacilli from women in different countries based on 16S rRNA gene sequences


Lin Tao University of Illinois at Chicago, College of Dentistry, M/C 690, 801 South Paulina Street, Chicago, IL 60612, USA (e-mail:


Aims: Lactobacilli are widely distributed in food and the environment, and some colonize the human body as commensal bacteria. The aim of this study was to determine the species of lactobacilli that colonize the vagina and compare them with those found in food and the environment.

Methods and Results: Thirty-five Lactobacillus strains from women from seven countries were isolated, and sequences from 16S rRNA genes were determined and compared with existing data in GenBank. A phylogenetic tree was achieved using the Neighbour-Joining method based on the analysis of 1465 nucleotides. The results showed that most vaginal isolates were L. crispatus, L. jensenii and L. gasseri. Some were L. vaginalis, L. fermentum, L. mucosae, L. paracasei and L. rhamnosus. Two isolates from a native American woman displayed distinct branches, indicating novel phylotypes. Few vaginal isolates matched food or environmental Lactobacillus species.

Conclusions: Most women worldwide were colonized by three common Lactobacillus species: L. crispatus, L. jensenii and L. gasseri.

Significance and Impact of Study: Knowledge of vaginal Lactobacillus species richness and distribution in women worldwide may lead to the design of better probiotic products as bacterial replacement therapy.


Lactobacilli are ubiquitous in the environment. They colonize plants, animals and humans (Collins et al. 1991). They are also widely used to process foods and beverages, including beer (Bohak et al. 1998), fermented olives (Duran Quintana et al. 1999) and dairy products (Miteva et al. 1992). In the human body, lactobacilli may colonize three anatomic regions: the oral cavity, the intestines and the vaginal tract. Although the lactobacilli that inhabit the vaginae of mothers contaminate the infants’ mouth during delivery, they do not appear to colonize the oral cavity (Carlsson and Gothefors 1975) or the intestines of the infants (Tannock et al. 1990) after birth. It is unknown, however, whether food or the environment could be a source of lactobacilli that colonize humans.

Lactobacilli play an important role in maintaining the vaginal health of women (Redondo-Lopez et al. 1990). They produce lactic acid, hydrogen peroxide (H2O2), bacteriocins and other antimicrobial substances to inhibit pathogenic organisms in the vagina. In healthy women, lactobacilli are usually the dominant bacterial species. When lactobacilli are reduced or absent, other micro-organisms, such as anaerobes, may overgrow. As a result, a common disorder called bacterial vaginosis may occur (Sobel 1997). To develop suitable bacterial replacement therapy for the treatment of vaginosis, it is desirable to identify the correct species of vaginal lactobacilli.

Despite its importance to women’s health, the taxonomy of vaginal lactobacilli has not been extensively studied. Historically, the dominant species in the human vagina was considered to be Lactobacillus acidophilus (Thomas 1928; Rogosa and Sharpe 1960). This has inspired the production of many over-the-counter Lactobacillus pills, tablets and suppositories for vaginal colonization with the unsuitable species (Hughes and Hillier 1990), and the practice of a home remedy for vaginitis, namely vaginal instillation of yoghurt (Nyirjesy et al. 1997). Unfortunately, neither the Lactobacillus products nor the home remedy has been very effective.

The application of genomic analyses has advanced the taxonomy of lactobacilli. Based on DNA homology studies, the previous L. acidophilus species has been divided into six DNA homology groups that could not be distinguished biochemically (Johnson et al. 1980). These homology groups were characterized later as six distinct species: L. acidophilus, L. crispatus, L. amylovorus, L. gallinarum, L. gasseri and L. johnsonii (Du Plessis and Dicks 1995). Recent studies using the DNA–DNA hybridization method in reference to type strains (Giorgi et al. 1987; Antonio et al. 1999; Song et al. 1999) revealed that L. crispatus, L. gasseri and/or L. jensenii are the most common species found in the vagina. The less common species include L. ruminis, L. oris, L. reuteri and L. vaginalis. A new species, L. iners, which also colonizes the human vagina, has recently been reported (Falsen et al. 1999).

At present, more advanced genotypic methods are available to study microbial taxonomy. These include the analysis of 16S rRNA (Collins et al. 1991), 16S–23S and 23S–5S spacer regions (Nour 1998), pulsed-field gel electrophoresis (Chevallier et al. 1994), randomly amplified polymorphic DNA (RAPD)-PCR (Van Reenen and Dicks 1996), M13 fingerprinting and ribotyping (Miteva et al. 1992). Among these molecular techniques, the 16S RNA analysis has been accepted as a more reliable method (Collins et al. 1991). Therefore, the aims of the present study were to determine the species, including previously unrecognized species, termed phylotypes, of vaginal lactobacilli isolated from women of different countries, and to establish a phylogenetic tree including lactobacilli from food and the environment by analysing their 16S rRNA genes (rDNA).



From a total of 400 Lactobacillus isolates, 35 representative strains were selected for this study. These strains were collected from vaginal swabs of reproductive-aged women of seven different countries: Argentina, Brazil, China, India, South Korea, Turkey and the United States. Two of the strains were from a pair of mother (45-year-old) and daughter (23-year-old). The daughter was delivered by vaginal birth. This study has been approved by the Institution Review Board at the University of Illinois at Chicago and various collaborating foreign sites. All human subjects gave informed consent to the project. Four Lactobacillus vaginal reference strains (L. jensenii ATCC 25258, L. crispatus ATCC 33197 and 33820, and L. vaginalis 49540) were obtained from the American Type Culture Collection (ATCC). Lactobacillus Rogosa and MRS agars (Difco) were used for the initial isolation, and MRS broth was used for subsequent culture. All isolates were presumptively identified as Lactobacillus based on their ability to grow on the Rogosa agar, Gram-positive staining, rod cell shape and catalase-negative phenotype. Sugar fermentation tests were performed for initial characterization. Final confirmation of their identity was achieved by the 16S rDNA sequence analysis.

Genomic DNA extraction

To isolate DNA from lactobacilli, 5 ml of an overnight culture of each strain grown in MRS broth were diluted with 10 ml pre-warmed MRS broth. After about 4 h of incubation at 37°C, the bacterial cells were harvested by centrifugation and the chromosomal DNA was isolated as described by Chassy et al. (1976).

Amplification of 16S rDNA

To amplify nearly full-length 16S rDNA, universal primers were used corresponding to six conserved regions of the Escherichia coli numbering system. The names and sequences of the primers, and their corresponding E. coli positions, are listed in Table 1. Polymerase chain reaction (PCR) was performed using the Techne thermalcycler (Techne, Princeton, NJ, USA). The reaction mixture (final volume of 50 μl) contained: 100 ng template DNA; 1 unit Taq DNA polymerase (Biolase, Bioline, Reno, NV, USA); 1 × reaction buffer; 1·5 mmol l–1 MgCl2; deoxynucleoside triphosphates 0·1 mmol l–1 each; primer, 50 pmol each; and bovine serum albumin, 2 μg. The thermal cycling programme used was as follows: initial denaturation at 94°C for 3 min, 35 cycle of 94°C for 45 s, 50°C for 1 min, and 72°C for 2 min. Finally, there was an extension step at 72°C for 10 min. The PCR DNA products were analysed for correct sizes and purity on 1·2% agarose gel.

Table 1.   Oligonucleotide primers for amplification of 16S rDNA in lactobacilli Thumbnail image of

Sequence determination of 16S rDNA

The PCR products were purified from the agarose gel using the GeneClean kit (Bio101, Vista, CA, USA). The sequences were determined by ABI 377 automated DNA sequencer (Perkin Elmer) at the Molecular Core Facility of the University of Illinois at Chicago.

Phylogenetic analysis

The sequence data of the 35 vaginal isolates and 63 reference strains were aligned using the CLUSTAL X programme (version 1·8) (available from the web server (Thompson et al. 1994). The sequence data of 59 Lactobacillus reference strains were obtained from GenBank; the sequences of four reference strains (L. jensenii ATCC 25258, L. crispatus ATCC 33197 and 33820, and L. vaginalis 49540) were determined in this study. The average number of bases among the 16S rDNA analysed was 1465. Based on the numbering system for the 16S rDNA from E. coli, the sequences included the gene fragment from position 45–1510. Signature base analyses were also conducted. Phylogenetic trees were constructed with the Tree Explorer programme (available from the web server by the Neighbour-Joining method in PHYLIP format (Felsenstein 1990). Bootstrap values were obtained by the TreeTop-Phylogenetic Tree Prediction program on the Internet (

Nucleotide sequence accession number

The 16S rDNA sequences determined in this study have been deposited in the GenBank database. The accession numbers assigned to the clinical strains are from AF243141 to AF243175. The accession numbers of reference strains sequenced in this study are: L. crispatus ATCC 33197, AF257096; L. crispatus ATCC 33820, AF257097; L. jensenii ATCC 25258, AF243176; and L. vaginalis ATCC 49540, AF243177.


Overall phylogenetic analysis

For most of the species, only one representative type strain was selected for sequence analysis, while in two vaginal Lactobacillus species, L. crispatus and L. vaginalis, two or more representative type strains of each were studied. The aligned sequences were analysed for phylogenetic relationships. The phylogenetic tree of lactobacilli based on the Neighbour-Joining method is shown in Fig. 1. The phylogenetic distribution of most vaginal isolates (25/35) are within three major clusters of L. crispatus, L. jensenii and L. gasseri.

Figure 1.

 Phylogenetic tree showing the positions of the vaginal Lactobacillus isolates among the known Lactobacillus species. The tree was obtained by the Neighbour-Joining method of 1465 aligned nucleotide positions in 16S rDNA. Bootstrap values (from 100 replicates) greater than 40% are shown at the branch points. The source of each strain is indicated by a name of a country or ethnicity. The homofermentative cluster (a) and heterofermentative cluster (b) are separated as two major subtrees. Three compressed subtrees are indicated as triangles. The L. delbrueckii group includes three subspecies: L. delbrueckii subsp. bulgaricus, L. delbrueckii subsp. delbrueckii and L. delbrueckii subsp. lactis. The L. salivarius group includes 16 species: L. agilis, L. animalis, L. aviarius, L. bifermentans, L. confusus, L. coryniformis, L. divergens, L. mali, L. maltaromicus, L. minor, L. murinus, L. ruminis, L. sake, L. salivarious, L. thermophilus and L. viridescens. The L. brevis group also includes 16 species: L. alimentarius, L. brevis, L. buchneri, L. collinoides, L. farciminis, L. fructivorans, L. hilgardii, L. kefiri, L. kimchii, L. lindneri, L. paraalimentarius, L. pentosus, L. perolens, L. plantarum, L. sanfrancisco and L. vermiforme. Lactobacillus crispatus DSM 20584 and ATCC 33820 had an identical sequence, so only one was listed. The scale represents 0·02 nucleotide substitution per position

The first group included 11 clinical isolates originating from several different countries (one from China, two from the South Korea, two from the United States and five from Turkey) and three L. crispatus type strains (DSM 20584, ATCC 33197 and ATCC 33820). Two type strains (DSM 20584 and ATCC 33820) and two clinical isolates from different countries (TL25a from Turkey and KLB79 from Korea) had identical 16S rDNA sequences. The second group included seven isolates related to L. jensenii ATCC 25258. It included three strains from the United States, one from Brazil, one from China, one from India and one from Turkey. The third group included seven isolates related to L. gasseri DSM 20243. It included two strains from Brazil, three from Turkey, two from the United States and one from India.

The 16S rDNA sequence of less than a quarter (8/35) of clinical isolates matched five other species. Two isolates (one from Argentina and one from the United States) matched L. fermentum. One (from China) was close to the two L. vaginalis type strains. One (from Argentina) matched L. casei subsp. rhamnosus, and two (from Argentina and South Korea) matched L. paracasei subsp. paracasei. None of the reference strains, which were originally isolated from environmental sources or from animals or plants, matched any of the human vaginal isolates analysed except for L. mucosae (Roos et al. 2000). This Lactobacillus species originally isolated from pig intestines matched two human vaginal isolates (one from Brazil and one from Argentina).

Two major phylogenetic clusters are illustrated in Fig. 1. Cluster A includes Lactobacillus species belonging to the homofermentative group, while cluster B includes Lactobacillus species belonging to the heterofermentative group based on bacterial classification criteria of this genus found in Bergey’s Manual (Kandler and Weiss 1986).

Possible novel phylotypes

Based on the tree topology (Fig. 1), three possible novel phylotypes were apparent. These included the two vaginal isolates from the native American women, KC45a and KC45b, and the type strain L. cripatus NCTC 4. Among the four L. crispatus type strains, three (DSM 20584, ATCC 33197 and ATCC 33820) are genetically close to each other, but the fourth one is unrelated. Therefore, the species of the type strain L. crispatus NCTC 4 might be wrongfully assigned. It may actually represent a new, separate phylotype. A fourth possible novel phylotype was KC38 in the second group, due to its apparent phylogenetic distance from the L. jensenii type strain, which is greater than the distance between L. gasseri and L. johnsonii. Likewise, the difference in percentage nucleotide homology (Table 2) between KC38 and L. jensenii (1·5%) is also greater than the difference between L. gasseri and L. johnsonii (1·3%).

Table 2.   Percentage homology for a 1465-nucleotide region of 16S rRNA gene of some representative Lactobacillus species Thumbnail image of

Percentage homology

As shown in Table 2, the 1465-nucleotide region of the 16S rDNA of 23 strains representing different Lactobacillus species were compared for percentage homology. These strains represented food lactobacilli, such as L. acidophilus, L. delbrueckii subsp. bulgaricus and L. pontis, known vaginal species, such as L. crispatus, L. jensenii and L. vaginalis, and possible novel phylotypes, such as KC45a and KC45b. Each representative strain was listed in a descending order according to its homology with L. crispatus DSM 20584.

Sequence signatures

Signatures are positions within the sequence where the base compositions differ from that found in other bacteria. Therefore, identifying signatures of a specific species can facilitate their identification. Single base and base-pair sequence signatures for two closely-related Lactobacillus species, L. fermentum and L. mucosae, were clearly noticeable and are presented in Table 3. Included in Table 3 are the position number, the base(s) found in each of the two specific Lactobacillus species and the bases found in other lactobacilli.

Table 3.   Signature analysis of Lactobacillus fermentum and L. mucosaeThumbnail image of

Species distribution among vaginal lactobacilli isolated from the same women and from the mother–daughter pair

Although most women hosted only one vaginal Lactobacillus strain, some women carried two or more strains. Among the 35 vaginal strains analysed in the present study, 12 were isolated from six women, two of which were a pair of mother and daughter. The sources, strain names and their species identifications are listed in Table 4. Multiple strains from the same women were all different species except for KC12a and KC12b from the mother. Although KC12a and KC12b were both L. crispatus, they were different strains because they had different colonial morphologies, protein profiles (not shown) and 16S rDNA sequences (Fig. 1). On the other hand, the daughter had a different vaginal Lactobacillus species from her mother.

Table 4.   Species distribution among multiple vaginal lactobacilli isolated from the same women and from the mother–daughter pair Thumbnail image of


The results showed that most vaginal Lactobacillus strains from women of multiple geographically-separated countries belonged to three species: L. crispatus, L. gasseri and L. jensenii. The high degree of species consistency in the vaginal indigenous lactobacilli among several different women’s populations is significant for the future development of bacterial replacement therapy. This is of special interest in comparison with the huge microbial species biodiversity found in the gut of one individual (Suau et al. 1999).

The present study with 16S rDNA sequence analysis confirms the DNA homology studies of Giorgi et al. (1987), Antonio et al. (1999) and Song et al. (1999), who found that the most prevalent species of vaginal lactobacilli in women from Italy, the United States and Japan, respectively, were homologous to the type strains of L. crispatus, L. gasseri and/or L. jensenii, and none of the strains tested was homologous to L. acidophilus. Additionally, this study identified several novel phylotypes of vaginal lactobacilli.

To our knowledge, this is the first application of 16S rDNA analysis of human vaginal lactobacilli isolated from several different countries against a large number of known species within the genus Lactobacillus. Previous studies have used methods based on physiological and biochemical criteria to classify vaginal lactobacilli (Rogosa and Sharpe 1960; Reid et al. 1996), or have used reference strains as a standard to identify the species of clinical isolates by DNA-DNA hybridization (Giorgi et al. 1987Antonio et al. 1999; Song et al. 1999). Although the chromosomal DNA hybridization method has been largely reliable for identifying bacterial species, it has several limitations. For example, identification of the range of bacterial species is limited by the number of labelled DNA probes. Since these probes are made of reference strains representing existing species, the DNA hybridization method can only identify known species. Also, correct species identification is dependent on the reliability of the reference strains used. However, many reference strains were characterized previously with non-genetic methods and thus, may themselves be unreliable. Finally, DNA hybridization data are not sufficient to establish a phylogenetic tree to illustrate the genetic distances or relatedness among different Lactobacillus species that colonize the human vagina and those that exist in food or the environment. Conversely, data from the 16S rDNA analysis do not have these limitations.

The phylogenetic tree in Fig. 1 shows that geographical distances among the sources of the vaginal lactobacilli do not influence their phylogenetic distances. Many vaginal lactobacilli from women of different continents were closely related or even identical (e.g. TL25a from Turkey and KLB79 from South Korea). Most of these strains belonged to three species L. crispatus, L. gasseri and L. jensenii, which are homofermentative lactobacilli. These lactobacilli metabolize glucose only to lactic acid. Less than a quarter of the strains tested belonged to five other species, L. vaginalis, L. fermentum, L. rhamnosus, L. paracasei and L. mucosae, all of which are heterofermentative lactobacilli. They metabolize glucose to a mixture of lactic acid, acetic acid and CO2 etc. The two Lactobacillus clusters are well separated in the phylogenetic tree.

Overall, the majority of human vaginal strains were different from the food and/or environmental Lactobacillus species in the phylogenetic tree (Fig. 1). Food species include L. amylolyticus from beer malt (Bohak et al. 1998), L. plantarum from fermented green olive (Duran Quintana et al. 1999) and the group of dairy species, such as L. acidophilus and L. delbrueckii subsp. bulgaricus (Miteva et al. 1992). Because some non-vaginal species, such as the dairy Lactobacillus species, were phylogenetically closer to the common vaginal Lactobacillus species, L. crispatus, than L. jensenii and L. gasseri in the homofermentative group (Fig. 1a) and other less common vaginal species in the heterofermentative group (Fig. 1b), phylogenetic distance alone may not be sufficient to determine whether a Lactobacillus species could colonize the human vagina. Other species-specific factors, such as the ability to adhere to mucosal membrane, may be required, because two strains isolated from women in Brazil and Argentina were identified as L. mucosae, a newly characterized species isolated from the pig intestine (Roos et al. 2000). Therefore, multiple genetically-distant Lactobacillus species may colonize the human vagina.

Based on the phylogenetic positions, several novel phylotypes were identified. Most notably, two strains (KC45a and KC45b) isolated from a native American woman may represent two distinct phylotypes. Possible additional phylotypes included the reference strain L. crispatus NCTC 4 and one strain from China (KC38). On the other hand, only two Lactobacillus species, L. fermentum and L. mucosae, had good single base and base-pair sequence signatures (Table 3). These signatures can be used to differentiate the two closely-related bacterial species.

Although most women host only one Lactobacillus species or strain, as shown in Table 4, multiple genetically-distant Lactobacillus species or strains simultaneously colonized the same women. Recently, Antonio et al. (1999), using DNA/DNA hybridization, identified two species, L. jensenii and L. crispatus, from 11 women. Walter et al. (2001), using PCR and denaturing gradient gel electrophoresis, showed that multiple Lactobacillus species co-exist in the human gut. These results, together with our observation, indicate that different Lactobacillus species may co-exist in the same environment.

Although the source of vaginal lactobacilli in women has not yet been identified, it is reasonable to speculate that lactobacilli from the mother’s vaginal tract may inoculate the infant during birth. However, studies by Carlsson and Gothefors (1975) and by Tannock et al. (1990) on bacterial transmission showed that vaginal lactobacilli from the mothers failed to establish colonization in the oral cavity or in the digestive tract of the infants. The result of the present study on the mother–daughter pair showed that the daughter, who was born by vaginal birth, was colonized by a different vaginal Lactobacillus species from her mother. This observation indicated that vaginal lactobacilli of adult women may not necessarily be acquired from their mothers during birth. Vaginal colonization of lactobacilli might occur later in life, probably after puberty. Obviously, a study with an increased number of mother–daughter pairs will be needed to confirm this observation.

In summary, most vaginal Lactobacillus strains from women of geographically-separated countries belonged to three species, L. crispatus, L. gasseri and L. jensenii, indicating a high degree of species consistency in vaginal lactobacilli among women worldwide. Novel phylotypes were identified. Studies of a larger number of women subjects in different countries and geographical areas, especially those living in relatively isolated communities, would be required to identify additional species or phylotypes. Because lactobacilli are critical to women’s vaginal health, the results from this study provide important information for our understanding of the vaginal colonization of lactobacilli and their species richness, relative abundance, geographical distributions and phylogenetic relationships. This knowledge may help guide future development of bacterial replacement therapy to control vaginal infections.


This work was supported in part by Public Health Service grant R03 AI 45127 from the National Institute of Allergy and Infectious Diseases, and by a seed grant 2-2-25521 from the Center for Research on Women and Gender, University of Illinois at Chicago. The authors are grateful to Dr Susan Mou for her assistance in obtaining samples from American women. J.-S. So acknowledges the support of the Korea Research Foundation (1999-042-E00071).