Genomics of lactic acid bacteria


  • Editor: Rustam Aminov

Correspondence: Todd Klaenhammer, Department of Food, Bioprocessing and Nutrition Sciences, North Carolina State University, PO Box 7624, Raleigh, NC 27695-7624, USA. Tel.: +1 919 515 2972;
fax: +1 919 513 0014;


Lactic acid bacteria (LAB) are found to occupy a variety of ecological niches including fermented foods as well as mucosal surfaces of humans and other vertebrates. This review is based on the genomic content of LAB that is responsible for the functional and ecological diversity of these bacteria. These genomes reveal an ongoing process of reductive evolution as the LAB have specialized to different nutritionally rich environments. Species-to-species variation in the number of pseudogenes as well as genes directing nutrient uptake and metabolism reflects the adaptation of LAB to food matrices and the gastrointestinal tract. Although a general trend of genome reduction was observed, certain niche-specific genes appear to be recently acquired and appear on plasmids or adjacent to prophages. Recent work has improved our understanding of the genomic content responsible for various phenotypes that continue to be discovered, as well as those that have been exploited by man for thousands of years.


We are exposed to a huge variety of microorganisms on a daily basis; one group of bacteria that humans have developed a particularly intimate relationship with are the lactic acid bacteria (LAB). The LAB group is composed of micro-aerophilic, nonsporulating rods and cocci that are functionally linked by their common capacity to produce primarily lactic acid from hexose sugars (Makarova & Koonin, 2007). The vast diversity of LAB allows them to inhabit a variety of ecological niches ranging from food matrices such as dairy products, meats, vegetables, sourdough bread, and wine to human mucosal surfaces such as the oral cavity, vagina, and gastrointestinal tract (Pfeiler & Klaenhammer, 2007). The metabolic characteristics of LAB have been exploited for the preservation of foods and have been passed down from generation to generation through food ‘traditions’ that continue to flourish in many cultures to this day. Foods fermented using LAB are still widely consumed, the sales of fermented foods reaching tens of billions of dollars per year, worldwide. Recently, commensal LAB have been given increased attention due to evidence suggesting their important roles in the maintenance of health and the prevention of infection (Reid et al., 2003; Klaenhammer et al., 2005).

Currently, 31 complete LAB genomes have been sequenced and are publicly available. Furthermore, considerable comparative and functional genomic analyses have accompanied the appearance of genomic sequence information. The goal of this review is to highlight some of the interesting outcomes from recent work relating to the genomics of LAB and to put these into context with the relationship between LAB and humans.

General genome features and history

The availability of sequenced genomes has allowed for a deeper understanding of the evolutionary divergence of the LAB, and reveals a trend of relatively recent and ongoing reduction in genome size (van de Guchte et al., 2006). The last common ancestor of Lactobacillales appears to have lost c. 600–1200 genes and gained <100 during its divergence from the Bacilli ancestor (Makarova & Koonin, 2007). The extent of genome reduction varies greatly among LAB with Oenococcus oeni having only c. 1700 predicted ORFs compared with the c. 3000 of Lactobacillus plantarum (Pfeiler & Klaenhammer, 2007). Analysis of the available genomes of LAB suggests that the bulk of the genes lost were due to adaptation to nutrient-rich food environments, particularly those organisms that have adapted to milk and other food environments rich in protein and carbohydrates. The yogurt bacterium Lactobacillus delbrueckii ssp. bulgaricus shows a large difference in G–C% content (49.7%) from the closely related species Lactobacillus acidophilus (34.7%), a gastrointestinal commensal organism. Interestingly, the difference was primarily in the less conserved third codon position, which had a 65% G–C content implying rapid ongoing evolution to a higher G–C content. Furthermore, the number of rRNA and tRNA genes in L. bulgaricus is c. 50% higher than the average for a genome of its size. These numbers would correspond to a genome of 3–4 Mb, significantly larger than its actual size of 1.8 Mbp (van de Guchte et al., 2006). The specialized adaptation to milk is particularly interesting because this fermentation environment would not exist without human intervention. The selective pressure came not only from the natural environment, but also from anthropogenic environments created by humans, which essentially domesticated these organisms over the last 5000 years through repeated transfer of LAB cultures for production of fermented dairy products.


Further evidence of the recent and ongoing genome reduction of LAB is the presence of pseudogenes, often in relatively high numbers compared with other groups of bacteria. This feature is particularly common in organisms that are associated with nutrient-rich food environments such as L. bulgaricus, Lactobacillus helveticus, Lactobacillus lactis, Streptococcus thermophilus, and O. oeni. The yogurt cultures L. bulgaricus and S. thermophilus are reported to have the largest number of pseudogenes, 270 and 182, respectively, showing evidence of recent specialization to the nutrient-rich milk environment (Makarova et al., 2006; van de Guchte et al., 2006). A study of S. thermophilus strains found that the most decayed genes were predicted to encode proteins for carbohydrate transport and metabolism. For example, phosphotransferase system transporting proteins responsible for the uptake of glucose, fructose, β-glucoside, and trehalose were pseudogenes in all eight strains of S. thermophilus studied, while these were all present in a functional form in Streptococcus salivarius, a closely related oral commensal organism. Alternatively, a specific lactose symporter is present in S. thermophilus but is missing in other streptococci species (Bolotin et al., 2004).

This trend of genome reduction and specialization is also observed in L. bulgaricus, which has several incomplete transport systems along with pseudogenes involved in carbohydrate metabolism, amino acid/cofactor biosynthesis, and competence (van de Guchte et al., 2006). Gains in lactose transport and loss of other carbohydrate transport and metabolism genes, as well as the loss of amino acid biosynthesis genes, provide evidence for the adaptation to milk, an environment rich in lactose and protein.


Although a large number of transport systems appear to be lost in some LAB, transporters still make up 13–18% of their genomes, a number larger than what is found in many other bacteria. The diverse environments occupied by LAB require the ability to transport and utilize a variety of substrates in order to survive. The large number of transporters corresponds to the adaptation to nutrient-rich environments and subsequent loss of biosynthetic pathways. Amino acid transporters represent the largest number of uptake systems, which also include sugar, cation/anion, and peptide transporters (Lorca et al., 2007). These transporters are particularly important for low G–C% LAB, which are highly auxotrophic and must scavenge nutrients from the environment (Klaenhammer et al., 2005). Alternatively, a more metabolically capable organism such as L. plantarum retained a larger number of sugar uptake systems and complete pathways for biosynthesis of most amino acids. The maintenance of these systems reflects the ability of this organism to metabolize a wide variety of substrates in a primarily plant environment that is not as nutritionally rich (Kleerebezem et al., 2003). Specific transporters have been identified in some LAB that are needed for survival in specialized environments such as the gastrointestinal tract. For example, gastrointestinal-related organisms such as L. acidophilus and Lactobacillus paracasei contain transporters able to take up fructo-oligosacharides, a group of nondigestible carbohydrates found primarily in the gastrointestinal tract (Barrangou et al., 2003; Altermann et al., 2005; Goh et al., 2006).

Probiotic and gastrointestinal-related genes

In addition to the adaptation of LAB to various food environments, a great deal of interest is currently being paid to those LAB that have adapted to mucosal surfaces. Much of the interest in these organisms is due to relatively recent evidence indicating the important role they play in the maintenance of health. These organisms, referred to as probiotics, are defined as ‘Live microorganisms which when administered in adequate amounts confer a health benefit on the host’ (FAO/WHO, 2001). Some potential beneficial effects of probiotics include exclusion of pathogens, mucosal immunomodulation, and reduction of carcinogens (Reid et al., 2003). A variety of LAB cultures are now being added to foods and consumed as supplements for the sole purpose of eliciting health-promoting benefits. The genome sequence of probiotic organisms has shed light on important functions needed for survival in the gastrointestinal tract. Genes have been identified that encode for proteins involved in probiotic functions including acid/bile tolerance, surface proteins/adherence, gene transfer, and carbohydrate utilization (Klaenhammer et al., 2005).

The human gut microbiota is extraordinarily complex and is receiving increased attention for its role in maintenance of health. The microbial load of the human gut reaches 1012–1014 CFU g−1 of luminal content and is important for digestion, protection against pathogens, and maintenance of mucosal immunity. Mucosal surfaces such as the gastrointestinal tract are the places where the body comes in contact with the majority of antigens and infectious agents. The gastrointestinal tract produces 70–85% of the immune cells of the body and balances anti- and proinflammatory factors to respond to the presence of these antigens. Circumstances such as intestinal infection, changes in diet, and antibiotic treatment can influence the makeup of the intestinal microbiota (Candela et al., in press). Although lactobacilli and other LAB make up a small portion of the total gastrointestinal microbial community, they are predominant microbiota in the small intestine and considered to play a pivotal role in its protection (Heilig et al., 2002; Hayashi et al., 2005).

Adhesion to intestinal cells is considered an important component of a probiotic strain because it allows the organisms to persist in the intestinal tract and potentially exclude pathogens. Many probiotic species, including L. acidophilus, Lactobacillus johnsonii, and Lactobacillus gasseri, encode a series of cell surface mucus-binding proteins that bind mucin glycoproteins. These species also contain a predicted fibronectin-binding protein FbpA able to bind to fibronectin, another cell surface protein (Pridmore et al., 2004; Altermann et al., 2005; Klaenhammer et al., 2005). Interestingly, L. helveticus that appears to have diverged from other lactobacilli via adaptation to a milk environment does not contain any mucus-binding proteins and encodes less than half the cell wall proteins of the closely related gastrointestinal commensal L. acidophilus (Callanan et al., 2008). Another study in L. johnsonii using sequencing and microarray data identified specific genes involved in increased persistence in the gastrointestinal tract. Three genes were identified that were important for a long gut persistence phenotype: two genes within an operon for a mannose-specific transporter and one gene with similarity to an immunoglobulin A protease. Reduced gastrointestinal transit time was observed when each of the three genes were deleted independently (Denou et al., 2008).

In addition to competitive exclusion as one means to inhibit intestinal pathogens, some probiotic LAB also produce compounds that inhibit the growth of pathogens. Bacteriocins, for example, are small peptides produced by some bacteria that are toxic to other competing microorganisms. A recent study revealed that Lactobacillus salivarius UCC118 was able to protect mice from infection by Lysteria monocytogenes, a common food-borne pathogen. The protective effect was clearly shown to be due to production of a specific bacteriocin that was lethal for L. monocytogenes (Corr et al., 2007). Bacteriocins, including those produced by LAB, are now being applied as food preservatives due to their ability to selectively inhibit pathogens such as L. monocytogenes, Bacillus cereus, Clostridium botulinum, and Staphylococcus aureus, common pathogens causing food-borne infections (Gálvez et al., 2007; Sit & Vederas, 2008).

Resistance to bile is another trait observed in gastrointestinal-related LAB. Bile-specific hydrolases and transporters have been found in many probiotic species such as L. johnsonii, L. acidophilus, Lactobacillus reuteri, and L. plantarum (Leer et al., 1993; Pridmore et al., 2004; McAuliffe et al., 2005; Pfeiler et al., 2007; Martoni et al., 2008). Microarray expression analysis of L. acidophilus NCFM found 289 genes that were differentially expressed in the presence of bile. Of these genes, 168 were downregulated while 78 were upregulated. The upregulated genes encoded for genes involved in carbohydrate uptake/metabolism, stress responses, and adhesion to intestinal cells (Pfeiler et al., 2007). Notably, genes for lactose metabolism were induced in the presence of bile suggesting a close evolution of mammalian gastrointestinal tract and a resident microbiota adapted for utilization of milk. A study in L. reuteri identified an operon containing a multidrug resistance transporter and a gene of unknown function differentially expressed in the presence of bile. When these genes were deleted, the strain demonstrated a reduced ability to recover in the presence of bile, suggesting that transport of bile salts plays an important roll in bile tolerance (Whitehead et al., 2008). LAB have maintained and acquired a variety of genes that allows to them to survive and interact with the human gastrointestinal tract in an amazingly complex way.


Plasmids are found in many LAB and vary in size and gene content. Important plasmid-encoded genes in Lactococcus species were discovered decades ago and include genes involved in lactose/galactose utilization, proteolysis, oligopeptide transport, bacteriophage resistance, citrate utilization, bacteriocin production, and stress response (McKay, 1983; Gasson, 1990). Many of the plasmid-encoded genes have functions not found previously in L. lactis and have significantly different G–C% contents. These features of plasmid-encoded genes suggest recent acquisition via horizontal gene transfer from other dairy organisms such as enterococci, streptococci, and lactobacilli. The common dairy strain L. lactis ssp. cremoris SK11 contains four plasmids that were sequenced and shown to encode a large number of genes that reflect the organisms adaptation to the milk environment and confirm their critical roles in the industrial performance and fermentative ability of L. lactis in cheese and cultured dairy products (Siezen et al., 2005). The probiotic strain L. salivarius contains the only characterized megaplasmid in the LAB, 242 kb and comprising c. 11% of the genome. Although the megaplasmid does not encode any essential single copy genes, it does encode important genes for additional amino acid and carbohydrate metabolism. The megaplasmid also encodes genes potentially important to the probiotic status of this organism including the ABP118 bacteriocin, a bile salt hydrolase, and a putative conjugation locus. Functions were determined for many of the genes on the megaplasmid; however, the megaplasmid also contains a relatively large number of pseudogenes (Claesson et al., 2006). Plasmids are vehicles for rapid genetic transfer and typically encode genes that are important to a particular strain and its competitiveness in a specific environment.

Prophage and clustered regularly interspaced short palindromic repeats (CRISPRs)

Bacteriophages present a significant challenge in industrial fermentations using LAB. Phage and phage remnants are found in the genomes of most LAB and play a prominent role in species-to-species and strain-to-strain variability. Prophage and remnants can also encode genes directing phenotypes important for host survival or functions. Notable examples are virulence-related genes in pathogenic Streptococcus pyogenes and bacteriophage resistance in dairy-related organisms such as S. thermophilus and L. lactis. (Canchaya et al., 2003; Ventura et al., 2006; Wegmann et al., 2007).

CRISPRs have been observed in the genomes of a number of LAB (Barrangou et al., 2007). CRISPR sequences generally contain a series of short palindromic repeats separated by spacer sequences and adjacent to CRISPR associated, or cas, genes. Although the function of these structures is not fully understood, the spacer regions share significant homology with foreign DNA elements. It is suspected that these regions are involved in protecting the host from invasion by potentially harmful foreign DNA, including that from bacteriophages and plasmids. Evidence suggests that the CRISPR regions and cas genes provide a type of phage immunity via an RNA interference mechanism. This theory is substantiated by the presence of additional spacers in the CRISPR regions of phage-resistant strains of S. thermophilus. Furthermore, when S. thermophilus strains were challenged with phage, additional spacer regions homologous to the tested phage were observed. This evidence suggests that CRISPR regions act as a type of primitive bacterial immune system against invading DNA.

Conclusions and future issues

As genome sequence and functional genomic information continues to explode, key features of the genomes of LAB continue to be discovered. The analysis of these features leads to a greater understanding of the physiology and metabolism of organisms that are connected so intimately to humans and their food. The role of LAB in food and health continues to expand and evolve as new discoveries are made and new applications are explored. Currently, the LAB are receiving significant attention as vehicles for delivery of biotherapeutics because of their ability to reach the gastrointestinal tract and interact with the host immune system. Potential biotherapeutic applications being explored currently include using LAB for drug and vaccine delivery vehicles, where the drug is produced directly within the gastrointestinal tract in a proximal position to the immune cells present in the human gut (Delcenserie et al., 2008).

Furthermore, researchers are just now beginning to scratch the surface of the complex relationship between humans and their microbiota. As discoveries in metagenomics continue to be made, the picture becomes more and more complex. The make up of the intestinal microbiota has even been linked to fat deposition and obesity in mouse and human models, further illustrating the relationship between host microbiota, diet, and energy balance (Turnbaugh et al., 2006, 2008). The improved understanding of the genomics of LAB not only answers many questions but also raises many new ones, helping to expand our knowledge of their relationship with mankind.


Research at NCSU on the genomics of probiotic lactobacilli is supported by the NC Dairy Foundation, Danisco USA Inc., and Dairy Management Inc. J.S. is supported by a NIH-Molecular Biotechnology Training Fellowship.