• Open Access

Genomics of dairy fermentations


  • Roland J. Siezen,

    Corresponding author
    1. Kluyver Centre for Genomics of Industrial Fermentation, TI Food and Nutrition, 6700AN Wageningen, The Netherlands.
    2. NIZO Food Research, 6710BA Ede, The Netherlands.
    3. Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen Medical Centre, 6500HB Nijmegen, The Netherlands.
    Search for more papers by this author
  • Herwig Bachmann

    1. Kluyver Centre for Genomics of Industrial Fermentation, TI Food and Nutrition, 6700AN Wageningen, The Netherlands.
    2. NIZO Food Research, 6710BA Ede, The Netherlands.
    Search for more papers by this author

*E-mail r.siezen@cmbi.ru.nl; Tel. (31) 2436 19559; Fax (31) 2436 19395.


Microbes are used in the process of making industrial and artisanal fermented dairy products, such as cheese, yogurt, sour cream and fermented milks (Fig. 1). These microbes are predominately lactic acid bacteria (LAB), such as lactococci, lactobacilli and streptococci. For quality and consistency, industrial production requires the use of starter cultures, which are very carefully created, cultivated and maintained (Fig. 2). What happens in the fermentation process? Milk sugars (mainly lactose) are fermented with the major final product being lactic acid. Lactic acid not only inhibits the out-growth of other organisms but also lowers the pH of the food product. Taste and texture, the feeling of food in your mouth, is also important. Lactic acid bacteria make the specific end-products that impart flavour and modify the texture of the final product. Cheese production makes use of predominantly Lactococcus lactis. It is the major component of cheese starter cultures and, as the worldwide cheese market is huge, it is one of the most important microbes for the food industry. Several of the important functions for fermentation are encoded on conjugative plasmids in these bacteria, among them lactose metabolism and the breakdown of milk proteins during cheese production (Siezen et al., 2005; Shearman et al., 2008). The lactobacilli are also important players in dairy fermentations with Lactobacillus bulgaricus mainly used in yoghurt manufacture, together with Streptococcus thermophilus. This use of microbial consortia adds yet another degree of complexity to an already complex production process.

Figure 1.

From milk to fermented dairy product.

Figure 2.

Starter cultures for dairy fermentations.

There are now over 20 genomes of LAB published and annotated, providing insight into their metabolic capabilities, as reviewed in (Pfeiler and Klaenhammer, 2007; Mayo et al., 2008). Comparing these genomes for shared or unique genotypes is a start, but the world of dairy fermentation is not content just with comparison. The real questions that are being asked are: what makes my yogurt or cheese different, and how can I develop new flavours, textures and products? This is increasingly being investigated by natural diversity analysis of microbes, and in situ omics measurements in dairy products. These and many other studies were reported at the 9th Symposium on LAB: Health, Evolution and Systems Biology, held in September 2008 (http://www.lab9.nl). Here we highlight some of the latest developments in genomics in these areas.

Genome sequencing and diversity

An overview of genome sequences of some of the microbes used in dairy fermentations is given in Table 1. The most recent additions are Lactobacillus helveticus DPC4571, a starter/adjunct culture with traits that are extremely desirable in Swiss cheese production, which include autolysis, reduced bitterness and enhanced flavour development (Callanan et al., 2008), and also the industrially important plasmid pLP712 of L. lactis, encoding lactose catabolism and proteolytic enzymes (Shearman et al., 2008). Propionibacterium freudenreichii ssp. shermanii CIP 103027 is a member of the dairy propionibacteria, commonly isolated from cheese and other dairy products, and is important for the development of flavour and the characteristic holes formed by CO2 in Emmental (Swiss-type) cheese (Dherbecourt et al., 2008). Brevibacterium linens BL2 is used as an adjunct culture in the ripening stage of soft cheddar-type cheese production. It produces an enzyme that converts l-methionine into methanethiol, an important aromatic component of the cheddar cheese aroma. Multiple strains have now been sequenced of L. lactis, S. thermophilus, Lb. delbrueckii, Lb. helveticus and Lb. casei (Table 1), providing deeper insight into their genomic diversity. A pangenome sequencing analysis of 11 strains of S. thermophilus has identified 65 kb DNA in regions > 1.5 kb not previously found in the three sequenced genomes (Danielsen and Rasmussen, 2008).

Table 1.  Microbial genome sequencing relevant to dairy fermentations.
Species and strainSubspecies/strainGenBank codeSize (kb)GC%Origin/useReference
  1. n.a., not available, either proprietary or incomplete.

Lactococcus lactislactis/IL1403NC_002662236535CheeseBolotin et al. (2001)
cremoris/SK11NC_008527264135CheeseMakarova et al. (2006)
cremoris/MG1363NC_009004253035CheeseWegmann et al. (2007)
Streptococcus thermophilusCNRZ1066NC_006449179639Yoghurt, cheeseBolotin et al. (2004)
LMG18311NC_006448179739Yoghurt, cheeseBolotin et al. (2004)
LMD9NC_008532186439Yoghurt, cheeseMakarova et al. (2006)
11 strains    Danielsen and Rasmussen (2008)
Lactobacillus delbrueckiibulgaricus/ATCC11842NC_008054186549Yoghurtvan de Guchte et al. (2006)
bulgaricus/ATCC-BAA365NC_008529185749YoghurtMakarova et al. (2006)
Lactobacillus helveticusDPC4571NC_010080208138CheeseCallanan et al. (2008)
CNRZ32n.a.∼228037CheeseJ.Broadbent et al., Utah State University, USA
R0052n.a.n.a.n.a.DairyJ.Broadbent et al., Utah State University, USA
CM4n.a.2028n.a.n.a.Calpis; Kitasato University, Japan
Lactobacillus caseiBL23NC_0109993079n.a.CheeseJ Deutscher et al., Centre National de la Recherche Scientifique, France
ATCC334NC_008526292446CheeseMakarova et al. (2006)
casei/ATCC393n.a.∼3000n.a.CheeseM. Hatori, University of Tokyo, Japan
Leuconostoc mesenteroidesmesenteroides/ATTC8293NC_008531207537OlivesDOE Joint Genome Institute, USA
Brevibacterium linensBL2AAGP01000000436663CheeseDOE Joint Genome Institute, USA
Propionibacterium freudenreichiishermanii/CIP103027n.a.∼250067CheeseDherbecourt et al. (2008)
ATCC6207n.a.∼264067CheeseDSM and Friesland Foods, the Netherlands

The diversity of LAB natural isolates is being studied by comparative genome hybridization (CGH) using microarrays based on a single reference genome, e.g. for Lb. casei (Cai et al., 2008) and Lb. helveticus (Broadbent et al., 2008), or multiple reference genomes, e.g. S. thermophilus (Rasmussen et al., 2008) and L. lactis (Ganesan et al., 2008) (G. Felis, pers. comm.). A novel genotype-calling algorithm PanCGH has been developed to analyse these pangenome arrays, and this has been applied to L. lactis strains (Bayjanov et al., 2008).

Genome mining

Genome sequence analysis can provide the first insight into metabolic potential. An excellent, albeit older, example is the prediction that L. casei, a non-starter LAB that increases in later stages of cheese ripening, has the potential to use citrate as an alternative energy source when lactose has been depleted (Diaz-Muniz et al., 2006).

A putative complete citric acid cycle (TCA) was reconstructed from the genome sequence, and experimentally shown to be active under simulated cheese ripening conditions, converting citrate mostly to acetic acid instead of lactic acid, yielding 2 ATP per molecule of citric acid.

The potential to form flavours from amino acids was compared in all sequenced LAB by searching their genomes for enzymes involved in proteolysis and amino acid conversions (Liu and Siezen, 2006; Liu et al., 2008). Focusing on enzymes involved in metabolism of the sulfur-containing amino acids methionine and cysteine, which are known precursors of many dairy flavours, the largest set of enzymes was found in typical dairy LAB such as L. lactis, S. thermophilus and Lb. casei. The genome sequence of Lactobacillus helveticus DPC4571 (Callanan et al., 2008) revealed a number of formerly unknown endopeptidases with potential roles in hydrolysis of proline-rich caseins and bitter peptides. These peptidases were cloned, overexpressed and further characterized with synthetic peptide substrates and in a cheese model system (Slattery et al., 2008). Amino acid auxotrophy in Lb. helveticus CNRZ32 was predicted from its genome sequence, and agreed well with phenotypic amino acid requirements (Christiansen et al., 2008).

Lipolysis of milk fat also contributes to flavour formation in cheese. By combining several bioinformatics methods, 23 putative esterases for lipolysis were identified in the genome of Propionobacterium freudenreichii CIP103027, the main agent for lipolysis in Emmental cheese (Dherbecourt et al., 2008). Twelve of these putative esterases were selected and expressed in E. coli, of which six showed esterase activity on short-chain napthyl esters, thereby confirming the efficiency of genome mining.

The putative transport capabilities of eleven Gram-positive bacteria, including the dairy LAB Lb. casei, L. lactis, Lc. mesenteroides, Lb. delbrueckii and S. thermophilus, has been predicted using extensive comparative genome analysis (Lorca et al., 2007). This study has provided detailed information of the potential uptake systems for carbohydrates, peptides and amino acids in each species, as classified according to TCDB, the membrane transport protein classification database (http://tcdb.ucsd.edu).

One of the most exciting and useful aspects of having full-genome sequences is the ability to construct genome-scale metabolic models. They enable input and output fluxes, ATP production, growth rate, biomass yields and product formation to be predicted, and then experimentally tested (for LAB examples see Smid et al., 2005; Notebaart et al., 2006; Teusink et al., 2006). New genome-scale models have now been made for S. thermophilus (Pastink et al., 2008), and the pangenome (multiple strains) of L. lactis (Wels et al., 2008). Individual genome-scale models of the three sequenced L. lactis strains have been reconstructed using Pathway Tools and the BioCyc database (Ganesan et al., 2008) (http://www.biosystems.usu.edu/cibcyc).

Experimental omics

In situ transcriptome analysis

Most of the omics data related to dairy fermentations has been obtained from in vitro experiments, which were designed to mimic a dairy product environment (Kok et al., 2005; Neves et al., 2005; Kilstrup, 2006). Experimental data obtained from the product environment are limited. The major problem is that dairy environments such as fermented milk and especially cheese have a very rich protein and fat content. This makes the isolation of bacterial RNA, proteins or metabolites extremely difficult. In a recent study, the transcriptome profile of L. helveticus CNRZ32 grown in milk was compared with growth in a defined medium (Smeianov et al., 2007). The milk isolate had 42 upregulated genes, encoding cell-envelope proteinases, oligopeptide transporters, endopeptidases and enzymes involved in lactose, cysteine and purine metabolism. A DNA microarray time series was analysed during the first 20 h of a batch fermentation of L. lactis in milk (De Jong et al., 2008). The data were used to reconstruct gene regulatory networks and revealed a number of unknown regulons and DNA motifs in the genome of L. lactis.

Recently, the first methodological studies on the extraction of RNA directly from cheese (Monnet et al., 2008a), or by separation of bacterial cells from cheese before RNA isolation were reported (Makhzami et al., 2008; Ulvéet al., 2008). An alternative approach was developed to follow gene expression directly in cheese using recombinant in vivo expression technology (R-IVET). R-IVET is not dependent on RNA isolation but it rather ‘records’in situ promoter activity throughout the incubation period by the irreversible excision of a marker fragment from the genome. Genome-scale analysis of in situ gene expression was developed for L. lactis, and allowed the identification and validation of positively regulated promoters in a product environment (Bachmann et al., 2008a). For the evaluation of in situ activated target sequences a high-throughput, cheese-manufacturing model, termed MicroCheese, was developed (Bachmann et al., 2008d). This MicroCheese system in combination with the R-IVET toolbox was used to identify and validate L. lactis promoters induced during the manufacturing and ripening of a Gouda-type cheese made with a mixed starter culture (Bachmann et al., 2008b).

In situ proteomics and metabolomics

Hannon and co-workers described the preparation of an aqueous phase of cheddar cheese and the subsequent separation of bacterial proteins from milk proteins by affinity chromatography and gel filtration (Hannon et al., 2008). Proteome analysis identified bacterial proteins from cheese manufactured with pure cultures of either S. thermophilus or L. lactis but also from cheeses manufactured with a mixed culture of both strains. The analysis showed that many genes involved in stress response and energy generation were upregulated during the cheese fermentation. Yvon and co-workers separated bacterial cells from the cheese matrix, determined the activity of eight flavour-forming enzymes and investigated the proteome and metabolome of the cell extracts (Yvon et al., 2008). Minor differences were found in the proteome between 1 and 7 days after cheese manufacturing, but important differences were seen in bacterial metabolites.

Bacterial interactions in dairy consortia

The impact of genomic approaches on the elucidation of microbial interactions was reviewed recently (Sieuwerts et al., 2008a). Current developments in the dairy environment include transcriptome and proteome studies on mixed cultures of S. thermophilus and L. bulgaricus in milk (Monnet et al., 2008b; Sieuwerts et al., 2008b). This bacterial consortium represents a typical yoghurt culture, and the results reveal new insights into interactions between the two bacteria (Fig. 3). The measurement of volatile bacterial metabolites in mixed-culture dairy fermentations may also permit the identification of bacterial interactions (Janssen et al., 2008).

Figure 3.

Microbial interactions in yoghurt (adapted from Sieuwerts et al., 2008a). Reprinted with permission from the American Society for Microbiology.

Evolutionary aspects of dairy fermentation

A comparison of nine genome sequences of LAB revealed extensive gene loss and horizontal gene transfer during the evolutionary adaptation to their habitat (Makarova et al., 2006). Evolutionary genomic studies of LAB pointed to a substantial gene loss especially in the Lactococcus–Streptococcus branch (Makarova and Koonin, 2007). Gene loss in relation to dairy niche adaptation was reported for L. lactis, Lb helveticus and the yoghurt bacteria (Bolotin et al., 2004); van de Guchte et al., 2006; Callanan et al., 2008; Siezen et al., 2008). The genome sequences of S. thermophilus and Lb. bulgaricus revealed that > 10% of all potential coding sequences are pseudogenes, indicating that evolutionary processes to adapt to the dairy environment are still very actively ongoing (Bolotin et al., 2004; van de Guchte et al., 2006). Loss of genes for carbohydrate metabolism and amino acid biosynthesis in Lb. bulgaricus reflect an adaptation to the protein-rich milk environment.

Most studies with L. lactis were carried out with strains isolated from the dairy environment. The diagnostic sequencing of two L. lactis plant isolates has now shown that these strains contain many genes never before reported as part of the genome of L. lactis. These genes are mainly involved in the utilization of complex carbohydrates, which typically occur in plant material (Siezen et al., 2008). In a follow-up study, one of these plant isolates was adapted to growth in milk by propagating it for 1000 generations in milk. Three independently evolved strains were extensively characterized and reveal interesting insights into evolutionary aspects of this adaptation process (Bachmann et al., 2008c).

The acquisition of new genes via horizontal gene transfer has been proposed for several dairy specific LAB (Bolotin et al., 2004; Siezen et al., 2005; Makarova and Koonin, 2007; Callanan et al., 2008), and includes transfer between S. thermophilus, L. lactis and Lb. bulgaricus (Bolotin et al., 2004). Recently, a genomic island of 100 kb, with deviant GC content and flanked by IS elements, was found in the genome of L. helveticus DCP4571, and included fatty acid and amino acid metabolism genes (Callanan et al., 2008). One mechanism of horizontal gene transfer is the phage-mediated transduction of DNA. Recently, it was shown for the first time that this mechanism allows the transfer of plasmids from the genus St reptococcus to the genus Lactococcus (Ammann et al., 2008). As bacteriophages can cause cell lysis, they can have a big impact on the performance of starter cultures and they are responsible for substantial financial losses to the dairy industry. Resistance to phage infection can be conferred by CRISPRs (clustered regularly interspaced short palindromic repeats), which are variable repeats separated by DNA spacers found in the genomes of many prokaryotes, including LAB (Barrangou et al., 2007). A recent comparative genome analysis identified 66 CRISPR loci in LAB (Horvath et al., 2008a). A poor correlation of CRISPR families with bacterial phylogeny supports the notion that CRISPRs are acquired via horizontal gene transfer and have further evolved independently. This evolution is mainly determined by phage predation and it forms an important part of the ecology between phages and their hosts. CRISPR sequences were further studied in S. thermophilus (Horvath et al., 2008b) and it was shown that they are responsible for increased phage resistance achieved by successive phage challenges (Deveau et al., 2008). It is suggested that the directed evolution of strains with multiple phage resistances should be possible, which forms an attractive approach for stabilizing industrial fermentation processes.

When a new process or product is being developed in an industrial setting, the initial stages involve setting up small-scale experiments and then a small-scale pilot plant to mimic the industrial environment. Intelligent use of genomics data should give a competitive edge as it can provide detailed information on the spatio-temporal aspects of the process. It is no surprise then that the number of omics studies performed in a product-like environment is rapidly increasing. A comparison of the data is difficult as most studies use different bacterial strains or methodologies, but the principle discoveries will form the basis of detailed descriptions as to what is happening in these complex environments. It is beyond doubt that the elucidation of the in situ behaviour of bacterial cultures in the post-genomics era will lead to a better insight into dairy fermentations and help to improve industrial fermentation processes.


We thank Sander Sieuwerts and Wouter Lublink (C.S.K.) for permission and contributing their figures for re-use, and Greer Wilson and Michiel Kleerebezem for critically reading and correcting the manuscript. This project was carried out within the research programme of the Kluyver Centre for Genomics of Industrial Fermentation, which is part of the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research.