Lipoproteins in bacteria: structures and biosynthetic pathways


  • Hiroshi Nakayama,

    Corresponding author
    • Biomolecular Characterization Team, RIKEN Advanced Science Institute, Saitama, Japan
    Search for more papers by this author
  • Kenji Kurokawa,

    1. Global Research Laboratory of Insect Symbiosis, College of Pharmacy, Pusan National University, Busan, Korea
    Search for more papers by this author
  • Bok Luel Lee

    Corresponding author
    1. Global Research Laboratory of Insect Symbiosis, College of Pharmacy, Pusan National University, Busan, Korea
    • Biomolecular Characterization Team, RIKEN Advanced Science Institute, Saitama, Japan
    Search for more papers by this author


H. Nakayama, Biomolecular Characterization Team, RIKEN Advanced Science Institute, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan

Fax: +81 48 462 4704

Tel: +81 48 462 1419


B. L. Lee, National Research Laboratory of Defense Proteins, College of Pharmacy, Pusan National University, Jangjeon Dong, Geumjeong Gu, Busan 609-735, Korea

Fax: +82 51 513 2801

Tel: +82 51 510 2809



Bacterial lipoproteins are characterized by the presence of a conserved N-terminal lipid-modified cysteine residue that allows the hydrophilic protein to anchor onto bacterial cell membranes. These proteins play important roles in a wide variety of bacterial physiological processes, including virulence, and induce innate immune reactions by functioning as ligands of the mammalian Toll-like receptor 2. We review recent advances in our understanding of bacterial lipoprotein structure, biosynthesis and structure–function relationships between bacterial lipoproteins and Toll-like receptor 2. Notably, 40 years after the first report of the triacyl structure of Braun's lipoprotein in Escherichia coli, recent intensive MS-based analyses have led to the discovery of three new lipidated structures of lipoproteins in monoderm bacteria: the lyso, N-acetyl and peptidyl forms. Moreover, the bacterial lipoprotein structure is considered to be constant in each bacterium; however, lipoprotein structures in Staphylococcus aureus vary between the diacyl and triacyl forms depending on the environmental conditions. Thus, the lipidation state of bacterial lipoproteins, particularly in monoderm bacteria, is more complex than previously assumed.


ATP-binding cassette


preprolipoprotein diacylglyceryl transferase


apolipoprotein N-acyltransferase


localization of lipoprotein


lipoprotein signal peptidase


macrophage-activating lipopeptide-2 kDa


penicillin-binding protein


type II secretion system


twin arginine translocation


Toll-like receptor


In 1973, Hantke and Braun [1] reported biochemical evidence that an abundant Escherichia coli lipoprotein Lpp contains an unusual S-glycerylcysteine residue modified with three fatty acids (N-acyl-S-diacylglyceryl cysteine) at its N-terminus (Fig. 1). They also determined the positional distribution of fatty acids in this lipoprotein (Table 1). Based on their pioneering work, this triacylated E. coli Lpp lipoprotein was also named Braun's lipoprotein. Additional extensive biochemical studies [2-4] have established that bacterial lipoproteins are a major class of the membrane protein family and share type II signal peptide sequences with a conserved lipid-modified cysteine residue at the N-terminus, which allows them to anchor onto the periplasmic leaflet of the plasma membrane or outer membrane. Further studies also showed that some lipoproteins are transported to the outer surface of the outer membrane or secreted into extracellular milieu. A large number of available genome sequences confirm that these bacterial lipoprotein genes are universally distributed in bacteria and constitute 1–3% of their total genes [5, 6].

Table 1. Classification of lipoprotein structures
Bacterial speciesMajor habitatProtein nameModified groups to the lipidated cysteineReference
  1. a

    The total carbon number of modified acyl groups on the lipidated cysteine is calculated from the MS/MS-analyzed lipopeptide peak(s). The most abundant N-terminal lipopeptide ion in MALDI-TOF MS was usually analyzed by MS/MS.

  2. b

    R1 and R2 denote a hydrogen or an acyl group attached to the sn-1 and sn-2 positions of the S-glyceryl group of the lipidated cysteine, respectively. R3 denotes a hydrogen, an acyl group or a dipeptide attached to the α-amino group of the lipidated cysteine (Fig. 1).

  3. c

    Foods probably contained human, animal and/or soil contaminants.

  4. d

    The total carbon number of modified acyl groups on R1 and R2 is shown.

  5. e

    Our interpretation is shown in parenthesis.

  6. f

    Descending order in abundance.

  7. g

    M. tuberculosis lipoprotein was heterologously expressed in M. smegmatis.

  8. h

    Fatty acid species were determined, but their positions were not.

Class A: N-acyl-S-monoacyl-glyceryl-cysteine structures
A-a: Lyso form
B. cereus Intestine, foodscBC020032 : 017 : 0H15 : 0 [16]
OppA32 : 017 : 0, 18 : 0H14 : 0, 15 : 0 [16]
PrsA32 : 017 : 0H15 : 0 [16]
E. faecalis IntestineEF225634 : 118 : 1H16 : 0 [16]
EF325634 : 118 : 1H16 : 0 [16]
PnrA34 : 118 : 0, 18 : 1H16 : 0, 16 : 1 [16]
L. bulgaricus IntestineLdb020236 : 218 : 1H18 : 1 [16]
Ldb218336 : 218 : 1H18 : 1 [16]
S. sanguinis Oral cavitySSA_037534 : 118 : 1H16 : 0 [16]
SSA_103834 : 118 : 0, 18 : 1H16 : 0, 16 : 1 [16]
Class B: N-acyl-S-diacyl-glyceryl-cysteine structures
B-a: N-acetyl form
B. halodurans SoilBH346032 : 015 : 015 : 02 : 0 [16]
MalE32 : 015 : 015 : 02 : 0 [16]
B. licheniformis SoilMntA34 : 017 : 015 : 02 : 0 [16]
OppA34 : 017 : 015 : 02 : 0 [16]
B. subtilis SoilBSU0163034 : 017 : 015 : 02 : 0 [16]
PrsA34 : 017 : 015 : 02 : 0 [16]
G. kaustophilus Deep seaGK096934 : 032 : 0d2 : 0 [16]
GK128334 : 017 : 015 : 02 : 0 [16]
O. iheyensis Deep seaCtaC32 : 015 : 015 : 02 : 0 [16]
B-b: Conventional triacyl form
A. laidlawii Waste waterACL_122348 : 0, 50 : 0f16 : 016 : 016 : 0, 18 : 0 [14]
50 : 0(18 : 0)e(16 : 0)e(16 : 0)ePresent review
ACL_141048 : 0, 50 : 0f16 : 016 : 016 : 0, 18 : 0 [14]
50 : 0(18 : 0)e(16 : 0)e(16 : 0)ePresent review
E. coli IntestineLpp16 : 0, 18 : 1, 17 : Cyc, 16 : 1 (f, h)16 : 0, 16 : 1, 18 : 1f [1]
M. smegmatis Soil, water, plantsLppXg51 : 016 : 0, 19 : 0h16 : 0 [30]
 51 : 0(19 : 0)e(16 : 0)e(16 : 0)ePresent review
LprFg51 : 016 : 0, 19 : 0h16 : 0 [44]
  51 : 0(19 : 0)e(16 : 0)e(16 : 0)ePresent review
M. genitalium GenitalMG_04050 : 118 : 116 : 016 : 0 [16]
M. pneumoniae Respiratory tractsMPN05250 : 118 : 116 : 016 : 0 [16]
MPN41550 : 118 : 116 : 016 : 0 [16]
P. gingivalis Oral cavityPG182847 : 032 : 0–33 : 1d15 : 0, 14 : 0 [24]
S. aureus Nares, skinSA073952 : 032 : 0–37 : 0d15 : 0–20 : 0 [15]
SA077152 : 032 : 0–36 : 0d16 : 0–20 : 0 [15]
SA207451 : 031 : 0–36 : 0d15 : 0–20 : 0 [15]
SA220252 : 032 : 0–36 : 0d16 : 0–20 : 0 [15]
SitC53 : 033 : 0–37 : 0d16 : 0–20 : 0 [15]
S. epidermidis SkinSitC53 : 032 : 0–35 : 0d17 : 0–20 : 0 [15]
Class C: S-diacyl-glyceryl-cysteine structures
C-a: Conventional diacyl form
B. viridis Sea, soilCytochrome c18 : 1,18 : 1; 18 : OH,18 : OH; 18 : OH, 18 : 1hH [25]
L. monocytogenes FoodscLmo013532 : 017 : 015 : 0H [16]
Lmo219632 : 017 : 015 : 0H [16]
Lmo221932 : 017 : 015 : 0H [16]
M. fermentans ThroatMBIO_076334 : 034 : 0dH [16]
MBIO_086934 : 034 : 0dH [16]
S. aureus Nares, skinSA165933 : 018 : 015 : 0H [17]
SitC32 : 0–35 : 017 : 0–20 : 015 : 0H [17]
C-b: Peptidyl form
M. fermentans ThroatMBIO_031934 : 018 : 016 : 0Ala-Ser- [16]
MBIO_066134 : 018 : 016 : 0Ala-Gly- [16]
Figure 1.

The canonical biosynthetic pathway of bacterial lipoproteins. After preprolipoprotein translocation by the Sec or Tat machinery into the outer leaflet of the plasma membrane, the first enzyme Lgt transfers a diacylglyceryl moiety from a membrane phospholipid to the sulfhydryl group of the +1 cysteine of the conserved lipobox motif in the type II N-terminal signal peptide, generating a thioether linkage. The Lsp then cleaves the signal peptide at the N-terminus of the +1 S-diacylglyceryl cysteine of the prolipoprotein. Finally, the third enzyme, Lnt, transfers an acyl group from another phospholipid to the newly-generated α-amino group of the S-diacylglyceryl cysteine of the apolipoprotein, resulting in the generation of mature triacylated lipoprotein.

These lipoproteins play important roles in bacterial physiology and virulence, including nutrient uptake, cell wall metabolism, cell division, transmembrane signal transduction, antibiotic resistance and adhesion to host tissues during infection. In addition, bacterial lipoproteins function as trigger molecules for the activation of host innate immune responses via Toll-like receptor (TLR)2, and TLR signals contribute to the establishment of adaptive immunity [7]. Therefore, it is important to understand the molecular relationship between bacterial lipoprotein structures and their biological functions in bacterial cell growth and host innate immune responses.

As shown in Fig. 1, the canonical biosynthetic pathway of bacterial lipoproteins, which was determined in the E. coli system, consists of three sequentially acting enzymes: preprolipoprotein diacylglyceryl transferase (Lgt), prolipoprotein signal peptidase (Lsp) and apolipoprotein N-acyltransferase (Lnt) [8]. All three enzymes have been shown to play essential roles in the survival of E. coli and other Gram-negative bacteria. By contrast, in Gram-positive bacteria, Lgt and Lsp appear to be essential in at least some of the tested Actinobacteria [high-guanine + cytosine (GC)-content species] but not in Firmicutes (low-GC-content species). In the present review, we refer to the apolipoprotein containing N-acylation-free S-diacylglycerylcysteine as diacyl lipoprotein because it is the mature form in some bacteria, and we refer to the hololipoprotein containing N-acyl-S-diacylglycerylcysteine as triacyl lipoprotein (Fig. 1).

Because Firmicutes (including low-GC-content Gram-positive bacteria) and Tenericutes do not contain the E. coli-type Lnt orthologue genes in their genome sequences, and lipoproteins purified from cell wall-deficient mycoplasmas (belong to Tenericutes) are shown to be of the diacyl form [9-11], it has been widely assumed over the last decade that the lipoproteins of these bacterial species possess only diacylated lipoproteins [12, 13]. However, recent studies using MALDI-TOF MS have provided clear biochemical evidence of the existence of the triacyl form of lipoproteins in staphylococci and in some Tenericutes [14-16]. Subsequent studies have determined unconventional but conserved lipid-modified structures, leading to the discovery of a novel class of bacterial lipoprotein structures in Firmicutes [16]. These novel structures strongly suggest the presence of an unidentified N-acyltransferase and another enzyme necessary for the biosynthesis of lipoproteins in some Firmicutes and Tenericutes. Moreover, environment-mediated structural variation between triacylated and diacylated structures was also demonstrated in Staphylococcus aureus [17]. These recent findings are now prompting us to change the conventional views of the structures and biosynthetic processes of bacterial lipoproteins. Thus, in the present review, we summarize the recent research progress made in this field and discuss the biological significance of newly-identified bacterial lipoprotein structures, biosynthetic pathways and biological functions.

Bacterial lipoprotein structures

Determination of lipid-modified structure of lipoproteins by MS

The lipid-modified structure of E. coli Lpp was determined by the combination of a number of experimental methods, including chemical degradation, fatty acid analysis by GLC, amino acid analysis, incorporation studies of radioisotope-labelled compounds, MS and chemical synthesis [1]. Although each one of these methods provided only a partial structure, an appropriate combination of them allowed us to determine the consistent lipid modification structure. However, this approach requires a large amount of the highly purified lipoprotein. Because Lpp is the most abundant small-size protein in E. coli and is covalently attached to the peptidoglycan layer, this lipoprotein can be purified relatively easily by extensive washing of insoluble peptidoglycan to remove contaminated proteins, lipids and other impurities. However, it is often difficult to purify other lipoproteins to homogeneity, resulting in ambiguous or inconsistent lipoprotein structures.

By contrast, recent advances in MS-centred analyses, along with an increase in information about bacterial genomes, have enabled us to determine the lipid-modification structures of bacterial lipoproteins clearly and directly. Remarkably, the development of gel electrophoresis-based MS analyses has allowed us to analyze crude lipoproteins with significantly small amounts of starting materials (i.e. ten- to 100-fold less than conventional methods) [14, 15, 18, 19]. In MS analysis, native lipoproteins are usually digested with trypsin to obtain smaller N-terminal lipopeptides and then the precise molecular mass of these lipopeptides is determined. The N-terminal lipopeptides are often separated from other nonlipopeptide fragments by organic solvent extraction or adsorption to hydrophobic media, and are discriminated from them in MS by elemental composition analyses. The determination of accurate molecular mass of N-terminal lipopeptides by MS can estimate the putative numbers of total carbon atoms and double bonds in the fatty acid modifications. Further MS/MS analysis of the N-terminal lipopeptide ions provides information not only on the amino acid sequence of the peptide moiety, but also on the identities and composition of the amide-linked fatty acid and ester-linked R1 (sn-1) and R2 (sn-2) fatty acids (Fig. 2). Fatty acid compositions can also be determined based on preferential losses from precursor ions of neutral functional group(s), such as acylated thioglyceride and whole fatty acid(s), in MS/MS because the dissociation efficiencies of the neutral groups depend on the property of each chemical bond. Recently, we developed a simple method for enhancing the efficiency of neutral loss of the acylated or unacylated thioglyceride group by H2O2 oxidation of the lipopeptides before MS measurement, which allows a clear determination of the composition of amide-linked fatty acids [15]. Moreover, recent studies regarding MS analyses of glycerophospholipids [20] provided evidence that the fatty acid substituted on the primary alcohol (R1, sn-1 or sn-3 position) of the glyceryl group is more stable than that on the secondary alcohol (R2, sn-2 position) in MS/MS experiments. This tendency can also be applied to the MS/MS analysis of N-terminal lipopeptides, resulting in a stronger signal for the R1 (sn-1) fatty acid-containing lipopeptide ion than that for the R2 (sn-2) fatty acid-containing lipopeptide ion [17, 21]. Based on such knowledge, we also determined the positional distribution of esterified fatty acids by MS/MS analysis of N-terminal lipopeptides (Fig. 2). Furthermore, MS analysis of the N-terminal lipopeptides after digestion with Pseudomonas lipoprotein lipase, which cleaves only esterified fatty acids with preferential cleavage of sn-1 or sn-3 (R1) fatty acids over sn-2 (R2) [22], provided clues about the positional distribution of esterified fatty acids. The determination of all positional distributions of fatty acids by the two methods has yielded consistent results [15, 16]. Therefore, based on our recent accumulated knowledge, the present review carefully re-interprets the MS/MS data previously reported by other groups.

Figure 2.

The novel lyso, N-acetyl and peptidyl forms of bacterial lipoproteins. Structural determination of the lyso form: MALDI-TOF MS/MS spectrum of the N-terminal lipopeptide of E. faecalis PnrA (m/z 997.7) (A) and the determined structure (B) are shown. The C-terminus-containing y-series ions and product ions that have lost a fatty acid (m/z 715.5), ketene (m/z 733.5 or 759.6) or monoacyl(18 : 1)-thioglycerol (m/z 625.4) are highlighted. The position of O-acylation has not been determined (for details, see text). Structural determination of N-acetyl form: MS/MS spectrum of the N-terminal lipopeptide of B. licheniformis MntA (m/z 1016.7) (C). The y-series ions and other major product ions are highlighted. (D) The determined structure of MntA is shown. Structural determination of the peptidyl form: the MS/MS spectrum from a fraction containing in-gel-digested MBIO_0319 from M. fermentans (m/z 1071.7) (E) and the determined structure (F) are shown. The y-series ions, y* ions that have lost ammonium from y ions and other major product ions are highlighted. The asterisk indicates a contaminated ion derived from the matrix. This research was originally published in the Journal of Biological Chemistry. Kurokawa K, et al. (2012) Novel bacterial lipoprotein structures conserved in low-GC content Gram-positive bacteria are recognized by Toll-like receptor 2. J Biol Chem 287, 13170–13181. © the American Society for Biochemistry and Molecular Biology [16].

Although recent MS-based technologies for bacterial lipoproteins are quite advanced, there are still some limitations: (a) quantitative analysis of lipoproteins is sometimes difficult because of differences in ionization efficiencies and (b) MS cannot discriminate between stereoisomers and between some constitutional isomers, such as double-bonded positional isomerism in unsaturated fatty acids and branched and straight chain isomerism in saturated fatty acids; however, discrimination between double-bond positions of fatty acids in phospho(glycero)lipid has been achieved recently via ozone-induced double-bond-specific dissociation in an in-house-modified MS instrument [23].

Bacterial lipoprotein structures in Gram-negative bacteria

After the first determination of a triacylated structure of E. coli Braun's lipoprotein [1], the structure of the peptidoglycan-associated lipoprotein Pal, obtained from both E. coli and Pseudomonas aeruginosa, was next determined as a lipidated protein [3]. Subsequently, similar lipoproteins were also identified in several other Gram-negative Proteobacteria, including the Haemophilus influenzae type b strain [4]. These studies on bacterial lipoprotein structures and concomitant studies on the three essential enzymes for their maturation provided evidence of the common existence of the N-acyl-S-diacylglycerylcysteine residue in the lipoproteins of Gram-negative γ-Proteobacteria.

Similar triacyl lipoproteins have also been identified in other Gram-negative species. A lipoprotein (PG1828) prepared from the lipopolysaccharide fraction of a periodontopathic Bacteroidetes bacterium, Porphyromonas gingivalis, was structurally characterized [24]. After trypsin digestion and further purification by normal phase LC, the N-terminal lipopeptide of this lipoprotein was analyzed by MALDI-quadrupole-TOF MS, revealing that the major lipoproteins present in this bacterium were 46 : 0- (where 46 and 0 refer to the numbers of total carbons and double bonds, respectively) and 47 : 0-triacylated lipoproteins. Further MS/MS analysis of the 47 : 0-triacylated form revealed that the lipopeptide contains the N-acyl(14 : 0 or 15 : 0)-S-diacyl(33 : 0 or 32 : 0) glycerylcysteine residue at its N-terminus. To our knowledge, this is the first example of an MS/MS-based determination of native bacterial lipoprotein structure. However, because the recovery of the characterized P. gingivalis lipoprotein was not described, it is uncertain whether the triacylglyceryl structure is the common and/or major lipidation structure in P. gingivalis. Nevertheless, the study [24] comprises the first report of the distribution of triacyl structures in Gram-negative Bacteroidetes other than γ-Proteobacteria.

Additionally, a few exceptional structures have also been reported in other Gram-negative bacteria. The cytochrome c subunit in the photosynthetic reaction centre of the α-proteobacterium Blastochloris viridis, formerly known as Rhodopseudomonas viridis, was determined to be a diacyl lipoprotein with two different fatty acids (18 : OH and 18 : 1; where 18 : OH refers to hydroxyoctadecanoic acid) [25]. The quantitative recovery of diacyl lipoprotein suggested that the diacyl structure is a major form, at least in this subunit, and thus does not appear to be an intermediate for the triacyl form. Although the B. viridis genome sequence is unavailable at present, α-Proteobacteria appear to possess the E. coli-type lnt gene, indicating that the lnt gene of B. viridis might not be expressed or functional under the conditions examined.

Furthermore, an unconventional lipidation structure was reported from the spirochete Borrelia burgdorferi, which is the causative agent of tick-borne Lyme disease [26]. An organic phase obtained from a proteinase K-treated lipoprotein fraction was further purified by high-performance TLC to obtain the lipid component of lipoproteins, and the purified fraction was found to contain N-acyl-S-diacylglycerylcysteine in which one of the two O-acylated fatty acids was probably substituted with an acetyl group (O-acetyl form) [26]. Of note, no direct evidence was provided of either O-acetylation or glycerylcysteine in the purified fraction. The total fatty acid compositions were 16 : 0, 18 : 0, 18 : 1 and 18 : 2, whereas the compositions of the amide-linked fatty acids were 16 : 0 and 18 : 0; however, the fatty acid compositions are inconsistent with the determined molecular weights in the corresponding conditions. The inconsistencies could be partly resolved by the assumption of the O-acetyl form. The fatty acid compositions are also inconsistent with previous work in which palmitic acid (16 : 0) was reported to be the predominant fatty acid (~ 80% of total fatty acids) in a B. burgdorferi lipoprotein fraction that was isolated by standard Triton X-114 partitioning and then delipidated by extensive washes with organic solvent [27]. Although the recovery of their lipoprotein fraction was not described [26], the recovery may be assumed to be low as a result of the use of the unusual purification method of organic solvent partitioning (a modified Folch method that is typically used for lipid extraction) and filtration to isolate the lipoprotein fraction [26]. To date, no lipoprotein species with the O-acetyl form have been identified in Borrelia or any other bacterium. Obviously, further direct analysis is required to confirm this unique lipidation structure of Borrelia lipoproteins.

Interestingly, genomic analyses of some symbiotic bacteria, including α-proteobacterium Wolbachia pipientis, an endosymbiont of filarial nematodes [28], and γ-Proteobacteria Buchnera sp., an endosymbiont of aphids [29], showed a loss of Lnt orthologues in their genomes. These data suggest that these symbiotic bacteria may have diacylated lipoproteins, whereas the number of predicted lipoprotein genes in their genomes is limited.

In summary, the triacylated bacterial lipoprotein structure is widely accepted as the common structure in Gram-negative bacteria, with some exceptions.

Bacterial lipoprotein structures in Gram-positive bacteria

Diacyl lipoproteins in some Firmicutes and Tenericutes

Low-GC-content single-membrane-containing (monoderm) bacteria such as Firmicutes and Tenericutes had been assumed to possess only diacyl lipoproteins for two reasons: (a) Firmicutes and Tenericutes do not encode an apparent E. coli Lnt orthologue in their genome [12, 13, 30] and (b) lipoproteins in Tenericutes have been shown to produce the diacyl form [9-11]. For example, a macrophage stimulating molecule in Mycoplasma fermentans was isolated and determined to be a diacyl lipopeptide, named macrophage-activating lipopeptide-2 kDa (MALP-2), by MALDI-TOF MS [9]. The two esterified fatty acids were described as a mixture of 16 : 0/18 : 0 and 16 : 0/18 : 1 [9], although the low resolution of the mass spectrum is unable to determine the exact composition of the fatty acid species. The presence of the 18 : 1 fatty acid in MALP-2 was not observed in a recent study in which the precursor protein of MALP-2 (MBIO_0763 of strain PG18) and MBIO_0869 adopts the S-diacyl(34 : 0 in total)glycerylcysteine structure [16]. In addition, Listeria monocytogenes, a bacterium that causes food poisoning and is capable of living as an intracellular parasite, was recently demonstrated to have only diacyl lipoproteins. MS and MS/MS analysis showed that L. monocytogenes Lmo2196 and two other lipoproteins had an α-amino group-free S-diacyl(R1-17 : 0, R2-15 : 0)glyceryl cysteine at their N-termini [16]. This fatty acid distribution of the S-glyceryl group is consistent with the literature on L. monocytogenes membrane fatty acids, which is dominated (> 95%) by branched 15 : 0 and 17 : 0 fatty acids [31].

Although S. aureus lipoproteins were recently reported to be of the N-acylated triacyl form [15, 19], Kurokawa et al. [17] also provided solid biochemical evidence that N-acylation-free diacyl lipoproteins accumulated when acidic pH conditions were combined with a post-logarithmic growth phase. In addition, either high temperatures or high salt concentrations additively accelerated the accumulation of the diacyl form. Until this study, the lipidated structure of bacterial lipoproteins had been considered to be constant in each bacterium and, consequently, each bacterium was assumed to possess a static biosynthetic pathway. However, these results suggested that bacterial lipoprotein structures can vary in response to environmental conditions. The diacyl SitC lipoprotein in S. aureus was modified with 17 : 0 to 20 : 0 fatty acids at the R1 (sn-1) position and a 15 : 0 fatty acid at the R2 (sn-2) position (Table 1). By contrast, Tawaratsumida et al. [32] reported that an S. aureus diacyl lipoprotein assumed only one fatty acid-modified structure in which both the R1 and R2 positions were modified with palmitic (16 : 0) acid. The dipalmitoylglyceryl structure is a very minor form in S. aureus phospholipids and was never detected in our S. aureus lipopeptide analyses [15, 17, 19]; in addition to their failure to detect other major diacylglyceryl structures, this raises the question of where their S-dipalmitoylglyceryl peptide originated.

Peptidyl lipoproteins in M. fermentans

By contrast to the conventional diacylated form, an unusual diacyl lipoprotein structure was recently identified in M. fermentans MBIO_0319 and MBIO_0661 lipoproteins (Table 1) [16]. MS/MS analysis and Edman degradation of the N-terminal lipopeptide prepared from MBIO_0319 demonstrated the existence of additional alanyl-serine residues in front of the lipidated cysteine residue (Fig. 2E,F). Similarly, MBIO_0661 also has alanyl-glycine residues in front of the S-diacylglycerylcysteine. This unusual structure was named the ‘peptidyl form’ (Table 1). Of the four M. fermentans lipoproteins analyzed, each lipoprotein takes only one of two structures: either the conventional diacylated form or the peptidyl form. Because the N-terminal alanyl-serine or alanyl-glycine sequence is identical to the deduced amino acid sequence of MBIO_0319 or MBIO_0661, respectively, unusual cleavage-site selectivity of M. fermentans Lsp against the lipobox sequences may generate these peptidyl forms (see below) [16]. This finding goes beyond the commonly accepted notion that the N-terminus of a bacterial lipoprotein is only the lipidated cysteine residue.

N-Acylated triacyl lipoproteins in some Firmicutes and Tenericutes

Earlier studies have indicated the existence of N-acylated lipoproteins in two typical low-GC-content Gram-positive bacteria, Bacillus subtilis and S. aureus, based on the examination of electrophoretic mobility and the amounts of residual radioisotope-labelled lipoproteins after alkaline hydrolysis that cleave esterified fatty acids, respectively [33, 34]. Additionally, immunological studies suggested that Mycoplasma genitalium and Mycoplasma pneumoniae possess triacylated lipoproteins [35, 36]. However, insufficient evidence of lipidated structures, along with the two reasons noted earlier, raise doubts about the real existence of triacyl structures [12, 13, 37, 38].

Under these circumstances, recent MS-based biochemical analysis provided the first solid evidence of the triacylated form in the Gram-positive bacterium S. aureus [15, 19]. The N-terminal peptide of the SitC protein, an abundant lipoprotein in S. aureus, was analyzed using MALDI-TOF MS, demonstrating that the SitC lipoprotein had a triacyl structure modified with various fatty acids (the fatty acid composition depends on the strain, although it is typically 46 : 0–53 : 0 in total). Subsequently, the triacyl forms of the SitC lipoprotein were further confirmed by MALDI-TOF MS/MS in combination with the lipoprotein lipase digestion or H2O2-oxidation method. These analyses revealed that the α-amino groups of lipidated cysteine residues in S. aureus lipoproteins are modified with fatty acids of varying lengths, from 15 : 0 to 20 : 0. Moreover, four other S. aureus lipoproteins were also determined to have N-acylated triacyl forms [15]. Unsaturated fatty acids are mostly incorporated into the esterified acyl chains of lipoproteins of the S. aureus SA113 strain, although the positional distribution has not been clearly determined as a result of these unsaturated fatty acids being only minor species [15].

Additionally, the SitC protein [39] purified from Staphylococcus epidermidis, an opportunistic microorganism residing on skin that is one of the five most common nosocomial infectants, was found to be triacylated with various fatty acids (the sum of the carbon number ranges from 49 : 0 to 55 : 0) (Table 1). The MS/MS spectrum of a major 53 : 0-triacylated lipopeptide ion showed that the S. epidermidis SitC lipoprotein is also an N-acylated (17 : 0 to 20 : 0) triacyl form [15].

Beside these staphylococci, some Tenericutes also contained the N-acylated triacyl form. Serebryakova et al. [14] developed an MS-based method including the use of a hydrophobic membrane surface as an adsorbent for lipopeptides and applied this method to characterize lipoproteins prepared from Acholeplasma laidlawii, a noncholesterol-requiring model organism for mycoplasma. All eight A. laidlawii lipoproteins analyzed had N-acylated triacyl forms that possessed various fatty acids ranging from 46 : 0 to 52 : 0 in total carbon number. The most intense molecular species in each lipoprotein was 48 : 0-triacyl form and the second was 50 : 0; the sum of these two species comprosed ~ 70% of the total. The study demonstrated that most of the 48 : 0(in total)-triacyl form is tripalmitoylated, and suggested the preferential occurrence of palmitic acids in two esterified (R1 and R2) fatty acids and stearic acid in an amide-linked (R3) fatty acid in a 50 : 0-triacyl form (Table 1). However, the characteristic N-palmitoyl dehydroalanyl peptide ions generated by the neutral loss of O-diacylthioglyceride from the molecular ions can easily be identified in the published MS/MS spectra of both the 50 : 0-triacyl and tripalmitoyl forms, indicating that the major amide-linked (R3) fatty acid should be palmitic acid in both. In addition, their MS/MS spectra demonstrated the presence of a product ion generated by the neutral loss of whole palmitic acid (MH minus 256), suggesting that the major esterified R2 fatty acid is likely to be palmitic acid. These data in turn indicate that R1 is occupied with stearic acid. Taken together, their 50 : 0 forms are likely be mainly N-palmitoyl-S-diacyl(R1-stearoyl, R2-palmitoyl)glycerylcysteine. Our interpretation argues that unidentified Lnt may have strict substrate specificity, as shown by the preferential incorporation of palmitic acid into the α-amino group (Table 1).

The lipoprotein structures of two closely-related mycoplasmas, M. genitalium and M. pneumoniae, were determined as N-acylated triacyl forms by MS/MS analyses in combination with lipoprotein lipase digestion or H2O2 oxidation [16]. Lipoproteins from these mycoplasmas possess N-acyl(16 : 0)-S-diacyl(R1-18 : 1, R2-16 : 0)glyceryl cysteine at their N-termini, as demonstrated by selective neutral losses in their MS/MS spectra. This positional distribution of fatty acids in the diacylglycerol moiety was consistent with previous studies on membrane phosphatidylglycerol from cholesterol-requiring mycoplasmas, including M. pneumoniae; the sn-1 (R1) position of the membrane phospholipid is dominated by unsaturated fatty acid and the sn-2 (R2) position is dominated by saturated fatty acids [40]. This distribution is in contrast to other bacterial (not mycoplasmal) phospholipids in which the sn-1 (R1) position is substituted with saturated fatty acids and the sn-2 (R2) position is substituted with unsaturated, branched or cyclopropane-containing fatty acids [41].

Lyso and N-acetyl lipoproteins in some Firmicutes

The presence of an unexpected triacyl lipoprotein structure in staphylococci prompted us to determine whether other bacteria in Firmicutes also have N-acylated triacyl lipoproteins. We attempted to answer this question using MS-based structural analyses of lipoproteins purified from many Firmicutes species and found the widespread existence of the N-acylated form, including two unexpected novel N-acylated lipoprotein structures [16].

The purine nucleoside receptor PnrA (EF0177 in strain V583) of Enterococcus faecalis, a well-known intestinal bacterium, was shown to have a lipidated cysteine residue with two fatty acids of 34 : 1 in total, although no conventional triacyl forms. Further MS/MS analyses in combination with lipoprotein lipase digestion or H2O2 oxidation revealed an unusual structure containing N-acyl(16 : 0)-S-monoacyl(18 : 1)glycerylcysteine at the N-terminus (Fig. 2A,B). This lipoprotein structure did not have one of two esterified fatty acids of the S-diacylglyceryl group and was thus named the ‘lyso form’. Although we could not determine the exact O-acylation position in the lyso structure, we assumed that the R1-acylated lyso form is the major species because the acyl group of lysophospholipids is known to transit between the R1 and R2 positions at neutral pH and ambient temperatures, in which the R1-acyl form is predominant [20]. The same lyso forms were also identified in other E. faecalis lipoproteins, such as EF2256 and EF3256 (Table 1).

In addition to E. faecalis, the same lyso form lipoprotein structures were also found in three other bacterial species: Bacillus cereus, a component of human gut microbiota and a causative agent of food poisoning; Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus), a probiotic strain originating from Bulgarian yogurt; and Streptococcus sanguinis, a member of human indigenous oral microflora (Table 1). Although the composition of fatty acids was different for these bacteria, the results suggested that the lyso form is a well-distributed lipoprotein structure in the Class Bacilli in Firmicutes and that at least several Gram-positive bacterial species in the gut (or of a probiotic strain) produce the lyso form of lipoproteins.

Another lipoprotein structure has been identified from Bacillus licheniformis, a soil bacterium capable of causing food poisoning. The N-terminus of the manganese-binding subunit of the ATP-binding cassette (ABC) transporter MntA contains the N-acetyl-S-diacyl(R1-17 : 0, R2-15 : 0)glycerylcysteine residue (Fig. 2C,D and Table 1). This structure was named the ‘N-acetyl form’. The determined fatty acid positions of the S-glyceryl group were consistent with published data on Bacillus phospholipids [42]. Similar analyses demonstrated the presence of the same N-acetyl form in another B. licheniformis lipoprotein OppA (Table 1). Furthermore, neutrophilic B. subtilis, a model Gram-positive bacterium used in the production of Natto, a traditional Japanese dish of fermented soya beans, produced the N-acetyl form of lipoproteins (Table 1). The positional distribution of fatty acids in the diacylglyceryl moiety was consistent with those of phospholipids [43]. The N-acetyl form was further isolated from three other extremophilic strains, namely the alkaliphilic Bacillus halodurans grown at pH 9.5, the alkaliphilic and extremely halotolerant Oceanobacillus iheyensis grown at pH 9.5 and the thermophilic Geobacillus kaustophilus isolated from the deep sea of the Mariana Trench and grown at 60 °C (Table 1). These results strongly suggest that the N-acetyl form is also a well-distributed bacterial lipoprotein structure in the Family Bacillaceae, although B. cereus produces the lyso form.

Lipoprotein structures in Actinobacteria

Triacylated lipoproteins were identified in high-GC-content Gram-positive diderm Mycobacteria [30, 44]. Recombinant forms of the lipoproteins LppX and LprF of Mycobacterium tuberculosis, the etiological agent for tuberculosis, were heterologously expressed in Mycobacterium smegmatis, a fast-growing model mycobacterium. They were analyzed by MALDI-TOF MS and MS/MS after trypsin digestion. The LppX was triacylated with fatty acids containing a total carbon number of 51 : 0. MS/MS analysis demonstrated that the modifying fatty acids are 16 : 0 for N-acylation and 16 : 0 and 19 : 0 for O-acylation [30]. The neutral losses in the MS/MS spectra clearly suggest that the 19 : 0 fatty acid is at the R1 position and the 16 : 0 fatty acid is at the R2 position, although the exact positional distribution was not noted [30]. This interpretation of the MS/MS spectra is consistent with the previously reported positional distribution of fatty acids in mycobacterial phospholipids: 18 : 0, 18 : 1 and 19 : 0-branched (tuberculostearic acid) at sn-1 and 16 : 0 and 16 : 1-branched at sn-2 [45]. Another M. tuberculosis lipoprotein, LprF, expressed in M. smegmatis, was triacylated with the identical fatty acids to LppX but was also probably O-glycosylated with hexose at a serine or threonine residue [44]. In addition, the monoderm Streptomyces scabies, a plant pathogen, also exhibits triacylated lipoproteins when a Streptomyces coelicolor lipoprotein is heterologously expressed [46]. Because Mycobacteria possess an outer membrane-like structure around their cells [47-49], the triacylated structure might play a role in the translocation of lipoproteins to the outer membrane in a manner similar to that seen in E. coli, as discussed below. However, the exact biological function of N-acylation in these species has not been determined.

Biosynthetic pathways of bacterial lipoproteins

The canonical biosynthetic pathway of E. coli lipoproteins and putative new enzymes in low-GC-content bacteria

The canonical biosynthetic pathway of bacterial lipoproteins was determined in E. coli following pioneering work by Wu and colleagues [8, 50-53]. After translocation by Sec or twin arginine translocation (Tat) machinery, lipoproteins mature in three sequential steps catalysed by Lgt, Lsp and Lnt (Fig. 1), all of which are integral membrane enzymes embedded in the cytoplasmic membrane and are essential for the growth of E. coli. All three enzymes are also conserved in high-GC-content Gram-positive Actinobacteria. Although the E. coli-type Lnt is absent in low-GC-content Firmicutes and Tenericutes, an unidentified Lnt must be involved in the production of the N-acylated triacyl and N-acetyl forms. An additional new enzyme should contribute to the biosynthesis of the lyso form. In addition, an unusual Lsp that has unique cleavage-site specificity may contribute to produce the peptidyl form in M. fermentans (Fig. 3).

Figure 3.

Bacterial lipoprotein biogenesis in low-GC-content monoderm bacteria. The dotted lines indicate uncharacterized or putative modification steps. In staphylococci and some mycoplasmas, sequential reactions by Lgt (step 1) and Lsp (step 2) produce diacyl lipoprotein and an unidentified N-acylation enzyme (step 3) probably causes its maturation into the triacyl form. To generate the lyso form lipoprotein in B. cereus, E. faecalis and other bacteria, a putative deacylase may work on the triacyl form to remove one of two ester-linked fatty acids (step 4). Alternatively, a possible transacylase may work on the diacyl form to translocate one of two ester-linked fatty acids to the α-amino group of the cysteine residue (step 5). In M. fermentans, peptidyl forms are generated by an unusual Lsp enzyme that may have unique cutting-site specificity (step 6). In Bacillaceae, the N-acetyl form is generated by an unusual Lnt enzyme that adds an acetyl group to the α-amino group of diacyl lipoprotein from an unknown substrate (step 7). This research was originally published in the Journal of Biological Chemistry. Kurokawa K, et al. (2012) Novel bacterial lipoprotein structures conserved in low-GC content Gram-positive bacteria are recognized by Toll-like receptor 2. J Biol Chem 287, 13170–13181. © the American Society for Biochemistry and Molecular Biology [16].

Translocation by Sec or Tat machinery

A nascent polypeptide chain of a bacterial lipoprotein, called a preprolipoprotein, is exported by the Sec or Tat translocases. The Sec machinery exports unfolded nascent chains using the chemical energy generated by ATP hydrolysis, whereas the Tat translocase exports already-folded proteins containing a twin-arginine motif at their N-termini depending on the proton motive force [54]. The majority of preprolipoproteins are translocated by Sec. Components of the Sec machinery (SecA, SecD, SecE, SecF and SecY) are required for translocating preprolipoproteins [55, 56], as well as other proteins. The Sec machinery also affects the Lsp-mediated processing of lipid-modified prolipoproteins that are formed by Lgt (Fig. 1) [57]. Bioinformatic and biochemical studies have shown that a significant portion of preprolipoproteins are translocated via Tat in high-GC-content Gram-positive Streptomyces [12, 58].

Lgt in the canonical biosynthetic pathway

After export, the lipoprotein biosynthesis machinery processes these preprolipoproteins depending on a conserved motif in their N-terminal Type II signal peptides consisting of three parts: the n-region, h-region and c-region. The c-region has the consensus [LVI]–3[ASTVI]–2[GAS]–1C+1 sequence, which is called the ‘lipobox’ [5]. The sulfhydryl group of the C-terminal cysteine residue of the lipobox is crucial for further lipid modification by Lgt.

Lgt transfers the diacylglycerol moiety from membrane phospholipids to the cysteine residue of preprolipoproteins and produces prolipoproteins (Fig. 1). Lgt utilizes negatively-charged phospholipids, such as phosphatidylglycerol in particular, as its lipid substrate [8]. Cell fractionation experiments have demonstrated that the Lgt enzyme is an inner membrane protein [59]. Its membrane topology was determined by fusing Lgt to β-galactosidase or alkaline phosphatase and by the substituted cysteine accessibility method. The data obtained revealed that Lgt has seven transmembrane segments, and that its N-terminus faces the periplasm, whereas its C-terminus faces the cytoplasm [59]. Several highly-conserved regions were determined to be important for substrate recognition [60]. Through site-directed mutagenesis studies, H103 and Y235 of E. coli Lgt were implicated to be essential for Lgt enzymatic activity [61], and a recent study suggested that the highly-conserved residues of Y26, N146 and G154 are indispensable in vivo [59]. The catalytic mechanism of the Lgt-mediated unique thioether linkage formation has not been determined.

By contrast to E. coli and other Gram-negative bacteria, Lgt appears to be dispensable in all Firmicutes and high-GC-content Actinobacteria, except M. tuberculosis and S. coelicolor. The biological roles of Lgt in Gram-positive bacteria determined by mutant research have been reviewed previously [6, 12, 13]. Lgt has been reported to be essential for the bacterial growth in some Actinobacteria, including S. coelicolor [58] and M. tuberculosis [62]. In M. tuberculosis, the requirement of Lgt could be a result of the mislocalization of some essential lipoproteins [63]. Therefore, Lgt may be a valuable target enzyme for developing anti-tuberculosis drugs. By contrast, Lgt appears to be dispensable in all of the Firmicutes examined to date. In studies using lgt mutants, lipoprotein lipidation was shown to be required for the full virulence of Bacillus anthracis, the causative agent of anthrax. Spores of an lgt mutant of B. anthracis showed inefficient germination both in vitro and in the mouse skin of a subcutaneous infection model, resulting in attenuated virulence; however, vegetative cells remained virulent as a result of the anthrax toxin [64]. This observation is in accordance with the impaired germination of a B. subtilis lgt mutant [65].

Most bacteria contain a single lgt gene; however, some species in both Gram-negative and Gram-positive bacteria have two or more lgt paralogues [6, 12, 13, 64]. In S. coelicolor, each of the two lgt genes could be disrupted, although no double mutant was obtained, suggesting that lipidation by Lgt might be essential for bacterial growth [58].

Lsp in the canonical biosynthetic pathway

Lsp, a type II signal peptidase, cleaves the signal peptide present in front of the lipidated cysteine residue of prolipoproteins (Fig. 1). E. coli Lsp is also an integral membrane protein with four transmembrane segments. Both its N-terminus and C-terminus face the cytoplasm [66]. Two conserved aspartic acid residues (D102 and D129 in B. subtilis Lsp) in the type II signal peptidases of 19 bacterial species including E. coli are critical for the Lsp activity of both B. subtilis [67] and S. coelicolor [58]. These two aspartic acids might act as a catalytic dyad for a pepsin-type aspartic protease. E. coli Lsp strictly cleaves peptide bonds at the N-terminus of the lipid-modified cysteine residue, whereas Lsps from some Gram-positive bacteria may have a lower specificity or a different recognition mode for the substrate (see below). E. coli Lsp is highly resistant to both high temperatures and a wide ranging pH but is highly sensitive to detergents [68].

The biological function of Lsp for bacterial physiology and virulence, as revealed by mutation studies, has been reviewed previously [6, 12, 13]. In all Firmicutes examined to date, the lsp gene is dispensable for bacterial growth. The enzymatic activity of Lsp is inhibited noncompetitively by the cyclic depsipeptide antibiotic globomycin, which causes the accumulation of prolipoproteins in the inner membrane [69-71]. The inhibition of Lsp activity is proposed as an alternative means of antimicrobial chemotherapy targeting the endosymbiont Wolbachia bacteria of filarial nematodes, resulting in the prevention of lymphatic filariasis or onchocerciasis [72, 73]. As in the case of the lgt genes, most bacteria have a single lsp gene, whereas some bacteria possess two paralogous lsp genes [6, 12, 13].

Unusual Lsp with different cleavage-site specificity

Although E. coli Lsp was originally assumed to cleave only lipid-modified prolipoproteins, Lsps from L. monocytogenes and group B streptococci have been demonstrated to cleave the peptide bond at the N-terminus of the unmodified cysteine residue in the lipobox [74, 75]. As noted above, all of the examined M. fermentans lipoproteins possess either the conventional diacyl or the peptidyl form and are cleaved at the peptide bond just after serine, which strongly suggests that the substrate specificity of M. fermentans Lsp is not restricted to cleavage just before the lipidated cysteine and that cleavage might also occur after a serine residue near the lipidated cysteine residue. Consistent with this expectation, all 27 putative lipoproteins of M. fermentans have a serine residue between the −3 and −1 positions (where +1 is the conserved Cys) in the lipobox [16].

Lnt in the canonical biosynthetic pathway

E. coli Lnt transfers an sn-1 acyl chain of phospholipids to the α-amino group of the lipidated cysteine residue of the apolipoprotein (diacyl lipoprotein) that is generated by the Lsp-mediated cleavage of a prolipoprotein (Fig. 1) [76, 77]. E. coli Lnt utilizes all three major phospholipids of E. coli in vivo as its fatty acid source: phosphatidylethanolamine, phosphatidylglycerol and cardiolipin [78-80]. E. coli Lnt is an integral membrane protein. Topology mapping of Lnt fused to β-galactosidase or alkaline phosphatase indicated the presence of six membrane-spanning segments, with both the N-terminus and C-terminus facing the cytoplasm [81]. The last and largest periplasmic domain of Lnt belongs to the nitrilase superfamily [82] containing the probable catalytic triad E267-K335-C387. Recent elegant biochemical analyses revealed that this triad is involved in the two-step ping-pong reaction mechanism. The initial formation of an acyl-enzyme intermediate in which an sn-1-acyl chain of a phospholipid forms a thioester bond with the thiol of C387 (ping) is followed by N-acyl transfer to the apolipoprotein (pong) [83, 84]. E267 is required for the formation of the acyl–enzyme intermediate probably enhancing the basicity of the sulfur of C387. Subsequent analysis also revealed that both the phospholipid head group and the acyl chain composition affect N-acyltransferase activity in vitro [84]. The substrate head group preferred by E. coli Lnt was determined to be phosphatidylethanolamine, although Lnt accepted other phospholipids [84]. The preferred acyl chain composition on phosphatidylethanolamine was determined to be saturated fatty acids (16 : 0 or 18 : 0) at sn-1 and unsaturated fatty acid (18 : 1) or short chain (12 : 0) at sn-2. The length of the sn-1 acyl chain was not important [84]; however, the nontransferred sn-2 acyl moiety plays a critical role in the substrate selectivity of E. coli Lnt [76, 84].

N-Acylation by Lnt is essential for bacterial cell growth [80, 81], although overexpression of the LolCDE proteins, which are components of the lipoprotein translocation machinery, allows the deletion of the lnt gene [85]. Therefore, a fundamental role of Lnt-mediated N-acylation in E. coli is assumed to accelerate lipoprotein translocation onto the outer membrane.

Lnt of Actinobacteria

Recently, Lnt orthologues of M. smegmatis, M. tuberculosis and S. scabies were also reported to have N-acyltransferase activity [30, 46]. Unexpectedly, the above mentioned N-acylated fatty acid species of lipoproteins in M. smegmatis and the positional distribution of fatty acids in mycobacterial phospholipids [45] indicated that the substrate for Msmegmatis Lnt should be from the sn-2 position of phospholipids. This specificity is different from that of E. coli Lnt, which incorporates fatty acids at the sn-1 position to diacyl lipoproteins [76].

All genomes of the Streptomyces species sequenced to date contain two lnt genes [46]. Although both the Lnt paralogues in S. scabies were functional, they were not essential for cell growth and virulence [46].

Unidentified Lnt in staphylococci and mycoplasmas

Structural evidence indicates that staphylococci and some Tenericutes have an unidentified enzyme that catalyses the α-aminoacylation of diacylated lipoproteins. Although we do not know which gene encodes this enzyme, structural information on lipoproteins provides some hints about the fatty acid preference of this unidentified N-acyltransferase. Almost half of the fatty acids bound to the α-amino group of the diacylated cysteine in S. aureus lipoproteins are 18 : 0. By contrast, the 18 : 0 fatty acid is known to constitute only 3.2% of the total membrane or 13.4% of phosphatidylglycerol in S. aureus. These data suggest that the Lnt candidate in S. aureus may preferentially incorporate the 18 : 0 fatty acid, although the source of the fatty acids cannot yet be determined. The structural data also suggest that, if phospholipids are substrates, these should be derived from sn-1 fatty acids because the sn-1 position in phospholipids is predominantly substituted with 16 : 0 to 20 : 0 fatty acids in S. aureus, whereas the sn-2 position is substituted with a 15 : 0-branched fatty acid [86]. This apparent sn-1 specificity is similar to that seen in E. coli Lnt [76] but not in M. smegmatis Lnt [30, 58].

Similar fatty acid preferences are found in mycoplasmas. The α-aminoacylation enzyme(s) of A. laidlawii have strict fatty acid preferences for palmitic acid (16 : 0). The N-acylated fatty acid of M. pneumoniae lipoproteins was determined to be 16 : 0 only, whereas the total membrane of M. pneumoniae has two other major fatty acids: 18 : 0 and 18 : 1 [87]. Thus, Lnt in M. pneumoniae may prefer 16 : 0 fatty acids for lipoprotein biosynthesis. Identical fatty acid distributions are also observed in M. genitalium lipoproteins. In M. pneumoniae, 16 : 0 fatty acids are predominant at the sn-2 position of the phosphatidylglycerol. Therefore, the putative N-acylation enzyme may transfer a fatty acid from the sn-2 position of a phospholipid to the α-amino group of the diacylglyceryl cysteine residue if phospholipids are the substrates of M. pneumoniae Lnt.

N-Acetylation in Bacillus-related species reveals another interesting characteristic about the substrate specificity of the putative N-acetyltransferase. The substrate for this short chain acylation may originate from sources other than phospholipids because, to our knowledge, there is no report of the existence of an acetyl-containing phospholipid in Bacillus and related bacteria. Hence, acetyl-containing phospholipids, even if they exist, may populate small amounts, and/or their half-lives must be relatively short. Recent advances in lipidomic technology [88] should reveal whether this unusual phospholipid(s) exists. Otherwise, the uncovered enzymes may use an acetyl donor other than phospholipids. Most cytoplasmic N-acetyltransferases use high-energy acetyl-CoA as the acetyl donor [89]; however, this cytosolic donor needs to be exported across the cytoplasmic membrane. Another possible substrate is the N- or O-acetyl group attached to extracellular components such as peptidoglycan. Further studies will help to reveal how N-acetylated lipoproteins are produced.

Recently, the structural variation of lipoproteins between the triacyl and diacyl forms was discovered in S. aureus, as described above [17]. Diacyl lipoproteins accumulated in a combination of conditions, including acidic pH and a post-logarithmic growth phase. Intriguingly, pH-up-shift experiments revealed that protein synthesis was also required for the structural alteration of the diacyl form into the triacyl form [17]. Thus, the N-acylated state of S. aureus lipoproteins may be modulated by the expression level of the unidentified N-acyltransferase. Alternatively, the activity or substrate specificity of unidentified modification enzyme(s) may be altered under specific growth conditions. Another possibility is that lipoproteins cannot be further N-acylated as a result of the unavailability of the substrate for N-acylation under specific conditions.

Putative transacylase or deacylase involved in the lyso form structure

To generate the lyso form lipoproteins, at least one additional enzyme is necessary. After the production of diacyl lipoproteins by Lgt and Lsp, either transacylase, which acts on diacyl lipoprotein, or a combination of Lnt and deacylase, which acts on triacyl lipoproteins, may be involved in the synthesis of the lyso form (Fig. 3). At present, we do not know which pathway is more feasible for the generation of the lyso forms of lipoproteins in vivo. However, structural data may resolve this issue because lyso form lipoproteins exhibit a fatty acid preference [16]. In L. bulgaricus, the 18 : 1 fatty acid is predominant in amide-linked fatty acids of lipoproteins (Table 1) and is the major species in the sn-1 acyl chain of phospholipids as well. Therefore, the putative transacylase of L. bulgaricus may transfer 18 : 1-acyl group from the R1 (sn-1) position of diacyl lipoproteins [16]. However, the ester-linked fatty acid of the lyso form lipoproteins in L. bulgaricus is also 18 : 1, which is not a major sn-2 acyl chain of phospholipids, implying that Lnt (but not transacylase) might carry out the N-acylation of lipoproteins in L. bulgaricus, followed by the preferential removal of the R2 (sn-2) acyl chain from the triacylated intermediates, probably by a deacylase. By contrast, because 15 : 0 or 16 : 0 fatty acids are predominant among the amide-linked fatty acids of B. cereus or E. faecalis lipoproteins, respectively (Table 1), and are the major species in the sn-2 acyl chain of the phospholipids in the respective bacteria, the putative transacylase may transfer the 15 : 0 (or 16 : 0) fatty acid to the α-amino group from the R2 (sn-2) position of the S-diacylglyceryl group of the diacylated intermediates. The remaining R1 (sn-1) acyl chain composition is consistent with the R1 acyl chain composition of the lyso form of lipoproteins. Alternatively, the putative Lnt may transfer the fatty acid from the R2 (sn-2) position of phospholipids, if phospholipids are substrates of Lnt, and a deacylase may remove the R2 acyl chain (15 : 0 or 16 : 0) from the triacylated intermediate.

Translocation of lipoproteins to the outer cell envelope in Gram-negative bacteria

Translocation by the Lol system

In Gram-negative diderm Proteobacteria, most mature lipoproteins are translocated from the outer leaflet of the cytoplasmic membrane to the inner leaflet of the outer membrane via localization of lipoprotein (Lol) machinery. This translocation mechanism has been studied extensively and reviewed previously [90, 91]. In E. coli, the Lol system consists of an ABC transporter-like LolCDE complex that is embedded in the inner membrane [92], the soluble chaperone protein LolA in the periplasm [93] and a receptor lipoprotein LolB that anchors on the inner leaflet of the outer membrane [94]. Initially, a triacyl lipoprotein is recognized by LolCDE and is transferred to LolA depending on ATP hydrolysis by LolCDE. The lipoprotein and LolA form a water-soluble complex that traverses the periplasm to the outer membrane by simple diffusion. Then, LolA transfers the lipoprotein to LolB in a mouth-to-mouth manner at the outer membrane. Finally, the lipoprotein is translocated to the outer membrane from LolB via an unknown mechanism. All of the components in the Lol system are essential for E. coli growth [95-97].

The residue next to the lipid-modified cysteine functions as the sorting signal for lipoproteins in E. coli [98]. Lipoproteins having Asp at the +2 position are retained in the inner membrane, whereas lipoproteins having other amino acids at position +2 are translocated via the Lol system. Thus, the Asp at +2 functions as a LolCDE avoidance signal. By contrast, Lys at +3 and Ser at +4 constitute the avoidance signal in P. aeruginosa [99]. This type of sorting signal encoded in the N-terminal amino acid(s) of lipoproteins is seen in Enterobacteriaceae [100].

When Sutcliffe et al. [6] carried out bioinformatics analyses, they observed that the LolA orthologue is conserved in Proteobacteria and is also found in other bacteria, including Bacteroidetes and Spirochaetes. Unexpectedly, small parts of Firmicutes possess the LolA homologue in their genomes despite the absence of the outer membrane. On the other hand, LolB orthologues are only detected in β-, γ- and δ-Proteobacteria. This phylogenetic analysis suggests that there may be another translocation system in other bacteria [6].

Translocating lipoproteins to the cell surface of Gram-negative bacteria

Several lipoproteins in some Gram-negative diderm bacteria are detected on the surface of the cells, indicating the existence of machinery translocating lipoproteins to the outer surface of the outer membrane. Some of these surface lipoproteins are transported across the outer membrane by the type II secretion system (T2SS) from the periplasm [101, 102]. For example, the decaheme c-type cytochromes MtrC and OmcA of Shewanella oneidensis, a Gram-negative bacterium known for its metal-reduction capability, are cell surface-exposed lipoproteins that are involved in extracellular electron transfer. Translocating the cytochromes to the outer surface of the cells is directly dependent on the T2SS because deletion of the T2SS genes (gspD or gspG) resulted in the reduced translocation of the cytochromes [102]. Quantitative analysis of cell surface membrane proteins also revealed reduced abundances of many outer membrane proteins, including the MtrC and OmcA lipoproteins in gspD mutant cells [103]. By contrast, in another bacterium, Caulobacter crescentus, T2SS does not affect outer membrane surface translocation but does affect the subsequent cleavage and release of the lipoprotein ElpS under phosphate starvation conditions [104].

The pathogenic Borrelia spirochetes possess abundant surface-exposed lipoproteins that function as major antigens, indicating the existence of the lipoprotein translocation machinery that enables the translocation of lipoproteins to the cell surface. Currently, the molecular mechanisms underlying this export remain largely unknown despite studies carried out by Zückert and colleagues [105-108], who analyzed chimeras between borrelial lipoproteins and a monomeric fluorescent protein reporter and revealed that lipoproteins are transferred to the cell surface ‘by default’. Sequences at the N-terminal ‘tether’ domain [108] allow them to be retained in the periplasm, which is different from the established Lol avoidance signal [98]. In addition, translocation of monomeric lipoproteins to the cell surface requires an unfolded conformation and can be initiated at the C-terminus [107]. Furthermore, dimeric lipoproteins are shown to be exported as unfolded monomeric intermediates and assembled into a dimeric form at the cell surface [106]. The identification of the molecular machinery responsible for these processes will help us understand the transport mechanism of bacterial lipoproteins.

Biological functions of bacterial lipoproteins and their biosynthetic enzymes

Physiological roles of lipoproteins and their lipidation

In E. coli, 90% of more than 100 putative lipoproteins are expected to localize to the inner leaflet of the outer membrane [109]. Most outer membrane lipoproteins play important roles in producing and maintaining the outer membrane in Gram-negative diderms. Among them, three lipoproteins, BamD (YfiO) [110, 111], LolB [96] and LptE (RlpB), are essential for bacterial cell growth. BamD is a component of the BamABCDE complex involved in β-barrel protein insertion into the outer membrane [112]. LolB functions in lipoprotein localization, as described above. LptE complexed with LptD (Imp) is involved in transporting lipopolysaccharide to the cell surface [113]. The essentiality of lipid modification and the Lol system in E. coli may be a result of the lethal effects of mislocalization of these essential outer membrane lipoproteins.

Recently, two outer membrane lipoproteins were shown to activate peptidoglycan synthesis in E. coli [114, 115]. At the final stage of peptidoglycan synthesis, penicillin-binding proteins (PBPs) polymerize and cross-link monomeric units of peptidoglycan precursors into a mature peptidoglycan sacculus. Two lipoproteins, LpoA and LpoB, form a complex with PBP1a and PBP1b, respectively, to activate peptidoglycan synthesis. Braun's lipoprotein (Lpp) maintains connections between the outer membrane and peptidoglycan, which stabilizes the outer membrane envelope [116, 117]. Pal is also an abundant outer membrane lipoprotein in E. coli and is involved in outer membrane invagination during the process of cell division [118].

By contrast, lipoproteins in Gram-positive monoderm bacteria are localized on the outer surface of the cytoplasmic membrane or secreted into the extracellular milieu. Orthologues for ABC transporter periplasmic substrate-binding proteins that are not lipoproteins in Gram-negative bacteria are found as lipoproteins in Gram-positive bacteria [119, 120]. These solute-binding lipoproteins are the most abundant lipoproteins in Gram-positive bacteria [12]. A recent review explains the function of such solute-binding lipoproteins in staphylococcal iron acquisition [121]. Other functional groups of lipoproteins in Gram-positive bacteria, such as β-lactamase for antibiotic resistance, adhesins, and PrsA for protein folding and secretion, have been noted and reviewed previously [12, 13, 120]. Interestingly, B. subtilis PrsA is the only known essential lipoprotein for cell growth [122]; however, the lipidation enzymes Lgt and Lsp are dispensable. Thus, lipid modification of PrsA is not essential for its function. In M. tuberculosis, seven lipoproteins are essential for cell growth [65]. Of these, LpqW is involved in the synthesis of lipoarabinomannan, a major cell wall component that functions as a virulence factor in this bacterium [123].

Lipoproteins as TLR2 ligands

Some lipoproteins of pathogenic bacteria are known to interact with host molecules and function as virulence factors against hosts. In addition, the role of lipidation of lipoproteins in inflammation and bacterial pathogenesis has been evaluated using bacteria with mutations affecting lipoprotein biosynthesis. Because the roles of lipoproteins in virulence have been reviewed previously [12, 13, 124, 125], we focus on the TLR2 stimulation abilities of bacterial lipoproteins by new lipidation structures.

Mammalian TLRs recognize specific pathogen-associated molecular patterns derived from various microorganisms, including bacteria, viruses, protozoa and fungi [126]. TLRs induce inflammatory cytokine secretion for host innate immune defence, and are also involved in the establishment of adaptive immunity [7, 126]. Among them, TLR2 plays a major role in the recognition of Gram-positive bacteria [127]. To date, many TLR2 ligand molecules have been suggested, such as lipoproteins, lipopeptides, peptidoglycan, lipoteichoic acid, lipomannans and lipoarabinomannans [128], whereas other TLRs essentially recognize a single class of pathogen-associated molecular pattern molecules, such as lipopolysaccharide for TLR4. Because these different TLR2 ligands show significant variations in their structures, it is questionable whether TLR2 really interacts with all these suggested ligands [128]. Recent biochemical studies using cell wall component-deficient mutant bacteria clearly demonstrated that bacterial lipoproteins but not lipoteichoic acid or peptidoglycan act as a real native TLR2 ligand molecules [19, 129, 130]. This is supported by analyses of crystal structures of a TLR2 ectodomain in complex with a lipopeptide as described below [131, 132], and by studies using lgt mutants of bacteria such as S. aureus [19, 37, 129, 130, 133], L. monocytogenes [134] and group B Streptococcus [75], and using an lipoteichoic acid-deficient S. aureus ltaS mutant [19, 129, 130]. Therefore, only the lipoproteins are real TLR2 ligands, and others may contain lipoproteins as contaminants during their preparations.

TLR2 is unique in its ability to form heterodimer complexes with TLR1 or TLR6. Triacyl and diacyl synthetic lipopeptides, such as N-palmitoyl-S-dipalmitoylglyceryl CSK4 and MALP-2, have been used as TLR2/1 and TLR2/6 agonists, respectively, leading to a model in which triacylated lipopeptides/lipoproteins activate through the TLR2/1 heterodimer, whereas diacylated lipopeptides/lipoproteins activate through the TLR2/6 heterodimer [135-137]. Recently published crystal structures of TLR2/1 and TLR2/6 heterodimerized with a synthetic triacyl and diacyl lipopeptide, respectively, demonstrated that TLR2 interacts with the O-esterified fatty acids, the glyceryl group and the thioether moiety of the S-glycerylcysteine residue of triacyl and diacyl lipopeptides, and that TLR1 recognizes the amide-linked fatty acid of the triacyl form via its hydrophobic cavity, whereas TLR6 does not have such a cavity. However, other biochemical and cell biological studies did not match this model [19, 138-140]. For example, native triacylated SitC lipoprotein purified from S. aureus cells stimulated immune cells via both the TLR2/1 and TLR2/6 heterodimers [19, 138-140]. These studies demonstrate that not only the lipidation structure, but also the amino acid sequence after the lipidated cysteine residue affects the selectivity of TLR2 heterodimers.

How are immune cells stimulated by the three new lipid-modified structures: the lyso, N-acetyl and peptidyl forms? The question of whether or not lyso form lipoproteins that lose one of two esterified fatty acids are recognized by TLR2 heterodimers is an intriguing one. Interestingly, a synthetic lyso form lipopeptide was inactive against TLR2-expressing cells [141] or TLR2/6-expressing cells, although it activated TLR2/1 expressing cells [140]. However, unexpectedly, two native lyso form lipoproteins stimulate mouse thioglycolate-induced peritoneal macrophages. Additionally, the TLR2 heterodimer selectivities of two different lyso form lipoproteins, namely B. cereus OppA or E. faecalis PrsA, are different from each other: B. cereus OppA stimulates immune cells via both the TLR2/1 and TLR2/6 heterodimers, whereas E. faecalis PnrA mediates the immune signal via the TLR2/6 heterodimer [16]. Therefore, the unusual lyso form lipoproteins can function as TLR2 ligands, and TLR2 heterodimer selectivity may determined not only by the position of the acyl chains, but also by protein sequences and/or lipid compositions (Table 1). Because the lyso form synthetic peptide was not active in TLR2-expressing cells [141], it can be assumed that the TLR2/1 and TLR2/6 heterodimers (but not the TLR2 homodimer) detect the lyso form structure: in other words, the acquisition of TLR1 and TLR6 via the duplication of the gene for TLR2 [142] might enable host organisms to sense lyso form producing bacteria. Further analyses of structure–function relationships using synthetic lyso form lipopeptides with a variety of different peptide sequences and/or with different fatty acyl groups are required to solve the question of how TLR2 heterodimers recognize the bacterial lyso form lipoprotein at the molecular level.

The TLR2 stimulation ability of another new bacterial lipidation structure, the N-acetyl form, was also examined using mouse macrophages. As expected, the native N-acetyl form lipoproteins and a synthetic N-acetyl form lipopeptide stimulated peritoneal macrophages via TLR2/6 [16], which is consistent with a previous study that used a synthetic N-acetyl form lipopeptide [139]. Peptidyl form lipoproteins were not examined because of difficulties in obtaining sufficient amounts of native lipoprotein for experiments [16].

Because typical Gram-positive bacteria are surrounded by a thick cell wall, their lipoproteins are embedded under the wall, and thus they appear to be protected from any physical interaction with the host cell surface TLR2. The in vitro TLR2 stimulation activity of crude peptidoglycan fraction of staphylococci was improved by enzymatic degradation of the peptidoglycan [143]. In the innate immune response, macrophages engulf invading bacteria and deliver them to the phagosome for professional degradation where the TLR complexes are recruited [144-146]. Recent studies have reported that phagocytosis of S. aureus cells together with the subsequent degradation of the cell wall in acidic phagosomes is important for an efficient lipoprotein/TLR2 interaction [147, 148]. Moreover, a fluorescent-labelled purified SitC lipoprotein from S. aureus induced the intracellular accumulation of TLR2 in primary murine keratinocytes, and SitC colocalized with TLR2 but not with TLR4 or nucleotide-binding oligomerization domain 2 (Nod2) [149].

Watanabe et al. [150] demonstrated that TLR2 is not involved in the elimination but, instead, the prolonged intracellular survival of S. aureus in macrophages. This intracellular survival has been explained by the TLR2-dependent inhibition of superoxide production, which is augmented by another bacterial cell wall component, d-alanylated wall teichoic acid [151]. Interestingly, another study has reported that the lipoprotein-dependent intracellular survival of S. aureus requires TLR2 but not myeloid differentiation primary response gene 88 (MyD88) [152].

Because an acidic pH results in S. aureus lipoproteins of the diacyl form [17], it will be interesting to determine whether S. aureus accumulates diacylated lipoproteins in acidic phagosomes and whether the diacylation is relevant to the immune cell activation and/or intracellular survival of S. aureus.


Unexpectedly, 40 years after the first determination of the E. coli triacyl structure, three novel bacterial lipoprotein structures were identified in Gram-positive monoderms. At present, five different lipoprotein structures (conventional diacyl, peptidyl, conventional triacyl, N-acetyl and lyso forms) are reported. All of these five subclasses are identified in monoderms such as Firmicutes and Tenericutes. Furthermore, environment-dependent regulation of N-acylation is unexpectedly observed in S. aureus. On the other hand, diderm bacteria, including Gram-negative and high-GC-content Gram-positive bacteria, except for B. viridis and possibly B. burgdorferi, produce triacyl lipoprotein structures. In these diderm bacteria, lipoproteins are translocated from the inner membrane to the periplasmic or extracellular leaflet of the outer membrane, or secreted into the extracellular milieu. The molecular mechanisms underlying the canonical biosynthesis and several sorting pathways have been studied using both genetical and biochemical methods, although our knowledge of these fields is still limited. Along with genetical and biochemical means, structural biological analyses of these enzymes will provide deep insights into the reaction mechanisms; however, it is difficult to characterize these proteins because they are integral membrane proteins. Importantly, recently identified new bacterial lipoprotein structures indicate new modes of lipid modification by unidentified enzymes. We have discussed the putative lipoprotein biosynthetic pathway and proposed plausible biochemical properties for the responsible enzymes based on lipoprotein structures determined by extensive MS/MS analyses. Molecular and biochemical characterization studies of the unidentified key enzymes are still necessary.


This work was supported by grants of a Bio-Program (2008-2004086) and the BK21 program of the National Research Foundation of Korea to B.L.L.