Microtubule-associated protein light chain 3 (LC3), a mammalian homologue of yeast Atg8, plays an essential role in autophagy, which is involved in the bulk degradation of cytoplasmic components by the lysosomal system. Here, we report the crystal structure of LC3 at 2.05 Å resolution with an R-factor of 21.8% and a free R-factor of 24.9%. The structure of LC3, which is similar to those of Golgi-associated ATPase enhancer of 16 kDa (GATE-16) and GABAA receptor-associated protein (GABARAP), contains a ubiquitin core with two α helices, α1 and α2, attached at its N-terminus. Some common and distinct features are observed among these proteins, including the conservation of residues required to form an interaction among α1, α2 and the ubiquitin core. However, the electrostatic potential surfaces of these helices differ, implicating particular roles to select specific binding partners. Hydrophobic patches on the ubiquitin core of LC3, GABARAP and GATE-16 are well conserved and are similar to the E1 binding surface of ubiquitin and NEDD8. Therefore, we propose that the hydrophobic patch is a binding surface for mammalian Atg7 similar to a ubiquitin-like conjugation system. We also propose the functional implications of the ubiquitin fold as a recognition module of target proteins.
Autophagy is a process that involves the bulk degradation of cytoplasmic components by the lysosomal/vacuolar system (Baba et al. 1994; Klionsky & Ohsumi 1999) and is responsible for the majority of intracellular protein degradation (Mortimore & Poso 1987). Autophagy is a cellular survival response to starvation and plays an essential role in developmental processes and differentiation of cells. During autophagy in yeast, cytoplasmic constituents, often including organelles, are first surrounded by a membrane sac called an isolated membrane which is then closed resulting in the formation of a double membrane structure called an autophagosome. The outer membrane of the autophagosome is subsequently fused with the vacuole and the released inner membrane together with the sequestered components, called the autophagic body, are degraded in the vacuole for recycling.
Genetic approaches in the yeast Saccharomyces cerevisiae have isolated at least 16 autophagy-defective (Atg) genes that regulate autophagy under nutrient-starved conditions (Tsukada & Ohsumi 1993; Thumm et al. 1994). Among them, four Atg proteins, namely Atg3, Atg4, Atg7 and Atg8, were shown to be involved in a novel ubiquitin-like conjugation system that mediates protein lipidation (Ohsumi 2001; Ichimura et al. 2000).
Atg8, a novel ubiquitin-like protein, undergoes cleavage of its C-terminal arginine residue by the novel cysteine protease Atg4 (Kirisako et al. 2000). The C-terminal-cleaved Atg8 is activated by Atg7, a novel E1-like protein, and is transferred subsequently to Atg3, a novel E2-like enzyme, through a thioester bond. Intriguingly, Atg8 is finally conjugated to phosphatidylethanolamine (PE) by a covalent bond (Ichimura et al. 2000). Although Atg8 exists in a free/peripherally membrane-bound form, the Atg8-PE conjugate exists in a tightly membrane-associated form (Kirisako et al. 2000) and is located on the autophagosome. Thus, the Atg8-PE conjugate is thought to have a critical role in autophagosome formation.
Recent studies have suggested that the molecular machinery of autophagosome formation is evolutionally conserved from yeast to higher eukaryotes. In mammals, an Atg8-like conjugation system also exists called the LC3-system. Microtubule-associated protein light chain 3 (LC3), the first identified mammalian homologue of Atg8, was originally identified as a light chain of the microtubule-associated proteins 1 A and 1B in the rat brain (Mann & Hammarback 1994). Like the Atg8 system in yeast, the C-terminal region of LC3 is cleaved by mammalian Atg4 (mAtg4) homologues (Kabeya et al. 2000). The processed form, called LC3-I, has a glycine residue at the C-terminus (Kabeya et al. 2000) and resides in the cytosol. After activation by mammalian Atg7 (mAtg7) and mammalian Atg3 (mAtg3) homologues (Tanida et al. 2001, 2002), LC3-I is further modified to another form, called LC3-II, which is most likely the PE-conjugated form (Kabeya et al. 2004) as in the Atg8 system. Because LC3-II is localized to the autophagosomal membrane, it is now widely used as a key molecule to monitor autophagosome and autophagy activity in mammalian systems (Kabeya et al. 2000).
In addition to LC3, two mammalian Atg8 homologues have been reported: Golgi-associated ATPase enhancer of 16 kDa (GATE-16) and GABAA receptor-associated protein (GABARAP) (Legesse-Miller et al. 1998; Wang et al. 1999). Recently, they were shown to be the substrates of mAtg4, mAtg7 and mAtg3 in a similar manner to LC3 (Tanida et al. 2003; Scherz-Shouval et al. 2003; Hemelaar et al. 2003), and they may be conjugated to PE. However, considering that LC3 is most sensitive to C-terminal cleavage by mAtg4 and is most abundant in autophagosomal membranes among three homologues, LC3 is thought to play a major role in mammalian autophagosome formation.
Here, we report the crystal structure of LC3 at 2.05 Å resolution. LC3 has a ubiquitin fold at the C-terminal region and two helices at the N-terminal region, as seen in GATE-16 and GABARAP. We discuss the structural similarities and differences among these three LC3 family proteins and their functional implications. This is the first report of our studies to elucidate the molecular mechanism of autophagy on the structural basis.
Overall structure of LC3
In the present study, we used rat LC3-I (1-120), which we call LC3 hereafter. The final refined LC3 structure has an R-factor of 0.215 and a free R-factor of 0.249. The region corresponding to amino acids 5-117 was modelled along with 66 water molecules. Four N-terminal residues and three C-terminal residues were disordered in the crystal. Statistics of the X-ray data collection and refinement statistics are summarized in Tables 1 and 2, respectively.
Table 1. Statistics of X-ray data collection
Rmerge(I) = (ΣΣ|Ii—<I>|)/ ΣΣIi, where Ii is the intensity of the ith observation and <I> is the mean intensity. Values in parentheses refer to the outer shell.
The structure of LC3 contains a five-stranded central β-sheet (β1–β5) as a core, which is flanked by two pairs of α-helices (α1 and α2, α3 and α4) (Fig. 1). The major portion of the structure (30-117) comprising five β-strands and two α-helices (α3 and α4) closely resembles a ubiquitin fold, and two of the α-helices (α1 and α2) associate with its N-terminus. Comparison of the LC3 structure to the PDB database using the DALI search engine (Holm & Sander 1996) revealed that LC3 shows strong structural similarity to GATE-16 (PDB code 1EO6) (Paz et al. 2000) and GABARAP (PDB code 1KJT) (Bavro et al. 2002) with root mean square deviations (r.m.s.d) of 1.1 Å and 1.3 Å, respectively, for 110 α-carbons. Figure 2A shows the superposition of GATE-16 and GABARAP on the LC3 structure.
Although the overall structure of LC3 is very similar to other homologues (Paz et al. 2000; Bavro et al. 2002; Coyle et al. 2002; Knight et al. 2002; Stangler et al. 2002), LC3 has some structural differences compared to GABARAP and GATE-16 in two regions. One region is the loop between β1 and β2, and the other is the region between β3 and β4. Both loops in LC3 have one amino acid residue insertion, compared to those in GABARAP and GATE-16. Gln43 is inserted in the former loop, and Gly85 is inserted in the latter loop. The latter insertion makes it possible to form a hydrogen bond with the main-chain carbonyl group of His86 and the main-chain amide group of Val83, resulting in the formation of an extra strand, β4, by residues 86-88. A salt bridge between the side-chains of His86 and Glu102 also contributes to the stabilization of the β4 strand. The β4 strand is observed in ubiquitin, which also has a Gly insertion at the same position.
α1 and α2, characteristic features of LC3 family proteins
As shown in Fig. 2A, the presence of α1 and α2 is a structural feature of LC3 family proteins. These helices are attached to the ubiquitin core by a number of interactions. Figure 3 focuses on the hydrophilic and hydrophobic interactions among α1, α2 and the ubiquitin core. Hydrophilic interactions including hydrogen-bonds and salt-bridges tether these helices to the ubiquitin core; Lys8 and Arg16 form salt bridges with Asp104 and Asp106 on the loop between α4 and β5, respectively. In addition, the salt bridge between Arg11 and Asp19 maintains the relative orientation of α1 and α2. Besides these salt-bridge interactions, hydrogen bonds are formed between Phe7 and Glu36, Arg11 and Arg16, Thr6 and Glu36 and Arg10 and Thr50. α2 is attached to the ubiquitin core mainly through several hydrophobic interactions, including Ile23 and Val20 on α2 and Leu53, Pro32 and Phe108, while Phe7 on α1 interacts with Phe108, Ile34 and Tyr110 on the core. Remarkably, most of the residues involved in these hydrophilic and hydrophobic interactions are strictly conserved in LC3 family proteins including Lys8, Arg16, Ile23, Pro32, Ile34, Glu36, Leu53, Asp106, Phe108 and Tyr110. The conservation of these residues implies that α1 and α2 are indispensable for biological function. It should be noted that in GABARAP, two conformations (open and closed one) are observed for α1, and an open conformation was suggested to be involved in tubulin polymerization (Coyle et al. 2002). In the open conformation, the residues corresponding to the α1-helix are flipped almost 180° relative to their position in the closed conformation and are extended. LC3 could probably take an open conformation, although only a closed conformation was observed in our crystals.
Figure 4A compares the electrostatic surface potential of LC3, GATE-16 and GABARAP. Although the interactions among α1, α2 and the ubiquitin core are the common feature in LC3 family, their electrostatic surface potential distributions on α1 and α2 are quite different. The surface of the α1 moiety of LC3 is basic, in contrast to the acidic nature of GATE-16 and GABARAP. While, the surface of the α2 moiety is acidic, neutral and basic in LC3, GATE-16 and GABARAP, respectively. LC3 family proteins are reported to interact with different target proteins. For example, GATE-16 interacts with N-ethylmaleimide-sensitive fusion protein (NSF), which catalyses SNARE complex disassembly via its ATPase activity (Sagiv et al. 2000). In contrast, GABARAP can interact with the γ2 subunit of GABAA receptors both in vivo and in vitro (Wang et al. 1999). Since GABARAP and GATE-16 co-localize to LC3-positive autophagosomes that are induced by starvation, it remains a possibility that they participate in autophagy as well as their involvements originaly described. The difference in electrostatic surface potential of α1 and α2 in LC3 family proteins may rather confer specificity toward their respective target proteins.
The E1 recognition site of LC3 family proteins
Recently, the structure of NEDD8 bound to its E1 enzyme, the APPBP1-UBA3 complex, was reported (Walden et al. 2003). In the complex structure, the hydrophobic patch, including three hydrophobic residues, Leu8, Ile44 and Val70 of NEDD8, played an essential role in binding to the adenylation domain of the E1 protein (Walden et al. 2003). As for LC3, Tyr38, Leu82 and Ala114 structurally aligned to these residues, so they are suggested to be involved in the binding to mAtg7. Val36, Phe79 and Ser110 in GATE-16 and Ala36, Phe79 and Ser110 in GABARAP structurally correspond to these residues. Electrostatic surface potentials around these residues of LC3, GATE-16 and GABARAP are compared in Fig. 4B. The hydrophobic patch including the above residues is localized on one surface of the ubiquitin moiety, supporting the notion that the hydrophobic surface is responsible for the E1 binding in a manner similar to that of ubiquitin or NEDD8. Since the molecular surface of the putative E1 binding site is quite similar among LC3 family proteins, the recognition mechanism by mAtg7 is common to these proteins. We superimposed the structure of LC3 on to the NEDD8 structure bound to the adenylation domain of APPBP1-UBA3 complex. The hydrophobic patch is located on the interface, and the above three residues are located near the contact surface, while α1 and α2 are located on the opposite side. Although the present model is crude, we can speculate that LC3 family proteins interact with their E1-like protein, mAtg7, using the hydrophobic surface, and α1 and α2 do not contribute to the binding to E1-like enzymes.
The regions with high B-factors are localized at the loop regions between β1 and β2 and between α3 and β3 as well as in the N-terminal and C-terminal regions in the LC3 family proteins (Fig. 1). In particular, both termini have high B-factors, and they are suggested to take flexible conformations. Interestingly, the regions with high B-factors are localized on the upper half of the putative hydrophobic surface (circled in red in Fig. 4B). The C-terminus of each protein is positioned at the centre of the surface and is surrounded by loops with high B-factors. The residues comprising these loops are not conserved among LC3 family proteins and are longer than those observed in ubiquitin and other ubiquitin-like proteins, such as NEDD8. The highly flexible nature of the C-terminal region in the LC3 family proteins, as well as the flexible hydrophobic surface, would facilitate the interaction with mAtg7 to allow a glove-like fit.
Implication of the ubiquitin fold as protein interacting modules
A number of proteins and protein modules with a ubiquitin fold have been determined. Figure 5 compares the structure of LC3 (Fig. 5B) with ubiquitin (Fig. 5A), NEDD8 (Fig. 5C), the Ras-binding domain of c-Raf1 kinase (Fig. 5D) and the PB1 domains of Bem1p (Fig. 5E) and Cdc24p (Fig. 5F). In spite of their low sequence homology, all of the proteins and modules have common structural features classified as the ubiquitin fold. Because of its robustness, the ubiquitin fold may be widely used as a ubiquitous scaffold to present the residues or motifs necessary for interaction with target proteins in various sites. In the case of NEDD8, as shown previously, the hydrophobic region centred around Leu8, Ile44 and Val70 has been identified as the binding site for the adenylation domain of E1-like enzymes (Walden et al. 2003) (Fig. 5C). Similarly, in LC3, the corresponding hydrophobic patch is assumed to be responsible for the binding to mAtg7 (Fig. 4B). The PB1 domains are novel modules containing a ubiquitin scaffold and responsible for homo and hetero dimerization of PB1 family proteins (Yoshinaga et al. 2003). For the PB1 domain of Bem1p, the binding site is localized on the β1, β2 and the C-terminal region of α1 and classified as the Type-II PB1 domain (Terasawa et al. 2001) (Fig. 5E). The similar region is also utilized as the Ras-binding site of c-Raf1 kinase (Nassar et al. 1995) (Fig. 5D). The Cdc24p PB1 domain is identified as a binding partner of the Bem1p PB1 domain and the interaction is crucial for establishment of the cell polarity of budding yeast. The interaction surface is comprised of the region, including β4, α3 and the loop between β3 and β4, which is classified as the Type-I PB1 domain (Fig. 5F). This region was initially identified as the OPCA (OPR/PC/AID) motif characteristically aligned with acidic and hydrophobic residues. Therefore, the Cdc24p PB1 domain presents the OPCA motif on the ubiquitin scaffold as the interaction site with the Bem1p PB1 domain (Ponting et al. 2002). LC3 also appears to use the same strategy to present the additional N-terminal α1 and α2 helices on the ubiquitin scaffold which is utilized as the interaction site with its target proteins, while the hydrophobic patch is used as the binding site for E1-like enzyme. The above consideration taken together leads to the notion that the ubiquitin scaffold presents several surfaces as the interaction sites for cognate partners, thus, expanding the versatility of the protein interaction modes.
The expression, purification and crystallization of rat LC3-I (Molecular weight 14 555) were previously described (Sugawara et al. 2003). Diffraction data were collected at 90 K to 2.05 Å resolution on a Mar CCD165 detector using SPring-8 beamline BL41XU at a wavelength of 1.000 Å. The data collection was performed with a total oscillation range of 180° and a step of 1.0° for each exposure time of 10 s. The crystal belongs to space group P43 with unit cell dimensions of a = 60.48 and c = 35.28 Å, and it contains one molecule in an asymmetric unit. Diffraction data were processed using the HKL2000 program suite (Otwinowski & Minor 1997). The data collection statistics are summarized in Table 1.
Structure solution and refinement
The structure was solved by molecular replacement with the CNS program (Brünger et al. 1998) using the GATE-16 structure (39% sequence identity, PDB code 1EO6) (Paz et al. 2000) as a search model. Initial refinement was performed by the torsion angle molecular dynamic simulated annealing method and bulk-solvent correction against the maximum-likelihood amplitude target. For each cycle, the model was rebuilt manually using the molecular modelling program TURBO-Frodo (Cambillau & Roussel 1997) in several steps alternated with cycles of automated refinement using data to 2.05 Å resolution. The atomic coordinates and structure factors of LC3 have been deposited in the Protein Data Bank (http://www.rcsb.org/pdb) with accession code 1UGM.
We thank Dr M. Kawamoto and Dr H. Sakai of the Japan Synchrotron Radiation Research Institute (JASRI) for their kind help in the X-ray diffraction experiment at the beamline BL41XU, SPring-8. We also thank Dr Z. Elazar of Weizmann Institute of Science and Dr S. Yoshinaga of Hokkaido University for helpful discussions.
This work has been supported by a Grant-in-Aid for Scientific Research on Priority Areas and National Project on Protein Structural and Functional Analyses from the Ministry of Education, Culture, Sports, Science and Technology, Japan.