Trafficking of soluble proteins to the apicoplast in Plasmodium falciparum is determined by an N-terminal transit peptide (TP) which is necessary and sufficient for apicoplast import. Apicoplast precursor proteins are synthesized at the rough endoplasmic reticulum, but are then specifically sorted from other proteins in the secretory pathway. The mechanism of TP recognition is presently unknown. Apicoplast TPs do not contain a conserved sequence motif; therefore, we asked whether they contain an essential structural motif. Using nuclear magnetic resonance to study a model TP from acyl carrier protein, we found a short, low-occupancy helix, but the TP was otherwise disordered. Using an in vivo localization assay, we blocked TP secondary structure by proline mutagenesis, but found robust apicoplast localization. Alternatively, we increased the helical content of the TP through mutation while maintaining established TP characteristics. Apicoplast import was disrupted in a helical mutant TP, but import was then restored by the further addition of a single proline. We conclude that structure in the TP interferes with apicoplast import, and therefore TPs are functionally disordered. These results provide an explanation for the amino acid bias observed in apicoplast TPs.
Akin to the majority of other parasites in the phylum Apicomplexa, Plasmodium falciparum contains a plastid organelle named the apicoplast. The organelle's 35 kB circular genome is phylogenetically related to red algae (1), but the apicoplast genome reveals little about the role of the apicoplast. The majority of proteins that function in the apicoplast are encoded in the nuclear genome and are trafficked to the organelle post-translationally (2). Several metabolic pathways, including those responsible for synthesizing fatty acids, heme, isoprenoids, and iron sulfur clusters, rely on nuclear-encoded apicoplast proteins (3).
Nuclear-encoded apicoplast proteins are directed to the apicoplast by a bi-partite N-terminal targeting motif. This consists of an endoplasmic reticulum (ER) signal sequence, followed by an apicoplast transit peptide (TP). Together, this composite N-terminal sequence is necessary and sufficient for soluble protein import into the apicoplast (4). The ER signal sequence is cleaved during cotranslational import of proteins into the rough ER, exposing the TP in the ER lumen. The outer membrane of the apicoplast is in close contact with the ER, although the two compartments are known to be distinct (5).
Transport from the ER lumen to inside the outer apicoplast membrane is mediated by vesicular transport for membrane-bound proteins (6), and this mechanism is likely to be responsible for soluble protein trafficking as well (1). Apicoplast proteins arriving at the apicoplast periphery are postulated to cross the remaining three membranes through translocon complexes that have yet to be fully defined (7). After import, the TP is thought to be cleaved by a stromal processing protease (SPP) to reveal the mature protein (8).
Apicoplast proteins must be distinguished from other secretory pathway proteins. During the erythrocytic stages, Plasmodium secretory proteins are trafficked to a variety of destinations other than the apicoplast, including the host erythrocyte, the parasitophorous vacuole, and organelles such as micronemes and rhoptries (9). Specificity for the apicoplast is conferred by the apicoplast TP. While conserved sequence motifs have been identified for some trafficking destinations, such as export to the erythrocyte (10), apicoplast TPs do not contain a conserved sequence motif (11). TPs vary widely in length, ranging from 24 to several hundred amino acids (12). Although TP sequences are diverse, the presence of positively charged amino acids near the N-terminus is known to be of particular importance (13,14).
Like apicoplast TPs, the TPs of chloroplasts and mitochondria are also N-terminal peptides with no conserved sequence motifs. Mitochondrial TPs form amphipathic helices, and this secondary structure element is recognized by import receptors (15). Chloroplast TPs are also capable of forming alpha helices (16) and these sequences are recognized and imported into the chloroplast (17). Interestingly, the chloroplast is distantly related to the apicoplast (1). Orthologous components of the chloroplast translocon complex have been identified in P. falciparum(18) and in the related apicomplexan parasite, Toxoplasma gondii(19). In select instances, an apicoplast TP has been shown to mediate import into chloroplasts (20) and a chloroplast TP was found to mediate apicoplast import (21). It is unknown if further similarities exist between chloroplast and apicoplast import. Aside from the aforementioned proteins, orthologs of many critical components of the chloroplast translocons have not been identified in apicomplexan parasites.
Components of the ER-associated degradation (ERAD) pathway have also been proposed to mediate apicoplast import (22), and offer an alternative mode of TP recognition. While the ERAD pathway is generally responsible for the identification and degradation of misfolded proteins in the ER (23), a second set of ERAD proteins localizing to the apicoplast have recently been identified in both P. falciparum and T. gondii(18,24,25). In T. gondii, loss of the putative ERAD channel protein, Derlin-1, results in loss of apicoplast protein import and parasite death (24). In contrast to chloroplast and mitochondrial translocons, the ERAD system recognizes structurally disordered or misfolded proteins for translocation.
In this work, we examine the role of secondary structure in a TP for protein trafficking to the apicoplast of P. falciparum. Using computational and experimental techniques we show that the apicoplast TP of acyl carrier protein (ACP) is disordered but may contain a low-occupancy alpha helix nucleation site. We used proline mutagenesis to block the formation of secondary structure in the TP, and determined the effect of these mutations on apicoplast protein trafficking in vivo. Surprisingly, proline mutations did not disrupt apicoplast localization, suggesting that structure is not required for apicoplast import. We mutagenized the TP to increase the alpha helical propensity, and found apicoplast import to be blocked. Finally, we restored apicoplast import to the engineered helical TP by addition of a single proline, suggesting that mislocalization was due to the structure and not the specific mutated residues. Overall, we conclude that the formation of structure in the TP interferes with apicoplast import, and that recognition of apicoplast TPs requires an unstructured state. Analysis of TP sequences shows that amino acids with high helical propensity are underrepresented, suggesting that structural considerations play a role in the evolution of TP sequences in P. falciparum.
The TP of ACP has been used to study apicoplast import. Apicoplast protein import was first described using the TP of ACP (2), and it was later used to develop a general method to identify apicoplast TPs from P. falciparum(13), and to demonstrate a non-specific role for positive charge in TPs (14). Here, we made use of the TP of ACP as a model system to further probe the role of structure in apicoplast TP recognition.
Conformation of TP-ACP measured by nuclear magnetic resonance
Three-dimensional (3D) nuclear magnetic resonance (NMR) was used to measure the secondary structure of the TP. Rather than using the TP peptide alone (residues F17–F40), we used full-length TP-ACP because this construct should be identical to the trafficking intermediate found in the parasite. This approach allowed us to characterize the ‘intervening region’ (IV) after the TP cut site (residues L41–P55) and to compare our NMR results with our crystal structure of the ACP domain (residues S57–Q137) (26).
The TP-ACP protein construct was labeled with the isotopes 15N and 13C during protein expression and purified to homogeneity for high-field NMR spectroscopy. Backbone resonances for Cα, Cβ, HN and N were assigned using standard methods. CCO and HCα assignments were not obtained because of problematic spectral overlap. All residues were assigned with exception of the N-terminal F17, and the phosphopantetheine prosthetic group. NMR assignments were deposited with the Biological Magnetic Resonance Data Bank (BMRB accession number 16661). Assignments for the ACP domain agree closely with previously published results (27), with only three appreciable deviations that do not impact the interpretation of the spectra (Figure S1). Backbone dihedral angles, and therefore secondary structure, affect the NMR chemical shifts of backbone nuclei. As Cα chemical shift differences from documented random coil values are an excellent indicator of alpha helix (28), these values were tabulated and plotted by residue number (Figure 1) using random coil values from Wishart and Sykes (29). Similar results were obtained using the contemporary chemical shift analysis program talos+ (Figure S2), which incorporates a variety of corrective terms and employs a neural network to assign secondary structure classes (30).
Cα chemical shift differences (Figure 1A) indicate helical structure by positive values, extended or strand structure by negative values and random coil by values near 0. The four main helices of the ACP domain are delineated by consecutive amino acids with chemical shifts between 2 and 5 ppm. This is in good agreement with the previously reported crystal structure (26), depicted at the top of Figure 1A with the helices shown as cylinders. The linkers between helices in the ACP domain have negative Cα chemical shift deviations, which indicate extended or turn structure. Chemical shift differences between helices 1 and 2 are highly variable because of the ordered loop structure that packs atop the four helix bundle of ACP, and possibly because of the phosphopantetheine prosthetic group that is covalently attached to this region.
Cα chemical shift differences of the TP have small deviations from random coil values (Figure 1B). Residues between N36 and L49 have an average chemical shift difference of 0.01 ppm, indicative of complete disorder proximal to the proteolytic cleavage site. Residues 23–35 of the TP show an average Cα chemical shift difference of 0.41 ± 0.30 ppm, suggesting that these residues are primarily unstructured, but this region may act in concert to form a minor population of alpha helix. The chemical shift analysis program talos+ (30) assigned a putative short helix at residues 27–30 (Figure S2), but classified the rest of the TP as random coil structure. The beginning and end of the TP show negative chemical shifts of greater variability, which may be due to either end effects of the N-terminus, or an ordered turn connecting the TP to ACP. In an effort to identify long-range interactions involving the TP, either within itself or with the ACP domain, an 15N-edited nuclear Overhauser effect spectroscopy (NOESY) experiment was performed. This technique experimentally measures inter-residue distances (31). NOESY spectra identified sequential residues, but failed to identify long-range interactions involving the TP.
A lack of long-range NOEs attributed to the TP suggests that it is structurally dynamic. The chemical shift analysis software talos+ can estimate a backbone order parameter S2 from the random coil chemical shifts by the method of Berjanskii and Wishart (32). The backbone order parameter S2 ranges from 0 for complete isotropic disorder to 1 for a static bond vector. The estimated S2 value ranges from 0.8 to 0.9 for the ACP domain, but drops to 0.4–0.6 in the TP and intervening region (IV) (Figure S2). Small increases in S2 are located at the putative helix in the TP at position 29, and also at positions 23 and 46. While talos+ can estimate an S2 value from chemical shifts, chemical shifts classically report on the conformation of a particular residue, and dynamics are best assessed by separate NMR relaxation experiments.
Backbone dynamics can be directly observed by NMR 15N relaxation experiments. Unfortunately, spectral overlap of backbone amide cross-peaks in TP-ACP NMR spectra made quantitative analysis of backbone dynamics of the TP intractable. Qualitatively, two populations of cross-peaks are readily distinguishable and suggest that regions of the spectra corresponding to the TP and intervening region before the ACP domain are highly dynamic. 15N relaxation experiments measure the rate of signal decay. Slow signal decay is due to the motion of a particular residue being uncorrelated with those around it, therefore making it highly dynamic. Conversely, a structured residue has many neighbors moving in a correlated fashion, and it experiences a faster rate of signal decay. In the spectra of backbone 15N nuclei acquired with no relaxation delay, all protein cross-peaks were visible. When using a comparatively long relaxation delay of 165 milliseconds, cross-peaks unambiguously assigned to the ACP domain relax, but several cross-peaks persisted (Figure 2). These slow-relaxing cross-peaks coincide with assignments to amino acids within the TP and IV before the ACP domain. Therefore, backbone relaxation data are consistent with the TP of ACP being primarily unstructured and highly dynamic.
Apicoplast trafficking does not require TP structure
To test the structural requirements of the ACP TP we developed a system to observe apicoplast protein trafficking in cell culture similar to that employed by Foth et al. (13). We generated a construct for in vivo expression in P. falciparum based on residues 1–55 of ACP, which contains both the TP and IV, fused to the green fluorescent protein (GFP) (Figure 3). This construct was designed for chromosomal integration with the mycobacteriophage Bxb1 integrase system (33,34), resulting in stable parasite lines containing a single copy of the transgene at a unique genomic locus. Transgenic parasites expressing the wild-type (WT) ACP leader sequence exhibited a single point of green fluorescence in the ring and early trophozoite stages of the life cycle (Figure 4). GFP fluorescence of the WT construct was distinct from the mitochondria, as visualized with mitotracker. During the late trophozoite stage, the GFP fluorescence formed extended and reticulated patterns that were distinct from the single mitochondrion (Figure 4), and were consistent with the morphology of the apicoplast as it prepares for schizogony (35). A negative control for apicoplast localization, designated ‘ΔT', was generated by mimicking a previously described construct in which the TP was deleted. This construct retains the ER signal sequence and the first four amino acids of the TP, which were fused directly to GFP (13) (Figure 3). Transgenic parasites expressing the ΔT construct displayed a GFP fluorescence pattern distinct from WT (Figure 4). In both early- and late-stage parasites, fluorescence appears to localize to the parasitophorous vacuole and secretory space.
To predict the effect of mutation on the helical content of the TP, we employed the prediction tool AGADIR (36). AGADIR is based on helix-coil transition theory of isolated peptides, but does not take into consideration any stabilizing effects of protein tertiary structure. Based on our NMR experiments, we concluded that the TP of ACP behaves as a peptide and does not interact with the ACP domain. Thus, AGADIR should serve as a good tool for predicting TP structure. Indeed, the helical content predicted by AGADIR for the TP of WT ACP is very similar to what we observed by NMR.
We designed mutant TPs linked to a GFP reporter to probe the role of structure during any of the steps of apicoplast trafficking. We used proline mutations to block the formation of TP secondary structure, regardless of whether the structure is static or induced by binding to a receptor. Proline is unique among amino acids for its limited conformational flexibility. The pyrrolidine ring of proline prevents rotation around the main chain peptide bond formed between the alpha carbon and the imino nitrogen. Unlike all other amino acids, proline cannot donate a main chain hydrogen bond to stabilize secondary structures such as alpha helices and beta sheets. These two characteristics greatly diminish the propensity of proline to form alpha helix. For comparison, the energy assigned by AGADIR to proline in helix is 2.7 kcal/mol higher than alanine (37), which equates to being 80-fold less favorable. Beta sheet structure is completely disrupted by proline. The backbone dihedral angle φ is fixed at 60° for proline, while beta sheet conformation requires a φ angle of −120 to −140°(38). Thus, proline mutagenesis is the most effective way to disrupt hydrogen-bonded secondary structure in the TP. We designed our proline substitutions (Figure 3) so that they are still predicted to localize to the apicoplast by PATS (12) and PlasmoAP (13) (Table 1). The mutant TPs retain net positive charge and HSP70 binding character (Figure S3) previously identified as features important for apicoplast import (13,14).
Table 1. Properties of transit peptide constructs used for subcellular localization
We first designed a single proline mutation L30P (Figure 3) to disrupt helix in the region where helix was indicated by AGADIR and NMR. The colored heat map in Figure 3 depicts AGADIR calculations of helical content and shows how the L30P mutation drastically reduces predicted helicity throughout the first 24 amino acids of the TP and up to the cleavage site. Predicted HSP70 binding (39) (Figure S3) and apicoplast localization by PlasmoAP (13) are preserved for this and all other mutants described. Interestingly, the transgenic parasites expressing the L30P mutation showed a similar fluorescence pattern as the WT construct (Figure 4). Late rings and early trophozoites contained a single point of fluorescence, which grew to an extended and branched pattern in late trophozoites. These results indicated that a helix in the vicinity of L30 was not necessary for trafficking to the apicoplast. While prolines are unfavorable in the body of an alpha helix, proline can be favorable at the N-terminus of a helix. Specifically, the N1 position of an alpha helix is favored by proline where it functions as a helix cap (40), and when positioned there, proline may actually promote helix formation. Further mutations in the TP were generated to exclude this and other possibilities.
To rule out proline acting as an N1 anchor for helix formation, we created two separate constructs by adding two C-terminal prolines to the L30P proline, spaced either one or two residues apart. These constructs, designated ‘3PxP’ and ‘3PxxP’ (Figure 3), prevent helix formation in a region of the TP containing four consecutive positive charges, which were previously observed to be important for apicoplast trafficking (14). The 3PxP and 3PxxP constructs retain two or three of the four positive charges, respectively, while blocking secondary structure formation throughout this region. Despite this disruption, both mutants displayed fluorescence similar to WT and consistent with apicoplast localization (Figure 4).
The N-terminus of apicoplast TPs is known to be critical for apicoplast localization (14). Amino acids before position 22 are predicted to be unstructured by AGADIR and NMR. However, structure could form N-terminal to the proximal proline at position 30. It is also possible that structure could be induced in an otherwise unstructured region upon binding to a trafficking receptor. To address these issues, we added two or three additional prolines to the 3PxxP construct to generate constructs ‘5PxxP’ and ‘6PxxP’ (Figure 3). GFP fluorescence from transgenic parasites expressing the 5PxxP and 6PxxP mutants was again consistent with apicoplast localization (Figure 4). In the 6PxxP mutant, 6 out of 24 amino acids of the TP were substituted with proline. At this density of prolines, the possibility of forming alpha helix anywhere in the TP is precluded.
Apicoplast trafficking does not require structure after the TP
The IV (Figure 3) between the TP cleavage site and the ACP domain has not been studied to determine whether it plays a role in apicoplast trafficking. NMR chemical shifts for these residues (41–55) indicate that they are disordered in solution (Figure 1), but the amino acid composition closely matches the characteristics of TPs. We therefore extended the proline mutagenesis into this region. An additional PxxP motif of three prolines was added C-terminal to the proteolytic cleavage site (inverted triangle, Figure 3) to yield the ‘9PxxP’ construct. The pattern of fluorescence was again consistent with apicoplast localization for both early- and late-stage parasites (Figure 4). In the 9PxxP construct, 9 out of the 40 residues between the ER signal sequence and GFP were mutated to proline, substituting 22.5% of the amino acids present. As the ΔT construct is secreted from the parasite, the determinants necessary for apicoplast trafficking must lie within these 40 amino acids. Localization of GFP to the apicoplast by the 9PxxP construct demonstrates that hydrogen-bonded secondary structure is not required in any part of the peptide responsible for apicoplast trafficking.
Proline mutants are imported into the apicoplast
ACP is located in the apicoplast during the erythrocytic stages, and antibodies specific for ACP can be used to identify this organelle (2). As the GFP constructs contain ACP residues 1–55, it is critical that the αACP antibodies do not cross-react with GFP constructs in order to demonstrate apicoplast colocalization. We generated αACP antibodies that were affinity purified with the ACP domain (residues 57–137) to avoid cross-reactivity, as previously described (26). Fixed transgenic parasites were probed with αACP, and colocalization with GFP fluorescence was examined. Despite some non-specific binding of the antibody, GFP was found to always colocalize with αACP for the WT construct as well as all six proline mutants (Figure 5A).
A hallmark of targeting to the apicoplast is the subsequent proteolytic processing of the TP (8,19). This can be observed by αGFP western blot with our TP-GFP constructs. Samples were prepared from late-trophozoite-stage parasites when GFP protein levels are the highest. The WT sequence is cleaved after residue F40 (8), leaving residues L41–P55 fused to GFP. By western blot, all six proline mutants displayed similar gel migration to the WT sample, indicating that all are processed (Figure 5B). As a negative control, an unprocessed construct, ‘TP-GFP’, containing residues F17–P55 fused to GFP was expressed in Escherichia coli. Pure recombinant TP-GFP appeared as a higher molecular weight band on the western blot, providing a marker for unprocessed protein. As expected, the ΔT construct appeared as a lower molecular weight band, because it contains 12 fewer residues compared to the processed forms of the WT and proline constructs. Interestingly, subtle variations in protein size amongst the proline mutants suggest that these mutations may have affected the TP cleavage site.
Apicoplast trafficking is inhibited by TP structure
The series of proline mutants demonstrated that the ACP TP can function in an extended, disordered state. If this lack of structure is a requirement of apicoplast import, then mutant TPs which possess significant structure would be expected to block apicoplast import. Guided by AGADIR predictions of helical content, we selectively mutated the ACP TP in such a fashion to both preserve TP physiochemical characteristics, while also biasing the TP to form alpha helix. We designed two such peptides. The first TP, named ‘α31’, contains six amino acid substitutions compared to WT and has a predicted helical content of 31% in the TP region. As shown in Figure 3, four of the six mutation sites were previously used in the proline mutation series, demonstrating that these sites can be mutated without altering apicoplast trafficking. We designed a second sequence, named ‘α62’, to maximize helical content. The design favors residues with helical propensity while also exploiting electrostatic and hydrophobic interactions between side chains to stabilize alpha helix. The α62 construct contains five amino acid substitutions in addition to those found in the α31 construct and is predicted to have a 62% average helical content in the TP region (Figure 3). Both designed TPs preserve apicoplast TP characteristics and are predicted by PATS (12) and PlasmoAP (13) to localize to the apicoplast (Table 1). To demonstrate that any observed changes in localization are due to the stabilized helical structure, a single proline mutation was introduced into each of the engineered TPs at residue 30, producing the α31P and α62P constructs. The L30P mutation reduces the predicted average helical content for the α31P and α62P constructs to 3% and 8%, respectively. Previous results from the proline mutant series (described above) show that the L30P mutation does not interfere with TP function.
The α31 and α62 constructs were expressed as GFP fusion proteins in P. falciparum parasites to assess the effects of the helix-stabilizing mutations. Transgenic parasites expressing the α31 construct displayed a pattern of GFP fluorescence similar to the WT construct, suggesting that apicoplast import occurred normally in these parasites (Figure 6A). Similar results were obtained for α31P. In contrast, a strikingly different pattern of fluorescence was observed in parasites expressing the α62 construct. GFP fluorescence was observed around the periphery of the parasite, consistent with accumulation in the parasitophorous vacuole, as observed for the ΔT construct (shown in Figure 4). In addition to vacuolar localization, the α62 construct was also found in perinuclear compartments, consistent with ER localization (Figure 6A). Importantly, addition of the single L30P mutation in the α62P construct completely reversed this phenotype and restored a fluorescence pattern consistent with apicoplast localization. We used the αACP antibodies described above to determine apicoplast localization for all four constructs. The α31, α31P and α62P constructs all colocalized with ACP in the apicoplast organelle; however, the α62 construct did not (Figure 6B). These results show that the helix-stabilizing mutations in α62 prevented this construct from localizing to the apicoplast. Instead, α62 remains in the secretory pathway and ultimately accumulates in the parasitophorous vacuole.
TP processing was analyzed by αGFP western blot as described above for the proline mutant series. The α62 construct migrates at a rate comparable to recombinant TP-GFP, indicating that it does not undergo processing in vivo (Figure 6C). The other three constructs are primarily found in the processed state, consistent with their apicoplast localization. Parasites expressing the α31 construct contain a minor population of unprocessed TP which migrates with the TP-GFP control. Unprocessed TP was also observed in α62P, but was not evident in α31P, even when 10-fold more signal was collected from the western blot (Figure 6C, bottom panel). The processing of the four constructs shown in Figure 6C seems to parallel the predicted helical content of their TPs with α31P displaying full processing (3% helix) while α62 is not processed at all (62% helix). α62P (8% helix) and α31 (31% helix) display intermediate levels of TP processing. These results suggest that partially stabilized helices might affect the kinetics of apicoplast trafficking or TP processing without fully blocking apicoplast import.
Parasite cell lines were validated by two methods. Genomic integration via the Bxb1 integrase system was confirmed for each of the constructs by isolating DNA from their respective P. falciparum cell lines and observing characteristic PCR amplicons (Figures 5C and 6D). In addition, DNA obtained from P. falciparum cell lines was sequenced to confirm the intended TP sequence. For all of the cell lines described in this work, transgenic parasites grew at a rate equivalent to WT and displayed no physical abnormalities.
The TP of ACP is primarily disordered
Apicoplast TPs have not been structurally characterized, and it is presently unknown whether structure plays a role in protein trafficking to the apicoplast or import into the apicoplast. We extended previous studies using the TP of ACP as a model of apicoplast import by asking if the TP contains structure, and whether structure is important for apicoplast import. NMR has previously been used to demonstrate that chloroplast TPs are able to form helical structures (41,42). However, these experiments were typically conducted in the presence of lipids or osmolytes, which are thought to mimic protein or lipid interactions (16). While chloroplast TPs are known to interact with galactolipids prior to import (43), work in the related apicomplexan T. gondii showed that galactolipids are not present on the apicoplast (44). Furthermore, apicoplast TPs are hydrophilic and unlikely to insert into a lipid bilayer. Thus, we sought to characterize the structure of TP-ACP in aqueous solution, perhaps offering a glimpse at the conformation of TP-ACP that is recognized as a trafficking intermediate in the ER lumen (8).
We used 3D NMR to structurally characterize TP-ACP and provide the first structural information about an apicoplast TP. The structural assignments made for the ACP domain agree well with our crystal structure of this protein (26). The TP of ACP is predominantly unstructured, with only a small degree of helical structure in the region of TP residues Q28, I29 and L30 (Figures 1 and S2). This description of the TP is in good agreement with predictions by AGADIR (36), a secondary structure prediction tool for peptides lacking tertiary structure. Additional NMR experiments examining 15N relaxation rates, an indication of how structurally dynamic a region is, corroborate that the TP of ACP is structurally disordered (Figure 2). There may, however, be additional conditions or factors that induce TP structure in the parasite ER which are not included in the NMR experiments. In particular, receptors could induce structure in the TP that is required for trafficking. Both mitochondrial and chloroplast TPs can form alpha helices which have been proposed to be recognized by import receptors (15–17).
Proline mutations disrupt secondary structure, but do not perturb apicoplast trafficking
Proline residues do naturally occur within chloroplast TPs, but in such examples, two independent helices are formed by the TP and the proline participates in the kink that separates them (45). In our proline mutants, particularly the 6PxxP and 9PxxP constructs (Figure 3), the high density of prolines precludes their participation in N-terminal helix caps and kinks. Similarly, the spacing of prolines is too close to allow the formation of beta strands. However, a possible side effect of substituting a large number of residues with proline is that the conformation of the peptide will be more likely to adopt polyproline II structure. Polyproline II is an extended conformation that does not contain intramolecular hydrogen bonds (46). In our studies, proline mutations may stabilize the formation of polyproline II, but they preclude the formation of any hydrogen-bonded secondary structure, such as alpha helix and beta sheet.
The 6PxxP mutation completely prohibits the formation of secondary structure in the TP and provides insights into apicoplast import. Substitution of 6 out of the 24 TP residues without perturbing trafficking to the apicoplast illustrates dramatic plasticity in TP recognition (Figures 4 and 5A). After recognition and trafficking to the apicoplast, the TP is imported across the apicoplast membranes and cleaved by a SPP. In T. gondii, this processing step does not occur unless the TP has crossed the innermost membrane of the apicoplast (19). Analysis of GFP construct processing (Figure 5B) suggests that import proceeds despite the substitution of 25% of the TP residues with proline. The Ramachandran space of proline indicates that fewer conformations are accessible to this amino acid (38), potentially increasing the rigidity of the TP. Multiple prolines in polyproline II structure can be approximated as a rigid rod, for lengths up to 12 consecutive residues (47). To the extent that our six proline substitutions in the TP contain these properties, the TP recognition and import machinery is completely tolerant to them.
We can draw similar conclusions about the intervening region which lies between the TP and the ACP domain. Although this region is not strictly considered part of the TP, it could play a role in apicoplast trafficking. A similar IV was observed between the TP of enoyl reductase (48) and the known functional domain of this enzyme (49). The 9PxxP construct disrupts secondary structure in the TP and the intervening region simultaneously (Figure 3) without affecting protein localization (Figures 4 and 5A) and without blocking TP processing (Figure 5B). Interestingly, different proline mutations appear to be processed at slightly different locations, suggesting that the proline mutations might have some effect on the selection of the TP cleavage site. In plants, the chloroplast SPP is thought to use nearby structure as a cue to locate poorly conserved TP cleavage sites (50). The analogous SPP found in apicomplexan parasites (8) could be similarly affected by multiple proline mutations. Indeed, alternative processing sites were observed in a series of TP truncation mutants in T. gondii(51).
Engineered TP helix blocks apicoplast trafficking
The proline mutant series demonstrated that TP secondary structure is not required for any step of apicoplast trafficking. However, these experiments did not determine whether structure would interfere with trafficking. We addressed this question by mutating the ACP TP so that it had a high helical propensity while still maintaining all of the known characteristics of an apicoplast TP. Carefully preserving the overall positive charge, hydrophilicity and HSP70 binding properties (Figure S3) as much as possible, we modified the ACP TP to maximize the helix content as predicted by AGADIR (Table 1). As amino acids branching near the beta carbon have less helical propensity than those that do not, we favored amino acids like methionine, leucine, arginine and glutamate over valine, isoleucine, asparagine and aspartate. Mutations were also chosen to form salt bridges and hydrophobic interactions, further stabilizing the helical conformation. For our experiments, we chose two constructs, α31 and α62, which differ considerably in predicted helical content (31% versus 62%). In construct α31, four of the six mutation sites had previously been mutated to proline without affecting apicoplast trafficking. Construct α62 included the α31 mutations, plus five additional mutations which doubled the predicted helical content. Unlike the α31 construct, α62 is not trafficked to the apicoplast, but instead remains in the secretory pathway and accumulates in the parasitophorous vacuole (Figure 6). Importantly, PATS and PlasmoAP do not distinguish between α62 and α31 or any of the other TP constructs (Table 1), demonstrating that there are key determinants of apicoplast trafficking which are not recognized by these programs.
Based on the hypothesis that the α31 and α62 constructs form an alpha helix, we reintroduced the L30P proline mutation to disrupt the helix. As shown in Figure 6, this single mutation reverts the phenotype observed with the α62 construct. The L30P mutation is an excellent control for α31 and α62 because the entire series of proline mutants contained L30P, and we found these to not interfere with apicoplast trafficking. The resulting α31P and α62P constructs are similar to the α31 and α62 constructs in terms of their physicochemical properties and their predicted scores as apicoplast TPs (Table 1). The primary difference between these constructs is the predicted helicity, which drops to 8% for α62P and 3% for α31P. Thus, it appears that the high structural content of the α62 TP is responsible for blocking apicoplast import.
The cleavage of TPs during apicoplast import appears to be correlated with predicted helix content. The α62 construct has the highest predicted helicity and we did not detect any TP processing with this construct (Figure 6). The α31 and α62P constructs (31 and 8% helix) were predominantly processed; however, a significant fraction of uncleaved TP was observed. In contrast, the α31P construct (3% helix) was completely processed even though it only differs from the α31 sequence by one proline mutation (Figure 6). Complete TP processing was also observed for the WT and proline mutant constructs shown in Figure 5. It is interesting to note that constructs with less than 3% predicted helical content were efficiently processed while those with intermediate levels of predicted helix content were not (Table 1). As the α31 and α62P constructs were ultimately imported into the apicoplast, it is not clear whether they were inefficiently imported or inefficiently cleaved during (or after) import. In either case, it appears that intermediate levels of helix which do not block apicoplast import may interfere with aspects of apicoplast trafficking.
We analyzed the collection of 35 apicoplast protein sequences and 102 non-apicoplast protein sequences used to develop the PATS algorithm (12). Using AGADIR, we found that the highest average level of predicted helix content in the first 24 amino acids of the TP sequences (the size of the ACP TP) was 2.2%. In contrast, 21 of the 102 non-apicoplast proteins contained higher levels of predicted helix in the first 24 amino acids. This comparison suggested that TP sequences have evolved to limit helical content. Analysis of the amino acid content of the TP sequences supports this supposition. The frequency with which the 20 amino acids are used in TP sequences was compared to their frequency of occurrence in the P. falciparum genome as a whole. Ten amino acids are overrepresented in TP sequences while the other 10 are underrepresented. For most amino acids, the relative abundance in the TP sequences is correlated with the Chou–Fasman helical propensity of these amino acids (53) (Figure 7). The abundance of charged amino acids is independent of helical propensity. Thus, the amino acid content of TP sequences can largely be explained by three underlying phenomena: the need for overall positive charge, the ability to interact with chaperones and the need to avoid helical structure.
Possible mechanisms of TP recognition
Several features of TPs are consistent with the recently discovered role of ERAD proteins in apicoplast protein import. Malaria parasites contain a second set of ERAD proteins which appear to have been inherited from the algal progenitor of the apicoplast. ERAD proteins colocalize with the apicoplast of P. falciparum(18,25), and are essential for apicoplast protein import (24). Classically, ERAD substrates are unfolded and bound by chaperones—two features that are likely to be shared by apicoplast TPs. HSP70 binding sites are already known to be important features of apicoplast TPs (13,54). Our proline mutants eliminate the possibility that hydrogen-bonded secondary structure is required for TP recognition. Moreover, the helix stabilization mutants show that TPs must be unfolded in order to function in apicoplast import. Therefore, the lack of TP secondary structure may be a structural determinant of apicoplast import, consistent with recognition of unfolded peptides by ERAD-like machinery. Chloroplast translocon orthologs already identified in P. falciparum and T. gondii may then play a subservient role in apicoplast import, facilitating translocation across the innermost apicoplast membrane, as previously hypothesized (24,55).
If apicoplast TPs are recognized directly by ERAD machinery, further gaps in understanding come to the forefront. ERAD substrates are surveyed by their glycosylation status in yeast and mammalian cells. The kinetics of glycosylation and deglycosylation allow finite time for folding before triggering the ERAD pathway for export and degradation (56). Certain protist parasites are believed to have lost the N-glycan dependence of the ERAD pathway (57), presumably recognizing substrates by their chaperone-binding properties alone. Many non-apicoplast proteins passing through the secretory pathway are likely to have difficulty folding, becoming classical ERAD substrates themselves. At this time it is unclear how apicoplast proteins are distinguished from misfolded proteins intended for other compartments.
Apicoplast TPs contain all of the determinants for protein trafficking to the organelle, but no conserved sequence motifs have been discerned. We investigated the role that TP secondary structure plays in apicoplast trafficking of the ACP—a well-studied apicoplast protein. Three-dimensional NMR studies showed that the TP of ACP was intrinsically disordered in solution, but we could not rule out the formation of induced structure during trafficking. Using a GFP-based localization assay, we blocked potential secondary structure formation in the TP with proline mutations, but found that apicoplast localization and processing of the TP were unaffected. We postulated that TPs may function in an extended state, and could be inhibited by structure in the TP. We engineered two TPs to contain alpha helical structure, and found that the TP with the highest helical content could not be imported into the apicoplast. Introduction of a single proline into the engineered TP restored apicoplast localization, demonstrating that its structure, and not its sequence, was the impediment to apicoplast trafficking. Other constructs with intermediate levels of helical content were ultimately trafficked to the apicoplast, but appeared to be processed less efficiently. Taken together, these data showed that the ACP TP was intrinsically disordered and that increased structural order reduced the efficiency of apicoplast import, and at higher levels, blocked trafficking altogether. Analysis of other P. falciparum TP sequences is consistent with this conclusion. TP sequences contain very low levels of predicted helical structure and are enriched with amino acids with low helical propensity. In general, the requirement for structural disorder may be a significant driving force in the evolution of TP sequences.
Materials and Methods
Cloning of TP-ACP
To generate TP-ACP, amino acids 18–137 of ACP (GenBank entry AAC71866) were amplified from plasmid pSPr020 (58) using primers P1 and P2 (Table S1) which introduce a proximal EcoRI and distal SalI sites in the PCR product. The amplicon was inserted into the pMALc2x vector (New England Biolabs). The resulting plasmid, pSPr024, encodes a maltose binding protein (MBP) domain which can be cleaved from ACP with factor Xa protease. Cleaved ACP contains vector-derived amino acids ISEF at the amino terminus. Nucleotides encoding residues ISE were removed from pSPr024 using QuikChange (Stratagene) mutagenesis with primers P3 and P4. The resulting plasmid, pSPr025, encodes residues 17–137 of ACP (including F17) immediately after the factor Xa cleavage site. Noncanonical cleavage of ACP by factor Xa was found to be problematic (data not shown). This problem was addressed by inserting a tobacco etch virus (TEV) protease cleavage site—a strategy that had previously been shown to be successful (59). Primers P5 and P6 were used to insert nucleotides encoding the TEV protease site using QuikChange (Stratagene) mutagenesis. The final product, pSPr026, was designed to produce TP-ACP (ACP with its TP) with the exact sequence described by van Dooren et al. (8).
TP-ACP purification for biophysical studies
Plasmid pSPr026 (encoding MBP-TP-ACP) was transformed into BL21 Star(DE3) cells (Invitrogen) which already harbored the pRIL plasmid from BL21-CodonPlus(DE3) cells (Stratagene). Cells were grown by shaking in lysogeny broth (LB) medium at 37°C to an optical density at 600 nm (OD600) of 0.8, the temperature was reduced to 20°C, and protein expression was induced with 0.4 mm isopropyl-beta-D-thiogalactopyranoside (IPTG) for 10 h. Cells were pelleted by centrifugation followed by resuspension in 20 mL of lysis buffer (20 mm Tris pH 7.5, 1 mg/mL lysozyme, 2.5 µg/mL DNAse I, 1 mm phenylmethylsulphonyl fluoride, 0.5 mm DTT) per liter of cell culture. Cell lysate was clarified by centrifugation at 30 000 ×g for 20 min at 4°C and immediately loaded onto an amylose column (New England Biolabs) equilibrated with 30 mm Na/K phosphate pH 7.5. After washing with one column volume of buffer, a HiTrap Q Sepharose Fast Flow anion exchange column (GE Healthcare) was attached after the amylose column, and MBP fusion protein was eluted from the amylose column onto the Q column with 100 mm maltose in equilibration buffer. The amylose column was removed, and the Q column was eluted with an NaCl gradient. Fractions containing fusion protein were pooled and digested with 20 µg/mL TEV protease and 1 mm DTT. After digestion at 4°C, the TP-ACP and MBP mixture was desalted and loaded onto a HiTrap SP Sepharose Fast Flow cation exchange column (GE Healthcare) in 20 mm Na/K phosphate pH 7.5. Protein bound to the SP column was eluted with an NaCl gradient. TP-ACP was further purified using a Sephacryl S-100 gel filtration column (GE Healthcare) equilibrated in 20 mm Na/K phosphate pH 7.0, 50 mm NaCl and 0.1 mm DTT. The resulting protein was pure as determined by SDS–PAGE with SimplyBlue stain (Invitrogen). Protein was concentrated in a 5 kDa cutoff centrifugal concentrator (Vivaspin) and flash frozen before storage at −80°C.
We generated TP-ACP for 3D NMR experiments as described above with the following modifications for isotope labeling. Two liters of cell culture grown to OD600 of 0.8 in LB medium was spun down and resuspended in 1 L of M9 salts supplemented with 100 µm CaCl2, 200 µm FeCl3, 2 mm MgSO4, 1 µg/mL thiamine, 1 g/L 15NH4Cl and 5 g/L U-13C glucose. Cells were incubated for 20 min in the new medium, and then induced with 0.4 mm IPTG while shaking at 20°C for 10 h. Protein was purified as described above. TP-ACP was dialyzed into H2O, concentrated to 2 mg/mL and then lyophilized. The final protein sample labeled with 15N and 13C was prepared for NMR data collection by dissolving lyophilized protein in 30 mm NaCD3CO2, 20 mm CaCl2, 5 mm DTT, 1 mm NaN3, 10% D2O, pH 5.5 to a final protein concentration of 940 µm, and loading it in a Shigemi NMR tube (Shigemi).
Helix prediction and measurement
The program agadir(36) was used to calculate per residue helix content for TP sequences. AGADIR predictions were performed at pH 7.4 and 310 K with 100 mm NaCl. The N-terminus was specified to be unprotected, representative of the free N-terminus of the protein construct used for biophysical studies. The C-terminus was specified as amidated as an approximation of the continuing peptide chain.
NMR data were collected using either a Varian Unity Inova 500 MHz fitted with a triple band probe, or a Varian Unity Inova 600 MHz with a cryoprobe. A standard backbone resonance assignment scheme (60) was applied using combined data from HNCA, HNCACB, HN(CO)CA and CBCA(CO)NH pulse sequences. An 15N-edited HSQC NOESY was performed with an NOE mixing time of 100 milliseconds. Spectra were processed with NMRPipe (61) and analyzed with NMRView (62). MONTE (63) was used to identify self-consistent backbone assignments. T2 relaxation experiments were performed using a standard CPMG pulse sequence including adiabatic pulse compensation (64), modified with a watergate sequence for water suppression.
Generation of P. falciparum transfection constructs
Twelve constructs based on the first 55 amino acids of ACP fused to GFP were generated in plasmid pLN-ENR-GFP (GenBank accession number DQ813653), which facilitates site-specific integration into the attB site of Dd2attB parasites (33). Nucleotides encoding the first 55 amino acids of full-length ACP were PCR amplified to add flanking AvrII and BsiWI restriction sites. The ACP amplicon was inserted in place of enr to create the WT pLN-TP-ACP-GFP expression construct. The proline mutations were introduced by PCR mutagenesis of full-length ACP in a cloning vector and subsequently ligated into pLN-GFP (see Supporting Methods for complete details). The program agadir(36) was used iteratively to design the helix-stabilizing mutant constructs α31 and α62 with predicted average helicities of 31 and 62%, respectively. The L30P mutation found in all of the proline mutants was introduced into α31 and α62, creating the α31P and α62P constructs with predicted average helicities of 3 and 8%, respectively. Plasmid DNA encoding all four constructs was synthesized by GeneArt with flanking AvrII and BsiWI restriction sites for insertion into pLN-ENR-GFP as described above. Each mutant construct was sequence verified.
P. falciparum transfection
P. falciparum transfections were performed using the Bxb1 mycobacteriophage integrase system in Dd2 strain parasites containing the attB recombination site (33) in combination with the red blood cell (RBC) preloading technique (65). As described previously (34), the plasmid encoding the integrase (pINT-REP20) was not maintained with G418. Blasticidin drug pressure was relieved 1–2 weeks after transfection to facilitate the loss of ectopic copies of the pLN plasmid.
Transgenic parasite lines were analyzed for the genomic integration and correct nucleotide sequence of the transgene. P. falciparum genomic DNA was purified using QiaAmp Blood DNA purification mini-kit (Qiagen). Integration into the genomic attB site was verified by using PCR primers S1 and S2 (Table S1) which amplify a 161 bp amplicon including the 5′ integration site. PCR primers S3 and S4 amplify a 297 bp amplicon including the 3′ integration site. The pLN plasmid as originally described by Nkrumah et al. contains two attP integration sites in tandem, either of which can facilitate genomic integration. The unused pLN attP site can increase the size of either the 5′ or the 3′ diagnostic PCR fragment by 56 bp. Sequencing results were obtained by PCR amplifying the transgene from genomic DNA using sequencing primers S5 and S6, followed by DNA sequencing with primer S5.
Epifluorescent microscopy and western analysis
Parasites were prepared for live fluorescence microscopy by incubating 100 µL of parasite culture at high parasitemia with 1 µg/mL 4′, 6-diamidino-2-phenylindole (DAPI) and 12.5 nm mitotracker CMX-Ros (Invitrogen) for 30 min at 37°C. Cells were washed for 5 min in complete media three times at 37°C, and then streaked on a slide for observation on a Nikon Eclipse 90i equipped with an automated z-stage. A series of images spanning 4–5 µm were acquired with either 0.2 µm or 0.5 µm spacing and images were deconvoluted with Volocity software (Improvision) to report a single combined z-stack image.
Parasites were fixed and permeabilized for colocalization studies. The method used in Figure 5A relied on intrinsic GFP fluorescence and proved inferior to the method used in Figure 6B which is described here. Infected RBCs from 250 µL of parasite culture were harvested by centrifugation and resuspended in PBS containing 4% paraformaldehyde and 0.0075% glutaraldehyde. Resuspended cells were immediately spotted onto polylysine-treated slides and allowed to fix for 30 min. The cells were washed with PBS and permeabilized with PBS containing 1% Triton-X-100 for 10 min. After another PBS wash, the cells were treated with 0.1 g/L NaBH4 for 10 min to reduce unreacted aldehydes. The cells were then washed in PBS, blocked in PBS containing 30 g/L BSA for 1 h, and incubated overnight at 4°C with 1:500 Rabbit αACP (26) and 1:100 Living Color mouse αGFP (Clontech). After three PBS washes, the cells were incubated with 1:1000 goat anti-mouse Alexa 488 (Invitrogen) and 1:3000 goat anti-rabbit Alexa 594 (Invitrogen) for 1 h. After three more PBS washes, cells were treated with Gold antifade DAPI (Invitrogen) and sealed under a coverslip for observation as described above for live fluorescence microscopy. All steps were carried out at room temperature except for the incubation with the primary antibodies.
A negative control for TP processing was designed for expression in E. coli to yield amino acids 17–55 of ACP fused to GFP. This construct mimics the trafficking intermediate of the WT TP-GFP construct used for subcellular localization. TP-GFP was expressed as an N-terminal GST fusion (see Supporting Methods for complete details) using expression plasmid pGEX-4T3 (GenBank entry U13855), and purified with a GSTrap affinity column (GE Healthcare). GST fusion protein was proteolytically cleaved using TEV protease, and the digested fusion protein was purified by a GSTrap column and SP anion exchange column run in tandem. TP-GFP was eluted from the SP column, concentrated, and flash frozen for later use.
Western blot samples were generated from high-parasitemia P. falciparum cultures of predominately trophozoite-stage parasites. Parasites were pelleted, treated with 0.2% saponin in PBS for 5 min on ice, and then washed repeatedly in PBS until the supernatant was clear. Lysis buffer was prepared in 156 µL with the following recipe: 50 µL 4× NuPAGE LDS gel loading buffer (Invitrogen), 40 µL 500 mm ascorbic acid, 40 µL 20 mm bathophenanthrolinedisulfonic acid disodium salt, 8 µL 500 mm ethylenediaminetetraacetic acid (EDTA) pH 8.0, 4 µL 1 mg/mL pepstatin, 10 µL 20× complete protease (Roche) and 4 µL 5% Triton-X-100. Lysis buffer was added to a parasite pellet of 50–80 µL, and then alternatively vortexed and heated to 95°C to break DNA and denature the sample before storing at −20°C. Western analysis was performed by SDS–PAGE, followed by transfer to a 0.2-µm polyvinylidene fluoride (PVDF) membrane by semi-dry electrophoretic transfer at 15 V for 1 h in Bjerrum and Schafer-Nielsen transfer buffer (66) composed of 48 mm Tris, 39 mm glycine, pH 9.2, 20% ethanol. Protein was cross-linked to the membrane with 4% formaldehyde and 0.1% glutaraldehyde for 1 h at 37°C. The blot was blocked with 5% non-fat milk in PBS for 1 h, washed three times in 1% milk in PBS, and then probed with 1:1000 αGFP JL-8 (Clonetech) in 1% milk in PBS for 7 h rocking at 23°C. The membrane was again washed three times in 1% milk in PBS, probed with 1:3300 αMouse horseradish peroxidase (HRP) secondary antibody (GE Healthcare) for 2 h rocking at 23°C, washed three times as before, visualized with enhanced chemiluminescence (ECL) western substrate (Pierce), and exposed to film sequentially for 10 min, 20 min and overnight.
TP sequence analysis
The PATS (12) training set of apicoplast targeted proteins was used to extract a set of 35 P. falciparum TP sequences. These sequences were trimmed to the length of the ACP TP (24 amino acids) and the frequency of occurrence for each of the 20 amino acids was tabulated. The fold overrepresentation for each amino acid was calculated by dividing the frequency of occurrence in TPs by the frequency of occurrence in the P. falciparum genome. These statistics were then compared to the amino acid helical propensities calculated by Chou and Fasman (53).
NMR assignments are archived at the Biological Magnetic Resonance Data Bank under accession number 16661. The authors would like to thank Anaya Majumdar and the Johns Hopkins Biomolecular NMR center for providing indispensable technical assistance relating to NMR data collection. This work was supported by the National Institutes of Health (R01 AI065853), and the Johns Hopkins Bloomberg School of Public Health Faculty Innovation Fund. This work was also made possible by UL1 RR 025005 from the NIH National Center for Research Resources, and by the Johns Hopkins Malaria Research Institute. K. A. M. was supported by T32 AI007417.