E. Sonnhammer, Center for Genomics and Bioinformatics, Karolinska Institutet, S-171 77 Stockholm, Sweden Fax: +46 8337983 Tel: +46 852486395 E-mail: email@example.com
The transmembrane topology of presenilins is still the subject of debate despite many experimental topology studies using antibodies or gene fusions. The results from these studies are partly contradictory and consequently several topology models have been proposed. Studies of presenilin-interacting proteins have produced further contradiction, primarily regarding the location of the C-terminus. It is thus impossible to produce a topology model that agrees with all published data on presenilin. We have analyzed the presenilin topology through computational sequence analysis of the presenilin family and the homologous presenilin-like protein family. Members of these families are intramembrane-cleaving aspartyl proteases. Although the overall sequence homology between the two families is low, they share the conserved putative active site residues and the conserved ‘PAL’ motif. Therefore, the topology model for the presenilin-like proteins can give some clues about the presenilin topology. Here we propose a novel nine-transmembrane topology with the C-terminus in the extracytosolic space. This model has strong support from published data on γ-secretase function and presenilin topology. Contrary to most presenilin topology models, we show that hydrophobic region X is probably a transmembrane segment. Consequently, the C-terminus would be located in the extracytosolic space. However, the last C-terminal amino acids are relatively hydrophobic and in conjunction with existing experimental data we cannot exclude the possibility that the extreme C-terminus could be buried within the γ-secretase complex. This might explain the difficulties in obtaining consistent experimental evidence regarding the location of the C-terminal region of presenilin.
Presenilins (PSs) are transmembrane (TM) proteins that are highly conserved throughout evolution. Elucidating the TM topology of PSs is vital for understanding their function. Several different models have been suggested based on different experimental findings. It is not possible to produce a membrane topology model that agrees with all published data. In humans there are two PSs (PS1 and PS2) whose protein sequences are > 65% identical to each other. Mutations in these genes cause the majority of early onset familial Alzheimer's disease cases. PSs are also involved in the proteolytic processing of the Notch receptor, which is responsible for critical signaling events during development. Other substrates have also been identified and there are probably still more to discover [1,2].
PSs are part of γ-secretase, a multisubunit protease that also entails nicastrin, aph-1 and pen-2 [3,4]. This complex is responsible for the intramembrane proteolysis of type I membrane proteins, such as amyloid-β precursor protein (APP) and Notch. PS has 10 hydrophobic regions (HRs), which may or may not be true TM regions. In the γ-secretase complex, PS exists in its active form as an N- and C-terminal fragment (NTF and CTF, respectively) heterodimer. Endoproteolysis of PS by an unknown ‘presenilinase’ results in these two fragments [5–7]. The major site of endoproteolytic cleavage in PS1 is between T291 and M292 , which is located in HRVII. The first evidence that the NTF–CTF heterodimers are the biologically active form of PS and part of the γ-secretase complex came from inhibition studies using transition state analogues designed to target the diaspartyl putative active site of the protease [8,9]. The two conserved aspartate residues are located in HRVI and HRVIII, respectively (D257 and D385 in PS1). They have been shown to be required for γ-secretase activity . It has also been shown that the PS C-terminus is involved in γ-secretase complex assembly and activity [11–16] and is required for endoplasmic reticulum (ER) retention . It is the conserved ‘PAL’ sequence in the C-terminal part of the protein that is believed to be essential for the activity. In addition, a wide range of PS interacting proteins have been identified .
All previously proposed PS topology models agree that HRI–VI are true TM regions. The debate questions which of HRVII–X are true TM regions, and also the location of the N- and C-termini. It is impossible to produce a membrane topology model that concurs with all previously published models. Most studies conducted are gene fusion experiments using different reporters to indicate cytosolic or noncytosolic location. Such analyses are known to suffer frequently from artifacts induced by the truncation of the protein studied and the nature of the reporter gene used, and therefore these studies must always be interpreted with caution. Antibody experiments on native proteins have been performed by Doan et al.  and Dewji et al. [19,20]. Although antibodies are generally considered more reliable, these two studies produced contradictory results.
The first and most widely accepted model for PS topology is the eight-TM model (excluding HRVII and HRX) with both the N- and C-termini located in the cytosol [21,22]. The experimental results from one antibody study supported this model, while at the same time the authors also speculated about several different possible topology models . A seven-TM model with an extracytosolic N-terminus has also been proposed [19,20]. Other models that have been suggested include a six-TM model with cytosolic N- and C-termini , and a ‘seven-TM and one membrane-embedded’ model with an extracytosolic C-terminus . The eight-TM model is primarily based on studies on the Caenorhabditis elegans ortholog of human PS, Sel-12. The other studies have mainly used human PS1, and to some extent human PS2. However, the topology of both human and worm PSs is considered to be the same.
Recently, a new family of presenilin-like (PSL) proteases (also called presenilin-homologs, PSH; intramembrane proteases, IMPAS) has emerged [25–27]. In humans there are five members of this family, the best known of which is the SPP (signal peptide peptidase). These proteins were discovered due to their sequence homology with PSs. Although the overall sequence homology between the PSs and PSLs is low, they share the conserved aspartate residues that are presumed to be the active site, and the conserved PAL sequence in the C-terminus. Indeed, SPP has been identified as an aspartyl protease . The topology of the PSL proteins has been predicted to be a nine-TM topology with the N-terminus in the extracytosolic space and the C-terminus in the cytosol [28,29]. The aspartate residues are located in TM regions six and seven. This topology has recently been verified experimentally .
When comparing the performance of topology prediction methods and low-resolution experiments (such as gene fusion studies and antibody experiments), it has been shown that most predictors are not significantly less accurate than low-resolution experiments . The conflicting results from published studies on PS topology illustrate the limitations of low-resolution experiments. In this study we have used computational prediction methods to analyze PS topology, and have combined these results with previous data from low-resolution experiments and functional data on γ-secretase to produce a novel topology model that is well supported by both prediction methods and experimental results. We propose a novel nine-TM topology model with the C-terminus in the extracytosolic space and the two putative active site aspartate residues in TM regions six and seven, respectively.
Results and Discussion
All of the experimentally inferred topology models agree that HRI–VI are indeed TM regions [18–24]. A majority of models agree on the localization of the N-terminus and the loops between HRI and HRVI [18,21–24]. Only one model suggests that the N-terminus is located in the extracytosolic space [19,20], leading to opposite localization of the loops between HRI and HRVI compared to the other models. Accumulations of contradictory results are found in the C-terminal part, and consequently several models have been proposed.
We analyzed the PS sequences with five different predictors using the sfinx tool . Figure 1A shows the output for human PS1. The results for the other PS sequences were essentially the same (not shown). The major insight is that HRX is probably a TM region, which is not the case in most previously published models, including the most widely accepted eight-TM model [21,22]. A minority of the methods predicts that HRVIII is a TM region, when used with default settings. hmmtop2.1 considers HRVIII to be a TM region, whereas memsat predicts both HRVII and HRVIII to be TM regions. However, the hypothesis that PS is an aspartyl protease performing intramembrane proteolysis of its substrates strongly suggests that the two putative active site aspartate residues are probably located in TM regions. Therefore, we carried out Phobius'  constrained prediction with both aspartate residues constrained to be in TM segments. The result was a nine-TM topology with the C-terminus in the extracytosolic space (Figs 1A and 2B). This is identical to the topology predicted by hmmtop2.1. This topology was essentially consistent for all PS sequences analyzed.
Our model is supported by 70% of the experimentally determined loop locations (Fig. 2B). The main conflicts between our model and those proposed previously concern HRX and to some extent HRVIII. There are several grounds for HRVIII being a TM region. The most compelling evidence comes from studies showing that the two conserved aspartate residues in HRVI and HRVIII indeed seem to be the active site [8–10]. The proteolysis that these residues perform is intramembrane, implying that they probably are located in TM regions. The relatively weak hydrophobicity of HRVIII can be explained by this embedded catalytic moiety. Furthermore, the membrane topology of the PSL family supports that the putative active site aspartate residues are located in adjacent TM regions (Fig. 1B) [28–30]. Finally, one of the previously published PS topology models supports HRVIII as a TM region [21,22]. In our model, as well as in the latter, the orientation of the PS active site is inverted compared to the PSL. A rationale for this is that the PS substrates are of inverse orientation compared to PSL substrates; they are type I and type II membrane proteins, respectively [1,2].
There are several reasons why we believe that HRX is a TM region. Undoubtedly, the membrane topology model for the PSL proteins [28–30] supports HRX as a TM region (Fig. 1B). The PAL motif that is conserved between the PS and PSL is located in the proximity of, or within, HRX. In addition, HRX is a strongly hydrophobic region and all the predictors consider it to be a TM region. The PAL sequence in the C-terminus has been shown to be essential for γ-secretase activity [11–13,15,16]. In fact, it may constitute part of the active site, as mutations in these residues exhibit the same phenotypes as mutations in the aspartate residues. This implies that the PAL sequence might be located in a TM region. The C-terminus has also been shown to interact with a number of other proteins, which could give an indication as to whether it is located in the cytosol or extracytosolic space. One of the members of the γ-secretase complex, nicastrin, has been shown to bind at the C-terminus of PS . The PS binding site in nicastrin is likely to be located in its TM region, suggesting that the PS C-terminus dives into or penetrates the membrane. Telencephalin (TLN) and APP have also been shown to interact with PS. Once again, PS binds with its C-terminus to the TM domain of TLN and APP . Furthermore, the topology model proposed by Nakai et al. supports HRX as a TM region .
In the model proposed here, only approximately 14 amino acid residues in the C-terminus are located in the extracytosolic space. Considering the available experimental data and the fact that these residues are relatively hydrophobic, it is possible that the extreme C-terminus might be buried in the γ-secretase complex. The PS C-terminus has also been shown to be required for ER retention . The sequence identified by Kaether et al. consists of 22 amino acids that include the PAL motif and HRX, and it is well conserved between species. As noted by Kaether et al. this sequence is not similar to any known cytoplasmic ER retention signal. However, subunits of multiprotein complexes possess retention signals in their TM domains, which do not have the classical three-to-four amino acid motif. These retention signals prevent the export of unassembled subunits from the ER. Once the complex is assembled, the retention signal is masked and the protein complex can be exported. In another study the PS C-terminus was also found to be required for efficient transport , however, no details regarding the experiment were given.
Altogether, there are experimental data that support HRX as a TM region. At the same time, there are experimental studies that contradict HRX as a TM region. Only the topology model by Nakai et al. supports HRX as a TM region, although Doan et al.  speculate that HRX might be a TM region without presenting any experimental evidence. A reason why HRX has not been considered a TM segment is that in most gene fusion studies the loop before HRX and the C-terminus were determined to be located in the cytosol. However, as pointed out by Nakai et al. this could be due to the nature of the reporter gene used. The C-terminus of PS is relatively hydrophobic and fusion to a reporter gene with a relatively hydrophobic N-terminus could create an artificial TM region at the fusion point, leading to incorrect localization of the reporter gene. The contradicting results regarding the C-terminal location from the antibody studies [18–20] might be an artifact due to the overexpression of PS.
There are also protein interaction experiments that indicate a cytosolic location of the C-terminus. PS has a potential PDZ domain recognition sequence in its extreme C-terminus, and PDZ domain-containing proteins that recognize PS through this motif have been reported [36–39]. Taken together, the experimental data regarding the location of the C-terminus is conflicting. We believe that the topology predictions for PS and the confirmed topology for the PSL proteins, combined with published PS experimental studies, suggest that the C-terminus is located in the extracytosolic space. A possible explanation for the conflicting results is that the C-terminus could be buried in the γ-secretase complex, as discussed previously. Another possibility, although less likely, is that there are two molecular species of PS coexisting in the cell differing in the location of the C-terminus. This would implicate that the C-terminal region could interact with factors in both the cytosol and the extracytosolic space.
The antibody studies by Dewji et al. [19,20] have the most conflicts with our model. They propose a seven-TM topology model with the N-terminus and the long loop between HRVII and HRVIII located in the extracytosolic space, and consequently the C-terminus in the cytosol. The localization of the N-terminus and the long loop not only conflicts with our model, but also with all other experimental studies. The proposition by Dewji et al.  that PSs are G protein-coupled receptors (GPCRs) and have a seven-TM topology with an extracytosolic N-terminus is unlikely due to several reasons. First, there are eight very hydrophobic regions in the PS sequence (HRI–VI and HRIX–X). Second, the N-terminus is probably located in the cytosol as shown by four experimental studies [18,21–24] and our analysis. Third, the PSs do not exhibit topological characteristics commonly found in different subfamilies of known GPCRs. For example, loop six (extracytosolic loop three) in the Dewji et al. model is very long (approximately 140 residues), which has not been found in GPCRs . Also, the first cytosolic loop is longer than those usually found in GPCRs.
In summary, we propose a novel nine-TM topology for the PSs with the C-terminus located in the extracytosolic space (Fig. 2B). The nine TM regions correspond to HRI–VI and HRVIII–X. The putative active site aspartate residues are located in TM regions six and seven in our model. It is impossible to decide on a single topology model that agrees with all published experimental data on PSs. Determining their topology is critical for understanding their normal and pathogenic functions. However, the ultimate solution to the topology of PSs can only be achieved through atomic resolution studies of the whole γ-secretase complex. Meanwhile, we offer an alternative topology model of PSs, reconciling the previously published experimental data with results from our topology prediction analysis.
The PS sequences in the full alignment from the Pfam database, version 16.0, were analyzed, excluding sequence fragments. Membrane topology for each sequence was predicted with five different methods, and the sfinx tool  (http://sfinx.cgb.ki.se/) was used to display the results. The predictors were Phobius , tmhmm2.0 , PHDhtm , hmmtop2.1  and memsat. In addition, a Kyte–Doolittle hydrophobicity curve  was constructed for each PS sequence. phobius also predicts N-terminal signal peptides. Each program was used with default settings. The sequences were also analyzed using phobius constrained predictions with the two putative active site aspartate residues constrained to be in TM segments.
This work was supported by a grant from the Swedish Knowledge Foundation and Pfizer Corporation to A.H., and L.K., and by grants from Pfizer Corporation to E.S.