The sterile alpha motif (SAM) domain is one of the most common protein modules found in eukaryotic genomes. Many SAM domains have been shown to form helical polymer structures suggesting that SAM modules can be used to create large protein complexes in the cell. Because many polymeric SAM domains form heterogenous and insoluble aggregates that are experimentally intractable when isolated, it is likely that many polymeric SAM domains have gone uncharacterized. We, therefore, developed a method to maintain polymeric SAM domains in a soluble form that allowed rapid screening for potential SAM polymers. SAM domains were expressed as fusions to a super-negatively charged green fluorescent protein (negGFP). The negGFP imparts three useful properties to the SAM domains: (1) the charge helps to maintain solubility; (2) the charge leads to reliable migration toward the cathode on native gels; and (3) the fluorescence emission allows visualization in crude extracts. Using the negGFP-SAM fusions, we screened a large library of human SAM domains for polymerization using a native gel screen. A selected set of hSAM domains were then purified and examined for true polymer formation by electron microscopy. In this manner, we identified a set of new potential SAM polymers: ANKS3, Atherin, BicaudalC1, Caskin1, Caskin2, Kazrin, L3MBTL3, L3MBTL4, LBP, LiprinB1, LiprinB2, SAMD8, SAMD9, and STIM2. While further characterization will be necessary to verify that the SAM domains identified here truly form polymers, our results provide a much stronger working hypothesis for a large number of proteins that was possible from sequence analysis alone.
Many eukaryotic proteins are constructed in a modular fashion in which distinct functionalities are provided by linking together protein domains.1 As a result, it is often possible to develop general functional hypotheses for a particular protein by knowing its domain structure. Consequently, considerable effort has been devoted to understanding the functions of common protein modules.
Sterile alpha motif (SAM) domains are among the most common protein modules in eukaryotic genomes2 and are somewhat unusual because they exhibit a wide range of different functions. They display a variety of oligomeric forms including both polymers3–8 and closed oligomers,5, 9–12 thereby playing important roles in building cellular complexes. SAM domains also bind to other domains such as MAP kinases13 to RNA,14 and one SAM domain has been suggested to bind to lipids.15 Because of the large number of known functions, the presence of a SAM domain does not necessarily imply a particular function, but a set of possible functions. It is, therefore, necessary to characterize each SAM domain individually.
Polymerization is perhaps the most unusual and striking function of many SAM domains. SAM polymers have been found in transcriptional repressors where they may mediate spreading along chromatin3, 6 as well as a wide variety of scaffolding proteins where SAM polymers are used to generate large protein complexes.4, 16 SAM domains known to form polymers assembled by a head-to-tail association of monomeric subunits. These interactions are mediated by two common binding surfaces termed the mid-loop (ML) and end-helix (EH) surfaces3 [Fig. 1(A)]. Another form of SAM polymer utilizing different interaction surfaces has been proposed, but there is still no compelling evidence that this is a biologically relevant structure.17, 18 The SAM domain polymer is quite versatile as the N- and C-termini project away from the axis of the polymer such that the SAM domain can be placed anywhere in a protein and still form a polymer, possibly explaining its wide exploitation in the eukaryotic genome.
Polymerization is a particularly problematic function for biochemical examination because polymers are often insoluble and heterogeneous. As a result, predictions indicate that many polymeric SAM domains remain uncharacterized.19 We, therefore, sought to fill this gap in the functional annotation of this important protein module. Here, we describe the development of a high throughput screen for polymerization and its application to nearly all identified SAM domains in the human genome. We have identified a total of 17 potentially new polymer-forming SAM domains and structurally characterized 14 of them by electron microscopy (EM).
Supercharged GFP fusion approach to solubilizing and screening SAM polymers
Our goal was to comprehensively screen human SAM (hSAM) domains for polymerization. We, therefore, attempted to develop a method that would reliably solubilize polymeric SAM domains so that they could be examined. In addition, we sought a way to rapidly screen for their ability to oligomerize. The approach we took is outlined in Figure 1. We expressed all SAM domains as fusions to a super-negativley charged green fluorescent protein (negGFP)20 with a net charge of −30. As described below, the highly charged fusion effectively solubilizes SAM domains. We also exploited the extreme negative charge of the fusion proteins to develop a rapid initial screen for oligomerization. In particular, the negatively charged fusions reliably run toward the cathode in a native gel so that much of the uncertainty about migration in a native gel is eliminated. We also exploited the negGFP fluorescence so that we could screen for migration in native gels and in gel filtration columns using crude extracts, obviating the need for laborious purification. Those SAM domains that appeared to form hetergenous aggregates in native gels were then examined further by gel filtration and, in selected cases, electron microscropy (EM).
Supercharged GFP fusions solubilize SAM polymers
To test whether the negGFP could solubilize SAM polymers, we chose the SAM domain from TEL (TEL-SAM) because it forms long, very stable polymers that are largely insoluble.3 We also used as controls two monomeric mutants of TEL-SAM: TEL-SAM-A61D, which bears the mutation on the ML surface, and TEL-SAM-V80E, which bears the mutation on the EH surface. The single mutants are soluble because they cannot self associate, but they bind to each other to form a dimer with a Kd of ∼2 nM via the lone intact ML and EH surfaces.21, 3 We expressed negGFP-TEL fusions in E. coli and observed soluble protein in the cell lysate for all fusions.
To examine whether the soluble wild-type TEL-SAM fusion remained polymeric when fused to the highly charged negGFP, we analyzed the fusion protein by gel filtration chromatography and EM. By monitoring fluorescence, it was possible to observe the elution position of negGFP fusions on a gel filtration column run using crude extract. As shown in Figure 2(A), negGFP-TEL-WT eluted in the void volume of a gel filtration column, while the negGFP-TEL-V80E fusion eluted as a monomer as expected. To verify that the high molecular weight material was polymer and not amorphous aggregates, negGFP-TEL-WT was purified and examined by EM [Fig. 2(B)]. The negGFP-TEL-WT polymers were readily apparent across the entire grid. The diameter of a polymer is roughly 12 nm compared to the 8 nm diameter of the TEL-SAM polymers visualized previously,3 with the extra width presumably supplied by the negGFP fusion protein.
Not all polymers are as stable as TEL-SAM and may be more susceptible to steric and charge repulsion imparted by the negGFP fusion. To test the applicability of the negGFP fusion to weak SAM polymers, we tested the approach on the human DGKδ-SAM polymer, which has an intersubunit Kd of roughly 6 μM,7 more than 1000-fold weaker than TEL-SAM. EM images of purified negGFP-DGKδ-WT are shown in Figure 2(B). The negGFP-DGKδ-WT polymers are not as easily identified as they are shorter than the TEL polymers and infrequently spread across the grid. Many round particles of the same diameter as the polymers can be seen in the background of the grid and may be single turns or very short polymers laying on the grid perpendicular to the polymer axis. Nevertheless, scattered rods are clearly visible on the grid. Thus, negGFP can be used to solubilize SAM polymers and it does not completely eliminate the ability of even weak polymers to self-associate.
Testing a rapid native gel screen for SAM polymers
To rapidly screen for potential SAM polymers, we used native gel analysis. The approach makes use of two key features of negGFP fusions: (1) the extreme negative charge ensures that virtually all proteins migrate toward the cathode and (2) the green fluorescence can be used to monitor the migration of the fusion protein in crude lysates. Figure 2(C) shows the migration of negGFP fusions containing TEL-WT, the monomeric mutants TEL-V80E and TEL-A61D, as well as the mixture of TEL-V80E and TEL-A61D that should form a dimer. These controls demonstrate that monomeric and dimeric populations can be easily distinguished. The polymeric negGFP-TEL-WT exhibited a dramatic shift in migration and largely remains in the gel well. These results indicate that native gels can readily identify highly stable polymers like TEL-SAM in crude cell lysates.
Can weaker polymers still be identified by native gel analysis? Native gel analysis of negGFP fusions containing the weakly associating DGKδ-WT and two DGKδ monomeric mutants, E35G (ML surface) and V52E (EH surface), is shown in Figure 2(C). Like TEL, the DGKδ monomer fusions could easily be distinguished from the DGK dimer created by mixing E35G and V52E mutants, indicating that the negGFP fusion does not completely disrupt subunit association. negGFP-DGKδ-WT, unlike negGFP-TEL-WT, entered the gel but displayed a retarded migration relative to the monomer and dimer controls. In addition, both the dimer and polymer bands of DGKδ-SAM are more diffuse than seen for TEL-SAM, which may reflect subunit dissociation during electrophoresis or heterogeneity in the starting population. Nevertheless, the results indicate that native gels can clearly identify possible polymers, even one as weakly assembled as DGKδ-SAM.
Screening human SAM domains for possible polymer formation
To screen for new SAM polymers in the human proteome, we attempted to generate negGFP fusions from as many hSAM domains as possible. hSAM domains were identified through the SMART database22, 23 and annotations in the NCBI gene database. Most of the identified hSAM domains were obtained by PCR amplification from the Stratagene MegaMan Human Transcriptome Library. For some hSAM domains that could not be obtained from the cDNA library, we were able to obtain clones from a variety of sources. Several proteins contain more than one SAM domain placed in tandem within the protein. For these proteins, we attempted to clone the tandem SAMs both in isolation and as a single, combined construct. We initially targeted 114 unique hSAM domains from 92 proteins in the human genome. Of the 114 targets, we were able to prepare negGFP-hSAM fusions for 96 unique hSAM domains from 77 proteins.
All the negGFP-hSAM fusions were expressed in E. coli and the lysates analyzed by native gel electrophoresis. negGFP-DGKδ-V52E and negGFP-DGKδ-WT were used in each gel as monomer and polymer controls. As DGKδ-SAM is the weakest hSAM polymer interaction characterized thus far, most other polymeric SAMs should migrate more slowly through the gel. Concentrations were adjusted to approximately the same level by measuring fluorescence intensity. Some SAM domains that were expressed at low levels were partially purified, so they could be concentrated before loading on a gel. (Supporting Information Fig. S1, boxed section). We were unable to analyze the hSAM domains from five proteins because they were either proteolyzed or poorly expressed. Indeed, most constructs containing a single SAM derived from a tandem SAM protein could not be stably expressed in isolation, suggesting that most tandem SAMs form a structural unit. We were able to screen the hSAMs (and tandem hSAMs) from 72 separate proteins for their ability to polymerize. The results, summarized in Figure 3, indicate that close to half of the members of the hSAM domain family may form polymers or homo-oligomers.
All the hSAMs that have been previously characterized to form polymers or have homologues that form polymers behaved as aggregates in the native gel analysis. This group includes diacylglycerol kinase δ (DGKδ),7 diacylglycerol kinase η (DGKη), translocation-ETS-leukemia (TEL/ETV6),3 translocation-ETS-leukemia 2 (TEL2), all polyhomeotic homologues,6 sex comb on midleg homologue 1 (SCMH1),6 Shank-2, Shank-3,8 Tankyrase-1,24 Tankyrase-2, and SCM-like with Four MBT domains 1 (SFMBT1) (Supporting Information Table 1). Ephrin receptor B2 (EPHB2) behaved as a monomer, but this is expected as experimentally observed oligomer forms were only found to occur at very high concentration.18 Likewise, hSAMs that have been previously shown to be monomers, such as V-ETS Erythroblastosis virus E26 oncogene homolog 1 (ETS1),25 behaved as monomers (Supporting Information Table 1). These hSAMs, therefore, serve as additional controls for the technique and indicate that the vast majority of bone fide SAM polymers can be identified if they express at sufficient levels, without false positives.
Overall, hSAMs from 30 proteins displayed a migration pattern in the gel consistent with a polymer, 17 of which had not been previously identified as polymer-forming hSAMs. This group included Ankyrin repeat and sterile alpha motif domain containing 3 (ANKS3), Atherin, BicaudalC1, CASK interacting protein 1 (Caskin1), CASK interacting protein 2 (Caskin2), Kazrin, L(3)mbt-like 3 (L3MBTL3), L(3)mbt-like 4 (L3MBTL4), lipopolysaccharide binding protein (LBP), Liprinβ1, Liprinβ2, sterile alpha motif domain containing 3 (SAMD3), sterile alpha motif domain containing 8 (SAMD8), sterile alpha motif domain containing 9 (SAMD9), sterile alpha motif domain containing 13 (SAMD13), sterile alpha and TIR motif containing 1 (SARM1), and stromal interaction molecule 2 (STIM2).
Identification of polymeric hSAM domains by EM
To distinguish polymers from amorphous aggregates, we examined selected negGFP-hSAM fusions by negative-stain EM. As we were interested in new polymers, we did not further examine any hSAM domains for which a close homolog had been previously shown to be polymeric (see Supporting Information Table 1), leaving 17 hSAM domains to be characterized. Of these, we found 14 hSAMs that appear able to form polymers rather than amorphous aggregates as judged by the presence of elongated rods. The SAM domains that may be polymer forming are ANKS3, Atherin, BicaudalC1, Caskin1, Caskin2, Kazrin, L3MBTL3, L3MBTL4, LBP, Liprinβ1, Liprinβ2, SAMD8, SAMD9, and STIM2. (Fig. 4) The hSAM domains that formed aggregates by native gel analysis but did not form obvious polymers when examined by EM were SAMD3, SAMD13, and SARM1 (data not shown).
We were intrigued to find that a number of proteins that contain two or three tandem SAM domains formed likely polymeric structures. The tandem hSAM proteins were Caskin 1, Caskin 2, Kazrin, Liprin β1, and Liprin β2. So far, all previously characterized SAM domain polymers are made from single SAM domain units. We, therefore, chose one of these, Caskin1, which contains two tandem SAM domains to characterize further. We were able to obtain a crystal structure of the Caskin1 tandem SAM domains, revealing a novel helical polymer structure, containing four tandem SAM domains per turn (submitted for publication).
We have identified a number of potential SAM polymers which provides insight into how these proteins might function. We previously developed a prediction algorithm for possible polymer-forming SAM domains.19 Of the 51 human, SAM domains predicted previously that were also examined here, 46 are consistent with our experimental findings (the SAM domains used for algorithm training excluded). The few discrepancies included several SAMs predicted to be nonpolymers: SAMHD1 and SEC23IP, which behaved as borderline oligomers, and SAMD8, which we identified as a polymeric SAM. Additional discrepancies included were two SAMs predicted to be polymers: c14orf174, which behaved as a borderline oligomer, and INPPL1, which behaved as a nonpolymer/weak polymer. While it is not possible to rigorously validate polymer formation for any protein without considerably more work, our results provide a much stronger working hypothesis than can be had from sequence analysis alone. Some interesting possibilities emerged.
Atherin is a protein found in atherosclerotic lesions in vessels where it potentially acts to anchor low-density lipoproteins (LDL) to the arterial wall.26 We found the Atherin hSAM to form polymers suggesting that SAM-mediated polymers could act as a physical scaffold for atherosclerotic plaques.
Bicaudal C1, which contains another SAM from our list of likely novel polymers, is a conserved RNA-binding protein that has been linked to polycystic kidney disease (PKD). Mutation of the Bicaudal C1 gene which leads to truncation of the protein short of the SAM domain gives rise to a mouse model of PKD.27 Functional Bicaudal C1 containing the SAM domain is required for the alignment of renal primary cilia such that directional fluid flow can be achieved. Bicaudal C1-hSAM concentrates the protein in cytoplasmic RNA-processing bodies, called P-bodies, where it is thought to potentially play a role in downregulation of canonical WNT signaling and thereby facilitate the formation of the planar cell polarity, required for coordination of cilia orientation.28 Localization of the Bicaudal C1 protein at P-bodies could conceivably be organized by SAM-mediated polymers such as those observed in this study. Interestingly, Bicaudal C1 has been shown to physically associate with another SAM-containing protein with a role in PKD, ANKS6. ANKS6 has been shown to self-associate in a manner dependent on both its SAM and K homology domains.29 While the ANKS6 hSAM was classified as a borderline oligomer in this study due to its weak association in the native gel assay, formation of strong polymers might ensue with additional interactions mediated by the K homology domain. As the Bicaudal C1–ANKS6 association is dependent on their respective ankyrin repeats but not their SAMs, the ankyrin repeat binding sites could conceivably join these proteins in a polymer network of Bicaudal C1 and ANKS6 hSAM homopolymers. This topology would be distinct from previously proposed SAM copolymers that are formed by polycomb group proteins.6
STIM1 and 2 are type-I transmembrane proteins that play an essential role in calcium homeostasis and signaling. STIM1 provides a critical role in store-operated calcium entry (SOCE) by sensing calcium depletion in the lumen of the ER and transducing that signal to factors at the plasma membrane to allow influx of extracellular calcium.30, 31 STIM2 is thought to play roles in maintaining basal calcium levels as well as potentially regulating SOCE.32 The ER-signal peptide, calcium-sensing EF domain, and SAM domain reside within the ER lumen while the remainder of the protein projects into the cytosol.33 Previous work showed STIM1 and STIM2 aggregate in a SAM-dependant manner in response to calcium depletion, a step required for relocalization of STIM1 into puncta in close proximity to the plasma membrane.34, 35 This puncta formation is required for signal transduction and the eventual opening of Ca++ channels.36 A crystal structure of a construct containing the STIM1 EF and SAM domains shows that the two domains pack together when the EF hand is in the Ca++-loaded state.37 Oligomerization of both STIM1 and STIM2 appears to be linked to partial unfolding35, 38 suggesting the possibility that the loss of Ca++ binding might potentially destabilize all or part of the EF domain exposing a surface of the SAM domain required for oligomer formation. The rod-shaped structures we observed suggest that STIM2 assembly may be well organized rather than just simple aggregation of unfolded polypeptide chains.
A number of proteins containing our newly identified polymeric hSAMs are known scaffolding proteins such as Liprin β1 and β2, Kazrin, Caskin1 and Caskin2. Like other characterized polymeric SAM domains in scaffolding proteins, polymerization provides a way to organize large multiprotein complexes. Unlike previously characterized SAMs found within scaffolding proteins, however, all of these proteins contain multiple, adjacent SAMs. Thus for the first time, we have discovered likely polymeric forms of tandem SAM domains. In one case, Caskin1, we followed up on our findings from this screen and obtained a new polymeric SAM structure made from two tandem SAM domains (submitted for publication). Like the Caskin1 example, we hope that our screen will provide a starting point for further investigations of the structural and functional implications of SAM polymerization.
Materials and Methods
A clone for negGFP including an N-terminal 6xHis tag20 was kindly provided by the Liu lab (Harvard University). The negGFP sequence including the His tag was cloned into the pBAD-HisA vector (Invitrogen) using NcoI and KpnI sites. Next, an oligomer cassette was inserted downstream of the negGFP sequence using KpnI and HindIII restriction sites. The cassette included the sequence for a Gly-Ser linker (GGSGGS) followed by the sequences for the following combination of restriction sites: NotI–AsiSI-PISceI–Asc-FseI.
All hSAMs were identified by either the SMART database or through annotations included in the NCBI database. The SAM domains were then PCR amplified from the MegaMan Human Transcriptome Library (Stratagene) and the correct clones validated by sequencing. Platinum PCR Mix (Invitrogen) was used initially to amplify constructs from the cDNA library. Clones that were not amplified with Platinum PCR mix could often be amplified using Phusion polymerase (New England Biolabs). Several hSAMs were cloned from alternative sources of DNA. Constructs containing the Caskin 2, Liprin α2, Liprin α3, Liprin α4, Liprin β1, and Liprin β2 sequences were obtained from Open Biosystems. A Liprin α1 construct was kindly provided by Michael Streuli. The Caskin 1 (KIAA1306) construct was provided by the Kazusa DNA Research Institute. KazrinE was obtained from ATCC (9891927). Finally, SAM domains were inserted in-frame into a pBAD-HisA vector containing the negGFP sequences using the NotI and FseI sites.
negGFP-hSAM Lysate preparation
The negGFP-hSAM constructs were transformed into ARI814 cells.39 LB media (50 mL) containing 100 μg/mL ampicillin was inoculated with 0.5 mL of a saturated overnight culture grown in the same media and incubated with shaking at 37°C until the cell density reached an OD of 0.8. Cells were then transferred to 16°C, induced with 0.2% arabinose, and incubated for an additional 12 h. After harvesting by centrifugation, cells were resuspended in 0.5 mL of 20 mM Tris (pH 7.5), 1M NaCl, 1 mM TCEP, 5 mM MgCl2 buffer containing lysozyme (5 mg/mL), DnaseI (20 μg/mL), 0.5 mM PMSF, and half a tablet of complete mini protease inhibitor (Roche). Cells were subjected to three rounds of freeze-thaw followed by two 10-s rounds of sonication. The cell lysate was centrifuged at 13,000g for 20 min and the pellet discarded. Lysate fluorescence was measured by transferring 200 μL of lysate into a 96-well clear bottom, black-sided plate (Nunc). negGFP fluorescence was monitored on a Molecular Devices Spectramax M5 plate reader using an excitation wavelength of 488 nm, an emission wavelength of 510 nm, and a cutoff filter of 515 nm.
Gel filtration-fluorescence assay
Cell lysate or purified protein (0.5 mL) was loaded onto a Superdex200-10/300GL gel filtration column (Amersham Biosciences) at 0.5 mL/min in 20 mM Tris (pH 7.5), 500 mL NaCl. 0.25 mL fractions were collected, and 0.2 mL of each fraction was transferred to 96-well clear bottom, black-sided plates (Nunc). Fluorescence was monitored on a Molecular Devices Spectramax M5 plate reader using an excitation wavelength of 488 nm, an emission wavelength of 510 nm, and a cutoff filter of 515 nm.
Cell lysate (22.5 μL) prepared as described above was mixed with 7.5 μL 4X RunBlue Native Sample Buffer (Expedeon) and loaded onto a 20% RunBlue 12-well Native gel (Expedeon). The amounts of lysate samples run in the control gel were adjusted such that each sample had an equal fluorescence for comparison purposes. Gels were run in 40 mM Tricine, 60 mM Tris buffer for 20 h at 4°C and visualized on a Bio-Rad Molecular Imager FX Pro-Plus using an excitation wavelength of 488 nm and an emission wavelength of 510 nm. Possibly polymeric SAMs were defined as those that displayed a diffuse and retarded migration compared to the monomeric DGK control. Borderline oligomers were defined as those that had either a sharply defined band at a position in the gel that was significantly higher than the monomeric control or a sharp band with a slight diffuse trail. Monomeric/weak polymer SAMs were defined as those with a sharp band roughly consistent with the monomer control (some variation is to be expected as not all constructs are the same molecular weight.
Cultures (4 L) of ARI814 cells expressing each construct were grown and lysed in 35 mL of lysis buffer as described above, except that 10 mM imidazole was included in the lysis buffer. Cell lysate was incubated with 2 mL of Ni-NTA Superflow resin (Qiagen) for 1 h at 4°C and washed with 100 mL of 20 mM Tris (pH 7.5), 1M NaCl, 1 mM TCEP, and 10 mM imidazole followed by 150 mL of 20 mM Tris (pH 7.5), 1M NaCl, 1 mM TCEP, and 20 mM imidazole. Protein was eluted with 15 mL of 20 mM Tris (pH 7.5), 1M NaCl, 1 mM TCEP, 200 mM imidazole and then dialyzed by two successive transfers into 2 L of 20 mM Tris (pH 7.5), 300 mM NaCl, 1 mM TCEP. Full-length negGFP-hSAM protein samples contaminated with free, proteolyzed negGFP were subjected to a second, gel-filtration purification step. Protein samples were concentrated to 500 μL using an Amicon Ultra(-10) centrifugal filter concentrator unit (Millipore) and were loaded onto a Superdex200-10/300GL gel filtration column (Amersham Biosciences) at 0.5 mL/min in 20 mM Tris pH 7.5/300 mM NaCl/1 mM TCEP. 0.25 mL fractions were collected and assayed for GFP fluorescence as described above. Samples containing GFP fluorescence at an elution volume consistent with the molecular weight for a full-length monomer or higher were pooled. The final pure protein samples were concentrated as described until protein began to fall out of solution or a final volume of 200 μL was achieved.
Carbon-coated parlodion support films mounted on copper grids were made hydrophilic immediately before use. Approximately 3 μL of each protein sample was applied to separate grids and allowed to adhere for several minutes. Grids were rinsed with distilled water and negatively stained with 1% uranyl acetate. Samples were examined in a Hitachi H-7000 electron microscope at an accelerating voltage of 75 kV.
The authors thank Heedeok Hong, Robert Jefferson, Tyler Korman, and Ryan Stafford for helpful comments on the manuscript.