AP-MS, affinity purification combined with mass spectrometry; ICAT, isotope-coded affinity tag; IgG, immunoglobulin G; IMEx, International Molecular Exchange; iTRAQ, isobaric tags for absolute and relative quantitation; LC, liquid chromatography; MAP, mixing after purification; MS, mass spectrometry; MS/MS, tandem mass spectrometry; NFκB, nuclear factor kappa B; SILAC, stable isotope labeling by amino acids in cell culture; PAM, purification after mixing; siRNA, small interfering RNA; TAP, tandem affinity purification; TEV, tobacco etch virus; TNFα, tumor necrosis factor alpha
Many physiological processes are regulated by dynamic protein interaction networks whose characterization provides valuable information on cell biology. Several strategies can be used to analyze protein–protein interactions. Among them, affinity purification combined with mass spectrometry (AP-MS) is arguably the most widely employed technique, not only owing to its high throughput and sensitivity but also because it can answer critical questions such as where, when, and how protein–protein interactions occur. In AP-MS workflows, both the target protein and its interacting partners are isolated before being identified by MS. The main challenge of this approach is to distinguish bona fide binders from background contaminants. This review focuses on the different strategies designed to circumvent this limitation. In this regard, the combination of quantitative proteomics and affinity purification emerges as one of the most powerful, yet relatively simple, strategies to characterize protein–protein interactions. © IUBMB Life, 65(1):9–16, 2013
Cell physiology is governed by the capacity of proteins to form interaction scaffolds that are temporally and spatially regulated. Proteins' ability to interact between each other is majorly dictated by their three-dimensional structure and by the presence of specific motifs, often highly conserved between species. Metabolism, cell-cycle regulation or protein synthesis, folding, and degradation, to name but a few, are processes that rely largely on intricate networks of protein–protein interactions. Characterizing such networks is critical to achieve global understanding of cell behavior and constitutes a cornerstone of the so-called functional proteomics (1) whose major aim is to understand how proteome dynamics controls the shaping of the phenotype. Cell map proteomics, also known as interactomics, is one of the bases of functional proteomics and aspires to depict the map of protein–protein interactions occurring within the cell.
As a rule, protein interaction data are stored in public-access repositories available to the scientific community. To date, 177 protein–protein interaction databases are listed in the pathguide web resource (http://www.pathguide.org). To deal with the difficulty of contrasting information derived from so many different sources, the International Molecular Exchange (IMEx) consortium proposes common guidelines for reporting protein interaction studies termed (minimum information required for reporting a molecular interaction experiment) (2). Additionally, IMEx provides a unique website to access and query highly curated data from the databases associated to the consortium (3). It is far out of the scope of this review to describe how to place interactomic data in a biological framework and a comprehensive tutorial on this key issue can be found elsewhere (4).
Different approaches such as the yeast two-hybrid system (5, 6), surface plasmon resonance (7), or protein and peptide arrays (8, 9) have been employed to study protein–protein interactions. Each of these strategies possesses distinct features and can provide complementary information. For example, surface plasmon resonance allows the determination of binding affinity and kinetics (7) and peptide arrays can be used to fine-map sequence motifs that mediate protein–protein recognition (10, 11). Nevertheless, the main limitation of these methods is that interactions are studied under nonphysiological conditions.
Besides these techniques, and sustained by a continuous improvement of mass spectrometers and protein databases, affinity purification in combination with mass spectrometry (AP-MS) has emerged as a powerful and high-throughput technology for interaction studies (12). Unlike the above-mentioned techniques, AP-MS can provide valuable information about protein–protein interactions as they occur in the cell.
In a typical AP-MS experiment, the target protein (bait) is purified almost to homogeneity under conditions that preserve both its native conformation and its binding partners (preys). To that end, affinity purification is the method of choice as it allows high-purification yields in a single step. Once isolated, the bait and its preys are trypsin digested and subjected to MS analysis to identify the members of the protein complex studied (Fig. 1). As protein identification strategies are well established and easy to perform, the real challenge is to distinguish true binders from the unspecific background of proteins that unavoidably result from any purification procedure. Interestingly, this problem is not resolved but exacerbated by the increased sensitivity of modern mass spectrometers that may detect even minute amounts of contaminating proteins in a sample.
Two main strategies have been implemented to overcome this pitfall, namely tandem affinity purification (TAP) and quantitative AP-MS. The former is based on a two-step purification process that reduces the amount of background proteins, whereas the later makes use of quantitative proteomics and, unlike the TAP approach, provides evidence to distinguish real interactions from false positives. The present review aims to discuss different AP-MS strategies, stressing the advantages derived from the use of quantitative proteomics in the characterization of protein complexes and interaction networks.
Tandem Affinity Purification
The TAP strategy (13, 14) was the first approach devised to deal with the presence of false positives in AP-MS experiments. This method requires expressing the target protein fused to a dual tag. The original TAP tag, since different versions have been introduced subsequently (15, 16), consists of a sequence stretch containing a calmodulin-binding peptide, a tobacco etch virus (TEV) protease cleavage site and the immunoglobulin G (IgG)-binding domain of protein A from Staphylococcus aureus. This tag can be fused to either the C- or the N-terminus of the target protein. The construct is transiently or stably expressed into the host cell and purification is carried out using two sequential affinity steps under mild conditions. In the first place, the bait and its preys are captured with IgG beads and eluted by enzymatic cleavage with TEV protease. Afterward, the eluate is subjected to affinity chromatography in a calmodulin column and released with ethylene glycol tetraacetic acid.
This strategy greatly reduces contaminants and true protein binders are, thus, significantly enriched. However, the presence of false interaction partners cannot be ruled out entirely as the TAP approach offers no evidence regarding the specificity of the characterized interactions. Small amounts of unspecific background proteins may be present after the second purification step and will be incorrectly assigned as bona fide binders. Another drawback of this technique is the possibility that the tag may hamper the interaction of the bait with some of its physiological partners, through steric hindrance or by preventing proper folding of the protein. Finally, two-step purification can lead to the loss of labile interactors owing to a longer procedure and the different experimental conditions of each affinity column.
Despite these shortcomings, the TAP approach has been successfully used in a number of protein interaction studies. For example, Gavin et al. applied the TAP approach to the in-depth characterization of interaction networks in Saccharomyces cerevisiae (17, 18), identifying about 500 protein complexes, half of them not reported previously. An independent analysis in the same yeast species allowed the tagging and successful purification of more than 2,000 proteins involved in about 7,000 different interactions (18). Applying a similar strategy, Butland et al. mapped the interaction partners of more than 600 soluble proteins from Escherichia coli (19) and Kuhner et al. described more than 150 protein complexes in the bacterium Mycoplasma pneumoniae (20), whose genome contains less than 700 protein coding genes (21). TAP has also been employed in higher eukaryotes to map protein complexes involved in the TNFα/NFκB transduction pathway (22), in transcription and RNA processing (23) or in chromosome segregation (24).
Affinity Purification and Quantitative Proteomics
The relatively recent emergence of MS-based quantitative proteomic techniques has brought about a different way to deal with the presence of false positives in AP-MS experiments. Quantitative proteomics measures differences in protein abundance between two or more samples and, with a proper experimental design, may provide valuable information about protein–protein interactions.
In quantitative AP-MS studies, two affinity purifications are carried out in parallel. In one of them, the bait protein is captured with an antibody, a specific ligand or, if tagged, with the appropriate affinity resin. The second purification serves as a negative control and is designed to yield only background proteins. Then, quantitative proteomics is used to determine the relative amount of each protein in both conditions. False positives are expected to be equally abundant and can be readily distinguished from true binders that will be present in only one of the pull-downs and, consequently, will show extreme quantitative ratios (Fig. 2). The choice of a suitable negative control is a critical issue in any quantitative AP-MS workflow and there are a number of possible experimental designs. A detailed review on this subject can be found elsewhere (25).
Theoretically, any quantification method can be applied in interaction studies including isotope-coded affinity tag (ICAT) (26), isobaric tags for absolute and relative quantitation (iTRAQ) (27), 18O labeling (28, 29), stable isotope labeling by amino acids in cell culture (SILAC) (30, 31), and label-free approaches (32, 33). For example, chemical labeling with ICAT was used to characterize the components of the RNA polymerase II preinitiation complex (34, 35), to find host proteins interacting with virulence factors from entheropathogenic E. coli (36), to identify protein complexes containing actinin-4 in prostate cancer cells (37), or to characterize binders of the transcription factor AP4 (38). Isobaric chemical labeling with iTRAQ was applied to the simultaneous analysis of protein interactions and their phosphorylation sites (39, 40) and, in combination with ICAT, to unravel the AP4 interaction network (38). Enzymatic labeling with 18O allowed the characterization of the cyclin-dependent kinase 9 interactome (41). Finally, label-free quantitation was used to identify novel targets of deubiquitinating enzymes (42) and can be used by the MasterMap (43) and the QUBIC (44) platforms for the characterization of protein–protein interactions.
SILAC was originally developed, and has been extensively employed, as a tool for expression proteomics (30, 31). In this approach, cell populations to be compared are cultured in medium containing either standard (“light”) or stable isotope-labeled (“heavy”) versions of one or more amino acids. After mixing cell populations in a one-to-one ratio, proteins are extracted and trypsin digested and quantitative differences between both states can be determined by MS. Given that the light and heavy samples are mixed right at the beginning of the proteomic workflow, both control and treated samples undergo the same experimental procedure, which limits technical variation. SILAC is, therefore, less prone to experimental error than other alternatives. Additionally, it is easy to implement and does not involve modifications of sample processing or treatment as labeling occurs prior to protein extraction.
Mainly for these reasons, SILAC has been adopted as the gold standard method for conducting quantitative AP-MS experiments. The first SILAC-based AP-MS workflow reported (45) addressed the analysis of proteins interacting with the activated epidermal growth factor (EGF) receptor. In this study, Blagoev et al. used the SH2 domain of Grb2 to purify the activated form of the receptor and its interacting partners from EGF-treated cells. To assess the specificity of the interactions, nonstimulated cells were included as a negative control. Both cell cultures were subjected to differential SILAC labeling. After affinity purification, 28 out of 228 proteins identified showed extreme quantitative ratios, revealing their direct or indirect association to the activated receptor.
Subsequently, many laboratories have used this strategy coupled to different purification protocols based on the use of antibodies, tags, or specific ligands of the target proteins (Fig. 2A). Using SILAC-based AP-MS, de Hoog et al. identified the proteins involved in focal adhesions in an attachment-specific manner (46), Guerrero et al. mapped the 26S proteasome interaction network (47), and Foster et al. described the insulin-dependent interactome of the glucose transporter GLUT-4 (48). Other groups have used similar strategies to identify members of the COP9 signalosome complex (49) to characterize binders of protein phosphatase 1 (50), to determine the integrin-linked kinase interactome (51), to discover host proteins interacting with bacterial tyrosine kinase substrates (52), or to characterize protein complexes recognizing epigenetic histone marks (53). Additionally, this approach can been extended to the analysis of protein interactions with DNA (54) and RNA (55).
Many SILAC-based AP-MS workflows involve the transfection of the bait fused to a tag to allow its purification. Gene transfection often results in overexpression of the target protein that may lead to the identification of nonphysiological binders. Such binders are true positives as they effectively interact with the target protein although their biological relevance is questionable. To bypass this limitation, Selbach and Mann developed an interaction screening method termed quantitative immunoprecipitation combined with knockdown (QUICK) that combines interference RNA, immunoprecipitation, and SILAC labeling (56). In QUICK experiments, two cell cultures are subjected to SILAC labeling and small interfering RNA (siRNA) is used to knock down the protein of interest in one of them. The second cell culture, however, expresses the endogenous target protein. After mixing both populations, the bait protein and its preys are purified with a suitable antibody and the light-to-heavy ratio of each protein is used to identify the true binders (Fig. 2B). As the target protein is expressed at normal levels, this approach allows the characterization of protein complexes under nearly physiological conditions.
So far, QUICK has been used to characterize the interactome of β-catenin and CBL (56), to identify novel binding partners of 14-33ζ (57) or to analyze the interactome of the transcription factor Stat3 (58).
Analysis of Dynamic Interactions
The characterization of weak or dynamic interactions by SILAC-based AP-MS is a challenging task, mainly because of the exchange between light and heavy proteins from a complex during purification (Fig. 3). This phenomenon is not expected to affect the most stable interactors but can lead to the misidentification of true binders as background proteins (false negatives). This problem has been evidenced by several groups (49, 59–61) by comparing the number of identified bona fide binders using two different protocols named purification after mixing (PAM) and mixing after purification (MAP). In PAM–SILAC (Fig. 3A), labeled cells are combined right before purification, giving enough time for protein exchange to occur. Conversely, in the MAP–SILAC workflow (Fig. 3B), labeled proteins are mixed after purification so that the heavy–light swap of interactors is not possible. Weak or transient interactions can only be detected using the MAP–SILAC approach, whereas high-affinity binding partners are identified in both workflows.
Mousson et al. applied these protocols for the identification of proteins interacting with the human TATA-binding protein (TBP) (60). As expected, most known TBP interactors were found to display extreme quantitative ratios. However, the transcription factor BTAF1 was detected as a background binder when the cells were combined before the affinity step but as a true positive when proteins were mixed after purification. Further experiments with synchronized cells showed that BTAF1 is a transient member of the TBP transcription complex that is bound only during mitosis. Wang and Huang used a similar strategy to identify strong and dynamic interactions in the 26S proteasome complex, identifying 67 binders (59). Fourteen of them would have been wrongly catalogued as false positives if the cells had been combined before purification. With a comparable approach, Fang et al. distinguished between stable and dynamic interactors of the COP9 signalosome complex (49). Finally, Kito et al. used the same strategy to characterize the eIF2B–eIF2 and the cyclin–Cdc28 complexes in yeast (61).
In any quantitative AP-MS workflow based on chemical labeling (ICPL, iTRAQ, etc.), enzymatic labeling with 18O or label-free approaches sample combination occurs necessarily after purification of the samples and in this regard, weak interactors cannot be wrongly identified as contaminants. However, there is no way of distinguishing dynamic from stable interactors when applying these techniques. In contrast, the combination of MAP and PAM–SILAC constitutes a simple and straightforward alternative to address this issue.
Protein Interaction Analysis with Chemical Crosslinking
A different approach to the study of weak interactions entails the use of crosslinkers in combination with MS. Chemical crosslinking (62) fixes protein–protein interactions by forming covalent bonds between adjacent amino acid residues and provides a snapshot of the cellular interactome. After crosslinking, protein complexes can be purified in fully denaturant conditions, reducing the risk of losing low-affinity binders.
Crosslinking can be combined with quantitative AP-MS to filter out background contaminants in protein interaction screenings. This strategy has been used in conjunction with SILAC to characterize the proteasome interacting network (47, 63). Furthermore, by applying this approach to synchronized cells, Kaake et al. were able to identify the proteins interacting with the proteasome in a cell-cycle-dependent fashion (64). Label-free quantitation has also been coupled with chemical crosslinking and AP-MS to identify telomere-binding proteins (65) and to map the components of the COP9 signalosome (66).
An additional advantage of the use of crosslinkers is the possibility of determining contact surfaces between proteins, gaining structural information about the complexes identified (67). Nevertheless, this possibility relies on the identification of the crosslinked peptides by MS which is still a challenging task although some strategies have been proposed to address this issue (68–70).
Characterization of protein complexes and interaction networks is crucial to understand cell biology at the molecular level. Several strategies can be deployed to achieve this goal. Among them, the combination of affinity purification and protein identification by MS is gaining increased acceptance, mainly for its high throughput and sensitivity and because interactions can be analyzed under close to physiological conditions. The use of quantitative proteomics together with AP-MS allows differentiating true interaction partners from nonspecific background proteins by comparing the relative abundance of purified proteins from the experimental condition and from a negative control. In this regard, SILAC labeling is arguably the most suitable method to carry out quantitative AP-MS experiments owing to its high accuracy and easy implementation. One of the unique features of the SILAC approach is that it allows the combination of the sample and the negative control either before or after performing the pull-down assay. Remarkably, this attribute can be exploited to distinguish dynamic from stable binders and makes SILAC-based AP-MS one of the most powerful methods to screen protein–protein interactions. Finally, the use of chemical crosslinking is a promising strategy to map in detail the contact surfaces involved in protein–protein interactions. However, owing to the intrinsic difficulty of identifying crosslinked peptides by MS, further improvements are required before this technique can be used routinely.
The authors thank Severine Gharbi (Centro Nacional de Biotecnología, Madrid) for critical comments on the manuscript and for fruitful discussions. This work was supported by grant S2010/BMD-2305 from the Comunidad Autónoma de Madrid. The Proteomics Unit of the Centro Nacional de Biotecnología is a member of the Spanish National Institute for Proteomics (ProteoRed-ISCIII).