Structural basis of an epitope tagging system derived from Haloarcula marismortui bacteriorhodopsin I D94N and its monoclonal antibody GD‐26

Specific antibody interactions with short peptides have made epitope tagging systems a vital tool employed in virtually all fields of biological research. Here, we present a novel epitope tagging system comprised of a monoclonal antibody named GD‐26, which recognises the TD peptide (GTGATPADD) derived from Haloarcula marismortui bacteriorhodopsin I (HmBRI) D94N mutant. The crystal structure of the antigen‐binding fragment (Fab) of GD‐26 complexed with the TD peptide was determined to a resolution of 1.45 Å. The TD peptide was found to adopt a 310 helix conformation within the binding cleft, providing a characteristic peptide structure for recognition by GD‐26 Fab. Based on the structure information, polar and nonpolar forces collectively contribute to the strong binding. Attempts to engineer the TD peptide show that the proline residue is crucial for the formation of the 310 helix in order to fit into the binding cleft. Isothermal calorimetry (ITC) reported a dissociation constant K D of 12 ± 2.8 nm, indicating a strong interaction between the TD peptide and GD‐26 Fab. High specificity of GD‐26 IgG to the TD peptide was demonstrated by western blotting, ELISA and immunofluorescence as only TD‐tagged proteins were detected, suggesting the effectiveness of the GD‐26/TD peptide tagging system. In addition to already‐existing epitope tags such as the FLAG tag and the ALFA tag adopting either extended or α‐helix conformations, the unique 310 helix conformation of the TD peptide together with the corresponding monoclonal antibody GD‐26 offers a novel tagging option for research.


Introduction
Epitope tags are short peptide sequences (usually 6-15 amino acids in length) that are fused to a target protein for specific anti-tag antibody/tag recognition. Since the introduction of an epitope tag for recombinant protein detection [1], epitope tagging systems have become an indispensable tool in many fields of scientific research. Initially, they were developed to facilitate recombinant protein purification and detection. Epitope tagging systems are of great advantage, especially when antibodies to the protein of interest are unavailable. Epitope tags are described as a single tool that can purify numerous proteins. Later, the use of epitope tags has been explored and applied to a wide range of biological applications including immunoprecipitation, immunofluorescence microscopy [2], protein trafficking in cells [3,4], protein crystallisation and structural biology [5].
Antibodies for the most widely used epitope tagging systems such as the FLAG tag [6], the HA tag [7,8], the c-Myc tag [9,10] and the polyhistidine tag [11] are all made commercially available. However, each tag has its own strengths and weaknesses and researchers need to choose the most suitable tagging systems for their experiments [12,13]. For instance, the polyhistidine tag is convenient for protein purification with a mild elution condition, but it lacks high specificity leading to poor results in immunoblotting. On the other hand, although the FLAG tag system is not as convenient as the polyhistidine tag for the purpose of protein purification, it is highly specific and often used for immunoblotting. It is therefore common to include a combination of multiple tagging systems in one expression construct in order to fulfil various needs [14,15].
Epitope tags are short enough that they are less likely to interfere with the structure and the activity of a target protein. However, the presence of epitope tags may still affect the structure [16,17], stability [18] and function [19] of the protein of interest [13]. Besides, the FLAG tag has been found to undergo a posttranslational modification, which disrupts the anti-tag antibody/tag interactions resulting in a decrease in the purification yield [20]. Despite the small size and presumed inertness of epitope tags, the potential negative effects of epitope tags on the protein of interest need to be taken into account. Studying the molecular interaction of an anti-tag antibody/tag complex at the atomic level by X-ray crystallography helps understand, predict and improve the performance of an epitope tag. The development of new epitope tags and epitope engineering remains as an ongoing process in order to offer a greater range of epitope tags with distinct characteristics for researchers to choose from.
In the present study, we produced a recombinant monoclonal antibody named GD-26 in mammalian cells as GD-26 IgG was previously raised by immunising mice with Haloarcula marismortui bacteriorhodopsin I (HmBRI) D94N mutant expressed in Escherichia coli (E. coli) and purified with detergents [21,22]. Based on the published structure of HmBRI D94N (PDB: 4PXK) [23], the potential epitope sites were predicted to be located at the loop regions or the C terminus (Fig. 1A). The synthetic peptides of these predicted epitopes (Fig. 1A, peptides III-V) were studied by isothermal calorimetry (ITC) to determine whether there was a binding event taking place in the presence of the antigen-binding fragment (Fab) of GD-26. A nine amino acid peptide (GTGATPADD, named the TD peptide) located at the C terminus of HmBRI D94N was determined to be the epitope for GD-26 Fab, exhibiting a high affinity with a nanomolar (nM) dissociation constant K D . The crystal of GD-26 Fab complexed with the TD peptide delivered a high-quality structure at a resolution of 1. 45 A. Further epitope engineering was conducted based on the structure obtained for the GD-26 Fab/TD peptide complex. By fusing the TD peptide to the C terminus of enhanced GFP (eGFP) as a model, the applications of the TD epitope tag were demonstrated by western blotting, ELISA and immunofluorescence as GD-26 IgG successfully detected TD-tagged eGFP. In summary, the GD-26/TD peptide tagging system offers another tagging option for researchers as the TD tag adopts a unique 3 10 helix structure and the protocol for the production of recombinant GD-26 IgG and Fab is provided in this report.

Identification of the epitope peptide for GD-26 recognition
The novel monoclonal antibody GD-26, identified as the mouse IgG2a isotype, was generated by immunising mice with HmBRI D94N expressed in E. coli. In this study, GD-26 Fab was expressed in a suspensionadapted HEK293 cell line known as Expi293F. Based on the previously reported structure of HmBRI D94N (PDB: 4PXK) [23] and other microbial bacteriorhodopsins [24,25], it is known that HmBRI D94N is a seven-transmembrane protein whose transmembrane domains are linked by loops that are exposed at the surface. Being exposed to the solvent, the N terminus (Fig. 1A, peptide I), the loop regions (Fig. 1A, peptide II and peptide III) and the C terminus (Fig. 1A, peptide IV & peptide V) are hot spots for interactions with GD-26 IgG. Although peptide V is not resolved in the existing HmBRI D94N crystal structure (PDB: 4PXK) [23], it is an extension to peptide IV, which points outwards away from the transmembrane core. Five potential epitope sites were thus highlighted in Fig. 1A because the crystal structure of HmBRI D94N (PDB: 4PXK) [23] suggests that these regions not only extend outwards but also own sufficient space for antibody binding without steric collisions. Peptide I at the N terminus (Fig. 1A) is so short that it is not very likely to be an epitope for antibody recognition [26]. Peptide II (Fig. 1A)  hairpin turn region, which is the most outwardextending part of peptide II, consists of two glycine residues lacking side chains. Therefore, the likelihood of peptide II being a binding site for an antibody is low. Compared with peptides I and II, peptide III ( Fig. 1A) is much more likely to be an antibodybinding site due to its length, outward-extending conformation and the presence of polar and hydrophobic residues. The C terminus was split into peptide IV and peptide V (Fig. 1A) because the most commonly found epitope length is around 10 amino acids [26]. Peptide IV exhibits an extended helix conformation consisting of polar and hydrophobic residues, while peptide V contains charged residues. Amino acid residues containing charged and/or hydrophobic side chains are more likely to interact with an antibody. Due to the secondary structure of peptide IV, the three amino acid residues (Glu, Thr and Thr) between peptides IV and V might serve as a linker in order to create space for peptide V for antibody binding. Alongside peptide III, peptide IV and peptide V were initially targeted as they showed a better chance of binding. The three potential GD-26-binding sites on HmBRI D94N (Fig. 1A, peptides III-V) were synthesised and investigated by studying their interactions with GD-26 Fab using ITC (Fig. 1B-D). ITC measures the heat changes upon the binding of two molecules by titrating one interaction partner to the other until saturation ( Fig. 1B-E: upper panels). These raw data are converted to a binding isotherm ( Fig. 1B-E: lower panels) to be fitted to an appropriate binding model and to deliver comprehensive thermodynamic information of the studied interaction. The stoichiometry (N-value) is determined as the x-axis at the mid-point (inflection point) of the sigmoidal fitting curve. The association constant K A is derived from the slope at the inflection point, and the dissociation constant K D is the reciprocal of K A . While peptide III and peptide IV showed no interaction with GD-26 Fab (Fig. 1B,C), a considerable exothermic reaction (change in enthalpy DH = À13.4 kcalÁmol À1 ) was observed upon injecting peptide V into GD-26 Fab (Fig. 1D). It demonstrates that peptide V binds to GD-26 Fab with a strong affinity as a dissociation constant K D of 19.4 nM was measured with a 1 : 1 stoichiometry (N sites = 0.926). Peptide V (GTGATPADD, named TD peptide) is identified as the epitope peptide on HmBRI D94N for the GD-26 recognition. The TD peptide was recognised by GD-26 IgG (Fig. 1E) with a similar affinity (K D = 18.8 nM) but showing a 2 : 1 stoichiometry (N sites = 2.27), which indicates the divalency of GD-26 IgG, in contrast to the 1 : 1 stoichiometry generated by monovalent GD-26 Fab.

Overall structure of GD-26 Fab complexed with the TD peptide
To understand the structural basis for the recognition of the TD peptide by GD-26 as detected by ITC, GD-26 Fab was cocrystallised with the TD peptide. The crystal structure of the GD-26 Fab/TD peptide complex was determined by X-ray crystallography, solved by the molecular replacement method, and eventually refined to a resolution of 1. 45 A (Table 1). GD-26 Fab displays a canonical immunoglobulin fold consisting of antiparallel b-sheets, which forms the constant and variable domains of the heavy and light chains (V H and V L ; Fig. 2A). All the complementaritydetermining region (CDR) loops contribute to the formation of the antigen-binding pocket that holds the TD peptide in place (Fig. 2B,C).

Conformation of the TD peptide
The TD peptide found in the GD-26 Fab/TD peptide complex adopts a compact 3 10 helix conformation mainly comprised of residues Pro6, Ala7 and Asp8 (Fig. 3). This short 3 10 helix is formed and stabilised by hydrogen bonds attributed to the carbonyl and amino groups of the backbone. The 3 10 helix is buried deepest in the antigen-binding pocket between the V H and V L interface, while the N terminus and the C terminus of the TD peptide both exhibit an extended structure towards the surface of GD-26 Fab ( Fig. 2A,C).

Interactions between GD-26 Fab and the TD peptide
In addition to the hydrogen bonds formed by the TD peptide backbone as mentioned above, there is a network of hydrogen bonds contributed by the side chains of the TD peptide and the residues of the surrounding CDR regions (Fig. 4). The side chains of Asp8 and Asp9 of the TD peptide form charge-assisted hydrogen bonds with the side chains of His35 in CDR-H1 and Lys35 in CDR-L1, respectively ( Fig. 4A,C). Moreover, the side chain of Thr5 of the TD peptide forms watermediated hydrogen bonds with the side chains of Typ50 in CDR-H2 and His101 in CDR-L3. The backbone carbonyl groups of Thr2, Gly3, Ala4 and Ala7 of the TD peptide make hydrogen bonds with the side chains of Tyr51, Gln55 and Asn39 in V L and Asn33 in V H . The side chain of Tyr51 in V L is also hydrogen-bonded to the backbone amino group of Ala4 of the TD peptide. Extensive nonpolar interactions (Fig. 4B) also contribute to the high-affinity binding interaction between GD-26 Fab and the TD peptide, along with the hydrogen bond contacts. The position of Pro6 in the TD peptide, the start of the 3 10 helix, shows that Pro6 is involved in face-to-face p stacking with Tyr37 and edge-to-face p stacking with Tyr31 in CDR-L1 (Fig. 4A). The CH/p interactions are also present between the backbone of the TD peptide and Tyr31, Tyr37, Tyr51 and Phe54 in V L and Trp50 in V H .

Engineering of the TD peptide
Despite all the thermodynamic parameters indicated in the preliminary result of the TD peptide binding to GD-26 Fab (Fig. 1D), the K D for the TD peptide with GD-26 Fab was determined as 12 AE 2.8 nM (mean AE SEM) from four independent experiments (n = 4; Table 2). Located at the end of the TD peptide, Asp9 is involved in the binding to V L of GD-26 Fab (Fig. 4). Substituting Asp9 with the uncharged isosteric residue Asn resulted in a reduced affinity due to the loss of charge-assisted hydrogen bonds ( Table 2, peptide D9N). Removing Asp9 ( Table 2, peptide C1) from the TD peptide caused the further loss of interactions with GD-26 Fab, while depleting both Asp9 and Asp8 ( Table 2, peptide C2) led to a complete loss of binding to GD-26 Fab. Truncation of the C terminus of the TD peptide benefited neither specificity nor enhanced binding as there was a substantial decrease in the K D from 12 to 548 nM that resulted from the deletion of both Gly1 and Thr2 ( Table 2, peptide N2). Further deletion of Gly3 and Ala4 completely diminished its capability to bind to GD-26 Fab (Table 2, peptide N4) even though the N terminus remained intact. The results suggest that the ideal length of the antigen peptide for the recognition by GD-26 Fab remains as nine amino acids ( Table 2, blue zone), ensuring enough effective contacts within the antigenbinding pocket. The residues at the N terminus and the C terminus of the TD peptide show the importance of collective hydrogen bonds and ionic interactions in the binding to GD-26 Fab.
Based on the structural information on GD-26 Fab complexed with the TD peptide ( Fig. 4), we noticed that there was unoccupied space around Ala7. By modelling, amino acids that could potentially replace Ala7 and still fit into the antigen-binding pocket were predicted. Synthetic peptides with predicted substitutions were mixed with GD-26 Fab on ITC to monitor changes in K D ( Table 2, yellow zone). The resulting K D values showed that the substitutions of Ala7 on the TD peptide failed to improve the affinity for GD-26 Fab.
The special position of Pro6 in the centre of the TD peptide was investigated as well. Substitution of Pro6 to Ala and Gly both resulted in a weaker interaction ( Table 2, pink zone). Interestingly, when Pro6 was substituted by Phe, which contains a benzene ring, the affinity was partially recovered. This is in agreement with the p-p stacking interactions observed within the antigen-binding pocket where Pro6 on the TD peptide interacts with Tyr31 and Tyr37 in CDR-L1 through p stacking (Fig. 4A).

. 2 6 Å
2 .9 1 Å 3 . 1 5 Å Fig. 3. Structure of TD peptide. The TD peptide (GTGATPADD) is shown in stick representation with carbons coloured in pink, oxygens in red and nitrogens in blue. Intrapeptide hydrogen bonds (distance less than 3.5 A) are indicated by the dashed lines with distances measured. The TD peptide structure was generated by using the UCSF CHIMERA software [57].  A residue of Met was attached to the N terminus of the TD peptide to test the applicability of the TD peptide as an N-terminal epitope tag for protein expression ( Table 2, grey zone). The peptide G1M and peptide M1 both showed a strong interaction with GD-26 Fab as indicated by the K D values (46.5 nM and 25.1 nM, respectively), which are comparable with the K D of 12 nM obtained from the TD peptide without the added Met residue, indicating the application of the TD peptide as an N-terminal tag in addition to the C-terminal tag originated from HmBRI D94N. Since the first trials of amino acid replacement stated above did not achieve a greater affinity than the wild-type TD peptide, it was decided not to repeat these trials due to time and cost constraints. Thus, the K D values in Table 2 are from a single experiment (n = 1) without SEM apart from the K D for the wildtype TD peptide.

Applications of the GD-26/TD tagging system
Short epitope peptides and paired monoclonal antibodies have the potential to be developed into a powerful tagging system for protein purification and detection. We tested the possibility and feasibility of the TD peptide and GD-26 IgG to be a tagging system. First, a short linker consisting of two glycine residues and the TD peptide were fused to the reporter protein eGFP at the C terminus, while a polyhistidine tag was placed at the N terminus of eGFP. GD-26 IgG was able to recognise not only the synthetic TD peptide (Fig. 1E) but also TD-tagged eGFP expressed in E. coli with a K D value of 129 nM derived from ITC (Fig. 5). Downstream applications of the GD-26/TD tagging system were studied. Western blotting showed specific and reliable detection of GD-26 IgG to TDtagged eGFP ( $ 28 kDa) in crude E. coli cell lysates (Fig. 6A). An additional low molecular weight band below TD-tagged eGFP was also detected by GD-26 IgG in Fig. 6A. As mentioned above, a polyhistidine tag was fused to the N terminus of TD-tagged eGFP and this lower band was no longer observed in western blotting using the anti-polyhistidine antibody for detection (Fig. 6B). Therefore, the additional band below TD-tagged eGFP in Fig. 6A represents the Nterminal degradation product of TD-tagged eGFP whose TD peptide at the C terminus remains intact for detection by GD-26 IgG. Besides cell lysates, as little as 5 ng of purified TD-tagged eGFP was detected by GD-26 IgG using western blotting (Fig. 6C), while eGFP without the TD peptide showed no detection Table 2. Changes in the binding to GD-26 Fab due to engineering of TD peptide. Amino acid replacement based on the wild-type TD peptide is indicated in red. The strength of interactions between GD-26 Fab and the engineered peptides is described as K D . The peptides coloured in blue study the minimal and ideal length of an antigen peptide for GD-26 Fab. The peptides coloured in yellow investigate whether minimising the unoccupied space around Ala7 can enhance the binding to GD-26 Fab. The peptides coloured in pink study the importance of p-p interactions at peptide position 6. The peptides coloured in grey test whether the TD peptide can act as an N-terminal tag.

Peptide
Sequence  signals as expected (Fig. 6A,C). In addition to western blotting, ELISA revealed that GD-26 IgG was able to detect purified TD-tagged eGFP at picogram level (Fig. 6D). Instead of coating with TD-tagged eGFP, the uncoated wells and the wells coated with cell lysates of E. coli BL21(DE3) and mammalian cell Expi293F showed negligible absorbance at 450 nm, indicating that GD-26 IgG detection to TD-tagged eGFP is sensitive and specific. We also generated a human construct in which the TD peptide was fused to the C terminus of eGFP to be expressed in HEK293 cells. This is to validate whether the GD-26/TD peptide tagging system can be utilised in cell lines that are capable of posttranslational modifications. In immunofluorescence applications on transfected and then PFA-fixed HEK293 cells (Fig. 6E), GD-26 IgG specifically recognised TD-tagged eGFP (Fig. 6E, red) and the signal colocalised with the intrinsic eGFP signal (Fig. 6E,  green). eGFP without the TD peptide remained its intrinsic signal in HEK293 cells, but it showed no GD-26 IgG signal as untagged eGFP did not contain the TD peptide tag for GD-26 IgG to bind to. Characterising the interactions between the monoclonal antibody GD-26 and the TD peptide contributes to employment of this novel peptide tag in scientific research.

Discussion
As a by-product while developing antibodies against the membrane protein HmBRI D94N, a novel epitope tagging system comprising the TD peptide (GTGATPADD) and GD-26 monoclonal antibody was established. Taking the archaeal origin of the TD peptide into account, the protein BLAST analysis (Fig. 7) shows lack of identical sequences within proteins of E. coli and eukaryotic cells, which are common recombinant protein expression systems. The uniqueness makes the TD peptide a potentially advantageous epitope tag. The affinity, specificity and structural basis for the interactions between the TD peptide and GD-26 Fab, as well as the applications of GD-26 IgG and the TD tag, are characterised in the present study.
The crystal structure of GD-26 Fab complexed with the TD peptide revealed that the TD peptide adopted a 3 10 helix-like structure (Figs 2 and 3) embedded in a pocket lined by aromatic and side-chain amide residues in the interface between the heavy and light chains on the surface of GD-26 Fab (Fig. 4A). A proline residue is frequently found in the formation of a 3 10 helix in a short peptide [27][28][29]. The unique cyclic conformation of the TD peptide has a propensity to form before encountering GD-26 Fab, attributed to the intrapeptide hydrogen bonds and the presence of proline located at position 6 which is the start of the 3 10 helix. Similar to the structure reported in this study, recognition of the CD20 epitope by rituximab [27] shows that CD20 forms a 3 10 helix whose proline residue is also present at the turn of the helix and located at the bottom of the binding cleft. Our Fabpeptide complex structure has also complemented an earlier publication, which reports a crystal structure of HmBRI D94N (PDB: 4PXK) with an undetermined C-terminal region due to disorder [23]. The presence of the 3 10 helical conformation is critical for the recognition of the TD peptide by GD-26 Fab as the extended-chain conformation is unlikely to fit into the same binding cleft generated by GD-26 Fab due to shape complementarity and spatial restrictions [30]. The high affinity of this interaction described by a K D of 12 nM is explained by the structural basis exhibiting hydrogen bonds and p-stacking (Fig. 4) to anchor the bound peptide ligand. First, we found multiple aromatic residues in the binding cleft, especially Tyr31 and Tyr37 in CDR-L1, which adopt edge-to-face and faceto-face p stacking with the side chain of Pro6 in the TD peptide to stabilise the binding. Prolines are known to play a role in protein-protein and protein-peptide interactions with aromatic residues from the other molecule via favourable hydrophobic effects, C-H•••p and C-H•••O interactions [31][32][33][34]. Secondly, both the N terminus and C terminus of the TD peptide greatly contribute to the specificity and high affinity for the recognition by GD-26 Fab (Table 2). Aspartic acids are among the most common hydrogen acceptors in charged hydrogen bonds [35,36]. Asp8 and Asp9 in the TD peptide interact with CDR-H1 and CDR-L1, respectively, via charge-assisted hydrogen bonds to dominate the binding. In addition to threonine and aspartic acid residues, the TD peptide also contains two glycine residues at positions 1 and 3. Glycine residues may have an advantage of providing backbone flexibility to the peptide antigen to better satisfy spatial constraints within the binding cleft due to lack of side chains [35]. As a result, a network of polar and nonpolar interactions gives rise to the compact folding of the TD peptide, the high affinity and the specificity between GD-26 Fab and the TD peptide. In terms of specificity, various amino acid substitutions in the TD peptide retain the ability to interact with GD-26 Fab despite the reduced affinities ( Table 2). Substitution of Pro6 to Phe instead of Gly and Ala shows a recovered affinity. This indicates the importance of the p-stacking interactions at the position 6 of the TD peptide, while Pro remains as the ideal residue at this position in terms of the residue size and side-chain orientation for both edge-to-edge and edgeto-surface p interactions. We have characterised the interaction between the TD peptide and GD-26 Fab both structurally and compositionally.
Compared with the synthetic TD peptide that is freely accessible in solution and binds to GD-26 Fab with a K D of 12 nM (Table 2), TD-tagged eGFP expressed in E. coli binds to GD-26 Fab with a K D of 129 nM (Fig. 5). The decrease in the affinity observed may be caused by the linker between eGFP and the TD peptide. The TD peptide was fused to the C terminus of eGFP with a short linker composed of two glycine residues, which are flexible amino acids due to lack of side chains. This short linker is aimed at providing minimal space for the TD peptide to avoid collision with the linked proteins. The accessibility of the TD peptide tag may be hindered by the adjacent amino acids and the protein structure it is inserted into, leading to a reduced affinity. Despite the low affinity for TD-tagged eGFP and GD-26 Fab, the ELISA results (Fig. 6D) showed that GD-26 IgG was able to detect the coated antigen as low as 1 pg in concentration. This strong sensitivity is most likely attributed to the signal amplification from the HRP-conjugated secondary antibody, as well as the avidity arising from the bivalency of GD-26 IgG. When antigens (eGFP) are tethered to a surface in sufficiently close proximity, the two Fab binding sites of GD-26 IgG can be simultaneously occupied leading to an increase in total binding strength, hence enhanced affinity and stability [37,38] as compared to the monovalent affinity given by GD-26 Fab. The interactions between TD-tagged eGFP and GD-26 IgG are stable enough to withstand extensive washing steps and to be detected in western blotting, ELISA and immunofluorescence (Fig. 6) without cross-reactivity, suggesting the effectiveness of the GD-26/TD peptide tagging system. Importantly, the TD peptide seems compatible with the folding of the tagged proteins as eGFP remains its intrinsic fluorescent property of the chromophore contained within the b-barrel fold (Fig. 6E). The 3 10 helical structure of the TD peptide is involved in the interactions with GD-26 Fab (Fig. 2). Western blotting results on TD-tagged eGFP (Fig. 6A,C) in the presence of GD-26 IgG suggest that after SDS/PAGE, the TD peptide retains its 3 10 helical structure for detection by GD-26 IgG. SDS is an anionic detergent that binds to proteins primarily through hydrophobic interactions to denature and make proteins negatively charged for electrophoresis. (For our sample preparation, we used lithium dodecyl sulfate, which is identical to SDS apart from the salt.) Proteins treated by SDS adopt an extended form. Instead of unfolding proteins completely, SDS tends to aggregate at hydrophobic sites generating an extended polypeptide chain containing SDS-coated helical regions connected by uncoated segments [39,40]. Considering that SDS molecules do not bind to nor linearise all regions of a protein uniformly, the region of TD peptide is likely to keep its 3 10 helical structure in SDS/PAGE since it seems more difficult for SDS molecules to interact with the TD peptide due to the hydrophilic property and net negative charge of the TD peptide. Besides, the proline residue within the TD peptide may also contribute to the conformational rigidity preserving the 3 10 helix in the presence of SDS [41]. General requirements for epitope tags are small size, being monomeric, good solubility and stability and, importantly, little structural and functional effects on the fused proteins. Being isolated from the membrane protein HmBRI, the TD peptide has a natural amino acid sequence that is not found in any endogenous proteins of E. coli nor mammalian cells (Fig. 7). It is not surprising that the TD peptide has good water solubility since it is originated from the cytoplasmic tail of HmBRI and has the net negative charge at pH 7.0. The TD peptide also shows good cellular protease resistance and proteolytic stability. Without using protease inhibitors, the TD peptide remains attached and detectable after protein expression and purification from bacterial and mammalian cell systems (Fig. 6). Another advantage of using the TD peptide tag is that GD-26 IgG is specific without cross-reactivity in both bacterial and mammalian lysates (Fig. 6). Besides, GD-26 IgG has a long-term (over a year) stability stored at 4°C in Tris buffer. The GD-26/TD peptide tagging system possesses all the above-mentioned features such as other already-existing epitope tags such as the FLAG and c-Myc tags do. The affinity of GD-26/TD peptide at nanomolar level is also comparable with those commonly used epitope tags (Table 3). What makes the TD peptide stand out from other epitope tags is that it is derived from a membrane protein and its interactions with GD-26 IgG are compatible with detergents, similar to the 1D4 tag [42]. The helical propensity of proline has been previously reported to be enhanced in the presence of detergent micelles [43], which may apply to the presence of proline within the TD peptide. Moreover, our TD peptide is one of those rare epitope tags that holds a specific structure in solution (i.e. the ALFA tag [2]) rather than an extended conformation found in most of the existing epitope tags including the FLAG, c-Myc, HA and polyhistidine tags.
Not every epitope tagging system has the crystal structures solved. Crystal structures of the following epitope tags complexed with either Fab or scFV fragments are reported: the c-Myc tag [44], the polyhistidine tag [45], the PA tag [5,46,47], the P20.1 system [48], the PA tag [5,47], the SpyTag [49] and the ALFA tag [2] (Table 3). The published structures show that the majority of the epitope peptides adopt an extended conformation. Nevertheless, the PA tag adopts a Ushaped conformation upon binding and this characteristic allows the PA tag to replace the loop region of the protein of interest so that anti-PA Fab can act as a crystallisation chaperone [5,47].
The ALFA tag forms a stable a-helical structure itself, and ALFA-tagged proteins are recognised by a specialised nanobody for downstream applications such as immunoprecipitations and super-resolution microscopy [2]. One of the most important considerations in choosing epitope tags is the downstream applications of the tagged protein. Recently, grafting the key residues of a donor helix onto an exposed acceptor helix of the target protein has been demonstrated and validated as a powerful tool for X-ray crystallography and electron microscopy in combination with the use of off-the-shelf anti-helix (donor helix) antibodies [50]. This epitope grafting approach facilitates electron microscopy studies by increasing the size of a target protein over 50 kDa, commonly known as the detection limit, without the need to develop target-specific antibodies.
Since the 3 10 helix exhibits a rigid conformation for the recognition by GD-26 Fab, we anticipate that our GD-26/TD peptide tagging system may be applied to X-ray crystallography and electron microscopy as a crystallisation chaperone by inserting the TD peptide into the loop/turn region of a target protein. To our knowledge, our GD-26/TD peptide tagging system utilises the intrinsic short 3 10 helical structure, which is not observed in other tags to achieve a structural feature that remains structured in applications such as SDS/PAGE. We offer a novel epitope tag characterised by a high-affinity interaction between the 3 10 helix-forming TD peptide and GD-26 IgG, adding to the current list of choices. As 3 10 helices are commonly found in proteins including targets for anticancer [27], antiviral [28,29] and Alzheimer's disease [30] studies, the structural basis for GD-26 Fab complexed with the TD peptide followed by peptide engineering provides a better understanding of this high-affinity interaction and may contribute to other structural studies involving antibodies and 3 10 helices.

Expression and purification of GD-26 IgG and GD-26 Fab
The monoclonal antibody GD-26 was generated by immunising mice with HmBRI D94N expressed in E. coli and then sequenced (GenScript, Piscataway, NJ, USA). The plasmid constructs of GD-26 IgG and GD-26 Fab (Fig. 8) were generated based on the previous publication [51]   (GenScript) and subcloned into the pRVL-1 plasmid (Addgene, Watertown, MA, USA) [51] digested with KpnI and BamHI restriction enzymes (Fig. 8). All plasmids were amplified using E. coli DH5a strain and purified using Easy-Prep EndoFree Maxi Plasmid Extraction Kit (BIOTOOLS, New Taipei City, Taiwan). Expi293F cells were transfected following the manufacturer's instructions. To make GD-26 IgG, cotransfection of two plasmids, pRVL1_GD-26 IgG_Heavy Chain and pRVL1_GD-26_Light Chain (Fig. 8), was made at a 1 : 2 ratio. Likewise, to make GD-26 Fab, cotransfection of pRVL1_GD-26 Fab_Heavy Chain and pRVL1_GD-26_Light Chain (Fig. 8) was made at a 1 : 2 ratio. Six days after transfection, the medium was collected because the antibody and Fab were secreted. The medium containing GD-26 IgG was diluted with the Tris buffer (50 mM Tris, 150 mM NaCl, pH 7.5) at a 1 : 1 ratio and applied onto a HiTrap Protein G HP column (GE Healthcare, Chicago, IL, USA). GD-26 IgG was eluted with the glycine/HCl buffer (0.1 M, pH 2.5) and dialysed against the Tris buffer for further purification by gel filtration chromatography on a Superdex 200 Increase 10/300 GL column (GE Healthcare) to remove any aggregates. The medium containing GD-26 Fab was diluted with the Tris buffer (50 mM Tris, 150 mM NaCl, pH 7.5) at a 1 : 1 ratio and applied onto a HisTrap Excel column (GE Healthcare). GD-26 Fab was eluted with the Tris buffer supplemented with 500 mM imidazole and then further purified by gel filtration chromatography on a Superdex 200 Increase 10/300 GL column (GE Healthcare).

Expression and purification of TD peptide-tagged eGFP
The eGFP gene encoding residues 1-231 with eight histidines at the N terminus and the TD peptide with a linker consisting of two glycine residues at the C terminus was synthesised and subcloned into the pET11a plasmid digested with NdeI and BamHI restriction enzymes (BIOTOOLS). eGFP was expressed in E. coli BL21(DE3) in Luria-Bertani medium containing 100 µgÁmL À1 ampicillin. When reaching an OD600 value of 0.6, cells were induced with 1 mM IPTG and incubated overnight at 20°C. Following expression, cell pellets were dissolved in the Tris buffer (50 mM Tris, 500 mM NaCl, pH 7.5) and disrupted by a NanoLyzer N2 homogeniser (Gogene Corporation, Xinpu, Hsinchu County, Taiwan). After applying the lysates to ultracentrifugation (Beckman Type 45 Ti rotor, Beckman Coulter, Brea, CA, USA) at 186 000 g for 1.5 h at 4°C, the supernatant was applied onto a HisTrap Excel column (GE Healthcare). The column was washed by the Tris buffer supplemented with 20, 50 and 100 mM imidazole and then eluted by 500 mM imidazole. Eluted eGFP was further purified by gel filtration chromatography on a Superdex 200 Increase 10/300 GL column (GE Healthcare) with the use of the Tris buffer (50 mM Tris, 150 mM NaCl, pH 7.5) to remove aggregates and imidazole.

Isothermal calorimetry PEAQ-ITC
The titration experiments were performed with a MicroCal PEAQ-ITC Automated (Malvern Panalytical, Malvern, UK) at 25°C. The ITC sample cell contained 200 lL of GD-26 IgG or GD-26 Fab in the Tris buffer (50 mM Tris, 150 mM NaCl, pH 7.5). In the syringe, 40 lL of synthetic peptides (peptide III-V) or TD-tagged eGFP in the same Tris buffer was injected into the sample cell using 20 injections (0.4 lL for the first injection). Prior to data analysis, heats of dilution generated by injecting the peptide solution into the buffer in the sample cell were subtracted from the experimental data. The experimental raw data were analysed using the MicroCal PEAQ-ITC Analysis software package provided with the instrument. The first data point was removed for fitting the data to a one-setof-sites model. After entering sample concentrations, the inbuilt function for fitting is operated by clicking iteration button until a good fit to the experimental data points of the isotherms is achieved and Chi-squared is no longer decreasing. The final fitting parameters including K D , changes in enthalpy (DH), entropy (DS) and Gibb's free energy (DG) and binding stoichiometry were generated by the same software.

Crystallisation, data collection, structure determination and refinement
Purified GD-26 Fab (20.8 mgÁmL À1 ) was mixed with the synthetic TD peptide at a molar ratio of 1 : 6 for 1 h at room temperature. The GD-26 Fab/TD peptide complex crystal was grown by mixing 0.3 µL of the mixture with 0.3 µL reservoir solution using the sitting drop vapour diffusion method at room temperature. The crystals were grown at 18°C and obtained in 25% (w/v) PEG 3350, 0.2 M ammonium acetate, and 0.1 M Bis-Tris, pH 5.5.
XRD experiments were carried out at the National Synchrotron Radiation Research Center in Taiwan. GD-26 Fab/TD peptide complex crystals used 40% PEG 3350 as the protectant for data collection at cryogenic temperatures. The data set was collected to 1. 45 A resolution by using the NSRRC Taiwan Photon Source beamline TLS-15A1. The diffraction data set was processed by using the software HKL-3000 [52]. The GD-26 Fab/TD peptide crystal belonged to space group P2 1 2 1 2 1 with unit-cell dimensions of a = 51. 43 A, b = 74.34 A, and c = 108.1 A. The crystal structure of GD-26 Fab/TD peptide was solved by molecular replacement with the software PHE-NIX [53] using the structure of PDB ID: 4ZXB [54] as a search model. One Fab molecule was located. The program Coot [55] was used for model building. The peptide antigen model was built according to the GTGATPADD sequence, guided by a 2Fo-Fc map. The structure was refined using REFMAC5 [56], and the refinement gave R work and R free values were 0. 13  protein structures were produced by using the UCSF Chimera software [57].

Western blotting
Whole-cell extracts were dissolved in NuPAGE LDS sample buffer (Thermo Fisher Scientific). 20, 10, 1, 0.5 and 0.1 µL of the extracts were loaded and fractionated on NuPAGE 12% Bis-Tris gels (Thermo Fisher Scientific) under reducing conditions. Following SDS/PAGE, cell extracts were transferred onto poly(vinylidene difluoride) membranes (Merck Millipore, Burlington, MA, USA). The membranes were blocked with 5% (w/v) nonfat dry milk in the TBST buffer (50 mM Tris, 150 NaCl, 1% Tween 20, pH 7.6) for 1 h at room temperature. Following blocking, the membranes were incubated with GD-26 IgG (1 mgÁmL À1 ; 1 : 2000) or antihistidine tag antibody (clone HIS.H8; Merck Millipore; 1 : 2000) for 1 h at room temperature. After washing three times with the TBST buffer for 5 min, the membranes were incubated with a 1 : 100 000 dilution of anti-mouse IgG (whole molecule)-peroxidase antibody (Sigma-Aldrich, St. Louis, MO, USA) for 1 h at room temperature. The membranes were washed three times with the TBST buffer, developed with the chemiluminescent horseradish peroxidase substrate (Merck Millipore) and imaged by iBright Imaging Systems (Thermo Fisher Scientific). The same protocol was applied to western blotting on purified eGFP (5-20 ng).

ELISA
Clear flat-bottom Maxisorp Nunc-Immuno plates (Thermo Fisher Scientific) were coated with 100 µL of TD-tagged eGFP ranging from 0.1 µg to 1 pg in the coating buffer (0.1 M Na 2 HPO 4 , pH 9.6) overnight at 4°C. Following five washes with the PBST buffer (137 mM NaCl, 2.7 mM KCl, 9.7 mM Na 2 HPO 4 , 2.1 mM KH 2 PO 4 , 0.1% Tween 20, pH 7.4), the plates were blocked with 200 µL of 5% (w/v) nonfat dry milk in the PBST buffer for 1 h at room temperature. After blocking, the plates were washed five times with the PBST buffer and then incubated with 100 µL of GD-26 IgG (1 mgÁmL À1 ) with a 1 : 10 000 dilution for 1 h at room temperature. After washing five times with the PBST buffer, the plates were incubated with 100 µL of anti-mouse IgG (whole molecule)-peroxidase antibody (1 : 100 000; Sigma-Aldrich) for 1 h at room temperature. The plates were washed five times with the PBST buffer, developed using 100 µL of TMB substrate solution and stopped by 100 µL of 0.5 N HCl solution. Absorbance was read at 450 nm using the Powerwave XS2 Plate Reader (BioTek, Winooski, VT, USA).

Cell culture, transfection and immunofluorescence
The eGFP gene encoding residues 1-231 with eight histidines at the N terminus and the TD peptide with a linker consisting of two glycine residues at the C terminus were optimised for expression in mammalian cells, synthesised and subcloned into the pcDNA3 plasmid digested with KpnI and XbaI restriction enzymes (BIOTOOLS). HEK293 cells were cultured in Dulbecco's modified Eagle's medium (high glucose, pyruvate; Thermo Fisher Scientific) supplemented with 10% FBS (Thermo Fisher Scientific) at 37°C with 5% CO 2 . HEK293 cells were seeded onto poly-D-lysine (Thermo Fisher Scientific)-coated coverslips in a 12-well plate at a density of 2.5 9 10 5 cells per well. Two days after incubation at 37°C with 5% CO 2 , HEK293 cells in each well were transiently transfected with 1.6 µg of plasmid DNA with the use of 1.6 µL of Lipofectamine 2000 transfection reagent (Thermo Fisher Scientific) following the manufacturer's instructions. Two days after transfection, cells were rinsed in phosphate-buffered saline (PBS; 3 9 5 min) and fixed for 10 min in 4% (v/v) paraformaldehyde in PBS. Fixed cells were rinsed in PBS (3 9 5 min) and then permeabilised for 20 min in 0.2% (v/v) Triton X-100 in PBS. Permeabilised cells were rinsed in PBS (3 9 5 min) and blocked with 10% (v/v) goat serum (Thermo Fisher Scientific) in the PBT buffer (PBS containing 0.5% (w/v) BSA and 0.1% (v/v) Triton X-100) at overnight at 4°C. Following blocking, cells were incubated with GD-26 IgG (1 mgÁmL À1 ; 1 : 1000 in PBT) for 2 h at room temperature. After rinsing for 3 9 5 min in PBS, cells were incubated with Alexa Fluor 568-conjugated goat anti-mouse IgG (H + L) highly cross-adsorbed secondary antibody (Thermo Fisher Scientific; 1 : 200 in PBT) for 1 h at room temperature in the dark. After removal of the excess secondary antibody by rinsing with PBS (3 9 5 min), coverslips containing immunostained cells were mounted on microscope slides with VectashieldÒ Vibrance TM antifade mounting media with DAPI (Vector Laboratories, Burlingame, CA, USA). Immunofluorescence images were acquired using a Leica TCS SP5 X confocal microscope (Leica Microsystems, Wetzlar, Germany) at 63X magnification and zoom 2. The excitation wavelengths for DAPIstained nuclei, eGFP and Alexa Fluor 568-labelled GD-26 IgG were 405, 488 and 561 nm, respectively. Images were processed by IMAGEJ/FIJI software (National Institutes of Health, Bethesda, MD, USA) [58].