Analysis of the crystal structure of a parallel three‐stranded coiled coil

Here, we present the crystal structure of the synthetic peptide KE1, which contains four K‐coil heptads separated in the middle by the QFLMLMF heptad. The structure determination reveals the presence of a canonical parallel three stranded coiled coil. The geometric characteristics of this structure are compared with other coiled coils with the same topology. Furthermore, for this topology, the analysis of the propensity of the single amino acid to occupy a specific position in the heptad sequence is reported. A number of viral proteins use specialized coiled coil tail needles to inject their genetic material into the host cells. The simplicity and regularity of the coiled coil arrangement made it an attractive system for de novo design of key molecules in drug delivery systems, vaccines, and therapeutics.


| INTRODUCTION
Seventy years ago, Crick and Pauling, during their preliminary exchange of views with successive correspondence on the disputed paternity of the coiled coil model of α-keratin (ropes with two or more α helical strands), 1 could not have imagined the importance of this protein architecture in nature. In fact, it is now recognized that coiled coils play relevant roles in oligomerization, DNA binding and gene regulation, ion transport, structural assembly, etc. 2,3 In recent years they have become even more prominent due to their replication role in viruses such as HIV 4 and SARS-Cov-2, 5,6 while they are increasingly used in biotechnological applications. [7][8][9][10] The first detailed description of the coiled coil assembly was reported in the seminal work of F.H.C. Crick,11 in which the α-helices wrap around each other, with their side chains packed in a "knobs-into-holes" manner. The twostranded coiled coil represents the most frequent oligomerization motif encountered in proteins. 12 However, trimeric, 13 tetrameric, 14 pentameric, 15 and hexameric 16 assemblies have been reported, for example, in viral proteins, while heptameric coiled coils have also been characterized. 17 The structural aspects of coiled coil assemblies have been extensively covered in the literature. 3,18 Briefly, the non-integral periodicity (3.63) of the ideal α-helix does not permit the side-by-side arrangement with undistorted α-helical strands. However, in the canonical coiled coil, left handed super-coiling of the α-helixes reduces the periodicity to 3.5 when viewed from the perspective of the bundle axis. Therefore, with regard to this super-helix axis the "knobs-into-holes" scheme is replicated every seven residues of each α-helix. The peptide sequence is characterized by repeated heptads with positions conventionally labeled a, b, c, d, e, f, and g. Generally, coiled coil structures of parallel amphipathic α-helices have a continuous nonpolar core formed by the side chains of hydrophobic residues located in the a and d positions; while the residues in positions e and g, which lie alongside this hydrophobic core, are typically charged residues involved in intra-or inter-helical salt bridge interactions. 19 The remaining positions (b, c, and f ), located in the coiled coil surface exposed to the solvent, are generally occupied by hydrophilic residues. It is well known that there is a subtle mechanism that can select the formation of two, three or more stranded helixes, which is related to the presence of specific buried hydrophobic residues at the a and d heptad positions. 20 Significant research effort has been dedicated to the de novo design of peptides that are able to form homo or hetero coiled coils. [21][22][23] In particular, two sequences named K-coil (KVSALKE) and E-coil (EVSALEK) with complementary charges in positions e, f, and g of the heptad (the sequence starts with the residue in position g of the heptad for historical reasons) were designed for the formation of heterodimeric coiled coils. 24,25 In detail, the K 4 peptide, which consists of four copies of the K-coil heptad sequence, forms coiled-coil heterodimers with the complementary E 4 peptide, while the K 4 and E 4 peptides are random-coil on their own. 25,26 This opened the possibility of relevant biotechnological applications. For example, the peptide formed by five K-coil heptads, K 5 and its partner E 5 were used in multimerization of adenovirus serotype 3 fiber knob domains with potential application in the therapy of epithelial cancer and gene/drug delivery to normal epithelial tissues. 27 Recently, it has been shown that the K 5 peptide, when fused with proteins, functions as a cell penetrating peptide to enhance the intracellular delivery of proteins. 28 During our work with the synthetic peptide KE1, which contains four K-coil heptads separated in the middle by a phage-display selected sequence for xanthine recognition (QFLMLMF heptad), we have characterized, at atomic resolution, the X-ray structure of this peptide.
Here, we present the details of the homotrimeric parallel coiled-coil structure observed for this peptide and its comparison with artificial and natural proteins in order to characterize the structural determinants which produce this specific coiled coil assembly. was observed in (NH 4 ) 2 SO 4 /PEG400 w/v based conditions. These crystals diffracted at very low resolution (7-9 Å). Using the stock protein solution without theophylline and optimizing the crystal conditions with the hanging-drop method, single pyramidal crystals grew within 7 days in 0.1 M NaHepes pH 7.5, 1.8-2.5 M (NH 4 ) 2 SO 4 and 1.5-3% PEG400 w/v to dimension of 0.4 * 0.2 mm. X-ray diffraction experiments were carried out at Elettra Synchrotron XRD1 beamline. Crystals were harvested into mother liquor supplemented by glycerol 30% and flash-frozen at 100 K. Diffraction data were collected at wavelength of 1.00 Å on a MARCCD (345 mm ) image plate detector.

| MATERIALS AND METHODS
Indexing, integration of reflection intensities and data scaling (Table S1) were performed using SCALA 29 from CCP4i. 30 The crystals 31 (Table S1).
Molecular replacement method with AMoRE 33 was used to solve the structure of KE1 from twinned data starting from a 35 polyalanine α-helical model. Merohedral twin 34 with twin fraction of 46% was solved using the twin law -k -h -l. In the first cycles of model building with Coot 35 and refinement with REFMAC5 36 the poly-Ala model was mutated following the KE1 sequence. Final twin refinement was conducted using SHELXL. 37 Final R work and R free values are reported in Table S1.
HeLANAL 38 was used to calculate the helical geometry of KE1.
The same analysis was done for the structural homologs found in PDB, belonging to the most representative viral fusion coiled coils and predicted by DALI server 39 (Table S2). The list of trimeric coiled coil structures (Table S3) was obtained searching in the CC+ database for the parallel, homo-trimeric fold with sequence length > 28 residues. 40 The resulting assemblies were then analyzed with the CCCP (coiled coil crick parameterization) tool (Table 1) 41 and the distribution of the amino acids in the coiled coil positions was determined ( Table 2).
Pymol was used to prepare the figures. 42

| RESULTS AND DISCUSSION
The crystal structure of KE1 (PDB code 3TQ2) was solved at near atomic resolution of 1.1 Å. Data reduction and refinement statistics are listed in Table S1. The asymmetric unit of the merohedrally twined trigonal (P3) crystals consists of one peptide chain, a sulfate ion, which forms a salt bridge interaction with Lys27 (2.73 Å), and 29 water molecules. The five heptads of KE1 fold as a single α3.6 13 helix ( Figure 1A) with typical intra-helical hydrogen bonds. Interestingly, the central QFLMLMF heptad, selected for xanthine recognition, maintains the α-helix fold of the four K-coil heptads. This was unexpected because this heptad was selected for its binding abilities, rather than its helical propensity. In fact, based on the initial design, 43 we hoped that this region would adopt a loop structure for recognition of xanthine molecules between the aromatic aminoacyl residues, rather than a helical conformation. Therefore, we were surprised to find that the hydrophobic heptad inserted in this region led to the formation of a regular α-helix. This could be responsible for the poor and ambiguous results obtained in the recognition investigation of KE1. This fold is stabilized by an interaction between the side chains of Phe16 (position a) and Met20 (position e), already reported as a stabilizing factor of an α-helical peptide via shared rotamer preferences. 44 The helix geometry, calculated by HELANAL, 38  Curved helices are more abundant in natural protein with respect to the kinked helices and the even more rare linear helices. 38 The curvature of the helices is important for the efficient knobs-into-hole packing of helices and it allows the sequestration into the hydrophobic core of both anions 45 and cations. 46 The importance of flexibility and adaptability of helices has been recognized in metal-assisted stabilization of collagenous triple helices 47 and metal-dependent catalytic activity in designed three-stranded coiled coils. 48 Flexibility is clearly an advantage for viral fusion domains during the transition from pre-to post-fusion arrangement. In fact, class I viruses use the α-helix-rich domains to fuse their membrane to the host-cell membrane and deliver the viral genome into the host cytoplasm for replication. 49 For instance, SARS-Cov-2 Spike pre-fusion protein undergoes a conformational change triggered by a cleavage event at the cell surface that generates the post-fusion arrangement. 50 Similarly, the HIV inactive gp41-gp120 domain becomes active when the core of gp41 folds as a coiled coil. 51 An analysis of the KE1 helix deformation in comparison with some of the most representative structural homologs found in PDB with DALI server and belonging to viral fusion proteins is reported in Table S2.
The crystallographic threefold axes observed in the crystal structure of KE1 produce a quaternary structure characterized by parallel three-stranded coiled coils ( Figure 1B). Positions a in the trimeric assembly of the K-coil heptads are occupied by valines, whose side chains interact in the hydrophobic core. On the contrary, in the central heptads the same positions are occupied by Phe16 residues, whose aromatic groups point toward the external surface. The second heptad positions, d, crucial for the formation of the hydrophobic core, are occupied in all five heptads by leucine residues whose side chains are tightly packed. Positions e and g are occupied by Lys residues in the K-coil heptads, while the central heptads have Met20 and Gln15 residues, respectively ( Figure 1B). Due to the trigonal symmetry and cell translation, KE1 assembles as a super-helix which develops along the c axis, with the C-terminal of one helix adjacent to the N-terminal of the equivalent helix generated by the threefold rotation and cell translation to form a pseudo-continuous α-helix ( Figure 1C). Therefore, the super-helix has a periodicity of about 150 Å, which corresponds to the three times the c axis dimension. Each single-peptide wraps around the axis of the super-helix by 120 with a contribution from each single heptad of 24 . Although sterically and electrostatically unfavored with respect to the antiparallel fold, 52,53 parallel coiled coils are found in many natural and artificial proteins. 3 For example, a similar crystallographic arrangement was observed for the coil-V a L d , a 29-residue designed trimeric coiled coil, 54 while a three helix parallel coiled coil arrangement has been described for the phage P22 cell envelope-penetrating tail needle gp26 55 ( Figure 1D). Moreover, for its similarity to the class I fusion peptides, this specific arrangement has acquired importance in biomedicine. An example is given by the use of N3G mimicking the inner core post fusion gp41, designated as promising therapeutic against HIV infection. 56 This peptide was also highly effective in inhibiting infection of human β-coronaviruses T A B L E 1 Crick parameters calculated with the CCCP tool for KE1 and for the 50 parallel trimeric coiled coil assemblies found in the CC+ database. T A B L E 2 Frequency and distribution of the specific amino acids on the hydrophobic (a, d, grey), charged (e, g, green), and hydrophilic (b, c, f, yellow) coiled coil positions obtained from the 50 parallel trimeric coiled coil assemblies found in CC+ database.
possibly by binding the HR2 region in the spike protein to block their hexameric structure formation. 56 The KE1 arrangement is remarkably similar to the fusion domain of the Avian Retrovirus 13 ( Figure 1E). The two structures overlap with a RMSD value of 0.98 Å.
With respect to other coiled coils, the parallel trimeric arrangement is less frequent in nature. 50 parallel trimeric coiled coil structures with sequence length > 28 residues found in the CC+ database 40 were analyzed for a comparison with KE1 (Table S3) Table 2. Those residues with abundance >10% are highlighted and ACKNOWLEDGMENT We thank Prof. F. Berti for the chemical synthesis.

CONFLICT OF INTEREST STATEMENT
The authors have declared no conflict of interest.

PEER REVIEW
The peer review history for this article is available at https://www. webofscience.com/api/gateway/wos/peer-review/10.1002/prot.

DATA AVAILABILITY STATEMENT
All data supporting this study can be requested to the corresponding authors.