Computational backbone design enables soluble engineering of transferrin receptor apical domain

Supply of iron into human cells is achieved by iron carrier protein transferrin and its receptor that upon complex formation get internalized by endocytosis. Similarly, the iron needs to be delivered into the brain, and necessitates the transport across the blood‐brain barrier. While there are still unanswered questions about these mechanisms, extensive efforts have been made to use the system for delivery of therapeutics into biological compartments. The dimeric form of the receptor, where each subunit consists of three domains, further complicates the detailed investigation of molecular determinants responsible for guiding the receptor interactions with other proteins. Especially the apical domain's biological function has been elusive. To further the study of transferrin receptor, we have computationally decoupled the apical domain for soluble expression, and validated the design strategy by structure determination. Besides presenting a methodology for solubilizing domains, the results will allow for study of apical domain's function.


| INTRODUCTION
Transferrin receptor 1 (TfR) together with its ligand, transferrin (Tf), supply cells with iron and are responsible for maintaining the physiological levels of iron in humans. 1,2 The receptor is a homodimer with each subunit consisting of helical, protease-like, and apical domains. 3 The dimeric interface is formed by the helical domain in one subunit, and the protease-like domain in the second subunit, with the functional interface responsible for binding of Tf situated at the helical domains. Upon complex formation, the Tf/TfR system gets internalized into cells via receptor mediated endocytosis. The interaction between the receptor and Tf is fine-tuned by another protein, known as hemochromatosis protein (HFE), thus keeping the intracellular concentration of iron at the physiologically relevant levels. [4][5][6] Both the helical and protease-like domains consequently have biological relevance for either iron delivery or receptor structural cohesion. For the apical domain that does not seem to be the case, although it has been implied in binding and internalization of the iron carrier ferritin. 7 The recently solved complex structure between the human ferritin and TfR 8 will support the further investigation of the detailed molecular mechanism.
The experimentally determined structures of TfR have been reported both in apo form and complexed with different ligands. 9 These have identified important features of the molecular determinants of interaction between different receptor domains and the protein binders. Tf and HFE, for example, partially overlap in their binding surface, identifying residues that are important for their interactions with the receptor. 10,11 The central role of the TfR and Tf/HFE in iron regulation has led to the evolutionary pressure to conserve residues essential for the interface formation across the mammalian species.
On the other hand, despite the lack of clear functional relevance of the apical domain, its surface has been exploited for binding by exogenous opportunistic pathogens. The domain is utilized by viruses and parasites to enter into cells. Machupo virus, for example, interacts with glycoprotein 1 (MGP1) to form a complex with the apical domain. 12 Similarly, malarial parasite, Plasmodium vivax, forms the majority of its interactions with the apical domain in addition to interacting with the protease domain of the receptor. 13 The evolutionary pressure for residue conservation in the apical domain is much lower; therefore, exogenous agents often exhibit a clear species preference. 9 The receptor's ability to internalize large protein molecules has found many biotechnological applications, where efforts have been made to transport small drugs, proteins, modified virus capsids, and nanocarriers across cell membranes. 14

| Computational rebuilding of Tf receptor apical domain
The apical domain of the Tf receptor starts at Gln 197 and finishes at Ser 378 as determined from the visual assessment of the experimental structure, PDB ID: 3kas. 12 Residues Tyr 219 and Tyr 222 , the large aromatic amino acids that constitute an important contribution to a hydrophobic patch interacting with the protease-like domain, were allowed to design to polar amino acids in an effort to prevent aggregation. The domain was additionally trimmed in-between residues Phe 297 and Ile 323 which form an unstructured, exposed loop interacting with the helical domain in the opposite unit of the homodimer.
Additionally, Phe 297 was fixed to its native identity during the design process, and Phe 298 was allowed to design to polar residues. The latter one forms an extension of the hydrophobic surface together with the two abovementioned tyrosine residues Tyr 219 and Tyr 222 . Positions at Ser 327 and Leu 329 were fixed to lysine amino acids each as they had been found to be solvent exposed after initial, loop-modeling trials. The total length of the rebuilt loop was 7 amino acids and was modeled with Rosetta 19 software application RosettaRemodel 20 with talaris2013 energy function. After limited screening for loop conditions, final modeling consisted of 100 trajectories. The limited number of trials was sufficient due to the fixed amino acid and the short length of the loop. The resulting models were sorted by total energy, and the lowest energy models were visually inspected for optimal backbone conformations. Finally, the sequence from the next best model was chosen, which resulted in AP01 design.

| Expression and purification of AP01 and AP02
AP01 and AP02 were cloned in pET29b(+) at NdeI and XhoI restriction sites. NovaBlue competent cells were used for plasmid amplification; heat-shock protocol was used transformation with AP01/ pET29b(+) plasmid. The gene insert was confirmed by colony PCR and DNA sequencing (Mix2Seq kit, Eurofins, Luxemburg). Tuner (DE3) competent cells (Invitrogen) were used for subsequent protein expression. Overnight culture of the confirmed clone was diluted in LB to OD 600 = 0.1, and grown at 37 C shaking at 180 rpm. When the optical density OD 600 reached 0.6, the temperature was lowered to 20 C, and the protein was induced by 0.2 mM IPTG for 18 hours. Cells were collected at 8000g for 10 minutes. Each gram cell pellet was dissolved in 5 mL 100 mM HEPES, 500 mM NaCl, pH 7.4, and sonicated four times at 40% amplitude for 20 seconds, with 20 seconds pause in between pulses using a Libra Cell 100 sonicator. Lysed cells were centrifuged at 15000g for 30 minutes. Supernatant was filtered through a 0.22 μm filter and loaded on a HisPur Cobalt resin column (Thermo Scientific), preequilibrated with 100 mM HEPES, 500 mM NaCl, pH 7.4. The column was washed with 100 mM HEPES, 500 mM NaCl, 10 mM imidazole, pH 7.4, and eluted with 100 mM HEPES, 500 mM NaCl, 500 mM imidazole, pH 7.4, according to the protocol. Elutes containing protein of the correct size, were confirmed with SDS-PAGE, concentrated, and further purified by size exclusion chromatography (SEC) over HiLoad 16/60 Sephacryl S100HR column. Eluted protein was stored in 20 mM HEPES, 150 mM NaCl, and 10% glycerol. AP01 was analyzed over an HPLC C3 column (Agilent ZORBAX 300 SB-C3, 4.6 × 50 mm, 5 μm) by MS using ESI accordingly to a protocol described previously. 22

| Structure determination
Additional purification of AP01 was carried out for protein preparations for structure determination. AP01 expressed as described above was purified by Ni-NTA affinity chromatography, and collected frac- No electron density was observed for the last 13 residues in both chains. Moreover, density of residues 25-27 and 97-99 in chain A as well as residues 102-107 in chain B was very poor. However, all residues were modeled to their NCS counterparts, which were visible. Three sodium ions were included into the model.
Details of data collection, processing, and refinement are summarized in Table 1.

| RESULTS AND DISCUSSION
In this report, we present the design of solubilized apical domain derived from human Tf receptor ( Figure 1A). Computational protein engineering was carried out to trim the domain, and to improve the protein solubility by redesigning unstructured loop regions. Furthermore, the hydrophobic patches on the surface, that resulted from decoupling from the receptor protease domain, have been optimized to minimize aggregation by designing them into polar amino acids. In total, two different designs have been built, AP01 and AP02, with the major difference being in the redesigned loop region stretching in-between residues Met 283 and Ile 329 . This loop region, besides contacting the protease domain, is naturally extending toward the opposite subunit of the TfR dimer. Both AP01 and AP02 were bacterially expressed, with AP01 being the more soluble design. To validate our design strategy and confirm the receptor remodeling, we determined the structure of AP01 by X-ray crystallography.

| Computational design of AP01 and bacterial expression
The redesigned apical domain of the receptor was built with the RosettaRemodel application, which is part of the Rosetta protein engineering suite. The application allows for easy access to algorithms to vary protein backbone lengths, in addition to protein sequence optimization ( Figure 1B). To decouple the apical domain, we had to carry out

| Assessment of the AP01 experimental structure
To evaluate our design strategy and confirm the structural cohesion of the AP01, we experimentally determined its structure. The overall structure of the AP01 recapitulated the core of the apical domain well with C α RMSD of 1.2 Å (calculated over 134 residues not part of the restructured loop; Figure 3A). The AP01 design did contain two major  Additional trimming resulted in less defined energy landscape with fewer solutions (loop length of 11). The 12 residues loop was also shown to be sensitive to amino acid changes which allowed for design; loop length = 12 (1) and 12 (2) describe two different loop design strategies. Loop sequence is defined by the box insert, where X denotes a position allowed to design to any amino acid [Color figure can be viewed at wileyonlinelibrary.com] defined folding funnels. In this way, the loop flexibility will eventually decrease due to the restrictions in available degrees of freedom. The lower limit that was found to be able to connect the loop backbones was as low as 11 amino acids during the rebuilding process. In addition, to allow for limited design process, three residue positions C-  optimization resulted in the second design, AP02, which was trimmed by 9 positions. It was also soluble, but seemed more prone to aggregation. When compared to the known interaction partners of TfR apical domain, AP01 retained most of the structural determinants responsible for the complex formation, except for the salt bridge between Glu 294 and the lysine residue in the binding partner. The designed proteins, AP01 and AP02, are thus relevant starting points for further binding optimization, for example, toward pathogenic epitopes in an effort to devise future protein therapeutics.

ACCESSION NUMBER
Coordinates and structure factors have been deposited in the Protein Data Bank with accession number 6y76.