The next generation of protein super‐fibres: robust recombinant production and recovery of hagfish intermediate filament proteins with fibre spinning and mechanical–structural characterizations

Summary Native hagfish intermediate filament proteins have impressive mechanical properties. However, using these native fibres for any application is impractical, necessitating their recombinant production. In the only literature report on the proteins (denoted α and ɣ), heterologous expression levels, using E. coli, were low and no attempts were made to optimize expression, explore wet‐spinning, or spin the two proteins individually into fibres. Reported here is the high production (~8 g l−1 of dry protein) of the hagfish intermediate filament proteins, with yields orders of magnitude higher (325–1000×) than previous reports. The proteins were spun into fibres individually and in their native‐like 1:1 ratio. For all fibres, the hallmark α‐helix to β‐sheet conversion occurred after draw‐processing. The native‐like 1:1 ratio fibres achieved the highest average tensile strength in this study at nearly 200 MPa with an elastic modulus of 5.7 GPa, representing the highest tensile strength reported for these proteins without chemical cross‐linking. Interestingly, the recombinant α protein achieved nearly the same mechanical properties when spun as a homopolymeric fibre. These results suggest that varying the two protein ratios beyond the natural 1:1 ratio will allow a high degree of tunability. With robust heterologous expression and purification established, optimizing fibre spinning will be accelerated compared to difficult to produce proteins such as spider silks.


Introduction
Hagfish are remarkable creatures, not only because of their unique anatomical features but, perhaps most interestingly, their unique adaptation to predation. Hagfish are capable of producing a slime that is reinforced by fibrous proteins. When the slime is expelled, it rapidly expands and can clog the gills of a predator, forcing them to release the hagfish (Fernholm, 1981). While the slime is of keen interest, the reinforcing fibres (intermediate filaments) are also exciting due to their remarkable mechanical properties (Fudge et al., 2010). When these fibres are isolated from the slime matrix, drawprocessed, and dried they exhibit mechanical properties similar to those of natural dragline silk from orb-weaving spiders (Gosline et al., 1986;Stauffer et al., 1994). The hagfish fibrous thread is heteropolymeric and composed of two proteins, denoted as a and c (Spitzer et al., 1984;Koch et al., 1994;Koch et al., 1995). These proteins share a common structural architecture: an a-helical rod domain and N-and C-termini that are not as predominantly a-helical. In the native fibre, these two proteins coil around each other in a classic coiled-coil conformation, and when these fibres are draw-processed the ahelices convert to b-sheets, which confer the remarkable mechanical properties (Fudge et al., 2003).
In past studies, native hagfish intermediate filament fibres have been dissolved using formic acid and then spun into fibres in an attempt to understand fibre creation and development from spinning systems that are not mimetic to the natural system (Negishi et al., 2012). While the mechanical properties of the fibres fell short of their naturally created counterparts, this past study indicated the potential of these two proteins to assemble and form a fibre using alternative spinning technologies. This initial research led to attempts to generate the two proteins in E. coli, where the proteins were produced as full-length natural analogs (Fu et al., 2015). When the proteins were purified, the authors demonstrated that they would self-assemble at the surface of an electrolyte buffer in a native-like a-helical conformation from which fibres could be pulled. Furthermore, when the fibres were draw-processed, the hallmark a-helix to b-sheet conversion was observed. Additional characterization with X-ray diffraction confirmed that the newly formed bsheet crystallites were orientated along the axis of the fibre, creating the natural structural elements that infer strength to protein-based fibres (Fu et al., 2017). Again, the mechanical properties fell short of characterized natural fibres. However, this prior study clearly demonstrated the potential of hagfish intermediate filament proteins if the proteins can be efficiently produced, and a spinning method can be developed that generates mechanical properties closer to natural hagfish fibres.
There are intrinsic traits of hagfish intermediate filament proteins that could be beneficial for the efficient production of recombinant fibrous proteins. Hagfish intermediate filament proteins are much smaller (% 65 kDa) than the highly studied, and challenging to produce, spider silk proteins (> 300 kDa). They also have a more even distribution of amino acids without the heavy reliance on glycine and alanine found in spider silk. This combination makes them a more amenable target for heterologous expression. Although there are two reports in the literature of recombinant expression of the two hagfish fibrous proteins (Fu et al., 2015(Fu et al., , 2017, no attempts have been made at optimizing protein production and output using efficient bioreactors. Efficient protein production in bioreactors, and scaling-up, is mandatory for understanding the potential of these two proteins for incorporation into engineering applications. Additionally, hexafluoroisopropanol (HFIP) has not been explored as a solvent from which to spin hagfish intermediate filament fibres and is known to drive a-helical conformations in proteins, recreating the natural structures of these proteins (Hirota et al., 1997;Maiti et al., 2004). While HFIP is not an ideal solvent for commercial or industrial applications, it is apparent that it can easily aid in producing superior fibre mechanical properties to determine the functional capacity of proteins in question.
Supporting this assertion is a recent study that utilized HFIP as a solvent to generate native-like mechanical properties of spider dragline silk from a full-length analog of MaSp1 (one of the two proteins that comprise dragline silk) for the first time (Bowen et al., 2018). Smaller fragments of MaSp1 were expressed and then the full-length analog was assembled using the intein system. This experiment illustrates that the recombinant protein forms should be as native-like as possible with the native structural elements present and that the natural spinning process does not have to be precisely recreated. Rather, HFIP is capable of producing fibres that mimic the mechanical properties of natural fibres.
Recombinant production also allows the unique opportunity to study these two hagfish proteins individually and gain insight into their fibre-forming abilities, structures, and mechanical properties. There are intermediate filament proteins that are homopolymeric, with one of the most studied being vimentin. In one study, the recombinant form of human vimentin was produced in E. coli, purified, and then spun into fibres (Pinto et al., 2014). When the vimentin filaments were allowed to selfassemble, fibres were able to be pulled from the resulting gelatinous film and draw-processed. These vimentin fibres exhibited maximum tensile strengths of 173 MPa. In the only literature report of fibres spun from recombinantly produced hagfish a and ɣ proteins, at their native ratio of 1:1, using a nearly identical fibre spinning process to the vimentin fibres, the tensile strength reported was 150 MPa (Fu et al., 2017).
Further indication that natural arrangements and processes may not need to be exactly replicated to obtain the robust natural properties is the intein spider silk example. The authors created only one of the two proteins of dragline silk and were still able to generate fibre properties that were very close to the native spider silk fibres (Bowen et al., 2018). Therefore, this study sought to understand if the individual recombinant forms of the hagfish a and ɣ proteins would assemble into a fibre and to also gain an initial glimpse of their structural elements, mechanical properties, and the interplay between them.
This work demonstrates that very high expression levels of recombinant hagfish intermediate filament (rHIF) proteins are obtainable in E. coli using small (~1 l) bioreactors. As a proof of concept, the process was scaled in E. coli to the 100 l bioreactor level, which demonstrated similarly high expression. An efficient purification procedure, which is both scalable and economical, is also reported. Finally, fibres that are double draw-processed, from both the individual proteins and in a native-like 1:1 combination, were produced, and their resultant mechanical properties and the structural elements of the fibres are reported.

Production and Recovery of Recombinant Hagfish Intermediate Filament (rHIF) proteins
Proteins were purified based upon a standard approach to inclusion body purification in a technique that was not dissimilar from a previous report (Fu et al., 2015). As shown in Fig. 1, both proteins were purified to an appreciable degree. In both the Coomassie and western blot ª 2021 The Authors. Microbial Biotechnology published by Society for Applied Microbiology andJohn Wiley &Sons Ltd., Microbial Biotechnology, 14, 1976-1989 analysis, rHIFa migrated further into the gel than rHIFɣ (C387S) even though rHIFɣ (C387S) has a lower molecular weight, likely due to structural differences between the two proteins. Both expression plasmids were verified via DNA sequencing prior to these expression studies (results not shown). ImageJ analysis of the Coomassie-stained gel indicates that the proteins are at least 70% pure. Western blot analysis indicates that the majority of the protein is expressed as the full-length protein. However, banding is apparent below the primary band (premature termination of protein synthesis), which likely biased the ImageJ analysis lower when using the Coomassie-stained gel for the analysis. Regardless, this level of purity was high enough that the formation and testing of fibres were possible.
Both proteins were successfully synthesized at high levels at a laboratory scale (BioFlo115 at~1 l). At the BioFlo115 level of production, both rHIFa and rHIFɣ (C387S) proteins were produced and recovered on average at ≥ 45 g kg À1 cell mass (≥ 8 g l À1 , Table 1). There was some variability between runs, as reported in the table, possibly due to minor differences in the OD 600 values at the start of induction, the final OD 600 values obtained, or the recovery and processing operations. Additionally, rHIFa was induced for 5 h for two runs rather than the standard 4 h induction used for the others. This was an attempt to drive protein expression higher for rHIFa, which was successful. No manipulations were made to the induction time for rHIFɣ (C387S) .
When production of the two proteins was scaled to the BioFlo610 bioreactor (~100 l), the production yields, both per cell mass and per volume, remained relatively consistent with those observed in the BioFlo115 bioreactors (~1 l). The production and subsequent recovery of the two proteins resulted in a mass yield of 39 g kg À1 cell mass (7.8 g l À1 ) for rHIFa and 45 g kg À1 cell mass (8.5 g l À1 ) for rHIFɣ (C387S) ( Table S1). As both rHIFa and rHIFɣ (C387S) were expressed as inclusion bodies, no attempts were made to quantify the protein in the soluble fraction. It is important to note that while the BioFlo115 bioreactors were run in triplicate for each construct to validate protein production, the BioFlo610 (~100 l)  bioreactor was only run once for each construct due to cost and time considerations.

Mechanical and structural analysis of rHIF fibres
Both proteins and the 1:1 rHIFa:rHIFɣ (C387S) combination readily spun fibres from HFIP using isopropyl alcohol (IPA) as the coagulant solution. Due to the brittleness of the as-spun fibres when dried, mechanical properties were not obtained. Images of the fibres using light microscopy at 4009 magnification are presented in Fig. 2. As can be seen, fibres from rHIFa and the 1:1 combination fibres were generally smoother with less surface topography than rHIFɣ (C387S) . The rough and inconsistent nature of the rHIFɣ (C387S) fibres could have presented as defects, or weak points, which may have contributed to the mechanical underperformance when compared to the rHIFa and 1:1 rHIFa:rHIFɣ (C387S) protein formulations. However, most of the fibres generally appear smooth on the surface and had a reasonably welldefined circular cross-section. The fibre images in the bottom row of Fig. 2 correspond to the stretch at which that protein, or combination, had the highest tensile strength. Fibre images in the top and middle rows are as-spun and 1X1X stretch, (where no deliberate stretch is applied), respectively, for comparison. The resulting mechanical properties from this study exhibited substantial tensile strength, strain, toughness, and elastic modulus (Table 2). Between the homopolymeric fibres, rHIFa fibres demonstrated better mechanical properties than rHIFɣ (C387S) . As shown in Table 2 and Fig. S1, rHIFa did not have any significant changes in the characterized mechanical properties between 1X1X and 1.5X1.5X. There was, however, an 8% increase in b-sheet content with the additional stretching from 1X1X to 1.5X1.5X (Table 2 and Fig. 3). At both stretches, rHIFa displayed the highest recorded strains at 0.92 and 0.87 mm mm À1 , and appreciable tensile strength at 104 and 106 MPa, respectively. This combination of the highest strains and higher strengths resulted in the rHIFa fibres having the highest toughness of all fibres tested with average values of 72 and 73 MJ m -3 , for 1X1X and 1.5X1.5X, respectively. A substantial decrease in fibre diameter and significant changes for all mechanical properties ( Fig. S1) were observed at the 2X2X stretch without additional b-sheet recruitment. The fibre diameter decreased nearly 30% (16.7 µm), tensile stress increased 60% (169 MPa), strain decreased by 73% (0.23 mm mm À1 ), toughness decreased by 55% (33 MJ m -3 ), and the elastic modulus improved by 49% (5.4 GPa). It is notable that while having one of the highest tensile strengths and elastic modulus, the reduction in strain resulted in these fibres having the third lowest average toughness of all fibres. When these mechanical properties and the fibre diameter are compared to the 1:1 rHIFa:rHIFɣ (C387S)   combination fibres, at 1.5X1.5X, it is evident that the mechanical properties of rHIFa are remarkably similar with only the tensile strength being statistically significantly different (Fig. S1). At the 1X1X stretch factor for the1:1 rHIFa:rHIFɣ (C387S) combination fibres produced fibres that were not as robust as the rHIFa fibres but were mechanically superior to the rHIFɣ (C387S) fibres. In fact, the mechanical properties of the 1:1 combination fibres fell almost perfectly between the values of the rHIFa and the rHIFɣ (C387S) samples ( Table 2). The 1:1 combination fibres at the 1.5X1.5X stretch demonstrated the highest tensile strength (199 MPa), the smallest fibre diameter (16.22 µm), the lowest strain (0.21 mm mm À1 ), and the highest elastic modulus (5.7 GPa), as well as the highest b-sheet content (49%) of all characterized fibres (Fig. 3). This stretch also yielded the greatest increase, of 12%, for the b-sheet content of all stretched fibres. The 1.5X1.5X stretch appears to be optimal for obtaining maximum tensile stress and elastic modulus for this particular protein formulation. However, the lower strain resulted in a lower toughness (35 MJ m -3 ). Interestingly, when the 1:1 combination fibres were stretched at 2X2X, there were significant differences across all mechanical properties (Fig. S1). The tensile strength decreased by 29% (149 MPa), the strain more than doubled (0.56 mm mm À1 ), which nearly doubled the average toughness (66 MJ m -3 ), and the elastic modulus was also 33% lower (4.1 MPa). However, even with those reductions, this combination resulted in the third highest tensile strength of all the fibre combinations, demonstrating a relatively high degree of tunability through manipulations of draw-processing.
The rHIFɣ (C387S) fibres did not achieve a maximum value for any mechanical property. The 1X1X rHIFɣ (C387S) consistently presented with the lowest mechanical properties for all of the characterized fibres ( Table 2). The rHIFɣ (C387S) fibres appeared to be optimally stretched at around 1.5X1.5X as the fibre diameter decreased by 57% (20.2 µm) and the highest tensile strength and strain of all rHIFɣ (C387S) fibres (122 MPa and 0.62 mm mm À1 ) were obtained, which resulted in an appreciable toughness (61 MJ m -3 ). Additional stretching to 2X2X resulted in the reduction of all measured mechanical properties with significant decreases occurring for all properties with the exception of the elastic modulus (Fig. S1). The b-sheet content for rHIFɣ (C387S) fibres at 2X2X also decreased to nearly 1X1X stretch levels when draw-processed further. All rHIFɣ (C387S) fibres consistently demonstrated the lowest b-sheet content ( Fig. 3 and Table 2) and the most limited range of elastic modulus (3-3.5 GPa).
All fibres saw substantially increased b-sheet content from the as-spun to 1X1X ( Fig. 3 and Table 2), which was simply a result of the fibre being threaded through the instrument and exposed to the stretch bath solutions. As-spun fibres ranged from 9À14% b-sheet content, and all fibres increased nearly identically to 36-37% at 1X1X. Even with this sizeable initial increase, further stretching resulted in additional b-sheet recruitment, although in smaller increments, which resulted in substantial improvements in mechanical properties. Particularly notable is that when the 1:1 combination fibres were optimally stretched at 1.5X1.5X, the b-sheet content rose above both rHIFa and rHIFɣ (C387S) , which did not happen at any other stretch, although this was observed for the as-spun fibres that had not been stretched. Additionally, the b-sheet content of both 1:1 rHIFa:rHIFɣ (C387S) combination and rHIFɣ (C387S) fibres decreased when draw-processed to the 2X2X level, which correlates with the measured mechanical properties. In contrast, the b-sheet content and mechanical properties of rHIFa fibres increased at this level.

Discussion
The protein purification process is based on a standard approach to inclusion body purifications. When expressed as inclusion bodies, the insolubility of both proteins allows the use of urea to remove more soluble contaminating proteins. For this study, the process was accomplished using batch centrifugation. However, inclusion body purifications can be scaled with large volume filtration systems (Forman et al., 1990). Large volume filtration systems should also allow for improvements in purity via additional processing steps and, if necessary, refolding of the proteins in a controllable manner. Further refinements of the purification process will predictably reduce the overall yield as more impurities are removed. However, given that this study induced cultures at an OD 600 of % 60, there is sufficient room to improve cell density, and therefore the total cell mass produced through the optimization of bioreactor media and feedstocks. There are several reports in the literature of E. coli in bioreactors achieving optical densities above 100 (Li and Sha, 2017). If cell density correlates in a linear manner with total protein production and recovery, then yields in excess of 13 g l À1 could be expected.
A previous report on recombinantly produced hagfish intermediate filament proteins a and ɣ only achieved yields of 0.01-0.02 g l À1 when using E. coli as a host (Fu et al., 2015). Using methodologies developed during this study, substantial improvements were made to the protein yields, which were increased by 325-to 1000fold from previous reports. An efficient and scalable purification process also allowed for the production of rHIF fibres with the highest mechanical properties yet reported for recombinant hagfish intermediate filament proteins. Furthermore, these fibres were produced with conventional wet-spinning techniques, instead of solution casting and drawing, which allowed for continuous fibre lengths to be produced that could be draw-processed as desired.
Both rHIFa and rHIFɣ (C387S) were produced and recovered at a laboratory production scale at ≥ 45 g kg À1 cell mass (≥ 8 g l À1 ). This process was then scaled to a relatively large 100 l bioreactor, where the proteins were again recovered at similarly high levels. This makes them very economical, according to a recent technoeconomic report on the production of spider silk (Edlund et al., 2018). A key bottleneck to the production of recombinant spider silk proteins is a general inability to produce native-sized proteins, with all structural elements present, at a sustainable level that supports systematization into engineering applications (Xia et al., 2010). The two recombinant forms of hagfish intermediate filaments investigated in this study do not suffer from this same problem, possibly due in part to the smaller molecular weights and less repetitive amino acid contents. Not only are they readily produced in E. coli, but they are also produced as largely full-length synthetic analogs of the native proteins. Additionally, as Table 3 demonstrates, the levels of expression and recovery surpass, by more than twofold, any reports of other structural fibre-forming proteins when produced in E. coli.
In the only other literature report on the spinning of synthetically produced hagfish intermediate filament proteins a and ɣ into fibres, the authors were able to achieve tensile strengths between 25 and 150 MPa when the fibres were draw-processed and dried (Fu et al., 2017). The study used highly purified proteins and a refolding process to attempt to create a natural coiledcoil structure. The fibre spinning process was performed by hand and utilized the solution casting and drawing procedure, which could only be performed with small volumes (~2 µl) of protein dope solutions. Finally, the fibres were also cross-linked with glutaraldehyde to take advantage of the lysine concentration of the two proteins, and those fibres presented tensile strengths up to 250 MPa with predictably very little strain (Fu et al., 2017). Although the fibres presented in this previous study are impressive, the system has several limitations, including the yields, protein assembly methods, and the spinning process.
This study utilizes HFIP as a solvent due to its high propensity to generate a-helical conformations in proteins, which is the native conformation of the hagfish proteins. It has also been shown to produce protein fibres that recapitulate the natural fibre mechanical properties from recombinant spider silk protein analogs (Xia et al., 2010;Bowen et al., 2018). When these predominantly a-helical rHIF proteins are forced out of solution in the coagulation bath, the proteins must interact or a fibre will not form, possibly creating a native-like coiled-coil structure. Since the two proteins and the 1:1 rHIFa: rHIFɣ (C387S) combination readily formed fibres and demonstrated a substantial proportion of a-helical and random structures in the as-spun fibres, the characteristic coiled-coil structure presumably formed to some extent, although this was not directly measured in this study.
The substantial recruitment of b-sheet demonstrated the conversion of a-helices to b-sheets from as-spun to 1X1X (where no deliberate stretch is applied), which supports that the transition occurs even when minimal force is applied to the fibre. The varied compositions of the stretch baths and the observed changes also suggest that environmental conditions can influence this structural transition. Increased stretching, or drawprocessing, was successful in recruiting additional bsheets that strongly correlate with the impressive gains in mechanical properties, especially tensile strength and elastic modulus ( Table 2).
The highest average tensile strength and elastic modulus values were obtained by the 1:1 rHIFa:rHIFɣ (C387S) combination fibres indicating a synergistic effect of having both proteins present in this system (Table 2 and Fig. 3). The b-sheet content of the 1:1 combination fibres at the 1.5X1.5X stretch is particularly indicative of this synergy. Here, the b-sheet content exceeded that of both rHIFa and rHIFɣ (C387S) when spun individually (Table 2 and Fig. 3). At both 1X1X and 2X2X stretches, the 1:1 combination fibres contained a blend of the bsheet content of individual rHIFa and rHIFɣ (C387S) fibres. Indicating that when draw-processed to an optimal degree, more b-sheets can be recruited when both proteins are present than either protein can achieve individually.
Notably, the rHIFa fibres were nearly as impressive as the 1:1 combination fibres and exceeded the mechanical properties reported from the previous study (Fu et al., 2017). The rHIFa fibres demonstrated the broadest range of strain, nearly doubling in initial length for the 1X1X stretch. However, when stretched further to 2X2X, the tensile strength, elastic modulus, strain, and b-sheet content were similar to the best-performing fibres in this study; the 1:1 combination at 1.5X1.5X. Unlike the other two protein formulations studied in this investigation, the b-sheet content of rHIFa fibres increased for all stretches and did not drop at the highest stretch performed. This suggests that each protein may respond uniquely to processing and tuning. Particularly indicative of the tunability observed for these proteins is that between the 1X1X fibres and the 2X2X fibres, rHIFa lost 70% strain and gained nearly 63% more tensile strength.
Mechanical properties for rHIFɣ (C387S) fibres were lower than both rHIFa and the 1:1 rHIFa:rHIFɣ (C387S) combination fibres. However, they are still mechanically impressive when compared to previously reported fibres. Again, the stretch factor of 1.5X1.5X for rHIFɣ (C387S) fibres produced the best fibres for this protein formulation, with some properties exceeding the prior synthetic proteins and regenerated natural slime threads (Negishi et al., 2012;Fu et al., 2017). Even though all fibres were spun with the same conditions and parameters, the rHIFɣ (C387S) fibres were generally larger in diameter and displayed larger surface deformities than either rHIFa or the 1:1 combination fibres. Both of these factors likely played a role in the observed mechanical properties. One potential explanation of the higher occurrence of these features for the rHIFɣ (C387S) fibres may be that the protein is less organized or does not assemble as efficiently with this spinning method. The removal of the single cysteine from the sequence is not thought to play a role in the lower mechanical performance observed. In the native fibres, it is unlikely to form a disulfide bond and contribute mechanically to the performance of the fibres (Fudge and Gosline, 2004;Fudge et al., 2010). However, since both proteins were expressed as inclusion bodies, and the omission of the cysteine from the rHIFɣ sequence was an unsuccessful attempt to produce soluble rHIFɣ, reinserting the cysteine should be explored as another route to obtain specifically tunable fibres.
Although not directly measured in this study, there is suggestive evidence that by stretching the fibres, the bsheets are being orientated along the axis of stretch as has been reported for native and recombinant hagfish intermediate filament fibres, spider silks, and others (Lef evre et al., 2007;Negishi et al., 2012;Sampath et al., 2012;Pinto et al., 2014;Fu et al., 2017). This is particularly apparent for the rHIFa fibres. At the 1.5X1.5X stretch, the diameters averaged~24 µm with 45% b-sheet content. When stretched to 2X2X, the diameters of the fibres dropped to~17 µm, and the bsheet content increased a negligible amount (45-46%), yet the fibres demonstrated a 63% increase in tensile strength and a 40% improvement in elastic modulus. It is unlikely that the improvements were the result of a 1% increase in b-sheet content.
For this study, only the individual proteins and their natural 1:1 combination were explored. However, the stark differences between the mechanical properties of the two proteins are intriguing as homopolymeric fibres. Varying the ratio beyond the natural 1:1 ratio will likely allow for the production of fibres with extensive tunability. For instance, producing a fibre with a high ratio of rHIFɣ (C387S) would likely provide a less rigid fibre (lower elastic modulus), while the presence of rHIFa would improve the tensile strength of the fibre due to its increased ability to form b-sheets. The ratio of the two proteins could be adjusted along with the post-spin draw to produce a fibre that exactly matches a mechanical requirement.
While this study indicates that these two proteins can be expressed at high levels in E. coli, the natural properties of hagfish intermediate filaments were not fully reproduced, although the highest mechanical properties were recorded to date for recombinant hagfish intermediate filament proteins. It should be noted that no attempts to optimize the spinning process were performed, and spinning was solely conducted with the reported parameters. Additionally, both proteins contained a histidine-tag and no attempts were made to understand the influence of the histidine-tag on fibre mechanical properties, if any. Neither the expression vector or gene sequences were designed with a proteolytic site to remove the tag.
It is clear that the fibre diameter has a substantial contribution to the lower mechanical properties observed. When isolated from the slime and dried, natural hagfish fibres are reported at 1.27 µm, and when stretched and dried, the diameter decreases to 1.07 µm (Fudge et al., 2010). The finest fibres produced from this study were 15 times larger in diameter than their natural counterparts. Reduction in the diameter and thus the crosssectional area of the fibres, even by relatively small increments, should allow for substantial improvements in the mechanical properties. Supporting this, as indicated in Table 2, is that as the fibre diameters are decreased through draw-processing, the tensile strength and elastic modulus increase, similar to other protein-based fibres. Conversely, as the diameter and strain increase, even if only slightly, there is a corresponding decrease in elastic modulus and tensile strength. Further refinements in the spinning process will likely result in finer fibres with more native-like mechanical properties. A variety of parameters could be explored when spinning these proteins individually or in combination. Previous works have successfully utilized formic acid to solvate the proteins and may produce fibres with different mechanical properties in this system. The actual amount of protein solvated, or concentrations, could relate to final fibre properties. Physical spinning parameters could also affect fibre mechanical properties such as, needle diameter, coagulation bath extrusion rate, and the speed at which the fibres traverse the system and thus the exposure time to the stretch bath solutions. Additional parameter control could also be achieved by varying the compositions of the stretch baths with alcohols other than isopropyl alcohol, varying the water ratios within the baths, and exploring other bath solutions to further refine the fibre mechanical properties. Alternative spinning techniques that produce very fine fibres, such as electrospinning, should also be explored.

Conclusion
This work has demonstrated that the production of recombinant hagfish intermediate filament proteins, using E. coli as a heterologous host, occurs at high enough levels to be commercially favourable without any additional optimization. Furthermore, the initial scale-up of the production system did not result in any significant reductions in the expression of the recombinant proteins. An effective purification strategy was also developed for the recovery of the rHIF proteins, which resulted in yields that surpass other similarly produced fibrous structural proteins. The large quantities of recovered protein allowed for rapid progress to investigate fibre production methods and ultimately resulted in noteworthy fibres. The fibres spun from these efforts yielded some of the highest yet reported mechanical properties for these recombinantly produced forms of hagfish intermediate filament proteins. Recombinant production also allowed the opportunity to study these proteins in non-mimetic states that do not occur naturally to better understand the individual protein functions and roles. Through these individual and combined studies of the proteins and their fibres, it was determined that a broad range of tunability is possible for the mechanical properties of these synthetic fibres, which can be characteristically controlled by the proteins utilized, the protein structures, the processing methods, and possibly other unexplored factors. This study provides the first evidence that recombinant hagfish intermediate filament proteins can be produced at high enough levels to be a viable source of material for high-performance fibrous protein materials.

Expression vector construction
The genes encoding hagfish a and ɣ proteins have been previously identified (Koch et al., 1994;Koch et al., 1995). One change was made in the natural ɣ sequence. As is indicated in Fig. 4, the cysteine that is ordinarily present in the natural hagfish ɣ protein sequence was removed and replaced with serine. The removal of the cysteine was an attempt to improve the purification of the proteins by removing the ability to form disulfide bonds. Provided that both proteins were expressed at high levels as inclusion bodies, this may not have been necessary. Finally, the pET-19k vector included a 109 histidine-tag at the N-terminal. While this tag was included, it was not utilized for purification. Rather, the histidine-tag was used in western blot analysis to confirm the identity of the produced proteins. The recombinant proteins are denoted as recombinant hagfish intermediate filament alpha (rHIFa) or gamma (rHIFɣ (C387S) ). The specific amino acid substitution denoted for rHIFɣ (C387S) is identified in the sequence in Fig. 4.
The full-length gene sequences were codon-optimized for expression in E. coli using ThermoFisher gene Optimizer TM software and were synthesized by ThermoFisher Scientific (Grand Island, NY, USA).
The pET19k cloning vector was generated in our laboratory by modifying the pET19b vector (Novagen, St. Louis, MO, USA) by replacing the ampicillin resistance gene with the kanamycin resistance gene from the pET26b vector (Novagen) Fig. S2. Synthesized hagfish genes were inserted into the pET19k vector at the restriction sites of NdeI and BamHI. The resulting vectors were transformed into E. coli BL21 (DE3) chemically competent cells (New England Biolabs, Ipswich, MA, USA) to produce the two recombinant hagfish proteins.

Expression of rHIFa and rHIFɣ (C387S) proteins
Before scaling to the BioFlo610 (~100 l) level of production, protein expression was validated first in shaker flasks (data not shown) and in New Brunswick Scientific BioFlo115 (~1 l) bioreactors. Both rHIFa and rHIFɣ (C387S) were expressed as inclusion bodies at all scales. Each construct was produced in triplicate using the BioFlo115 bioreactors. From each batch, the protein was purified, lyophilized, and weighed. The protocols to scale-up from the BioFlo115 to BioFlo610 bioreactors remained the same in terms of instrument operation, media formulation, and feeds given that all of the reported fed-batch fermentations were conducted with New Brunswick bioreactors using the BioCommand software. As such, the protocol and media components are reported only once. The entire cell mass from the Bio-Flo610 runs was not able to be purified at one time, as was performed with the BioFlo115 runs, due to the relatively large cell masses (19 kg for rHIFa and 13 kg for rHIFɣ (C387S) ) and equipment limitations. Instead, multiple purifications were performed on the total cell mass from each run. The protein yield was then averaged across those purifications.
The rHIFa and rHIFɣ (C387S) protein expressions were scaled up in a New Brunswick Scientific BioFlo610 bioreactor. The first seed solution was grown in LB medium (100 ml) plus 10 g l À1 glucose and 100 mg l À1 kanamycin in a 500 ml Erlenmeyer flask with rotary shaking (220 rpm) at 37°C to an OD 600 of one. The first seed solution was then inoculated into LB (3 l) with 10 g l À1 glucose and 100 mg l À1 kanamycin in a 10 l bottle to produce the second seed solution. The second seed solution was grown with rotary shaking (130 rpm) at 30°C to an OD 600 of one. The second seed was grown at a lower temperature to slow growth and increase the time required to reach the desired OD 600 value, for timing purposes. Inoculum culture was pumped into the Bio-Flo610 fermenter with 50-70 l of sterilized modified K12 medium (Table S2), 100 mg l À1 kanamycin, and 0.02% v/v C-8840 antifoam (New London Chemical, Lakeland, FL, USA). The starting fermentation temperature was set at 37°C, and the pH setpoint was 6.8, which was regulated by the automatic addition of 20% ammonium hydroxide throughout the fermentation. A dissolved O 2 level of 80% was cascade controlled by agitation . Expressed sequences for both rHIFa (top) and rHIFɣ (C387S) (bottom). The underlined sequences represent the natural sequence. Sequence not underlined represents sequences added as part of the expression vector, including the histidine-tag sequence. Note: the cysteine at position 387 has been replaced with serine (C387S) and is red in the rHIFɣ (C387S) (bottom) sequence.
ª 2021 The Authors. Microbial Biotechnology published by Society for Applied Microbiology andJohn Wiley &Sons Ltd., Microbial Biotechnology, 14, 1976-1989 and O 2 (0-100%) into the culture. Glucose feeding solution (24-28 l) was fed during the fermentation controlled by the BioCommand software. The glucose level was monitored by a ReliOn Prime Blood Glucose Monitoring System. Induction of protein expression was initiated when an OD 600 value of 55-60 was obtained. The target proteins, rHIFa or rHIFɣ (C387S) , were induced with 1 mM IPTG at 28°C. After 4 h of induction, when the OD 600 value was around 100, the culture was harvested by centrifugation for 15 min at 8000-10 000 rcf at 4°C, and the cell pellets were stored in a À80°C freezer until processed for purification.

Protein purification and verification
Cell lysis. The frozen hagfish rHIFa and rHIFɣ (C387S) cell pellets were thawed and resuspended at 10 ml g À1 of cells in lysis buffer (50 mM Tris and 200 mM NaCl at pH 7.9 (rHIFa) or pH 5.5 (rHIFɣ (C387S) ) with 200 µg ml À1 lysozyme). The solutions were sonicated for 10 cycles of 10 s, with intervals of 45 s between cycles with a VCX 1500 (Sonics Vibracell, Newton, CT, USA).
Inclusion body washing. Lysate was centrifuged at 10 000 rcf for 15 min at 4°C, the resulting pellets (hagfish rHIFa and rHIFɣ (C387S) inclusion bodies) were resuspended in wash buffer 1 (100 mM Tris, 5 mM EDTA, 0.5 M urea, 2% v/v Triton X-100, and 5 mM DTT; pH 7.9 for rHIFa and pH 5.5 for rHIFɣ (C387S) ) at 5 ml g À1 cells. After centrifugation, the pellets were rewashed with wash buffer 1 and followed by two washes of wash buffer 2 (100 mM Tris, 5 mM EDTA, and 5 mM DTT; pH 7.9 for rHIFa and pH 5.5 for rHIFɣ (C387S) ) at 5 ml g À1 cells all with centrifugation parameters described above. The pellets were then washed with 1:1 1X TAE: isopropyl alcohol (IPA) (1X TAE: 40 mM Tris, 1 mM EDTA, and 20 mM acetic acid). A final set of washes of 1:1 deionized water:IPA were performed until the conductivity of the supernatant was <20 µS cm À1 . The washed proteins were then lyophilized. Production yields were determined by weighing the recovered dry protein and comparing it to either the cell mass or the final working volume in the bioreactor to give grams of protein recovered per kilogram of cell mass (g kg À1 ) or grams of protein recovered per litre of media (g l À1 ).
SDS-PAGE coomassie analysis. The purified proteins were mixed 1:1 v/v with 29 Laemmli Sample Buffer (Bio-Rad, Hercules, CA, USA) and heat-treated at 100°C for 5 min before loading on polyacrylamide gels (Novex 4-20% Tris-Glycine from ThermoFisher). A dual-colour protein standard (Bio-Rad) was included on all gels. The gels were allowed to run for 60 min at a constant 110 V.
After SDS-PAGE analysis, the gels were rinsed using deionized water before being stained with 5 ml Bio-Rad Bio-Safe Coomassie Stain for 60 min. The gel was then destained using deionized water for 60 min. ImageJ (NIH) was then utilized to determine the protein purity by selecting each individual lane, isolating the main protein band, and then subtracting the main protein band from contaminating bands within the same lane.
Western blot analysis. The protein samples separated by SDS-PAGE were transferred to a PVDF/Immobilon TM -P Membrane (Millipore, Burlington, MA, USA) by electroblotting using the Mini Trans-Blot System (Bio-Rad). Blots were set up as specified by the manufacturer (Bio-Rad). All transfers were performed under a constant 200 mA for 50 min. After fixing of the proteins, the membranes were subjected to immunoblotting analyses using 1X TBS-T (19 TBS-T: 20 mM Tris, 140 mM sodium chloride, and 0.05% v/v Tween-20, pH 7.4) and Carnation dehydrated milk as the blocking reagents at a concentration of 5% w/v. The primary antibody was an anti-6X his epitope tag, (mouse) antibody (Rockland, Limerick, PA, USA) at a 1:5000 dilution, and the secondary was an anti-mouse IgG (H + L) AP conjugate antibody (Promega) at a 1:10 000 dilution. Each addition of antibody was allowed to mix on the membranes for 30 min, with 15 min rinses of TBS-T in between each antibody and before the development addition. For the detection of alkaline phosphatase activity on the PVDF membrane, 1-Step TM NBT/BCIP Substrate Solution (ThermoFisher) was used as specified by the manufacturer.

Fibre preparation and production
The dried proteins were dissolved in 1,1,1,3,3,3hexafluoro-2-propanol (HFIP) (Oakwood Chemical, Rochester, MI, USA) at a concentration of 5% w/v. The two protein types were weighed out into a 4 ml capacity glass vial (Wheaton) for a 2 ml dope of 5% w/v concentration, and then 2 ml of HFIP were added. The vial was then capped and sealed with Parafilm. Additionally, a combination dope was made by mixing equal amounts of both, rHIFa and rHIFɣ (C387S) proteins in an overall 5% w/v protein concentration in 2 ml of HFIP. The dopes were designated by their protein content as rHIFa, 1:1 rHIFa:rHIFɣ (C387S) or 1:1 combination, and rHIFɣ (C387S) .
Protein dope solutions were allowed to solvate for 7 d. Solvation occurred prior to this, but was permitted additional time to ensure that all proteins were dissolved. The dopes were then centrifuged at 18 000 rcf for 10 min to remove any remaining particulates. The supernatant was loaded into a BD 3 ml syringe. A spinning port was placed onto the luer-lok end of the syringe. This was accomplished using the necessary PEEK tubing adapters for connecting PEEK tubing (I.D. 0.254 mm) to a syringe (Fig. S3). The loaded syringe system was placed into a custom extrusion spinning machine, as has been previously described (Fig. S4) (Copeland et al., 2015). The dope was extruded into the coagulation bath at a rate of 25 µl min À1 using a plunger press. Spinning bath contents were: coagulation bath of 99% IPA (high-purity), the first stretch bath at 80% IPA:20% ultra-pure water, and the final stretch bath solution at 20% IPA:80% ultrapure water. The as-spun fibres were only extruded into the coagulation bath without being threaded through the spinning instrument. The stretches are reported as 1X1X, 1.5X1.5X, and 2X2X. For example, the 2X2X stretch denotes that the fibre was draw-processed to 2X the initial length in the first stretch bath (80% IPA) and then drawn again to 2X the length in the second stretch bath (20% IPA) for a total of 300% draw-processed. Stretches are obtained by varying the rotational speed of the godets in the stretch baths; a 2X stretch is obtained when the circumference of second godet covers twice the distance of the first godet. The 1X1X stretch denotes that no deliberate stretch was applied in either bath, or that all godets were rotating at the same speed. The fibre was simply pulled from the coagulation bath, threaded through the instrument, and collected on the winding spool. To promote drying and prevent fibres from sticking together or sticking to the collecting spool, drying lamps (DL in Fig. S4) were utilized for each spin between the final godet and the spool. Three stretch factors were investigated for this study: 1X1X (0% drawprocessed), 1.5X1.5X (125% draw-processed), and 2X2X (300% draw-processed) and applied to all three protein formulations (rHIFa, 1:1 rHIFa:rHIFɣ (C387S) , and rHIFɣ (C387S) ).

Fibre mechanical analysis
Fibres were allowed to dry overnight on the collection spools at 24°C and 16% humidity. Individual fibre samples were then mounted across a rectangular 19 mm opening on plastic film C-shaped cards, as has been previously reported (Albertson et al., 2014). Tape and cyanoacrylate were used to secure the fibres to cards to prevent slipping during measuring and testing. A Motic BA310 microscope coupled with the Motic Image Plus 2.0 program (Schertz, TX, USA) was then used to measure fibre diameters at nine different points along with the 19 mm long fibre segments. A MTS Synergie 100 tensile testing instrument with a custom 10 g load cell (Transducer Techniques, Newark, CA, USA), was used to perform the uniaxial tensile test on the fibres at an extension rate of 5 mm min À1 and an acquisition rate of 120 Hz. Sample sizes of twenty to twenty-five individual fibres were tested for every protein and stretch factor combination. The displacement and load data were exported to Microsoft Excel. The raw data in Microsoft Excel, along with average fibre diameter measurements, were used to calculate the ultimate tensile strength, strain, toughness, and elastic modulus for each fibre.

Fibre structural analysis
The collected fibre samples were also probed with Fourier-transform infrared (FTIR) spectroscopy to evaluate the compositions of secondary protein structures present. As-spun fibres were gently scooped out of the coagulation bath and allowed to dry in a similar fashion as the stretched fibres prior to analysis. A minimum of at least 50 individual fibres was obtained from the collection spools for each unique combination of protein (rHIFa, 1:1 rHIFa:rHIFɣ (C387S) , and rHIFɣ (C387S) ) and stretch factor (1X1X, 1.5X1.5X, and 2X2X). The individual fibres were then twisted together to form a multi-fibre bundle that was used for the spectroscopic analysis. A Varian 660-IR instrument (Agilent, Santa Clara, CA, USA) fitted with horizontal MIRacle single reflection attenuated total reflectance module (Pike Technologies, Fitchburg, WI, USA) was used to obtain the FTIR spectra. Measurements were obtained for each bundle by wrapping it upon itself, into a small coil, and then securely clamping it directly onto the crystal stage. The collection was performed with Resolution Pro software over the spectral range of 600 to 4000 cm À1 , with 32 scans, a resolution of 4 cm À1 , and an aperture setting of 4 cm À1 at 4000 cm À1 . Background scans were collected before each bundle with the exact conditions that were used for the bundle. Spectral correction and deconvolution for secondary structure quantification was performed at the Amide I region (~1600 to 1700 cm À1 ) using Origi-nPro and a similar method as previously described by (B€ oni et al., 2018). The only variation to the described method was the use of Gaussian curves for peak fitting. All secondary structure peak assignments were based upon previous assignments used for the characterization of fibrous proteins utilized by (Guo et al., 2018) and based upon the work by (Hu et al., 2006).

Statistical analysis
All values presented are provided in mean AE standard deviation format. Analyses were first performed using a two-factor ANOVA to determine if any statistically significant differences were present. The specific significant differences were then determined with a Tukey post hoc analysis test. A P-value of <0.05 was considered statistically significant.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Pairwise comparisons of mechanical properties for all stretched fibres and b-sheet content for all fibres, including as-spun. An asterisk indicates a statistically significant difference with a P-value ≤0.05, and a blank square indicates no statistically significant difference.    Table S1. BioFlo610 protein yields from cell mass purifications. Final BioFl0610 volumes were 95 and 68 l and 19 and 13 kg for rHIFa and rHIFɣ (C387S) , respectively.