Interaction of two strongly divergent archaellins stabilizes the structure of the Halorubrum archaellum

Abstract Halophilic archaea from the genus Halorubrum possess two extraordinarily diverged archaellin genes, flaB1 and flaB2. To clarify roles for each archaellin, we compared two natural Halorubrum lacusprofundi strains: One of them contains both archaellin genes, and the other has the flaB2 gene only. Both strains synthesize functional archaella; however, the strain, where both archaellins are present, is more motile. In addition, we expressed these archaellins in a Haloferax volcanii strain from which the endogenous archaellin genes were deleted. Three Hfx. volcanii strains expressing Hrr. lacusprofundi archaellins produced functional filaments consisting of only one (FlaB1 or FlaB2) or both (FlaB1/FlaB2) archaellins. All three strains were motile, although there were profound differences in the efficiency of motility. Both native and recombinant FlaB1/FlaB2 filaments have greater thermal stability and resistance to low salinity stress than single‐component filaments. Functional supercoiled Hrr. lacusprofundi archaella can be composed of either single archaellin: FlaB2 or FlaB1; however, the two divergent archaellin subunits provide additional stabilization to the archaellum structure and thus adaptation to a wider range of external conditions. Comparative genomic analysis suggests that the described combination of divergent archaellins is not restricted to Hrr. lacusprofundi, but is occurring also in organisms from other haloarchaeal genera.


| INTRODUC TI ON
Archaeal flagella (archaella) are morphologically and functionally similar to bacterial flagella. However, the archaellum structure, assembly mechanism, and protein composition are fundamentally different from the flagellum and instead show similarity to type IV pili.
Furthermore, the crystal structure of the N-terminal truncation of archaellin FlaB1 Methanocaldococcus jannaschii has been determined at a resolution of 1.5 Å (Meshcheryakov et al., 2019). The structure of archaeal filaments differs significantly not only from bacterial flagella but also from bacterial type IV pili (Braun et al., 2016;Poweleit et al., 2016). The amino acid residues of archaellins responsible for intersubunit interactions were identified, as well as the protein regions forming the outside surface of the filament. The proposed models for the archaellar filament do not contain the long-pitch protofilaments found in bacterial flagellar filaments. To explain the archaella supercoiling, it was proposed to consider them as semiflexible filaments in a viscous medium (Coq, Du Roure, Marthelot, Bartolo, & Ferm, 2008;Tony, Lauga, & Hosoi, 2006;Wolgemuth, Powers, & Goldstein, 2000). For such structures, thrust can be generated by their rotation. Using molecular modeling, it was shown that conformational changes in the globular domain of the archaellin can lead to extension and compression, as well as bending of the filaments (Braun et al., 2016). However, the detailed mechanism of the archaellum supercoiling is not fully understood.
The proposed models of spatial archaellar filament structure do not explain the structural and functional role of multiple archaellins.
Despite the presence of several archaellin genes in genomes of M.
hungatei, M. maripaludis, and P. furiosus, protein products of only one of these genes were found incorporated in their filaments (Daum et al., 2017;Meshcheryakov et al., 2019;Poweleit et al., 2016). This raises the question about the importance of encoding multiple different archaellins. Interestingly, the presence of several copies of archaellin genes in archaeal genomes is very common.
Currently, almost 3,000 archaeal genomic sequences are deposited at the NCBI database (https://www.ncbi.nlm.nih.gov/genom e/ brows e/#!/proka ryote s/archaea), including about 400 genomes of halophilic archaea. The majority of these archaeal genomes contain archaellin genes. The known genomes of crenarchaeota typically only possess a single archaellin gene. However, the large majority of euryarchaeal genomes contain multiple archaellin genes.
In haloarchaea, multiple archaellin genes appear to have originated from duplication events (Desmond, Brochier-Armanet, & Gribaldo, 2007). Duplicated genes can be located in the same or different operons. For example, in three popular model haloarchaea, the organization of archaellin genes differs significantly. Haloarcula hispanica has three operons with 1 archaellin gene in each of them, two Haloferax volcanii archaellin genes are in one operon, and Halobacterium salinarum has two operons with 2 and 3 archaellin genes each. The similarity between the archaellin paralogs of the above species is still very high, indicating relatively recent duplication. Interestingly, several haloarchaea have multiple archaellin genes (most often, two) that are very divergent (such as Halobiforma, Halopiger, Halorubrum, Natrialba, and Natronolimnobius species).
Even archaea belonging to the same genus can differ drastically from each other in the number and size of the archaellin genes. It has been suggested that the archaellum supercoiling, as for bacterial flagella, can be achieved through a combination of subfilaments of different lengths, constructed from different types of subunits (Tarasov, Pyatibratov, Beznosov, & Fedorov, 2004;Tarasov, Pyatibratov, Tang, Dyall-Smith, & Fedorov, 2000). For Hbt. Salinarum, it was demonstrated that both archaellins FlgA1 and FlgA2 are necessary for the formation of a functional supercoiled archaellum, and mutant strains with a single FlgA1 or FlgA2 archaellin had straight nonfunctional filaments. In the case of methanogenic archaea, the multiple archaellin genes were shown to encode major and minor structural components of the archaellum filament. A "hook"-like structure was observed in a number of methanogenic archaea. In Methanococcus voltae and M. maripaludis, the archaellum hook segment is built of the FlaB3 archaellin, while the FlaB1 and FlaB2 proteins are the main components of the filament (Bardy, Mori, Komoriya, Aizawa, & Jarrell, 2002;Chaban et al., 2007). Inactivation of either the flaB1 or flaB2 genes resulted in a loss of motility and cessation of archaellum synthesis (including the hook) (Chaban et al., 2007). Recently, it has been shown that FlaB1 is the predominant component of Methanococcus maripaludis filaments (Meshcheryakov et al., 2019).
Inactivation of the flaB3 gene does not lead to the cessation of filament synthesis and a noticeable change in motility on semisolid agar (Chaban et al., 2007). However, time-lapse microscopy showed impaired motility for this deletion strain (movement in a closed circle).
The structures corresponding to the Methanococcales hooks were not found in the native archaella of other archaea. It is possible that the differentiation of one of the archaellins into a "hook protein" with a special structural role is a relatively late evolutionary event in the Methanococcales and is not typical for other archaea.
In Hfx. Volcanii, it was shown that the archaellum filament consists of one major (FlgA1) and one minor (FlgA2) component.
However, the structural role of the minor component is unknown and it does not form a hook-like structure (Tripepi, Esquivel, Wirth, & Pohlschröder, 2013). Deletion of the flgA2 gene leads to hypermotile cells by an unknown mechanism (Tripepi et al., 2013).
Haloarcula marismortui it was shown that these proteins function as ecoparalogs; that is, they are expressed under different environmental conditions and provide distinct stability advantages under varying salt concentrations (Syutkin et al., 2014(Syutkin et al., , 2019. In this work, we investigate the role of multiple archaellin genes of the Halorubrum genus. In contrast to the systems studied before, members of the Halorubrum group possess multiple archaellins with highly diverged protein sequences. Our preceding work has shown that functional supercoiled archaella filaments of Hrr. lacusprofundi ATCC49239 (ACAM 34) are formed from a protein encoded by a single archaellin gene (flaB2) (Syutkin et al., 2012). However, unlike Hrr.
lacusprofundi ACAM 34, other Halorubrum species possess at least two archaellin genes (flaB1 and flaB2) located in one operon. The amino acid sequences of the FlaB1 and FlaB2 archaellins differ significantly from each other (<43% identical residues, the N-terminal region being more conserved). We use the Hrr. lacusprofundi archaella as a model to address the role of multiple highly divergent archaellin genes often found in genomes of haloarchaea.
Recently, Hrr. lacusprofundi strains (DL18 and R1S1) with two archaellin genes (which is more typical for Halorubrum species) were isolated (Tschitschko et al., 2018). Since the presence of a single archaellin gene in Hrr. lacusprofundi ACAM 34 is sufficient to form functional supercoiled archaella, the presence of the flaB1 gene may seem redundant. FlaB1 could be responsible for: (a) formation of specific filaments that differ from FlaB2 in function (and, e.g., function as ecoparalogs) and (b) stabilization of the filament structure  Tomlinson and Hochstein (1976) TA B L E 1 Plasmids and strains together with FlaB, possibly as a result of constructive neutral evolution (Lukeš, Archibald, Keeling, Doolittle, & Gray, 2011 However, the combination of the two proteins renders the archaellum filament structure much more stable which can help maintain motility in a wider range of conditions.

| Strains and growth conditions
The plasmids and strains used in this study are listed in

| Preparation of DNA and polymerase chain reaction (PCR)
The plasmids for heterologous archaellin expression were assembled by the SLIC method (Li & Elledge, 2012) with modifications.
These expression vectors included the inducible tryptophanase promoter (tna) to drive expression of these genes. The DNA fragments containing desired archaellin genes were amplified from Hrr. lacusprofundi or Hrr. saccarovorum genome with the primers described in and then stopped by the addition of 10 mM dCTP. The resulting mix was used for E. coli transformation. The colonies that appeared on the next day were analyzed by PCR with the pTA1228_seqF and pTA1228_seqR primers. The plasmids from positive colonies were isolated, and correct assembly of the plasmid was confirmed by sequencing with the primers used for colonies screening.

| Chromatography mass spectrometry analysis
Protein bands were excised and treated with proteinase K (Promega) and trypsin (Sigma) at 37°C in a Thermo Mixer thermo shaker (Eppendorf, Germany). To stabilize proteinase K, CaCl 2 was added to the solution to a final concentration of 5 mM. The molar ratio of enzyme-to-protein was 1/50. The reaction was stopped by adding trifluoroacetic acid to the solution. Before mass spectrometric analysis, the peptides were separated by reversed-phase high-performance liquid chromatography using an Easy-nLC 1,000 Nanoliquid chromatography (Thermo Fisher Scientific). The separation was carried out in a homemade column 25 cm in length and 100 μm in diameter packed with a C18 adsorbent, with an adsorbent particle size of 3 μm, and pore size of 300 Å. The column was packed under laboratory conditions at a pressure of 500 atm.
The peptides were eluted in a gradient of acetonitrile from 3% to 40% for 180 min; the mobile phase flow rate was 0.3 μl/min. Mass spectra of the samples were obtained using an OrbiTrap Elite mass spectrometer (Thermo Scientific, Germany). The peptides were ionized by electrospray at nano-liter flow rates with 2 kV ion spray voltage; ion fragmentation was induced by collisions with an inert gas (collision-induced dissociation in a high-energy cell).
The mass spectra were processed, and peptides were identified using Thermo Xcalibur Qual Browser and PEAKS Studio (ver. 7.5) programs based on the sequences of UniRef-100. Parent Mass Error Tolerance was 2.0 ppm, and fragment Mass Error Tolerance was 0.1 Da. Only peptides were taken into account with a "10 L gP." threshold value higher than 15.

| Electron microscopy
The archaellar filament specimens were prepared by negative staining with 2% uranyl acetate on Formvar-coated copper grids. A grid was floated on a 20-µl drop of filament solution (about 0.01 mg/ml, in 20% NaCl, 10 mM Na-phosphate, pH 8.0) for 2 min, blotted with filter paper, placed on top of a drop of 2% uranyl acetate, and left for 1-1.5 min. Excess stain was removed by touching the grid to filter paper, and the grid was air-dried. Samples were examined on a Jeol JEM-1400 transmission electron microscope (JEOL, Japan) operated at 120 kV. Images were recorded digitally using a high-resolution water-cooled bottom-mounted CCD camera.
The measurements and necessary calculations were performed according to Privalov and Potekhin (1986) and described in detail in Tarasov, Kostyukova, Tiktopulo, Pyatibratov, and Fedorov (1995).

| Limited proteolysis
Limited proteolysis by trypsin (Sigma) was performed in 10 mM Naphosphate buffer, pH 8.0 at 21°C. 20 µl aliquots for electrophoresis were taken at defined periods. The reaction was terminated by adding an equimolar amount of trypsin inhibitor from ovomucoid (Sigma).

| Phylogenetic reconstruction
Sequences were aligned with MAFFT (Katoh & Standley, 2013) or PRANK (Löytynoja & Goldman, 2008). For some analyses, unreliably aligned sites were removed using guidance (Sela, Ashkenazy, Katoh, & Pupko, 2015). Search for the best model to describe sequence evolution and search for the maximum likelihood tree were performed in IQ-TREE (Nguyen, Schmidt, von Haeseler, & Minh, 2015) using the Bayesian information criterion (BIC). The only difference between phylogenetic reconstruction from the different alignments is that in case of alignments filtered for conserved sites, the branches are shorter (due to the removal of variable sites), the bootstrap support values are lower, and the most appropriate models determined with IQ-TREE are simpler, because the removal of the variable more difficult to align sites also removes phylogenetic information.

| Comparison of natural Hrr. lacusprofundi strains DL18 and ACAM 34
Both Hrr. lacusprofundi strains were isolated from the relict hypersaline Deep Lake in Antarctica (Franzmann et al., 1988;Liao et al., 2016). Due to its high salinity, this lake never freezes and its surface temperature ranges from −20°C to +10°C depending on the season. In the laboratory, Hrr. lacusprofundi cells can grow at temperatures ranging from −1°C to +44°C (optimum temperature is 33°C) (Franzmann et al., 1988).
This strikingly distinguishes them from other haloarchaeal systems.
For example, the identity between the two Hfx. volcanii archaellins Earlier, we showed that the cells of the Hrr. lacusprofundi ACAM 34 strain are motile on semisolid media (Syutkin et al., 2012). We compared the motility of both strains on semisolid 0.19% agar under the same conditions and found that the DL18 strain shows significantly higher motility ( Figure 1). It should be noted that at the same time, the growth rate of the DL18 in liquid media is higher than that of the ACAM 34 ( Figure A3). The maximum archaella yield in the late stationary phase was approximately 10 mg per 1-L culture for both strains. In contrast to the ACAM 34, the cells of the DL18 strain demonstrate a more stable motility and archaella production. To obtain the relatively motile ACAM 34 cells with high archaella yield that were used in the above experiment, it was necessary to pass cells through semisolid (0.19%) agar with 2-3 cycles of a selection of the most motile cells (Syutkin et al., 2012). When ACAM 34 cells that were kept for a long time (about 1 month) in a liquid medium were used to inoculate, the archaella yield decreases dramatically.

| Heterologous expression of Hrr. lacusprofundi archaellins in Haloferax volcanii
The analysis of natural Halorubrum strains allowed us to isolate archaellar filaments consisting of FlaB1/FlaB2 and FlaB2 archaellins.
Next, we aimed to study whether the FlaB1 archaellin is capable of producing functional archaella. To this end, we expressed the dif-

| Comparison of natural and recombinant Hrr. lacusprofundi archaella
The archaella, isolated from natural DL18 and ACAM 34 strains, were designated as HL-B1B2-N and HL-B2-N-respectively. SDS-PAGE of Hrr. lacusprofundi ACAM 34 archaellum filaments ( Figure 3) showed a single major band, corresponding to a molecular mass of ~50 kDa. For the DL18 strain, the same and additional major bands of ~37 kDa were observed (Figure 3). Mass spectrometry analysis ( Figure A13) confirmed that the isolated proteins were Hrr. lacusprofundi archaellins FlaB2 (~50 kDa) and FlaB1 (~37 kDa). The apparent molecular masses determined by SDS gel electrophoresis are higher than the true values (23.6 and 19.8 kDa for FlaB1 and FlaB2, respectively), which is typical for halophilic archaellins due to high content of carbonic acids and posttranslational modifications (Fedorov, Pyatibratov, Kostyukova, Osina, & Tarasov, 1994;Gerl & Sumper, 1988;Pyatibratov et al., 2008). Interestingly, the HL-B1B2-R strain has an advantage in motility at 15 and especially at 10% NaCl ( Figure 5). Thus, the two-component archaella appear to be better adapted to environmental salinity changes than the one-component archaella, allowing the species to occupy a larger ecological niche.
Interestingly, the staining intensities of natural Hrr. saccharovorum archaellins are noticeably less than that of recombinant archaellins. At the same time, both are glycosylated less than natural Hrr. lacusprofundi archaellins.

| Scanning microcalorimetry experiments
To obtain additional information regarding archaella of different composition, we applied differential scanning microcalorimetry (DSC). Isolated Hrr. lacusprofundi archaella were heated in near-natural (20% NaCl) and low (10% NaCl) salt conditions. We found that at DSC data for recombinant filaments are similar to results obtained for the natural archaella. The temperature of the heat absorption peak maximum of HL-B1B2-R archaella (92°C) under 20% NaCl was noticeably higher than that of recombinant filaments consisting either only of FlaB1 (86°C) or FlaB2 (74°C) (Figure 6d; Table 3).
These temperatures were all slightly lower than those observed for the natural filaments isolated from Hrr. lacusprofundi (HL-B1B2-N, 97.5°C and HL-B2-N, 80°C). On melting at 10% NaCl, extended heat absorption peak with a maximum of about 42°C was observed for the HL-B2-R archaella (in comparison with two peaks at 39 and 45°C for HL-B2-N). The HL-B1-R and HL-B2-R melting curves are different ( Figure 6c; Table 3). The combination of both subunits in one archael- and 10% NaCl, since at 20% NaCl the natural Hrr. saccharovorum archaella melted near the upper limit of the experimental temperature range. In this case, the T m of all three types of filaments was very similar, which is in line with the findings for Hrr. lacusprofundi archaella ( Figure A10).
Thus, in general, the FlaB1FlaB2 filaments are slightly more stable than the FlaB1 and much more stable than the FlaB2 filaments.

| Limited proteolysis confirms the interaction of Hrr. lacusprofundi archaellins in the filaments
To probe conformational features of two-component and one- are localized between N-terminal -helix and -strand 1, and within in -strand 2, respectively (Poweleit et al., 2016). The presence of FlaB1 seems to protect these sites, possibly by shielding them for trypsin.

| Bioinformatical analysis of Halorubrum archaellins
After analyzing the Halorubrums genomes available on insert date (50 in total), we found that in most species the organization of ar-  Halorubrum species. Thus, identities between three Hrr. halodurans archaellins are >55%, and >60% for four Hrr. vacuolatum archaellins.
Hrr. halodurans archaellin genes constitute a single operon, the flaB a and flaB b genes are separated by CG spacer, and the start codon of flaB c gene immediately follows the flaB b stop codon. The Hrr. vacuolatum genome contains three archaellin operons, one of them consists of two genes separated by a spacer of four nucleotides (GACC).
Interestingly, for the other haloarchaeal genera (Halopiger, Natrialba, Halobiforma, Natronolimnobius, and Natrarchaeobius) the situation with archaellins is very similar to that of Halorubrum.
Despite the rather high similarity of their archaellins with that of and Natronolimnobius are placed in the Natrialbales. Figure 8 and It should be emphasized that, as can be seen from the evolutionary tree (Figure 8)

| D ISCUSS I ON
The archaeal motility structure, the archaellum, consists of thousands of copies of N-terminally cleaved archaellin subunits. While crenarchaea usually encode a single type of archaellin, the euryarchaea are characterized by the presence of multiple types of archaellin encoding genes. Recently, high-resolution structures of archaellar filaments of methanogens and hyperthermophilic euryarchaea became available (Daum et al., 2017;Meshcheryakov et al., 2019;Poweleit et al., 2016). In these structures, only a single type of archaellin is present in the filament, even though several archaellin genes are present in the genome. Daum et al. suggested that these other archaellins either (a) are minor, and form specific basal or terminal segments of the filament, or (b) that each of the different types of archaellins forms individual filaments (Daum et al., 2017).
We aimed to understand the biological relevance of archaellin  We suppose that FlaB2 adopts a final more stable conformational state by interacting with FlaB1. It can also be assumed that FlaB2 peaks, we observe a new peak of heat absorption that melting point was slightly higher than for FlaB1 filaments and significantly higher than for FlaB2 filaments.
Thus, the two-component composition of Hrr. lacusprofundi archaellar filaments contributes to additional stabilization of the archaellum structure and adaptation to a wider range of external conditions and it is not required for archaella supercoiling.
By applying the heterologous expression of the Hrr. lacusprofundi archaellins in Hfx. volcanii, we demonstrated that archaellins can assemble in functional archaella, even in species that possess highly divergent archaellin genes. This suggests that foreign archaellin genes captured via horizontal transfer can quite easily adapt to the assembly and glycosylation system of the new host. Exchange of archaellins could provide an evolutionary advantage as it might allow adaptation to new environments or block the attachment of archaellum specific viruses (Pyatibratov et al., 2008;Tschitschko et al., 2018).
Euryarchaea are characterized by the genomic presence of multiple different archaellin genes. Most euryarchaea have two archaellin genes, but in some cases, the number of different archaellins is even higher (such as Hht. litchfieldiae, which has seven archaellin genes) (Tschitschko et al., 2015). FlgA2, FlgB1, and FlgB3 (Gerl et al., 1989). Earlier, it had been suggested that archaellin multiplicity may cause the archaellum to became supercoiled (Tarasov et al., 2000). This hypothesis was based on an analogy with the bacterial flagellar filaments, where two conformational flagellin states provide filament supercoiling (Calladine, 1978). It was shown that inactivation of the Hbt. salinarum archaellin genes led to a disruption of the archaella assembly, while only straight filaments could be formed from the product of a single archaellin gene (flgA1 or flgA2) (Tarasov et al., 2000(Tarasov et al., , 2004. In the case of Hrr. lacusprofundi, the presence of two archaellins is not required for supercoiling, as functional filaments can be formed from each of the single archaellin types. However, the presence of two archaellins provides extra stability to the filament, causing it to better withstand ionic stress conditions and to provide the highest level of motility. Thus, with this work, we add another aspect of encoding multiple archaellins to the other previously discovered mechanisms.
Besides forming specialized minor components of the filament, or acting as ecoparalogs, we now show that multiple archaellins can also be important to form the filament with optimal properties in terms of flexibility and stability. Also, we provide evidence that the exchange of archaellins between different species can result in functional archaellum filaments. Together, these findings sketch an

CO N FLI C T O F I NTE R E S T
None declared.

E TH I C S S TATEM ENT
None required.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data are provided in the results section and appendices (Table A1 and Figures A1-A17)

F I G U R E A 11
Weblogo representation of the alignment of internal sequences of Hrr. lacusprofundi FlaB1 and FlaB2 archaellins. In this representation, the overall height of a stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino acid at that position (Crooks, Hon, Chandonia, & Brenner, 2004). Left column: Amino acid residues 50-75; 53 Halorubrum FlaB1 sequences and 55 Halorubrum FlaB2 sequences were used. Right column: C-terminal sequences (~25 amino acid residues from C-termini); 50 FlaB1 sequences and 53 FlaB2 sequences were used. The tables below show the corresponding characteristic signatures for FlaB1 and FlaB2. Conservative residues for both archaellins are colored red, conservative residues different for FlaB1 and FlaB2 are blue, and slightly conservative residues are black