Cyanobacteria contain a structural homologue of the Hfq protein with altered RNA-binding properties


  • Andreas Bøggild,

    1.  Centre for mRNP Biogenesis and Metabolism, University of Aarhus, Denmark
    2.  Department of Molecular Biology, University of Aarhus, Denmark
    Search for more papers by this author
  • Martin Overgaard,

    1.  Centre for mRNP Biogenesis and Metabolism, University of Aarhus, Denmark
    2.  Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, Denmark
    Search for more papers by this author
  • Poul Valentin-Hansen,

    1.  Centre for mRNP Biogenesis and Metabolism, University of Aarhus, Denmark
    2.  Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, Denmark
    Search for more papers by this author
  • Ditlev E. Brodersen

    1.  Centre for mRNP Biogenesis and Metabolism, University of Aarhus, Denmark
    2.  Department of Molecular Biology, University of Aarhus, Denmark
    Search for more papers by this author

  • Database
    Structure models and diffraction data are available in the Protein Data Bank under the accession numbers 3HFO (Syn-Hfq) and 3HFN (Ana-Hfq)

D. E. Brodersen, University of Aarhus, Denmark, Gustav Wieds Vej 10c, DK-8000 Aarhus C, Denmark
Fax: +45 8612 3178
Tel: +45 8942 5259


Hfq proteins are common in many species of enterobacteria, where they participate in RNA folding and translational regulation through pairing of small RNAs and messenger RNAs. Hfq proteins share the distinctive Sm fold, and form ring-shaped structures similar to those of the Sm/Lsm proteins regulating mRNA turnover in eukaryotes. However, bacterial Hfq proteins are homohexameric, whereas eukaryotic Sm/Lsm proteins are heteroheptameric. Recently, Hfq proteins with poor sequence conservation were identified in archaea and cyanobacteria. In this article, we describe crystal structures of the Hfq proteins from the cyanobacteria Synechocystis sp. PCC 6803 and Anabaena PCC 7120 at 1.3 and 2.3 Å resolution, respectively, and show that they retain the classic Sm fold despite low sequence conservation. In addition, the intersubunit contacts and RNA-binding site are divergent, and we show biochemically that the proteins bind very weakly to known Escherichia coli Hfq target RNAs in vitro. Moreover, when expressed in E. coli, the proteins cannot mediate Hfq-dependent RNA regulation. It therefore appears that the cyanobacterial proteins constitute a specialized subfamily of Hfq proteins that bind relatively weakly to A/U-rich tracks of regulatory RNAs. The results have implications for our understanding of the evolution of the Sm fold and the Hfq proteins in the bacterial kingdom in general.

Structured digital abstract


Anabaena PCC 7120 Hfq


Escherichia coli Hfq


isopropyl thio-β-d-galactoside


Methanococcus jannaschii Hfq


Pseudomonas aeruginosa Hfq


Staphylococcus aureus Hfq


small noncoding RNA


Synechocystis sp. PCC 6803 Hfq


During the last decade, it has become increasingly clear that the Sm-like (Lsm) protein Hfq, initially identified as a host factor required for Qβ bacteriophage replication, acts as a global post-transcriptional regulator in enterobacteria [1–3]. Most of our present knowledge derives from detailed studies of small noncoding RNAs (sRNAs) in Escherichia coli and Salmonella. Here, Hfq has been shown to be intimately linked to the activity of a family of sRNAs that function in translational control by base-pairing with 5′-UTRs or the first few codons of target mRNAs [4–6]. Hfq binds strongly to the sRNAs at single-stranded AU-rich regions, and the interaction stabilizes many of the small RNAs. Hfq also binds to target mRNAs, and the protein has been shown in several cases to directly promote sRNA–mRNA duplex formation [7–9]. Recent data have demonstrated that Hfq strongly enhances association rates of sRNAs and mRNAs, a property that is perfectly suited for establishing efficient regulation by sRNAs that have only short, and often incomplete, target complementarity [10–13]. The simplest interpretation of these results is that Hfq is able to bind both the sRNA and the target mRNA simultaneously, thereby physically increasing the likelihood of annealing. In an alternative model, enhanced hybridization may result from the ability of Hfq to alter the secondary structure of either sRNA or mRNA (chaperone activity) [14–16]. Finally, Hfq has been identified as an important virulence factor in several Gram-negative bacteria (such as species of Salmonella, Yersinia, Pseudomonas, Brucella, Legionella, and Vibrio) [17–21] as well as in the Gram-positive pathogen Listeria monocytogenes [22].

The Sm/Lsm proteins generally found in eukaryotes and archaea are characterized by the presence of two relatively conserved motifs, Sm1 and Sm2, which are separated by a region of variable length and sequence. The bipartite sequence motif constitutes an autonomously folded domain (the Sm fold) composed of an N-terminal α-helix followed by a twisted five-stranded β-sheet, responsible for both protein oligomerization and RNA binding [23–25]. The evolutionary connection between the Sm/Lsm proteins and Hfq was established in 2002 by the finding that Hfq possesses certain structural and functional properties in common with the Sm proteins [7,9]. However, although the Sm1 motif is clearly present in the Hfq proteins, no obvious Sm2 motif can be identified. Rather, these proteins contain a different but conserved motif in the Sm2 region, with the sequence signature [Y/F]KHAI (Fig. 1A). Like its eukaryotic counterparts, Hfq oligomerizes into ring-shaped or doughnut-shaped structures, but its structure is homohexameric rather than heteroheptameric like the eukaryotic Sm/Lsm complexes [7,9,26]. Crystal structures of Hfq are currently available from three bacterial species, Staphylococcus aureus (Sau-Hfq) (Protein Data Bank: 1KQ1), E. coli (Eco-Hfq) (Protein Data Bank: 1HK9), and Pseudomonas aeruginosa (Pae-Hfq) (Protein Data Bank: 1U1S and 1U1T), and one archaeon, Methanococcus jannaschii (Mja-Hfq) (Protein Data Bank: 2QTX) [16,27–29]. All structures contain the distinctive Sm fold backbone conformation and share intersubunit contacts. The archaeal and E. coli Hfq proteins display very similar biochemical and biological properties, and these, as well as Mja-Hfq, can partly or fully complement E. coli hfq deletion strains [27]. Furthermore, the crystal structure of Sau-Hfq bound to an AU5G-heptamer RNA (Protein Data Bank: 1KQ2) showed details of RNA binding at the primary binding site and revealed the molecular mechanism of specificity towards A and U oligoribonucleotides [16].

Figure 1.

 Sequence alignment and overall structures. (A) Sequence alignment of the Syn-Hfq and Ana-Hfq sequences, along with those of Eco-Hfq, Pae-Hfq, Sau-Hfq, and Mja-Hfq. Dashed boxes show the location of the Sm1 and Sm2 regions, with the signature sequences in bold yellow letters and additional residues known to interact with RNA in bold red letters. Residues forming the nucleotide-binding pockets in the Sau-Hfq–RNA complex are indicated by an asterisk. For the remaining sequence letters, partially and fully conserved residues are shown in blue and red letters, respectively. The figure was produced with multalin [47] and secseq [48]. (B) Final σA-weighted 2mFo – DFc electron density map for the Syn-Hfq structure. (C) The overall structure of Syn-Hfq. (D) The overall structure of Ana-Hfq. Monomers related by crystal symmetry are shown in the same colours.

In 2002, a standard blast search was able to identify Hfq candidates in about half of the completed or nearly completed bacterial genomes, approximately 140 at the time [30]. On the basis of the apparent lack of Hfq-type proteins in several bacterial phyla, in some proteobacteria, and in a number of Gram-positive bacteria, it was suggested that the Hfq protein either was of ancient origin, but was subsequently lost in some evolutionary branches, or evolved later and spread outside the proteobacterial phyla by lateral gene transfer [30]. More recently, the combined use of motif and pattern sequence searches led to the identification of Hfq orthologues in some of the organisms from which it was thought to be absent, including the cyanobacterium Anabaena PCC 7120, and subsequently in a wide variety of unicellular and filamentous cyanobacteria, including Synechocystis sp. PCC 6803, as well as in a prochlorophyte, Prochlorococcus [3]. This suggested that Hfq proteins are more highly conserved in bacteria than initially anticipated.

The Synechocystis sp. PCC 6803 protein (Syn-Hfq) and Anabaena PCC 7120 protein (Ana-Hfq) constitute a new group of Hfq proteins, quite separate from other bacterial Hfqs. Of 69 aligned residues, 32 are identical between Ana-Hfq and Syn-Hfq, and another 13 can be considered to be conservative substitutions, giving an overall similarity of 65% (Fig. 1A). This makes these Hfq orthologues more similar to each other than to any other Hfq protein for which the structure is known. When the sequences are compared with those of the E. coli, S. aureus and P. aeruginosa orthologues, it is clear that Syn-Hfq and Ana-Hfq stand out as a separate group (Fig. 1A). In particular, the signature sequence [Y/F]KHAI in the Hfq Sm2 region is significantly different, being RLAAI in Syn-Hfq and WKQAI in Ana-Hfq. As this sequence motif is known to form part of the RNA interaction surface of the protein, we decided to study the structures and function of the cyanobaterial orthologues to elucidate the RNA-binding site as well as the maintenance of the overall structure of the proteins. Moreover, interest in these proteins has recently been fuelled by the finding that knock-out of Hfq in Synechocystis results in loss of motility and greatly reduced mRNA levels for a defined set of genes in this organism [31].

In this article, we present the crystal structures of the Hfq orthologues from Synechocystis sp. PCC 6803 and Anabaena PCC 7120 at 1.3 and 2.3 Å resolution, respectively. The structures reveal that the cyanobacterial Hfqs are remarkably similar in structure to other known bacterial Hfq proteins, despite missing several key sequence elements. However, the proteins possess variant RNA-binding sites and have a significantly lower affinity for known base-pairing sRNAs in vitro. In addition, they cannot mediate Hfq-dependent regulation in E. coli in vivo. The Syn-Hfq structure, to our knowledge, represents the highest-resolution crystal structure of any Hfq or Sm protein to date, and thus provides an unprecedentedly accurate view of details of the Sm fold.


Structure determination and crystal packing

Both Syn-Hfq and Ana-Hfq appear as stable hexamers in solution as determined by gel filtration during purification of the proteins (not shown). Single, three-dimensional crystals of Syn-Hfq grew in 56–62% of a commercial mix of organic acid salts known as Tacsimate at pH 7. The crystals belong to the face-centred space group F222 and diffract beyond 1.3 Å resolution (Table 1). Ana-Hfq crystallized as thin needles in 2 m ammonium sulfate and 0.1 m citric acid at pH 3.5, and never grew to more than 5 μm width. The crystals belong to space group P3, and data could be collected to 2.3 Å when the crystals were carefully mounted in litho loops and exposed to very strong synchrotron radiation. Both structures were determined by molecular replacement, using poly-alanine models of the known structures of Eco-Hfq (for Syn-Hfq) or Pae-Hfq (for Ana-Hfq), and refined iteratively with manual rebuilding. For Ana-Hfq, the cumulative structure factor statistics and phenix.xtriage indicated that the crystals suffered from merohedral twinning, resulting in pseudo-six-fold symmetry. Refinement against twinned data was consequently initiated in phenix and completed using refmac, which is able to use a maximum likelihood target function for twinned data. Automatic detection and refinement of twin domains in refmac yielded twin fractions 8.5%, 45.5% and 46.0% for the domains <h,k,l>, <−h,k,−l>, and <k,h,−l>, respectively. The final refinement resulted in R/Rfree of 15.2%/20.1% (Syn-Hfq, anisotropic B-factors; Table 1) and 27.2%/28.6% (Ana-Hfq, isotropic B-factors, twinned refinement; Table 1). The high-resolution electron density map for Syn-Hfq is exceptionally clear (Fig. 1B), except in the loop region 51–53, which appears to have an alternative, unmodelled conformation and therefore contributes to the R-factors being slightly on the high side for a 1.3 Å structure. The final model covers residues 5–69 in chain A, and residues 5–70 in chains B and C (Fig. 1C). For Ana-Hfq, the map is also generally very good, despite the crystals being twinned, and the final model covers residues 8–70 of both monomers, apart from the loop residues 52–54 (Fig. 1D). The R-factors for Ana-Hfq are relatively high, presumably because of twinning, but the structure solution was validated early on by the presence of clear side chain density for many residues in the initial electron density map calculated using phases from an all-alanine search model (Fig. S1). Furthermore, the relatively high sequence identity between Syn-Hfq and Ana-Hfq makes us very certain that the structure is correct, but care must, of course, be taken when interpreting the fine details of side chain positions in such a structure. Syn-Hfq crystallizes with three monomers in the crystallographic asymmetric unit, and the hexamer is thus generated by the two-fold crystallographic symmetry of F222 (Fig. 1C). Ana-Hfq, on the other hand, has only two monomers in the asymmetric unit, with the hexamer being generated by the proper crystallographic three-fold axis in P3 (Fig. 1D).

Table 1.   Crystallographic data statistics.
Crystallographic data statistics
 Crystal size (μm3)150 × 200 ×  3005 × 5 × 100
 Beamline911-3 (MAX-Lab)X06SA (SLS)
 Wavelength (Å)0.90000.9184
 Space groupF222P3
 a, b, c (Å)81.47, 86.16, 103.1860.58, 60.58, 32.47
 Asymmetric unit contentsThree monomersTwo monomers
 Resolution range (Å) (outer shell)50.0–1.3 (1.4–1.3)30.3–2.3 (2.4–2.3)
 Redundancy4.9 (4.9)3.5 (3.4)
 No. of unique reflections43 6735889
 Completeness (%) (outer shell)97.9 (96.5)97.9 (90.6)
 Rmerge (%) (outer shell)4.1 (50.7)6.5 (17.9)
 I/σI (outer shell)24.12 (3.15)14.55 (6.08)
Structure refinement
 Resolution range (Å)26.2–1.330.3–2.3
 R-factor (%)15.1727.18
 Rfree (%)20.0528.57
 No. of atoms/solvent/ residues1902/272/197940/0/120
 rmsd of bond lengths (Å) and angles (°)0.017/1.7980.024/2.547
 Average overall B-factor (Å2)17.7318.63
Ramachandran plot (%)
 Residues in core9284.0
 Generously allowed regions 815.0
 Additionally allowed regions 01.0
 Disallowed regions 00.0

In the crystals, Syn-Hfq packs in double layers of two hexamers, with the proximal sides against each other. Between the layers, the hexamers are shifted and interact with their distal sides (Fig. S2A). In the Ana-Hfq crystals, the hexamers are stacked right above each other in layers with no shifts, proximal side against distal side, in long tubes (Fig. S2B). There are no crystal contacts between layers in the built model, but close inspection of the electron density reveals that the N-termini protrude from the distal side and reach across to the next layer, thereby mediating the packing. This part of the structure (residues 1–7), however, could not be reliably modelled.

Overall structure

Despite significant variations in the core Sm sequence motifs, the overall structures of Syn-Hfq and Ana-Hfq are very similar to the previously determined structures of Hfqs from enterobacteria (E. coli, P. aeruginosa, and S. aureus), and consist of an N-terminal α-helix followed by a five-stranded antiparallel β-sheet (Figs 1C,D and 2A). The main structural differences between the two cyanobacterial Hfq orthologues are located in the 49–53 loop following β3, a region that generally shows great variability among the Hfq proteins (Fig. 2A). The unusually short conformation of the N-terminal α-helix observed for Mja-Hfq is not seen here, and, generally, the helices align very well with those in the structures of Eco-Hfq, Pae-Hfq, and Sau-Hfq [27].

Figure 2.

 Structural comparison and surface charge. (A) Superposition of monomers from Syn-Hfq and Ana-Hfq with monomers from previously determined Hfq structures. Syn-Hfq is shown in purple, Ana-Hfq in cyan, Eco-Hfq (Protein Data Bank: 1HK9 [29]) in yellow, and Mja-Hfq (Protein Data Bank: 2QTX [27]) in dark blue. (B) Details of the interface between neighbouring subunits in Eco-Hfq (yellow), Mja-Hfq (dark blue), and Syn-Hfq (purple). Structures are shown as Cα traces with all atoms displayed as sticks on the two β-strands that make up the interface (β4 and β5). (C) Surface charge distribution of Syn-Hfq (top half) and Ana-Hfq (bottom half) on both the distal (left) and proximal (RNA-binding, right) sides. The surface charges were calculated using pdb2pqr ( [49]) and apbs [50], and are shown on a scale from −25.0 to +25.0kbT/ec, where kb is Boltzmann’s constant, T is the absolute temperature, and ec is the elementary charge.

As mentioned, the sequences of Syn-Hfq and Ana-Hfq are quite similar to each other, but quite distant from those of other Hfq proteins for which structures are available (Fig. 1A). In particular, Hfq monomers from the cyanobacterial organisms do not form heterohexamers with endogenous Hfq when expressed in E. coli, unlike, for example, the Hfq orthologues from M. jannaschii and S. aureus (data not shown) [16,27]. A likely reason for this can be found when studying the interface between neighbouring subunits in the hexamer (Fig. 2B). In both Eco-Hfq and Mja-Hfq, the interface is stabilized by aromatic residues near the pore of the ring, some of which are also involved in binding RNA, particularly His57 and Tyr55 of the YKHAI (Eco-Hfq) motif corresponding to His64 and Phe62 in Mja-Hfq (Fig. 2B, left panel). In Syn-Hfq, these residues have been replaced by Ala61 and Arg59, respectively, whereas Ana-Hfq has Gln63 and Trp61 in these positions, and hence has at least the bulky aromatic residue conserved. However, it is likely that the resulting variation in shape complementarity, hydrophobic interaction forces and charges is the main reason why mixed hexamers between cyanobacterial and enterobacterial Hfqs cannot be formed. Interestingly, however, Syn-Hfq and Ana-Hfq have another aromatic residue, Tyr65 (Syn-Hfq)/Tyr67 (Ana-Hfq), which they share with Mja-Hfq (Tyr68) but not with Eco-Hfq (Thr61; Fig. 2B). So, in summary, the hydrophobic core structures that facilitate the subunit interface are not strictly conserved, and only some of these appear to be compatible with each other with respect to oligomerization. Whether or not this observation has any relation to the RNA-binding properties of the different Hfq proteins will probably remain unclear until structural data on the binding site(s) are available.

Another distinguishing feature of the cyanobacterial Hfqs is the presence of tryptophans, which are absent from ‘conventional’ Hfqs. One of these is shared between Syn-Hfq (Trp40) and Ana-Hfq (Trp42), and is involved in anchoring β3 to the helix of the neighbouring subunit via hydrophobic interactions. No similar hydrophobic interactions are found at this location in the other known structures, which have mostly either a serine or a glutamate. The following position (Tyr39 in Sau-Hfq) may, however, confer similar stability, as this residue points in the opposite direction and contacts the helix on the same monomer.

Cyanobacterial Hfqs contain an unusual RNA-binding site

The major RNA-binding site of the Hfq proteins has been identified through the crystal structure of Sau-Hfq in complex with a 5′-AU5G-3′ RNA oligomer [16]. In this structure, the RNA is bound in a circular, unwound manner around the pore of Hfq within a basic path on the proximal side, with the individual uracil bases pointing into pockets in the hexamer. The structure also revealed that this part of the surface was considerably more basic than the rest of the molecule, and particularly the other side (the distal side), which was relatively neutral. Although the primary sequences of Syn-Hfq and Ana-Hfq are very similar, there is a marked difference in their surface charges (Fig. 2C). Both hexamers are more basic on their proximal side, but the difference is most pronounced for Ana-Hfq.

The RNA-binding pocket seen in Sau-Hfq is composed of residues both from the N-terminal helix and the Sm1 and Sm2 motifs (Fig. 1A). Remarkably, of the five residues forming the nucleotide base pocket (Gln8, Lys41, Tyr42, Lys57 and His58 in Sau-Hfq), none is conserved in the cyanobacterial Hfqs. In the structure of the archaeal Mja-Hfq, there are some substitutions, but the main NG and Y residues of the Sm1 motif (Fig. 1A) as well as the [Y/F]KHAI sequence of the Sm2 motif are in place. Despite the variation in the critical amino acids of Syn-Hfq and Ana-Hfq, the RNA-binding pocket appears to be structurally preserved, albeit with different physicochemical properties (Fig. 3A). It may therefore well be that the cyanobacterial proteins do bind to single-stranded regions of RNA, as observed for Sau-Hfq. However, the protein–RNA interactions are likely to be different and, at least in the case of Syn-Hfq, it seems likely that the charge distribution would not favour strong binding of RNA.

Figure 3.

 RNA binding by Syn-Hfq and Ana-Hfq. (A) Surface and charge distribution of the RNA-binding pocket in Syn-Hfq, Ana-Hfq, and Sau-Hfq. Each surface is shown with the uracil bound in the structure of Sau-Hfq (Protein Data Bank: 1KQ2 [16]) as sticks overlaid by alignment of the protein main chain. (B) The RNA-binding pocket as observed in Sau-Hfq (green sticks) shown with the structurally corresponding residues in Syn-Hfq (purple) and Ana-Hfq (cyan) overlaid. Critical residues are shown as thick sticks, and selected hydrogen bonds as dashed, black lines. (C) Specific interactions with the uracil base shown as above. (D) Recognition of the RNA backbone shown as above.

One striking property sets Syn-Hfq and Ana-Hfq apart from the other Hfq proteins, namely the lack of one of the main features of the RNA-binding pocket as described for Sau-Hfq, the aromatic residue within the Sm1 motif, which is highly conserved among Hfq proteins, either as a tyrosine or as a phenylalanine (Fig. 1A). In the RNA-bound structure, the aromatic side chain stacks with the RNA base, thereby holding it tightly in place in the pocket (Fig. 3B). In Syn-Hfq, the residue is replaced by Asp44, and in Ana-Hfq by Thr46; neither of these residues confers stabilization of the base in the form of stacking. The substitutions also have the effect of creating a more open binding site (Fig. 3A). In Sau-Hfq, the preceding residue, Lys41, from the neighbouring subunit, forms the other side of the binding pocket (Fig. 3B). This basic residue is also not conserved in the two cyanobacterial Hfqs, and is even a proline in Ana-Hfq. Direct recognition of the base via hydrogen bonds is achieved by the conserved Gln8 in Sau-Hfq as well as Lys57 of the [Y/F]KHAI motif (Fig. 3C). Although the glutamine is not conserved in any of the cyanobacterial Hfq proteins, both have potential equivalents of the basic residue at the beginning of the motif, which are in place to mediate an interaction equivalent of Lys57, specifically Arg59 (Syn-Hfq) and Lys62 (Ana-Hfq). Of these, however, Arg59 would have to reach from the neighbouring molecule (Fig. 3C).

The conserved [Y/F]KHAI sequence in the Sm2 motif also takes part in contacting the RNA backbone in the Sau-Hfq cocrystal structure (Fig. 3D) [16]. In particular, His58 contacts one of the phosphate oxygen atoms in each binding pocket, so that all phosphates are held tightly (Fig. 3D, where two neighbouring histidines are shown). Whereas this motif is fully conserved in Mja-Hfq, bearing the sequence FKHAI, only the uncharged alanine and isoleucine are left in the cyanobacterial orthologues (Fig. 1). In Ana-Hfq, Gln63 could potentially mediate a contact similar to that of the canonical histidine, but in Syn-Hfq the residue is replaced by alanine. Both of these substitutions would be expected to significantly lower the RNA-binding affinity of the cyanobacterial Hfqs.

Syn-Hfq and Ana-Hfq exhibit lower RNA affinities in vitro

In order to validate the structural hypotheses regarding the RNA affinity of the two cyanobacterial proteins, we tested their binding to two well-characterized Hfq-binding sRNAs, MicM (alias RybC/SroB) and Spot 42 RNA, in vitro [7,32,33]. Samples containing a fixed, low amount of 32P-labelled sRNA, a 500-fold excess of tRNA to prevent nonspecific binding and increasing amounts of Syn-Hfq or Ana-Hfq were incubated, and complex formation was analysed by electrophoretic mobility shift assays. The results demonstrate that both cyanobacterial Hfq proteins are capable of interacting with known E. coli base-pairing sRNAs (Fig. 4A,B). However, the affinities for the sRNAs are approximately 100-fold lower than that of E. coli Hfq (Fig. 4C) [7,33]. This is remarkable, as even very remotely related Hfq variants, such as Mja-Hfq, an Hfq-like protein from the archaeon M. jannaschii, only exhibit a two- to six-fold lower affinity for E. coli RNAs [27].

Figure 4.

 RNA binding by Syn-Hfq and Ana-Hfq in vitro. (A) Gel mobility shift assays analysing the ability of Syn-Hfq to bind to the Spot 42 and MicM RNAs from E. coli in vitro. (B) Similar experiment analysing the ability of Ana-Hfq to bind to Spot 42 RNA in vitro. (C) Interaction between Eco-Hfq and MicM RNA in vitro. (D) Analysis of the ability of the cyanobacterial Hfq proteins to functionally replace E. coli Hfq in rpoS translation. Wild-type (denoted wt) or the hfq1::kan Δhfq derivative of E. coli (denoted hfq) were transformed with empty vector (pNDM220), or vector expression wild-type E. coli Hfq (pNDM–hfqEco), Ana-Hfq (pNDM–hfqAna), or Syn-Hfq (pNDM–hfqSyn). Cells harvested at various time points were analysed by quantitative western blotting on equal amounts of cells to determine the relative σS levels at each point during growth. EE, early-log cells; LE, late-log cells; T, cells at transition to stationary phase; and S, stationary-phase cells.

To investigate possible biological roles of the cyanobacterial Hfqs, we generated two low copy number plasmids, pNDM–hfqSyn (Hfq-Syn) and pNDM–hfqAna (Hfq-Ana), expressing the two hfq genes under the control of the inducible lac promoter derivative, PA1/O4 [34,35]. The ability of the two orthologues to support stabilization of the MicA, MicM and Spot 42 RNAs in vivo was then tested in rifampicin run-off experiments in isopropyl thio-β-d-galactoside (IPTG)-induced, exponentially grown E. coli hfq1::kan cells carrying pNDM–hfqSyn or pNDM–hfqAna, and, as controls, the corresponding pNDM–hfqEco expressing wild-type E. coli Hfq or the empty vector pNDM220 (Fig. S3) [27]. Using the same strains, we examined whether the cyanobacterial Hfq proteins could participate in RyhB-mediated downregulation of sodB mRNA expression (Fig. S4) [33,36–38]. Consistent with the very weak binding to the sRNAs, the in vivo results revealed that the two cyanobacterial Hfqs could neither provide stabilization of sRNAs nor promote sRNA-mediated decay of mRNAs in E. coli. In addition, and in contrast to archaeal Hfq, neither Syn-Hfq nor Ana-Hfq could promote translation of the general stress factor σS in E. coli (Fig. 4D) or complement stress phenotypes observed in E. coli hfq deletion strains (data not shown) [27,39].


In this article, we describe the crystal structures of the Hfq orthologues from the cyanobacteria Synechocystis sp. PCC 6803 and Anabaena PCC 7120, and show that, despite large variations in conserved parts of the sequences, these proteins maintain an Sm fold that is nearly indistinguishable from those of the ‘conventional’ Hfq proteins (Syn-Hfq has rmsd values between 0.378 and 0.635 Å when superimposed on the Mja-Hfq, Eco-Hfq or Sau-Hfq structures, and Ana-Hfq has rmsd values between 0.474 and 0.618 Å). However, the cyanobacterial proteins show a very divergent structure of the RNA-binding site described in the RNA-bound Sau-Hfq structure, which has a significant positive charge density only in the case of Ana-Hfq. Although the binding pocket itself is structurally retained, the amino acid composition is very different from that observed in other Hfq proteins. Finally, we show that a functional consequence of these variations is that the binding affinity for conserved Hfq-binding sRNAs of enterobacteria is lowered by two orders of magnitude in vitro and that the proteins consequently cannot replace the endogenous Hfq in E. coli in vivo.

The significant variations observed in primary amino acid sequence between the cyanobacterial and enterobacterial Hfq proteins raises the question of what maintains the bacterial Hfq Sm fold. With respect to oligomerization, we have shown that there appear to be different hydrophobic networks among Hfq proteins that are incompatible with cross-oligomerization. However, the Sm-like fold itself rather appears to be stabilized by a separate network of small hydrophobic residues, particularly isoleucine, valine, and leucine, many of which share positions between, for example, Eco-Hfq and the cyanobacterial counterparts. This may not be immediately evident from the sequence alignment in Fig. 1, as structurally identical positions may contain different small aliphatic amino acids (e.g. Syn-Hfq Ile65 and Eco-Hfq Val62). The side chains of these residues, however, almost always align perfectly in the structures.

Although the networks of small aliphatic amino acids are very similar, the aromatic side chains are distributed almost entirely without structural overlaps. Of the five tyrosine and phenylalanine residues in Eco-Hfq, none overlaps structurally with aromatic residues in Syn-Hfq, and only one overlaps with those in in Ana-Hfq (Trp61). Likewise, Syn-Hfq has four aromatic residues, none of which overlaps structurally with those found in Eco-Hfq. Inspection of the structures show that these residues do not form part of the hydrophobic core of the Sm fold, but are rather placed near or on the surface, where they might be involved in interactions with RNA. Together, these observations indicate that the Hfq proteins primarily use small, aliphatic residues to maintain a stable Sm fold, whereas large, aromatic residues are available for RNA binding.

In this article, we show that the location at the centre of the ring at the proximal side known to bind RNA in Sau-Hfq has an unusual structure in the cyanobacterial orthologues. This finding, along with the observed binding data for sRNAs (Fig. 4), indicates that the structural changes cause significant variations in the RNA-binding properties of Syn-Hfq and Ana-Hfq. In Sau-Hfq, the RNA is held by a combination of aromatic stacking and interactions with basic residues near the core of the ring (Fig. 5, lower right). In Syn-Hfq (Fig. 5, upper right) and Ana-Hfq (not shown), the clustering of aromatic and basic residues at this location is absent, and there is a more even distribution of potential RNA ligands on both the proximal and distal sides. The observed pattern leaves several possibilities for additional RNA-binding sites, on both the rim and distal sides of both proteins. Detailed analysis of such sites in the future, both structurally and functionally, will be of great importance for understanding whether and how Hfq proteins are able to bind two RNA substrates simultaneously or instead work as RNA chaperones. As the cyanobacterial Hfqs are divergent at the described RNA-binding site, a structure of a one of these proteins in complex with RNA would be of great value and might reveal the location of a secondary binding site. It has recently been shown that deletion of Hfq from Synchocystis reduces the levels of a specific set of transcripts and strongly affects the phenotype of the bacterium [31]. The work to map which RNAs are targeted by the cyanobacterial Hfqs in vivo should be continued and combined with structural studies of RNA binding to fully elucidate the function of these proteins.

Figure 5.

 Conservation of basic and aromatic residues. Distribution of basic (Lys, Arg, and His) residues on the distal (left) and proximal (right) sides of Syn-Hfq (top half) and Sau-Hfq (bottom half), shown as blue sticks. Aromatic residues (Tyr, Phe, and Trp) are shown as red sticks.

Experimental procedures

Plasmid construction

For overexpression of Syn-Hfq and Ana-Hfq in E. coli, codon-optimized genes flanked by SapI and PstI sites (Syn-hfq, 5′-GCTCTTCCAAC ATG TCC CGT TTC GAC TCA GGT CTA CCG AGT GTC AGA CAG GTG CAG CTG CTT ATC AAA GAT CAG ACG CCT GTT GAA ATT AAG TTA CTC ACC GGG GAT TCT CTG TTT GGC ACG ATT CGC TGG CAG GAT ACT GAT GGC CTG GGT TTG GTG GAC GAC TCG GAG CGT AGC ACC ATC GTA CGG CTG GCC GCA ATT GCG TAT ATT ACA CCA CGC CGT TAA CTGCAG-3′; and Ana-hfq, 5′-GCTCTTCCAAC GCG ATT ACC GAA TTT GAT ACG AGC CTG CCG AGT ATC CGT CAG CTT CAG AAT TTG ATC AAA CAG GCC GCA CCG GTA GAA ATT AAA CTG GTC ACC GGC GAT GCA ATT ACC GGT CGG GTG CTG TGG CAG GAT CCT ACC TGT GTT TGC ATT GCC GAT GAG AAC TCT CGC CAG ACC ACG ATC TGG AAA CAG GCA ATT GCG TAT TTA CAG CCA AAA GGG TAA CTGCAG-3′) were synthesized (Entelechon GmbH, Germany) and cloned into plasmid pTYP11 (New England Biolabs, Ipswich, MA, USA), generating pTYB11–hfqSyn and pTYB11–hfqAna, respectively. The low copy number plasmid, pNDM–hfqSyn, was constructed by replacing the EcoRI–BamHI fragment of pNDM220 with a PCR-generated fragment, prepared using P2-pNDM-Bam (5′-CGG GAT CCA AAG GAG GAA TTA ACT ATG TCC CGT TTC GAC TCA GGT CTA CCG-3′) and P2-pNDM-Rev-Eco (5′-CGG AAT TCT TAA CGG CGT GGT GTA ATA TAC GCA ATT GCG GC-3′) as primers, and pTYB11–hfqSyn as template, followed by digestion with EcoRI and BamHI. Likewise, the low copy number plasmid pNDM–hfqAna was constructed by replacing the EcoRI–BamHI fragment of pNDM220 [34] with a PCR-generated fragment, prepared using P3-pNDM-Bam (5′-CGG GAT CCA AAG GAG GAA TTA ACT ATG GCG ATT ACC GAA TTT GAT ACG AGC CTG CCG-3′) and P3-pNDM-Rev-Eco (5′-CGG AAT TCT TAC CCT TTT GGC TGT AAA TAC GCA ATT GCC-3′) as primers, and pTYB11–hfqAna as template, followed by digestion with EcoRI and BamHI.

Protein expression and purification

The full-length Hfq proteins were overexpressed in E. coli strain ER2566 and purified using the intein self-cleaving tag system (Impact-CN; New England Biolabs). Cells were grown at 37 °C in 2× YT medium containing 100 μg·mL−1 ampicillin. At a D600 nm of ∼0.6, induction was induced by addition of 0.6 mm IPTG, and growth was continued overnight at 20 °C. Cells were harvested by centrifugation (6000 g for 10 min at 0 °C), and lysed using a French press. Following centrifugation, the initial purification was carried out according to the manufacturer’s recommendations (New England Biolabs). The lysis and wash buffers contained 20 mm Na-Hepes (pH 8.0), 1 mm EDTA (pH 8.0), and 0.5 m or 0.8 m NaCl, respectively. On-column cleavage of the intein fusion tag was carried out for 2 days at room temperature, using a lysis buffer containing 50 mm dithiothreitol. The eluted proteins were precipitated by adding ammonium sulfate to 60% saturation, and resuspended in a small volume of 20 mm Hepes-OH (pH 8.0), 400 mm NaCl (Syn-Hfq) or 200 mm NaCl (Ana-Hfq), and 5 mmβ-mercaptoethanol. The clarified sample was then applied to a Superdex 75 10/300 GL column (GE Healthcare) running at 0.5 mL·min−1 in the same buffers. Peak fractions corresponding to monodisperse, hexameric protein, as judged by the gel filtration fractionation, were concentrated using a MilliPore spin filter with a 30 000 molecular weight cut-off to approximate concentrations of 7–8 A280 nm millilitre as estimated by the NanoDrop method (Thermo Fisher Scientific Inc., Waltham, MA, USA). Immediately prior to crystallization, the samples were ultracentrifuged in a Beckman TLA-110 rotor at 70 000 r.p.m. (approximately 200 000 g) at 4 °C for 20 min to remove potential aggregates and other impurities.

Crystallization and structure determination

Initial crystallization screens were performed in 24-well standard format using the Index Screen (Hampton Research, Aliso Viejo, CA, USA). Drops were mixed as 1 μL of protein + 1 μL of reservoir at room temperature before incubation at either 4 °C or 19 °C. Crystal hits were subsequently optimized to 56–62% Tacsimate (pH 7.0) at 19 °C for Syn-Hfq, and 2.7–3.2 m NaCl or KCl, and 100 mm citric acid (pH 3.5) at 4 °C for Ana-Hfq. Large, single crystals of Syn-Hfq appeared after 1 day and grew to a maximum size of 150 × 200 × 300 μm. These crystals could be frozen directly from the mother liquor using traditional nylon loops, owing to the high salt concentration. Small, needle-shaped crystals of Ana-Hfq appeared almost immediately, but never grew larger than 5 × 5 × 100 μm. These crystals were mounted using flat litho loops (Hampton Research) and frozen in liquid nitrogen, also directly from the mother liquor.

Data collection was carried out at the beamlines 911-3 at MAX-lab in Lund, Sweden (Syn-Hfq) and the microfocus beamline X06SA at SLS, Switzerland (Ana-Hfq), and integrated using xds [40]. Both structures were determined by molecular replacement using the automated procedure (AutoMR) in phenix [41] and poly-alanine models of the known Hfq structures from E. coli (Protein Data Bank: 1HK9 [29]), P. aeruginosa (Protein Data Bank: 1U1S [28]), S. aureus (Protein Data Bank: 1KQ1 [16]), and M. jannaschii (Protein Data Bank: 2QTX [27]), to minimize model bias. An initial model of Syn-Hfq was created by autobuilding in textal [42] and resolve [43], followed by manual rebuilding in coot [44]. Iterative refinement with manual rebuilding was carried out using phenix.refine, and, in the case of Ana-Hfq, finalized in refmac using the twinning operators <h,k,l>, <−h,h+k,−l>, and <k,h,−l> with twin fractions 8.5%, 45.5%, and 46.0%, respectively [45]. Model validation was carried out by submission of the final structures to the molprobity server [46].

Functional analysis of Syn-Hfq and Ana-Hfq

In vitro synthesis of RNA was carried out as previously described [27]. For the gel shift analyses, RNA samples containing a 32P-labelled transcript of the sRNAs (at ∼1 nm final concentration) and a 500-fold molar excess of E. coli tRNA were incubated with increasing amounts of the relevant Hfq protein, and complex formation was monitored by separation of the complexes on a 5% nondenaturing polyacrylamide gel running in TBE buffer. For the western blot analysis of σS, wild-type or the SØ928 hfq1::kan Δhfq strain of E. coli were transformed with pNDM220, pNDM–hfqEco (Eco-Hfq), pNDM–hfqSyn (Syn-Hfq), or pNDM–hfqAna (Ana-Hfq), and cultivated in LB medium containing 0.5 mm IPTG and 30 μg·mL−1 ampicillin. Cells were harvested at various time points after inoculation, and quantitative western blot analysis was performed on equal amounts of cells to determine the relative σS levels at each point during growth, as previously described [27].

For analysis of the ability of Syn-Hfq and Ana-Hfq to stabilize the Spot 42 or MicM RNAs in vivo, samples of exponentially grown cultures of IPTG-induced (0.5 mm) cells of E. coli SØ928 hfq1::kan carrying either pNDM–hfqSyn, pNDM–hfqAna, or pNDM–hfqEco, and treated with rifampicin to block new transcription, were removed at the indicated times, and total RNA was extracted. Nothern blotting with probes against Spot 42 or MicM RNA was performed as previously described [27].

For examination of the ability of Syn-Hfq and Ana-Hfq to function in sRNA-mediated regulation of mRNA turnover, cells of E. coli SØ928 hfq1::kan cells were transformed with pNDM-220, pNDM–hfqEco, pNDM–hfqSyn, or pNDM-hfqAna, and grown in LB medium containing 1 mm IPTG at 37 °C. At D600 nm∼0.4, each of the cultures was split, and 2,2′-dipyridyl was added to one culture. After 10 min, samples were harvested, total RNA was extracted, and the RyhB RNA and sodB mRNA levels were determined by northern blot analysis. The 5′-end labelled DNA probes (EM1 and EM33) used for detection of the RyhB RNA and sodB mRNA are described in [37].


We thank beamline staff at X06SA (SLS) and 911-3 (MAX-Lab) for help during data collection. This work was funded by the Danish National Research Foundation (Danmarks Grundforskningsfond) Centre for mRNP biogenesis and metabolism, and grants from the Novo Nordisk Foundation (D. E. Brodersen) and The Danish Natural Science Research Council (P. Valentin-Hansen).