Comparison of repressor and periplasmic binding proteins
To explore the question of which residues are necessary for repressor dimerization, structures of the repressor core domains and several periplasmic binding proteins were used to create a structure-based alignment (Fig. 3; DALI: Holm and Sander 1996). For the repressor proteins, residues that have either side chain or backbone atoms within 3.5 Å of the partner monomer or hydrophobic contacts within 4.5 Å have been mapped onto this alignment. Interface differences between unliganded proteins, which have “open” structures, and those of the “closed” ligand-bound conformations are also noted. Additional water-mediated interactions (at least one atom from each monomer within 3.5 Å of the same water molecule) are listed in Table 1. In general, the latter also make direct contacts and are included in Figure 3, although water molecules mediate additional interactions for TreR 95 to 99 and PurR 284 to 284, 284 to 252, and 285/286 to 260.
Of the N-subdomain sites that directly contact their partner monomers, 10 are structurally homologous for all three repressors; these positions are indicated with Roman numerals I–X in Figure 3A. However, no discrete pattern emerges from their amino acid identities that can delineate between oligomeric repressors and monomeric binding proteins. None of these sites have identical primary sequences for all three repressors. Amino acid similarity is only well-conserved as hydrophobic residues at sites VII and VIII, but the monomeric binding proteins also have hydrophobic side chains at these positions. In fact, an overall comparison for the N-core subdomain of RBP and the repressor proteins shows no significant deviation at any site. ABP does have a potentially disruptive proline at site II and a lysine at V and VIII. GBP has very low correlation at positions I and V, whereas ALBP is disparate at I, V, VII, and IX. However, given the low similarity between the three repressors, these differences are not remarkable. Sequence examination of this region does not reveal firm guidelines for oligomerization.
Since the primary sequences of these proteins demonstrate no striking patterns, we extended the analysis by looking for conserved interactions between pairs of sites. Contact maps were created for each interface to show the interactions that occur in the open and closed forms of each protein (Fig. 4). The open (ligand-free) structures of LacI and PurR have only one conserved interaction, between position III of monomer A and LacI residue 81 or PurR 79 on monomer B (Fig. 4A,B, dashed line). (The open structure of TreR is not available.) Four additional interactions create similar patterns on the contact maps of these two proteins (Fig. 4A,B, dotted lines): LacI has interactions between A:IV and B:VIII/B:IX that occur symmetrically from B to A. PurR has similar interactions using position V instead of IV. Otherwise, the interactions between unbound monomers produce strikingly different patterns. This variance is probably related to two issues. First, the open states of the two proteins differ the most at the core N-subdomain, both in their final positions and in their axes of rotation describing the motion that occurs on ligand-binding/release (Mowbray and Björkman 1999). Second, the two proteins have transposed functional modes. The open structure of LacI binds DNA, whereas PurR can only bind DNA in the closed complex with corepressor. In that light, both DNA-free forms of the proteins have interactions from A:VII to B:X and A:X to B:X. Further analysis of the N-terminal interface in the open, DNA-bound state must await new structural information from other repressors.
Even though their tertiary structures are more similar (Mowbray and Björkman 1999), the interaction maps of the closed core N-subdomains do not indicate many more common patterns. Only four interactions are conserved between the three repressors: A:VIII to B:IV, A:IV to B:VIII, VII to VII, and VIII to VIII (Fig. 4A–C, thick lines). Of these, position IV is conspicuous because several mutations of LacI 84 are known to have profound affects on interface strength (Chang et al. 1993; Nichols and Matthews 1997; Gerk et al. 2000). Some interactions are conserved between pairs of repressors, although no two proteins are more similar than the third (Fig. 4A–C, dashed lines). A few more similar but nonidentical patterns are indicated with dotted lines in Figure 4A–C. Therefore, the original sequence-based observation that the N-subdomain yields few strict rules of oligomerization appears to hold on structural analysis.
Examination of the core C-subdomain does indicate a region that may be critical for assembly. This subdomain provides an unmoving anchor for the allosteric conformational changes of LacI and PurR (Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Mowbray and Björkman 1999; Bell and Lewis 2000; Matthews et al. 2000). Although the C-subdomain has fewer amino acids involved in the interface than does the N-subdomain, a greater percentage of sites are structurally conserved. These are labeled with Roman numerals I–VIII in Figure 3A. Of these, positions I, II, V, and VI show little sequence similarity between all three repressors. In contrast, sites III, IV, VII, and VIII are all very hydrophobic. For the periplasmic binding proteins, this hydrophobicity is conserved at III and IV (with the exception of GBP III). However, the two protein families are not at all similar around VII and VIII. In fact, the primary sequences of the periplasmic binding proteins are so different from those of the repressors that their structural equivalence could not be inferred for a previous sequence-based alignment (Nichols et al. 1993). Differences arise not only from the primary sequences between the two groups at VII and VIII, but also from the insertions subsequent to position VIII that occur in all four periplasmic binding proteins examined (Fig. 3A). For ABP, the six amino acids inserted in this region do not align with any other residues and, for simplicity, are indicated with dots.
Primary sequence mutations at site VIII had previously implicated this region in dimer assembly. Monomeric LacI Y282D was one of the earliest characterized mutations of this repressor (Schmitz et al. 1976; Daly and Matthews 1986; Chakerian and Matthews 1991; Chen and Matthews 1992b). In light of the known assembly states of LacI Y282 mutants, the nature of residues for the other aligned proteins is very intriguing. The PurR residue VIII is an F and TreR has an L; both substitutions are precisely those tolerated by LacI (Chakerian and Matthews 1991). Both PurR and TreR also have hydrophobic amino acids at position VII, and LacI C281 also has hydrophobic character (Fig. 3A). None of the periplasmic binding proteins has hydrophobic residues at VII, although the S in ABP might be considered similar to the C of LacI. However, ABP has E at position VIII, where placing a D is fatal to LacI assembly. The VII–VIII regions for RBP and ALBP are also highly charged. Although the GBP sequence in the VII–VIII region is hydrophobic, this monomeric protein has an additional charged residue inserted on the loop subsequent to VIII. Interestingly, residue 289 of the LacI/GalR family member cytidine repressor also aligns with the site immediately following position VIII (Weickert and Adhya 1992). Substitution of a charged residue in the mutation C289R abrogates assembly of this normally oligomeric repressor (Barbier and Short 1993).
Contact maps of the core C-subdomain are also very striking (Fig. 4D–F). Most of the interactions are conserved among all three repressors (Fig. 4d–f, thick lines). The pattern is most noticeable for TreR, which has only four unique interactions (Fig. 4F, thin lines). The open and closed C-interfaces of LacI and PurR exhibit few differences, consistent with previous observations that this subdomain is not affected by the allosteric conformational change (Fig. 4D,E; Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Mowbray and Björkman 1999; Bell and Lewis 2000; Matthews et al. 2000). The five interactions of residue VII are particularly striking and appear to provide the key stabilizing interactions of this interface. Structurally, the side chain of this residue provides the “ball” for a hydrophobic “socket” of its partners (Fig. 5)
However, this site also provides an excellent example of why structural data alone are not sufficient for deciphering the code of protein structure. Both in vivo and in vitro studies demonstrate that position VII of LacI nonetheless tolerates a wide variety of substitutions. LacI C281 mutation to S, A, F, I, or M results in biophysical characteristics similar to wild-type protein (Chakerian and Matthews 1991). Phenotypic analysis of C281 mutations to L, H, Y, F, A, S, Q, R, or E demonstrates that these mutants all maintain their repression function (Suckow et al. 1996), which would be destroyed if assembly were abolished. Only C281 altered to P, K, and G diminished repression (Suckow et al. 1996). C281 has also been the target of extensive chemical modification (Daly et al. 1986). Low reaction rates indicate that C281 is shielded from solvent and is modified by only extremely reactive compounds, generally at high concentrations of reactants. However, the function of the resulting protein is only moderately affected, and the modified interface must still be intact (Daly et al. 1986). Thus, LacI position VII demonstrates plasticity even beyond the options provided by nature.
In contrast, LacI mutations at position VIII, which makes only two contacts in the interface (Fig. 4D), are poorly tolerated. Chakerian and Matthews (1991) demonstrated that only F and L are tolerated; A, S, and E abolish oligomerization. Phenotypic analysis yielded similar results: L and F maintained wild-type phenotypes whereas P, H, C, A, G, S, N, K, R, or E substitutions abolished repression function (Suckow et al. 1996). Our current hypothesis is that the differing mutational sensitivities arise from the proximity of each site to the surface of the interface. Site VII is on the edge, but VIII is buried in the middle and therefore has stricter packing constraints.
Despite mutational sensitivity at site VIII, both assembly and function of the LacI Y282D mutation are restored by a second-site mutation at K84. To determine how many second-site mutations can revert the phenotype of Y282D and thereby explore plasticity of quaternary structure, we developed tools to screen the function of a large number of randomly mutated proteins. The phenotypic screen presented in Figure 6 required several stages of development but evolved into a facile and effective tool with reproducible results even in the hands of novice researchers. The detailed rationale of the screen is as follows. Wild-type LacI (WT) is expressed from the repressor plasmid and binds to its operator site on the pZCam reporter plasmid. This binding represses transcription of β-galactosidase, and colonies remain white in the presence of the indicator Xgal. If the Y282D mutation is present on the repressor plasmid, the resulting LacI protein is monomeric and cannot bind DNA, allowing β-galactosidase transcription and resulting in turquoise blue colonies in the presence of Xgal. If a second mutation restores dimerization of Y282D protein, so that resulting protein can again bind operator DNA and inhibit β-galactosidase transcription, colonies remain white in the presence of Xgal. If the doubly mutated protein exists in equilibrium between monomer and dimer, it will not repress as efficiently as wild-type LacI and may produce light blue colonies. This last feature confers an advantage to the blue/white screen over alternative toxicity selection approaches, which could not detect intermediates. A further benefit of this screen is that the high ratio of repressor protein to reporter plasmid can allow detection of double mutants that may form only weakly associated oligomers.
Results of the genetic screen of >30,000 colonies are presented in Table 2. The mutations that compensate the dimerization defect of Y282D produced either white or very light blue colonies. A large percent of light blue colonies identified in the screen were found to be false positives (Y282D with no further mutations), as well as a few of the white colonies. Although this result raises the possibility that some of the mutations detected by the screen do not actually affect LacI assembly or function, several lines of evidence support the position that they do compensate the Y282D defect. The most substantive evidence is structural analysis; sites are easily assigned to one of three classes (see below). An especially interesting group contains sequential sites: 149, 150, and 151. Furthermore, seven revertants were each detected more than once (Table 2), five of which (M42I, H74R, N246S, T276I, and S354F) were isolated from Y282D plasmids mutated in separate experiments. Finally, four of the sites (133, 223, 274, and 276) had multiple mutations (Table 2).
Since all of the reversion mutations resulted from a single base pair change, we found their number to be surprisingly large. Furthermore, the K84 mutations known to compensate Y282D (Chang et al. 1993) were not obtained, because both require changing more than one base pair, implying that mutagenic saturation has not yet been reached. Although we attempted higher frequency mutation by passaging singly mutated plasmids back through XL1-Red cells, these plasmids were corrupted beyond use (data not shown). Further efforts using PCR-based methods are being pursued to exhaustively identify second-site revertants for the Y282D phenotype.
One other result that bears note is that some of the wild-type revertants (D282Y) included mutations at one or more of the three cysteine residues (C107, C140, and C281; see Table 2). These were the only revertants containing more than one mutation per LacI gene. We find these results very intriguing in light of challenges encountered when altering these cysteines in other projects (Wycuff 1999).
Global stabilization of oligomerization interfaces: H74R, M98I, M223I/T, M232I, Q248R, and S354F
The original mutations known to compensate Y282D oligomerization (K84L and K84A) appear to function through strengthening the monomer–monomer interface (Nichols and Matthews 1997; Gerk et al. 2000). Therefore, the fact that the first two mutants identified by this study, H74R and M98I, are also located on the core N-subdomain in the monomer–monomer interface was very satisfying (Fig. 2). In fact, the contacts between M98 and K84 are two of the four interactions conserved in the closed core N-interface for all three repressors (Fig. 4A–C). M98I may therefore function by forming a more extensive hydrophobic patch at the interface. Similarly, M223 is involved in a cross-monomer interaction of the core C-subdomain with S280 (Lewis et al. 1996). The LacI reversion mutation M223I/T may also strengthen the hydrophobicity of this contact (Fig. 7).
In contrast, an ionic interaction occurs between H74 and D278 when wild-type LacI binds inducer (Friedman et al. 1995; Lewis et al 1996). This interaction is the sole contact between the core N-subdomain of one monomer and the core C-subdomain of its partner monomer and is not present when LacI binds DNA (Fig. 2; Lewis et al. 1996; Bell and Lewis 2000). In fact, the cross-monomer partner for D278 in the “open” DNA-bound conformation is Q248 (Fig. 7; Bell and Lewis 2000). Interestingly, both H74R and Q248R compensate the Y282D lesion. These structural observations led to an initial hypothesis that the H74/D278 ion pair is essential for allosteric communication. However, Barry and Matthews (1999) found that, although mutations at position 74 do affect allosteric communication, the ionic bond is not required. Because arginine should form a stronger ion pair than histidine, the original hypothesis would predict that H74R should strengthen the inducer-bound form of LacI, which cannot bind DNA with high affinity. Therefore, the observation that the H74R mutation allows Y282D to bind operator DNA supports the conclusions of Barry and Matthews (1999). The stronger base (arginine) in H74R (or Q248R) may compensate for the additional negative charge of the Y282D. Interestingly, E70 of PurR makes a comparable ionic bond with PurR R278, and an E70A mutation (which removes a negative charge) results in a shift to the closed conformation, which has high affinity for DNA (Lu et al. 1998).
The locations of two other reversion mutations, M232I and S354F, have surprising implications (Fig. 2). S354 is in the LHR tetramerization region of LacI (Fig. 2). In wild-type LacI, residues 351 and 355 of one monomer contact the core domain of its partner monomer (Fig. 3); this interaction is asymmetric in that it occurs only once per dimer. Therefore, we speculate that the S354F mutation may provide an additional LHR-to-core interaction, thus stabilizing the assembled form of the protein. In contrast, M232 is not near any known or potential monomer–monomer interfaces. In fact, residue 232 is on a loop that juts into the space between dimers and may form a second dimer–dimer interface (in addition to the LHR; not shown). We speculate that M232I again provides a tighter hydrophobic contact, stabilizes the tetramer, and by Le Chatelier's principle, strengthens the dimers. This effect on the linked assembly steps is opposite to that observed for the L251A mutation (Dong et al. 1999).
Local structural effects: M223I/T, N246S, Q248R, D274N/G, T276A/I, and P284S
Another simple mechanism for compensating the Y282D mutation is through local adjustments in the structure that accommodate the tyrosine to aspartate change. Indeed, nine mutations—M223I/T, N246S, Q248R, D274N/G, T276A/I, and P284S—all occur in the immediate area of the original Y282D mutation (Fig. 7), and we postulate that they function in this manner. M223I/T and Q248R also participate in the monomer–monomer interface (see earlier). In particular, one means by which Q248R and D274N/G might compensate Y282D is through reducing excessive negative charge created by the aspartate. Several of these “local” sites are directly involved in wild-type LacI inducer binding (Friedman et al. 1995; Lewis et al. 1996). Interestingly, the phenotypes of E. coli expressing genes encoding some of these compensating mutations in a wild-type (Y282) background have been examined by the Miller lab and found to alter allosteric response to inducer (Kleina and Miller 1990; Suckow et al. 1996).
Relevant details for each site follow. N246 directly contacts inducer IPTG (Friedman et al. 1995; Lewis et al. 1996), and the mutation N246S is classified as having weak inducer sensitivity by Kleina and Miller (1990). These authors likewise report that Q248R in a wild-type background is insensitive to inducer (Kleina and Miller 1990). D274 also contacts IPTG (Friedman et al. 1995; Lewis et al. 1996), and D274N/wild type cannot bind inducer in vitro (Chang et al. 1994). D274G also has an inducer-insensitive phenotype, as does T276A (Suckow et al., 1996). Experiments to determine the effects of these mutations within the context of Y282D on inducer binding and allosteric response promise interesting results. However, instead of further phenotypic analysis or approximate assays of β-galactosidase activity, we have opted to begin purification and biophysical characterization of the double revertant/Y282D mutants. This decision was based on our experience with K84L, which has an inducer-insensitive phenotype (Kleina and Miller 1990; Suckow et al. 1996) but wild-type thermodynamics of IPTG binding (Chang et al. 1993). The discrepancy in the results from the two techniques is actually due to extraordinarily slow kinetics of inducer binding (Chang et al. 1993).
Several features of the mutation P284S warrant additional comment. This site is the only “local” residue not located near the inducer binding pocket and may deserve to be categorized separately (Fig. 7). In a wild-type background, P284S cannot bind DNA and is a heat-sensitive mutation (Suckow et al. 1996). Although this site is not in the monomer–monomer interface, it is directly adjacent to the 281–282 region postulated from the structural alignment (Figs. 3, 4, Fig. 4.) to be critical to dimerization. Perhaps this single mutation also disrupts assembly, whereas the two mutations P284S and Y282D are complementary, allowing assembly and DNA binding.
Putative allosteric mutants: M42I, A133T/V, D149N, V150I, S151P, S191F, L296M, and V321I
The locations of the remaining mutants identified as Y282D revertants are quite surprising (Figs. 8, 9, Fig. 9.). In fact, when A133V was first detected, we considered the possibility that it could be an artifact of the phenotypic screen until a cluster of revertant sites—A133T/V, D149N, V150I, S151P, and S191F—became apparent behind the inducer binding site on the face opposite the monomer–monomer interface (Fig. 8). In a wild-type background, both D149 and S191 contact inducer (Friedman et al. 1995; Lewis et al. 1996), and the S151P and S191F mutations cause an inducer-insensitive phenotype (Suckow et al. 1996). This region of the protein also appears to be intimately involved in the allosteric conformational change that occurs with inducer binding, in which the core N-subdomains move apart from each other while the core C-subdomains are fixed (Lewis et al. 1996; Bell and Lewis 2000; Matthews et al. 2000). A similar conformational change occurs in PurR, although in this case corepressor is required for assuming the DNA high-affinity conformation (Schumacher et al. 1994, 1995). Interestingly, the PurR mutation W147A shifts the protein to the conformation with high affinity for DNA, even in the absence of corepressor (Lu et al. 1998). Because the PurR residue 147 aligns with LacI 150 (Fig. 3), a similar mechanism may apply for this set of Y282D revertants, in particular the LacI V150I. Stabilization of the LacI DNA-bound form may facilitate detection of extremely low levels of assembled Y282D not perceptible by in vitro characterization.
Although L296M and V321I do not adjoin the cluster around residue 150, they are each on one of the three strands that covalently link the core N- and C-subdomains (Fig. 9) and are affected by induction when the core N-subdomains rotate together. Therefore, we find it reasonable to extend the hypothesis of long-range conformational effect to these residues. As we proceed with in vitro characterization, this group of intriguing mutants will be of particular interest.
Finally, we include M42I in this category because it probably also affects DNA binding. M42 is in the DNA-binding domain (Fig. 8), and the recent high-resolution structure of DNA-bound dimeric LacI shows the M42 side chain pointing toward the C terminus of helix 2 (Chuprina et al. 1993; Slijper et al. 1996; Bell and Lewis 2000). The M42I mutation may therefore stabilize the HTH DNA-binding domain, generating a protein with higher affinity for DNA binding. This mutation might allow detection of small populations of Y282D oligomer. Alternatively, the DNA binding affinity of M42I might be increased enough to detect binding of LacI monomer, which normally binds with an affinity 10,000-fold less than those of assembled species.