One fascinating aspect of protein folding is that a wide range of primary sequences can generate very similar secondary and tertiary structures. For example, Thornton and her colleagues have demonstrated that many sequences with less than 35% identity have the same folding topology (Swindells et al. 1998; this reference also provides a very helpful review of the various structural classification schemes currently available). Plasticity of folding is well illustrated by the LacI/GalR family of repressor proteins.† The breadth of this family has been recognized (von Wilcken-Bergmann and Müller-Hill 1982; Weickert and Adhya 1992), and a recent search of protein sequence databases found more than 30 repressor proteins from various bacterial strains with similarity to LacI (Swiss Institute of Bioinformatics BLAST search, Altschul et al. 1990). Of these, PurR has the highest sequence identity with LacI, ∼32%. Furthermore, the core domain of LacI has 10%–25% sequence identity with periplasmic sugar binding proteins (Müller-Hill 1983; Sams et al. 1984; Mowbry and Björkman 1999).
Predictions of similar tertiary structures between and within these two protein families were validated by comparing the X-ray crystallographic structures of LacI, PurR, and TreR (Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Hars et al. 1998; Bell and Lewis 2000) and GBP, RBP, ABP, and ALBP (Newcomer et al. 1981; Quiocho and Vyas 1984; Vyas et al. 1988, 1991; Mowbray and Cole 1992; Björkman and Mowbray 1998; Chaudhuri et al. 1999). The repressor core domains and the binding proteins consist of two subdomains connected by three strands that cross back and forth between them (see LacI structure in Fig. 1) Each subdomain has a central β sheet surrounded by α helices, and the sugar/effector binding site (Fig. 1, *) is formed by the residues that line a cleft between the two subdomains (Newcomer et al. 1981; Quiocho and Vyas 1984; Vyas et al. 1988, 1991; Mowbray and Cole 1992; Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Björkman and Mowbray 1998; Hars et al. 1998; Chaudhuri et al. 1999; Bell and Lewis 2000). In light of the striking structural similarities of these proteins, the difference in their assembly states is conspicuous: The repressor proteins all form oligomers, with the monomer–monomer interface located on the core domains (Weickert and Adhya 1992; Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Hars et al. 1998; Bell and Lewis 2000), whereas the periplasmic sugar binding proteins are monomeric. This observation raises questions of what determinants are required for LacI/GalR family assembly and whether and how the concept of plasticity in folding can be extended to quaternary structure. A structure-based alignment of three repressors and four periplasmic binding proteins has been created to examine these issues.
Interestingly, wild-type LacI itself shows some quaternary structural plasticity: Two structures have been determined for tetrameric, inducer-bound LacI: that of only the core (Friedman et al. 1995) and that of complete protein (Lewis et al. 1996). The four monomer–monomer interfaces in these structures all differ slightly, primarily in the number of atomic contacts made by various residues. Most contacts between LacI monomers are mediated through one face of the core domain (Fig. 2) Additional monomer–monomer contacts occur between the DNA binding domain and core N-subdomain and between the core C-subdomain and the LHR (Fig. 1).
Contributions to dimerization from the core C-subdomain have been known to be important for a long time. The significance of the region around 282 was indicated as early as 1976 when the variant designated T41 was determined to be monomeric (Schmitz et al. 1976). The sequence of this mutant was later found to contain the point mutation Y282D (Fig. 2; Chen and Matthews 1992b). In vitro characterization of purified Y282D demonstrated that the monomeric protein was folded and exhibited wild-type inducer binding properties (Daly and Matthews 1986). Later in vitro experiments showed that substituting A, E, and S at this position also disrupted assembly, whereas L and F did not (Chakerian and Matthews 1991). Changes at LacI 281 are not as dramatic but can also compromise assembly (Fig. 2; Chakerian and Matthews 1991; Spott et al. 2000). Although neither single mutation abrogates assembly, the double mutation C281S/Y282L does (Chakerian and Matthews 1991). Combinations of mutations at 281 and other sites such as 223, 255, 282, and 283 also compromise assembly as measured by phenotypic behavior (Spott et al. 2000). A number of other mutations near the core C-subdomain interface have been recently examined and demonstrate varied degrees of tolerance for substitution (Spott et al. 2000).
An extremely interesting demonstration of plasticity in the LacI interface became apparent when attempts were made to use the monomeric nature of Y282D for other in vitro studies. Strikingly, either of two mutations at amino acid 84 (K84L or K84A) can compensate the Y282D dimerization defect, restoring assembly (Fig. 2; Chang et al. 1993). Residues 282 and 84 are spatially separated; in fact, they are located on different core subdomains (Fig. 2; Nichols et al. 1993; Friedman et al. 1995; Lewis et al. 1996; Bell and Lewis 2000). Therefore, this compensation is not due to rearrangements around the primary Y282D mutation. Instead, these K84 mutations must compensate the disruption caused by Y282D through globally strengthening the interface. In fact, both mutations at 84 in a wild-type background are extraordinarily resistant to chemical denaturation (Nichols and Matthews 1997). Hydrophobic mutations at position 84 also cause significant increases in the thermal stability of LacI (Gerk et al. 2000).
We hypothesized that other combinations of residues might compensate the dimerization defect of Y282D, either by increasing the global stability of dimer assembly or through local rearrangements around position 282. Designed mutations provide one avenue for exploring this issue, but such efforts are limited by our incomplete understanding of protein structures. Instead, we used the link between LacI assembly and function, which allows phenotypic screening for DNA binding of randomly mutated LacI. Because a dimer is the minimal unit for binding DNA, the mutation Y282D also destroys DNA binding. When K84L/A restores assembly to Y282D, DNA binding is also recovered. Any other second-site mutations generated through random mutagenesis that restore Y282D assembly should also restore DNA binding. This technique does not rely on structural assumptions and allows rapid screening of many candidates. In fact, the number of mutations, 22, that compensate the Y282D defect is surprisingly large, and many of the sites would not have been chosen for study by conventional techniques. We have examined these mutations in the context of the LacI structure and deduce that they restore effective assembly through multiple mechanisms.
Results and discussion
Comparison of repressor and periplasmic binding proteins
To explore the question of which residues are necessary for repressor dimerization, structures of the repressor core domains and several periplasmic binding proteins were used to create a structure-based alignment (Fig. 3; DALI: Holm and Sander 1996). For the repressor proteins, residues that have either side chain or backbone atoms within 3.5 Å of the partner monomer or hydrophobic contacts within 4.5 Å have been mapped onto this alignment. Interface differences between unliganded proteins, which have “open” structures, and those of the “closed” ligand-bound conformations are also noted. Additional water-mediated interactions (at least one atom from each monomer within 3.5 Å of the same water molecule) are listed in Table 1. In general, the latter also make direct contacts and are included in Figure 3, although water molecules mediate additional interactions for TreR 95 to 99 and PurR 284 to 284, 284 to 252, and 285/286 to 260.
Of the N-subdomain sites that directly contact their partner monomers, 10 are structurally homologous for all three repressors; these positions are indicated with Roman numerals I–X in Figure 3A. However, no discrete pattern emerges from their amino acid identities that can delineate between oligomeric repressors and monomeric binding proteins. None of these sites have identical primary sequences for all three repressors. Amino acid similarity is only well-conserved as hydrophobic residues at sites VII and VIII, but the monomeric binding proteins also have hydrophobic side chains at these positions. In fact, an overall comparison for the N-core subdomain of RBP and the repressor proteins shows no significant deviation at any site. ABP does have a potentially disruptive proline at site II and a lysine at V and VIII. GBP has very low correlation at positions I and V, whereas ALBP is disparate at I, V, VII, and IX. However, given the low similarity between the three repressors, these differences are not remarkable. Sequence examination of this region does not reveal firm guidelines for oligomerization.
Since the primary sequences of these proteins demonstrate no striking patterns, we extended the analysis by looking for conserved interactions between pairs of sites. Contact maps were created for each interface to show the interactions that occur in the open and closed forms of each protein (Fig. 4). The open (ligand-free) structures of LacI and PurR have only one conserved interaction, between position III of monomer A and LacI residue 81 or PurR 79 on monomer B (Fig. 4A,B, dashed line). (The open structure of TreR is not available.) Four additional interactions create similar patterns on the contact maps of these two proteins (Fig. 4A,B, dotted lines): LacI has interactions between A:IV and B:VIII/B:IX that occur symmetrically from B to A. PurR has similar interactions using position V instead of IV. Otherwise, the interactions between unbound monomers produce strikingly different patterns. This variance is probably related to two issues. First, the open states of the two proteins differ the most at the core N-subdomain, both in their final positions and in their axes of rotation describing the motion that occurs on ligand-binding/release (Mowbray and Björkman 1999). Second, the two proteins have transposed functional modes. The open structure of LacI binds DNA, whereas PurR can only bind DNA in the closed complex with corepressor. In that light, both DNA-free forms of the proteins have interactions from A:VII to B:X and A:X to B:X. Further analysis of the N-terminal interface in the open, DNA-bound state must await new structural information from other repressors.
Even though their tertiary structures are more similar (Mowbray and Björkman 1999), the interaction maps of the closed core N-subdomains do not indicate many more common patterns. Only four interactions are conserved between the three repressors: A:VIII to B:IV, A:IV to B:VIII, VII to VII, and VIII to VIII (Fig. 4A–C, thick lines). Of these, position IV is conspicuous because several mutations of LacI 84 are known to have profound affects on interface strength (Chang et al. 1993; Nichols and Matthews 1997; Gerk et al. 2000). Some interactions are conserved between pairs of repressors, although no two proteins are more similar than the third (Fig. 4A–C, dashed lines). A few more similar but nonidentical patterns are indicated with dotted lines in Figure 4A–C. Therefore, the original sequence-based observation that the N-subdomain yields few strict rules of oligomerization appears to hold on structural analysis.
Examination of the core C-subdomain does indicate a region that may be critical for assembly. This subdomain provides an unmoving anchor for the allosteric conformational changes of LacI and PurR (Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Mowbray and Björkman 1999; Bell and Lewis 2000; Matthews et al. 2000). Although the C-subdomain has fewer amino acids involved in the interface than does the N-subdomain, a greater percentage of sites are structurally conserved. These are labeled with Roman numerals I–VIII in Figure 3A. Of these, positions I, II, V, and VI show little sequence similarity between all three repressors. In contrast, sites III, IV, VII, and VIII are all very hydrophobic. For the periplasmic binding proteins, this hydrophobicity is conserved at III and IV (with the exception of GBP III). However, the two protein families are not at all similar around VII and VIII. In fact, the primary sequences of the periplasmic binding proteins are so different from those of the repressors that their structural equivalence could not be inferred for a previous sequence-based alignment (Nichols et al. 1993). Differences arise not only from the primary sequences between the two groups at VII and VIII, but also from the insertions subsequent to position VIII that occur in all four periplasmic binding proteins examined (Fig. 3A). For ABP, the six amino acids inserted in this region do not align with any other residues and, for simplicity, are indicated with dots.
Primary sequence mutations at site VIII had previously implicated this region in dimer assembly. Monomeric LacI Y282D was one of the earliest characterized mutations of this repressor (Schmitz et al. 1976; Daly and Matthews 1986; Chakerian and Matthews 1991; Chen and Matthews 1992b). In light of the known assembly states of LacI Y282 mutants, the nature of residues for the other aligned proteins is very intriguing. The PurR residue VIII is an F and TreR has an L; both substitutions are precisely those tolerated by LacI (Chakerian and Matthews 1991). Both PurR and TreR also have hydrophobic amino acids at position VII, and LacI C281 also has hydrophobic character (Fig. 3A). None of the periplasmic binding proteins has hydrophobic residues at VII, although the S in ABP might be considered similar to the C of LacI. However, ABP has E at position VIII, where placing a D is fatal to LacI assembly. The VII–VIII regions for RBP and ALBP are also highly charged. Although the GBP sequence in the VII–VIII region is hydrophobic, this monomeric protein has an additional charged residue inserted on the loop subsequent to VIII. Interestingly, residue 289 of the LacI/GalR family member cytidine repressor also aligns with the site immediately following position VIII (Weickert and Adhya 1992). Substitution of a charged residue in the mutation C289R abrogates assembly of this normally oligomeric repressor (Barbier and Short 1993).
Contact maps of the core C-subdomain are also very striking (Fig. 4D–F). Most of the interactions are conserved among all three repressors (Fig. 4d–f, thick lines). The pattern is most noticeable for TreR, which has only four unique interactions (Fig. 4F, thin lines). The open and closed C-interfaces of LacI and PurR exhibit few differences, consistent with previous observations that this subdomain is not affected by the allosteric conformational change (Fig. 4D,E; Schumacher et al. 1994, 1995; Friedman et al. 1995; Lewis et al. 1996; Mowbray and Björkman 1999; Bell and Lewis 2000; Matthews et al. 2000). The five interactions of residue VII are particularly striking and appear to provide the key stabilizing interactions of this interface. Structurally, the side chain of this residue provides the “ball” for a hydrophobic “socket” of its partners (Fig. 5)
However, this site also provides an excellent example of why structural data alone are not sufficient for deciphering the code of protein structure. Both in vivo and in vitro studies demonstrate that position VII of LacI nonetheless tolerates a wide variety of substitutions. LacI C281 mutation to S, A, F, I, or M results in biophysical characteristics similar to wild-type protein (Chakerian and Matthews 1991). Phenotypic analysis of C281 mutations to L, H, Y, F, A, S, Q, R, or E demonstrates that these mutants all maintain their repression function (Suckow et al. 1996), which would be destroyed if assembly were abolished. Only C281 altered to P, K, and G diminished repression (Suckow et al. 1996). C281 has also been the target of extensive chemical modification (Daly et al. 1986). Low reaction rates indicate that C281 is shielded from solvent and is modified by only extremely reactive compounds, generally at high concentrations of reactants. However, the function of the resulting protein is only moderately affected, and the modified interface must still be intact (Daly et al. 1986). Thus, LacI position VII demonstrates plasticity even beyond the options provided by nature.
In contrast, LacI mutations at position VIII, which makes only two contacts in the interface (Fig. 4D), are poorly tolerated. Chakerian and Matthews (1991) demonstrated that only F and L are tolerated; A, S, and E abolish oligomerization. Phenotypic analysis yielded similar results: L and F maintained wild-type phenotypes whereas P, H, C, A, G, S, N, K, R, or E substitutions abolished repression function (Suckow et al. 1996). Our current hypothesis is that the differing mutational sensitivities arise from the proximity of each site to the surface of the interface. Site VII is on the edge, but VIII is buried in the middle and therefore has stricter packing constraints.
Despite mutational sensitivity at site VIII, both assembly and function of the LacI Y282D mutation are restored by a second-site mutation at K84. To determine how many second-site mutations can revert the phenotype of Y282D and thereby explore plasticity of quaternary structure, we developed tools to screen the function of a large number of randomly mutated proteins. The phenotypic screen presented in Figure 6 required several stages of development but evolved into a facile and effective tool with reproducible results even in the hands of novice researchers. The detailed rationale of the screen is as follows. Wild-type LacI (WT) is expressed from the repressor plasmid and binds to its operator site on the pZCam reporter plasmid. This binding represses transcription of β-galactosidase, and colonies remain white in the presence of the indicator Xgal. If the Y282D mutation is present on the repressor plasmid, the resulting LacI protein is monomeric and cannot bind DNA, allowing β-galactosidase transcription and resulting in turquoise blue colonies in the presence of Xgal. If a second mutation restores dimerization of Y282D protein, so that resulting protein can again bind operator DNA and inhibit β-galactosidase transcription, colonies remain white in the presence of Xgal. If the doubly mutated protein exists in equilibrium between monomer and dimer, it will not repress as efficiently as wild-type LacI and may produce light blue colonies. This last feature confers an advantage to the blue/white screen over alternative toxicity selection approaches, which could not detect intermediates. A further benefit of this screen is that the high ratio of repressor protein to reporter plasmid can allow detection of double mutants that may form only weakly associated oligomers.
Results of the genetic screen of >30,000 colonies are presented in Table 2. The mutations that compensate the dimerization defect of Y282D produced either white or very light blue colonies. A large percent of light blue colonies identified in the screen were found to be false positives (Y282D with no further mutations), as well as a few of the white colonies. Although this result raises the possibility that some of the mutations detected by the screen do not actually affect LacI assembly or function, several lines of evidence support the position that they do compensate the Y282D defect. The most substantive evidence is structural analysis; sites are easily assigned to one of three classes (see below). An especially interesting group contains sequential sites: 149, 150, and 151. Furthermore, seven revertants were each detected more than once (Table 2), five of which (M42I, H74R, N246S, T276I, and S354F) were isolated from Y282D plasmids mutated in separate experiments. Finally, four of the sites (133, 223, 274, and 276) had multiple mutations (Table 2).
Since all of the reversion mutations resulted from a single base pair change, we found their number to be surprisingly large. Furthermore, the K84 mutations known to compensate Y282D (Chang et al. 1993) were not obtained, because both require changing more than one base pair, implying that mutagenic saturation has not yet been reached. Although we attempted higher frequency mutation by passaging singly mutated plasmids back through XL1-Red cells, these plasmids were corrupted beyond use (data not shown). Further efforts using PCR-based methods are being pursued to exhaustively identify second-site revertants for the Y282D phenotype.
One other result that bears note is that some of the wild-type revertants (D282Y) included mutations at one or more of the three cysteine residues (C107, C140, and C281; see Table 2). These were the only revertants containing more than one mutation per LacI gene. We find these results very intriguing in light of challenges encountered when altering these cysteines in other projects (Wycuff 1999).
Sites of reversion
The mutations that compensate for the dimerization defect caused by Y282D are listed in Table 2. These sites were mapped onto the X-ray crystallographic structures for LacI (Friedman et al. 1995; Lewis et al. 1996; Bell and Lewis 2000). The sites of mutations cluster into three groups: (1) residues where changes might globally strengthen the oligomerization interfaces in a manner analogous to K84L; (2) side chains positioned so that mutations might elicit local structural rearrangement and compensate the effects of Y282D; and (3) sites that may either affect long-range structural changes, shifting LacI to a state with high affinity for DNA, or increase the DNA binding affinity of a species for which it was not previously detectable.
Global stabilization of oligomerization interfaces: H74R, M98I, M223I/T, M232I, Q248R, and S354F
The original mutations known to compensate Y282D oligomerization (K84L and K84A) appear to function through strengthening the monomer–monomer interface (Nichols and Matthews 1997; Gerk et al. 2000). Therefore, the fact that the first two mutants identified by this study, H74R and M98I, are also located on the core N-subdomain in the monomer–monomer interface was very satisfying (Fig. 2). In fact, the contacts between M98 and K84 are two of the four interactions conserved in the closed core N-interface for all three repressors (Fig. 4A–C). M98I may therefore function by forming a more extensive hydrophobic patch at the interface. Similarly, M223 is involved in a cross-monomer interaction of the core C-subdomain with S280 (Lewis et al. 1996). The LacI reversion mutation M223I/T may also strengthen the hydrophobicity of this contact (Fig. 7).
In contrast, an ionic interaction occurs between H74 and D278 when wild-type LacI binds inducer (Friedman et al. 1995; Lewis et al 1996). This interaction is the sole contact between the core N-subdomain of one monomer and the core C-subdomain of its partner monomer and is not present when LacI binds DNA (Fig. 2; Lewis et al. 1996; Bell and Lewis 2000). In fact, the cross-monomer partner for D278 in the “open” DNA-bound conformation is Q248 (Fig. 7; Bell and Lewis 2000). Interestingly, both H74R and Q248R compensate the Y282D lesion. These structural observations led to an initial hypothesis that the H74/D278 ion pair is essential for allosteric communication. However, Barry and Matthews (1999) found that, although mutations at position 74 do affect allosteric communication, the ionic bond is not required. Because arginine should form a stronger ion pair than histidine, the original hypothesis would predict that H74R should strengthen the inducer-bound form of LacI, which cannot bind DNA with high affinity. Therefore, the observation that the H74R mutation allows Y282D to bind operator DNA supports the conclusions of Barry and Matthews (1999). The stronger base (arginine) in H74R (or Q248R) may compensate for the additional negative charge of the Y282D. Interestingly, E70 of PurR makes a comparable ionic bond with PurR R278, and an E70A mutation (which removes a negative charge) results in a shift to the closed conformation, which has high affinity for DNA (Lu et al. 1998).
The locations of two other reversion mutations, M232I and S354F, have surprising implications (Fig. 2). S354 is in the LHR tetramerization region of LacI (Fig. 2). In wild-type LacI, residues 351 and 355 of one monomer contact the core domain of its partner monomer (Fig. 3); this interaction is asymmetric in that it occurs only once per dimer. Therefore, we speculate that the S354F mutation may provide an additional LHR-to-core interaction, thus stabilizing the assembled form of the protein. In contrast, M232 is not near any known or potential monomer–monomer interfaces. In fact, residue 232 is on a loop that juts into the space between dimers and may form a second dimer–dimer interface (in addition to the LHR; not shown). We speculate that M232I again provides a tighter hydrophobic contact, stabilizes the tetramer, and by Le Chatelier's principle, strengthens the dimers. This effect on the linked assembly steps is opposite to that observed for the L251A mutation (Dong et al. 1999).
Local structural effects: M223I/T, N246S, Q248R, D274N/G, T276A/I, and P284S
Another simple mechanism for compensating the Y282D mutation is through local adjustments in the structure that accommodate the tyrosine to aspartate change. Indeed, nine mutations—M223I/T, N246S, Q248R, D274N/G, T276A/I, and P284S—all occur in the immediate area of the original Y282D mutation (Fig. 7), and we postulate that they function in this manner. M223I/T and Q248R also participate in the monomer–monomer interface (see earlier). In particular, one means by which Q248R and D274N/G might compensate Y282D is through reducing excessive negative charge created by the aspartate. Several of these “local” sites are directly involved in wild-type LacI inducer binding (Friedman et al. 1995; Lewis et al. 1996). Interestingly, the phenotypes of E. coli expressing genes encoding some of these compensating mutations in a wild-type (Y282) background have been examined by the Miller lab and found to alter allosteric response to inducer (Kleina and Miller 1990; Suckow et al. 1996).
Relevant details for each site follow. N246 directly contacts inducer IPTG (Friedman et al. 1995; Lewis et al. 1996), and the mutation N246S is classified as having weak inducer sensitivity by Kleina and Miller (1990). These authors likewise report that Q248R in a wild-type background is insensitive to inducer (Kleina and Miller 1990). D274 also contacts IPTG (Friedman et al. 1995; Lewis et al. 1996), and D274N/wild type cannot bind inducer in vitro (Chang et al. 1994). D274G also has an inducer-insensitive phenotype, as does T276A (Suckow et al., 1996). Experiments to determine the effects of these mutations within the context of Y282D on inducer binding and allosteric response promise interesting results. However, instead of further phenotypic analysis or approximate assays of β-galactosidase activity, we have opted to begin purification and biophysical characterization of the double revertant/Y282D mutants. This decision was based on our experience with K84L, which has an inducer-insensitive phenotype (Kleina and Miller 1990; Suckow et al. 1996) but wild-type thermodynamics of IPTG binding (Chang et al. 1993). The discrepancy in the results from the two techniques is actually due to extraordinarily slow kinetics of inducer binding (Chang et al. 1993).
Several features of the mutation P284S warrant additional comment. This site is the only “local” residue not located near the inducer binding pocket and may deserve to be categorized separately (Fig. 7). In a wild-type background, P284S cannot bind DNA and is a heat-sensitive mutation (Suckow et al. 1996). Although this site is not in the monomer–monomer interface, it is directly adjacent to the 281–282 region postulated from the structural alignment (Figs. 3, 4, Fig. 4.) to be critical to dimerization. Perhaps this single mutation also disrupts assembly, whereas the two mutations P284S and Y282D are complementary, allowing assembly and DNA binding.
Putative allosteric mutants: M42I, A133T/V, D149N, V150I, S151P, S191F, L296M, and V321I
The locations of the remaining mutants identified as Y282D revertants are quite surprising (Figs. 8, 9, Fig. 9.). In fact, when A133V was first detected, we considered the possibility that it could be an artifact of the phenotypic screen until a cluster of revertant sites—A133T/V, D149N, V150I, S151P, and S191F—became apparent behind the inducer binding site on the face opposite the monomer–monomer interface (Fig. 8). In a wild-type background, both D149 and S191 contact inducer (Friedman et al. 1995; Lewis et al. 1996), and the S151P and S191F mutations cause an inducer-insensitive phenotype (Suckow et al. 1996). This region of the protein also appears to be intimately involved in the allosteric conformational change that occurs with inducer binding, in which the core N-subdomains move apart from each other while the core C-subdomains are fixed (Lewis et al. 1996; Bell and Lewis 2000; Matthews et al. 2000). A similar conformational change occurs in PurR, although in this case corepressor is required for assuming the DNA high-affinity conformation (Schumacher et al. 1994, 1995). Interestingly, the PurR mutation W147A shifts the protein to the conformation with high affinity for DNA, even in the absence of corepressor (Lu et al. 1998). Because the PurR residue 147 aligns with LacI 150 (Fig. 3), a similar mechanism may apply for this set of Y282D revertants, in particular the LacI V150I. Stabilization of the LacI DNA-bound form may facilitate detection of extremely low levels of assembled Y282D not perceptible by in vitro characterization.
Although L296M and V321I do not adjoin the cluster around residue 150, they are each on one of the three strands that covalently link the core N- and C-subdomains (Fig. 9) and are affected by induction when the core N-subdomains rotate together. Therefore, we find it reasonable to extend the hypothesis of long-range conformational effect to these residues. As we proceed with in vitro characterization, this group of intriguing mutants will be of particular interest.
Finally, we include M42I in this category because it probably also affects DNA binding. M42 is in the DNA-binding domain (Fig. 8), and the recent high-resolution structure of DNA-bound dimeric LacI shows the M42 side chain pointing toward the C terminus of helix 2 (Chuprina et al. 1993; Slijper et al. 1996; Bell and Lewis 2000). The M42I mutation may therefore stabilize the HTH DNA-binding domain, generating a protein with higher affinity for DNA binding. This mutation might allow detection of small populations of Y282D oligomer. Alternatively, the DNA binding affinity of M42I might be increased enough to detect binding of LacI monomer, which normally binds with an affinity 10,000-fold less than those of assembled species.
The ability of many different amino acid sequences to assume similar folds demonstrates the plasticity of secondary and tertiary structures. The structurally homologous families of LacI/GalR repressors and periplasmic binding proteins provide good examples of this behavior (von Wilcken-Bergmann and Müller-Hill 1982; Müller-Hill 1983; Weickert and Adhya 1992). However, few studies have explored whether the same principles apply to quaternary structure.
Structural alignments of monomeric binding proteins and oligomeric repressors, as well as contact maps of the latter, indicate that the core N-subdomain does not comprise many of the critical dimerization interactions. This arrangement is highly amenable to the concept of quaternary plasticity. Exceptions to these statements are that the positions equivalent to LacI 84, 95, 96, and 98 participate in conserved interactions of the closed conformations. Interestingly, mutations at 84 have previously been identified to have significant effects on interface strength (Chang et al. 1993), and this paper reports that M98I may have similar properties. The core C-subdomain interface, in contrast, is composed of many conserved interactions. Although half of the sites that comprise this interface do not demonstrate primary sequence conservation, both sequence and structural analyses indicate the sites corresponding to LacI 281 and 282 are especially important. From these observations, one might predict that any mutations at these sites would be detrimental to oligomeric assembly. Although this hypothesis is consistent with experimental data for 282, mutations at 281 have little or no effect on LacI oligomerization. This result for 281 provides yet another example of quaternary plasticity in repressor assembly.
The monomeric Y282D mutation of LacI has provided an advantageous context for further examining the issue of plasticity in assembly. Random mutagenesis and phenotypic screening for Y282D revertants have identified a surprisingly broad range of sites that appear to represent several different mechanisms for correcting the oligomerization defect. Many of these sites would have been unlikely targets of “rational” experimental design. The diverse regions identified in this study indicate that determinants of oligomeric assembly and DNA affinity are remarkably flexible. Detailed biochemical and biophysical characterization of these mutant proteins will be required to identify the precise mechanisms by which they mitigate the dramatic effect of Y282D on assembly and to determine any functional consequences that result from changing these amino acids. This information will accrue from measurements of the thermodynamic parameters of assembly and ligand binding of the purified mutant proteins.
The mutagenesis experiments indicate that the current 360-amino acid scaffold for LacI tolerates a broad range of changes without abolishing the potential for assembly. The strength of the LacI interface comes from many small interactions, and the precise nature of the contributors is not stringent. Indeed, energetic loss at one position (e.g., Y282D) can be compensated by changing a variety of second sites. Furthermore, the primary sequences of the sites involved in the repressor interfaces are not significantly distinct from their analogous residues of the monomeric periplasmic binding proteins, except near C-subdomain positions VII/VIII. However, for LacI, position VII (281) tolerates mutagenesis, and a deleterious substitution at position VIII (282) is easily compensated with second-site mutations. Therefore, the binding proteins perhaps need only a few additional interactions to facilitate formation of quaternary structures similar to those of the repressors.
These two families have one other subtle but possibly critical difference that affects not only their primary but also their tertiary structures. The presence of additional amino acids subsequent to position VIII in the core C-subdomains of the monomeric periplasmic binding proteins, but not the oligomeric repressors, may abrogate oligomerization. This situation suggests several interesting experimental questions: Will addition of this loop to LacI abolish assembly? If so, can the monomer–monomer interface be restored by second-site mutations in a manner analogous to the Y282D revertants? Will elimination of this loop in the monomeric binding proteins allow their assembly? The answers to these questions will help define how closely the plasticity demonstrated for LacI quaternary structure is coupled to its underlying tertiary fold.
Materials and methods
Plasmids and random mutagenesis
The reporter plasmid pZCam was constructed by excising the LacZ gene for β-galactosidase and its lac operator from pCRII (Invitrogen) with the restriction enzyme AflIII. The resulting overhanging ends were filled in with Klenow fragment, and the product was inserted into the tetracycline resistance gene of pACYC184 (NEB) at the blunt ends of an EcoRV restriction site. pZCam retains the Cam resistance gene of pACYC184 and an origin of replication that is compatible with the ColE1 origin of the plasmids containing the various LacI mutations. The latter carry the β-lactamase gene for Amp resistance.
The wild-type LacI gene was on plasmid pLS1, which differs from previously reported pJC1 (Chen and Matthews 1992a) in that a silent mutation at amino acid 45 to make a SacI restriction site was added by using Stratagene's Chameleon Double-Stranded Site Directed Mutagenesis protocol, the mutagenic oligo 5′-GTTGG GAATGTAATTGAGCTCCGCCATCGC-3′, and the selection oligo 5′-GATCCGTCGACCTCGAGCCAAGCTTG-3′. The latter changes the PstI site of pJC1 to an XhoI site on pLS1.
The Y282D mutation of LacI was on the plasmid pAC1 (Chakerian and Matthews 1991). The use of this plasmid may have been fortuitous in detecting the large number of revertants. Different LacI plasmids have various numbers of DNA binding sites, and adding even one extra site can change the detection level of the screen (results not shown). Y282D plasmid was randomly mutated by passaging the DNA through XL1-Red cells (Epicurian Coli, Stratagene). XL1-Red cells are deficient in DNA repair mechanisms, and errors in replication occur approximately once per kilobasepair. Because the LacI gene is 1080 basepairs long, mutagenesis in this system averaged one base per gene.
Blue–White phenotypic screen
Many changes in the LacI DNA binding function are readily observed by using a phenotypic screen (Kleina and Miller 1990; Markiewicz et al. 1994; Suckow et al. 1996). However, previously constructed screens rely primarily on nonsense suppression at known amino acid positions and incorporate the LacI mutations into the episome, which fixes the repressor:reporter gene ratio. This work required a screen that allowed random mutagenesis, variability in the repressor:reporter ratio (for future applications), easy purification of modified DNA for sequencing, and the facility to transform the DNA into various E. coli strains (again, for future applications). These constraints are all met by keeping the LacI gene on a plasmid such as pAC1 or pLS1.
The ideal reporter gene would have been incorporated into an E. coli strain that had a deleted LacI gene. However, we could locate only interrupted-LacI/LacZ combinations, which we believed potentiated the hazard of homologous recombination. Therefore, we chose a system in which the LacZ reporter gene was carried on the cotransformable plasmid (pZCam, see earlier). We first transformed pZCam into E. coli DH5α cells (GIBCO BRL), then created new competent cells designated DZCam (using CaCl2), and then transformed the various LacI plasmids.
Randomly mutated Y282D DNA was transformed into competent DZCam cells, which were plated on Cam/Amp LB agar plates and grown at 37°C overnight. Control plates contained bacterial colonies expressing wild-type and Y282D LacI. All colonies were then sprayed with 40 mg/mL Xgal in dimethylformamide by using a TLC reagent sprayer and incubated at 4°C in the dark. After ∼7 h, colonies were again sprayed with Xgal and incubated ∼16 h under the same conditions. After this process, Y282D control colonies were dark blue, whereas colonies expressing wild-type LacI remained white (Fig. 6). The latter was always verified, because even wild-type LacI will allow expression of a small amount of β-galactosidase over extended periods of time, causing the colonies to become light blue. Any colonies transformed with mutated DNA that were white or lighter blue than the Y282D control were streaked on Cam/Amp LB plates and incubated at 37°C overnight. Xgal was applied in the same manner to verify the original phenotype.
Sequencing positive colonies
If the streaked colonies remained white or lighter blue than Y282D control colonies, they were transferred to 5 mL 2XYT or LB liquid media (with Amp and Cam) and grown overnight. Plasmid DNA was purified from these cells by using the Wizard Plus Miniprep DNA Purification Kit (Promega) and sent for full sequencing to either Genosys Biotechnologies, Inc. or Lone Star Labs, Inc. Although this DNA was a mixture of LacI plasmid and reporter plasmid pZCam, the latter did not interfere with sequencing because of its low copy number. After desirable mutations were identified, the LacI plasmid was separated from pZCam by using the following scheme. The plasmid mixture was incubated with the SacII restriction enzyme, which digests only the pZCam plasmid. The mixture was then transformed into DH5α cells and plated on LB/Amp agarose plates. After an overnight, 37°C incubation, new colonies were gridded onto both Amp/LB and Cam/LB agarose plates. DNA was purified from colonies that grew only on the Amp plate (generally 100% of the colonies gridded).