Evolution is defined as the result of environmental stress that selects the fittest. The yet unresolved and fascinating question concerns the origin of the variants from which the fittest survive. Dogma has it that these variants arise from pre-existing ‘random’ mutations that are not directed by the different conditions of stress that select them. In other words, there would be no specific biochemical mechanisms initiated by each stressor that can increase mutation rates in those genes related to that stress. However, specific, stress-directed mutagenesis (SDM) would offer an enormous advantage for evolution and would be selected. Therefore, if SDM exists it should be present today in organisms coping with adverse conditions in their environment. Increased mutation rates would be directed to specific genes that must mutate to alleviate the stress, while avoiding random genome-wide damage. The search for directed mutations has been stymied by the problem posed by Delbrück in (1946): if a mutation can only be observed under conditions that may cause it, as well as select it, how can we tell the difference? Does the stress cause the mutation before the mutant is selected? This conundrum has understandably led to confusion, controversy, and, by default, acceptance of the Dogma first articulated more than 100 years ago by Weismann (1893). Recent advances in this field reveal mechanisms by which a specific stress does in fact cause mutations in related genes, thus resolving the issue raised by Delbrück.
This MicroReview attempts to critique and relate three different areas of research concerned with mechanisms underlying adaptive mutations: (i) acquisitive evolution, in which gain-of-function mutations confer new and advantageous capabilities upon organisms; (ii) experimental evolution, observed over hundreds of generations using glucose limited/starved microbial populations, and (iii) biochemical mechanisms underlying SDM.
Investigators in this field have provided beautiful examples of evolving biosynthetic and catabolic pathways responding to environmental stress. These mutations modify existing gene-enzyme systems to initiate pathways using a new carbon source as the previous source becomes limiting to survival (for reviews see Hegeman and Rosenberg, 1970; Lin et al., 1976; Clarke, 1984; Mortlock, 1984). Early investigations in this field demonstrated that cells starved in the presence of a carbon source they cannot use produced mutants that could use it. The most common mechanism involved is gene derepression resulting in the constitutive production of a previously inducible enzyme or an enzyme with altered substrate specificity. Many examples exist; some of the earliest are the use of altrose-galactoside via β-galactosidase (Lederberg, 1951); β-glycerolphosphate via alkaline phosphatase (Torriani and Rothman, 1961) and xylitol via ribitol dehydrogenase (Lerner et al., 1964). As discussed below, such derepressed genes are actively transcribed and thereby become targets for SDM. The metabolic steps required to metabolize a new related substrate are similar to those for existing pathways. Therefore, relatively minor changes to a duplicated copy of the existing gene may be required for recruitment to serve a new function (reviewed in Wright, 2000). Other frequently observed mutations in response to carbon source starvation confer increased permeability for the limiting metabolite. Examples of this kind will be discussed below in the section on experimental evolution.
The field of acquisitive evolution has been enhanced by biotechnologists, who have published many papers on the results of SDM. Genetic engineering is used to achieve strain improvement by increasing the yield of a valuable product for example, or allowing the metabolism of a xenobiotic compound (van der Meer et al., 1992). Although genetically engineered microbial strains are less competitive in natural environments than those that originally evolved and were selected under those conditions, they have provided valuable insight into mechanisms by which new catabolic and biosynthetic pathways arise during evolution. Of particular relevance are studies that are similar to investigations by evolutionary biologists analysing mechanisms of acquisitive evolution. Single point mutations have been shown to extend the substrate range for the degradation of many organic compounds, for example, xylene monooxygenase (Abril et al., 1989) and catachol dioxygenase (Ramos et al., 1987). Gain-of-function mutations for metabolizing a new substrate typically require starvation conditions and long exposure times to that substrate as the sole carbon source (e.g. Nochur et al., 1990).
Some examples of specific SDM in starving bacteria are particularly well-documented and pertinent to this review. Kasak et al. (1997) described the accumulation of phenol-utilizing mutants (Phe+) in starving cultures of a promoterless phenol degradation operon of P. putida in the presence, but not the absence, of phenol (the possibility that the absence of phenol eliminates Phe+ mutants from starving populations was excluded). Sequence analysis of these mutants revealed base substitutions, deletions and insertions that in effect created a promoter. These mutations differed in type and frequency in starving as compared to growing populations. Because the frequency of Phe+ mutations found on selective plates from stationary-phase cells was higher than from exponentially growing cells, they concluded that specific factors in starving cells must facilitate mutations conferring the ability to utilize phenol. Other investigators have also observed that mutants arising from SDM in starving bacteria differ from those found during growth, implicating different molecular mechanisms (Prival and Cebula, 1992; Rosenberg et al., 1994; Foster and Trimarchi, 1995; Notley-McRobb and Ferenci, 1999a; Sung and Yasbin, 2002). Although starvation and gene derepression are probably the predominant cause of SDM in nature, others include direct effects on supercoiling by variables such as osmolarity, temperature or anaerobiosis. For example, deletion mutations of the stress-induced spvB gene in Salmonella (Massey et al., 1999) only occur at a specific concentration of NaCl that would presumably induce a particular level of supercoiling, which is known to create DNA secondary structures containing unpaired bases vulnerable to mutations (Ripley and Glickman, 1983; Wright et al., 2002; Wright et al., 2003). Supercoiling also plays a critical role in selective gene transcription (Lefstin and Yamamoto, 1998), for example, in sensitive genes such as proU (Higgins et al., 1988) and Hg-MerR (Ansari et al., 1992).
A number of laboratories study mutations that arise over many generations in microbial populations limited or starved for a carbon source (for example, Dykhuizen and Hartl, 1983; Death et al., 1993; Notley-McRobb and Ferenci, 1999a, b; Funchain et al., 2000; Giraud et al., 2001; Notley-McRobb et al., 2003). The consequences of glucose starvation (in contrast to less ‘global’ nutrients such as phosphate or an amino acid) are extremely widespread and difficult to sort out, as all areas of metabolism are affected directly or, eventually, indirectly. A multitude of genes are derepressed and activated as a result of the stringent response (Cashel et al., 1996), for example, relA and spotT which regulate the accumulation of ppGpp that is required for the accumulation of rpoS-encoded σs. This central regulator then initiates a cascade derepression of starvation-responsive genes including those protecting the cells from additional stressors such as heat, osmotic shock, anaerobiosis, and oxidative damage (Gentry et al., 1993; Hengge-Aronis, 1996). The cAMP-CRP complex accumulates and activates about two-thirds of the carbon starvation responsive genes. Derepressed genes include those improving outer membrane permeability and glucose transport, and those encoding enzymes using alternate carbon sources, such as lactose, maltose, galactose and ribose (Saier et al., 1996). A burst of supercoiling accompanies this widespread derepression and a 30-fold higher mutation rate is seen in severely glucose-limited cultures with maximally derepressed genes than in mildly limited cultures (Notley-McRobb and Ferenci, 1999a). As discussed below, such derepressed genes are not only activated and selected in starving cells, but also become targets for SDM that may provide beneficial mutations and increased fitness for the stressed cells. The majority of mutations found in experimental evolution experiments do not appear to be random, but occur repeatedly in these newly derepressed genes and in replicate cultures.
The most frequently identified mutants in glucose-starved cultures show improved uptake of the limiting nutrient (Dykhuizen and Hartl, 1983). A literature search for large data-bases of sequenced mutations from such cultures was found in the work of Dardonville and Raibaud (1990) and Notley-McRobb and Ferenci (1999a,b). The majority of mutations (17, see Table 2 in Notley-McRobb and Ferenci, 1999b) occurred in the mgl operator and showed greater rates of glucose transport than the mglD null mutations inactivating a repressor that binds to the operator. These authors believe that the mglO mutations, located within a 14 bp sequence, were non-random and somehow directed to the operator. The regulatory mutations increasing glucose transport were analysed using a new computer program, mfg, to determine if SDM were responsible for their appearance (Wright et al., 1999; 2002; 2003; 2004). This program (see below) has successfully predicted mutation frequencies in E. coli and humans, based on the extent to which a mutable base is unpaired during transcription and upon the stability of the DNA secondary structure in which it is unpaired. The eight mglO point mutations were analysed and six were found to be similar to those described in amino acid auxotrophs of E. coli (Wright et al., 2003) and in human p53 cancers (Wright et al., 2002). The mutable sites were located in small loops very near or between stems of DNA stem-loop structures (SLSs) with high stability, and their predicted mutability was high, as they were unpaired and exposed most of the time (averaging 94% of their folds) during transcription (unpublished data). Of the nine mglD-encoded null repressor mutants available for analysis most were similar to the mglO mutable bases but were less frequently unpaired and had lower predicted mutabilities.
Investigations with experimental evolution in glucose-limited cultures are not designed to determine mutation frequencies. However, we have reported reversion rates of a derepressed lacZ auxotroph which serves as a model for the analysis of ‘forward’ mutations in derepressed genes. In this study relative reversion rates of each base in a lacZ stop codon were determined, following glucose starvation and derepression of this gene (including its defective codon). There was a good correlation between the relative number of mutations in the three bases and mutation frequencies predicted by the mfg program (Wright et al., 2003; and see below).
The most fascinating results were seen in the analysis of 14 point mutations of malT, encoding the specific MalT activator protein (Dardonville and Raibaud, 1990; Notley-McRobb and Ferenci, 1999a) which regulates at least five ‘maltose box’ promoters and LamB glycoporin expression critical to outer membrane permeability (Death et al., 1993). These mutations are observed only in response to severe glucose limitation, and result in constitutive expression of the regulon. Quite unexpectedly, they were seen to share the same characteristics found in the mutable bases of a model system of the immune response (Wright et al., 2004). Ninety per cent of these malT point mutations occurred at G:C base pairs; they were rarely unpaired and located at the ends of SLSs (unpublished data). The most striking characteristic linking these mutable bases to those in the immune system is that the rarely unpaired bases are preferentially influenced by transcription. There is an inverse correlation (P = 0.03) between the extent to which mutable malT bases are unpaired and the increase in stability of their SLSs as a function of increased levels of transcription. This correlation suggests a mechanism by which transcription could determine the order and intensity of base mutability (Wright et al., 2004). The gain-of-function malT mutations are clustered in two small regions of the gene, and they all result in the same discrete and well-defined change in MalT conformation. The altered conformation confers constitutivity on this key activator protein which binds to specific nucleotide sequences in the promoters it activates (Dardonville and Raibaud, 1990). Adaptations enhancing antibody affinity for antigens during the immune response also involve alterations in protein conformation. We speculate that such adaptations required the evolution of localized groups of unpaired mutable bases allowing coordinated trials with simultaneous mutations at many sites to achieve a best fit for binding to their target. In any event, it appears that mutations occurring in derepressed genes of glucose-starved cultures are not random, but share the mechanisms of SDM elicited by many kinds of stressors in the environment (see below).
Compared to wild type, the frequency of mismatch repair defective mutator strains is higher in experimental evolution experiments where strong selection is applied (Cox and Gibson, 1974; Notley-McRobb et al., 2003). Funchain et al. (2000) have monitored the function of more than 700 genes (∼15% of the genome) in repair-deficient mutants after 1000 generations in rich media and documented the presence of extensive damage not seen in wild type under the same conditions. As discussed above, the magnitude and variety of mutations expected in nutritionally stressed mutators is expected to be much greater than in cells grown in rich media. Distinguishing the multitude of specific, transcription-induced mutations expected in glucose-starved mutator cells from the genome-wide random mutations resulting from deficiencies in repair would be impossible. Moreover, the magnitude of specific SDM will be amplified in strains deficient in repair (see below).
Mechanisms of SDM
During cell growth, there are many DNA-destabilizing events that will increase mutation rates: recombination, transcription, error-prone DNA polymerases, proofreading errors during DNA replication, and repair-deficient mutators. In stressed cells, however, when growth is absent or minimized, replication-associated events may not be major sources of mutations compared with SDM. As mentioned earlier, the most efficient process for overcoming stress and accelerating evolution would be the initiation of specific feedback mechanisms by which each kind of stress targets hypermutation to those genes that must mutate to overcome the stress. Although there are a number of examples of specifically directed mutations caused by artificially induced transcription in growing cells (reviewed in Wright, 2000), evolution requires mutations directed by stress. To my knowledge, the first example of SDM was reported by Lipschutz et al. (1965), who examined the effect of phosphate starvation on reversion rates of an alkaline phosphatase mutant in derepressed compared with repressed genes. Reversion rates in the derepressed gene (in the absence of orthophosphate) were 15–20-fold higher than in the repressed gene indicating . . . ‘that the susceptibility of . . . bases to chemical changes increases during synthesis of mRNA . . . from an opening of the DNA helix or from an uncovering of the gene during derepression’. With the advent of gene arrays we now know that each kind of stress is limited to about 1% of the genome and specifically derepresses and activates those genes related to the stress. Examples include phosphate starvation (Antelmann et al., 2000; Ishige et al., 2003), amino acid starvation (Rudner et al., 1999; Wright et al., 1999), superoxide stress (Pomposiello et al., 2001), and osmotic stress (Wolf et al., 2003).
Reversions of amino acid auxotrophs are convenient models for analysing the specific consequences of starvation on mutations in related genes (Wright et al., 1999; 2003). For example, Fig. 1. depicts mechanisms by which gene derepression can initiate a series of events resulting in higher mutation rates in trpA of the trp operon. As the absence of tryptophan removes gene repression, transcription is activated in all the biosynthetic trp genes, including the defective trpA– mutant allele 23 (A-to-G) (Wright et al., 2003). Transcription drives supercoiling (Liu and Wang, 1987), and the DNA secondary stem-loop structures (SLSs) created by supercoiling can be quantitatively measured (Dayn et al., 1992). Stem-loop structures probably occur in the wake of the transcription bubble, as the bubble itself contains too little ssDNA to form predicted structures. Stem-loop structures are sequence-dependent; that is, they form in supercoiled DNA because of the proximity of inverted complements such as the two 5 nt segments forming the stem in Fig. 1. Templated mutations are perhaps the most convincing evidence for the location of mutable bases in SLSs (Ripley and Glickman, 1983). A recent example is the simultaneous mutation of three contiguous bases templated by three unpaired bases in the opposing strand of a SLS in the lacZ gene (Wright et al., 2003).
Supercoiling and SLSs are highly localized to the vicinity of each RNA polymerase complex (Rahmouni and Wells, 1992) and the secondary structures are characterized by unpaired bases located in a loop or at the base of a stem. Increasing levels of transcription will increase supercoiling, the lengths of ssDNA, the size and stability of SLSs (Balke and Gralla, 1987; Figueroa and Bossi, 1988), and thereby increase mutation rates. Unpaired bases in these secondary structures are vulnerable to mutation because of their intrinsic thermodynamic instability (especially Gs and Cs) and their availability to nucleotide-altering enzymes (Lindahl, 1993). The incidence of mutations in such unpaired bases [including, perhaps, ‘contingency’ loci (Moxon et al., 1994)] would of course be amplified in mutator strains lacking mismatch repair (see above).
Because extensive evidence indicates that mutable bases in actively transcribed genes of stressed cells are unpaired and located in SLSs (reviewed in Wright, 2000), their mutation rates could be predicted knowing (i) how frequently a base is unpaired in the successive SLSs that contain it during transcription, and (ii) which of those SLSs is the most stable, as that is the one in which the base would be maximally exposed and most likely to mutate. A new computer algorithm named mfg (http://biology.dbs.umt.edu/wright/upload/mfg.html) was therefore developed to provide this information and to calculate a Mutability Index (MI) for each base, defined as the absolute value of [(percentage of folds in which the base is unpaired) (highest –ΔG of all folds in which the base is unpaired)] (Wright et al., 2003). Thus, primary importance is given to the most stable SLS in which a given base is unpaired and exposed for the longest period of time; this should be the structure in which the base has the highest probability of mutation.
The mfg program was used to predict reversion frequencies in derepressed genes of 15 auxotrophs, and the correlation was good between MIs and relative mutation frequencies determined experimentally (Wright et al., 2003). This correlation is shown for a series of trpA– mutant alleles in Fig. 2A and compared with 35 background mutable bases in the constitutively expressed control gene, lacI.
As demonstrated by ‘The Unity in Biochemistry’[first recognized by Kluyver and Donker (1926)], the process of evolution has bestowed a fundamental unity of biochemical behaviour on all forms of life: the same metabolites, biosynthetic and catabolic pathways, enzymes and regulatory mechanisms. Mutagenesis in microbes and humans, then, should have much in common. Moreover, higher rates of transcription are associated with hypermutation in both cancer (Ghosh and Mitchell, 1999) and the immune response (Bachl et al., 2001), both of which have been analysed with the mfg program (Wright et al., 2002; 2004). The mfg program was used to predict mutation frequencies in an unprecedented database of 14 000 hypermutable bases of the human p53 tumor suppressor gene. As seen in Fig. 2B, an excellent correlation was found for the hypermutable bases as compared to control bases located in nearby codons. Plotting the number of mutations against –ΔG alone did not give a statistically significant correlation, i.e. both variables in the above equation were necessary to predict mutation frequencies successfully (Wright et al., 2002). The p53 data are compelling in demonstrating that the assumptions underlying the mfg program are essentially correct. We conclude that mutable bases in stressed cells are in fact unpaired and located in SLSs caused by supercoiling in the wake of the transcription bubble. The mutability of such a base will be affected by the: (i) per cent of total folds in which the base is unpaired during transcription; (ii) stability of the most stable SLS in which the base is unpaired; (iii) intrinsic thermodynamic instability of each base, and (iv) level of transcription.
As DNA sequence and thermodynamics determine the first three variables, the two extrinsic variables that can influence mutability are transcription and supercoiling. As discussed above, various environmental stressors can enhance mutations directly via transcription and/or supercoiling, and epigenetic factors increasing gene expression are also clearly associated with several tumour models (Ghosh and Mitchell, 1999). The data suggest that, if stressors cause direct damage to the hypermutable bases analysed in these studies, the resulting increase in mutation frequencies must depend upon, and reflect, the intrinsic mutabilities (MIs) of these bases.