SEARCH

SEARCH BY CITATION

Keywords:

  • polymerase fidelity;
  • microsatellite interruption;
  • POLK;
  • DINB1;
  • frameshift mutation

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

Microsatellite tandem repeats are frequent sites of strand slippage mutagenesis in the human genome. Microsatellite mutations often occur as insertion/deletion of a repeat motif (unit-based indels), and increase in frequency with increasing repeat length after a threshold is reached. We recently demonstrated that DNA polymerase κ (Pol κ) produces fewer unit-based indel errors within dinucleotide microsatellites than does polymerase δ. Here, we examined human Pol κ's error profile within microsatellite alleles of varying sequence composition and length, using an in vitro HSV-tk gap-filling assay. We observed that Pol κ displays relatively accurate synthesis for unit-based indels, using di- and tetranucleotide repeat templates longer than the threshold length. We observed an abrupt increase in the unit-based indel frequency when the total microsatellite length exceeds 28 nucleotides, suggesting that extended Pol κ protein–DNA interactions enhance fidelity of the enzyme when synthesizing these microsatellite alleles. In contrast, Pol κ is error-prone within the HSV-tk coding sequence, producing frequent single-base errors in a manner that is highly biased with regard to sequence context. Single-nucleotide errors are also created by Pol κ within di- and tetranucleotide repeats, independently of the microsatellite allele length and at a frequency per nucleotide similar to the frequency of single base errors within the coding sequence. These single-base errors represent the mutational signature of Pol κ, and we propose them a mechanism independent of homology-stabilized slippage. Pol κ's dual fidelity nature provides a unique research tool to explore the distinct mechanisms of slippage-mediated mutagenesis.Environ. Mol. Mutagen., 2012. © 2012 Wiley Periodicals, Inc.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

The maintenance of genome stability is a critical mission for all living cells. All three domains of life expend considerable cellular energy to ensure the faithful replication of the genome during each cell cycle [Kunkel and Bebenek, 2000]. Since 1966, when the Streisinger strand slippage model was first published [Streisinger et al., 1966], mutagenesis of repetitive sequences has been studied intensively, often using short mononucleotide repeats as mutagenic targets, and focusing on the intrinsic nature of the DNA to promote strand misalignment at repetitive sequences [Kunkel, 1990]. The most basic prediction of the model is that the longer a repetitive sequence is, the more mutable it will be, due to the increased incidence and stability of a misaligned (slipped strand) template. The increased likelihood that the misaligned template can be extended by a DNA polymerase when the misalignment is farther from the primer–template junction also has been proposed to contribute to the observed increase in frameshift mutagenesis as repeat tracts lengthen [Garcia-Diaz and Kunkel, 2006]. In addition to traditional slippage mechanisms that depend on length and homology to stabilize the misalignment, single base frameshifts can be initiated within the polymerase active site by an incoming dNTP aberrantly binding to a neighboring template base [Osheroff et al., 2000]. Finally, misincorporation of an incorrect dNTP opposite a templating base can lead to frameshifts after subsequent strand misalignment pairs the dNTP with a neighboring base to form a correct basepair for extension [Kumar et al., 2011]. In addition to insertion/deletion (indel) errors, base substitution errors also can occur by a slippage-mediated event within tandem repeats. In this case, strand slippage resolves after the last nucleotide of the tandem repeat and the 5′ nucleotide is incorporated, resulting in the formation of an incorrect basepair at the primer-terminus [Bebenek and Kunkel, 2000].

Microsatellite repeats are a significant source of insertion or deletion (indel) mutations genome-wide. Like frameshifts, computational and biochemical studies have shown that microsatellite mutagenesis occurs in a length-dependent manner [Kelkar et al., 2010], and the rate of mutations involving the insertion/deletion of a whole repeat motif (unit-based indels) increases dramatically after a threshold length is reached [Kelkar et al., 2010]. Thus, microsatellites longer than the threshold mutate at rates increasing with the length of the allele [Ellegren, 2004].

The integrity of the human genome is maintained through the coordinated effort of 14 nuclear DNA polymerases, divided into distinct polymerase families [Sweasy et al., 2006]. These diverse functions are required for complete and accurate DNA replication, DNA repair, recombination, and specialized functions such as somatic hypermutation [Bebenek et al., 2004]. Each DNA polymerase has an inherent error rate, which can vary depending on the specific DNA substrate. DNA Polymerase Kappa (Pol κ) is a member of the Y family of polymerases, best known for their well established roles in translesion synthesis [Lehmann et al., 2007; Sherrer et al., 2011]. Pol κ is error prone on undamaged, random-sequence DNA, and produces frequent single base insertions, deletions and base substitutions [Ohashi et al., 2000]. Pol κ can readily extend from a misaligned template, increasing the likelihood of slippage-mediated errors (Washington et al. 2002). Intriguingly, we recently demonstrated that Pol κ can synthesize tandem dinucleotide repeats with relative accuracy, producing fewer unit-based indel errors than the replicative polymerase, Pol δ [Hile et al., 2012]. To explore this dichotomous accurate versus error-prone fidelity, here we examined Pol κ's error profile during in vitro DNA synthesis of several di-, tetra- and complex microsatellite alleles. Again, we show that Pol κ averts unit-based indel errors within di- or tetranucleotides microsatellites that are much longer than the slippage threshold. We find also that slippage events resulting in single nucleotide indels are created frequently by Pol κ, but independently of the length of the repetitive sequence. Pol κ's unique combination of a high single-nucleotide error rate, lack of a 3′ to 5′ exonucleolytic activity to correct errors, and relatively high fidelity replication of di- and tetranucleotide tandem repeats makes it a useful tool to explore different mechanisms of slippage-mediated mutagenesis.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

In Vitro Gap-Filling HSV-tk Mutagenesis Assay

Purified full-length human Pol κ (99 kDa) was purchased from Enzymax (Lexington, KY). The construction of microsatellite-containing Herpes simplex virus type 1 thymidine kinase (HSV-tk) vectors has been previously described [Eckert et al., 2002; Hile and Eckert, 2008]. The four-subunit recombinant human Pol δ4 was a generous gift of Marietta Lee (NY Medical College). The MluI (position 83) to StuI (position 180) gapped DNA duplex substrates were created and purified as previously described [Hile et al., 2012]. Microsatellites were inserted in frame and sequences are provided in Table I. The single-stranded mutational target within the gapped molecule contains 89–91 bp of HSV-tk gene coding region sequence, in addition to the inserted microsatellites. In vitro gap-filling reactions contained ∼0.075 pmol of gapped DNA substrate and 250 μM deoxynucleotide triphosphates (dNTPs) in 100 μl total volume. Pol κ reactions contained 25 mM potassium phosphate buffer pH 7.2, 5 mM MgCl2, 5 mM DTT, 100 μg ml−1 nonacetylated BSA, and 10% glycerol. Pol κ is highly active and more processive in phosphate buffer than in Tris buffer [Hile et al., 2012], and more active in the presence of nonacetylated than acetylated BSA (data not shown). Thus, Pol κ was added to the gap-filling reactions at an enzyme: DNA ratio of 1:4 to 1:1 (0.01875–0.075 pmol Pol κ). Reactions were incubated at 37°C for 2 hr. Pol δ4 reactions contained 40 mM Tris pH 7.8, 10 mM MgCl2, 1 mM DTT, 5 mM NaCl, 200 μg ml−1 BSA and 1.2–4 pmol pol δ4. Reactions were incubated at 37°C for 1 hr. All reactions were terminated with 15 mM ethylenediaminetetraacetic acid (EDTA) and processed as described [Hile et al., 2012]. Complete gap filling was verified by agarose gel electrophoresis as described [Abdulovic et al., 2011]. An aliquot of DNA from complete gap-filling reactions was used to transform E. coli strain FT334. Subsequent selective plating on VBA + Chloramphenicol and VBA + Chloramphenicol + 5-fluorodeoxyuridine was performed for mutant frequency determination [Eckert et al., 1997]. To control for pre-existing mutations, the HSV-tk mutant frequencies for each single-stranded DNA used to make the gapped substrate was determined. The DNA sequences (within the target region) of independent mutants isolated from two polymerase reactions per template were determined as described [Eckert et al., 2002].

Table I. Microsatellite Sequences Examined in This Study
AlleleSequence (microsatellite allele in bold)Length
Wild type GCG CGT TCT CGA
Complex GCG CGT TTC TTT CTT TCT TCC TTC CTT CCT CTC TCT CGA32
[TTCC]6 GCG CGT TTC CTT CCT TCC TTC CTT CCTTCC TCT CGA24
[TTCC]9 GCG CGT TTC CTT CCT TCC TTC CTT CCT TCC TTC CTT CCT TCC TCT CGA36
[GT]10 GCG CGT GTG TGT GTG TGT GTG TGT TCT CGA20
[GT]13 GCG CGT GTG TGT GTG TGT GTG TGT GTG TGT TCT CGA26
[GT]19 GCG CGT GTG TGT GTG TGT GTG TGT GTG TGT GTG TGTGTG TGT TCT CGA38
[TC]11 GCG CGT TCT CTC TCT CTC TCT CTC TCT CGA22
[TC]14 GCG CGT TCT CTC TCT CTC TCT CTC TCT CTC TCT CGA28
[TC]17 GCG CGT TCT CTC TCT CTC TCT CTC TCT CTC TCT CTC TCT CGA34

Determination of Polymerase Error Frequencies

Overall estimated polymerase error frequencies (Pol EFest) for each template, which include errors created in either the microsatellite or HSV-tk coding sequences, were calculated by the following equation: Pol EFest = (Observed MF) × (# sequenced mutants with an in-target mutation/total number mutants sequenced).

  • equation image

Mutants that do not contain a mutation within the MluI-StuI target arise either from errors produced during the creation and purification of the gapped duplex [Eckert et al., 2002], or from polymerase strand displacement synthesis beyond the MluI site. The proportion of outside target mutants varied among templates, ranging from none to 10.7% (average 6.5%). Because Pol κ can make multiple errors per target sequence, error frequencies were also adjusted to reflect the number of mutational events that were detectable as single mutational events. All frameshift errors and base substitutions that caused an amino acid change or a stop codon were considered detectable. Only detectable events were used to calculate the Pol EFest. Each mutational event was also scored as tandem or nontandem. Tandem events were those adjacent to one another and nontandem were those > one nucleotide apart. Pol EFs were corrected for the existence of multiple nontandem errors as described [Hile et al., 2012].

Mutational events were analyzed based on the location of the error: within the microsatellite or within the HSV-tk coding sequence. The microsatellite mutation frequency was further subdivided, based on whether the mutational event resulted in a unit-based indel or an interruption. A unit-based indel is defined as an error that results in the insertion or deletion of bases that are a multiple of the tandem repeat motif. An interruption is defined as other microsatellite mutations, including detectable base substitutions and indels of nucleotides that are not a multiple of the repeat motif. For the [TC]14 and [TC]17 alleles, the mutation frequency of the single-stranded DNA used to make the gapped DNA duplex was within an order of magnitude of the Pol EF. Therefore, we determined the spectrum of pre-existing mutations by DNA sequence analysis of mutants isolated after electroporation of either ssDNA or unfilled gapped duplex DNA [Abdulovic et al., 2011]. For both alleles, sequencing revealed only unit-based microsatellite indel mutations. Therefore, the mutation frequency of the single-stranded DNA was subtracted from the polymerase unit-based indel frequency, to estimate the true Pol κ error frequency for these two alleles. To compare the interruption error frequency to the coding region mutation frequency, we normalized the Pol EF to the corresponding lengths of each target sequence (microsatellite lengths are given in Table I; the coding region length is 90 bases). The average coding region mutation frequency was determined for all sequenced polymerase reactions and templates.

Comparisons between Pol δ and Pol K mutational events were analyzed for statistical significance using Fisher's exact test (two-sided). Observed versus expected values for Pol K single base deletion within two-base repeats or noniterated sequence were analyzed using Chi-squared tests with three degrees of freedom (dof).

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

To assess Pol κ errors during DNA synthesis of repetitive sequences, gapped duplex DNA substrates were created that contain various microsatellite sequences inserted in-frame within the HSV-tk coding sequence (Table I). We previously showed that [GT]10 and [TC]11 repeats are synthesized by Pol κ with fewer unit-based indel errors than the replicative polymerase, Pol δ [Hile et al., 2012]. Here, we tested longer [GT] and [TC] alleles to determine the extent of Pol κ's fidelity during microsatellite synthesis. In our previous report, we hypothesized that Pol κ may be more accurate when replicating dinucleotide repeats than mononucleotide repeats because the slippage event required for a dinucleotide indel results in a larger DNA bulge (two nucleotides) than for a mononucleotide indel event (one nucleotide). Because tetranucleotide indel events require an even larger bulge (four nucleotides), we also tested a series of [TTCC]n alleles. Finally, we tested a complex microsatellite allele, for two reasons. First, it contains a [TTCC]3 motif, which is below the previously reported four-unit slippage threshold for tetranucleotide mutagenesis [Kelkar et al., 2008]. This allows us to make comparisons to the longer tetranucleotide alleles that are above the slippage threshold. Second, the complex microsatellite contains two other tandem repeat motifs both below the slippage threshold, [TTTC]3 and [TC]4, that increase the total length of the polypyrimidine tract to 32 nucleotides, which is similar in total length to the [TTCC]9 allele. The nine mutational targets also provided us with robust data characterizing Pol κ strand slippage errors within the HSV-tk coding sequence, which is the same sequence in all templates.

Tetranucleotide Microsatellite Synthesis by pol κ

To test the hypothesis that slipped intermediates involving longer repeat units are not efficient substrates for Pol κ, we examined the unit-based indel error rates of three tetranucleotide [TTCC]n sequences. The [TTCC]3 sequence was examined within the context of the complex allele, while the [TTCC]6 and [TTCC]9 alleles are pure, simple repeats. Because [TTCC]3 is below the four-unit microsatellite threshold and [TTCC]6 is above, we expected to observe a significant increase in unit-based indel errors comparing the [TTCC]6 allele to the complex allele. However, we found that the overall complex and [TTCC]6 microsatellite mutation frequency differed by less than twofold (Table II). DNA sequence analyses revealed that Pol κ produced no unit-based indels within the [TTCC]3 motif of the complex microsatellite, corresponding to a Pol EF of <4.5 × 10−4. The Pol EF for unit-based indel errors within the [TTCC]6 allele was also very low, ∼7.0 × 10−4, and similar to the frequency of unit-based indel errors at other tandem repeats within the complex allele (Fig. 1). Increasing the [TTCC] tandem repeat to nine units resulted in an increased Pol EF for unit-based indel errors, to 74 × 10−4 (Fig. 1), a dramatic ∼10-fold increase relative to six repeat units. Therefore, at some point between 24 nucleotides ([TTCC]6) and 36 nucleotides ([TTCC]9), the ability of Pol κ to accurately synthesize the microsatellite and avoid unit-based indel errors is lost.

thumbnail image

Figure 1. Pol K abruptly alters its unit-based error frequency (black) when synthesizing long tetranucleotide alleles, but its rate of interruption errors (white) is similar among all alleles. Gap-filling reactions were performed with the indicated microsatellite-containing templates as described in Materials and Methods. DNA sequencing was performed after isolation of independent mutants. Mutants in the microsatellite region of the target were classified as either unit-based indels (black bars) or interruptions (white bars).

Download figure to PowerPoint

Table II. Pol K Error Rates Within Microsatellite and HSV-tk Coding DNA Sequences
Microsatellite alleleMicrosatellitelengthObserved HSV-tk frequencya × 10−4NbPol EFest × 10−4 (no. of observations)c
CodingMicrosatellite
  • a

    Mutant frequencies represent the mean ± standard deviation of three to six independent reactions.

  • b

    Number of mutants sequenced with mutations in the target region.

  • c

    Polymerase error frequency (Pol EFest) was calculated by the following formula: Pol EFest = (Observed MF) × (# sequenced mutants with an in-target mutation/total number mutants sequenced).

  • d

    Data from published work (Hile et al., 2012).

  • e

    Single reaction.

Complex32340 ± 13077280 (61)72 (16)
[TTCC]624190 ± 1710091 (61)58 (39)
[TTCC]936500 ± 40104320 (68)150 (36)
[GT]10d20140 ± 67160140 (144)16 (16)
[GT]1020 150e n.d.n.d.
[GT]1326100 ± 405175 (42)16 (9)
[GT]1938310 ± 4654170 (33)110 (21)
[TC]11d22120 ± 397580 (60)20 (15)
[TC]1122 47e1038 (8)9 (2)
[TC]1428460 ± 1146290 (29)130 (17)
[TC]1734610 ± 12061250 (26)230 (35)

Dinucleotide Microsatellite Synthesis by Pol κ Synthesis

To further determine the relationship of Pol κ unit-based indel fidelity and repeat length, we tested two series of dinucleotide repeats, [TC]n and [GT]n. The shortest repeat lengths in each series (10–11 units) are well above the five-unit threshold length for dinucleotide microsatellite mutagenesis [Kelkar et al., 2010]. Therefore, we expected that the unit-based indel frequency would be progressively higher as the tandem repeat lengthened. Instead, we measure a low frequency of unit-based indel errors within the [GT]13 allele (7 × 10−4) (Fig. 2A), similar to previous measurements for the [GT]10 allele (3 × 10−4) [Hile et al., 2012]. These results differ from those we previously published for Pol β, in which we observed the expected fivefold increase in Pol EF when the [GT] allele lengthened from 10 to 13 U [Kelkar et al., 2010]. When the [GT] allele was further lengthened to 19 U, we observed an abrupt increase of Pol κ unit-based indel errors to 52 × 10−4 (Fig. 2A). The total length for the [GT]19 allele corresponds to 38 nucleotides. Therefore, at some point between 26 nucleotides and 38 nucleotides, Pol κ shifts from relatively accurate to more error prone microsatellite synthesis. To benchmark Pol κ's fidelity for the [GT]19 template, we performed gap-filling reactions using recombinant, four-subunit human Pol δ. The unit-based indel frequency for Pol δ synthesis of the [GT]19 microsatellite is 100 × 10−4, indicating that Pol κ continues to be more accurate when synthesizing this long microsatellite allele than a replicative polymerase. Another distinction between Pol κ and Pol δ when synthesizing the [GT]19 allele is the relative proportion of errors produced within the microsatellite versus the coding region. The [GT]19 tandem repeat is a hotspot for Pol δ errors, as 94% of the mutants sequenced (45/48) were within the microsatellite and the majority of these (96%) were unit-based indels (Fig. 3). In contrast, only 42% of Pol κ mutants were within the [GT]19 microsatellite, and of these, less than half (47%) were unit-based indels (Fig. 3). The differential distribution of Pol κ and Pol δ errors within the GT19 microsatellite versus the coding region is extremely statistically significant (P = 2.9 × 10−12, Fisher's exact test).

thumbnail image

Figure 2. Pol κ increases the frequency of unit-based indels only when synthesizing longer dinucleotide microsatellite alleles. Gap-filling reactions were performed with the indicated microsatellite-containing templates as described in Materials and Methods. DNA sequencing was performed after isolation of independent mutants. Mutants in the microsatellite region of the target were classified as either unit-based indels or interruptions. The frequency of unit-based indels for the indicated microsatellite templates is shown for the [GT] series (A), and the [TC] series (B).

Download figure to PowerPoint

thumbnail image

Figure 3. Proportion of mutants synthesized by pol κ and pol δ that occur within the coding sequence versus the microsatellite. Gap-filling reactions were performed with the noted microsatellite-containing templates as described in Materials and Methods. After DNA sequencing of independent mutants, errors were classified as coding (white bars) or microsatellite (black and gray bars combined). Microsatellite mutations were further classified as unit-based indels (black bars) or interruptions (gray bars). ***P = 2.9 × 10−4, Fisher's exact test.

Download figure to PowerPoint

The unusual absence of increasing Pol κ errors for unit-based indels with repeat length also was observed for the [TC]n microsatellite series. The frequency of unit-based indels for [TC]11 and [TC]14 were the same (7–9 × 10−4) (Fig. 2B). However, when the allele length increased by only three units to [TC]17, Pol κ's unit-based indel frequency increased dramatically, by >15-fold, to 150 × 10−4 (Fig. 2B). Thus, between a total allele length of 28 nucleotides and 38 nucleotides, Pol κ again shifted to more error-prone synthesis. The unit-based indel frequency for the [TC]17 allele is approximately threefold higher than for the [GT]19 allele, which may indicate that the bulged bases required for a unit-based indel of a pure pyrimidine tract is a better substrate for Pol κ extension than a bulge containing both a pyrimidine and a purine base.

Microsatellite Interruption Errors

The largest proportion of Pol κ errors within the microsatellite alleles examined are interruptions, defined as indel mutations that are not a multiple of the microsatellite motif, or detectable base substitutions within the microsatellite (e.g., see Fig. 4). For the tetranucleotide series, the Pol EF for microsatellite interruptions was not affected by repeat number or length, as the frequency was similar for the complex, [TTCC]6 and [TTCC]9 alleles (Fig. 1). This contrasts with the frequency of unit-based indel errors within the same series (Fig. 1). Thus, interruptions likely represent a mechanism distinct from unit-based indels. In addition, interruption errors do not strictly depend on homology stabilization of the slippage event by other tandem repeats, as the complex allele has a frequency of interruption mutations equal to the [TTCC]6 and [TTCC]9 alleles (Fig. 1).

thumbnail image

Figure 4. Mutational spectra of pol K synthesis across complex (A), [TTCC]6 (B), and [TTCC]9 (C) microsatellite alleles. Gap-filling reactions were performed with the indicated microsatellite-containing templates as described in Materials and Methods. DNA sequencing was performed after isolation of independent mutants. For each microsatellite allele, the mutational spectrum for two independent reactions for each microsatellite allele is represented. Lines above the sequence indicate the bases deleted or inserted as unit-based indels. The interruption mutants are annotated below the sequence. Closed triangle indicates insertion with a letter following to depict which base was inserted, open triangle indicates deletion, and letters without symbols specify base substitutions.

Download figure to PowerPoint

We compared the frequency of Pol κ microsatellite interruption errors to the frequency of single-base errors within the coding region of the gapped duplex DNA substrates. The Pol EF per nucleotide within the HSV-tk target sequence (2.0 × 10−4) is the same as that for the tetranucleotide and complex allele interruption frequency per nucleotide (∼2.0 × 10−4) (Table III). This result further supports the notion that microsatellite interruptions are distinct in mechanism from the unit-based indels created by Pol κ within long tetranucleotide alleles, and may represent a mutational signature of Pol κ. The Pol EF for interruption errors within long dinucleotide alleles ([GT]19 and [TC]17) was ∼2 × 10−4, again similar to the coding sequence mutation frequency per nucleotide (Table III). However, the Pol EF for interruption errors within the two shorter [GT] alleles and the [TC]11 allele (20–26 nucleotide total length) was lower, 0.3–0.7 × 10−4. Unlike the mid-length [GT]13 allele, [TC]14 shows a similar frequency of interruptions per nucleotide as the larger dinucleotides. We considered the possibility that perhaps a shorter total microsatellite length was a factor contributing to the lower frequencies; however, the [GT]13 allele is 26 nucleotides, slightly longer than the [TTCC]6 allele (24 nucleotide).

Table III. Frequency of Single-Nucleotide Errors Produced by Pol K
SequencecontextSingle- nucleotidea Pol EF × 10−4Mutants observed(no.)Nucleotides per targetSingle- nucleotidePol EF/nt × 10−4
  • a

    Includes single-base deletions, insertions and detectable base substitutions.

  • b

    Combined for 19 reactions.

  • c

    Data from published work (Hile et al., 2012).

Codingb180520902.0
Complex6314322.0
[TTCC]65137242.1
[TTCC]97616362.1
[GT]10c1313200.7
[GT]1395260.3
[GT]195711381.5
[TC]11a86220.4
[TC]1412012284.3
[TC]17778342.3

Strand Slippage Errors Within Random Sequence

Pol κ is error prone on undamaged, random-sequence DNA, and produces frequent single base insertions, deletions and base substitutions [Ohashi et al., 2000]. Our HSV-tk in vitro assay was specifically designed to be biased towards detecting polymerase strand misalignment errors, as the mutational target contains very few detectable base substitution sites [Eckert et al., 2002; Hile and Eckert, 2008]. Thus, our composite coding region spectrum (combined errors from all templates and reactions), which contains 520 mutants, provides a robust dataset for analysis of Pol κ indel mechanisms in nonrepetitive or short repeated sequences. The composite coding spectrum is shown in Figure 5. Visual inspection of the spectrum demonstrates the error-prone nature of Pol κ for single base indel errors, particularly at nonrepeated and two-nucleotide repeat sequences. Strikingly, the iterated CCC sequence at positions 147–149 is a relatively minor site of frameshift mutations by Pol κ. In contrast, this is a major hotspot for Pol β errors in this assay [Kelkar et al., 2010]. Analysis of the sequence contexts for single-base deletions in non-repetitive sequences revealed a considerable bias in the identity of both the deleted base as well as the 5′ template base (Table IV). For this class of error, template G is deleted in 61% of the observed mutants, a difference from an expected no mutational bias (32% expected, based on the number of template G residues in the coding sequence template) that is highly statistically significant (P < 0.0001, 3 dof, X2 test). Also, 58% of this type of error occurs within the sequence context 5′-CX-3′, where X is the deleted base (Table IV). The observed distribution of 5′ neighboring bases is significantly different from that expected (32% expected, based on the number of C residues) for no mutational bias (P < 0.0001). We also observed a high frequency of Pol κ single base deletion errors (84 × 10−4) at two base repeated sequences (AA, TT, GG, CC) (Table V). Again, we find a sequence context bias, in that template Gs and Cs are far more likely to be deleted at two base repeats than are As and Ts, a distribution that is significantly different from that expected for no bias (P = 0.004, 3dof, X2 test).

thumbnail image

Figure 5. Mutational spectra for pol K synthesis within the HSV-tk coding sequence. Gap-filling reactions were performed with the indicated microsatellite-containing templates as described in Materials and Methods. DNA sequencing was performed after isolation of independent mutants. This spectrum represents the combined coding sequence mutants of 19 independent reactions. Base substitutions are noted above the sequence. Below the sequence, closed triangles indicate insertions, with a letter following to depict which base was inserted, and open triangles indicate deletion. Two-base repeats are underlined. The stop sign indicates the end of the gapped DNA duplex target. Not included on the spectrum are two tandem mutants and four large deletions. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Download figure to PowerPoint

Table IV. Sequence Context and Identity of Pol κ Single-Base Deletions in Noniterated Sequences
Base identityNumber of events as 5′ template baseNumber of events asdeleted base
G3888
A108
T1235
C8413
Total144144
Table V. Distribution of Pol κ Single-Base Deletion Errors Within Two-Base Iterated Sequences
Sequence contextDeletionsNumber of sitesDeletions/site
AA3047
TT824
GG116619
CC96616

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

The Streisinger strand slippage model [Streisinger et al., 1966] has been the dogma of mutagenesis at repetitive sequences for over 40 years. The prediction that the mutability of a repetitive sequence is length dependent has been shown in biochemical fidelity assays [Kunkel, 1985a; Eckert and Hile, 2009] computational models [Kelkar et al., 2010], genetic assays [Henderson and Petes, 1992] and enzyme crystal structures [Garcia-Diaz et al., 2006]. The mutational signature of human Pol κ that we report here is novel, because although it is error prone when synthesizing nonrepeated or short mononucleotide repeats, Pol κ is more accurate than replicative Pol δ within microsatellite repeat motifs larger than mononucleotides. In addition, Pol κ violates the rule of increasing unit-based indel error rates with increasing repeat number, and produces few of these errors until the total length of the microsatellite reaches >28 nucleotides, much longer than microsatellite mutagenesis slippage threshold [Kelkar et al., 2010].

Pol κ is surprisingly accurate, preventing the incorporation of unit-based indel errors requiring bulged intermediates of two or more bases. Tandem repeats longer than the threshold is thought to be hotspots for unit-based indel mutations through traditional Streisinger strand slippage [Ellegren, 2004; Kelkar et al., 2008, 2010]. A longer template with more repeat units provides greater opportunity for misalignment, increases the chances for a more stable misaligned substrate, and enhances the probability that the slipped intermediate will be located farther from the primer template junction, making a better substrate for polymerase extension [Garcia-Diaz and Kunkel, 2006]. Our data demonstrate that Pol κ stabilizes dinucleotide repeats well beyond the threshold length of five units; in fact, microsatellite repeats are stabilized by Pol κ until the total microsatellite length exceeds 28 nucleotides. This behavior contrasts with most previous publications of DNA polymerases showing a strict relationship between polymerase indel error rates and repeat number [Garcia-Diaz and Kunkel, 2006]. This unexpected fidelity of Pol κ could be mediated by multiple factors. First, biochemical studies using iterated templates showed that mutability at repeats is inversely related to the processivity of the enzyme [Kunkel, 1985b; Eckert and Kunkel, 1993; Bebenek et al., 1995; Hile and Eckert, 2004]. That is, the more likely the enzyme is to dissociate from the DNA template, the more likely strand slippage mutations will occur. Possibly, disassociation of the polymerase from the DNA allows the primer template junction to separate and realign improperly. Processivity may be a factor in the fidelity we observe for Pol κ when replicating di- and tetranucleotide repeats. The N-terminal extension (N-clasp) of human Pol κ has been shown to lock the polymerase around the DNA and increase the processivity of the enzyme [Lone et al., 2007]. In our studies, we use full length Pol κ enzyme and buffer conditions that enhance both the activity and the processivity of the polymerase. We showed that, under these reaction conditions, Pol κ displays less template dissociation during synthesis of dinucleotide repeats than does Pol δ [Hile et al., 2012]. Thus, Pol κ's unit-based indel error rate may be minimized because slipped intermediates are less likely to form when the polymerase remains associated with the DNA substrate. Another not mutually exclusive mechanism to account for Pol κ's unusual accuracy at microsatellites is that intimate protein-DNA interactions prevent larger bulges (≥ two nucleotides) from acting as substrates for polymerase extension. In this case, the unit-based indel rate is minimized not because slipped intermediates do not occur, but because Pol κ does not efficiently extend from them unless they are far enough from the active site of the enzyme so they do not interfere with the interaction of Pol κ at the primer-template duplex. Polymerases with large, open active sites, such as the archael Y family DNA polymerase, Dpo4 [Ling et al., 2001] can tolerate misalignments closer to the active site. Polymerases with constricted active sites, such as Pol κ, would be predicted to require more sequence between the misaligned bulge and the active site. Structurally, Pol κ is unique among Y family polymerases by the presence of the N-clasp, which creates a steric hindrance and blocks Pol κ extension of some DNA lesion-containing substrates [Jia et al., 2008]. Unfortunately, the currently available Pol κ structures are restricted to the catalytic N-terminal half of the polymerase and short (∼18 nucleotide) DNA templates [Lone et al., 2007]. We observed an abrupt increase in Pol κ-mediated unit-based indel errors when the microsatellite sequence exceeded 28 nucleotides in length. Perhaps, this reflects an extended footprint of Pol κ and loss of long-range interactions between the N- and C-terminal domains of the polymerase that affect proper formation of the active site for catalysis.

In contrast, Pol κ is consistently error-prone for single-base mutations, both in noniterated and short iterated sequences. Comparisons of microsatellite interruptions within the tetranucleotide alleles with mutations that occur within the HSV-tk coding sequence demonstrate that these errors are not dependent on the presence of highly repetitive sequence. The fact that the interruption frequencies for the dinucleotide alleles of different lengths are not the same may be due to the fact that they are composed of alternating G/T or T/C residues instead of identical bases. The coding region of HSV-tk is rich in two-base repeated sequences, a mutational hotspot for Pol K (Fig. 5). The sequence composition of the tetranucleotide allele we examined ([TTCC]) may have contributed to the similar mutational signature we observed between the tetranucleotide interruptions and coding region single-nucleotide errors.

We observed a significant sequence context bias for single base deletions produced by Pol κ within the coding region, for both the identity of the deleted base and the 5′ templating base. The observed bias cannot be explained by published reports of differences in the kinetics of either correct base insertion or basepair extension [Johnson et al., 2000; Washington et al., 2002]. Alternatively, the biases may reflect the unique structural interactions of Pol κ with the template DNA at or around the active site of the enzyme. In the current Pol κ ternary complex structure [Lone et al., 2007], only the templating base is within the active site. The unpaired template strand is outside of the active site cleft, and the immediate 5′ template base isstabilized by intimate interactions with amino acid residues of the N-clasp and fingers domains. Because single nucleotide frameshifts and base substitutions only require a single nucleotide bulge intermediate, perhaps Pol κ's active site can accommodate this bulged substrate, consistent with the finding that Pol κ is a “promiscuous extender” [Washington et al., 2002].

On the basis of the data presented in this study, we suggest that two distinct mechanisms explain Pol κ's enigmatic role in mutagenesis. One mechanism is due to Pol κ's inherent error-prone nature on nonrepetitive and short repeats of as little as two identical nucleotides, and we propose that it occurs at the active site in a dNTP-mediated manner. This mutagenesis is observed within microsatellites (interruptions) but is independent of the microsatellite allele length. The second mechanism, unit-based-indel mutagenesis, is mediated by traditional slippage mechanisms involving misalignments of the primer and template strands and requiring two or more bases bulged from the duplex. This distinction in Pol K fidelity is most evident when comparing the tetranucleotide alleles, because our chosen tetranucleotide unit is composed of two identical bases followed by two different identical bases, ideal substrates for Pol κ-mediated interruption mutagenesis. Because of these dual biochemical properties, Pol κ provides a useful tool in understanding slippage-mediated mutagenesis, both dNTP-mediated and homology stabilized. The unexpected accuracy of Pol κ synthesis using di- and tetranucleotide templates gives insight into the mechanisms of homology-stabilized strand slippage and how that can be tempered by the interactions of individual polymerases with their substrates. Future Pol κ studies may shed light on how protein interactions influence slipped strand mispairings to modulate frameshift mutagenesis.

STATEMENT OF AUTHOR CONTRIBUTIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

B.A.B. executed the studies; B.A.B. and K.A.E. designed the experiments, interpreted the results and wrote the manuscript.

Acknowledgements

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES

The authors are grateful to Dr. Marietta Lee for generously providing purified human DNA polymerase δ. The authors thank Suzanne Hile for technical assistance, and members of the Eckert laboratory for critical discussions and reading of the manuscript.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. STATEMENT OF AUTHOR CONTRIBUTIONS
  8. Acknowledgements
  9. REFERENCES