Computational investigations of polymerase enzymes: Structure, function, inhibition, and biotechnology

DNA and RNA polymerases (Pols) are central to life, health, and biotechnology because they allow the flow of genetic information in biological systems. Importantly, Pol function and (de)regulation are linked to human diseases, notably cancer (DNA Pols) and viral infections (RNA Pols) such as COVID‐19. In addition, Pols are used in various applications such as synthesis of artificial genetic polymers and DNA amplification in molecular biology, medicine, and forensic analysis. Because of all of this, the field of Pols is an intense research area, in which computational studies contribute to elucidating experimentally inaccessible atomistic details of Pol function. In detail, Pols catalyze the replication, transcription, and repair of nucleic acids through the addition, via a nucleotidyl transfer reaction, of a nucleotide to the 3′‐end of the growing nucleic acid strand. Here, we analyze how computational methods, including force‐field‐based molecular dynamics, quantum mechanics/molecular mechanics, and free energy simulations, have advanced our understanding of Pols. We examine the complex interaction of chemical and physical events during Pol catalysis, like metal‐aided enzymatic reactions for nucleotide addition and large conformational rearrangements for substrate selection and binding. We also discuss the role of computational approaches in understanding the origin of Pol fidelity—the ability of Pols to incorporate the correct nucleotide that forms a Watson–Crick base pair with the base of the template nucleic acid strand. Finally, we explore how computations can accelerate the discovery of Pol‐targeting drugs and engineering of artificial Pols for synthetic and biotechnological applications.


| Structural domains of polymerases
DNA Pols are structurally diverse in accordance with the differences in their functional roles and properties. 49 For example, A-and B-family DNA Pols have an exonuclease 3 0 ! 5 0 proofreading domain that enhances their fidelity by as much as 100-fold (10 −7 to 10 −8 ). 50,51 X-family DNA Pols have an additional lyase domain involved in the excision of apyrimidinic sites. 58 Y-family DNA Pols have a smaller structure with fewer contacts with the incoming dNTP and DNA, which contributes to their low processivity. 59 Because they have a more open and solvent-exposed active site that can accommodate modified template bases, nucleotide adducts, and noncomplementary nucleotides, they also exhibit low fidelity. Nevertheless, most DNA Pols typically consist of a catalytic domain resembling a right hand ( Figure 1a). 49,50 On the basis of this architecture, the catalytic domain can be further subdivided into palm, fingers, and thumb subdomains, although only the first two are present in DNA Pol X from African swine fever virus. The palm subdomain contains the catalytic residues, while the fingers and thumb subdomains interact with the incoming dNTP and DNA substrates, respectively. The conventional right-hand-based nomenclature, however, is the opposite of the one originally proposed for the X-family DNA Pol, DNA Pol β (i.e., the fingers and thumb subdomains were swapped). 58 To avoid confusion, a function-based nomenclature is adopted for this family: the catalytic domain is referred to as the polymerase domain and the subdomains as the D (DNA binding), C (catalytic), and N (nascent base-pair binding) subdomains.
In comparison, RNA Pols are more structurally conserved. Viral single-subunit DNA-and RNA-dependent RNA Pols also adopt the right-hand architecture with seven conserved structural motifs: motifs A-E are located in the palm subdomain, while motifs F and G are located in the fingers subdomain. 64 On the other hand, DNA-dependent RNA Pols from the three domains of life have multiple subunits (Figure 1b). 65 The subunits of bacterial RNA Pols are named using Greek letters, those of archaeal RNA Pols are named Rpo followed by a number, and those of eukaryotic RNA Pols are named Rpb, again followed by a number. Using the Rpb nomenclature, the conserved core of multi-subunit RNA Pols consists of Rpb1 and Rpb2, which carry out NTP incorporation and translocation, and Rpb3, Rpb11, and Rpb6, which are involved in enzyme assembly and transcription regulation. The catalytic subunits Rpb1 and Rpb2 contain the catalytic residues, bridge and trigger helices, binding sites for downstream DNA and RNA-DNA hybrid, secondary NTP entry channel, and loop and switch regions responsible for handling the nucleic acid scaffold. The assembly platform is formed by the complex of Rpb10-Rpb12 and Rpb3-Rpb11. Together, the catalytic subunits and assembly platform constitute the minimal configuration of multi-subunit RNA Pols capable of RNA polymerization.

| Catalytic cycle of polymerases
Despite their structural diversity, Pols follow a common catalytic mechanism for D(R)NA synthesis, which consists of not only the chemical reaction itself, but also physical steps including substrate binding, conformational change, and translocation along the primer-template ( Figure 2). The catalytic cycle of RNA Pols consists of initiation, elongation (i.e., NTP addition and translocation), and termination phases. 65 In this regard, this review examines the elongation phase of RNA synthesis, which is analogous to the DNA synthetic process in DNA Pol.

| Nucleotide addition cycle
The catalytic cycle of DNA Pols begins with binding to the DNA substrate. 48 The dNTP then binds to the resulting DNA/Pol binary complex to form the initial DNA/Pol/dNTP ternary complex ( Figure 2). This may be followed by an  Figure 1a) and local active site rearrangement. These structural changes render a reactive ternary complex wherein the dNTP is aligned with the 3 0 -end of the primer DNA for nucleophilic attack ( Figure 2). Y-family DNA Pols, however, do not undergo any global conformational change because the protein is already prealigned for dNTP binding and catalysis. 66 After nucleotide incorporation, a pyrophosphate (PPi) group is released and a reverse conformational change occurs ( Figure 2). 48 However, the order of these two events has not been definitely established. 67,68 Subsequently, the DNA Pol translocates by one base pair along the DNA, leaving the active site free for the binding and incorporation of the next dNTP. 48 Alternatively, the DNA Pol can dissociate from the extended DNA and bind to a new DNA substrate.
The elongation phase of RNA synthesis is similar to the catalytic cycle of DNA Pols. 69,70 The NTP binds to the transcription elongation complex, which consists of the open-conformation RNA Pol and the RNA duplex or RNA-DNA hybrid formed in the initiation phase ( Figure 2). The mechanism for this step slightly differs between the RNA Pol classes. The single-subunit RNA Pol from bacteriophage T7 (T7 RNA Pol) has separate preinsertion and insertion sites for the substrate, 69 while in multi-subunit RNA Pols, the positions of the substrate in the preinsertion and insertion states nearly overlap. 71 However, a template-independent entry site, where the NTP initially binds before moving to the insertion site, has been proposed for multi-subunit RNA Pols. 72 Unlike these DNA-dependent RNA Pols, viral RNAdependent RNA Pols do not have a distinct entry or preinsertion site. 73 Once the NTP moves to the insertion site and establishes WC H-bond interactions (Figure 2), the active site closes through the folding of the trigger loop (or rotation of the O helix in single-subunit RNA Pols, Figure 1), leading to the formation of a reactive configuration. 69,70 Subsequently, the nucleotide is incorporated, PPi is released, the active site re-opens, and the RNA Pol translocates along the RNA duplex or RNA-DNA hybrid ( Figure 2). RNA Pols continue to transcribe until it receives a termination signal (termination phase), and unlike DNA Pols, cannot re-associate with the DNA template or RNA once it dissociates. 74

| Active site and chemical reaction mechanism
Pols have been shown to follow the two-metal-ion mechanism for nucleotidyl transfer (Figure 2, center). 75 The two-metal-ion mechanism is also employed by other nucleic-acid-processing enzymes, such as type II topoisomerases, ribonuclease H enzymes, and endonucleases, for DNA and RNA cleavage. [76][77][78][79] The metal ion at the A-site (M A ) is coordinated to two or three conserved acidic residues (Asp or Glu), two of which are also coordinated to the metal ion at the B-site (M B ). 50,64 In the case of multi-subunit RNA Pols, another conserved acidic residue is coordinated exclusively F I G U R E 2 Nucleotide addition cycle of polymerases (Pols). The active site is shown in the center. Multi-subunit RNA Pols often have an additional conserved acidic residue (gray) coordinated exclusively to the B-site metal (M B ). In the chemical step, the 3 0 -OH group of the primer terminus (red) is deprotonated and attacks the P α atom of the incoming nucleotide (blue) to M B . 54 Additionally, M B is coordinated to the triphosphate group of the incoming (d)NTP and in some cases, a backbone carbonyl oxygen. 50,54,64 The reaction begins with the deprotonation of 3 0 -OH to form the nucleophile, which then attacks the P α atom of (d)NTP in an S N 2-like reaction to form a pentacovalent phosphate transition state. 75 A new bond between the O3 0 and P α atoms is formed, thereby extending the primer by one nucleotide. Concurrently, the phosphodiester bond between the αand β-phosphates of (d)NTP is broken, leading to the release of PPi. PPi has been suggested to undergo protonation by a general acid during the reaction on the basis of solvent deuterium isotope effect data for various DNA Pols and RNA Pols. 80

| KEY STRUCTURAL STUDIES OF POLYMERASES
The catalytic cycle and chemical reaction mechanism of Pols discussed in Section 2 have been established primarily through X-ray crystallography and cryogenic electron microscopy (cryo-EM) studies. In this section, we briefly discuss landmark structural studies of Pols that not only provided a detailed picture of the catalytic cycle, but also served as a framework for computational studies aimed at addressing unresolved questions regarding the catalytic cycle and determining the contributing factors to Pol catalytic efficiency, fidelity, and processivity.

| Conformational changes for catalysis
DNA Pol β, of the X-family DNA Pols, has served as the model enzyme for structural studies of the global and local conformational changes in DNA Pols during the catalytic cycle. For this DNA Pol, at least one crystal structure for each catalytic step has been solved. The apoenzyme structure shows an extended protein conformation with the lyase domain positioned away from the polymerase domain (PDB ID 1BPD). 81 In complex with single-nucleotidegapped DNA, the enzyme adopts a doughnut-like conformation as the lyase domain interacts with the Nsubdomain (PDB ID 3ISB). 82 The dNTP/DNA/Pol ternary complex structure indicates a transition from an open to a closed conformation upon dNTP binding through the rotation of the N-subdomain (PDB ID 2FMS). 83 This conformational change is accompanied by local rearrangements that bring the active site to a reactant state, namely, the movements of R283 to H-bond with the templating base, conserved acidic residues (D190, D192, and D256) to coordinate to M A , and N279 and R183 to H-bond with the dNTP. The structure of the ternary product complex with PPi suggests that the enzyme remains in the closed conformation after the reaction (PDB ID 4KLL), 84 while the structure of the binary product complex with only the nicked product DNA (PDB ID 1BPZ) shows a return to the open conformation once PPi is released. 85

| Phosphodiester bond formation for nucleotide insertion
Noteworthy, the time-resolved X-ray crystallography study of DNA Pol η allowed the first real-time monitoring of phosphodiester bond formation. 86 Nucleotidyl transfer was initiated by immersing a nonreactive ground-state crystal of the dNTP/DNA/Pol ternary complex (with only an inert Ca 2+ ion in the B-site, PDB ID 4ECQ) in 1 mM aqueous Mg 2+ . The proper alignment of the primer 3 0 -OH and dNTP was observed once both metal sites were occupied by Mg 2+ (PDB ID 4ECR). The α-phosphate was refined in a pentacovalent transition state with inverted configuration at the peak of bond formation (PDB ID 4ECV), thereby confirming the proposed S N 2-like reaction. The same experimental technique was employed for DNA Pol β to compare phosphodiester bond formation in the matched and mismatched complexes. 84 In the mismatched complex (PDB ID 4LVS), the template strand was observed to shift upstream (relative to its position in the matched complex, PDB ID 4KLD) with concomitant rotation of the primer 3 0 -OH away from M A , resulting in a nonreactive ground state. Although the reactant state with M A -O3' coordination was not captured, the product structure (PDB ID 4KLT) clearly showed the occurrence of phosphodiester bond formation. However, unlike the case of the matched complex, the base of the incorrect dNTP broke its interaction with the templating base during the reaction (PDB ID 4KLQ). Moreover, the α-phosphate of the incorrect dNTP flipped away from the metal ions (PDB ID 4KLS), which was attributed to the strain on the DNA caused by the template shift.

| Sugar pucker conformations during catalysis
The ribofuranose sugar ring in nucleic acids is not flat but puckered with at least one of the five atoms out of plane. 87,88 In solution, dNTP and NTP molecules exist in a dynamic equilibrium between the C3'-and C2'-endo conformations. 89,90 The populations, energetics, and timescale of interconversion of these two conformations are highly dependent on the sugar ring substituents. Notably, in the active site of Pols, the sugar moieties of both the primer terminus and incoming (d)NTP typically adopt the C3'-endo conformation, as evinced by crystallographic data. 86,91 In nature, the most common helical topologies of DNA and RNA are the B-form (C2'-endo) and A-form (C3'-endo), respectively. The barrier for interconversion of these topologies in DNA duplex structures is small, as supported by several NMR and computational studies. 88,[92][93][94] Despite the preferred C2'-endo conformation of the B-form of DNA duplexes, several crystallographic structures of A-, B-, X-, and Y-family DNA Pols show the 3 0 -end of the primer adopting a C3'-endo conformation either in pre-or postreactant state. 62,83,86,[95][96][97][98][99][100] The impact of the sugar pucker conformation on catalytic efficiency (k cat /K M ) was tested for both Pol β 101 and Pol η 86 using a primer with a ribonucleotide (C3'-endo) at the 3 0 end. The catalytic efficiencies were comparable whether the 3 0 -end nucleotide was a ribonucleotide or a deoxyribonucleotide. Additionally, the sugar pucker conformation was demonstrated to have an effect on both nucleotide incorporation and extension by Pols. For example, nucleotides constrained to a C3'-endo conformation (such as 2 0 -fluororibonucleotides) are preferentially incorporated by RNA Pols. Equally, 2 0 -fluoroarabino nucleotides, which prefer the C2'-endo conformation, are preferentially incorporated by DNA Pols. Interestingly, the consequent product of such incorporation is very difficult to extend further. 102 On these bases, it can be inferred that sugar puckering plays a nontrivial role for correct and efficient nucleotidyl transfer reaction catalyzed by Pols.

| Complete transcription cycle of an RNA polymerase
RNA-dependent RNA Pols allow the copying of RNA from RNA and are essential enzymes of RNA viruses, enabling the synthesis of viral proteins by host cells and their assembly with viral genome into new progeny. 103 In the influenza virus, the enzyme is composed of three intertwined polypeptide chains (known as PA, PB1, and PB2) of about 700 amino acids each. 104 The influenza RNA Pol is capable of both replication and transcription of viral RNA. Transcription takes place by a unique process, known as cap-snatching, in which different domains act in concert. The cap-binding domain (from PB2) binds a messenger RNA fragment of the host cell, which is cleaved by the endonuclease domain (from PA) and used as primer at the Pol site (from PB1). Replication is an unprimed process, which is guided by the structural reorganization of a priming loop that folds into the Pol core and aligns the first nucleotides of the nascent viral RNA. Recent studies have provided structural insight into both processes, 105 attesting the tremendous advances achieved by cryo-EM structure determination. In particular, the authors, through carefully designed experiments, captured the influenza RNA Pol in different "instants" of the transcription cycle, 106,107 demonstrating the complex structural reorganization that the enzyme undergoes to perform catalysis. Likely, this structural knowledge will greatly help the discovery of new and effective antiviral drugs against influenza.

| COMPUTATIONAL STUDIES OF POLYMERASE CATALYTIC MECHANISM AND PROPERTIES
Although structural studies have shed light on the catalytic cycle of Pols and, to a certain extent, have addressed the main factors discriminating between the correct and incorrect (d)NTPs, they employ methods that cannot capture transition states or fast dynamic events (e.g., substrate binding, primer 3 0 -OH deprotonation, and PPi release). Moreover, obtaining crystal structures of mismatched complexes with the catalytic metal (M A ) is challenging because of the weak binding of the incorrect (d)NTP. 108 In this regard, computational approaches are a valuable means to fill the gap in our knowledge of the molecular factors influencing the catalytic efficiency, fidelity, and processivity of Pols. In this section, we focus on the application of computational approaches in elucidating key aspects of the chemical reaction that have eluded experimental probes, namely, the identity of the general base that deprotonates the primer 3 0 -OH, the role of the third metal ion observed in DNA Pol crystal structures, and the timing and mechanism of Pol translocation. We then examine diverse mechanistic hypotheses on the origin of Pol fidelity that have arisen from comparative computational studies of correct and incorrect (d)NTP incorporation. 4.1 | Proposed mechanisms for 3 0 -OH deprotonation and nucleotide incorporation

| Classic mechanisms
Coordination to M A is believed to lower the pK a of the primer 3 0 -OH 83 (ribose 3 0 -OH: pK a in water = 13, 109 effective pK a in protein = 8.1 110 ), and the conventional hypothesis is that either a conserved first-coordination-shell ligand or active site water molecule accepts the dissociated proton from 3 0 -OH ( Figure 3). For the model DNA Pol, DNA Pol β, Wilson et al. proposed a conserved aspartate residue bound to M A (D256) as the general base for the deprotonation step on the basis of ONIOM quantum mechanics/molecular mechanics (QM/MM) calculations. 110,111 Florian, Goodman, and Warshel used free energy perturbation/empirical valence bond (FEP/EVB) calculations to investigate alternative pathways for T7 DNA Pol, including proton transfer to bulk solvent and to one of the nonbridging oxygens of the α-phosphate of the incoming dNTP. 112 However, proton transfer to the catalytic residue, D654, remained the most energetically favorable pathway. Similarly, the corresponding catalytic residues of DNA Pol η (E116), 113 DNA Pol λ (D490), 114 and human immunodeficiency virus (HIV) RT (D185) 115 were shown to be the most plausible base by QM/MM studies. However, none of these studies addressed how the pK a of aspartate or glutamate (3.90 and 4.35, respectively, in water 74 ) is perturbed by the protein environment such that either residue is able to act as a proton acceptor. 113,116 On the other hand, a molecular dynamics (MD) study of DNA Pol I showed that a histidine residue (H829, pK a in water = 6.46 74 ), located about 10 Å from 3 0 -OH in the crystal structure of the open-conformation complex (PDB ID 4YFU), moves toward the catalytic site during enzyme closing. Then, it occasionally interacts with 3 0 -OH upon departure of the Na + ion at the A-site. 117 On the basis of this observation, H829 was proposed as a potential proton acceptor, although the energetic feasibility of this mechanism was not assessed. 117 A few of the studies discussed above [112][113][114] demonstrated that proton transfer to bulk solvent via H 2 O is a highenergy process. However, subsequent computational studies of DNA Pol β 118 and DNA Pol η 116,119 demonstrated the feasibility of such proton transfer to a deprotonated Mg A -bound H 2 O (Mg A -bound OH − pK a ≈ 11.2 116 ). In the case of DNA Pol β, the resulting Mg A -bound OH − acts as a conduit for the eventual proton transfer to bulk solvent according to EVB and pK a calculations (using the semi-macroscopic version of the protein dipole Langevin dipole in its linear response approximation [PDLD/S]). 118 On the other hand, for DNA Pol η, Stevens and Hammes-Schiffer 116 demonstrated the feasibility of proton transfer to the Mg A -bound OH − using QM/MM finite temperature string simulations. Subsequent DFTB3/MM metadynamics simulations by Roston, Demapan, and Cui 119 further indicated that this mode of proton transfer occurs concertedly with nucleophilic attack, with the overall process being the rate-limiting step of nucleotidyl transfer. However, in the case of RNA Pol II, Carvalho, Fernandes, and Ramos showed by ONIOM QM/MM calculations that this concerted mechanism is unfavorable and that proton transfer to a bulk OH − (doubly H-bonded to the RNA terminus and near Mg A ) is the lowest-energy pathway. 120 Notably, it was further found that, unlike the case of DNA Pols, the most stable configuration of the RNA Pol II active site is the one where 3 0 -OH is only weakly coordinated to Mg A . Moreover, 3 0 -OH needs to be dissociated from Mg A to be able to attack the P α atom of NTP. Here, it must be noted that, compared with the case in DNA Pol crystal structures (e.g., those discussed in Section 3), M A is far from the RNA terminus and almost directly below the P α atom of the incoming NTP in the RNA Pol II crystal structure (PDB ID 2E2H). 61 A later QM/MM study by Roßbach and Ochsenfeld, using the nudged elastic band approach and a different crystal structure (PDB ID 4A3F 121 ) as the starting point, corroborated this mechanism for RNA Pol II. 56

| Alternative mechanisms
As an alternative to these classic mechanisms, a water-mediated and substrate-assisted (WMSA) mechanism was proposed by Zhang et al. on the basis of QM/MM FEP studies of DNA Pol IV (Dpo4), 122 T7 DNA Pol, 123 and DNA Pol κ ( Figure 3). 124 In the originally proposed WMSA mechanism for Dpo4, the α-phosphate of the incoming dNTP acts as the general base; 122 however, in contrast to the previous studies discussed in Section 4.1.1, the proton from 3 0 -OH is not directly transferred to it but rather through a water molecule. This indirect process was found to be the rate-limiting step and a lower-energy pathway compared with direct proton transfer to α-phosphate. The proton is then relayed to the γ-phosphate of dNTP via a solvent water and finally, directly to the β-phosphate of dNTP during nucleotidyl transfer. Thus, the WMSA mechanism accounts for not only the deprotonation of 3 0 -OH, but also the protonation of PPi, which is believed to be required for its release. 80 A variant of this mechanism was proposed for T7 DNA Pol 123 and DNA Pol κ, 124 wherein the initial base is γ-phosphate and not α-phosphate. For T7 DNA Pol, using the same QM/MM method/theory level, it was shown that the WMSA mechanism has a much lower barrier 123 than direct proton transfer to D654, which was proposed earlier by Florian, Goodman, and Warshel. 112 Indirect proton transfer to the α-phosphate via water and subsequent migration to the O αβ atom of NTP was also shown to be a viable pathway for RNA Pol II by Salahub et al. using QM/MM calculations. 125 However, indirect proton transfer via D483 and direct proton transfer to the α-phosphate were found to be equally favorable pathways. This result differs from the earlier one by Carvalho, Fernandes, and Ramos, 120 although both studies suggested that the RNA 3 0 -OH is weakly coordinated to or unbound from Mg A during nucleotidyl transfer.
Alberts, Wang, and Schlick 127 proposed a similar water-mediated mechanism for DNA Pol β that, however, challenges those previously proposed by Wilson et al. (proton transfer to D256 110,111 ) and Matute, Yoon, and Warshel (proton transfer to bulk solvent via Mg A -bound OH − ). 118 The QM/MM calculations showed that the proton from 3 0 -OH is initially transferred to a water molecule and, during nucleotidyl transfer, migrates to D190 and finally to the γ-phosphate of dNTP as in the WMSA mechanism. In comparison, direct proton transfer to D256 was found to have a much higher barrier, and the proton returns to O3' when the resulting intermediate was subjected to unconstrained minimization.
Finally, our group proposed a unique self-activated mechanism (SAM), 91 which significantly differs from the ones discussed above (Figure 4). The mechanism is "self-activated" because it is initiated by the intramolecular H-bond between the 3 0 -OH and β-phosphate groups of the newly incorporated dNTP, which facilitates the in situ deprotonation of 3 0 -OH. Importantly, the presence of this intramolecular H-bond upon the formation of the Michaelis complex was demonstrated to be a conserved feature in all available crystal structures of (deoxy)ribonucleotides complexed with F I G U R E 3 Proposed mechanisms for nucleotidyl transfer catalyzed by polymerases. In the protein-mediated mechanism, the 3 0 -OH group of the primer terminus is deprotonated by a conserved catalytic residue (Asp or Glu). In the water-mediated and substrate-assisted mechanism, the 3 0 -OH proton is initially transferred to the α-phosphate of the incoming nucleotide via a water molecule and ultimately relayed to the β-phosphate prior to pyrophosphate release and translocation. Reprinted with permission from Reference 126 Copyright 2018 American Chemical Society. https://doi.org/10.1021/acscatal.8b03363. Further permissions related to the material excerpted should be directed to the American Chemical Society Pol/D(R)NA binary complexes in the Protein Data Bank ( Figure 5). 91 This unprecedented structural observation was indeed the basis of the proposed SAM. Car-Parrinello QM/MM simulations of DNA Pol η showed that direct proton transfer from 3 0 -OH to the pro-S oxygen of the β-phosphate occurs simultaneously with both the formation of the leaving PPi group and partial translocation ( Figure 4). As a result, the newly incorporated dNTP is already deprotonated and positioned above M A prior to the binding of the next dNTP, rendering the enzyme ready for a new catalytic cycle of dNTP incorporation. Intriguingly, this mechanism has been recently compared, via DFTB3/MM metadynamics simulations, to the other more conservative ones. 119 Roston, Demapan, and Cui argued that SAM is disfavored because their calculated pK a for 3 0 -OH (lower limit of 9.0 for the posttranslocated state) implies that 3 0 -OH is not likely to remain deprotonated long enough during catalysis. This argument seems, however, marginal since SAM implies a level of synchronicity of a concerted sequence of chemical and physical steps that cannot be easily tested, especially with semiquantitative semiempirical methods. As a matter of fact, SAM remains the only conceptually novel alternative to the classical mechanisms. In fact, as eventually recognized also by Roston, Demapan, and Cui, 119 we emphasize that, unlike the previously proposed mechanisms for nucleotide incorporation (Figure 3), SAM is the only mechanism that enables a synergistic interconnection of the chemical (3 0 -OH deprotonation and nucleotidyl transfer) and physical (translocation) steps for a closed-loop catalytic cycle (Figure 4). 91 Notably, in this case, ab initio QM/MM simulations are essential in capturing the concerted structural rearrangement for nucleophile activation and leaving group formation in SAM during the partial nucleic acid translocation for Pol catalysis. Thus, the coupling between the chemical and physical steps form a closed-loop cycle for Pol catalysis. This is the reason why SAM remains a unique mechanistic hypothesis for the way the chemical and physical steps are linked together, returning the most favored energy path for nucleotide addition in Pols, so far. We close this section with a final consideration. As outlined above, different pathways have been proposed based on computational results for the activation of the 3 0 -OH nucleophile, even for the same Pol enzyme. Unfortunately, at present, it is not possible to discern experimentally which mechanism is operative in Pols at physiological conditions. Actually, some of these mechanisms may even coexist. Furthermore, we note that specific proton transfer pathways may contribute only marginally to the overall catalytic effect as long as the nucleophile is formed. 128 That is, while it is intellectually interesting to discriminate one deprotonation pathway from another, such pathways remain speculative if not corroborated by experimental data. A specific pathway for nucleophile activation (i.e., deprotonation) is actually significant only if proposed as the rate-limiting step, or if it involves a specific and conserved residue (i.e., irreplaceable residue for catalysis) as the proton acceptor.

| Role of the third metal ion in DNA polymerases
Time-resolved crystallography studies of DNA Pol η 86,129 and DNA Pol μ 130 showed that a third metal ion (Mg 2+ or Mn 2+ ) appears near the α-phosphate during phosphodiester bond formation, suggesting that it plays a role in transition-state stabilization. For instance, in another group of nucleotide-processing enzymes, homing endonucleases, the third metal ion was found to be essential to achieving the proper geometry for phosphodiester bond hydrolysis. 131 However, subsequent computational studies of DNA Pol η gave diverging results on the effect of this third metal ion on the chemical barrier. This is likely because these studies used different mechanisms for the chemical  116,119,[132][133][134][135] Stevens and Hammes-Schiffer employed SAM, our proposed mechanism for 3 0 -OH deprotonation, 91 in modeling the chemical reaction in the two-Mg 2+ system. 116 QM/MM finite temperature string simulations showed that nucleotidyl transfer has a higher barrier than 3 0 -OH deprotonation via SAM. On the other hand, for the three-Mg 2+ system, only nucleotidyl transfer could be modeled because the third Mg 2+ prevents the proton transfer from 3 0 -OH through electrostatic effects, which agrees with our earlier finding. 91 Nucleotidyl transfer was found to be a thermodynamically downhill process without a significant barrier 116 because the third Mg 2+ stabilizes the PPi product through electrostatic interactions. 91,116,134 Roston, Demapan, and Cui, on the other hand, modeled 3 0 -OH deprotonation by Mg A -bound OH − and nucleotidyl transfer as a concerted process. 119 QM/MM metadynamics simulations showed that the third Mg 2+ lowers the barrier for this concerted process, lowers the pK a of the Mg A -coordinated H 2 O, and stabilizes the Mg A -bound OH − and negative charge accumulated on the leaving group at the transition state. Interestingly, when a catalytic aspartate (D115) was used as the general base in the proton transfer step, the inclusion of the third Mg 2+ in the model had no significant effect on the calculated barrier. In contrast to these two studies, Yoon and Warshel modeled the chemical reaction as a stepwise process of proton transfer, whose mechanism was unspecified, and nucleophilic attack. 132 FEP/EVB calculations showed that the third Mg 2+ has no effect on the barrier for nucleophilic attack, although the overall reaction is more exothermic than the same reaction in the two-Mg 2+ system. Their study is consistent with that of Wilson et al. 133 on DNA Pol β, wherein ONIOM QM/MM calculations along the P α -O αβ bond reaction coordinate showed that the barriers for nucleotidyl transfer in the two-and three-Mg 2+ systems are the same.
Unlike the case of DNA Pol η and DNA Pol μ, the third metal ion only appeared at the product state in the timeresolved crystallography study of DNA Pol β. 84 Wilson et al. thus postulated that it is likely involved in the reverse reaction (pyrophosphorolysis) and subsequently showed through ONIOM QM/MM calculations that the third Mg 2+ , which is coordinated to the O α and O αβ atoms, prevents the reformation of the P α -O αβ bond. 135 On the other hand, our metadynamics study of DNA Pol η indicated that the third Mg 2+ , in cooperation with R61, acts as an exit shuttle for PPi. 134 The mechanism involves two steps: (1) partial unbinding of the Mg B -Mg C -PPi complex and sidechain conformational change of R61 to interact (along with Y52 and R55) with this leaving group and (2) return of R61 to its original conformation to facilitate the release of the Mg B -Mg C -PPi complex. This proposed role of the third Mg 2+ in PPi release was supported by subsequent FEP/EVB calculations of the overall reaction in DNA Pol η by Yoon and Warshel. 132 Notably, we observed a similar role for such a third transient ion in product release in another nucleotide-processing metalloenzyme, human exonuclease 1. 136 The importance of transient positive ions in nucleic acid synthesis and scission has been demonstrated for two-metal-ion enzymes and ribozymes ( Figure 6). 137,138

| Translocation mechanism of polymerases
Translocation is an important step in the catalytic cycle because it frees the active site for the addition of a new nucleotide. However, its mechanism and timing (i.e., before, after, or concurrent with active site reopening and PPi release) are not well defined by experimental studies. MD simulations of multi-subunit RNA Pols 139,140 and single-subunit T7 RNA Pol 141 suggested that translocation occurs after PPi release. Translocation in RNA Pol was demonstrated to follow the Brownian ratchet mechanism, wherein the system interconverts between the pre-and posttranslocated states until the binding of the next NTP inhibits backward translocation and locks the system in the posttranslocated state. 142,143 Yu et al. 144,145 and Huang et al. [146][147][148] provided a more detailed picture of the RNA Pol translocation mechanism through Markov state models, which enable the modeling of long-timescale dynamics (milliseconds for translocation) from many short MD simulations initiated from different parts of the free energy landscape. In T7 RNA Pol, the translocation of RNA and template DNA was initially unsynchronized because the latter was temporarily hindered by stacking interaction with F644 in the Y helix. 145 Opening of the O helix resulted in the insertion of Y639 into the active site, pushing the RNA-DNA hybrid upstream. In contrast to this system, RNA Pol II was simulated with the active site (i.e., trigger loop) already open prior to translocation 147 on the basis of earlier evidence from fluorescence 149 and MD simulation studies. 143 RNA and template DNA were observed to translocate simultaneously, facilitated by bending motions of the bridge helix (part of which is structurally analogous to the Y helix). The bridge helix was also shown to play a role in the backtracking process in RNA Pol II, which enables the nucleolytic cleavage of an RNA dinucleotide containing the wrong base. 148 Through its bending motions, it was observed to promote the fraying of the RNA terminus away from the template DNA if the base pairing is unstable, which is the case in a mispair.
In comparison, there have been fewer computational studies of the specific steps in DNA Pol translocation. In the case of DNA Pol η, which does not show distinct conformational states during the catalytic cycle, we proposed the coupling of DNA Pol translocation with the deprotonation of the 3 0 -OH group of the newly incorporated dNTP (see Section 4.1.2 and Figure 4, SAM for nucleotide incorporation). 91 On the other hand, in a computational study of the closed conformation of HIV RT employing locally enhanced sampling, steered MD, and Milestoning, translocation was observed to begin as PPi exits the protein. 150 Similarly, a restricted-perturbation targeted MD study of DNA Pol I suggested that PPi release enables the coupled closed ! open conformational transition and translocation. 151 Modeling D(R)NA translocation in DNA Pols and RNA Pols is certainly highly challenging given the global nature (i.e., many atoms and structural motifs involved, all moving in concert) of the structural rearrangements required to advance the Pol along the template strand. In this regard, methodological advances aimed at identifying collective motions in complex molecular systems will contribute to unravel the critical features and molecular determinants of Pol translocation.

| Opposing views on the contribution of prechemistry steps
Because global (i.e., open ! closed transition) and local conformational changes in Pols are a prerequisite for D(R)NA synthesis, the prechemistry steps of the nucleotide addition cycle have been hypothesized to contribute to catalysis (i.e., k cat or k pol ) and consequently, to the fidelity of Pols, regardless of whether they are rate determining. This is in line with the induced-fit mechanism, wherein the binding of the correct substrate brings the enzyme to the proper orientation for the catalytic reaction. 152 This hypothesis has been investigated through comparative computational studies of the conformational transition mechanism in matched and mismatched complexes. For instance, transition path sampling calculations of DNA Pol β revealed that the active site residues around the Mg 2+ ions do not rearrange to a catalytically competent configuration in the mismatched G:A complex during the closing of the N-subdomain, unlike in the matched G:C complex. [153][154][155] Furthermore, free energy calculations indicated that the closed conformation is only a metastable state for the mismatched complex. A more recent MD study on DNA Pol I suggested that the binding of the incorrect dNTP (dTTP opposite G) disrupts the interaction networks between the 3 0 ! 5 0 exonuclease domain and palm and thumb subdomains, causing the conformational equilibrium to shift toward the open conformation. 156 Because of the differences between the prechemistry steps of matched and mismatched complexes revealed by these computational studies, as well as experiments, 157,158 it has been proposed that prechemistry steps can act as kinetic "checkpoints" for the selection of the correct (d)NTP. 159 Along the same lines, Johnson and his collaborators 160,161 proposed a three-step mechanism for (d)NTP incorporation where E and F are the open and closed forms of the Pol, respectively, D is D(R)NA, and N is the incoming nucleotide. k cat /K M can then be derived as Subsequently, their MD simulations of HIV RT using the Milestoning approach showed that the closed conformation is more thermodynamically favorable than the open conformation for the matched A:T complex, but not for the mismatched A:A complex. 162 Moreover, the reacting groups were properly aligned for the chemical reaction in the matched complex, but disordered in the mismatched complex. From these results, it was inferred that for correct dNTP incorporation, k 3 k −2 and k cat /K M ≈ K 1 k 2 (where . This implies that substrate affinity (K 1 ) and the rate of the open ! closed transition (k 2 ) determine the specificity of HIV RT for the correct dNTP. The concept of kinetic checkpoints has also been invoked to explain the fidelity of RNA Pols. Of the five kinetic checkpoints proposed by Feig et al. for multi-subunit RNA Pols, two occur prior to the chemical step: (1) rotation of the NTP as it moves from the entry site to the insertion site and (2) steric interaction between the closed trigger loop and NTP. 163 It was subsequently demonstrated that a kinetic model including these checkpoints could reproduce the experimental misincorporation rate. 164 On the other hand, for the viral single-subunit T7 RNA Pol, Yu et al. proposed three prechemistry kinetic checkpoints: (1) NTP dissociation from the preinsertion site, (2) NTP binding at the insertion site, and (3) NTP dissociation from the insertion site (i.e., the reverse of 2). 165,166 Furthermore, they calculated a "selection" free energy for each of these checkpoints by umbrella sampling simulations. The total selection free energy was then used to derive the theoretical error rate of T7 RNA Pol using the chemical master equation. 167,168 It was found that the total selection free energy from the prechemistry kinetic checkpoints is sufficient to reproduce the experimental error rate for dNTP incorporation but not that for incorrect NTP incorporation. 166 From this, it was inferred that the chemistry step must also contribute to the total selection free energy through a higher activation barrier in order to discriminate against the incorrect NTP. Additionally, these studies and other free energy studies of prechemistry complexes of RNA Pols 169,170 consistently demonstrated that dNTP or the incorrect NTP tend to dissociate from the preinsertion and/or insertion site, hinder the closing of the active site, and/or cause structural distortion.
Similarly to Yu et al., Salahub et al. devised a stochastic kinetic scheme for the entire nucleotide addition cycle to determine the origin of the sugar selectivity (i.e., NTP vs. dNTP) of RNA Pol II. 171 Unlike the studies above, the chemical step was found to have a more dominant contribution, that is, the experimental substrate selectivity of RNA Pol II could only be reproduced by assigning a different chemical reaction rate for each nucleotide. This conclusion was later corroborated by Roßbach and Ochsenfeld, who applied QM/MM calculations and the nudged elastic band method to show that ATP incorporation by RNA Pol II has a lower reaction barrier than dATP incorporation. 56 The lower reaction barrier was due to an active site Arg (R446) that bridges the RNA terminus with ATP (via H-bonds) but not with dATP.
On the other hand, Warshel et al. posited that unless they are rate limiting (i.e., have the highest free energy barrier), these so-called checkpoints represented by prechemistry steps do not contribute to fidelity. 172 In other words, unlike the case in the kinetic models proposed for RNA Pols, 165,166,171 fidelity is not considered as the cumulative effect of all kinetic checkpoints during the catalytic cycle. According to this hypothesis, if the chemistry step is the ratelimiting step (as experimentally shown in the case of, for example, DNA Pol β 173-175 and DNA Pol η 176 ), only the chemical barrier, and not the barriers of the prechemistry steps, contribute to catalysis and fidelity. However, Warshel et al. clarified that prechemistry conformational changes do determine the final active site preorganization, as reflected in the ground-state Michaelis complex. 172 In this way, prechemistry conformational changes by themselves, and not their associated barriers, have an indirect impact on the chemical barrier and catalytic rate. For example, for DNA Pol β, structural data showed that the fully folded mismatched G:dATP/DNA/Pol ternary complex has a closed conformation but the primer 3 0 -OH is neither coordinated to M A nor properly aligned with the P α atom of dATP for nucleophilic attack, unlike the matched G:dCTP/DNA/Pol complex. 84 Two independent computational studies, employing ONIOM QM/MM 177 and FEP/EVB, 178 indicated that bringing the mismatched complex from this ground state to a reactant state entails an additional energy cost. The resulting reactant state and subsequent transition state for the nucleophilic attack on the incorrect dNTP (dATP) were shown to lie at higher energies than the corresponding states in the matched complex. These studies somehow support the point of Warshel et al. 172 that active site preorganization, irrespective of the preceding conformational change, contributes to the fidelity of a Pol if the chemical reaction is the rate-determining step. Both the matched and mismatched complexes of DNA Pol β were shown to react in the closed conformation; however, the optimal active site preorganization of the former facilitated transition state stabilization, leading to a lower chemical barrier, whereas the poor active site preorganization of the latter diminished transition state stabilization, leading to a higher chemical barrier. This overall point on the mechanistic origin of fidelity in Pols remains however quite intriguing, with mechanistic aspects related to correct ligand selection and its subsequent efficient catalytic processing that will certainly benefit from additional careful computational studies.

| Contribution of (d)NTP binding affinity to polymerase fidelity
The relative binding energies of the correct and incorrect (d)NTPs (ΔΔG bind ) is related to and therefore, also contributes to Pol fidelity (see Section 2.1). Binding free energy studies of dNTP/DNA/Pol ternary complexes have been mainly reported by Warshel et al., who employed the linear response approximation (LRA) method. As an approximation, only the potential energy due to the nucleobase moiety was monitored, and all possible combinations of the four bases (limited to the neutral anti-anti configuration) were considered. 180,181 The binding free energy contribution to fidelity was evaluated for DNA Pol β and T7 DNA Pol. The calculations showed that ΔΔG bind is higher for T7 DNA Pol, consistent with the higher fidelity of this DNA Pol. The higher ΔΔG bind for T7 DNA Pol was attributed to the larger displacement of the incorrect dNTP toward the major groove compared with the case in DNA Pol β. Warshel et al. subsequently modified their approach to include the contribution of the triphosphate group of the incoming dNTP and calculated the binding energies at the transition state (ΔG TS bind ) instead of the ground state. 182 A high correlation (R = 0.97) was obtained between the calculated ΔG TS bind and ΔG bind derived from presteady-state kinetic data. They also compared the contributions of both the ΔG bind and chemical barrier to the fidelities of T7 DNA Pol, 179 DNA Pol β, 178 and DNA Pol η 132 and showed that ΔG bind has a slightly larger contribution in all cases. Ucisik and Hammes-Schiffer also reported binding free energy calculations using the more computationally expensive thermodynamic integration and reproduced the specificity of DNA Pol η for dATP versus dGTP opposite template T. 183 On the other hand, to rationalize the sugar selectivity of RB69 DNA Pol and T7 RNA Pol, Yoon and Warshel evaluated the change in ΔG TS bind upon mutating CTP to dCTP. 184 The FEP calculations suggested that the discrimination against CTP by RB69 DNA Pol is due to steric interaction (van der Waals contribution) between the 2 0 -OH group and an active site Tyr (Y416), while the discrimination against dCTP by T7 RNA Pol is due to electrostatic effects. The results of these computational studies therefore suggest that the binding affinity for the incoming (d)NTP also contributes to Pol fidelity.

| Role of conserved residues in correct (d)NTP selection
Residues that are potentially critical to the pre-chemistry conformational change, chemical reaction, and fidelity have been identified qualitatively by comparing their interactions during MD simulations of matched/mismatched complexes [185][186][187] or wild-type/mutant Pols. 185,188,189 For a more quantitative approach, energy decomposition analysis has been employed. 111,114,190,191 For example, Pedersen et al. used ONIOM QM/MM calculations to determine residue interactions in the active sites of DNA Pol β 111 and DNA Pol λ 114 that stabilize the transition state and lower the chemical barrier for correct dNTP incorporation. Graham, Syeda, and Cisneros performed energy decomposition analysis, along with electrostatic free energy response and noncovalent interaction analyses, using only energies obtained from MD simulations. 190 Interactions were calculated at a purported fidelity-checking site (two bases downstream from the active site) of DNA Pol I to which the enzyme transiently translocates after dNTP incorporation. These calculations allowed the identification of residues that are potentially involved in fidelity checking at the postinsertion site. This would be based on the change in the interaction energy of such residues with a mispaired base in the DNA substrate. On the other hand, Warshel et al. employed the PDLD/S-LRA method to reproduce the effect of mutations on the experimental activation energy of DNA Pol β and compare the contributions of ionized residues near the active site to the ΔG TS bind of matched and mismatched complexes. 192 However, to assess the contribution of residues far from the active site, they used a different approach, the hybrid FEP/linear interaction energy (LIE) approximation method. 193 The results were then rationalized by comparing residue interactions in the binary and transition state complexes of wild-type and mutant Pol β.
Unsurprisingly, most of the important residues identified by these methods are conserved residues. Importantly, using bioinformatics analysis of a large set of DNA/Pol structures, we found a conserved positively charged residue (Arg or Lys) located in the fingers subdomain of DNA Pols that always interacts with the incoming dNTP (Figure 7). 194 MD and metadynamics simulations of DNA Pol η indicated that π-stacking and H-bond interactions of the conserved residue, R61, with the base and triphosphate groups of the incoming dNTP, respectively, facilitate canonical WC base pairing, which is lower in energy than the alternative Hoogsteen base pairing. However, for the R61A mutant, the Hoogsteen base pairing was found to be slightly lower in energy, which is consistent with the observed Hoogsteen base pairing in dNTP/DNA/Pol crystal structures where the conserved Lys or Arg is missing, mutated, or displaced. We also identified through bioinformatics analysis two conserved Lys or Arg residues in the second coordination shell of the twometal-ion center of Pols ( Figure 6). 137 MD simulations of DNA Pol η indicated a possible role in active site stabilization on the basis of the observed partial unfolding of the DNA substrate and consequent distortion of the Michaelis-Menten complex upon mutation of these residues. Mutagenesis 195 and computational 196 studies on another nucleotide-processing metalloenzyme, all-α dimeric deoxyuridine triphosphate nucleotidohydrolase, suggested that such residues contribute to specificity by stabilizing the correct substrate.

| COMPUTATIONAL STUDIES OF POLYMERASE APPLICATIONS
Computational methods can be utilized not only to elucidate the Pol catalytic mechanism, but also to discover inhibitors against Pols and guide the protein engineering of Pols for various applications. This will reduce the need for more costly traditional experimental approaches such as inhibitor screening assays and directed evolution. Here, we discuss a few examples of common and emerging applications of computational methods in these research areas.

| Therapeutics: Polymerases as targets for disease treatment and prevention
One of the well-known Pol drug targets is HIV RT, which catalyzes the reverse transcription of the viral single-stranded (+) RNA genome of the HIV into double-stranded DNA. 16 Computational approaches, including docking, de novo design, and FEP, have been successfully applied in the discovery of nonnucleoside inhibitors with picomolar and low-nanomolar activities against wild-type and mutant HIV RT. 197 Currently, DNA Pols involved in translesion synthesis (e.g., Rev1, DNA Pol η, DNA Pol ι, DNA Pol κ, and DNA Pol ξ) have gained attention as drug targets because of their involvement in the onset and development of cancer. 17,18 Additionally, these DNA Pols can decrease the efficacy of chemotherapy by bypassing the nucleotide adducts formed by anticancer agents. One of the reported drug candidates is an indole thiobarbituric acid derivative that inhibits human DNA Pol η activity. 198 Kinetic experiments and chemical footprinting assays, with the aid of docking calculations, indicated that the inhibitor likely binds to a pocket between the fingers and little finger subdomains of the dNTP/DNA/Pol ternary complex, thereby preventing the proper positioning of the template DNA strand. Another timely target is the RNA-dependent RNA Pol of SARS-CoV-2, which causes COVID-19. 24 Because of the urgency to develop therapeutics, one of the approaches that has been adopted is the drug repurposing of known RNA-dependent RNA Pol inhibitors such as remdesivir. Several MD studies have already been performed to understand the mechanism of inhibition of remdesivir, which will hopefully aid the discovery of new and more active inhibitors. [199][200][201][202] 5.2 | Biotechnology: DNA Polymerases for PCR amplification of damaged DNA A major drawback of DNA Pols used in PCR amplification, such as Taq DNA Pol I, is their inability to amplify damaged DNA samples for medical and forensic analysis. 203 Thus, an attractive solution to this problem is to use thermostable Y-family DNA Pols, which specialize in lesion bypass. However, the poor processivity and activity of these DNA Pols hinder their application. Nevertheless, several studies have demonstrated that these limitations can be circumvented by strengthening DNA binding through mutations and/or fusion with a nonspecific DNA-binding protein such as Sulfolobus solfataricus nonspecific DNA-binding protein 7d (Sso7d). [204][205][206][207][208] In this respect, computational approaches can be used to rationally design DNA Pol variants with improved processivity and catalytic activity. This has been demonstrated for the Sso7d-Dbh (Din B homologue from S. acidocaldarius) fusion protein. 204 Potential mutation sites (nonconserved residues) around the DNA substrate in Dbh were identified by bioinformatics analysis, and the ΔG bind of DNA to the selected mutants were calculated using the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) method. The designed variants were experimentally confirmed to have lower K d (i.e., higher affinity), higher processivity, and higher activity for dCTP incorporation opposite both G and 8-oxoG than wild-type Sso7-Dbh. 204 The same computational strategy was also applied to improve the processivity and catalytic efficiency (dCTP incorporation opposite 8-oxoG) of Dpo4 from S. solfataricus. 205

| Synthetic biology: DNA Polymerases for incorporation of unnatural base pairs
The expansion of the genetic alphabet, through the creation of UBPs from modified nucleotides, enables the storage of additional information in DNA. 209,210 Furthermore, the increased chemical and structural diversity of the modified DNA and RNA would broaden their molecular biological and biotechnological applications to, for example, production of aptamers that bind to proteins and cells, incorporation of unnatural amino acids into proteins, and generation of semisynthetic organisms. However, the major challenge is to identify or develop DNA Pols that can incorporate UBPs with high catalytic efficiency and fidelity and maintain high extension efficiency after the incorporation of the UBP. In this respect, computational approaches can complement structural studies in understanding the mechanisms of UBP insertion and postinsertion elongation by DNA Pols. For instance, MD simulations have been used to investigate protein dynamics in a variant of the Klenow fragment of Taq DNA Pol I (KlenTaq), which incorporates the "hachimoji" P: Z base pair (P: 2-amino-imidazo[1,2-a]-1,3,5-triazin-4(8H)one; Z: 6-amino-5-nitro-2(1H)-pyridone) with high catalytic efficiency. 210 The calculations demonstrated that the mutations (located far from the active site) increase the enzyme's flexibility, allowing it to interact with the P:Z-containing DNA as optimally as with a purely WC DNA. MD simulations have also aided in understanding the incorporation of P-alkyl phosphonate nucleic acids, which have a noncanonical uncharged backbone, by B-family DNA Pols. Particularly, the selective incorporation of the (S) p -diastereoisomer of Palkyl phosphonate nucleic acids over the (R) p -diastereoisomer was shown to be due to the formation of a stable Michaelis-Menten complex with a catalytically ready configuration. 211 On the other hand, QM/MM energy minimization has been used to understand how the protein environment of wild-type KlenTaq modulates the conformation of a 1,4-diethynylbenzene-modified nucleotide once it is positioned at the 3 0 -primer terminus. 212 Density functional theory calculations also indicated a low energy barrier for the rotation around the diethynyl axis of the modified nucleotide, which explained its twisted conformation when it is at positions 2, 4, and 6 upstream from the 3 0 -primer terminus. 212

| CONCLUSION
Computational studies have built upon the wealth of existing structural and kinetic data on Pols to significantly advance our understanding of the Pol catalytic mechanism. Importantly, computational approaches have provided a means to study several key aspects of Pol function. We have reviewed numerous representative computational investigations on Pol function, including mechanistic events that are difficult to address at the molecular level with experimental means. Such events range from physical steps, such as conformational change and translocation, to chemical steps, such as proton transfer and reaction with transient ions. Another relevant aspect is misincorporation in Pols, which is challenging to examine in atomic detail through experiments owing to the inherent instability of mismatched complexes. In this case, molecular simulations can help in the direct comparison of the catalytic cycles of matched and mismatched complexes and the formulation of mechanistic hypotheses regarding the origin of Pol fidelity. Yet, there remains a lack of consensus on the primer 3 0 -OH deprotonation mechanism and role of the third metal ion in DNA Pols. Thus, further computational and experimental studies are required to clarify these aspects of the catalytic cycle, particularly on whether they are generally applicable to all Pols or specific to only the Pol under investigation. To resolve these issues in future computational work, we consider it critical to address: (1) extensive configurational sampling, (2) consistency in theoretical method and active site models employed (e.g., either with or without the third metal ion in DNA Pols) when evaluating all possible mechanisms, and (3) consideration of the pK a changes of the residues involved due to the microenvironment. 74,113,132 In addition to the investigation of the fundamental mechanism for Pol function, we also touched upon computational investigations in drug discovery targeting Pols; docking to predict binding poses, FEP to calculate binding affinity, and QM/MM and MD simulations 213,214 to study the mechanism of inhibition and perform dynamic docking can indeed find good use also in the context of Pol drug discovery efforts. The use of newer methods such as τ-random acceleration MD 215 and smoothed potential MD 216 to study inhibitor binding kinetics (e.g., drug residence time) has gained traction in recent years. Machine learning, which is commonly used for de novo molecular design and quantitative structure-activity relationship modeling, can also be utilized for other stages of drug discovery and development such as biomarker discovery. 217 In the case of Pols, we note that inhibitors can be designed to bind to either the active site and so terminate nucleic acid synthesis or bind to an allosteric site and hinder the formation of a reactive complex (e.g., prevent enzyme closing). 16,17,23 In this context, drug discovery targeting Pols faces two main challenges, namely, inhibitor selectivity due to the structural similarities among Pols and drug resistance due to mutations in Pols, specifically and often in the case of viral RNA Pols. Optimistically, we believe these drug discovery challenges will be overcome also with the aid of the variety of computational approaches discussed above and, in this regard, we look forward to future efforts and findings achieved via computational modeling integrated with experiments.
We have also highlighted promising applications of computations in the protein engineering of Pols for biotechnology and synthetic biology. Computational methods that can be used to predict the effect of mutation on catalytic efficiency, fidelity, and processivity include free energy calculations 218,219 and machine learning. 220 The knowledge of the catalytic mechanism and determinants of the nucleotide specificity of Pols gained through computational studies can aid in the rational design of Pols. In this regard, one property that has not yet been sufficiently explored by computational studies is Pol processivity. This is particularly challenging and important because PCR and synthetic applications require highly processive DNA Pol variants that can efficiently extend DNA after the incorporation of modified or synthetic nucleotides.
In conclusion, computational studies not only provide testable hypotheses and models of the Pol catalytic cycle and chemical properties that can be subsequently validated by experimental methods such as mutagenesis, but also generate predictions that can greatly expedite the discovery of Pol-targeting drugs and protein engineering of Pols. Computational studies are certainly expected to continue playing a significant role in Pol research and we are eager to witness the computer-aided advances of this fascinating research field.