The catalytic mechanism of cyclic GMP‐AMP synthase (cGAS) and implications for innate immunity and inhibition

Abstract Cyclic GMP‐AMP synthase (cGAS) is activated by ds‐DNA binding to produce the secondary messenger 2′,3′‐cGAMP. cGAS is an important control point in the innate immune response; dysregulation of the cGAS pathway is linked to autoimmune diseases while targeted stimulation may be of benefit in immunoncology. We report here the structure of cGAS with dinucleotides and small molecule inhibitors, and kinetic studies of the cGAS mechanism. Our structural work supports the understanding of how ds‐DNA activates cGAS, suggesting a site for small molecule binders that may cause cGAS activation at physiological ATP concentrations, and an apparent hotspot for inhibitor binding. Mechanistic studies of cGAS provide the first kinetic constants for 2′,3′‐cGAMP formation, and interestingly, describe a catalytic mechanism where 2′,3′‐cGAMP may be a minor product of cGAS compared with linear nucleotides.


Introduction
Cyclic GMP-AMP synthase (cGAS)* is a sensor of cytosolic double-stranded DNA (ds-DNA). ds-DNA in the cytosol may occur from infection or mitochondrial damage. Metazoans have developed the ability to sense ds-DNA in the cytosol as a trigger for the innate immune response. [1][2][3] Although interferon and cytokine signaling are warranted to combat infectious ds-DNA Additional Supporting Information may be found in the online version of this article.
Significance: We describe a novel SPR-based enzymatic assay and demonstrate its utility on the cGAS/STING system. We expect this assay to be of broad utility as it has the advantages of being continuous, is compatible with native substrates, and does not require the use of coupling reactions or fluorescently labeled reagents. Using both our SPR assay and conventional enzymatic assays, we report the first kinetic parameters for the cGAS enzymatic mechanism which suggests its canonical product, 2 0 ,3 0 -cGAMP, is actually the minor product of this reaction. We also present several new structures of cGAS with dinucleotides or small molecule inhibitors from a fragment screen. The results of our structural analysis support an existing model for the structural basis of DNA-induced cGAS activation that may be inducible by a small molecule binder. In combination with our activator data, we present the results from a fragment-binding study that suggests the presence of an inhibitor binding hotspot for cGAS.
contaminants, recent studies have found that self-ds-DNA may occur in persisting autoimmune disorders such as systemic lupus erythematosus, suggesting inappropriate ds-DNA sensing may be a contributor to autoimmune disease. [4][5][6][7] cGAS is activated by ds-DNA binding to catalyze the cyclization of ATP and GTP to form a cyclic dinucleotide with mixed 2 0 ,5 0 -and 3 0 ,5 0 -phosphodiester linkage (2 0 ,3 0 -cGAMP), which in turn activates stimulator of type 1 interferon genes (STING). [8][9][10] Activated STING causes the activation of TBK1, which phosphorylates IRF3 allowing it to translocate to the nucleus where it triggers interferon-inducible gene activation and interferon production. 8,[11][12][13] Interestingly, intentional activation of STING through intratumoral injections of STING agonists has demonstrated antitumor properties and immunological memory, suggesting that while cGAS inhibition may benefit autoimmune patients, cGAS activation may be of therapeutic benefit in oncology. [14][15][16] The relationship of cGAS and STING is both old (as much as much as 500 million years of co-evolution), 17 and interesting in that cGAS is a low-activity enzyme while STING is a particularly avid binder of the cGAS product (K d $4 nM). 18 It has also been noticed by multiple investigators that cGAS produces linear homo-and hetero-dinucleotides, including the unusual 2 0 ,5 0 -phosphodiester linked products GMP-2 0 -GTP and AMP-2 0 -GTP. 19,20 GMP-2 0 -GTP is thought to be a side reaction while AMP-2 0 -GTP is presumed to be the catalytic intermediate required for 2 0 ,3 0 -cGAMP production; if true this is a striking phenomenon as catalytic intermediates are seldom abundantly produced under physiological conditions, yet that appears to be the case for AMP-2 0 -GTP. OAS1 is a paralog of cGAS, it produces 2 0 ,5 0 -oligoadenylate as a secondary messenger during ds-RNA-induced innate immunity 21 ; similarly, the 2 0 ,5 0 -phosphodiester link in GMP-2 0 -GTP and AMP-2 0 -GTP may distinguish them for as yet unrecognized roles. To better understand the role of cGAS and STING we undertook a study of the cGAS enzymatic mechanism and its production of linear and cyclic dinucleotides. We report here a novel SPR-based kinetic analysis of cGAS, which in combination with HPLC, MS, and NMR assays suggest the majority of the cGAS enzymatic process is a futile cycle in terms of STING activation. We also present multiple structures of cGAS bound to dinucleotides and small molecule inhibitors that allows us to expand on the existing theory for cGAS activation and propose targeted sites for inhibitor or activator binding.

Results
Interaction with Asp 227 causes catalytic acid alignment ds-DNA binding causes two major changes to the apo cGAS secondary structure. The first is residues Gly 207 -Val 218 (Homo sapiens numbering for changes seen in Mus musculus (PDB 4O6A) 22 and Sus scrofa (PDB 4KB6)) 23 change from disordered to a regular secondary structure (b-strand between Gly 207 -Asn 210 , a-helix between Gly 212 -Val 218 ), and the second is a $1 Å shift of the b-sheets containing the catalytic acids (Glu 225 , Asp 227 , and Asp 319 ) towards the active site (Fig. 1). In the absence of ds-DNA, human cGAS can adopt a cyclic dinucleotide-dependent structure similar to the second of these structural changes, where the catalytic acid containing b-sheets have moved towards the active site (see PDB 4O67 and 4O69) 22 while residues Gly 207 -Val 218 remain disordered. Since only the shift in the b-sheets has occurred in this dinucleotide-dependent structural change, we shall distinguish this conformation from the fully active form, referring to it as "b-pseudo-active" for the changes in the b-sheets. To study the b-pseudoactive form we obtained structures of an Nterminal truncation of cGAS starting at residue 161 (cGAS 161 ) 22 in complex with five cyclic dinucleotides (2 0 ,2 0 -cGAMP, 2 0 ,3 0 -cGAMP, 3 0 ,3 0 -cGAMP, 3 0 ,3 0 -cdIMP and 3 0 ,3 0 -cdUMP), and the linear 2 0 ,5 0 -GpAp dinucleotide (Supporting Information Fig. S1 and Supporting Information Tables S1 and S2). In four of these structures, cGAS assumes the b-pseudo-active conformation (Table I).
There is always an interaction between the catalytic acid Asp 227 and the dinucleotide when the bpseudo-active conformation occurs. In most cases Asp 227 forms a hydrogen bond with the amino group of the guanine base with an approach between 2.5 and 3.3 Å (Fig. 1). The exception to this is 3 0 ,3 0 -cdIMP which does not have an amino group on its base, but does interact with Asp 227 through the 2 0 -OH guanine ribose (3.2 Å ).
Most of these dinucleotides have affinities weaker than 500 lM (the top concentration tested). 2 0 ,3 0 -cGAMP and 3 0 ,3 0 -cGAMP are more tightly bound than the other dinucleotides, and we were able to detect a modest ($2-fold) increase in affinity for both between apo cGAS 161 and ds-DNA-bound cGAS (Table I, Supporting Information Fig. S2). The change in affinity is consistent with the preordering of Asp 227 by ds-DNA.
The substrate orientation is thermodynamically preferred for 2 0 ,3 0 -cGAMP Structures of ds-DNA bound to cGAS shows ds-DNA packs against residues between Gly 207 -Val 218 , inducing their active conformation. 22,23 In general, Gly 207 -Val 218 do not adopt a regular secondary structure without ds-DNA, though strong electron density can be seen for these residues with some ligands. In our 2 0 ,3 0 -cGAMP structure, the electron density is sufficient to model Gly 207 -Val 218 . Comparison to the existing human cGAS 2 0 ,3 0 -cGAMP-bound structure (PDB 4O67) shows good agreement, the only exception is for the modeling of residues Ile 220 and Ser 221 . In our structure we observe a close contact between Lys 219 and Ala 222 that could be modeled as a continuation of the main chain into a b-turn, which is the case for 4O67; however, electron density of side chains in our structure make it clear there is actually a break in the main chain and not a b-turn. 4O67 is the only structure of cGAS from any organism that models a b-turn at Ile 220 and Ser 221 (Supporting Information Fig. S3).
Our structure, 4O67, and a M. musculus structure with both ds-DNA and 2 0 ,3 0 -cGAMP (PDB 4K9B) 19 all model 2 0 ,3 0 -cGAMP with the adenine base above Tyr 436 , and the guanine base near Leu 209 . This is the same position seen for the adenine and guanine base in the structure of substratebound cGAS (PDB 4KB6), we therefore refer to this base orientation for 2 0 ,3 0 -cGAMP as being in the "substrate orientation".
Tyr 436 and Arg 376 form a binding site for aromatic rings A screen of the Pfizer fragment chemical library discovered several binders of cGAS. Fragment screens are run to find low affinity chemical leads, which are subsequently developed into more potent druglike inhibitors. All the fragments described here bind at the active site and are weak inhibitors, the development of one of our fragment hits from a weak ($200 lM) to a potent ($200 nM) inhibitor is described elsewhere. 25 We detail here three fragments hits for their insights into the cGAS activation mechanism and direct interested readers to our other publication for the development of a fragment hit to a potent inhibitor of cGAS. 25 Each of the fragment binders shown here (F 1 -F 3 ) are small ($150-250 Da) with weak affinity ($100-300 lM) (Supporting Information Table S3 and Supporting Information Fig. S2). The Fo-Fc maps for F 2 and F 3 are not sufficient to unambiguously define atomic positions, which seems to be related to multiple modes of binding and internal pseudo-symmetry. Indeed, compared with the dinucleotide structures which have specific interactions that define their positions shows far less ambiguity in unbiased Fo-Fc maps, despite these dinucleotides being generally weaker binders than the fragments and at comparable resolutions ( Fig. 1 and Supporting Information Fig. S1).
Each compound binds to the same site formed between the side chains of Tyr 436 and Arg 376 (Fig. 1). This site is occupied by the adenine base in structures of ATP, 2 0 ,2 0 -cGAMP, 2 0 ,3 0 -cGAMP, or 3 0 ,3 0 -cGAMP, and is composed primarily of London dispersion interactions between the ring system of these binders with the phenyl ring of Try 436 . These interactions suggest this site could accommodate most 2-or 3-ring aromatic systems, which is consistent with this site needing to bind both the adenine and guanine bases during catalysis. Additionally, we observe that in all our cyclic dinucleotides structures, this site had better defined electron density than the other nucleobase site, even for identical ring systems like 3 0 ,3 0 -cdIMP and 3 0 ,3 0 -cdUMP. The binding site formed by Tyr 436 and Arg 376 is therefore a site of binding with broad specificity for aromatic rings, be they nucleobases or small molecule fragments.
cGAS can form an a-helix at Gly 212 -Val 218 in the absence of ds-DNA Compounds F 1 -F 3 bind distant ($10 Å ) from Asp 227 and do not elicit the b-pseudo-active conformation. For compounds F 1 and F 2 , one chain of the asymmetric unit (chain A) has strong electron density for Gly 207 -Val 218 , which adopts a short b-strand at Gly 207 -Asn 210 , and an a-helix at Gly 212 -Val 218 while the other chain in the asymmetric unit does not. Compound F 3 has the same conformation but weaker density for these residues. Interestingly, while 3 0 ,3 0 -cdUMP does not cause a b-pseudo-active conformation, one molecule of the asymmetric unit (chain B) adopts the same conformation for Gly 207 -Val 218 seen for compounds F 1 -F 3 . Thr 211 of the short b-strand is important for coordinating the nucleobase and determining the specificity of the 2 0 or 3 0 -OH bond formation (20), while Ser 213 of the a-helix may bind phosphates of the linear intermediate (see PDB 4K99 and 4K9A). 19 These structural changes are therefore critical for cGAS activity.
The formation of the short b-strand and a-helix occur in the ds-DNA-bound structures of cGAS, yet there are no molecular contacts to explain their presence in these structures. Thus, these data seem a-pseudo-active a Data are the average and standard deviation of two or more SPR experiments. b Data were not determined (ND).
to describe a naturally occurring propensity for this conformation, which is enriched in one of the two chains in the asymmetric unit of these crystals. These structures demonstrate the ability of cGAS to form an active-like conformation for Gly 207 -Val 218 while the catalytic acid containing b-sheets remain in the inactive pose. This is the opposite effect seen for the b-pseudo-active conformation. We call this second pseudo-active conformation "a-pseudo-active" in reference to the a-helix at residues Gly 212 -Val 218 . That distinct a-and b-pseudo-active states exist suggests these two conformations are either independent or mutually exclusive, requiring DNA-binding to coordinate both conformation transitions (Fig. 2).

An SPR-based enzymatic assay to determine kinetic constants
The Biacore T200 SPR microfluidics were designed with a serpentine flow; samples injected must pass along all four flow cells during data collection. We reasoned that if we immobilized apo cGAS, ds-DNA bound cGAS, and STING 155-341 in series we could inject ATP and GTP, detect binding to apo cGAS, form 2 0 ,3 0 -cGAMP with ds-DNA bound cGAS, and finally detect 2 0 ,3 0 -cGAMP using STING 155-341 as a sensor [ Fig. 3(a)].
K M values reflect the steady-state equilibrium between free enzyme and all other enzyme species. In theory, K M values for DNA-bound cGAS 161 could be determined by directly monitoring the SPR response of this channel as a function of ATP and GTP. However, the ds-DNA bound cGAS 161 channel had a near-zero response resonance unit (RU) signal for all concentrations of ATP and GTP which made a direct K M calculation from the cGAS channel impossible. Monitoring the STING 155-341 response showed 2 0 ,3 0 -cGAMP was being produced by ds-DNA bound cGAS; thus we used the STING response to determine K M in lack of a direct cGAS response. Though STING was necessary for monitoring substrate turn over in this system, we suspect direct analysis of the change in RU of the enzyme channel should work for other systems.
Although binding is generally seen as a positive increase in SPR response, it has been observed that for some proteins a negative signal can occur which is attributed to conformation changes associated with binding. 26 The ds-DNA bound cGAS response appears to report the summation of positive (mass accumulation) and negative (conformation changes) responses occurring during catalysis. Since mass accumulation and conformational changes should be distinct with distinct enzymes, we suspect this zerosum phenomenon will not occur for other enzymes.
We have determined there are criteria that must be met for SPR to be used for activity assays (see "Discussion" section). One of the most important is that the sensor protein must bind and give a measurable response for the desired analyte while being insensitive to other chemical matter present in the assay samples (see below). STING binds 2 0 ,3 0 -cGAMP with nM affinity (4 nM reported K d ), 18 and gave a clear positive response when titrated with commercial 2 0 ,3 0 -cGAMP standards. No response was observed when ATP or GTP was injected in the absence of the other nucleotide indicating STING 155-341 does not bind ATP or GTP at the concentrations used here.
AMP-2 0 -GTP has been previously observed in cGAS reactions and was an important clue to seminal studies on the cGAS enzymatic mechanism [Fig. 4(f)] 19 where it was described as a pathway intermediate. To determine if STING 155-341 binds the intermediate, we used full length cGAS to prepare four reaction mixtures with variable concentrations of 2 0 ,3 0 -cGAMP and AMP-2 0 -GTP. Samples were tested in SPR and the 2 0 ,3 0 -cGAMP concentration was determined from the STING response compared with 2 0 ,3 0 -cGAMP standards. The total SPR response on the STING channel was in close agreement to the expected result for total 2 0 ,3 0 -cGAMP as measured by NMR, with no additional response observed from AMP-2 0 -GTP (Table II), even when intermediate was present at 10-fold excess over 2 0 ,3 0 -cGAMP. The simplest explanation for these data is AMP-2 0 -GTP does not bind to STING 155-341 at the top concentrations tested here ($50 lM). We therefore concluded that STING can be used as a selective sensor for 2 0 ,3 0 -cGAMP without interference from ATP, GTP or AMP-2 0 -GTP.
K M values for 2 0 ,3 0 -cGAMP production were determined using SPR as well as HPLC, MS, and NMR assays. Values compiled in Table III show good agreement between techniques, and between both full length cGAS and cGAS 161 .
Using the SPR method, the apparent ATP K M value was determined using a range of GTP concentrations. Using an extra sum-of-squares F-test, a GTP concentration-independent K M.ATP value was supported over a GTP dependent K M value (P 5 0.18), suggesting a lack of cooperativity between ATP and GTP binding. Similarly, ATP binding to apo cGAS 161 showed no dependence on the concentration of GTP. Similar K d and K M values for ATP (235 6 97 and 190 6 20 lM) are consistent with a subsequent step occurring at a rate slower than the rate of substrate association/dissociation (e.g. slow conformational changes or chemistry). We observed no RU signal for GTP binding to apo cGAS 161 (up to 1 mM GTP), including conditions where ATP was pre-bound to cGAS 161 (up to 2 mM ATP). The simplest explanation of these data is that apo cGAS 161 does not bind GTP tightly; however, the absolute RU change for ATP was muted compared with nonsubstrate analytes [Supporting Information Fig.  S2(E)]. We therefore cannot rule out that GTP binding results in a conformation change and a net-zero RU signal.

Substrate inhibition occurs from competitive side reactions
ATP titrations show a clear substrate inhibition pattern for both cGAS 161 in SPR, and full length cGAS in HPLC and MS assays. Using SPR, the apparent K IS value increased with increasing GTP concentrations. However, no substrate inhibition was seen at up to 2 mM ATP when the ratio of ATP and GTP was constant, suggesting substrate competition occurs between ATP and GTP.
The apparent substrate inhibition is most simply explained by the two nucleotides competing with each other to form AMP-3 0 -ATP or GMP-2 0 -GTP instead of 2 0 ,3 0 -cGAMP. In agreement with this, cGAS has been observed to produce both AMP-3 0 -ATP and GMP-2 0 -GTP (Supporting Information Fig. S4). 19,20 We found GTP behaved as a competitive inhibitor of AMP-3 0 -ATP formation using cGAS 161 or full length cGAS in HPLC titrations. This substrate inhibition is the result of a random-ordered, bi-substrate reaction, where one reactant can compete with the second for binding. This also explains the GTP-concentration dependence of the ATP K IS values. As mentioned earlier, when ATP and GTP are kept at a constant molar ratio, no substrate inhibition was observed because the ratio of 2 0 ,3 0 -cGAMP to homo-dinucleotides would also remain constant. The apparent K M for AMP-3 0 -ATP formation in the absence of GTP was 3700 6 1200 lM.
Linear homo-and hetero-dinucleotides are the major initial products of cGAS Though we first identified the formation of AMP-3 0 -ATP and GMP-2 0 -GTP in HPLC and MS by running control reactions with single nucleotide substrates, we could also observe both products in MS experiments under physiologically relevant concentrations of ATP and GTP (2 mM and 0.5 mM) 27 with cGAS 161 or full length cGAS. Exact dinucleotide concentrations were not possible to determine lacking MS ionization controls, but peak intensities suggests the homo-linear products are $10% of the total reaction (Supporting Information Fig. S4). Similarly, the , and at physiological substrate concentrations, 27 comprised $20% of the product compared with 2 0 ,3 0 -cGAMP for full length cGAS. The GMP-2 0 -GTP peak was only observed in HPLC in the absence of ATP, suggesting this product is not as readily formed compared with AMP-3 0 -ATP. In addition to the homo-linear products, we observed the production of linear AMP-2 0 -GTP using continuous reaction monitoring of full length cGAS in NMR and reaction-arrested MS and HPLC. In our studies, AMP-2 0 -GTP is produced in significant abundance initially, while 2 0 ,3 0 -cGAMP is not [ Fig.  4(a-c)]; this result is independent of the cGAS form used in the study (i.e. cGAS 161 or full length). These data strongly suggest a mechanism where the majority of substrate flows through AMP-2 0 -GTP, which is released to solution and then must rebind in competition with ATP and GTP to form 2 0 ,3 0 -cGAMP. Consistent with this mechanism, there is a lag in the initial rate of 2 0 ,3 0 -cGAMP formation, but no lag in the initial rate of AMP-2 0 -GTP formation at saturating ATP and GTP concentrations (Fig. 4).
cGAS can produce 2 0 ,3 0 -cGAMP without releasing AMP-2 0 -GTP to solution In a closed system (such as an NMR tube), AMP-2 0 -GTP released from the enzyme will accumulate until it reaches a concentration sufficient to compete with ATP and GTP for rebinding to free enzyme. Thus, end point activity assays may miss the difference in AMP-2 0 -GTP and 2 0 ,3 0 -cGAMP concentrations at early time points while continuous methods like NMR will not. In contrast to a closed tube, the continuous flow through the SPR system limits the concentration of AMP-2 0 -GTP that can accumulate according to the rate of production and the rate of flow through the cell. Thus, as ATP and GTP concentrations increase past their K M in SPR they will competitively block AMP-2 0 -GTP rebinding to cGAS, decreasing 2 0 ,3 0 -cGAMP production. In effect this would look like substrate inhibition. To separate substrate inhibition of 2 0 ,3 0 -cGAMP production due to ATP and GTP competing with AMP-2 0 -GTP, from inhibition by side reactions (e.g. AMP-3 0 -ATP and GMP-2 0 -GTP formation), we used a fixed ratio of ATP to GTP with final concentrations up to 25-fold their K M and did not observe inhibition of 2 0 ,3 0 -cGAMP production [ Fig. 4(e)]. These data are consistent with a catalytic mechanism where AMP-2 0 -GTP is not subject to ATP or GTP competition; this is most easily explained by a mechanism where AMP-2 0 -GTP is not released but instead reorients on the enzyme.
These SPR data appear to be in conflict with the NMR data which demonstrate a large accumulation of AMP-2 0 -GTP in solution. Reconciliation of the NMR and SPR data results in a mixed mechanism, where the majority of AMP-2 0 -GTP dissociates and can rebind after enough has accumulated to compete, but where a minor portion reorients directly on-enzyme to form 2 0 ,3 0 -cGAMP in a process that is not competitive with ATP or GTP [ Fig. 4(f), red arrows].
NMR experiments using full length cGAS and physiological concentrations of ATP and GTP (2 and 0.5 mM, 10-fold their K M ) 27 show $250 lM AMP-2 0 -GTP accumulates at steady-state [ Fig. 4(a)]. These data suggest either slow chemistry occurs for AMP-2 0 -GTP loss relative to formation, or the K M of AMP-2 0 -GTP is around 25 lM. Consistent with a weak K M , we did not observe binding for AMP-2 0 -GTP at a top concentration of 50 lM using apo cGAS 161 in SPR.
Despite the apparent weak affinity of AMP-2 0 -GTP, and its competition with ATP and GTP, this may not have a significant biological effect since STING is a tight binder of 2 0 ,3 0 -cGAMP (4 nM K d ) 18 requiring only a small amount to cause signaling. The SPR results demonstrate a mechanism exists where a fraction of AMP-2 0 -GTP is not subject to ATP or GTP competition, such as through reorientation on the enzyme. Based on the full length NMR steady-state concentration of AMP-2 0 -GTP, a rough approximation suggests that if the fraction of AMP-2 0 -GTP that stays on-enzyme is >0.1%, the onenzyme path should produce the requisite levels of 2 0 ,3 0 -cGAMP needed for STING sensing before enough AMP-2 0 -GTP accumulates to compete with ATP and GTP for enzyme rebinding. Thus AMP-2 0 -GTP appears to be both the major initial product of cGAS and its release is a futile cycle in terms of 2 0 ,3 0 -cGAMP production and STING activation.

Discussion
During catalysis cGAS must accommodate a swap of adenine and guanine nucleobases in its active site. This is enabled by Tyr 436 and Arg 376 , which create a binding site for disparate aromatic rings. Since this site is essential for binding of both substrates and intermediate, small molecules binding this site cause enzyme inhibition (Fig..1) 25,28 When analyzing the inactive-to-active transition of cGAS, we see there are a-and b-pseudo-active conformations that mimic the true ds-DNAdependent active state. The a-pseudo-active state occurs without clear ligand provocation, suggesting it may be a regularly occurring state in the absence of ds-DNA, whereas the b-pseudo-active state is observed when Asp 227 is engaged.
In the absence of ds-DNA, we only observe aand b-pseudo-active states separately, suggesting these states are either independent or mutually exclusive. Our structural analysis supports a model where a-and b-pseudo-active states are mutually exclusive, which is consistent with a model for activation proposed by Civril et al. 23 According to this model, when ds-DNA binds, it breaks the long Nterminal helix of cGAS into two daughter helices (a1 and a2) at Ser 175 (Fig. 2). This break is observed in ds-DNA-bound structures where the positive dipole of the a2 helix interacts with the DNA backbone 19,23 , but also occurs in the cyclic dinucleotideinduced b-pseudo-active states. The potential importance of the long N-terminal helix formed the basis for a Leu 174 Asn mutant by Civril et al. (in recognition of its importance they refer to this helix as the "spine" of cGAS) where they hypothesized the helical break allows Leu 174 to stabilize the active form through the formation of an a-helix at Gly 212 -Val 218 . Strikingly, they showed Leu 174 Asn is able to bind DNA but no longer produces 2 0 ,3 0 -cGAMP, these results are consist with the idea that a-and bpseudo-active confirmations are mutually exclusive. Similarly, others have found amino acid substitutions along this helix (e.g. Lys 173 Ala/Arg 176 Ala in H. sapiens or Arg 158 Ala in M. musculus) greatly reduce catalytic activity. 23,24 In the a-pseudo-active structures, the long Nterminal helix is intact and the single-turn a-helix is formed at Gly 212 -Val 218 . When ds-DNA breaks the long N-terminal helix, it also positions the Cterminal end of the daughter helix a2 towards ds-DNA, aligning Gly 207 -Asn 210 to form a short bstrand with Val 228 -Lys 231 . The importance of the short b-strand for cGAS activity has been described by others, who liken it to the "activation loop" of kinases 22 Though ds-DNA binding may also help form the a-helix at Gly 212 -Val 218 through packing, we consistently observe that the short b-strand at Gly 207 -Asn 210 occurs with the formation of the a-helix at Gly 212 -Val 218 , suggesting these structures are linked. In the b-pseudo-active structures, daughter helix a2 is pulled away from the ds-DNA-binding site, suggesting the ds-DNA interaction is needed to position the daughter helix a2 and facilitate the formation of the b-strand at Gly 207 -Asn 210 . Thus, a-and bpseudo-active states seem mutually exclusive, with the long N-terminal helix acting like a spring to pull the a-and b-pseudo-active states apart, an effect that is removed when ds-DNA binding breaks this helix (Fig. 2).
The a-and b-pseudo-active states observed here are therefore consistent with existing data and support the Civril et al. model for ds-DNA activation.
Interestingly, the 4KB6 structure shows Asp 227 (Asp 202 in S. scrofa numbering) interacting with Mg 21 and the a-phosphate oxygens of ATP. Since the physiological concentration of ATP is $10-fold its K d or K M , these data suggest cGAS may be in the b-pseudo-active state in cells with the long Nterminal helix broken. If so, the element missing for cGAS activation would be the induction and alignment of the short b-strand at Gly 207 -Asn 210 and the a-helix at Gly 212 -Val 218 . We therefore propose efforts to stimulate the innate immune response through cGAS should focus on the discovery of binders that facilitate the formation of the short b-strand at Gly 207 -Asn 210 and the a-helix at Gly 212 -Val 218 . Since we have always observed both the short b-strand at Gly 207 -Asn 210 and the a-helix at Gly 212 -Val 218 to occur simultaneously in our a-pseudo-active states, it may be possible to induce both these structures through stabilizing the short b-strand at Gly 207 -Asn 210 in the presence of high concentrations of ATP. We therefore envision a small molecule binder that could mimic the phosphate backbone interactions of ds-DNA in aligning the end of the daughter helix a2 after the N-terminal helix is broken, this should induce a short b-strand at of Gly 207 -Asn 210 , thus stimulating cGAS activation at high ATP concentrations.
In addition to structural studies, we have engaged in an analysis of the catalytic mechanism of cGAS. These studies include a novel SPR-based enzymatic assay that should be applicable to other systems (Fig.  3). In these experiments, ds-DNA bound cGAS did not show a direct RU effect during catalysis, necessitating the use of STING 155-341 as a sensor protein for 2 0 ,3 0 -cGAMP. Although it is possible to use other binding sensors, such as antibodies tailored to detect specific analytes, a prior SPR-based catalytic assay showed a non-zero RU effect for an enzyme undergoing catalysis 29 ; it is therefore reasonable to speculate a direct use of SPR for enzymology studies should be possible for other systems.
When compared with the NMR and HPLC assays used here, SPR has the advantage of being a continuous and relatively quick data collection method. Its disadvantages are that it cannot separate side reactions from the main reaction without associated sensor proteins (e.g. STING 155-341 here), and it is hard to quantitate turnover rates. Furthermore, since SPR systems have a continuous flow, their use for enzymology must meet certain constraints: (1) as K M is a steady-state measurement, the analyte produced must reach steady-state response (RU plateau) within the experiment; (2) the same flow rate must be used for all injections; and (3) the reaction cannot proceed exclusively through intermediates with weak affinities that will be lost to the flow. cGAS produces AMP-2 0 -GTP as an intermediate to 2 0 ,3 0 -cGAMP. Full length cGAS experiments using NMR and HPLC clearly demonstrate that AMP-2 0 -GTP is a significant initial product of cGAS in a closed reaction vessel. These data are consistent with prior studies which also show linear dinucleotides are produced by cGAS 19,20 ; however, these data are distinct in that they are from continuous assays capable of probing early time points to distinguish the relative rates of formation of linear v cyclic dinucleotide. Furthermore, using SPR we demonstrate 2 0 ,3 0 -cGAMP can be produced through a process that is not competitive with ATP and GTP [ Fig. 4(e)]. The simplest explanation of the NMR, HPLC and SPR data is that while the majority of AMP-2 0 -GTP product is released, a portion can instead reorient in the cGAS active site to produce 2 0 ,3 0 -cGAMP in a mechanism that is not competitive with ATP or GTP.
NMR and HPLC experiments using full length cGAS demonstrate AMP-2 0 -GTP or linear homonucleotides are the major initial products at physiological concentrations of ATP and GTP [ Fig. 4(d)]. To our knowledge no one has yet demonstrated the presence of AMP-2 0 -GTP in cells, though many have now demonstrated its presence in vitro. Our protein was produced without any post translational modifications, it is possible that such modifications, or multiple localization of cGAS upon long segments of ds-DNA in cells, 30 could lead to a lower fraction of linear nucleotides produced. Indeed, our SPR data could be thought of as a mimic for the ds-DNA localized condition. However, lacking cell data, we can still say these HPLC and MS data show linear homodinucleotides are readily produced at high (> 1 mM) concentrations of a single nucleotide and do not elicit an SPR binding response for STING 155-341 , suggesting STING 155-341 does not bind AMP-3 0 -ATP or GMP-2 0 -GTP. Furthermore, neither apo cGAS 161 nor STING 155-341 binds AMP-2 0 -GTP at up to 50 lM. Although weak AMP-2 0 -GTP binding may be dismissed as an artifact of cGAS truncation or the SPR system, these data are borne out by an apparent weak affinity for ds-DNA-bound full length cGAS in solution NMR experiments. That AMP-2 0 -GTP has relatively weak affinity for cGAS is supported by its steady-state concentration of 250 lM at physiologically relevant ($10-fold their K M ) ATP and GTP concentrations [ Fig. 4(a)]. 27 If the catalytic efficiency (k cat /K M ) of AMP-2 0 -GTP formation and depletion are equivalent the intermediate would have a K M around 25 lM.
The apparent reconciliation of the weak affinity of AMP-2 0 -GTP with the importance of producing 2 0 ,3 0 -cGAMP for immune signaling is that STING is a strong binder of 2 0 ,3 0 -cGAMP, thus only a small amount of 2 0 ,3 0 -cGAMP is need to activate STING. SPR shows a portion of the cGAS 161 reaction occurs without competition with ATP and GTP. If the portion of 2 0 ,3 0 -cGAMP produced through a noncompetitive mechanism is greater than 0.1% of the cellular concentration of ATP ($ 2 mM), this minor portion will produce the requisite concentration of 2 0 ,3 0 -cGAMP needed for STING signaling before the AMP-2 0 -GTP released from cGAS can accumulate to sufficiently compete with ATP and GTP for enzyme rebinding.
Given their unusual 2 0 ,5 0 -phosphodiester bond, which would distinguish them from normal RNAlike oligos, there is a possibility AMP-2 0 -GTP and GMP-2 0 -GTP have non-STING binding partner. However, until such a partner is identified it would seem the large amount of linear homo-nucleotides and AMP-2 0 -GTP released compared with 2 0 ,3 0 -cGAMP is part of a futile cycle for cGAS. In conclusion, given the low basal activity of cGAS, 18,19,31 its apparent bias to not bind GTP, and therefore not produce the unusual 2 0 ,5 0 -phosphodiester bond, and that much of its activity seems to be lost in apparent futile cycles involving homo-or hetero-dinucleotide release, this enzyme seems to contain several levels of control to limit 2 0 ,3 0 -cGAMP production in the absence of stimulatory ds-DNA.

Protein expression
The genes for full length H. sapiens cGAS, Nterminal truncated cGAS beginning at residue 161 (cGAS 161 ), and H. sapiens STING residues 155 through 341 (STING 155-341 ) were ordered from Gen-eWiz (South Plainfield, NJ). Genes were cloned into pET28 containing an N-terminal SUMO-HIS 6 tag, a ULP1 cleavage site, a BIRA recognition sequence, and a TEV cleavage site.
Escherichia coli BL21(DE3) cells transformed with the above constructs were grown in LB medium (Invitrogen) at 378C to an OD 600 nm of 0.8 before inducing protein expression with 0.1 mM isopropyl-1thio-b-d-galactopyranoside at 158C for 16-20 h. Harvested cells were suspended in 20 mM HEPES pH 7.5, 1 M NaCl, 10% (v/v) glycerol, 30 mM imidazole and 1 mM TCEP, and gently killed using a Branson Ultrasonic Disintegrator (VWR Scientific Products, Chicago, IL) with seven rounds of 10 s 10% duty cycle sonication separated by 50 s rest periods.
The soluble fraction was separated using centrifugation (30,000 RCF, 1 h), applied to a HisTrap FF column (GE Healthcare), washed with 10 column volumes of buffer containing 20 mM HEPES pH 7.5, 250 mM NaCl, 10% glycerol, and 1 mM TCEP containing 30 mM imidazole, then eluted in the same buffer but with 300 mM imidazole and 300 mM NaCl. The protein was concentrated using a 10 kDa MWCO Amicon spin column (Millipore), buffer exchanged into 50 mM HEPES pH 7.5, 250 mM NaCl, 10% (v/v) glycerol and 1 mM TCEP. For SPR studies, cGAS 161 or STING 155-341 was treated with ULP1 and BirA ligase (Avidity) to generate Nterminal biotin-tagged protein; otherwise, samples were incubated for 16-20 h with TEV protease (Life Technologies) to liberate untagged protein. Proteasetreated samples were passed through a HisTrap FF column to remove tags and residual tagged protein.
Full length cGAS or cGAS 161 was applied to a Heparin FF column in 20 mM HEPES pH 7.5, 250 mM NaCl, 10% glycerol, and 1 mM TCEP, then eluted with a gradient of 0.25-1 M NaCl. All proteins were subjected to a final purification step using a HiLoad Superdex75 column (GE Healthcare) in 20 mM HEPES pH 7.5, 150 mM KCl and 1 mM TCEP. Protein purity was verified by SDS-PAGE and ESI-TOF mass spectrometry.

Crystallization and structure determination
Dinucleotides were purchased (InvivoGen). Compounds F 1 , F 2 and F 3 were discovered as the result of an NMR-based fragment screen (see Hall et al. for general description of library and methods 32 ). Crystals of cGAS 161 were grown using conditions similar to a previous report. 31 Protein was concentrated to 6 mg/mL, and then mixed at a 2:1 ratio with PEG 3350 (18-20% v/v), 0.2 M ammonium citrate pH 7 in a sitting drop well at 48C. Rod-shaped crystals were observed within 2 days, and grew to their final size within 5-7 days. Cryoprotectant was made using mother liquor at a final concentration of 23% PEG 3350. Compounds were dissolved into cryoprotectant at 50 mM, and soaks were performed at 48C for 5-10 min. Crystals were flash frozen in liquid nitrogen, and data were collected at the Argonne National Lab (IMCA) beamline. Data were scaled and merged using AIMLESS. 33 Initial phases were obtained from MR using PDB 4LEV in PHASER. 34 Refinement was performed using BUSTER-TNT and Phenix Refine. Omit maps were calculated using Phenix Refine with simulated annealing.

HPLC assays
GTP titrations experiments were performed using 500 nM full length cGAS, 1 lM interferon stimulatory ds-DNA (ISD) (Integrated DNA Technologies) in 10 mM HEPES pH 7.5, 140 mM NaCl, 5 mM MgCl 2 and 0.01% Tween-20 at 378C. ATP was held at 1.1 mM, while GTP was titrated down from 1.1 mM to 80 lM using a 60% dilution series. Reactions were quenched with 50 mM EDTA and separated on a Zorbax SB-C8 column (5 lm, 4.6 3 150 mm) using a methanol-phosphate gradient (buffer A: 20 mM potassium phosphate, pH 6.0; buffer B: equal volumes methanol and buffer A) at 358C. Peaks were identified using ATP, GTP, and 2 0 ,3 0 -cGAMP chemical standards. 2 0 ,3 0 -cGAMP peak areas were converted to molar concentrations using a standard curve. Formation of this product was fit to a timedependent approach to steady-state using equation (1): where V 1 is the initial rate and was constrained to zero, k obs is the rate constant for the approach to the steady state rate (V 2 ), and t is time. Steady-state values were analyzed for substrate-dependent inhibition using equation (2): where V obs is the observed reaction velocity at substrate concentration S, V max is the theoretical maximal rate, and K M and K IS are the apparent Michaelis constant and the apparent inhibition constant. Although titrating GTP, a second product peak was identified with an inverse dependence on the GTP concentration. Maximal rate of production was seen in the absence of GTP, suggesting the peak was the linear AMP-3 0 -ATP product reported by Gao et al. 19 (Supporting Information Fig. S4). The GTPdependence for the rate of AMP-3 0 -ATP formation was fit to a standard half-maximal inhibitory concentration (IC 50 ) model, described in equation (3): where V 0 is the observed rate in the absence of GTP. Additionally, cGAS was titrated with 0.3-3 mM ATP in the absence of GTP. The rate of AMP-3 0 -ATP formation showed a hyperbolic concentration dependence, and was fit to the Michaelis Menten equation.

NMR assays
Samples for continuous monitoring of cGAS reactions were prepared with 0.5 lM cGAS, a top concentration of 2 mM ATP with 2-fold dilutions over three points, and a fixed concentration of 500 lM GTP. Reactions were performed in SPR running buffer (see below) at 238C. Reactions were monitored through 1D spectra collected at 8 min intervals. Peak identities were determined from comparison to ATP, GTP and 2 0 ,3 0 -cGAMP standards, the assignment of additional peaks as the linear AMP-2 0 -GTP intermediate was made after mass spectra analysis revealed a mass of 853 Da (predicted 853 Da) in addition to the substrates and product. Compound concentrations were determined through peak integration using ATP as an internal standard. Steady-state rates of 2 0 ,3 0 -cGAMP formation were fit using equation.(1) Initial rates of intermediate formation were determined by fitting the intermediate concentration to equation (1) with an unconstrained V 1 value and a negative V 2 value.

MS assays
cGAS (100 nM) was incubated for 30 min at 378C with 100 nM ISD in 10 mM HEPES pH 7.5, 140 mM NaCl, 5 mM MgCl 2 and 0.01% Tween-20, and varied substrate concentrations. The substrate dependence was assessed by titrating ATP or GTP over a range of 5 lM to 1.3 mM, keeping the invariant substrate at 0.3 mM (GTP) or 1 mM (ATP). Additional samples were prepared using 2 mM ATP or GTP in the absence of the second nucleotide, or 0.5 mM GTP, 2 mM ATP. All samples were quenched with 50 mM EDTA prior to analysis. Quenched samples were diluted in H 2 O, vortexed, and centrifuged at 3000 RCF. Soluble analytes were separated in a hypercarb column (5 lm, 2 3 30 mm) using an acetateacetonitrile/acetone gradient (Buffer A: 20 mM ammonium acetate, pH 10; Buffer B: 45% acetonitrile, 45% acetone, 10 mM ammonium acetate, pH 10, 0.1% formic acid) at 608C. Analytes were identified by mass spec analysis using a 1290 Agilent UHPLC in conjunction with a Sciex 5500 triplequadrupole mass spectrometer in MRM mode. Data were processed using Sciex's Multiquant 3.0. Reaction products were identified by mass and MRM transitions for each nucleotide: AMP-3 0 -ATP at 837 m/z (predicted and observed), GMP-2 0 -GTP at 869 m/z (predicted and observed), AMP-2 0 -GTP at 853 m/z (predicted and observed) and 2 0 3 0 -cGAMP at 675 m/z (predicted and observed). ATP, GTP, and 2 0 3 0 -cGAMP quantities were calculated from standard curves of peak intensity. Data were analyzed using equation (2).
Sample-dependent responses on Channels 2-4 were subtracted from Channel 1 to account for interactions with neutravidin, and data were further corrected by subtracting a zero concentration blank from all compounds to account for mismatches in the sample buffer and running buffer. Dissociation equilibrium constants (K d ) were determined using a binary association model (T200 BIA Eval software): where R C is the response at compound concentration C, R max is the maximum response of the fit, and R 0 is a global data offset from zero (Table I).

SPR activity assays
The ATP-concentration dependence of cGAS activity was assessed by injecting ATP (2.0 mM to 2.0 lM, 2-fold dilutions) in the presence of 1.0 mM to 1.0 lM GTP (2-fold dilutions). ATP and GTP titrations were also performed in the absence of the other nucleotide. 2 0 ,3 0 -cGAMP (20 lM to 20 nM, 2-fold dilutions) was injected before and after ATP and GTP samples to establish the 2 0 ,3 0 -cGAMP-dependent change on the STING channel. SPR instrumentation and sensor chip setup was as described earlier. Samples were injected for 900 s, dissociation was measured for 300 s at 5 lL/min, 48C. The STING response (Channel 4) values showed an initial negative deflection, especially at higher nucleotide concentrations; therefore, blank subtracted values were normalized to have zero absorbance at the signal minimum ($100 s). Normalized data were fit to a single exponential association from 600 to 900 s, and the extrapolated plateau RU values were then replotted as a function of the ATP concentration. Datasets corresponding to different GTP concentrations were simultaneously fit to equation (2) using GraphPad Prism with a shared K M.ATP value. In turn, the resulting V max values were fit as a function of the GTP concentration using the Michaelis Menten equation (equation 2 when [S]/K IS (1).
To further probe the cGAS substrate inhibition observed above, SPR injections were made using a 2:1 molar ratio of ATP to GTP over an ATP range of 5 mM to 90 lM (11 points, 1.5-fold dilutions). Data collection and analysis were as described earlier.
To ensure that STING 155-341 response was specific to 2 0 ,3 0 -cGAMP and not influence by any of the linear products present in the enzymatic samples (e.g. the cGAS intermediate), reactions were prepared in the buffer used for SPR with 2 lM cGAS, 3 lM ISD, and a 2:1 molar ratio of GTP to ATP at 0.25, 0.5, 1.0, and 4 mM ATP. Reactions were quenched with 85 mM EDTA after 3 h at 238C and analyzed by NMR to determine the concentration of 2 0 ,3 0 -cGAMP and linear 2 0 ,5 0 -GMP-3 0 -AMP dinucleotide. The samples were then diluted 4-fold into SPR running buffer to minimize noise associated with running buffer and sample buffer mismatch and injected for 60 s, and dissociation was measured for 60 s at 60 lL/min, 48C.