Community analysis of large‐scale molecular dynamics simulations elucidated dynamics‐driven allostery in tyrosine kinase 2

TYK2 is a nonreceptor tyrosine kinase, member of the Janus kinases (JAK), with a central role in several diseases, including cancer. The JAKs' catalytic domains (KD) are highly conserved, yet the isolated TYK2‐KD exhibits unique specificities. In a previous work, using molecular dynamics (MD) simulations of a catalytically impaired TYK2‐KD variant (P1104A) we found that this amino acid change of its JAK‐characteristic insert (αFG), acts at the dynamics level. Given that structural dynamics is key to the allosteric activation of protein kinases, in this study we applied a long‐scale MD simulation and investigated an active TYK2‐KD form in the presence of adenosine 5′‐triphosphate and one magnesium ion that represents a dynamic and crucial step of the catalytic cycle, in other protein kinases. Community analysis of the MD trajectory shed light, for the first time, on the dynamic profile and dynamics‐driven allosteric communications within the TYK2‐KD during activation and revealed that αFG and amino acids P1104, P1105, and I1112 in particular, hold a pivotal role and act synergistically with a dynamically coupled communication network of amino acids serving intra‐KD signaling for allosteric regulation of TYK2 activity. Corroborating our findings, most of the identified amino acids are associated with cancer‐related missense/splice‐site mutations of the Tyk2 gene. We propose that the conformational dynamics at this step of the catalytic cycle, coordinated by αFG, underlie TYK2‐unique substrate recognition and account for its distinct specificity. In total, this work adds to knowledge towards an in‐depth understanding of TYK2 activation and may be valuable towards a rational design of allosteric TYK2‐specific inhibitors.


| INTRODUCTION
Eukaryotic protein kinases (EPKs) are highly dynamic, ATP (adenosine 5 0 -triphosphate)-dependent enzymes that catalyze phosphorylation of their downstream protein-substrates, thus acting as "switches" of signaling pathways with pivotal role in regulating key processes in human cells.Through their catalytic cycle, they have to transition from the inactive to active state and vice-versa and their conformational plasticity and concerted motions within their catalytic domains are essential for regulating kinase activity. 1,2The catalytic domains of EPKs share high sequence and fold similarities 3,4 therefore identification of residues that determine distinct specificities is challenging.
Tyrosine kinase 2 (TYK2) is a nonreceptor tyrosine kinase that belongs to the four-membered, Janus family of kinases (JAKs; the others being JAK1-3) and in an interplay with JAK1 and JAK2, is involved in cell signaling in response to various immunoregulatory cytokines as part of the JAK-STAT signaling pathway depending on the cytokine and receptor complex. 5,6][9][10][11][12][13] It has also been reported that the TYK2-STAT3 signaling pathway, in addition to cancer, is implicated in Alzheimer's disease, 14 whereas tyrosine phosphorylation at a different site uniquely by the isolated catalytic domain of TYK2, has been reported to negatively control STAT3 activity. 15Unraveling the atomic details of TYK2 kinase activity and the determinants of its specificity, is therefore, of major importance.
Structurally, TYK2 follows a JAK-characteristic architecture; namely, it is divided into four domains: the FERM and SH2 domains that are responsible for receptor binding, 16 a pseudo-kinase domain (pKD) that regulates its kinase activity 17,18 and the catalytic domain (KD).
The TYK2-KD is responsible for its tyrosine kinase activity and adopts the EPK characteristic bilobate fold (see e.g., in 19 and Figure 1).In brief, it is structurally divided into two lobes connected with a linker (hinge region); namely, a small N-terminal lobe consisting of a 5-stranded β-domain and the regulatory C-helix that is key to the ON-OFF switch of kinase activity, and a large mainly α-helical C-terminal lobe (Figure 1).The ATP molecule and proteinsubstrates must bind in the cleft between the two lobes.ATP, in particular, must bind deep in the cleft in such a way that its adenine-ring is buried in a conserved, transiently formed dynamic hydrophobic pocket. 2 The N-lobe provides most of the residues that are involved in correct, catalytically competent positioning of ATP, including a conserved lysine located at the β3 strand (K930 in TYK2), which serves to coordinate the ATP αand β-phosphates and is stabilized by a saltbridge with a conserved glutamate residue of the regulatory C-helix (KE salt-bridge; K930-E947 in TYK2) in active kinases as well as a glycine-rich loop (G-loop) that contributes to aligning the ATPphosphates towards the substrate. 4The C-lobe on the other hand, encompasses elements that are characteristic of EPKs 3 ; namely, the activation loop and a co-evolved helical subdomain that comprises F I G U R E 1 A known crystal structure of the kinase domain of the TYK2 protein (ADP.2Mg2+ -bound form; PDB entry:4gvj 39 ) used as template in this study (left panel) and details of the corresponding amino acid sequence (right panel), with secondary structure elements as retrieved from the rcsb database (https://www.rcsb.org)shown above the sequence.Important regions and motifs that are discussed in the text are indicated.The additional FG-helix, characteristic of the JAK family of kinases, is colored in magenta.three α-helices (αG-αH-αI; GHI-subdomain).Tyrosine phosphorylation of its activation loop (AL) at Y1054-Y1055, is required for TYK2 activation. 20The C-lobe also provides several regions and motifs both conserved and specific that are involved in substrate-binding and catalysis, such as the HRD-and DFG-motifs, 3 the APE-motif that connects AL with the GHI-subdomain 21 and the substrate-binding, P + 1 loop (Figure 1).The 1064 PVFWYA 1069 region of the TYK2 P + 1 loop, in particular, is characteristic of tyrosine-specific kinases 4 (PTKs).Both the N-and C-lobes provide residues serving noncovalent communication between the lobes, through the formation of transiently assembled spatial motifs in the dynamic inner hydrophobic core of active EPKs; namely the regulatory R-spine (assembled upon AL phosphorylation) and the catalytic C-spine (transiently assembled upon ATPbinding) as well as additional bridging residues that link the spines and combined, serve dynamics-driven allosteric kinase activation. 2,22,23e TYK2-KD C-lobe has an addition of an α-helix near the substrate-binding site, the FG-helix (Figure 1), which is characteristic of JAKs and plays a central role in their (auto)activation. 24However, the atomic details of this, remain elusive.[27][28][29][30] In a previous work, and as a first step towards the elucidation of the atomic-details of altered TYK2 functionality, we applied all-atom molecular dynamics (MD) simulations and studied the dynamics of the functionally impaired P1104A-KD in comparison with that of the wild type, in their apo-forms. 31[34][35] Analysis of the MD trajectories indicated that this particular αFG amino acid change quenches the dynamics of the resulting TYK2-KD and promotes closed, collapsed, and inaccessible to ATP conformations. 31Given that structural dynamics underlies dynamics-driven allosteric activation of protein kinases 2,23 and that movements within the TYK2-KD are restricted by its other sub-domains in the context of the full-length protein in its autoinhibited state, 17,18 these results prompted us to investigate the dynamics of the TYK2-KD during the activation process, that remain unexplored until today.
To this end, in the present study, we applied a microsecond-scale MD simulation (3 μs) and investigated the dynamics of the ALphosphorylated (at Y1054-Y1055) wild-type TYK2-KD in the presence of ATP and one magnesium ion (ATP.1Mg2+ ) that represents the activation process and corresponds to a crucial and highly dynamic pre phosphoryl-transfer step of the catalytic cycle, in other protein kinases. 32,35,36alysis of the MD-trajectory, including community network analysis, shed light, for the first time, on the dynamic profile and dynamics-driven allosteric communications within the TYK2-KD in response to AL phosphorylation and ATP.1Mg 2+ -binding, and revealed that the FG-helix and amino acids P1104, P1105, and I1112 in particular, hold a pivotal role and are part of a dynamically coupled allosteric communication network of amino acids across the lobes, serving dynamic bridging of the spines and intra-KD allosteric signaling for dynamics-driven allosteric regulation of TYK2 activity.The MD results suggest that this is most-probably facilitated by the dynamic nature of several areas of the TYK2-KD, including the β3-αC and αH-αI connecting regions, in a TYK2-unique manner.Corroborating the functional significance of the identified allosteric network of side chains, most of the key residues (more than 40%) identified in this study, are associated with cancer-related Tyk2 missense or splice-site mutations, as revealed by screening of related databases.We propose that this pre-phosphoryl-transfer step of the catalytic cycle is crucial for TYK2 activation and that the conformational dynamics at this step, coordinated by αFG, is essential for substrate recognition and thus, accounts for its distinct specificity (e.g., for TYK2-specific sites of STAT3).In total, this work adds to knowledge towards an in-depth understanding of TYK2 activation and provides information that may be valuable toward a structure-based design of allosteric TYK2-specific inhibitors.

| Molecular dynamics simulations
All-atom molecular dynamics (MD) simulations were performed using the GROMACS software package (v.5.1.4). 37,38Due to the lack of known crystal structures of the simulated TYK2-KD form, 3D modeling of the initial conformation was necessary, prior to the MD simulation.The known crystal structure of an ADP.2Mg 2+ -bound form of a TYK2-KD (rcsb PDB:4gvj 39 ; and Figure 1) was used as template, for this purpose.Due to amino acid changes in 4gvj, in silico mutagenesis using Swiss-PdbViewer 40 was applied to obtain the wild-type TYK2-KD form.The coordinates for the ADP molecule and of one magnesium ion were kept, whereas missing coordinates for the ATP γ-phosphate group and for the phosphate groups of the phosphorylated AL tyrosine residues, Y1054p and Y1055p, were transferred from the PDB:1atp and PDB:6dbk 41 crystal structures, respectively.
The simulations were carried out in explicit water using periodic dodecahedron boxes of TIP3P-water molecules 42 extending 10 Å from protein atoms, to solvate the protein system.Periodic boundaries were applied to minimize edge effects.The solvated system was subsequently neutralized with counter-ions, optimized by steepest descent energy minimization, and equilibrated by restrained MD simulations for 1 ns, in two steps: namely, the protein/ATP.1Mg2+ atoms were harmonically restrained to allow the solvent to equilibrate for 500 ps in the NVT ensemble at constant temperature, followed by 500 ps in the NPT ensemble at a pressure of 1 bar.The optimization phase was followed by the 3 μs-long unrestrained MD simulation in the NPT ensemble (productive run).All simulations were performed at a constant temperature of 300 K with separate coupling of protein and nonprotein atoms.In order to better prepare the system, a much shorter ($40 ns) preparatory MD run (and its corresponding equilibration steps) was carried out at 150 K, prior to the 300 K simulation.
The AMBER99SB-ILDN 43 forcefield as implemented in GROMACS was used in the simulations, as we have previously shown that it produces more reliable MD results. 44The AMBER parameters for phosphotyrosines and for ATP with a net charge of À4e, were adapted from the Bryce R, AMBER Parameter Database (http://pharmacy.man.ac.uk/amber).The PME 45 method for the treatment of the long-range electrostatic interactions, a cut-off radius of 10 Å for all nonbonded interactions, and a 2 fs time step for integration of the potential function, were used in all simulations.
The 3 μs-long MD simulation was carried out in the National HPC facility, ARIS, supported by a computational time grant from the National Infrastructures for Research and Technology S.A. (GRNET; under project IDs: KIN_IMMUNMD-II and KIN_IMMUNMD-III).

| Analysis of the MD-trajectory
GROMACS 38 was also used for various preliminary analyses of the MD-trajectory based on several metrics, such as RMSD and RMSF calculations, analysis of geometric parameters (e.g., angles, dihedrals, and distances), and conformation clustering (as in 31 ).In brief, convergence of the MD-trajectory was assessed by monitoring the root-mean-square deviation (RMSD) of several atoms from the initial positions, whereas conformational flexibility was evaluated by means of root-mean-square fluctuation (RMSF) calculations.The RMSF values are a measure of atomic fluctuations; that is, atomic displacements relative to their mean positions along the MD-trajectory.Conformation clustering was applied to obtain representative MDsnapshots at various time frames of the simulation.
Additional analyses of the MD-trajectory related to identifying concerted atomic motions included:

| Principal component analysis
Principal component (PC) analysis was applied to eliminate the noise and to identify the dominant modes of atomic movements during the MD simulation (as in 31 ).The PC-analysis was performed in the range of 2-3 μs of the MD-trajectory and the movements of Cα atoms and side chains (a representative atom) were treated separately.Components PC1-PC100 obtained from the independent PC-analyses were saved as new trajectories (PCA-trajectories) and were used in the subsequent analyses of the MD trajectory.Additional PC-analysis was also applied for various other time-ranges of the MD trajectory, for comparison.

| Cross-correlation analysis
Cross-correlation analysis of the MD trajectory was used in order to examine the degree of correlation of the atomic motions within the simulated TYK2-KD.This type of analysis is based on the calculation of cross-correlation matrices the elements of which, namely, the cross-correlation coefficients (C ij ) between each pair of atoms i,j (see eq. Corr i,j in 31 ), take values ranging from À1 to +1 and can identify atomic movements in a collective manner (semi-rigid bodies) either in the same or in opposite directions (positive or negative values, respectively).C ij values close to ±1 indicate highly correlated atomic displacements, whereas values close to 0 are indicative of uncorrelated motions.The Bio3D software 46,47 and the PCA-trajectories obtained, as described above, were used for this analysis.

| Community map analysis and nodebetweenness analysis
Community map analysis was performed also using the Bio3D software 46,47 and the separate, Cα and side chain, PCA-trajectories.Contact maps produced using an atom-atom distance cut-off of ≤10 Å for at least 75% of the analyzed simulation time on correlated residues (jC ij j > 0.5) obtained by the cross-correlation analysis, and the Girvan-Newman algorithm 48 as implemented in Bio3D, were used for this purpose.The Girvan-Newman method is a graph-based network approach that is based on the edge-betweenness centrality measure (details in 48 ).In brief, the edge-betweenness centrality of an individual residue is defined as the number of the shortest paths connecting other residue pairs that pass through it along the MD trajectory, thus providing an estimate of the influence of this residue on communication (modularity).Communities of residues are characterized by high modularity values; that is, residues in the same community share dense connections, whereas residues of different communities have sparse or no connections at all.The results are then plotted on a 2D network-graph with nodes representing communities and edges representing the strength of allosteric coupling between communities.
Two independent community analyses, based on only positive (C ij > +0.5) or both positive and negative cross-correlation (jC ij j > 0.5) matrix calculations, were carried out for each one of the Cα and side chain PCA-trajectories.
Model illustrations were made using the PyMOL Molecular Graphics System. 49

| Multiple sequence alignment
ClustalW 50 was used for multiple alignments of JAK-KD amino acid sequences, retrieved from the Uniprot-database. 51

| Cancer mutations search
Cancer-associated mutations were obtained by screening the cBio-Portal for Cancer Genomics. 52

| RESULTS AND DISCUSSION
As already mentioned, our previous MD work on the apo form of the TYK2-KD in comparison with its functionally impaired P1104A variant, indicated that this αFG amino acid alteration has long-range LESGIDOU and VLASSI effects and acts at the dynamics rather than at the structural level of the TYK2-KD. 31It has long been recognized that dynamics and concerted motions within the KD domains of EPKs during their catalytic cycle (e.g., in response to ATP-binding) play a pivotal role in their allosteric activation. 2,23In addition, AL phosphorylation is necessary for kinase activation and has been reported to also result in increased KD dynamics in other kinases. 53nce the dynamics and allosteric communications within the TYK2-KD in response to ATP-binding or AL phosphorylation remain unexplored until today, in this study, we applied a much longer, microsecond-scale (3 μs) all-atom MD simulation and studied the TYK2-KD in the presence of ATP and one magnesium ion (ATP.1Mg2+ ).Magnesium ions are essential for kinase activity; namely for transferring the ATP γ-phosphate (phosphoryl-transfer) to proteinsubstrates.Binding of ATP and one Mg 2+ ion in particular, has been reported to represent a crucial and highly dynamic step of the catalytic cycle of other protein kinases. 32,35,36In addition, the ALphosphorylated form (Y1054p-Y1055p) was used to simulate a fully activated TYK2-KD and to investigate the effect of AL phosphorylation on its dynamics.
Due to the lack of crystal structures of this TYK2-KD form, a 3Dmodel of the starting conformation was produced before the MD simulation, based on a known crystal structure of its ADP.2Mg 2+ bound form (Figure 1), as described in Section 2.
A first evaluation of the MD trajectory was carried out by monitoring the root-mean-square deviation (RMSD; see Section 2) of the protein-Cα, ATP-adenine, and magnesium atoms, relative to their initial positions along the 3 μs MD-trajectory (Figure S1).As reflected by relatively small RMSD values (≤2 Å), a convergence of the MDtrajectory albeit dynamic as expected for an activated protein kinase, was reached after 2 μs of the simulation (Figure S1; see also below).
3.1 | Proper ATP positioning in the inner hydrophobic core but transiently formed K930(β3)-E947(αC) salt-bridge, were observed in the simulated TYK2-KD To investigate the details of the ATP-binding site, several related geometric parameters were monitored along the MD-trajectory.As already mentioned, kinase activity requires correct positioning of the ATP molecule deep into the ATP-binding groove, via a hydrogen bond of the ATP-adenine with a conserved protein anchoring site; namely, the backbone oxygen of a conserved glutamate residue at the hinge region (E979 in TYK2 3,4 ).Monitoring of this distance (d in Figure S2) showed that the ATP molecule retained its proper anchoring to the protein (in hydrogen-bonding distance; d $ 2.5 Å) throughout the 3 μs of the MD-trajectory (Figure S2B).On the other hand, a "breathing" of the entrance of the ATP pocket was observed, as reflected by a two-peak distribution of its angle, θ (Figure S2), indicative of a switching between "open" and "closed" conformations of the G-loop relative to the catalytic loop.As already mentioned, relative motions of the N-lobe with respect to the C-lobe are essential for the catalytic activity of protein kinases 2,54 and their restriction, for example, by pKD in the monomeric autoinhibited state, has been proposed to block TYK2 activation. 17,18Indeed, a rigid ATP-binding pocket and a collapsed G-loop, were suggested by our previous MD work on the catalytically impaired P1104A variant. 31cordingly, the side chain of the conserved ATP-binding β3 lysine residue, K930, switched between conformations either in close proximity to the ATP α/β-phosphates or in hydrogen-bonding distances to its αC-glutamate partner in forming the K930-E947 saltbridge along the MD-trajectory (Figure 2).Temporary disruption of the KE salt-bridge has also been observed in the monomeric form of AL-phosphorylated dimeric receptor-PTKs; for example, in the case of EGFR. 53The conformational swapping of the K930 side chain observed here, was more clear in the range of 2-3 μs of the simulation (Figure 2) and this time-range was selected for subsequent analyses of the MD-trajectory (see also below).
3.2 | Disruption of the C-lobe, APE-R salt-bridge, and increased dynamics of the phosphorylated activation loop and of the aH-aI connecting region, were detected in the simulated TYK2-KD Another hallmark of active EPKs has been proposed to be a conserved salt-bridge in the C-lobe between the APE-glutamate (E1071) and a conserved arginine residue (R1159) located at the loop connecting the αΗ and αΙ helices.This salt-bridge mediates communication of the activation loop with the C-lobe GHI-subdomain with an important role in the catalytic activity of other protein kinases. 21,55terestingly, monitoring of the E1071-R1159 (APE-R) distance along the MD-trajectory, revealed that this salt-bridge is mainly disrupted in the TYK2-KD simulated here (Figure 2), reflecting a highly dynamic character of the TYK2 AL and αΗ-αΙ loops in response to AL phosphorylation and/or ATP.1Mg 2+ -binding.
To further investigate the details of conformational flexibility within the simulated TYK2-KD, RMSF calculations were carried out (see Section 2).As shown in Figure 3, increased backbone dynamics (as reflected by higher RMSF values) were indeed detected for AL and the αΗ-αΙ connecting fragment as well as for several other, mainly N-lobe regions.More specifically, almost all the N-lobe loops including the β3-αC, β2-β3, and β4-β5 connecting fragments (and the aforementioned G-loop), exhibited large backbone fluctuations in the simulated TYK2-KD (Figure 3).In addition to AL and αΗ-αI regions, and albeit the C-lobe overall appears to be more rigid compared to the N-lobe, a third C-lobe area emerged from the RMSF analysis as also highly dynamic; namely, the region connecting the αFG and αG helices (Figure 3), which is related to substrate-binding in JAKs (based on substrate-mimicking binding of suppressors of cytokine signaling proteins; SOCs 56 ).This finding is in line with the notion that increased flexibility of substrate-binding regions is characteristic of PTKs. 57ken together our MD results so far, reveal that ATP.1Mg 2+binding and AL phosphorylation increase the TYK2-KD dynamics, in line with the established notion that elevated dynamics and concerted side chain movements are essential to coordinating activation of protein kinases. 2,53

| Cross-correlation analysis of the MDtrajectory revealed a lack of correlated atomic movements between the activation loop and the GHIsubdomain
To identify intra-KD correlated movements in response to ATP.1Mg 2+ -binding and AL phosphorylation, a cross-correlation analysis of the MD trajectory was first applied (see Section 2).Side-chain atoms were taken into account in this analysis as separate entities, due to their key role in mediating allosteric communications, necessary for kinase activity. 2An illustration of highly correlated atomic As shown in Figure 4, there is a disruption of communication between the GHI-subdomain (αΗ-αΙ loop) and AL (APE region), as reflected by the absence of correlation (lack of connecting lines) between the APE-R side chains, E1071 and R1159 (see also Section 3.2) as well as of a related network of hydrophobic interactions expected in the region 21 involving the side chains of A1156, R1159, A1081, and W1085 (Figures 4 and S4).This is most probably due to the replacement of a central to stabilizing these interactions, hydrophobic side chain 21 S4).Interestingly, Ala-1156 is TYK2-unique, whereas position 1156 preserves its hydrophobic character in the other JAK members (i.e., it is either a Pro or Val; Figure S5), suggesting that the disruption of the interaction network at the APE-R region observed here, is TYK2-specific and crucial for its specific activity.Corroborating this idea, valine substitutions for A1156 or A1081, predicted by our analysis to restore the hydrophobic interaction network in the region, are detected in several adenocarcinomas. 52ken together our MD results so far, suggest a unique dynamic profile for the TYK2-KD in response to ATP.1Μg 2+ -binding and AL phosphorylation.

| Community map analysis of the MDtrajectory revealed the dynamic profile of the TYK2-KD in response to ATP.1Mg 2+ -binding and AL phosphorylation
To investigate this issue, we carried out a community map analysis of the MD-trajectory.This type of analysis combined with microsecondscale MD simulations has been widely used to analyze allosteric communications and to obtain the dynamic profiles of various proteins including protein kinases, under various conditions. 32,34,35One of these studies; namely, a community analysis work of MD trajectories of ATP/Mg 2+ -bound PKA forms, divided the catalytic domain of this Ser/Thr-specific kinase into dynamic communities (i.e., transient clusters of residues that move as semi-rigid bodies), each with identifiable function, and suggested that this dynamic architecture is a common feature of EPKs. 32Variations of dynamic profiles depending, for example, on the functional state of the kinase or characteristic of each kinase family, are expected.Community analysis of the MD-trajectory of this study is therefore expected to shed light on allosteric communications within the TYK2-KD in response to ATP.Mg 2+ -binding and AL phosphorylation and unravel the details of its characteristic dynamic profile during activation.
To this end, cross-correlation matrices were combined with community network analysis based on the Girvan-Newman algorithm as implemented in Bio3D 46,47 and as described in Section 2. The Girvan-Newman 48 method offers an effective way of describing allosteric interactions as it provides a quantitative estimate of allosteric coupling based on the edge-betweenness measure from graph-theory (see Section 2).Main-chain and side-chain atomic motions as well as positive only and positive/negative (All) cross-correlations, were treated separately.Hereafter, we will mainly focus on communities identified by the positive-correlation-based analysis, unless otherwise noted as they provide information of strong long-range collective movements in the same direction (see also Figures 4 and S3).Communities were defined by the Cα-based community analysis (MC) using a distance cutoff of 10 Å for 75% of the simulation time and C ij values ≥0.5.The same criteria were applied for the sidechain-based (SC) community analysis that provided a more detailed information on allosteric communications; namely, it provided information on community bridging residues (a residue is defined as "bridging" if its Cα and sidechain atoms belong to different communities) that play a crucial role in dynamics-driven allosteric intra-domain communications. 2,32Nomenclature and coloring of the identified MCcommunities were according to their structurally and/or functionally equivalent PKA communities, 32 whereas extra MC-communities were named based on TYK2 specific secondary structure elements they are built around.Labelling of the identified SC-communities was according to the MC-community with better overlap.

| Dynamic profile of the simulated TYK2-KD
The community network map analysis of the MD-trajectory, detected 14 highly interconnected communities of dynamic amino acids with positively correlated atomic motions within the simulated TYK2-KD, which are depicted using three types of illustration in Figure 5A (a bar plot of community bringing residues is shown, in Figure S6).The N-lobe is partitioned into five communities (Com-N, Com-A, Com-A1, Com-A2, and Com-B), whereas the remaining nine communities are formed by C-lobe regions, including the regulatory C-helix (Figure 5A).Among the identified communities, eight are structurally equivalent to major EPK communities 32 (Com-A to Com-H), whereas six extra TYK2 communities emerged from our analysis that seem to be TYK2/JAK specific (Com-A1, Com-A2, Com-N, Com-FG, Com-I, and Com-P + 1).The identified TYK2-KD communities are described in detail, below.
F I G U R E 4 Cross-correlation analysis of the MD trajectory (2-3 μs) focused on highly fluctuating regions of the C-lobe (see Figure 2).Residues with highly positively correlated (correlation coefficient ≥ 0.8) motions of Cα (denoted MC) and side chain (denoted SC) atoms are interconnected with red lines on the corresponding representative snapshot of the MD simulation.Amino acids discussed in the text are labeled.

N-lobe communities
Com-A (in red) is centered around the five-stranded β-sheet (Figure 5A-a,b) and is the main N-lobe and the third largest MCcommunity identified in the TYK2-KD simulated here (Figure S7).It comprises residues that are involved in the correct positioning of ATP, such as hydrophobic residues lining the upper surface of the ATPadenine-binding pocket (e.g., C-spine residues, V911 and A928; Figure 5B), the conserved ATP-binding residues, E979 and K930 as  22 and to residues serving dynamic bridging of the spines in active EPKs 2 along with bars colored according to their corresponding communities identified in this study (positive only: C ij > 0.5 and positive/ negative: jC ij j > 0.5, correlations-based community map analyses).Additional amino acids with long hydrophobic side chains that emerged from this study as potential contributors to dynamically supporting the TYK2 R-and C-spines in response to ATP.1Mg 2+ -binding and ALphosphorylation, are also listed along with their corresponding communities.
well as the entire G-loop.It thus, corresponds to the ATP-binding community that is involved in the dynamic assembly of the N-lobe β-structure and in the correct positioning of the ATP-adenine ring, which is a prerequisite for committing the kinase to catalysis.Com-A forming residues also include the R-spine residue, Y962, and hydrophobic residues L976 and M978 (Figure 5B/"R spine shell"), which correspond to amino acids of the inner hydrophobic core that dynamically bridge the spines of activated protein kinases. 2,22The side chain of the "gatekeeper" methionine residue, M978, in particular, blocks access to a hydrophobic pocket adjacent to the ATP-binding site and is mainly a threonine in other PTKs, with an important role in regulating their auto-activation. 58Interestingly, our community analysis identified M978 as a "bridging" residue with the regulatory community, Com-C (described below), strongly suggesting a similar role of its side chain in the regulation of TYK2 (and JAKs) kinase activity.In support of this idea, residue M978 together with Y962 have been reported to act as a regulatory switch in TYK2 19 and are conserved in JAKs (Figure S5).
Com-B (in orange) is a small community that is built around the flexible β 3 -αC linker including the N-terminal portion of the C-helix (Figure 5A-a,b).It serves dynamic allosteric coupling between the ATP-adenosine-binding community, Com-A, and Com-F that corresponds to the activation community (two edges of the orange sphere in Figure 5A-c), suggesting an important role of the Com-B region in coupling ATP-binding with TYK2 activation.Corroborating this idea, the TYK2 β 3 -αC region is a cancer hotspot.For example, Com-B residues of dynamic nature, such as G943 and its adjacent S942, are unique to TYK2 among the JAKs (Figure S5) and Tyk2 point mutations affecting their dynamics and thus, the dynamics of the C-helix; namely, S942L and G943C, have been identified in mixed and lung cancers, respectively. 52Combined these observations, suggest a key role for the dynamics of the β 3 -αC loop in αC-IN positioning and TYK2 activation.Notably, an important role of disorder-to-order transitions of this region in activation has been reported for the monomeric form of other dimeric PTKs; for example, the receptor tyrosine kinase, EGFR. 53m-A1 (in salmon) is a TYK2/JAK-specific community and encompasses the β2-β3 bulge, situated on top of the N-lobe (Figure 5A-a,b), reflecting the high conformational flexibility of this region in the TYK2-KD simulated here (Figure 3).Notably, the Com-A1 region is constrained by pKD in the autoinhibited form of JAKs, thus preventing relative movements of the N-and C-lobes necessary for kinase activity, 17 whereas it is released in mutation-activated JAK forms. 18Taken together, these observations in conjunction with the important role of β3 in ATP-binding (bears the KE-lysine, Κ930 and the C-spine residue, A928; Figure 1), suggest an also important contribution of the dynamics of the β2-β3 bulge in the dynamic assembly of the N-lobe β-structure upon ATP-binding, in TYK2/JAKs.Supporting this idea, Tyk2 missense mutations that result in altering residues of dynamic nature in this region, have been also identified in cancer patients (e.g., G922D/S, T923I 52 ).
Com-A2 (in dark red) is TYK2/JAK-specific and comprises the β4-β5 connecting loop, which is placed on top of the Com-A β-structure (Figure 5A-b) and is also highly dynamic in the TYK2-KD simulated here (Figure 3).Com-A2 is a small TYK2 community, which is, however, allosterically coupled with Com-A during the analyzed time-frame of the simulation (Figure 5A-c), implying an equally important role for the dynamics of the β4-β5 connecting loop in correct ATP-positioning.
Com-N (in dark gray) is a TYK2/JAK-specific community and includes the N-terminus (aa: P889-L897) of the TYK2-KD simulated here (Figure 5A-a,b), which actually corresponds to the C-terminal end of the pKD-KD connecting region of JAKs. 17Com-N is a solely MCmembered community (Figure S7), which is however dynamically coupled with two other N-lobe communities, ÀA and -A1 (Figure 5Ac), also implying a role of Com-N forming residues in the dynamic assembly of the N-lobe β-structure upon ATP-binding and its correct, catalytically competent positioning.Indeed, this TYK2 region is also a cancer hotspot. 52e regulatory, Com-C Com-C (in yellow) corresponds to the regulatory community, which is involved in the dynamic assembly of the R-spine upon AL phosphorylation, 32 and in the TYK2-KD simulated here, it comprises most of the regulatory C-helix (Figure 5A-a; MC) including the R-spine residue, L951, and the glutamate residue, E947 of the KE salt-bridge.
The SC-membered Com-C (Figure 5A; SC), on the other hand, extends beyond αC and emerged as the dominant (Figure S7) and most interconnecting SC-community in the simulated TYK2-KD (Figures 5A-c; SC and S6, bar at "C").Indeed, and in line with its role in the dynamic R-spine assembly, the TYK2 Com-C SC integrates almost all the R-spine residues (yellow side chains in Figures 5B/"R spine" and S8) and forms an extended positively correlated side chain communication network that dynamically connects αC with the assembled phosphorylated AL and its neighboring C-terminal and N-terminal halves of helices αΕ and αF (yellow in Figure 5A-a,b; SC).
The highly correlated Com-C side chain motions detected here are reminiscent of synchronous side chain movements observed in the inner hydrophobic core of PKA in response to ATP-binding and are proposed to serve dynamics-driven allosteric activation of EPKs. 2 Indeed, in addition to R-spine forming residues, the side chains of the gatekeeper, M978 (see also Com-A description above), and of other hydrophobic residues of the TYK2 inner hydrophobic core, predicted to serve dynamic bridging of the spines in activated EPKs, are part of the identified TYK2 Com-C SC (yellow SC bars in Figure 5B/"Bridging the Spines").Validating these findings, replacement, for example of I950 or of the gatekeeper, M978 by a valine (shorter) residue, results in uterine or colorectal cancer, respectively. 52Based on this notion, additional Com-C SC members with long hydrophobic side chains, emerged from this study as potential contributors to the dynamic assembly of the R-spine and to dynamically bridging the spines of the TYK2-KD during the activation process (Figure 5B/"Hydrophobic Support to the R-spine").
In total, Com-C integrates transiently key residues for intra-KD signal transduction upon ATP/Mg binding and AL phosphorylation and emerged as a key player in allosteric regulation of TYK2 activity.

C-lobe communities
Com-D (in dark green) corresponds to the catalytic community and its structurally equivalent in PKA, for example, is involved in ATP-binding and proper positioning of the γ-phosphate and magnesium ions for phosphoryl-transfer to the substrate as well as in the assembly of the C-spine. 32In the TYK2-KD simulated here, however, Com-D emerged as a solely MC-community (Figure S7), most probably due to the absence of a protein-substrate and/or of a second magnesium ion in the simulation.Supporting a catalytic role, however, the TYK2 Com-D comprises the main-chains of most of the catalytic residues and their neighboring amino acids involved in substrate recognition and the assembly of the catalytic spine, such as the D-helix, the catalytic loop ( 1025 AARNV 1029 ) and its neighboring C-terminal portion of the αFhelix (green in Figure 5A-a,b; MC).In total, Com-D provides the backbone atoms of four out of six C-spine residues of the C-lobe (Figure 5B/"C-spine"), and is allosterically coupled with both N-lobe and C-lobe communities with the strongest couplings (thick edges in Figure 5A-c) being with Com-A and Com-FG (described below).
In addition, when negatively correlated motions are also taken into account, several Com-D side chains are merged with Com-FG (magenta Figure 5B; SC) and these observations are in agreement with literature data showing that the D-helix and its neighboring substratebinding regions are more dynamic in Tyr-specific kinases. 57m-E (in blue) is the second largest community identified in the TYK2-KD simulated here (Figure S7) and comprises most of the E-helix, the β7-β8 region that is directly linked to the catalyticand MG-loops, and the αC-β4 linker (Figure 5A-a,b; MC).Supporting this finding, the two latter regions have been reported to move as a rigid block serving as pivot point for C-helix relative movements in protein kinases. 57Indeed, Com-E provides the main-chains of several spine-bridging residues as well as the side chains of almost all the C-lobe C-spine-forming residues (blue bars in Figure 5B).It also comprises the side chains of several residues that contribute to sensing ATP-binding as well as residues in the vicinity of the substrate-binding region, such as the D-helix and the C-terminal half of the αF-helix (blue in Figure 5A-a Com-F (in brown) corresponds to the activation community in other EPKs 32 and is the largest community of positively correlated backbone atoms identified in the simulated TYK2-KD of this study (Figure S7).Com-F MC is built around the activation loop, including the β6β9and β10β11-sheets (bearing Y1054p-Y1055p) that are hallmarks of assembled activation loops (extended, AL-OUT) in active PTKs, 4 strongly supporting its role as the TYK2 activation community (brown in Figure 5A-a,b; MC).It also comprises the half of the αF-helix that provides a scaffold for the R-spine assembly (via D1083), as well as the C-terminal end of the αΕ-helix preceding β6 (brown in Indeed, it is noteworthy that the TYK2 Com-F MC comprises the mainchains of two residues of the C-helix; namely, W944 and K945 (brown stripe in Figure 5A-a; MC), reflecting a strong allosteric connection between AL and the C-helix in the TYK2-KD simulated here.
Since interactions between the C-helix and AL are characteristic of active kinases, 57 this observation infers an important role of these amino acids in allosteric TYK2 activation.
The SC-membered Com-F, on the other hand, is the third largest SC-community (Figure S7) and mostly a "bridging" community (Figure S6; bar at "F"), reflecting the highly fluctuating phosphorylated AL in the TYK2-KD simulated here (see also Figure 3).Interestingly, the TYK2 Com-F SC comprises the side chains of the entire FG-helix (Figure 5A-a; SC) and therefore, it can also be regarded as the TYK2 Com-FG SC .Notably, Com-F SC is also strongly allosterically coupled with the R-spine (C SC ) and C-spine (E SC ) communities (Figure 5A-c), and this observation combined with the Com-F MC results, suggests a central role of αFG side chains in dynamic bridging the spines and in dynamics-driven allosteric activation of TYK2.
Com-FG (in magenta) was named after the FG-helix as it includes the main-chain atoms of the entire αFG (Figure 5A-a,b; MC) and is therefore, the major JAK-specific community.Interestingly, Com-FG extends to the 1064 PVFWY 1068 portion of the P + 1 loop that determines Tyr-specificity for protein substrates and to its adjacent APEhelix (Figure 5A-a; MC).Proline P1064 in particular, together with tryptophan W1067, is conserved in all PTKs and provides the interaction platform for the p-site tyrosine of protein substrates during catalysis, with also important role in auto-activation. 4rthermore, Com-FG is also a solely main-chain community, which is allosterically coupled with the catalytic community, Com-D, and the activation community, Com-F MC with the strongest coupling being with Com-D (Figure 5A-c; MC), supporting a central role of the FG-helix, in dynamically coupling substrate specificity with catalysis.
Taken together these findings, strongly suggest that the functional role of the TYK2 Com-FG, is related to determining TYK2/ JAKs-specificity for protein substrates and that the αFG and APE helices hold a central role in this, on top of the 1064 PVFW 1067 region of the P + 1 loop.
Com-G (in pink) comprises the backbone atoms of the G-helix and of its preceding loop that connects it with the FG-helix (Figure 5A-a,b; MC).This region contributes to substrate binding as well as to binding regulatory proteins in JAKs, for example, the SOCs proteins that inhibit JAK/STAT signaling. 56Com-G is a dynamic, also solely MC-community, which is however allosterically coupled with two other protein-protein interaction communities; namely, Com-FG and Com-H (Figure 5A-c; MC).These observations reflect a dynamic character of the G-helix and especially of the loop that connects it with αFG (see also Figure 3) and further support the notion of higher flexibility of substrate-binding regions in PTKs. 57e GHI community Com-H PKA , which is proposed to serve as docking site for regulatory protein-protein interactions in EPKs, 32 is split into two discrete communities in the TYK2-KD simulated here; namely, Com-H and Com-I.
Com-H (in purple) is the largest community of the TYK2 GHI helical subdomain (Figure S7) and encompasses the αΗ helix as well as its flanking linkers with the αG (αG-αH) and αΙ (αΗ-αΙ) helices (Figure 5A-a,b; MC).Interestingly, this region has been suggested to act as scaffold for the trans-activation of homo-or heterodimeric protein kinases in an asymmetric-dimer mode. 59Given the homo-/heteromeric mode of action of JAKs, this concept points to the TYK2 Com-H as a potential dimerization and/or scaffolding community.Notably, however, the side chains of the αΗ-αI linker split off from Com-H and are merged with the neighboring community Com-I, reflecting the dynamic character of this region in the simulated TYK2-KD (see also Sections 3.2 and 3.3; and Figures 4   and 5).Combined these results imply that the dynamics of the αΗ-αΙ linker are most likely crucial for preventing unspecific scaffolding.In total, the community analysis suggested that the TYK2-KD simulated here is in a dynamically committed state for substrate recognition/binding and catalysis; and pointed to several amino acids as potential mediators of dynamic-driven allosteric communications within the TYK2-KD at this step of the catalytic cycle.

| Identification of amino acids serving dynamics-driven allosteric communications within the simulated TYK2-KD
To identify amino acids that contribute the most to the TYK2-KD allosteric community network identified in this study (Figure 5A-c), side chain betweenness centrality values, as obtained by Bio3D (see Section 2), were extracted and plotted along the amino acid sequence.
As shown in Figure 6, several amino acids across the TYK2-KD primary structure are characterized by higher centrality values of their side chains, suggesting a greater role in mediating allosteric signal propagation within the catalytic domain and in dynamically coupling the spines as a result of ATP.1Mg 2+ -binding and AL phosphorylation.Indeed, among these amino acids are C-spine (V911, A928, and T1090) and R-spine (H1021, D1083) residues, conserved hydrophobic amino acids dynamically bridging the spines in activated kinases (L976, M978) as well as catalytically important residues (N1028) and substrate binding amino acids, including the PTK-characteristic 1064 PVFW 1067 portion of the P + 1 loop (Figure 6).Notably, most of these residues (more than 40%) are associated with cancer-related TYK2 amino acid substitutions, further supporting the functional relevance of our findings.Interestingly, among these are three αFG side chains, including P1104 (Figure 6), supporting a pivotal role of the FG-helix in dynamics-driven allosteric feed-back to the active site at this step of the catalytic cycle.
To elaborate on this in the light of the community map analysis results, amino acids with higher side chain centrality values (Bc sc > average Bc sc = 356; labeled in Figure 6) were categorized according to their corresponding communities and along with the predicted functional community roles (Section 3.4.1),are shown in Table 1.
As shown in Table 1 and illustrated in Figure 7, the higher centrality side chains are organized into community groups with distinct identifiable functional contributions to the allosteric network of amino acids and structurally, form interconnected dynamic motifs that wrap around the spines, thus contributing to their dynamic assembly and coupling in response to ATP.Mg 2+ -binding and AL phosphorylation (compare Figures 7 and S8).We then sought to clarify the functional significance of these amino acids in connection with sequence conservation and involvement in cancer (collectively shown in Table 2).

| Intra-KD allosteric signaling related to ATPadenine positioning (Com-A SC )
A functional group of side chains strongly contributing to the allosteric communication network of amino acids identified in this study, is Com-A SC (Table 1/row-A; red in Figure 7A); that is, it is related to the correct positioning of the ATP-adenine ring.Indeed, it provides side chains of several conserved and well-characterized ATP-binding related amino acids, as already mentioned, such as V911 and A928 (C-spine-forming), L976 (spines-bridging), E979 (ATP-adenine-binding) as well as conserved glycine residues, G904 and G909 of the G-loop (Table 1 and Figure 7).As expected, and further validating the functional relevance of these results, this functional group is a cancer hotspot (Table 2; "ATP-adenine-positioning Block").
Notably, an additional, unique to TYK2 side chain emerged from our analysis as a potential contributor to mediating intra-KD allosteric communications upon ATP-binding; namely, S912, which is a glutamate in the other JAKs (Table 2).This residue follows amino acid V911 and projects upwards the N-lobe β-structure (Figure 7B, right), indicating an important role in controlling its dynamic assembly and correct shaping of the ATP-binding pocket.Indeed, the observed fluctuations of the N-lobe β-structure would interfere with pKD in the autoinhibited full-length TYK2 (Figure 7B).In addition, and interestingly enough, the N-lobe β-structure has been recently reported to be involved in transient interactions with pKD during transactivation of JAKs, 60 further supporting its role in assisting dynamically coupled intra-KD allosteric communications during the activation process, predicted by our analysis.Further corroborating our findings, substitutions of S912 affecting the dynamics (S912I) or electrostatic interactions (S912R) in the region, along with a splice-site mutation of residue G970 of the β4-β5 loop (Com-A2; Figure 7B, right), are found in several cancers (Table 2).

| Intra-KD allosteric signaling related to kinase activity regulation; dynamic assembly of the R-spine/ activation (Com-C SC )
The predominant functional group of strongly signaling side chains (bold in Table 1; Figure 7A), is related to the dynamic assembly of the R-spine (C-row; yellow-surfaces), with the sub-group connected to activation being the most populated (Table 2; "Activation block").
Indeed, the latter sub-group includes the R-spine residue, D1083, and its adjacent hydrophobic residue, V1084 of αF as well as the hydrophobic amino acids I1020, Α1045, and V1048 (Figure 7A) of the β6β9-sheet, the assembly of which is a hallmark of activated (phosphorylated AL-OUT) kinases.
Notably, the strongly signaling Com-C SC members also include the PTK-specific tryptophan residue, W1067 of the P + 1 loop as well as a distal phenylalanine located in αΙ, F1162 (Figure 7A), indicating a key role of these side chains in allosteric signal transmission to the active site during TYK2 activation.Supporting this idea, a splice-site mutation of W1067 is a cancer hotspot (Table 2).Given that, in addition to W1067, the phenylalanine at position 1162 is also conserved in PTKs (data not shown), these results imply that the dynamics-driven feedback of these side chains to the active site observed here may also hold true for other JAKs/PTKs.Interestingly, the C-spine residue T1090 also emerged as a significant contributor to allosteric signaling for the dynamic assembly of the R-spine (bold in Table 1; Figure 7A).This finding combined with the observation that T1090 shares the same functional group with the conserved Mg-binding asparagine, N1028 (Table 2; "Catalytic block"), strongly suggests a pivotal role of T1090 in allosteric regulation of catalytic activity in response to ATP/Mg-binding.Corroborating this hypothesis, T1090 replacement by an isoleucine, which according to our analysis, is predicted to stabilize the C-spine (and/or the R-spine) and to result in an unregulated TYK2 catalytic activity, is found in glioblastoma (Table 2).It is also noteworthy that T1090 is conserved only in TYK2 and its substrate, JAK1, whereas it preserves a more hydrophobic character in the other two JAK-members (Table 2) and in PTKs (data not shown), implying that the regulatory F I G U R E 6 Side chain node-betweenness centrality plot.Edge-betweenness centrality values (in 2-3 μs) for representative side chain atoms (Bc SC ), as obtained by Bio3D, are plotted along the TYK2-KD primary structure.Labels indicate amino acids with higher Bc SC values (> average Bc SC = 356).Secondary structure elements and functionally important regions and motifs discussed in the text, are boxed and labeled (as in Figure 1).Asterisks and x's indicate TYK2 amino acids mutated in cancer (missense or splice-site Tyk2 mutations, respectively).Highlighted are amino acids that correspond to the top 5% of Bc SC values.
T A B L E 1 TYK2 amino acids mediating dynamically coupled allosteric communications within the simulated TYK2-KD (Bc sc > average Bc sc = 356; labeled in Figure 6), categorized according to their corresponding communities identified in this study (as in Figure 5A).role of T1090 inferred for TYK2, may also apply to JAK1 and is therefore, important to coordinating cytokine signaling.

MC
The Com-C SC functional group provides additional side chains (yellow in Figure 7B) that further contribute to the allosteric communication network.These include the hydrophobic side chains of residues, M978 (gatekeeper; see also Section 3.4: Com-A and Com-C), W944 (αC), M1011 and A1016 (αE), L1024 (catalytic loop), L1044 (DFG + 1), A1047 (β9), C1072 (APE+1), and A1081 (αF) as well as several polar amino acids, including the R-spine, HRD-histidine, Interestingly, many of the aforementioned amino acids, especially those of the activation block, which is a cancer hotspot (Table 2; "Activation block"), are also either TYK2-unique or shared between TYK2 and its counterparts in cytokine signaling, JAK1 and JAK2.For example, besides T1090, amino acids A1047 (of the activation loop) and C1072 (of the APE-helix) are also shared only by TYK2 and JAK1 among the JAKs, whereas the αC residue, W944 is TYK2-unique (Table 2).W944 in particular, as already mentioned, is predicted to contribute to allosteric TYK2 activation by dynamically linking αC with the activation loop (see Section 3.4.1;Com-F) and indeed, it is also mutated in cancer (Table 2).Another TYK2-unique amino acid of the activation block is A1016, which is a serine/threonine in the other JAKs and interestingly, it is replaced by threonine or valine in several cancers (Table 2), further supporting its predicted role in dynamics- Community-based illustration of the allosteric communication network of amino acids identified in the simulated TYK2-KD.Side chains characterized by higher Bc SC values (as in Table 1); namely, (A) significantly higher Bc SC values (>1000; bold in Table 1); that is, with a pivotal role in mediating intra-KD allosteric signaling or (B) Bc SC > 356 (labeled in Figure 6), are mapped on the representative MD-snapshot and depicted as surfaces colored according to their corresponding SC community (as in Figure 5A).The main chain of the simulated TYK2-KD is depicted as cartoon, colored in gray (in panel A), or according to the corresponding MC communities (in panel B).The ATP molecule and magnesium ion are shown in sticks and as a green-sphere, respectively.Schematic illustrations in panel B, indicate the position of other TYK2 sub-domains (pKD 17 ; FERM-SH2 16 ) in the autoinhibited full-length monomeric TYK2 (based on 18 ).Encircled are amino acids in the substrate binding region that are central to the identified communication network.Asterisks and x's as in Figure 6.
T A B L E 2 Collective listing of the allosteric communication network of TYK2 amino acids identified in this study (as in Table 1): Predicted functional roles; amino acid sequence conservation in the JAK family (see also Figure S5); Involvement in cancer.

G909D; G909S
Renal clear cell carcinoma; Uterine mixed endometrial N/A: senses ATP binding (hinge region) -F: Activation community (assembly of αC with the activation center: β6β9and β10β11-sheets by Tyr-phosphorylated AL) -C: R-spine assembly Activation Block driven allosteric TYK2 activation.Finally, residue A1081, which in the TYK2-KD simulated here, was found to lose contact with A1156 at the αΗ-αΙ loop (Section 3.3 and Figure S4), also emerged as part of the activation block (Table 2), implying an important role of this amino acid in dynamics-driven allosteric TYK2 activation.Indeed, replacement of this side chain by the longer hydrophobic side chain of valine, as identified in colorectal cancer (Table 2), is therefore expected to stabilize the network and to result in an unregulated activation.These results combined with the observation that A1081 is shared between TYK2 and its substrates JAK1 and JAK2 (Table 2), support the notion that the dynamics of its αΗ-αΙ connecting region not only hold a crucial role in dynamics-driven allosteric activation of TYK2 but it most probably dictates recognition of its counterparts in cytokine signaling, implying a role in coordinating specific cytokine responses.
Taken together these findings indicate a TYK2-unique amino acid composition of intra-KD allosteric signaling related to its activation in response to ATP.1Μg 2+ -binding and/or AL-phosphorylation.
In total, the Com-C SC correlated motions observed here, exemplify the contribution of AL phosphorylation to the dynamic assembly of both the R-spine and the catalytic site, and suggest that AL-phosphorylated TYK2 is dynamically committed to substrate recognition/binding and catalysis in response to ATP.1Μg 2+ -binding.This in turn suggests that AL phosphorylation and/or ATP.1Μg 2+ -binding may bypass cytokine-dependent hetero-dimerization and may account for its distinct activity.
3.5.3| Intra-KD allosteric signaling related to the dynamic assembly of the catalytic-spine/catalysis (Com-E SC ) The second most contributing signaling group of side chains is that related to the dynamic assembly of the catalytic, C-spine (Table 1/Erow; Figure 7), with the most prominent contributions being provided by two functional amino acid blocks; namely, the C-spine and the C-spine-stabilizing blocks (Table 2).
The functional relevance of the C-spine block, in particular, is corroborated by the observation that it is entirely a cancer hotspot with amino acids A1025-A1026 of the catalytic loop, being the strongest contributors to allosteric intra-KD signaling (bold in Tables 1 and -H; PP interactions; substrate docking/ dimerization/ scaffolding (putative) For involvement of P1104A in additional types of cancer see, for example, reviews in Leitner et al. 10 and Borcherding et al. 13 LESGIDOU and VLASSI Table 2).This group is conserved in the JAKs, with the notable exception of amino acid at position 996, which is a glycine uniquely in TYK2 (Table 2).G996 is located at the C-terminal-end of the αD-αE connecting loop, which is shorter in TYK2 compared to the other JAKs (Figure S5), implying an important role in the dynamics of this TYK2 region in catalytic activity.Moreover, 50% of cancer-related mutations associated with this group result in longer hydrophobic side chains (A1026V, L990M, T096M; Table 2) that according to our analysis, are predicted to stabilize the C-spine assembly, further supporting the importance of this amino acid block in dynamics-driven allosteric control of catalytic activity.
The C-spine-stabilizing block of dynamically coupled amino acids (Table 2; "C-spine-stabilizing block"), on the other hand, includes residues of the N-terminal half of the αE helix and of the β7-strand preceding the Mg-binding loop that also map in the interface with other TYK2-domains in the autoinhibited form (blue surfaces in Figure 7B, right).Among these amino acids are V1037 and other hydrophobic side chains identified here as contributors to the dynamic assembly of the C-spine (Figure 5B; "Hydrophobic support to the C-spine") as well as the conserved K1038, which signals through its interaction with the ATP-adenine-bound, E979 (Figure 7B, right).Corroborating the functional significance of this group, an isoleucine substitution for V1037, expected to stabilize the C-spine and to disrupt the dynamic bridging of the spines, leads to glioblastoma (Table 2).Notably, several amino acids of this block are also either TYK2-unique (L1036, A998, L983) or shared with JAK1 (A1004, Q999; Table 2).Interestingly, residue Q999 is an arginine in JAK3 and an arginine substitution for this TYK2 glutamine has been identified in colorectal cancer patients (Table 2).Taken together, the results of this group point to a unique interface of the TYK2-KD with its other sub-domains and imply an important contribution of the dynamics of this region to allosteric signaling for substrate recognition and catalysis.
3.5.4| Intra-KD allosteric signaling related to substrate binding (Com-F SC ) Role of aFG in allosteric signal transmission; TYK2-specificity As already mentioned, three side chains of the TYK2 FG-helix; namely, P1104, P1105, and I1112, emerged from our analysis as strong mediators of allosteric signal propagation to the active site (bold in Table 1, Figure 7A).
As shown in Table 1 (F-row), these side chains are part of the second most contributing functional group of dynamically signaling side chains (brown in Figure 7A), and of the TYK2-specificity-determining block, in particular (Com-FG MC /F SC cell in Tables 1 and 2).The functional role of the latter derived from the observation that the conservation pattern of this amino acid block is JAK-specific but in a TYK2-unique combination; namely, the strongly signaling triad of V1065, P1105, and I1112 side chains of this group, is TYK2-unique (Table 2; "TYK2-specificity determining block").Corroborating these findings, side chain substitutions affecting the specificity-signaling of P1105 and I1112, along with the P1104 to Ala change, are associated with cancer (Table 2).Further supporting the role of this amino acid group as specificity-determinant, P1104A was found to exhibit altered specificity in IFN-stimulated cells, 27 and an Ala substitution for an isoleucine equivalent to I1112 (I1065 JAK2 ), has been reported to prevent (auto)phosphorylation of JAK2. 24Notably, residues A1069-P1070, are also part of the TYK2-specificity-determining block of side chains identified in this study (Table 2; "TYK2-specificity determining block").
This indicates an important role of the APE-motif in determining TYK2 specificity, and this may be facilitated by the disruption of the APE-R salt-bridge observed in this study (Section 3.2).Supporting this idea, a Gly substitution for A1069, expected to disrupt TYK2-specificity signaling, is associated with lung adenocarcinoma (Table 2).
Furthermore, P1104 and its adjacent P1105, are part of a central hub (encircled in Figure 7B) of the identified allosteric communication network of amino acids that serves allosteric coupling of substratebinding with the catalytic machinery.More specifically, and as shown in Figure 8A, the P1104 side chain ring packs against the W1067 indole-ring, which in turn is hydrogen-bonded to the conserved αF glutamate, E1093, whereas P1105 packs in a ring-stacking interaction with the F1066 phenol ring; and all these dynamically coupled interactions were preserved during the analyzed time-frame of the MD simulation, as indicated by narrow distributions and small values (<5.5 Å) of the corresponding side chain distances (Figure S9).The side chain of E1093 in turn is anchored to the catalytic loop through a hydrogen-bond with the main-chain amide-group of A1026 (Figure 8A) and this interaction also persisted (distance <4 Å; Figure S9).Interestingly, the PTK conserved proline residue, P1064 is transiently involved in Van der Waals interactions with the W1067 (as reflected by the two-peak distribution of their side chain distance; Figure S9), and this way, it also dynamically feeds-back to the catalytic machinery.
Taken together, these findings support a pivotal role of the FGhelix in determining specificity and in allosteric intra-KD signal propagation for coupling substrate specificity/binding with catalysis.Notably, and on top of the regulatory side chains, M978 and Y962, amino acids P889 and T890 are also conserved in JAKs (Figure S5) but not in other PTKs (data not shown), suggesting that the dynamically coupled capping network identified here and involving residues, P889-T890-Y962-M978-K930, may be JAK-specific.
Validating the functional significance of this finding, and in addition to M978 and Y962, P889, and T890 are also mutated in cancer. 52Moreover, and further supporting these results, this capping would be served by pKD in its activating position, as observed in a mutationactivated full-length JAK1 protein (based on 18 ).Notably, in other PTKs, a structurally equivalent capping serving allosteric communication between ATP-and substrate-binding sites is also controlled by domains outside their KDs (e.g., SH2/SH3) and is mediated by a tryptophan residue located in their nonKD-KD linkers (e.g., W260 SRC , W395 BTK , W238 LCK ) 33,34 near the P889-T890 TYK2 site.
It is also worth noting that the ensemble of dynamic conformations observed here involves the side chain of the adjacent to A1026 in the catalytic loop, arginine R1027, which also switches between conformations either transiently packed against W1067 (cation-π interaction) or at a hydrogen-bonding distance with the ATP γ-phosphate, while remaining hydrogen-bonded to the HRDaspartate, D1023, in the simulated TYK2-KD (Figures 8B and S9).This R1027-mediated interaction network helps neutralizing the buildup of negative charge on the γ-phosphate and D1023, thus most probably preventing ATP hydrolysis in the absence of substrate.Indeed, R1027 does not contribute to allosteric signaling to the active site in the TYK2-KD simulated here, as reflected by the zero Bc SC value of its side chain (Figure 6).Supporting these findings, replacement of R1027 by a histidine, results in tumor-associated TYK2 activation. 61mbined these observations suggest that the dynamic conformational ensemble we observed in this study reflects the ensemble of states existing prior to phosphoryl-transfer and serving dynamicsdriven regulation of TYK2 catalytic activity prior to substrate binding and securing dynamic allosteric coupling of ATP-binding and activation with substrate recognition/binding and catalysis.

| Comparison with APO-form simulations
The above results prompted us to examine allosteric signaling in the APO (unphosphorylated) forms of the TYK2 and P110A KDs that we have simulated in a previous work. 31Side chain node-betweenness centrality values were calculated from the corresponding MD trajectories following the same procedure as for the wtATP.(Figure 6) were subsequently mapped on the corresponding representative MD structures and depicted using cartoon-putty illustrations, for comparison (Figure 9C,D).
As shown in Figure 9, several side chains across the primary sequence of the wtTYK2-KD APO-form also exhibit significantly high Bc SC values, thus serving allosteric communication between the lobes (Figure 9A,C, left panel).Notably, among these signaling side chains, several have been also identified as part of dynamically coupled allosteric networks in other active kinases (footnoted in Table S1).These include the functionally important amino acids, K930 33 and D1041, 33 the C-spine residue, V911, 2,22,33 the R-spine regulator, D1083 22 and the spines-bridging residue, V1037 2 (Figure 9C, left panel).Interestingly, the latter three amino acids are shared with the ATP.1Mgbound wtTYK2-KD form simulated in this study (gray shaded in Table S1).An additional amino acid with a high Βc sc value, S1086 (Figure 9A), also identified as part of the "Activation block" by our analysis of the ATP.1Mg 2+ bound form (Table 2) has been also found to exhibit high betweenness centrality values in simulations of another active tyrosine kinase (LcK). 34Further supporting its significance (on top of others discussed above; Section 3.5.2),this serine residue is conserved in the PTK family (data not shown) and its replacement by a hydrophobic residue (S1086I), predicted by our community analysis to promote unspecific activation, is associated with glioblastoma (Table 2).Taken together, these observations add to indications that the APO wtTYK2-KD simulated previously 31 is indeed active despite the lack of AL-phosphorylation and this may account for the basal activity exhibited by wtTYK2. 27 the contrary, a lack of strong allosteric signaling is observed in the P1104-KD APO form, as indicated by relatively small Bc SC values compared with both the wild-type simulations (Figure 9) and in line with our previous analysis. 31Indeed, only few residues exhibit strong Bc SC values in the variant, and these cluster at the C-terminal end of the F-helix, near the substrate-binding region, including amino acids V1089 and T1096, which are conserved in JAKs (Figure S5) and the C-spine residue, T1090 (Figure 9C,right panel).T1090, in particular, is predicted by our analysis to hold a pivotal role in allosteric regulation of catalytic activity (Table 2; "Catalytic block") and therefore, strong signaling of its side chain, as observed in the APO P1104A-KD case, is expected to promote unspecific C-spine and/or R-spine stabilization, similar to the T1090I cancer-associated mutation (Table 2), as already discussed above (Section 3.5.2).
Interestingly, and in contrast to the wtATP.1 Mg simulation, the strongly signaling side chains of both the wtTYK2 and P1104A APO simulations do not include residues involved in substrate specificity/binding, such as W1067 (P + 1 loop) and the FG-helix, including the mutated P1104 (Figure 9C,D and Table S1).This observation suggests that allosteric coupling mediated by the FGhelix and especially by residues, P1104, P1105, and I1112 along with W1067, is associated with ATP.1Mg 2+ binding and/or ALphosphorylation during the pre-phosphoryl-transfer step of the catalytic cycle studied here.
Notably, Bc SC calculations in various time-ranges of the wtATP.1Mg2+ simulation revealed that, although W1067 was found in all the analyzed time-frames, the entire αFG (and especially I1112) mediated strong allosteric signaling only in the last 1 μs of the simulation used in this study (2-3 μs; Figure S10A and Table S1).This observation combined with the better sampling of this time-frame of the simulation (Figure S10B) and the fact that it captures the transition of the K930 side chain from ATP-to αC-bound (KE salt-bridged) conformations (Figure 2), further supports the key role of αFG (together with W1067) in coupling ATP-binding and activation (AL-phosphorylation) with substrate specificity and binding.

| CONCLUSIONS
In conclusion, this work highlights the importance of conformational dynamics of the TYK2-KD in allosteric regulation of TYK2 kinase activity and is in perfect agreement with the notion that ATP-binding and conformational flexibility of the catalytic domains of JAKs are critical for their (both cytokine-dependent or mutation-driven) activation. 27,60 summary, this study by combining a long, microsecond-scale MD simulation with community network analysis, shed light on the dynamic profile of the TYK2-KD in response to ATP.1Mg 2+ -binding and AL phosphorylation (activation process) and identified a dynamically coupled allosteric communication network of amino acids across the lobes, serving intra-KD signaling for dynamics-driven allosteric regulation of TYK2 activity, prior to substrate binding.Corroborating the functional significance of the identified allosteric network of side chains, more than 40% of the key residues identified in this study, are associated with cancer-related Tyk2 missense or splice-site mutations, as revealed by screening of related databases.
In particular, our community analysis supports a pivotal role of amino acids P1104, P1105, and I1112 of the FG-helix in the communication network and strongly suggests that the JAK-specific helical insert is not simply involved in substrate binding but it is actively implicated in determining TYK2 specificity and in dynamics-driven allosteric feedback to the active site for coupling substrate specificity/ binding with catalysis, thus substantiating its reported role in JAK activation. 24 addition, the community-network analysis of this study was able to provide prediction on the functional consequences of cancerassociated TYK2 amino acid substitutions in line with the notion that consideration of protein structural dynamics and identification of mediators of allosteric signaling, in particular, is advantageous to the identification of disease-causing mutations. 62 propose that the pre-phosphoryl-transfer step of the catalytic cycle studied here, is crucial for TYK2 activation (as reported for other protein kinases 32,[35][36] ) and that the conformational dynamics at this step, coordinated by αFG, is crucial to serving molecular recognition.
This, combined with the TYK2-unique amino acid composition of several regions, including the β3-αC and αΗ-αI connecting regions, may account for its distinct specificities reported in the literature (e.g., for TYK2-specific sites of STAT3 9,15 and/or for noncanonical, cytokineindependent TYK2 functions 61 ).
In total, this work adds to knowledge toward an in-depth understanding of TYK2 activation and specificity determinants and we believe that the dynamic profile provided here, combined with the TYK2-unique amino acid composition of the allosteric communication network identified in this study, may be valuable towards a structurebased design of allosteric TYK2-specific inhibitors.
motions in the time-frame of 2-3 μs of the simulation, focused on the aforementioned dynamic C-lobe regions, is shown in Figure 4. Due to the lack of strong negative correlations (Figure S3), only highly positively-correlated atomic movements (C ij ≥ +0.8) are displayed.
(Leu277 PKA ) by an alanine in TYK2 (A1156).Indeed, the predicted APE-R shielding side chain contact, A1156-A1081, is disrupted and replaced by an unforeseen interaction of F I G U R E 2 Monitoring of distances and hallmarks of active EPKs.Monitoring of conserved salt bridges (KE and APE-R) and of distances of the KE-lysine, K930 with the ATP α-(left) and β-phosphates (right), along the MD trajectory.F I G U R E 3 Backbone rootmean-square-fluctuation (RMSF) analysis of the MD trajectory in the time frame of 2-3 μs.Important secondary structure elements and motifs discussed in the text are boxed and labeled.residues A1156 and R1159, in the simulated TYK2-KD (Figures 4 and

F
I G U R E 5 Community network analysis of the MD trajectory unravels the dynamic profile of the TYK2-KD in response to ATP.1Μg 2+binding and AL-phosphorylation.(A) Communities obtained by positive correlations-based community map analyses of the MD trajectory (2-3 μs), are mapped on the representative MD snapshot and depicted as (a) cylinder-cartoon and (b) surface representations.(c) 2D-graph illustration of the allosteric community network.The communities (nodes) are depicted as spheres (with radii proportional to community size) and the allosteric coupling between communities is represented by lines (edges) with a width corresponding to the degree of correlation.Red highlighted spheres indicate communities containing the FG-helix.Color-index and nomenclature of communities are indicated below the panel.(B) TYK2 amino acids corresponding to R spine-, C spine-, R shell-forming residues Figure 5A-c) and emerged as the third most bridging TYK2 SCcommunity (Figure S6; bar at "E").These observations further support the role of the TYK2 Com-E as pivot community to serving allosteric communication during relative N-/C-lobe movements.Combined these findings, indicate that Com-E represents the C-spine assembly stabilizing community.Based on this idea, several additional Com-E SC hydrophobic residues emerged from our analysis as potential contributors to the dynamic assembly of the C-spine in TYK2 (Figure 5B/"Hydrophobic Support to the C-spine").

Figure
Figure 5A-a,b; MC).Interestingly, Com-F MC is allosterically coupled with almost all the C-lobe communities as well as with the N-lobe Com-B and the regulatory Com-C, communities (Figure 5A-c; MC).
) comprises the main-chains of almost the entire αI (Figure 5A-a,b; MC) and is dynamically coupled with the C-lobe communities, Com-F, Com-E, and Com-H (Figure 5A-c; MC), implying involvement of the TYK2 αΙ-helix in allosteric TYK2 activation.In line with this hypothesis, in the autoinhibited form, the αΙ helix, along with its spatial neighboring helix, aE (see Figure 5A-b; MC/back), is constrained by other TYK2 sub-domains (based on 18 ).More importantly, the side chains of the Com-I SC residue members of the αH-αΙ connecting region, bridge it with Com-H (Figure S6; bar at "I") and form a dynamic surface underneath the R-spine community, Com-C SC (dark blue and yellow-surfaces, respectively in Figure 5A-b; SC/back).These results, combined with the observation that Com-I SC serves allosteric coupling of Com-C and Com-H communities (Figure 5A-c; SC), suggest a key role of the dynamics of the Com-I side chains, and especially of the αΗ-αΙ linker, in allosteric regulation of TYK2 activity.Com-P + 1 (in black in Figure 5A; MC) encompasses the portion of the P + 1 loop (aa: R1058-S1063) that is involved in binding protein-substrate positions ≥P + 1 (as deduced by the substratemimicking binding of SOCs 56 ).It is also a solely MC-community but allosterically coupled with Com-F MC (Figure 5A-c), suggesting an important role of the P + 1 loop dynamics in coupling activation with substrate binding.

+
1σ; = 1000), suggesting a pivotal role of their side chains in allosteric signal propagation within the TYK2-KD during the activation process.Potential functional roles predicted by the community analysis (details in Section 3.4.1)are also included.N/A denotes uncorrelated side chain movements.
C-spine assembly/ stabilization C-spine-stabilizing Block

Note: 1 R
-spine shell, 2 R-spine, and3 C-spine corresponding amino acids 22 ; 4,5 Dynamic residues found to move synchronously in the inner hydrophobic core of active EPKs in response to ATP or ATP/substrate-peptide binding2 ; In red: TYK2-unique amino acids.a Retrieved from the cBioPortal for Cancer Genomics52 (https://www.cbioportal.org).b Amino acid conservation throughout the PTK family.c

3. 5 . 5 |
The gatekeeper M978 and β3-lysine, K930Several additional long-range interactions, mediated by regulatory signaling side chains identified here (yellow in Figure8A), facilitate dynamics-driven communication of substrate-binding and ATPbinding sites (e.g., at K930).Part of this network is the gatekeeper, M978 (Figure8A), as already mentioned above.This is better illustrated in Figure8B, which depicts a conformational ensemble in the analyzed simulation time-period.This involves oscillation of M978 and of other side chains of the inner hydrophobic core, including R-spine-forming residues and the β3 ATP-binding lysine, K930 (of the KE salt-bridge; see also Figure2), between conformations pointing either towards the ATP molecule or the C-helix (Figure8B); actually the M978 and K930 side chain movements are negatively correlated in the TYK2-KD simulated here.This exemplifies the role of methionine M978 as the "gatekeeper" and as a regulatory switch in TYK2, as described in Section 3.4.1 (see Com-A and Com-C description).These findings are in perfect agreement with the deleterious effect of both K930 and M978 substitutions linked to catalytically impaired TYK2 variants or to cancer-associated amino acid substitutions (e.g., Lys930to Arg27 or Met978 to Phe 60 ).Moreover, the region corresponding to Com-N identified in this study (e.g., aa: P889-T890 that are located in the pKD-KD connecting linker; see Com-N description in Section 3.4.1),assists in shielding the entrance to the hydrophobic core by transiently capping the hydrophobic pocket near the ATPbinding site (Figure8A).
1Mg 2+ simulation of this study.The Bc sc values for the APO TYK2 and P1104A KDs (Figure 9A,B) along with those of the ATP.1Mg 2+ -bound wtTYK2-KD F I G U R E 8 Part of the allosteric communication network of amino acids mediating dynamic coupling between substrate-and ATP-binding sites in the simulated TYK2-KD.(A) Details of the residue interaction network, mapped on the 2-3 μs representative conformation.Main-chain (in ribbon) and side chains atoms (in sticks) are colored according to their corresponding communities (as in Figure 5A).Red-and blue-boxed labels indicate R-and C-spine forming residues, respectively.Dashed lines represent hydrogen bonding interactions.(B) Conformational ensemble of selected amino acids from panel A, revealed by overlapping of representative MD-snapshots in the time ranges of 2-2.5 μs and 2.5-3 μs of the MD simulation.F I G U R E 9 Comparison with APO-form simulations.(A, B) Bc SC plots of APO-forms of the KDs of wtTYK2 (A) and P1104A variant (B) that we have simulated in a previous work, 31 as obtained by Bio3D analysis of their corresponding MD trajectories (to be compared with Figure 6).(C, D) Putty cartoon representation (using PyMOL 49 ) of Bc SC values on representative MD structures of the APO KDs (C) and of the ATP.1Mg-bound wtTYK2-KD simulated in this study (D), shown side-by-side for comparison.Cartoon thickness and coloring (from blue to red; according to the color palette shown below) are proportional to the corresponding underlying Bc SC values (see panels A, B, and Figure 6, respectively) and thus to the degree of involvement of the corresponding side chains in allosteric signaling.Highlighted labels indicate amino acids with strong Bc SC values filtered by the top 5% threshold of Figure 6, for comparison purposes and for clarity.Red-and blue-boxed labels indicate R-and C-spine forming residues, respectively.