The essential and multifunctional TFIIH complex

Abstract TFIIH is a 10‐subunit complex that regulates RNA polymerase II (pol II) transcription but also serves other important biological roles. Although much remains unknown about TFIIH function in eukaryotic cells, much progress has been made even in just the past few years, due in part to technological advances (e.g. cryoEM and single molecule methods) and the development of chemical inhibitors of TFIIH enzymes. This review focuses on the major cellular roles for TFIIH, with an emphasis on TFIIH function as a regulator of pol II transcription. We describe the structure of TFIIH and its roles in pol II initiation, promoter‐proximal pausing, elongation, and termination. We also discuss cellular roles for TFIIH beyond transcription (e.g. DNA repair, cell cycle regulation) and summarize small molecule inhibitors of TFIIH and diseases associated with defects in TFIIH structure and function.


Introduction
As expected for a large, multi-subunit complex, TFIIH serves many biological roles, ranging from DNA repair to transcription to cell cycle regulation. The 10subunit TFIIH complex performs these roles primarily, if not exclusively, in the nucleus. However, the dissociable 3-subunit kinase module (consisting of MAT1, CCNH, and CDK7 in humans) can localize to the cytoplasm. Whereas this review will focus on TFIIH as a regulator of RNA polymerase II (pol II) transcription, we describe some of its other biological roles in later sections. We also note other excellent reviews that focus on TFIIH in transcription and/or DNA repair. [1][2][3] Given the rapid advances in our understanding of TFIIH structure and function, enabled in part by recent cryoEM data and improved chemical inhibitors, it is timely to summarize these findings in the context of established models of TFIIH function. We start with a description of the composition and structure of TFIIH, followed by sections devoted to (i) transcription regulation and other cellular roles, (ii) pathologies associated with TFIIH function, and (iii) small molecule inhibitors of TFIIH. We conclude with a section that summarizes some outstanding questions that could be addressed in future experiments.
TFIIH core subunits: XPB, XPD, p62, p52, p44, p34, p8 The XPB and XPD subunits each contain two RecAlike domains (ATPase/helicase); XPB has 3 0 -5 0 helicase activity whereas XPD has 5 0 -3 0 helicase activity. XPB also possesses a DRD-like (DNA Damage Recognition Domain) domain in addition to an Nterminal extension as depicted in Figure 1. XPB was initially characterized as a helicase, 12 but can also function as a 5'-3' DNA translocase, 13,14 at least with respect to its transcription-associated roles (see below). The helicase activity of XPB is involved in the DNA damage response 15 and works in conjunction with XPD. 5,16 In addition to its ATPase/helicase function enabled by the RecA-like domains, XPD contains an ARCH domain and an iron-sulfur cluster as shown in Figure 1. Basic roles in transcription (XPB) and DNA repair (XPD and XPB) are conserved going back to yeast. [16][17][18][19] Whereas XPB and XPD possess catalytic functions, additional subunits in the TFIIH core serve important structural and regulatory roles. The p62 subunit contains two BSD domains and a PH-like (pleckstrin homology) domain ( Fig. 1) that may mediate interactions with a variety of regulatory proteins, including sequence-specific DNA-binding transcription factors. 20 The PH-like domain also interacts with TFIIE, 21,22 which likely stabilizes TFIIH assembly within the PIC. The p52 subunit contains two XPB-binding regions as well as a p8 binding region ( Fig. 1) that helps stabilize the TFIIH core structure; 23,24 the Egly lab has shown that these interactions also stimulate XPB ATPase activity in both transcription and the DNA damage response. [24][25][26] Through its interactions with p52 and XPB, the p8 subunit 27,28 stimulates XPB ATPase activity. Mutations in p8 or p52 that negatively affect the XPB-p52-p8 interaction network are associated with developmental disorders in humans and in model organisms. 23,29,30 The p34 and p44 subunits are considered a structural "backbone" for TFIIH. 22,31 Both p34 and p44 contain a von Willebrand-Factor-A like domain and a Zn 21 binding region as depicted in Figure 1. Additionally, p44 contains a RING finger motif, which represents another Zn 21 binding region. The 5 0 -3 0 helicase activity of XPD is stimulated by p44,

Tfb2
Regulates XPB ATPase activity Regulates XPD ATPase activity

Ccl1
Regulates Kinase Activity Uniprot accession numbers are shown in parenthesis. All percent identities were determined from the protein BLAST tool on the National Center for Biotechnology Information website.
which directly interacts with XPD. 32,33 Notably, XPD mutations that are linked to developmental disorders XP and TTD inhibit the XPD-p44 interaction and decrease XPD helicase and NER activity. 33 TFIIH CAK subunits: MAT1, CDK7, CCNH Soon after its biochemical purification, TFIIH was identified as a factor that phosphorylates the pol II CTD. 6,[9][10][11] The TFIIH kinase, CDK7 (Kin28 in Saccharomyces cerevisiae; Mcs6 in Schizosaccharomyces pombe), contains an N-terminal CCNH binding region and a C-terminal MAT1 binding region ( Fig.  1). 34 Collectively, the CDK7, CCNH, and MAT1 subunits form a stable kinase module that can reversibly associate with TFIIH. 11,35 In fact, CDK7-CCNH-MAT1 are known as the CDK Activating Kinase (CAK) complex in metazoans, 10 based upon the ability of CDK7 to phosphorylate and activate cyclin-dependent kinases. This role is not well conserved in yeast; the Cak1 protein performs this function in S. cerevisiae 36-38 whereas Csk1 and the CDK7 ortholog Mcs6 both serve as CAKs in S. pombe. 39,40 CDK7 activity and substrate specificity is regulated by CCNH and MAT1. [41][42][43][44][45][46] For example, CCNH controls substrate specificity toward CDKs involved in cell cycle regulation as well as the pol II CTD, 35,47 and MAT1 helps direct CDK7-dependent phosphorylation toward DNA-binding transcription factors. 48,49 MAT1 also stabilizes CCNH and CDK7 in the CAK complex 50 and anchors it to the TFIIH core through interactions with XPD and XPB. 22,[51][52][53]

TFIIH structure
Building off pioneering work in other labs, 34,54,55 the Nogales and Cramer groups have implemented the latest technological advances in cryoEM and single particle reconstruction techniques to advance our understanding of TFIIH structure and function. 22,53 The TFIIH core is horseshoe-shaped; XPD (Rad3) and XPB (Ssl2) are at each end, connected by the other core subunits p44 (Ssl1), p34 (Tfb4), p52 (Tfb2), and p8 (Tfb5), as shown in Figure 2(A). Whereas the p62 (Tfb1) subunit could not be resolved in the human TFIIH structure, 53 the Cramer group showed that for the yeast (S. cerevisiae) complex, Tfb1 extends along the horseshoeshaped surface, interacting with multiple core subunits. 22 The recent cryoEM structural data for yeast and human TFIIH show a conserved subunit architecture, 22,53 in agreement with past studies. 34 As expected, XPB (Ssl2) was shown to interact with p52 (Tfb2) and p8 (Tfb5) in both yeast and human TFIIH [ Fig. 2(B)]. These findings were in agreement with biochemical and cellular experiments from the Egly lab that showed p52 interacts with and stimulates XPB ATPase activity. 24,56 Specifically, the XPB-p52-p8 heterotrimer results from interactions between the XPB RecA2 domain, the C-terminal domain of p52, and p8 [ Fig. 2(C)]. 22,34,53 The p8 mutation L21P likely destabilizes this interaction network, and this mutation is linked to TTD. 57 The cryoEM data revealed structural details of previously characterized interactions between p52-XPB and p44-XPD and also established separate p34 structural interfaces with p52 and p44 that anchor the horseshoeshaped TFIIH core between XPD and XPB. The cry-oEM 22,53 and crosslinking-mass spectrometry data 34 established that p34 and p44 dimerize through interaction between the Von Willebrand domain (vWA) of p34 and the RING finger domain of p44 ( Fig. 1). This p34-p44 dimer may serve to seed assembly of the remaining core subunits, suggested by its interaction network with the other core subunits XPB, XPD, p62, and p52. 22,31,34,53 Whereas structural data reveal that XPB and XPD are located at each end of the horseshoeshaped core TFIIH complex [ Fig. 2(A)], XPD and XPB also directly interact. 22,34,53 Upon comparison of the free TFIIH structure with TFIIH within a partial PIC, the Nogales group noted evidence for a structural shift that breaks XPD-XPB contacts. 53 MAT1, while not fully resolved in the cryo-EM structures, was shown via CXMS to anchor the CAK to the core through interactions between its "latch" domain with XPB and XPD. 34 This was corroborated by applying docking and homology models to the TFIIH cryoEM maps. 22,53 The XPD ARCH domain appears to interact with the MAT1 latch domain 34, 58 and a C259Y mutation in the ARCH domain is linked to TTD; this mutation likely destabilizes the ARCH domain and may prevent proper CAK assembly with the TFIIH core. Although the CAK subunits MAT1 (Tfb3), CDK7 (Kin28), and CCNH (Ccl1) were largely unresolved in the cryoEM studies with human and yeast TFIIH, previous results have confirmed CDK7 interactions with MAT1 and CCNH, summarized in Figure 3. 34,51

TFIIH and Transcription Regulation
TFIIH is a basic component of the pol II transcription initiation machinery, commonly known as the pre-initiation complex (PIC; see Fig. 4). In humans, the PIC is about 4.5 MDa in size and consists of TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, Mediator, and pol II. In vitro, the PIC can form in an ordered pathway that is likely relevant for in vivo assembly. 59-61 TFIID, which contains the TATAbinding protein TBP, first binds the TATA box upstream of the TSS; this pioneering event can nucleate assembly of TFIIA and TFIIB (which bind opposite ends of TBP), followed by TFIIF and pol II.  Like TFIIF, TFIIE interacts directly with pol II, 61,62 and TFIIE binding helps assemble and orient TFIIH through multiple protein-protein interfaces. 22 As shown in Figure 5, TFIIH also directly contacts downstream promoter DNA, which helps anchor it in place within the PIC. Moreover, the Nogales and Cramer labs have shown that MAT1 (Tfb3 in S. cerevisiae) contacts the pol II stalk, providing rationale for the MAT1 requirement for transcription. 18,22 Although Mediator is not required to form a DNA-bound complex containing TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and pol II, Mediator stabilizes 61 and regulates the entire PIC assembly, including TFIIH (see below). 63 The Cramer lab determined a cryo-EM structure of yeast (S. cerevisiae) TFIIH in the context of a minimal PIC, which provided important insights about how TFIIE and Mediator work together with TFIIH during transcription initiation. For instance, the structure revealed that Ssl2 (XPB) may be oriented on downstream DNA by the TFIIE E-bridge helix; this interaction may also provide a physical basis for TFIIE-stimulated DNA opening. 22,64,65 It was also shown from the structural data that Rad3 (XPD) was located 40Å from the promoter DNA, 22 in agreement with functional data that confirmed its helicase function was dispensable for pol II transcription. 66 The kinase module of TFIIH (i.e. CDK7, CCNH, MAT1) is highly mobile and has been difficult to resolve in structural studies of the free TFIIH complex 53,54,59 or in studies with minimal PIC assemblies. 22,59,62 Notably, the TFIIH kinase module could not be resolved (i.e. too flexible/ dynamic) in PICs lacking Mediator, 59,62 whereas the TFIIH kinase module could be resolved (albeit at limited resolution) within a yeast PIC that included a core Mediator complex. 22 The TFIIH kinase module preferentially located between the "hook" and "shoulder" of Mediator within a yeast PIC. 22 This location may be important for positioning Kin28 (CDK7) for pol II CTD phosphorylation, which is stimulated by Mediator. [67][68][69][70] However, the kinase module (a.k.a. the CAK) is likely not the only TFIIH interaction with Mediator; 71 T. brucei contain a TFIIH-like complex that lacks CAK homologs, 72 yet a 7-subunit core TFIIH complex forms  a stable interaction with a Trypanosoma brucei Mediator complex. 73

Promoter opening
The TFIIH subunit XPB is arguably the most important for pol II transcription, as it contains an ATPase and translocase activity that enables ATPdependent opening of the promoter DNA at the transcription start site. 74 This opening of the DNA template is required for transcription initiation; the single-stranded "template" DNA can then descend into the cleft and engage the pol II active site. Moreover, promoter opening appears to represent an important regulatory stage for gene induction, at least in certain cell types or contexts. 75 As shown schematically in Figure 5, XPB interacts with downstream DNA and uses its 5 0 -3 0 DNA translocase activity 13,14 to open promoter DNA, acting as a molecular wrench. 76 Because upstream DNA is fixed through TBP/TFIID binding (which also bends the DNA), XPB 5'-3' translocation along the nontemplate strand (or 3'-5' translocation on the template strand) would generate torsional stress that would be relieved by opening/melting the duplex DNA around the TSS. Thus, XPB acts to reel downstream DNA into the pol II cleft. 14 The translocation mechanism for XPB has been most thoroughly studied with yeast TFIIH (XPB ortholog Ssl2), and biochemical data suggest Ssl2 enables DNA translocation in the 5'-3' direction; 13 in this case, translocation on the non-template strand would open the promoter DNA. This XPB-dependent "reeling" of DNA into the pol II cleft also helps explain why downstream DNA is required for TFIIH-dependent stimulation of transcription in vitro. 77 The XPB-dependent reeling of DNA into the pol II cleft requires continuous ATP hydrolysis, and after short (10bp) translocation XPB dissociates from the DNA template; 13 because TFIIH is anchored to the PIC through TFIIE and the pol II stalk, 22,53,62 it is likely that XPB can rapidly reengage downstream promoter DNA as before. The melted template is unstable and can rapidly reanneal; however, PIC factors TFIIB, TFIIE, and TFIIF can interact with the template and nontemplate strands to stabilize the open complex. 59,61,62 Promoter re-annealing at the TSS would re-establish the PIC in the "closed" state, requiring renewed XPB function to generate the open complex. By contrast, a stable open template within the PIC can initiate transcription, but another barrier/regulatory checkpoint is promoter escape, which is described below.
In yeast, the TATA box can be variably spaced (i.e. greater than 30bp upstream of the TSS typical for human genes whose promoters contain TATA boxes). At yeast genes with TATA boxes greater than 30bp upstream of the TSS, TFIIH-dependent promoter opening occurs upstream of the TSS; to ensure accurate initiation, pol II scans downstream sequences until the TSS is encountered. At that point, RNA synthesis is initiated. This process of TSS scanning occurs in a TFIIH-and XPBdependent manner but does not require RNA synthesis. 78 However, TSS scanning in yeast appears to require continuous ATP-hydrolysis and likely requires continuous translocation/reeling by the XPB/Ssl2 translocase. 13,14 Using single-molecule approaches (optical tweezers), the Block and Kornberg groups showed evidence suggesting a large open bubble (average size: 85 bp with SNR20 promoter) as an intermediate in the TSS scanning mechanism. 79 By contrast, data from the Galburt and Hahn labs supported a small (6 bp) open bubble during TSS scanning, followed by formation of a larger bubble (13 bp) upon transcription initiation. 80 These contrasting models of TSS scanning may reflect differences in the experimental design; for instance, the large open complex (e.g. 85 bp singlestranded DNA bubble) was observed upon applying larger forces (assistive or hindering) to the template DNA and the methods of PIC assembly were distinct in each case. Although XPB/Ssl2-dependent TSS scanning does not appear to be important for human pol II transcription, it remains plausible that human TFIIH adopts similar scanning mechanisms as a nucleotide excision repair (NER) factor, to help localize the complex to DNA lesions. 81 Interestingly, the Kornberg lab showed that, in a simplified transcription system, loss of the TFIIH kinase module (called TFIIK in S. cerevisiae) reduced TSS scanning and caused transcription to favor an "upstream" initiation site. 82 These experiments lacked TFIID and could not be recapitulated in cells, suggesting auxiliary factors help enforce TSS scanning in S. cerevisiae. However, the data suggest new roles for the yeast kinase module (Tfb3, Ccl1, Kin28) in TFIIH-dependent TSS scanning. This could be controlled by the Tfb3 subunit (homolog of human MAT1), which is highly flexible and tethers TFIIH to both TFIIE and the pol II enzyme; 22 however, TSS scanning could also be enabled by Kin28 kinase activity. Kin28 substrates within the S. cerevisiae PIC include the pol II CTD and Mediator; 68,83 moreover, the Hahn lab has shown that Kin28 can promote ATP-dependent (i.e. transcription-independent) dissociation of the PIC to a re-initiation-competent scaffold complex. 83 Whether such Tfb3-or Kin28-dependent mechanisms underlie the link between TFIIK and pol II TSS scanning remain to be determined.

Promoter escape and promoter-proximal pausing
After formation of the open complex, pol II can initiate transcription but must break contacts with the PIC, in a process called promoter escape. Pol II promoter escape occurs after generation of a 12-13 base transcript and requires structural reorganization of TFIIB. [84][85][86] TFIIH contributes to promoter escape as well, through mechanisms involving XPB 87 and CDK7-dependent phosphorylation of the pol II CTD. The CTD of the RPB1/POLR2A subunit of human pol II contains 52 heptad repeats (26 in S. cerevisiae, 42 in Drosophila) of the general consensus sequence Y1-S2-P3-T4-S5-P6-S7. 88 In its unphosphorylated form, the pol II CTD forms a highaffinity 89 interaction with Mediator. 68,90,91 Pol II CTD phosphorylation by the TFIIH kinase CDK7 (which targets Ser5 and Ser7 in the CTD heptad repeats) [92][93][94] releases Mediator interactions with the CTD, 95,96 which may help regulate promoter escape. 97,98 Following promoter escape, pol II typically pauses after generating a transcript of 20-80 bases in length (Fig. 6). 99 Although promoter-proximal pol II pausing is widespread in metazoans, it is not uniformly observed in yeast. 100 Paused pol II complexes are largely lacking in the yeast S. cerevisiae, whereas paused intermediates are observed in S. pombe. 101 In human cells, it has been shown that selective inhibition of the TFIIH-associated kinase, CDK7, increases pol II promoter-proximal pausing at thousands of genes under normal growth conditions. 96 The DSIF (DRB sensitivity inducing factor) complex, which consists of SPT4 and SPT5, is present throughout eukaryotic lineages and is a wellestablished regulator of promoter-proximal pol II pausing. 102 DSIF associates with pol II after promoter escape; in fact, DSIF interactions with pol II are occluded by PIC factors TFIIB, TFIIE, and TFIIF. 103 The Gilmour lab has shown that DSIF-pol II interaction can be stabilized by nascent RNA emerging from the pol II exit channel 104 and structural data from the Cramer lab has implicated an upstream (i.e. behind transcribing pol II) "DNA clamp" and an "RNA clamp" adjacent to the RNA exit channel as potential domains important for regulation of pol II pausing. 103 The parallels between DSIF and CDK7 in the regulation of pol II activity are striking. CDK7 (Kin28 in S. cerevisiae) phosphorylates SPT5 42,45 and this may control DSIF-dependent regulation of pol II pausing (Fig. 6). Inhibition of CDK7 also reduces DSIF recruitment to promoter-proximal regions. 105,106 Depletion of SPT5 in human cells caused defects in RNA processing (e.g. defective capping and splicing) similar to those observed upon CDK7 inhibition. 107 Moreover, Spt5 depletion in murine embryonic fibroblasts (MEFs) caused defects in pol II processivity (e.g. premature termination on long genes) that roughly approximated processivity defects observed in yeast upon inhibition of Kin28. 108,109 In S. cerevisiae, Spt5 contributes to capping enzyme recruitment 110 and inhibition of Kin28 caused pol II accumulation near the 12 nucleosome, 109 similar to degron-depleted Spt5 cells. 111 Moreover, in S. pombe, Spt5 protein levels were shown to be sensitive to the kinase activity of its CDK7 ortholog Mcs6 (increased Spt5 protein upon Mcs6 inhibition). 112 Collectively, these results suggest a conserved functional co-dependence between the TFIIH kinase and DSIF.
Another well-established regulator of pol II promoter-proximal pausing is the four-subunit NELF complex, which is conserved in metazoans but is absent in yeast genomes. NELF may act in concert with DSIF to stabilize pol II pausing, 99,102 and DSIF and NELF recruitment appears to be regulated by CDK7 kinase activity; 93,105,106 however, the mechanistic basis for these processes remains unclear. Human CDK7 also phosphorylates and activates CDK9 [ Fig. 6(A)], 105 the P-TEFb kinase that is broadly implicated in the regulation of pol II pausing and pause release. 113 Like CDK7, activated CDK9 can also phosphorylate DSIF, 114 which may facilitate pol II pause release into productive elongation. Additional mechanisms by which CDK7 appears to impact promoter-proximal pausing are through its phosphorylation of the pol II CTD. Whereas unphosphorylated CTD interacts with the Mediator complex, TFIIH phosphorylation triggers a new set of CTD interactors, including RNA capping enzymes. 96,112,[115][116][117][118] Pol II pausing may have evolved in part to ensure that nascent transcripts are appropriately 5'-capped prior to the transition to productive elongation. 119,120 TFIIH-dependent phosphorylation of the pol II CTD also enhances PAF1 complex association, 96 and the PAF1 complex has been shown by the Roeder and Shilatifard labs to control pol II promoter-proximal pausing in human cells. [121][122][123] Elongation, RNA processing, and termination The TFIIH-associated kinase CDK7 (Kin28 in S. cerevisiae) influences pol II transcription beyond the promoter-proximal region, via phosphorylation of the pol II CTD. In yeast and mammalian cells, TFIIHphosphorylated CTD both recruits 96,106,[116][117][118][124][125][126][127] and activates 115 5'-capping enzymes for the nascent RNA transcript. Moreover, Pol II CTD phosphorylation has been shown to affect pol II elongation rates, and elongation rates, in turn, have been shown to affect alternate splicing in mammalian cells. [128][129][130] As a major pol II CTD kinase, CDK7 likely impacts pol II elongation in this way. In agreement, pol II CTD phosphorylation, including at Ser5 (a primary CDK7 target), has been shown to be important for splicing regulation, 131 and pol II pausing in gene bodies correlates with splicing, providing further evidence for a transcription rate effect. 132 CDK7 activity is also linked to pol II termination at gene 3'-ends. Inhibition of CDK7 causes termination defects in yeast and human cells. 93,96,105,133 However, the termination defects appear to manifest in mechanistically distinct ways in human and yeast cells. Increased read-through transcription was observed in human cells upon CDK7 inhibition 93,96,105 whereas premature termination was seen upon inhibition of the CDK7 ortholog Kin28 in S. cerevisiae. 109,134 Because human CDK7 activates the P-TEFb kinase CDK9 (another major pol II CTD kinase), 105 inhibition of CDK7 has amplified effects on CTD phosphorylation, and this may contribute to the distinct effects of CDK7 inhibition in yeast versus human cells.
The defects in pol II elongation (S. cerevisiae) and termination (human) upon inhibition of TFIIH kinase activity may also reflect TFIIH-dependent phosphorylation of SPT5, which is a conserved regulator of pol II elongation. 135 However, because few high-confidence targets of CDK7 have been identified in yeast or mammalian cells, it remains likely that other substrates contribute to the elongation, RNA processing, and termination defects observed upon inhibition of TFIIH kinase activity.

CDK7 as an epigenetic regulator of pol II transcription
Studies in yeast and mammalian cells have revealed that phosphorylated forms of the pol II CTD preferentially interact with H3K4 and H3K36 methyltransferases. [136][137][138][139] Comparative proteomics of factors bound to TFIIH-versus P-TEFbphosphorylated pol II CTD implicated the H3K36 methyltransferase SETD2 as binding specifically to P-TEFb-phosphorylated CTD, whereas the H3K4 methyltransferase complexes SETD1A/B interacted with phosphorylated CTD more generally (i.e. TFIIH-or P-TEFb-phosphorylated). 96 Because CDK7 activates the P-TEFb kinase CDK9 in human cells, 105 these results implicate CDK7 as an epigenetic regulator of transcription-associated chromatin marks. In agreement, targeted inhibition of CDK7 in human cells decreased H3K4me3 spreading into gene bodies and altered the distribution of H3K36me3 toward gene 3'-ends. 96 Although the term "epigenetic" has taken on various meanings, 140 a strict definition requires a heritable trait that does not result from a change in genomic DNA sequence. DNA CpG methylation is considered heritable, 141 and emerging evidence suggests histone marks can also be heritable. [142][143][144] The transcription-associated histone marks that are affected by CDK7 kinase activity, H3K4me3 and H3K36me3, are each linked to DNA methylation. H3K4me3 marks are concentrated in regions lacking DNA methylation, 145 whereas H3K36me3 levels directly correlate with DNA (CpG) methylation. 146 In this way, the TFIIH-associated kinase CDK7 may indirectly control DNA methylation patterns in mammalian cells, with potential consequences for regulation of gene expression patterns in cell progeny. Incidentally, the histone marks affected by human CDK7 activity (H3K4me3 and H3K36me3) have also been shown to influence splicing. [147][148][149] CDK7 as a potential regulator of DNA-binding TFs Sequence-specific DNA-binding TFs regulate all physiological processes; CDK7 is known to phosphorylate numerous DNA-binding TFs, including p53 150 and nuclear receptors. [151][152][153][154] Not all phosphorylation sites are functionally relevant, however, 155 and it remains to be determined how TFIIH-dependent TF phosphorylation directly controls TF function.

Regulators of TFIIH Function: TFIIE and Mediator
TFIIE is a well-established regulator of TFIIH; TFIIE helps recruit TFIIH to the promoter and both TFIIE and TFIIH enable promoter opening. 65,156,157 Rimel and Taatjes A physical basis for recruitment was observed from recent cryoEM data from the Cramer lab, which showed four separate contacts between TFIIE and TFIIH within a minimal yeast PIC. 22 An "E-bridge" helix contacts XPB (Ssl2), and p62 (Tfb1) contacts TFIIE through separate "E-floater" and "E-dock" helices. Specifically, the E-dock interacts with the p62 (Tfb1) PH-like domain, the E-floater contacts p62 (Tfb1) through its BSD1 domain, and the Ebridge helix connects the p62 (Tfb1) BSD2 domain to XPB (Ssl2) lobe 2. 22 The E-bridge interaction with p62 (Tfb1) and XPB (Ssl2) may influence conformational ratcheting, supporting a role for TFIIEdependent stimulation of DNA opening. A TFIIE-TFIIH interaction was also observed through XPD (Rad3) in the yeast minimal PIC structure, 22 at the base of the pol II stalk, in agreement with cryoEM data with human factors. 59,62 Mediator has also been shown to regulate TFIIH function. Mediator directly interacts with TFIIH 22,71 and stimulates CDK7 kinase activity. [68][69][70] Structural data from the Kornberg and Cramer labs reveal direct binding between the yeast TFIIH CAK and the Mediator hook, knob, and shoulder domains, 22,89 implicating these interactions in CDK7 (Kin28) activation. Because Mediator binds with high affinity to the unphosphorylated pol II CTD, 89,90 Mediator may also position the CTD favorably for CDK7-dependent phosphorylation. 22,89,158 Pol II CTD phosphorylation releases its interaction with Mediator, 95,96 which may contribute to pol II promoter escape. 97,98 The Mediator-associated kinase CDK8 (Srb10 in S. cerevisiae) also appears to regulate CDK7 (Kin28) function through mechanisms that remain unclear. 83,159 Moreover, biochemical experiments from the Malik group implicate Mediator-dependent regulation of TFIIH, perhaps through activation of XPB. 160

Cellular Roles for TFIIH Beyond Transcription
Subunits of TFIIH have been linked to important cellular functions that do not involve pol II transcription. Several of these functions are described in the following sections.

DNA repair
The TFIIH core subunits XPD and XPB both mediate the response to DNA damage and structural interactions between the CAK and TFIIH core subunits serve to regulate XPD and XPB function during nucleotide excision repair (NER). The helicase activity of XPD is essential for its NER function, 161 and C-terminal mutations of XPD that destabilize its interactions with p44 reduce its NER activity. 33 The TFIIH CAK subunit MAT1 negatively regulates the XPD ATPase/helicase, 58 and structural data reveal a direct interaction between MAT1 (Tfb3 in S. cerevisiae) and XPD (Rad3 in S. cerevisiae) that likely enables this regulation. 22,34,53 Removal of the CAK thus activates XPD for unwinding the DNA around the lesion site. A DNA repair factor that performs this task is XPA. 18,25 Incidentally, MAT1 and CCNH each contribute to the stability of the CAK, ensuring it remains stable (and CDK7 active as a kinase) apart from the TFIIH core. A role for XPB in nucleotide excision repair remains enigmatic, although its ATPase/helicase function appears to be important for TFIIH localization to the site of DNA damage. 81 These complementary roles for XPB and XPD facilitate recognition and enable repair via other factors (reviewed in refs. 1-3).

Additional roles for XPB and XPD: retroviruses and G-quadruplexes
The TFIIH core subunits XPB and XPD have been found to protect against retroviral insertion into the host genome. This occurs through degradation of retroviral cDNA; 162 XPB or XPD mutants deficient in NER activity had higher rates of viral integration and these effects did not appear to result from changes in transcriptional activity. 162,163 Thus, it appears that XPB and XPD act not only to detect and initiate repair of DNA damage, they also serve to safeguard the host against retroviral infection. The correlation between NER-deficient XPB or XPD and viral genome integration suggests that retroviral insertion favors sites of unrepaired DNA damage.
ChIP-Seq data in human cells have shown that XPD and XPB genomic occupancy overlaps with sequences predicted to form G-quadruplex structures, 164 which are four-stranded helical structures that can form in GC-rich genomic regions. 165 Additional biochemical experiments suggested that whereas both XPB and XPD can bind G-quadruplex DNA, only XPD is capable of unwinding these structures. These biochemical assays were completed with single-subunit XPB and XPD homologs from thermophilic microbes, however, and it remains to be seen whether similar activities are recapitulated within the context of the entire TFIIH complex. 164 These results suggest additional functions for the TFIIH XPB and XPD subunits that may expand upon and/or contribute to their established roles in pol II transcription and DNA repair.

Cell cycle regulation
In mammals, the TFIIH CAK subunits (MAT1, CDK7, CCNH) are essential for cell cycle regulation. For example, loss of MAT1 in mice caused defects in S phase entry 166 and inhibition of CDK7 functions similarly by preventing CDK1 and CDK2 activation in human cells. 167 The CAK regulates the cell cycle through T-loop phosphorylation and activation of CDK1, CDK2, CDK4, and CDK6. [168][169][170] Whereas its role in cell cycle regulation is not conserved in S. cerevisiae, the CDK7 ortholog in S. pombe, Mcs6, does control cell cycle progression, in coordination with the Lsk1 kinase. 112,171 Taken together, these findings reveal that whereas the complete TFIIH complex (i.e. core 1 CAK) is required for transcriptional activity, the CAK is important for cell cycle regulation.

Neurogenesis and memory formation
Recent studies in model organisms have linked the TFIIH kinase CDK7 to neurogenesis and long-term memory formation. In the developing mouse neocortex, Cdk7 expression was observed to correlate with miR-210 expression, which targets Cdk7. Consistent with a role for miR-210 in regulation of Cdk7 levels throughout neuronal development, decreased Cdk7 (or increased miR-210) promoted differentiation of neural progenitors to post-mitotic neurons. 172 Reduced or elevated Cdk7 levels also had predictable effects on cell cycle progression and proliferation of neural progenitor cells (see above). 172 These findings are in general agreement with links between Cdk7 in mouse development and stem cell maintenance. 173 A study in C. elegans has also connected Cdk7 activity to neuronal differentiation, suggesting ancient links to neurogenesis. 174 In a mouse model study of post-mitotic neurons, He et al. observed that Cdk7 expression was increased compared with developing neurons, and that Cdk7 inhibition (with THZ1) impaired long-term memory formation, whereas short-term memory was unaffected. 175 Collectively, these findings correlate CDK7 activity to neuronal development and function; however, these links likely reflect, at least in part, the key requirement for CDK7 in pol II-dependent gene expression. Memory formation requires new transcription (e.g. of immediate early genes, many of which are DNAbinding TFs), and these findings with CDK7 are reminiscent of other studies that have linked general regulators of pol II transcription to memory formation in mammals. 176

Pathologies Associated with TFIIH Function
Defects in TFIIH function are linked to developmental diseases and numerous cancers (Table II), and TFIIH is also targeted by several viral pathogens. These are summarized below.

Developmental diseases
The developmental diseases Xeroderma Pigmentosum (XP), Cockayne Syndrome (CS), and Trichothiodystrophy (TTD) are associated with mutations in XPB, XPD, and p8. These mutations negatively affect a range of TFIIH functions in transcription and/or NER. 57 However, TTD-associated mutations typically affect TFIIH transcriptional activity whereas XP and XP/CS are typically associated with NER deficiencies. Although XP, CS, and TTD have distinct symptoms, each results from recessive mutations and share UV sensitivity. 177 As shown in Table  III, XPD mutations are more numerous compared with XPB or p8 mutations identified in the clinic.
XP is an autosomal recessive disorder resulting in skin abnormalities and neurodegeneration. The UV sensitivity associated with XP greatly increases the chance of melanoma and squamous and basal cell carcinomas. 177 XP is associated with mutations in XPD and XPB, as well as other proteins that are not TFIIH subunits. As shown in Table III, the missense mutation in XPB (R425STOP) results in XP phenotypes along with other frameshift mutations. The R683W XPD mutation represents a common XPD site associated with XP, although more than 30 other XPD mutations have been identified in the clinic. XP patients show deficient XPD helicase activity, consistent with defects in NER. 178 CS is a recessive disease that varies in the severity of symptoms, but typically is associated with developmental and neurological disorders. The mildest form of CS, called UVSS, manifests as mild photosensitivity. More moderate forms, called CSI and CSII, exhibit a multitude of symptoms including reduced lifespan (roughly 12 years), dwarfism, Rimel and Taatjes retinopathy, microencephaly, and other developmental defects. The most severe form of CS, called COFS, is neonatal lethal. 177 Mutations in XPB and XPD also contribute to XP/CS, a condition with overlapping symptoms of XP and CS. 57 Symptoms include skin abnormalities associated with XP and severe neurological and developmental defects associated with CS. 57 As shown in Table III, the XPB missense mutations F99S and Q545STOP and the XPD missense mutation D681H are commonly associated with XP/CS; however, other frameshift mutations in these subunits have been documented. These XPB and XPD mutations may prevent CAK module dissociation that is required for NER or could block normal function of other DNA repair factors to cause delayed transcriptional resumption (i.e. after DNA damage) commonly seen in XP/CS patients. [179][180][181] TTD is an autosomal recessive disorder with symptoms including brittle hair and ichthyosis. Those with TTD may also have reduced lifespan and mild to severe mental retardation. TTD is associated with mutations in XPB, XPD, or p8 (Table III), with many mutants identified in XPD, including missense and frameshift mutations. 177,182,183 TTD is associated with an overall decrease in cellular TFIIH concentration and basal transcriptional defects. 178,183 Viral pathogenesis TFIIH is targeted by several proteins expressed by pathogenic viruses. 184 For example, the NSs protein expressed by the Rift Valley Fever Virus (RVFV) was shown by the Egly lab to bind the TFIIH subunit p44, resulting in overall loss of TFIIH function. 185 Other labs later showed that the RVFV NSs protein can similarly target and degrade TFIIH subunits such as p62. 186,187 Moreover, the HIV Tat protein binds the TFIIH CAK and activates CDK7 to promote expression of the viral genome. 188,189 TFIIH recruitment to the HIV-1 promoter was also shown to be an essential, late-stage event required for HIV emergence from a latent state. 190 These TFIIH links to viral pathogenesis are underscored by the finding that CDK7 inhibition has broad-spectrum antiviral activity, blocking replication of cytomegaloviruses, herpesviruses, and adenoviruses in human cells. 191 Cancer Numerous cancers appear to be dependent on elevated CDK7 activity to drive their oncogenic state. 192,193 Because of its role in cell cycle regulation, CDK7 activity cannot be entirely de-coupled from its broad transcriptional effects. However, elevated expression of oncogenes is enabled by clusters of enhancers, or super-enhancers, [194][195][196] that require CDK7 activity to maintain their high-level expression. 197 Consequently, these genes are especially sensitive to CDK7 inhibition. Small cell lung cancers, triple negative breast cancer, and T-cell acute lymphoblastic leukemias are each aggressive cancers with high mortality rates; notably, each is sensitive to THZ1, a covalent inhibitor of CDK7. [198][199][200][201] As expected, CDK7 inhibition by THZ1 is broadly cytotoxic, 200 and although THZ1 inhibits other kinases, it represents a powerful means to assess potential roles for CDK7 kinase activity in cancer cell proliferation. Indeed, THZ1 has revealed numerous transcriptional dependencies in cancer cell lines that suggest novel combinatorial approaches (e.g. CDK7 inhibition 1 conventional therapeutic) for cancer treatment. 202,203 Furthermore, pre-clinical studies have shown that THZ1 can inhibit the emergence of multi-drug resistance in human cells and mouse models. 204 A biological rationale for these findings is that drug resistance requires adaptive transcriptional reprogramming that is blocked by persistent, low-level inhibition of CDK7. 204 These and other results provide a promising proof-of-concept that further development of CDK7 inhibitors could yield therapeutic strategies that will be effective in the clinic.

Small Molecule Inhibitors of TFIIH Function
Although other small molecule inhibitors of CDK7 have been described, 191,203,205 the most widely used has been THZ1, developed in the lab of Nathaneal Gray. 200 THZ1 is an ATP analog that contains an extension with a Michael acceptor that is positioned to react with C312, which is adjacent to the CDK7 ATP binding site. As noted above, THZ1 has served as a valuable tool to probe the role of CDK7 in mammalian cells; however, although it is most potent against CDK7, it also inhibits dozens of other kinases. Consequently, experiments with THZ1 must be interpreted with caution.
Chemical genetics methods, pioneered by the Shokat lab, 206 have been successful in selective inhibition of CDK7 activity in yeast and human cells. 42,83 This strategy requires a "gatekeeper" mutation (e.g. F91G in human CDK7) in the ATP binding pocket that will accommodate a bulky ATP analog such as NM-PP1. In this way, normal kinase function is retained; however, in the presence of NM-PP1 (which cannot bind native kinases and therefore does not compete with ATP), the mutant CDK7 allele is selectively inhibited. The Ansari lab has advanced this concept to design a covalent inhibitor of yeast CDK7, Kin28. 109 Studies that have implemented these "chemical genetic" strategies have markedly advanced our understanding of the roles of the TFIIH kinase in transcription; however, because genome editing is required, these approaches cannot be easily translated to the clinic.
Triptolide is a highly selective inhibitor of XPB that covalently binds at Cys 342 to inhibit its ATPase activity. 207,208 Importantly, an XPB C342T mutant was shown to be resistant to triptolide treatment, confirming XPB as the relevant cellular target. 207 Triptolide is a potent inhibitor of pol II transcription (IC 50 of 12nM at 12 hours of treatment 208 ); however, numerous labs have noted pol II degradation is induced with prolonged triptolide treatment times. [209][210][211] Thus, triptolide treatment must be brief to reliably assess its effects on pol II transcription. Using a short (1 hour) triptolide treatment in human cells (HCT116), the Shilatifard lab showed that transcription of most pol II genes was inhibited, but a small subset were resistant. 209 This suggested that this subset of genes was either stably paused or may not be dependent upon XPB for activation. Similar results were seen by the Lis lab in triptolide-treated murine ES cells. 212 Using a nascent RNA technique called GRO-Seq, 213 the Lis lab noted that, as expected for an XPB inhibitor, triptolide caused genome-wide loss of new transcription and promoter reads over time.
Another small molecule that targets XPB is spironolactone, which appears to selectively degrade the XPB protein while retaining the structural integrity of the rest of the TFIIH complex. 214 This led to the provocative conclusion that, contrary to existing models, XPB function was not generally required for pol II transcription in human cells. 215 Because XPB requires continual ATP hydrolysis to maintain an open DNA template, 13,14,64 these findings suggested a unified mechanism of promoter opening from bacteria to mammalian cells that does not require ATP hydrolysis. Whereas other studies have suggested that TFIIH/XPB function may not be required at all pol IItranscribed genes, 61,209,216 the data with spironolactone were the first to suggest XPB function was not generally required for pol II transcription in human cells. 215 The spironolactone data should be interpreted with caution, however, and not simply because they contradict well-tested models of pol II open complex formation in eukaryotic cells. Spironolactone is a reactive compound and possesses functional groups (e.g. Michael acceptor) that are designated as PAINS (Pan-Assay INterference compoundS) by the medicinal chemistry community. 217 This is corroborated by its wide range of physiological effects in the clinic. Based upon structural data and mutagenesis experiments, XPB itself appears to be important for maintaining the integrity of the TFIIH complex, 22,34,53,218 and thus it seems unlikely that XPB could be selectively extracted and degraded without negatively affecting TFIIH structural integrity. It is notable, however, that evidence in yeast suggests that XPB (Ssl2) may be labile, dependent upon the Tfb6 protein, 219 which lacks a clear ortholog in human cells.

Conclusions and Outstanding Questions
Many basic cellular functions for TFIIH have been defined, and its known links to human disease are extensive and are certain to expand. Below we highlight a few, among many, interesting and outstanding questions.
How does TFIIH function within the entire PIC during pol II initiation, promoter escape, and promoter-proximal pausing? Because the TFIIH subunit XPB binds DNA downstream of the TSS, pol II must bypass XPB to transition to productive elongation. TFIID also binds DNA downstream of the TSS, 220 and functional interactions between TFIIH and TAF7 have been described. 221 65 Antisense transcription (i.e. on the non-template strand, transcribing in the opposite direction) is widespread in mammalian cells 213,223 and would promote negative supercoiling at the promoter. Potentially, this could preclude XPB action during pol II transcription initiation at some genes. How does TFIIH function during DNA repair? The PH-like domain of p62 has been shown to interact with several DNA repair factors 224,225 in a process that is regulated in part by the chromatin remodeler CHD1. 226 How are these interactions controlled? How does TFIIH signal other repair factors to mobilize to sites of DNA damage? The ATPase/helicase/translocase function for the TFIIH subunits XPB and XPD is implicated in a variety of cellular processes. What range of substrates do XPB or XPD act upon? Could XPB or XPD unwind RNA structures or help resolve Rloops? Novel TFIIH-associated factors continue to be identified. 227 Do these auxiliary factors alter XPB or XPD function? What are the most functionally relevant substrates for CDK7? Whereas several key substrates have been identified that support its role in cell cycle regulation, few transcription-related targets are known beyond CDK9, SPT5, and the pol II CTD. A first step in understanding the cellular roles of any kinase is to identify its substrates. If this can be accomplished, the next challenge will be to determine the functional consequences of specific phosphorylation events. Will improved inhibitors of TFIIH be developed? Whereas small molecule inhibitors are likely to remain widely used as molecular probes for basic science research, how will TFIIH inhibitors fare in the clinic?
Progress toward answering these and other questions will continue to rely on improved biophysical, chemical, and cellular tools (e.g. transcriptomics). Mechanistic insights will require structural and in vitro ensemble and single-molecule approaches that should be able to parse out key structural and functional intermediates. As in previous decades of TFIIH research, clinical identification and biochemical characterization of disease-associated mutants will also advance understanding and may yield new strategies for molecular therapeutics.