The human papillomavirus late life cycle and links to keratinocyte differentiation

Regulation of human papillomavirus (HPV) gene expression is tightly linked to differentiation of the keratinocytes the virus infects. HPV late gene expression is confined to the cells in the upper layers of the epithelium where the virus capsid proteins are synthesized. As these proteins are highly immunogenic, and the upper epithelium is an immune‐privileged site, this spatial restriction aids immune evasion. Many decades of work have contributed to the current understanding of how this restriction occurs at a molecular level. This review will examine what is known about late gene expression in HPV‐infected lesions and will dissect the intricacies of late gene regulation. Future directions for novel antiviral approaches will be highlighted.


| INTRODUCTION
Human papillomaviruses (HPVs) are nonenveloped DNA viruses which infect cutaneous and mucosal epithelia causing mainly benign lesions. 1,2Currently, there are 227 fully classified HPV genotypes divided into α-, β-, γ-, μand ν-papillomaviruses. 3,4The αpapillomaviruses infect mucosal or cutaneous epithelial while other papillomaviruses infect cutaneous epithelia.Fourteen α-HPV genotypes (HPVs 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68), the "high risk" HPVs (HR-HPVs), are oncogenic and upon persistent infection cause cancer progression 5,6 mainly in anogenital and oropharyngeal sites. 7HR-HPV subtypes, HPV16 and HPV18, provide the biggest burden of disease, attributable to up to 70% of cervical and oropharyngeal cancers (mainly HPV16) cases. 7V infects epithelial cells and uses the cellular DNA replication/ repair and protein synthesis machinery for successful replication. 8V avoids immune detection by establishing a persistent, low-level infection in basal epithelial cells, 9 and limiting productive infection to the upper layers of the epithelium where immune surveillance is restricted. 10,11Cancer formation is an unwanted side effect of viral strategies that allow persistence and replication in the face of the host immune response against infection.

| The epithelium and the epithelial barrier
Squamous epithelia are divided into cutaneous on skin surfaces and mucosal epithelia on inner body surfaces.The epithelium consists of the basal, spinous, granular, and for cutaneous epithelia, cornified layers (Figure 1). 17The basal layer contains epithelial stem cells, the only epithelial cells normally capable of cell division.Basal cell division yields two daughter cells which may remain in the basal layer or may generate "transit amplifying cells" which undergo a finite number of cell divisions before switching to a differentiated phenotype to supply the cells of the spinous layer. 18Spinous layer cells flatten their shape as they become more differentiated.The granular layer contains cells in the process of losing their nuclei and cytoplasmic organelles. 19These cells synthesize keratins and filaggrin to form a tight fibrous network.Transglutaminase cross-links involucrin, first synthesized at an earlier differentiation stage, and small proline-rich proteins to cell membrane proteins. The epithelium is an immune-privileged site.Although T-cells, mast cells, and dendritic cells are present in the dermis, only Langerhans cells penetrate the epidermis. 20Epithelial keratinocytes are considered sentinel cells of the immune system because they present a toll-like receptor (TLR)-activated innate immune barrier to infectious agents, including viruses. 21

| The HPV life cycle and the infected epithelium
The HPV double-stranded DNA genome is ~8 kb in length. 1 Transcription is from one strand resulting in the expression of proteins E1, E2, E4, E6, and E7 (Figure 1), which have regulatory functions in viral replication, transcription, innate immunity to HPV infection, and inhibiting host cell differentiation.Most human papillomaviruses studied so far also express E8^E2, while the α-papillomaviruses express E5.The late region encodes the two capsid proteins L1 and L2 (Figure 1).
HPVs infect epithelia by binding first to the basement membrane, then, for α-papillomaviruses, binding and infecting basal layer cells. 22,23β-papillomaviruses may enter stem cells in hair follicles to establish latent infection. 24Upon entry, HPV virions undergo retrograde transport to the trans-Golgi network and then, in an innate immune avoidance strategy, enter membrane vesicles.Viral DNA then associates with chromosomes upon nuclear envelope breakdown during cell division.When the nuclear envelope reforms, viral genomes are located in PML nuclear bodies, where replication begins. 1,22itial gene expression in an infected basal epithelial cell involves the expression of viral early proteins E1 and E2, 25 which activate viral replication leading to amplification of the incoming genome to between 50 and 100 viral genomes. 26E8^E2 restricts viral genome amplification to this low-level by repressing E1/E2-mediated viral genome replication and transcription. 27Basal layer daughter cells each contain an equal number of HPV genomes, which are thought to be relatively silent transcriptionally. 25When these cells begin to differentiate into the spinous layer, they initiate full viral gene expression.First, the viral early gene products E5, 28 E6, 29 and E7 30 are synthesized to repress cellular differentiation, activate cell cycle progression, and inhibit apoptosis, which would be the normal consequence of inappropriate cellular replication activity in differentiating cells. 8They also repress the antiviral innate immune The stratified epithelium and HPV infection.(A) The normal stratified mucosal epithelium consists of basal, spinous (lilac-colored cells), and granular layers (purple-colored cells).The dermis is shown in pale pink.Arrows to the right-hand side indicate approximate positions of expression of the named differentiation markers.(B) An HPV-infected mucosal epithelium supporting a productive viral life cycle.E1, E2, E6, E7, and E8^E2 are all expressed in basal keratinocytes. 12-14E1^E4, L1, and L2 expression is restricted to the upper epithelial layers. 15,16The cytopathic effect of HPV infection, enlarged nuclei, and a perinuclear halo (koilocytes) are shown in the upper layers where virions (small red circles) are produced.Triangles to the righthand side indicate expression profiles of viral early and late proteins.Created with BioRender.com.HPV, human papillomavirus.response. 11E1 and E2 expression is reinitiated for viral genome replication 31,32 (Figure 1).The host cell recognizes viral replication in differentiating cells as abnormal and induces the ATR and ATM DNA damage response and repair (DDR) pathways leading to damage repair-induced viral genome amplification to many thousands of copies. 33,348][39][40][41][42] E5 is also expressed at late stages of the α-papillomavirus life cycle 43 where it reprograms differentiating cells to retain proliferation capacity to allow DNA synthesis and at least for HPV31, may subtly support viral genome amplification and late gene expression. 44,45E6 and E7 expression increases late in the life cycle of HPV16, 18, and 31 [46][47][48] to maintain cellular proliferation to facilitate viral genome amplification and late gene expression. 41 the final, late phase of the HPV life cycle, the L1 major and L2 minor capsid proteins are expressed to form the virion 49,50 (Figure 1).These proteins are highly immunogenic and so delayed expression to the upper epithelial layers allows virion formation without triggering an immune response.
In this review, we provide an overview of the current understanding of late events in the virus life cycle.We will explore the molecular and cellular mechanisms controlling the expression of the capsid proteins and the formation and egress of newly formed virions.We include some information on innate immunity to HPV infection as it pertains to viral genome amplification and late gene expression but readers are referred to the following reviews for a full analysis of immunity to HPV.Although some key innate immune factors are upregulated upon HPV-associated keratinocyte differentiation, 52 the virus has evolved countermeasures to ensure that viral replication can occur successfully.For example, E6 and E7 can each suppress STAT-1 expression, even when levels increase upon differentiation of both normal and HPV-positive keratinocytes 53 and reduced STAT-1 levels are required for viral genome amplification.Repression of TGF-β by E6 results in differentiation-specific downregulation of the keratinocytespecific interferon, IFN-κ to facilitate late events in the life cycle. 54,55V31 E5, which is expressed late in the virus life cycle, has also been shown to downregulate IFN-κ and the ISGs it controls via repression of the JAK-STAT pathway, leading to viral genome amplification and late gene expression.Caspases 3, 7, 8, and 9 have all been shown to be slightly upregulated upon differentiation of HPV-infected cells. 56Apoptosis is not induced by these low caspase levels but the interferon response, which would otherwise clear the infection, is repressed.Therefore, the activity of these caspases and downregulation of interferon signaling is required for late events in the HPV life cycle. 55e status of the differentiated host cell itself contributes to the spatial regulation of HPV late gene expression; expression of the late transcripts is repressed in less differentiated epithelial cells and induced in terminally differentiating keratinocytes.These processes are regulated at transcriptional and posttranscriptional levels and involve interplay between viral and cellular proteins. 57The following sections explain in detail what is currently known about mechanisms controlling late gene expression.

| Transcription
In undifferentiated cells, inappropriate capsid protein expression may be inhibited by a range of transcriptional and posttranscriptional mechanisms.It is likely prevented at the transcriptional level by viral late promoter repression, although late promoter regulation requires further study.The episomal genome may be epigenetically repressed in basal keratinocytes. 46,47What is known so far is that the HPV31 late promoter displays active histone marks in both undifferentiated and differentiated keratinocytes 46 and for HPV16, RNA polymerase II (RNA Pol II) is already loaded onto the late promoter in undifferentiated cells. 58Transcription elongation is inhibited because low levels of cyclin-dependent kinase 9 (CDK9) lead to hypophosphorylation of the RNA Pol II carboxyl-terminal domain (CTD), which precludes its activation. 58Differentiation stage-specific relative levels of essential transcription-activating or repressive factors are likely to play a role in repressing the late promoter in the early stages of the life cycle. 59BRD4 is a key E2 partner protein controlling replication, transcription, and episomal genome segregation. 60D4S, a short form of BRD4, can bind and inhibit E2 activity and repress late promoter activation in undifferentiated epithelial cells, possibly by altering chromatin conformation. 61[64] Transcription elongation from the HPV early promoter is likely to interfere with RNA polymerase binding at the late promoter. 65erefore, active transcription of the early gene region likely represses late gene expression by steric hindrance/transcriptional interference.[68] If transcription progresses into the late region in less differentiated cells, a U-rich RNA element exists at the end of the L1 ORF spanning into the late 3′ untranslated region (3′UTR) which may repress late gene expression in undifferentiated epithelial cells. 76For HPV16 and HPV31, the element, termed the negative-or lateregulatory element (NRE, LRE), binds a U1 snRNP-like complex [77][78][79] that has been shown to inhibit polyadenylation 80 (Figure 2).Such improperly processed late RNAs would not be licensed to be exported to the cytoplasm and would be degraded in the nucleus.

| Splicing
Splicing is regulated by binding of U1 snRNP to the 5′ splice site followed by location of the 3′ splice site through splicing factor U2AF and splicing factor 1 (SF1) recruiting U2 snRNP to the intron branch point.These events are followed by the formation of the entire spliceosome and subsequent steps in splicing (see Graham and formation and/or activity of the cleavage/polyadenylation complex on the early 3′UTR.Splicing repressors hnRNP L, hnRNP A1, hnRNP A2/B1, and SAM68 were shown to bind to elements in the L1 coding region to inhibit late gene expression possibly by inhibiting binding of U2AF and formation of an early splicing complex at the 3′ splice site at the 5′ end of the L1 coding region [84][85][86][87][88] (Figure 2).Thus, repression of splicing has direct links to early polyadenylation regulation to inhibit mature late mRNA production in undifferentiated HPVinfected cells.

| RNA stability and translation
The processes of mRNA stability and translation are molecularly linked. 89Defective mRNAs are detected on the ribosomes and targeted for degradation while the stability of viable mRNAs is positively controlled during translation. 89The late 3′UTR LRE controls HPV16 late mRNA stability. 90,91HPV16 LRE-containing RNAs were very unstable in a HeLa cell in vitro decay assay 90 while in cells transfected with an expression vector containing the HPV1 L1 gene and a portion of the 3′UTR, cytoplasmic L1 mRNAs were detected but no L1 protein was synthesized. 91These data suggest that L1 mRNAs were unstable and/or unable to be translated in undifferentiated cells.
HuR, a protein that promotes nuclear export and mRNA stability, binds an AU-rich element in the HPV1 late 3′UTR and low cytoplasmic HuR levels correlated with inhibition of expression of a reporter gene containing the 3′UTR. 92,93The HPV16 LRE also binds HuR. 94In undifferentiated keratinocytes, HuR overexpression resulted in unscheduled expression of the L1 capsid protein 94 suggesting that low HuR expression in these cells leads to unstable capsid-encoding mRNA. 66,68V1 and HPV16 capsid protein expression is inhibited in undifferentiated cells via two RNA regulatory elements, one located at the 5′ end and another at the 3′ end of the HPV16 L2 ORF.The 5′ element inhibited cytoplasmic mRNA stability, while the 3′ element bound hnRNP K and poly(rC) binding proteins 1 and 2 to inhibit translation of L2 mRNA in vitro 74,84,95 suggesting that these proteins are required for efficient translation in differentiated HPV-infected keratinocytes.A stability/translation regulatory element consisting of three UUUUU-motifs present in the HPV1 late 3′UTR bound hnRNPC to inhibit CAT reporter mRNA translation.96,97 hnRNP C can regulate RNA stability either directly in the cytoplasm 98 or via inhibition of nuclear pre-mRNA degradation or prevention of pre-mRNA export to the cytoplasm.99,100 The role of hnRNP C and how it controls HPV1 late gene expression remains to be investigated.
Finally, depletion of CUG binding protein 1 (CUGBP1) in HeLa cells could repress mRNA translation from a Renilla reporter gene construct containing the GU-rich 3′ portion of the HPV16 LRE, suggesting that this factor may contribute to translational repression of mRNA expression in undifferentiated keratinocytes. 805 | Activation of late gene expression in differentiated epithelial cells

| Transcription
Late promoter activation results in the expression of mRNAs encoding E1^E4, E5, L2, and L1. 1 This promoter is positively controlled by CDK8, CDK9, BRD4, 58 and by E7. 101 As noted above, transcription elongation by RNA Pol II may commence in less differentiated cells but is only fully activated in differentiated cells.
CDK8 is recruited to the Mediator complex via BRD4.This is followed by CDK9 recruitment to phosphorylate the CTD of RNA Pol II, which activates transcription elongation. 58 this results in upregulation of KLF4 and altered expression of its target genes and upregulation of viral genome replication and late gene expression. 104,105KLF13 was shown to similarly affect the late stages of the HPV life cycle but was also found to be required for STAT5 phosphorylation and subsequent activation of the ATM DDR 106 to facilitate viral genome amplification. 107,108V minichromosomes are composed of up to 32 nucleosomes. 59,109Nucleosomal histones can be acetylated, methylated, and phosphorylated by chromatin remodeling factors, and at least for HPV16, HPV18, and HPV31, the late promoter is regulated by such changes. 46,59,110,111The HPV31 late promoter has an open (active) chromatin conformation in both undifferentiated and differentiated keratinocytes but activated chromatin marks increase significantly in differentiated cells allowing binding of transcriptional activator CCAAT/enhancer-binding protein (CEBP)-α to the late promoter. 469 this chromatin loop is disrupted and uncovers activity of the enhancer in the URR, 115 which is required to activate the HPV18 late promoter. 47,48Subsequently, the late promoter acquires histone H4 acetylation associated with increased recruitment of transcriptional activators.Finally, histone variants, especially those of histones H3 and H4, can affect transcription by altering nucleosome and chromatin structure.Histone H3.3 variant was found to be enriched in HPV virions, which may suggest that this variant was recruited to viral chromatin before encapsidation to support active late gene transcription. 114SIRT1 is a histone deacetylase that can bind and activate chromatin and is required for the formation of DNA repair complexes to double strand breaks and as such it activates HPV31 replication and late gene expression. 116SETD2 is a histone H3 methyltransferase (H3K36me3) that activates transcription elongation and is required for HPV replication.It also controls alternative splicing of the late RNAs 117 possibly through the known link between the ATM DDR and alternative splicing regulation. 118 summary, accurate differentiation stage-specific late promoter activation is regulated by multiple processes including DNA replication, 102 chromatin remodeling/transcription, controlled by differentiation stage-specific transcription complexes, and inhibition of repressive transcription factors.

| Polyadenylation
Upon keratinocyte differentiation, there is a switch in the use of polyadenylation sites in the HPV genome.Early mRNAs continue to terminate at the pAE, but use of the late polyadenylation site (pAL), located downstream of the L1 gene in the late 3′UTR, is upregulated. 57,69st late mRNAs initiate at the late promoter in the E7 ORF 3,4 so RNA Pol II must ignore the pAE to terminate at the pAL.Possible mechanisms of repression of pAE in differentiated keratinocytes include downregulation of auxiliary polyadenylation factors required to enhance the pAE, as discussed above, and changes in alternative splicing to produce viral late mRNAs such as E1^E4^L1, E6E7^L1 and E1^L1 which splice out the pAE.Some late mRNAs do not splice out the pAE but E2 can inhibit recognition of the pAE in these transcripts resulting in readthrough of transcription into the late region. 119

| Splicing
Most HPV RNAs undergo alternative splicing. 120,121Together with differential promoter usage, this strategy ensures that each viral ORF is present as a first ORF in an mRNA (subsequent ORFs are inefficiently translated 122 ) and may allow the virus genome to encode all its proteins efficiently. 68ny splicing regulators have been shown to bind to HPV RNAs (Table 1). 123A key event in late gene expression is the splicing out of the intron between E1 and E4 and the E4 and L1 ORFs (Figure 2).E1^E4^L1 is the major late transcript encoding L1.
Additonal L1-encoding mRNAs include rare transcripts E6E7^L1, E6^E4^L1, E1^L1, and L1 initiating from the weak E4 promoter. 48,68,124L2 proteins may be encoded by the E1^E4E5L2L1 readthrough RNA or, for HPV6, 125 HPV16, 68 and HPV18, 48 by an L2L1 RNA initiated at a weak promoter in the E5 gene region (Figure 2).Which of these mRNAs encode L1 or L2 proteins is unknown.It is possible that all can be translated to yield these proteins through leaky scanning, 126 but that translation efficiency may be low for some (Figure 3).8][129][130] Since SRSF3 is required for HPV16 late gene expression, this means that E2 indirectly controls the expression of the capsid proteins. 129The E2 binding partner, BRD4 can control alternative splicing through direct interaction with the spliceosome during RNA Pol II transcription. 131Therefore, BRD4, or its short form BRD4S, could control late RNA splicing via E2. 61E2 itself can bind splicing factors therefore E2 could regulate cellular or viral constitutive and alternative splicing. 132,133Finally, E2 regulates transcription of a wide range of cellular genes. 134If the protein products of such genes are involved in transcription or posttranscriptional events, or signaling that impacts these processes, E2 could be a master regulator of late gene expression.
HPV late mRNAs include unusually long terminal exons (L1 = 1.5 kb; L2 = 2.9 kb) whose splicing would be inefficient. 135The same LRE that can inhibit late mRNA polyadenylation in undifferentiated epithelial cells may activate terminal exon splicing in differentiated epithelial cells by allowing formation of a splicing complex mimic at the 3′ end of the L1 ORF acting to define the terminal exon and link it to upstream splicing events. 77,136e cellular DNA damage response (DDR) is key to HPV genome amplification in differentiating keratinocytes, 34 but the DDR can also control splicing. 137DNA damage induced by the drug melphalan induced association of phosphorylated BRAC1 and BARD 1 with HPV16 DNA.The data suggest that DDR inhibited the pAE while the increased association of key splicing factors U2AF65 and SF3b with the HPV genome via phosphorylated BRAC1 and/or BCLAF1 or TRAP150 could activate late pre-mRNA splicing. 71 well as acting as a transcriptional regulator, CTCF can control alternative splicing of cellular genes [138][139][140][141] and has been shown to regulate splice site choice for HPV18 transcripts. 47CTCF may activate late gene expression by enhancing spliceosome recognition of the alternative splice sites required for late mRNA production, perhaps through chromatin changes due to CTCFmediated chromatin looping from the CTCF binding site in the E2 T A B L E 1 List of RNA binding proteins involved in HPV late gene expression, their known functions and the experimental systems in which they were analyzed.ORF to the URR and/or by slowing progression of RNA Pol II across the HPV genome. 48

| Stability and translation
The LREs located in the late 3′UTRs of the HPV1 and HPV16 RNAs bind cellular factors hnRNP C, HuR, and Poly(A) Binding Protein C (PABPC) to control mRNA stability and translation. 92,94,97,142While HuR overexpression in undifferentiated HPV16-infected keratinocytes resulted in unscheduled capsid protein expression, depletion of HuR in differentiated keratinocytes reduced L1 protein expression. 94R may positively regulate capsid protein expression in differentiated cells by allowing nuclear export, stabilizing the capsid mRNAs and enhancing translation.The splicing regulator SRSF1 has been shown to regulate mRNA stability and translation. 143SRSF1 levels rise in differentiated HPV16 and HPV31-infected epithelia 128,129 in concert with increased levels of cytoplasmic mRNAs encoding capsid proteins. 68In cervical keratinocytes, SRSF1 relocated to the cytoplasm due to HPV infection 130 and upon depletion of SRSF1, fewer HPV16 capsid mRNAs were located on the polysomes compared to control siRNA-treated cells (Graham S. V. and Caceres J. F., unpublished data) suggesting that SRSF1 may be required to support capsid protein stability or translation.

| Codon bias in translation of late mRNAs
Kozak rules of translation suggest that in mRNAs containing more than one translation start codon, usually only the first is chosen by the ribosomes to initiate protein synthesis. 122The major late mRNA E1^E4^L1 contains a strong AUG at its 5′ end 144 suggesting that translation of the L1 ORF, the second ORF in the mRNA would be inefficient.The E1^L1 mRNA could be translated to yield L1, albeit with five amino acids from E1 at its 5′ end, while the E6^L1 mRNA could be efficiently translated since the E6 start codon is of weak consensus leading to translation initiation at the downstream L1 ORF.
All known L2-encoding transcripts contain L2 as at least a third ORF suggesting that L2 is inefficiently translated.However, activity of the putative E5 promoter, although limited, 48 could yield sufficient L2 protein production to provide the low ratio of L2 to L1 protein subunits found in the virus capsid.
All viruses rely on host cell translation to complete their life cycles and many viruses manipulate translation to facilitate production of virus proteins.Infection of different tissues results in codon optimization to maximize efficiency of viral protein production. 145V L1 and L2 mRNAs show a strong codon bias towards use of rare codons with a T-nucleotide at the third position 146,147 and this codon bias occurs in a keratinocyte differentiation stage-specific manner.
9][150][151] Recently, it has been shown that in general viral late protein translation is reduced compared to early protein translation because early proteins seem to be better adapted to the tRNA pools of their target tissues.Interestingly, at least for the few early and late proteins analyzed, HPV proteins did not follow this rule: L1 and L2 proteins were translated as well as early proteins. 145don bias can influence mRNA stability as well as translation efficiency. 89Late RNAs are very much less abundant than early mRNAs in HPV-infected cells in vitro and in vivo. 48,152Optimized capsid protein translation from stable late mRNAs would be essential to yield sufficient pools of capsid proteins for virion production.

| Capsid formation and egress
There is still a lack of clarity concerning HPV capsid formation and virion egress in keratinocytes in vivo.Some HPVs such as HPV16 express L2 protein before L1 153 but for HPV1, L1 expression precedes L2. 154 Following translation, L1 monomers assemble into pentamers in the cytoplasm and are imported into the nucleus.L2 is imported as a monomer with involvement of karyopherins and Hsc70. 155,156Karyopherins also prevent spontaneous capsid assembly. 157Capsids are formed of 72 L1 pentameric capsomeres and it is likely that up to the same number of L2 monomers are incorporated internally at the capsomere fivefold axes of symmetry.
A link between viral DNA replication and capsid formation is essential to ensure the correct order of virion formation.[160] However, E2 also interacts with L1 at viral replication foci and this enhances transcription and replication of viral genomes. 161Viral genome encapsidation may begin by recruitment of E2 to viral replication foci through its interaction with L2 158 because L2 null mutant HPV31 genomes displayed a 10-fold reduction in packaging viral genomes compared to wild type HPV31. 162L1 capsomeres themselves and/or together with cellular nucleophosmin may act as histone chaperones to aid formation of viral minichromosomes.
[165] The HPV capsid may not be selective in incorporating DNA since no packaging signals have been identified and HPV pseudovirions can incorporate heterologous episomes, such as plasmids, 166 while capsids containing cellular DNA have been detected in productive infection. 167There may be a restriction on size such that only DNA fragments ≤8 kb can be packaged leading to the hypothesis that capsids that incorporate larger fragments of DNA are unable to form stable virions. 167ithelial terminal differentiation has been proposed to provide a suitable environment for capsid assembly. 168,169Virion stability is achieved through capsomeres binding to each other via disulfide bridges 169,170 in response to the redox gradient between the suprabasal and cornified layers of the epithelium. 170Virions are transmitted in squames released from the upper surface of the infected epithelium 168 and are extremely stable in the environment.
This ensures transmission is efficient, which is important given the relatively low (compared to other viruses) number of virus particles produced during productive infection. 171,172However, while immature capsids are unstable, they may be just as infectious as mature capsids and it has been proposed that immature virions could be released from deeper layers of the epithelium to play a role in natural infection. 169atial control of late events and links to epithelial differentiation suggests that HPV infection can alter the differentiated keratinocyte to facilitate virion formation and egress.Transglutaminase, a key protein of terminally differentiated keratinocytes, can crosslink E4 to the cornified envelope resulting in decreased structural stability of squames. 168,173E4 can multimerise via its C-terminus and has been shown to form amyloid-like fibers in HPV16-infected differentiated keratinocytes.E4 also interacts with intermediate filament keratins to collapse the cytokeratin network leading to reduced thickness and increased fragility of infected squames (Figure 3). 174,175E5 may promote vacuole formation in, and disintegration of, keratinized squames to aid the release of progeny virus particles. 176,177More recently, transcriptomic studies have revealed that virus infection results in disruption to adherens, tight and gap cell junctions and desmosomes in differentiated keratinocytes. 178,179Small proline-rich proteins, which act as crosslinking proteins in the cornified envelope, and changes in mucins were also significantly downregulated. 178These changes are probably due to E6/E7-mediated decreased keratinocyte differentiation capacity.All these changes would be predicted to reduce cell-to-cell adhesion and disrupt the physical epithelial barrier 19 to allow easier egress of newly formed virus particles.

| Future directions
Several of the studies reported above were carried out in monolayer culture.Differentiation of keratinocytes by culturing in high calcium concentration or by growth in methylcellulose may not allow full viral genome amplification or capsid protein expression.Organotypic raft culture of HPV-infected or HPV genome-transfected keratinocytes is far superior for studying late events because this system allows virion formation and virus release and should be the method of choice for future studies of late events. 180,181That said, these are in vitro approaches, which may not recapitulate the in vivo environment.A number of animal papillomavirus have been used to mimic HPV infection, but there are clear differences between the different animal viral life cycles and HPV life cycles. 182,1834][185][186][187][188] However, murine epithelia generally display fewer cell layers than human epithelia 188 and MmuPV1 does not express an E5 protein, which is important for late gene expression 44,45 meaning that late events in the MmuPV1 184 life cycle may exhibit significant differences to that of HPVs.Importantly, the tractability of a murine model offers an approach to lineage tracing of MmuPV1 infected cells, including observation of late events and spatial analysis of host-pathogen interactions, which could transform our understanding of HPV late gene expression. 189herapeutic strategies against HPV infection would involve inhibiting the viral life cycle and, for high-risk types, targeting persistent infection by interfering with increased viral oncoproteins expression responsible for cancer progression. 190strategy to reveal HPV infection in the lower epithelial layers would involve induction of capsid protein synthesis as this would stimulate an immune response against infection.The experimental evidence using overexpression of HuR suggests that the highly immunogenic capsid proteins can be made to be expressed in basal HPV-infected keratinocytes. 94CUGBP1 is another potential target; therapeutic siRNAs against CUGBP1 could potentially unmask capsid protein expression in basal epithelial cells but as CUGBP1 is part of an inhibitory protein complex 80 siRNAs targeting multiple proteins may be necessary.There is clear evidence for changes to the viral and cellular epigenome during infection. 114As these changes are essential for viral replication and gene expression, epigenetic therapies in development against cancers could be deployed to modulate such changes as a means to disrupt the viral life cycle.
Topical rather than systemic anti-HPV therapies are key since the target tissue is superficial.Small molecules that can travel through epithelial layers to target the production of late mRNAs could prevent virion formation and spread in the environment.This is relevant to genital warts, where multiple lesions can occur locally and spread to other individuals.Small molecule inhibitors of HuR, have been identified 191,192 which may restrict HPV16 capsid protein expression. 94SRPIN340 and related, next-generation drugs, 193 inhibit the kinase SRPK1 194 which phosphorylates SR proteins, 143 that are required for the HPV life cycle. 129,130SRPIN340 can inhibit the expression of HPV16 late proteins E4 and L1 and reverse the effects of the infectious process on differentiation and the epithelial barrier (Faizo and Graham in preparation).Finally, since DNA damage results in activation of HPV16 late gene expression, 71 DNA damage inhibitors, being developed as anticancer drugs, 195 could be used to inhibit HPV late mRNA splicing.Cancer progression is driven by persistent expression of viral oncoproteins in the basal layer of the infected epithelium. 190Thus, it would be essential that intervention strategies targeting events in the upper epithelial layers would not impinge on basal layer cells.

| CONCLUSIONS
Repression of late gene expression in undifferentiated epithelial cells is a multilayered and tightly controlled process, and although less explored experimentally, this is likely also true of activation in differentiated keratinocytes.A full understanding of differentiationspecific late events in the viral life cycle could lead to the development of novel therapies against HPV infection.
Repression of late gene expression also occurs through differential use of the viral early and late polyadenylation sites (pAs).57In undifferentiated epithelial cells, early gene expression terminates at the early polyadenylation site, preventing read-through to the late region.69HPVs possess weak consensus early pAs (pAE), and at least for HPV31, there is some heterogeneity in polyadenylation site selection.The HPV16 pAE possesses an upstream regulatory element which binds polyadenylation-enhancing factors such as human Fip1, cleavage stimulation factor 64 kDa subunit (CstF-64), heterogenous nuclear ribonucleoprotein (hnRNP) C1/C2 and polypyrimidine tract binding protein (PTB) 70,71 (Figure 2).Moreover, sequences in the HPV16 and HPV31 L2 open reading frames (ORFs) bind splicing regulatory factor hnRNP H and the 64 kDa subunit of polyadenylation factor CstF to enhance recognition of the pAE by the cleavage and polyadenylation machinery 72-75 (Figure 2).Presumably, highlevel expression of polyadenylation-enhancing factors in cells synthesizing early gene transcripts will ensure spatially appropriate expression but weak recognition of the pAE by polyadenylation factors may allow RNA polymerase read-through to the late region explaining the observations of late RNAs in undifferentiated epithelial cells.

Faizo 81
for a full description of splicing).The binding of U1 and U2 snRNPs to splice sites is regulated positively by serine-arginine-rich (SR) proteins and negatively by hnRNP proteins.Splicing regulation contributes to the repression of late gene expression in undifferentiated keratinocytes.hnRNP L and hnRNP C1/C2 can bind upstream and downstream of the early polyadenylation site. 82hnRNP L may antagonize hnRNP C1 activation of the 5′ splice site at the end of the E4 ORF, resulting in mRNAs which do not splice out the E5 gene region and the early region 3′UTR. 83This would enhance the F I G U R E 2 RNA binding proteins which interact with HPV16 late RNAs.(A) Diagram of the HPV16 genome.Colored cylinders, open reading frames.Blue horizontal arrows, viral promoters.Polyadenylation sites are shown with downward black arrows.Gray lozenge, HPV16 3′UTR LRE.Proteins that bind HPV late RNAs are shown above and below the genome map.Those in the red type are activators, those in blue type are inhibitors.Those in black type are not sufficiently investigated as yet.HPV, human papillomavirus; LRE, late-regulatory element; 3′UTR, 3′ untranslated region.
HPV18 late promoter mapping identified an element that binds cellular factors ARE/poly (U)-binding/degradation factor 1 (AUF1) and hnRNP A1/B2 in a differentiation-dependent manner to repress the late promoter, 102 while the LAP and LIP forms of transcription factor C/EPB-β have been shown to activate and repress gene expression from the HPV31 late promoter, respectively. 103However, the exact role of positive and negative transcription factors to controlling the late promoter is unknown due to the location of the late promoter within the E7 ORF and cross-talk with the enhancer in the URR and the viral origin of replication. 102Late events in the HPV life cycle including the DDR and viral genome amplification and late gene expression are linked.p63 transcriptionally regulates expression of cell cycle proteins such as cyclins, CDKs, and DDR proteins such as RAD51 to activate viral genome amplification and late gene expression.Other transcription factors controlling late gene expression via viral genome amplification include Kruppel-like factors (KLF) 4 and 13.KLF4 is an essential factor for normal keratinocyte differentiation but in infected cells, the levels of KLF4 regulator miR145 are reduced.Together with changes in posttranslational modifications, Kim et al., demonstrated that the HPV16 genome was hypomethylated (open chromatin conformation). 113Histone H3 and H4 in the HPV1 and HPV18 minichromosomes have posttranslational modifications indicative of active chromatin 114 which E7 can regulate during keratinocyte differentiation. 47,48Open chromatin conformation of HPV genomes in differentiated cervical keratinocytes should support vegetative viral genome amplification and late gene expression.The transcription factor CCCTC-binding factor (CTCF) binds a sequence in the HPV E2 ORF and in undifferentiated keratinocytes creates a repressive chromatin loop between the E2 gene and the URR.Upon differentiation and related changes in expression of certain transcription factors,

F I G U R E 3
Major events of terminal epithelium differentiation and how HPV infection disrupts this.As epithelial cells terminally differentiate, enucleation (light blue discontinuous oval in the center of the lower cell) takes place coupled with a loss of organelles.Dark red clusters of fibers represent keratohyalin granules that mark granular layer cells.As cells flatten, an intracellular keratin network (lilac-colored fibers) is formed.E4 (green circles) can be cleaved at its N-terminus by calpain resulting in multimerization of E4 C-termini to form amyloid-like fibers (green circle chains).E4 can also be cross-linked to the cornified envelope (light brown line around the cells) by transglutaminase (purplecolored protein).HPV (blue circles) infection has also been demonstrated to disrupt the function of gap junctions, adherens, desmosomes and tight junctions (see figure caption).Created with BioRender.com.HPV, human papillomavirus.