Birth of a poly(A) tail: mechanisms and control of mRNA polyadenylation

During their synthesis in the cell nucleus, most eukaryotic mRNAs undergo a two‐step 3′‐end processing reaction in which the pre‐mRNA is cleaved and released from the transcribing RNA polymerase II and a polyadenosine (poly(A)) tail is added to the newly formed 3′‐end. These biochemical reactions might appear simple at first sight (endonucleolytic RNA cleavage and synthesis of a homopolymeric tail), but their catalysis requires a multi‐faceted enzymatic machinery, the cleavage and polyadenylation complex (CPAC), which is composed of more than 20 individual protein subunits. The activity of CPAC is further orchestrated by Poly(A) Binding Proteins (PABPs), which decorate the poly(A) tail during its synthesis and guide the mRNA through subsequent gene expression steps. Here, we review the structure, molecular mechanism, and regulation of eukaryotic mRNA 3′‐end processing machineries with a focus on the polyadenylation step. We concentrate on the CPAC and PABPs from mammals and the budding yeast, Saccharomyces cerevisiae, because these systems are the best‐characterized at present. Comparison of their functions provides valuable insights into the principles of mRNA 3′‐end processing.

During their synthesis in the cell nucleus, most eukaryotic mRNAs undergo a two-step 3 0 -end processing reaction in which the pre-mRNA is cleaved and released from the transcribing RNA polymerase II and a polyadenosine (poly(A)) tail is added to the newly formed 3 0 -end. These biochemical reactions might appear simple at first sight (endonucleolytic RNA cleavage and synthesis of a homopolymeric tail), but their catalysis requires a multi-faceted enzymatic machinery, the cleavage and polyadenylation complex (CPAC), which is composed of more than 20 individual protein subunits. The activity of CPAC is further orchestrated by Poly(A) Binding Proteins (PABPs), which decorate the poly(A) tail during its synthesis and guide the mRNA through subsequent gene expression steps. Here, we review the structure, molecular mechanism, and regulation of eukaryotic mRNA 3 0 -end processing machineries with a focus on the polyadenylation step. We concentrate on the CPAC and PABPs from mammals and the budding yeast, Saccharomyces cerevisiae, because these systems are the best-characterized at present. Comparison of their functions provides valuable insights into the principles of mRNA 3 0 -end processing.
3 0 -End processing of eukaryotic mRNAs involves recognition of a polyA signal sequence (PAS) in the nascent pre-mRNA, followed by endonucleolytic cleavage and polyadenylation of the 5 0 -cleavage product (Fig. 1A). These reactions are carried out by a large (~1 MDa) and highly conserved multisubunit complex which we will refer to holistically as the cleavage and polyadenylation complex (CPAC). This assembly consists of cleavage factors (CF) together with the cleavage and polyadenylation factor (CPF) in yeast or cleavage and polyadenylation specificity factor (CPSF) in mammals. The CPAC appears to be structurally and enzymatically organized into distinct and highly interconnected modules and subcomplexes (see Box 1) that work together to coordinate 3 0 -end processing of mRNAs. Furthermore, the polyadenylation activity of CPAC is intimately controlled by poly(A) binding proteins (PABPs) which decorate the nascent poly(A) tail, instruct the termination of poly(A) tail elongation, and mediate post-transcriptional gene regulation.
Here we walk through the nuclear birth of the mRNA poly(A) tail which emerges as a complex and Abbreviations 3 0 -UTR, 3 0 -untranslated region; AMP, adenosine monophosphate; ATP, adenosine triphosphate; CF IIm, mammalian cleavage factor II; CF Im, mammalian cleavage factor I; CF, cleavage factor; CPAC, cleavage and polyadenylation complex; CPF, cleavage and polyadenylation factor; CPSF, cleavage and polyadenylation specificity factor; CstF, cleavage stimulation factor; mCF, mammalian cleavage factor; mPSF, mammalian polymerase specificity factor; mRNA, messenger RNA; mRNP, mRNA ribonucleoprotein particle; PABP, poly(A) binding protein; PAS, polyA signal sequence; poly(A), polyadenosine; pre-mRNA, precursor mRNA; snRNP, small nuclear ribonucleoprotein. highly regulated process. We highlight the recent progress on the structural and functional characterization of the CPAC of yeast and mammals, and trace how variations in the conserved polyadenylation machineries and their interplay with PABPs result in major differences in polyadenylation in these organisms.

Recognition of the polyadenylation signal triggers RNA cleavage
The binding of CPAC to the PAS (A 1 A 2 U 3 A 4 A 5 A 6 ) on the nascent pre-mRNA triggers the enzymatic activities of CPAC and defines the 3 0 -end of the mRNA. The recognition of PAS is carried out by the five-zinc finger subunit of the CPAC, Yth1/CPSF30, and the WD40 protein Pfs2/WDR33, in complex with the scaffolding three-beta propeller protein, Cft1/ CPSF160. Together, CPSF160-WDR33-CPSF30 along with hFip1 constitute the mammalian polymerase specificity factor (mPSF) (Box 1 and Fig. 1B) [1,2]. In yeast, the equivalent Cft1-Pfs2-Yth1-Fip1 sub-complex additionally contains the poly(A) polymerase Pap1 as a constitutive subunit (i.e. Pap1 is always stably bound to the complex) to form the polymerase module [3]. mPSF and the polymerase module function as a rigid scaffold to coordinate 3 0 -end processing of mRNAs.
In yeast and humans, Yth1/CPSF30 binds the first two nucleotides (A 1 A 2 ) of the PAS. Here, the zinc finger 2 makes highly conserved base-specific contacts. In humans, CPSF30 makes additional base-specific Fig. 1. The eukaryotic CPAC. (A) Prior to RNA binding, CPF/CPSF is in a flexible, elongated, and inactive state. Recognition of the polyA signal sequence (A 1 A 2 U 3 A 4 A 5 A 6 ) on the RNA by the polymerase module/mPSF (green) and binding of the cleavage factors (gray) to the upstream and downstream sequence elements in the RNA leads to a structural rearrangement that primes the complex for activation [5,7,12]. Once in the activation-competent state, the rearranged CPAC is thought to bring the nuclease module (purple) in close proximity to the cleavage site allowing cleavage and subsequent polyadenylation. The phosphatase module (yellow) is not strictly required for 3 0 -end processing but may coordinate the abovementioned transitions. (B) Surface representation of a composite model of the polymerase (shades of green) and nuclease modules (shades of purple) of the yeast CPAC. The corresponding homologous subunits in mammals are also labeled. The HAT domains of Rna14/Cstf77 dimer are also shown in gray. The polyadenylated RNA substrate bound to the Yth1/CPSF30 and Pfs2/ WDR33 subunits is depicted in blue. The yeast polymerase module (PDB: 7ZGR) was aligned to the human mPSF-CstF77 complex (PDB: 6URO). Pap1 is shown as being flexibly tethered by one of the two copies of Fip1 (PDB: 3C66), whereas the second copy is contacting the HAT domain of Rna14/Cstf77. The Ysh1/CPSF73 (PDB: 6I1D) and Cft2/CPSF100 (PDB: 2I7X) subunits of the nuclease module/mCF are flexibly tethered by Cft2/CPSF100 and Mpe1/RBBP6. (C) Cartoon representation of the yeast Pap1-Fip1 interface. (D) AlphaFold prediction of the human PAP showing the corresponding region shown in (C) to depict the potential binding of the C-tip regulatory region.
contacts with the A 4 A 5 of the PAS via its zinc finger 3, and a non-conserved N-terminal region in WDR33 specifically stabilizes a Hoogsteen base pair between U 3 and A 6 [2,[4][5][6]. This WDR33-RNA interaction may explain the higher affinity of the human CPAC for the PAS sequence compared to the yeast complex.
Pfs2/WDR33 may also contribute to RNA binding via additional charged surfaces on its flat solvent-exposed side, but whether this contributes in a sequencespecific manner is unclear [3].
PolyA signal sequence recognition then triggers activation of the CPAC endonuclease, Ysh1/CPSF73, Box 1. Functional composition of the CPAC CPF/CPSF and cleavage factors are highlighted by dashed lines. Sub-complexes (or modules) are indicated by darker shading. Subunits homologous in yeast and mammals are shown side by side. Subunits with enzymatic activity are marked by an asterisk. The requirement of individual yeast (y) and mammalian (m) subunits for the pre-mRNA cleavage and polyadenylation phases is shown on the right (see the color key). which cleaves the pre-mRNA~20 nucleotides downstream of the PAS. The endonuclease is part of a multisubunit subcomplex known as mammalian Cleavage Factor (mCF) and yeast nuclease module [3,7]. The composition of these sub-complexes and their requirement in CPAC cleavage activity are summarized in Box 1. It should be emphasized that the activation of the endonuclease additionally requires cleavage factors, as described later. Ysh1/CPSF73 interacts with a related pseudonuclease, Cft2/CPSF100 and the scaffolding protein Pta1/Symplekin [8,9]. Recent studies show that Mpe1/RBBP6 is universally required for correct activation of Ysh1/CPSF73 likely through a conserved mechanism of action [6,7,10,11]. Importantly, although Mpe1 shows a strong interaction with Ysh1 [12] and is a constitutive subunit of the yeast CPAC, RBBP6 has weaker affinity for CPSF73 and is therefore not a constitutive subunit of the human CPAC [7,10,13]. Recruitment of RBBP6 to the mammalian CPAC may require recognition of the PAS sequence, stable RNA binding, or conformational changes which could signal a quality control checkpoint before committing the complex to pre-mRNA cleavage. Pta1/Symplekin is a scaffolding protein which bridges the three enzymatic activities of the CPAC (polymerase, endonuclease, and phosphatases, discussed below) and may be an important node to integrate and relay regulatory signals between the enzymatic components. Pta1 directly interacts with the N-terminal domain of poly(A) polymerase Pap1 [14], and Pta1/Symplekin form conserved interactions with Cft2/CPSF100 and the endonuclease Ysh1/CPSF73 [8,15]. Curiously, Pta1 and Symplekin are part of distinct subcomplexes in yeast and mammals (Box 1). In mammals, Symplekin belongs to the mCF and is required for cleavage activity. In yeast, Pta1 is part of the phosphatase module which couples pre-mRNA 3 0 -end processing to transcription termination by dephosphorylating the C-terminal domain of RNA polymerase II [16]. However, the phosphatase module is dispensable for cleavage and polyadenylation activities of the yeast CPAC. To date, an equivalent phosphatase module has not been identified in mammals, but the conserved human Ssu72 phosphatase does interact with Symplekin [17]. Additional subunits of the yeast phosphatase module have identifiable human homologs: Swd2/WDR82, Glc7/PP2A, and potentially Ref2/PNUTS. Recent studies have begun to elucidate a role for these proteins in 3 0 -end processing [13,18,19].
Cleavage factors play essential roles in activating the endonuclease (Fig. 1A). In yeast the multi-subunit complex CF IA (composed of Rna14, Rna15, Clp1, and Pcf11), and the single subunit CF IB/Hrp1 are required for correct activation of the CPF endonuclease, Ysh1 [12,[20][21][22]. These factors bind sequence elements upstream and downstream the PAS that are important for their role in regulating 3 0 -end processing [23][24][25][26]. CF IA and CF IB also stimulate and control polyadenylation activity of the CPAC, as discussed later. In mammals, the functionally equivalent CF IA homologs are split between two stable complexes: cleavage stimulation factor (CstF, consisting of Cstf77, Cstf64, and Cstf50 subunits) [27][28][29] and the mammalian cleavage factor II (CF IIm, consisting of Pcf11 and Clp1) [30,31] (Box 1). The mammalian cleavage factor I (CF Im) is not required for activation of cleavage [7,10], but does bind sequence motifs upstream of the PAS to stabilize assembly of the CPAC on its substrate and regulate cleavage site selection [32].

From cleavage to polyadenylation
Cleavage of the pre-mRNA generates a 5 0 -product and a 3 0 -product. The 5 0 -product of the cleaved pre-mRNA is polyadenylated by the poly(A) polymerase (Pap1 in yeast, PAP in vertebrates). The 3 0 -product, which is still bound to transcribing RNA polymerase II, is degraded by the torpedo exonuclease, Rat1/XRN2. The exonuclease activity of Rat1/XRN2 on the 3 0cleavage product ultimately leads to transcription termination by dislodging the RNA Polymerase II from the DNA [33,34]. The 3 0 -cleavage product, which is downstream of the cleavage site, contains sequence elements bound by CF IA or CstF which were necessary for activating CPAC cleavage activities [12]. Thus, it is reasonable to speculate that digestion of the 3 0 -product by Rat1/XRN2 effectively enforces the directionality of 3 0 -end processing by limiting re-cleavage of the RNA substrate, thereby promoting productive polyadenylation.
Pap1/PAP is flexibly tethered to the complex through its interaction with Fip1/hFip1, which should facilitate easy access to the 5 0 -product following cleavage [35,36] (Fig. 1B). Whereas human PAP is weakly bound to hFip1 and is not a constitutive component of the mammalian CPAC [1,13], yeast Pap1 is a constitutive subunit of the CPAC. Indeed, a non-conserved region of yeast Fip1 appears to strongly bind a conserved cleft in Pap1 [37] (Fig. 1C), but in humans multiple low-affinity regions in hFip1 are required for its interaction with PAP [38]. The differential affinities of Pap1/PAP within CPAC complexes may underlie critical species-specific regulatory mechanisms of polyadenylation, as discussed later in this review.
The stoichiometry of Fip1-Pap1/PAP within the CPAC has been the subject of recent structural studies which agree that two copies of the Fip1 subunit are able to interact with Yth1/CPSF30 (Fig. 1B). Specifically, crystal structures and solution nuclear magnetic resonance studies reveal extensive high-affinity interaction of one Fip1 molecule with zinc finger 4 of Yth1/ CPSF30, and another Fip1 molecule interacting with zinc finger 5 albeit with lower affinity [36,38,39]. This would, in principle, allow up to two polymerase molecules to be accommodated within the CPAC, which has been observed using native mass spectrometry analysis [3]. Further studies are required to address the functional relevance and regulatory potential of differential Fip1-PAP stoichiometries on polyadenylation and gene expression.
There may be CPAC states that favor cleavage or polyadenylation depending on the conformation of the substrate RNA, the recruitment of cleavage factors and non-constitutive subunits, and the integration of regulatory signals. hFip1, for example, has been suggested to mediate some of these transitions. Specifically, the Cstf77 homodimer within CstF binds to the N-terminal region of two copies of hFip1 (Fig. 1B). This interaction is mutually exclusive with the interaction between hFip1 and PAP [38] and may explain why PAP is not part of the pre-cleavage complex [13]. Presumably, this state of the CPAC (without PAP) is competent for (and would even promote) cleavage, but recent studies show opposing views on the requirement for PAP in the cleavage activity of the CPAC [7,10]. Clarifying the role of PAP in cleavage will require further studies. It is clear, however, that Cstf77 can prevent PAP incorporation into the CPAC, thereby inhibiting polyadenylation [38]. This may offer clues as to how the CPAC transitions from cleavage to polyadenylation. Of note, the yeast orthologue of Cst-f77, Rna14 has been reported to interact with Fip1 as well [40], but the functional significance of this connection is unknown.
The transition from cleavage to polyadenylation may be regulated by Pta1 and the phosphatase activity of the CPAC, although the phosphatase module is not strictly required for mRNA 3 0 -end processing. Phosphorylation of Pta1 inhibits polyadenylation activity presumably by stabilizing its binding to Pap1 [14,41]. This inhibition is relieved by the Glc7 phosphatase [41], suggesting that Glc7 is an important regulator of CPAC states aiding in its transition from cleavage to polyadenylation-competent states. Importantly, the regulation of this transition could allow the choice between abortive or productive synthesis of mature mRNA. We can speculate that unadenylated or short-tailed mRNAs observed in certain conditions [42,43] represent a stalled state of CPAC which did not fully commit to polyadenylation following RNA cleavage. An unadenylated 5 0 -cleavage product is prominent in some experiments assaying coupled cleavage and polyadenylation [3,10,12], whereas in other similar experiments this intermediate does not accumulate [7,43]. This indicates that the efficiency of this transition is sensitive to experimental conditions in vitro.
Elucidating the conformational changes between the functional states of the CPAC will reveal how the RNA is delivered to the polymerase following cleavage, the role cleavage factors play in coordinating these transitions and how the complex is ultimately recycled to act on a new RNA substrate.

Poly(A) tail elongation
The CPAC polymerase uses the cleaved pre-mRNA 3 0end as a primer to start synthesizing the poly(A) tail. Within the U-shaped Pap1/PAP, a large cleft is formed between the palm and C-terminal domains which encloses three nucleotides of RNA and harbors the enzyme active site on one side within the fingers and palm domains (Fig. 1B) [44][45][46]. In each nucleotide addition cycle, ATP binding into the active site induces cleft closure and the catalysis of an AMP incorporation into the RNA 3 0 -end. Upon nucleotide addition, the cleft then opens to allow the release of the products; pyrophosphate and the extended poly(A) tail [44,47]. As a result of limited binding to the RNA primer, the polymerizing activity of Pap1/PAP is distributive. In other words, the enzyme adds only one or few nucleotides before dissociating from its RNA substrate [44,[48][49][50].
Because the rate of polyadenylation is limited by the stability of polymerase binding to the RNA 3 0 -end, the specificity and processivity of polyadenylation are controlled by factors that modulate this interaction. CPAC binding to the PAS, as detailed above, links the pre-mRNA and polymerase together. This tethering of substrate and enzyme increases the rate of poly(A) tail addition but becomes progressively weaker when the poly(A) tail grows longer [51][52][53]. Additional contacts to the RNA sequences upstream of the PAS may be required to enhance pre-mRNA binding to the CPAC. For example, cleavage factors IA and IB in yeast may stimulate and control poly(A) tail elongation by mediating such contacts [3,20,21,43,54]. In mammals, CF Im has been proposed to play a similar role in stimulating polyadenylation [55,56], but this activity could not be recapitulated in a reconstituted system [7]. Further research is needed to elucidate how cleavage factors modulate polyadenylation and whether their differential recruitment could provide a means for regulating poly(A) tail synthesis in a gene or transcript isoform-specific manner.
Despite the high degree of conservation between the yeast and human CPACs, poly(A) tails are elongated in a very different manner in these organisms. The difference can be attributed, at least to some extent, to the details of how Pap1/PAP is tethered to the rest of the CPAC (discussed above) and brought into the proximity of the RNA 3 0 -end. In yeast, the stable Pap1-Fip1 interaction provides sufficient processivity for the fully assembled CPAC to elongate poly(A) tails rapidly to 100-200 adenosines [3,6,43] (Fig. 2A). It is also possible that an RNA-binding site within the Cterminal domain of Pap1 [45,57,58] contributes to orienting and maintaining the poly(A) tail in a conformation conducive for processive polyadenylation.
On the other hand, PAP is not a stable subunit of the mammalian CPAC [1,13]. Although hFip1 links PAP to the CPAC and U-rich regions of the pre-mRNA [38,59], this connection provides only partial polyadenylation activity. Mammalian CPAC reaches full activity in the presence of the nuclear PABP, PABPN1, which binds PAP and the nascent poly(A) tail, and increases the apparent affinity of PAP for the poly(A) tail [51,52,60]. Individually either CPSF or PABPN1 can stimulate PAP's activity by 50-fold, but together, they increase the polyadenylation rate by 15 000-fold [53,61]. As a result of this cooperative action polyadenylation proceeds in a biphasic manner (Fig. 2B) The budding yeast CPAC polyadenylates the pre-mRNA with high basal activity. Polyadenylation is terminated primarily by Nab2 after the synthesis of~60 adenosines. If Nab2 is not available, then uncontrolled polyadenylation is prevented through fail-safe termination pathways (dashed arrows and coordinate bar) by Pab1 or by the intrinsic mechanism of CPAC which restrict poly(A) tail lengths to~90 and 100-200 adenosines, respectively. Nab2 and Pab1 are hypothesized to "measure" the poly(A) tail lengths by forming oligomeric structures that prevent Pap1 accessing the RNA 3 0 -end. The high basal polyadenylation activity as well as the intrinsic length control require cleavage factors CF IA and CF IB within the CPAC (model adapted from [43]). (B) Mammalian CPAC has low basal polyadenylation activity due to weak association of PAP with the CPSF and poly(A) tail. After a slow addition of 10-12 adenosines, binding of first PABPN1 to the nascent tail stabilizes PAP's association with the poly(A) tail. Now PABPN1 and CPSF cooperatively stimulate PAP's activity allowing rapid poly(A) tail elongation. In the course of elongation, PABPN1 oligomerizes on the growing poly(A) tail and ultimately forms a size-limited spherical structure with a~250-adenosine tail. Although PABPN1 continues to stimulate polyadenylation beyond this point, the spherical structure prevents the CPSF from stimulating PAP's activity, which terminates the processive phase of elongation (model adapted from [52]).
allow simultaneous binding of PAP and PABPN1 are added slowly whereas the subsequent elongation up to 250 adenosines proceeds rapidly supported by sequential binding of PABPN1 molecules to the nascent poly (A) tail [52]. Notably, as opposed to the human system, in the budding and fission yeasts PABPs do not appear to be required during the elongation phase [42,43,62,63], instead, they inhibit the activity of Pap1 and act to terminate polyadenylation in a timely manner (see the next section). Nevertheless, the association of PABPs may prevent the CPAC from re-cleaving its polyadenylated pre-mRNA substrate at the same site [64], thereby providing additional directionality to the 3 0end processing steps.

Termination of polyadenylation and the control of poly(A) tail length
Poly(A) tails on newly synthesized mRNAs are uniform in length implying that all mRNAs are polyadenylated in an identical manner within an organism [43,63,65]. The intriguing biochemical question of such "length control" is how to stop the untemplated polyadenylation reaction at the right moment? The polyadenylation activity of CPAC is controlled by PABPs that associate with the nascent poly(A) tails and instruct termination of poly(A) tail synthesis. Interestingly, the lengths of poly(A) tails on yeast and mammalian mRNAs are strikingly different (60 and 250 adenosines, respectively) due to distinct termination mechanisms which reflect the different modes of poly(A) tail elongation (discussed in the previous chapter) and the acquisition of non-homologous PABPs for controlling this process (Fig. 2).
In yeast, the otherwise efficient poly(A) tail elongation is inhibited by the association of Nab2, a CCCHtype zinc finger protein, resulting in 60-adenosine long poly(A) tails [43,64,[66][67][68]. If Nab2 is not available, polyadenylation can be terminated at 90 adenosines by Pab1, a PABP containing four RRM-domains [42,43,64,66]. Importantly, Pab1 localizes mostly in the cytoplasm but shuttles to the nucleus [69]. In contrast, mammalian mRNA poly(A) lengths are controlled by PABPN1, a single RRM-domain-containing protein, which provides means for both stimulating poly(A) tail elongation and terminating it once the poly(A) tail is 250 adenosines long [51][52][53]60,63]. A general feature of PABP-mediated length control is the multimerization of proteins on the nascent poly (A) tail. How these proteins "measure" the exact poly (A) tail length and terminate polyadenylation is best characterized for PABPN1. As discussed before, PABPN1 stimulates polyadenylation by increasing the apparent affinity of PAP for the poly(A) [60]. Successive association of PABPN1 molecules to the growing tail forms an oligomeric filament, in which individual molecules cover~11 nucleotides and the binding displays moderate cooperativity. The filament collapses into a spherical particle which can grow up to 20 nm wide accommodating~15-20 PABPN1 molecules bound to a~250-adenosine tail. This is a size limit that prevents more PABPN1 from associating within the particle [70][71][72] (Fig. 2B). Further extension of the tail beyond the length contained in the particle is no longer compatible with the interaction between PAP and the rest of the CPAC. This breaks the tripartite connection, which is required for the cooperative stimulation of polyadenylation, and thereby terminates processive synthesis [52]. Nab2 and Pab1 operate on poly(A) length control with a different mechanism as their binding to the poly(A) tail inhibits Pap1 activity [43,64,66] (Fig. 2A). Nab2 interacts with poly(A) RNA in an unusual way: two Nab2 molecules bring three of their seven zinc-fingers [73] together to create a binding site for a stretch of 11 adenosines [74]. However, it remains unclear how Nab2 measures the length of a full-sized 60-adenosine tail and whether such dimerization mediates termination of polyadenylation [68]. On the other hand, the 30-nucleotide footprint of Pab1 directly suggests that the association of three Pab1 molecules on the poly(A) tail [75,76] is responsible for inhibiting polymerization after the synthesis of a 90-adenosine tail [43].
We speculate that the evolution of PABPN1stimulated polyadenylation in animal cells, and presumably in plants, may have enabled the robust synthesis of longer poly(A) tails compared to those in fungi [77,78] which appear to lack such mode of regulation [62,63,79]. In this scenario, the early termination by Nab2 or Pab1-type PABPs, still utilized by the presentday fungi, became circumvented by the continuous loading of PABPN1 molecules during elongation. That being said, animal orthologues of Nab2 (ZC3H14) and Pab1 (PABPC1-4) have been suggested to regulate nuclear polyadenylation [73,80,81]. How these, and other proteins such as nucleophosmin [82], function in relation to the PABPN1 pathway awaits further studies.
Recently, a functional characterization of the yeast CPAC reconstituted from recombinant proteins uncovered a surprising new mechanism for controlling poly (A) length which is independent of PABPs and inherent to the CPAC machinery. This "intrinsic length control" requires the cleavage factors CF IA and CF IB, and terminates polyadenylation after the addition of 100-200 adenosines [43]. The mechanism for intrinsic length control remains unclear, but could involve steric occlusion or sequestration of the poly(A) tail by the RNA-binding surfaces of CPAC, or product inhibition of Pap1 by the long poly(A) tail. Nevertheless, termination of polyadenylation by the intrinsic activity of CPAC, as well as by Pab1, provides fail-safe systems in conditions when the primary Nab2 pathway is unable to control all polyadenylation events (Fig. 2A). This may prevent excessive polyadenylation and enable recycling of CPAC, thereby increasing the robustness of gene expression.
Finally, because PABPs stay bound to the poly(A) tail after termination, length control is an integral part of the assembly of the mRNA ribonucleoprotein particle (mRNP) [83]. The nuclear availability of PABPs, which changes in different conditions [80,[84][85][86][87][88][89], will impact the pathway utilized for poly(A) length control and thus the composition and stoichiometry of mRNP-bound PABPs. Therefore, termination of polyadenylation links mRNA 3 0 -end processing to the subsequent steps in gene expression: mRNP release from chromatin, nuclear export, remodeling after passage through the nuclear pore complex, translation [65,[90][91][92][93][94], and mRNA degradation, as detailed later in the review.
Much of the PAP regulation in mammals is wired through its C-terminal 240 residues which appear to be mostly unstructured. For example, U2AF 65, as well as the U1 snRNP subunits U1 70K and U1A, all employ similar motifs to inhibit the activity of PAP by binding to a region that spans the last 20 C-terminal amino acids of PAP (referred here as C-tip) [109][110][111]. C-tip seems to be important for PAP's processivity but is not present in the current PAP structures. High conservation of C-tip sequence between human, bovine, and Xenopus PAPs [109] further suggests that the evolution of this part of the protein is limited by functional constraints.
In an attempt to understand mechanistically how such regulation might work, we examined the Alpha-Fold predicted structures of several vertebrate PAPs [119,120]. Curiously, in these models the C-tip folds back to interact with the C-terminal domain of PAP, and this putative site of interaction overlaps with a corresponding region in Pap1 that binds yFip1 (Fig. 1C,D). An important question is whether the binding of regulatory factors to the C-tip could tune the strength of Fip1-PAP interaction and affect the recruitment of PAP to the CPAC. In one scenario, regulator binding could displace C-tip from the surface of PAP, thereby permitting hFip1 binding to PAP in a manner observed in yeast. Here regulator binding would promote the recruitment of PAP to the CPAC, which appears at odds with the observed inhibition. In an alternative scenario, a regulator could interact with the C-tip bound to the surface of PAP, as in Fig. 1D, and as a result sterically occlude hFip1 binding. This would inhibit the recruitment of PAP to the CPAC. Testing mechanistic questions and predictions like these is now possible with the recently reconstituted recombinant polyadenylation systems [2,10,38].

Birth and demise: a brief view on poly (A) tail dynamics during mRNA lifecycle
Polyadenylation is a constitutive and essential step in gene expression because unadenylated pre-mRNAs are exported slowly and degraded by nuclear RNA surveillance factors [121,122]. The only mRNAs lacking poly (A) tails are the metazoan replication-dependent histone mRNAs which protect their 3 0 -ends with a stemloop structure and which are processed by the U7 snRNP complex [123]. In all other protein-coding transcripts, poly(A) tails protect them from nuclear degradation, stimulate translation, and play a central role in their controlled degradation in the cytoplasm [124,125].
After their initial synthesis in the nucleus, poly(A) tails undergo gradual shortening (deadenylation) catalyzed by RNA deadenylase enzymes. Complete removal of the tail releases poly(A)-bound PABPs triggering decapping and degradation of the mRNA from both 5 0 and 3 0 -ends [126][127][128]. Indeed, the varying poly (A) tail lengths observed in total mRNA ensembles reflect a mixture of different aged molecules each deadenylated with an mRNA-, cell type-and condition-specific rate [77,[129][130][131]. The variations in deadenylation rates are governed by, for example, the recruitment of deadenylases and their co-factors, translation efficiency, and RNA localization [125]. A textbook view is that the removal of the poly(A) tail is the rate-limiting step in mRNA degradation [126], which is supported by recent studies in mouse fibroblasts and yeast [127,130]. Assuming this model, the defined initial tail length, ensured by the nuclear length control systems, contribute to establishing precise mRNA halflives by fixing the number of adenosines that need to be removed before degradation occurs [61]. However, this direct relationship between poly(A) tail lengths and RNA half-lives is confounded by observations that deadenylation rates vary across the length of the poly(A) tail, and short-tailed mRNAs display large variation in their degradation rates [125,[130][131][132]. Furthermore, these rates can exist within a regime in which RNA degradation is temporally uncoupled from deadenylation [133]. Nevertheless, the corollary of the observed global poly(A) tail length control is that all mRNAs enter the deadenylation process from the same starting point.
Importantly, certain metazoan mRNAs undergo a second round of polyadenylation in the cytoplasm. Such regulation is important, for example, for translational activation of maternal mRNAs in oocytes and early embryos. Cytoplasmic polyadenylation employs some of the same components of the CPAC which are responsible for nuclear mRNA polyadenylation [134]. In addition, a repertoire of noncanonical poly(A) polymerases and terminal uridylyltransferases can extend poly(A) tails in the nucleus or the cytoplasm, often adding residues other than adenosine in a process named "mixed tailing" in order to impact mRNA stability and translatability [129,135,136]. The recent observation of N6methylation of adenines within the poly(A) tail adds to the list of ways to slow the activity of deadenylases and stabilize mRNAs [137].
In summary, despite their monotonic sequence, poly (A) tails are highly dynamic structures which integrate various cellular signals to modulate mRNA expression at the post-transcriptional level.

Conclusions and future directions
Cleavage and polyadenylation complex machineries of mammals and yeast display overall a high level of structural conservation and share many functional properties. However, a few structural divergences affecting interactions between essential components of CPAC reflect a requirement for specialization in gene expression and have resulted in important functional differences in how mRNA poly(A) tails are synthesized across species. In the future, the study of CPAC and PABPs will have to be extended to other eukaryotes, including plants, protists, and other fungal species. This will help to establish universal rules for 3 0 -end processing (if any) and give insights into species-specific specialization. These exceptions may, for example, inspire solutions to prevent the growth of fungal pathogens, which infect animals and plants, by targeting their fungal-specific features of poly(A) tail synthesis.
Future mechanistic investigations of RNA 3 0 -end processing should pursue to visualize the CPAC in the precleavage, post-cleavage, polyadenylating, and termination states. Understanding the transitions and dynamics between these states will be challenging, and will require a combination of structural, biochemical, and biophysical approaches. Furthermore, how PABPs interact with the poly(A) RNA and regulate the activity of CPAC will have to be incorporated into these models. Now that reconstituted systems are in place, it has become possible to address in detail how various regulators of CPAC modulate its activities. Finally, these mechanistic queries will have to be combined with systems-level approaches to investigate how the regulation of poly(A) tail synthesis is integrated within gene expression pathways. For example, do various nuclear PABPs in mammals or paralogous genes of CPAC subunits in plants [138] impose gene, cell type, or condition-specific regulation of poly (A) length control and mRNA expression?