Physiological and engineered tRNA aminoacylation

Aminoacyl‐tRNA synthetases form the protein family that controls the interpretation of the genetic code, with tRNA aminoacylation being the key chemical step during which an amino acid is assigned to a corresponding sequence of nucleic acids. In consequence, aminoacyl‐tRNA synthetases have been studied in their physiological context, in disease states, and as tools for synthetic biology to enable the expansion of the genetic code. Here, we review the fundamentals of aminoacyl‐tRNA synthetase biology and classification, with a focus on mammalian cytoplasmic enzymes. We compile evidence that the localization of aminoacyl‐tRNA synthetases can be critical in health and disease. In addition, we discuss evidence from synthetic biology which made use of the importance of subcellular localization for efficient manipulation of the protein synthesis machinery.

hypothesis proposed by Francis Crick in the 50s that an adaptor molecule is needed for the correct transfer of new amino acids to a nascent protein chain (Crick, 1958).
The enzymes which catalyze the "charging" of tRNAs with amino acids through an esterification (aminoacylation), are known as aminoacyl-tRNA synthetases (aaRS) (Ibba & Soll, 2000;Rajendran et al., 2018; P. R. Schimmel & Söll, 1979).Several findings preceded the identification of aaRSs, including the necessity for ATP consumption (Hoagland, 1955), the discovery of aminoacyl adenylate intermediates (Zamecnik et al., 1958), and the identification of tRNA itself (Hoagland et al., 1958).aaRSs are named after the amino acid they charge, with one convention using the three-letter code before -RS.For example, the aaRS that aminoacylates tRNAs with cysteine, cysteinyl-tRNA synthetase, is abbreviated as CysRS.In IUPAC nomenclature, aaRS are abbreviated by the one letter code of the amino acid followed by the letters -ARS, with the previous example being CARS.In eukaryotic cells, which possess distinct cytoplasmic and mitochondrial aaRSs, CARS1 denotes the cytoplasmic version, while CARS2 denotes its mitochondrial counterpart.The same is used for the corresponding human gene (CARS1).One aaRS for each one of the 20 canonical amino acids is found in the mammalian cytoplasm.The exceptions are EPRS, a fusion protein in which GluRS and ProRS are connected (Ray et al., 2011), and PheRS, which is active as a dimer of heterodimers (FARSA and FARSB;Finarov et al., 2010).EPRS fusion originated in single-celled ancestors to all animals and is hypothesized to balance the demand for both amino acids (Eswarappa et al., 2018).Three of the cytoplasmic aaRSs are also imported into mitochondria, where they charge their cognate mitochondrial tRNAs (Pang et al., 2014).The majority of all tRNAs in fastproliferating human cells are aminoacylated, except tRNA-Ser and tRNA-Thr (Evans et al., 2017).Heterozygous loss of one aaRS copy does not lead to drastic phenotypes in mice (Groza et al., 2022;Motley et al., 2011).This suggests that aaRS enzymatic activity is usually not limiting for protein synthesis.In addition to the 20 amino acids used in all species, two other amino acids, selenocysteine (sec) and pyrrolysine (pyl), are also genetically encoded, and incorporated directly during protein synthesis in a particular subset of organisms (Ambrogelly et al., 2007;Polycarpo et al., 2004;Srinivasan et al., 2002;Stadtman, 1996;Zinoni et al., 1986).Selenocysteine incorporation follows an unusual and indirect mechanism (Driscoll & Copeland, 2003;Stadtman, 1996).Selenocysteine is highly reactive and therefore cells do not possess a pool of the free amino acid-in consequence, there is no aaRS for selenocysteine (Driscoll & Copeland, 2003;Stadtman, 1996).Instead, in mammalian cells, serine is charged to tRNA-Sec, converted on the tRNA in two steps, and transferred to the ribosome by a designated elongation factor (Driscoll & Copeland, 2003;Stadtman, 1996).The first step is catalyzed by an enzyme called L-seryl-tRNA(Sec) kinase (PSTK), which converts the serine that is bound to tRNA-Sec into phosphoserine (Carlson et al., 2004;Diamond et al., 1981;Mäenpää & Bernfield, 1970).In the second step, selenophosphate is used to convert the phosphoserine into selenocysteine by SepSecS (Palioura et al., 2009;Yuan et al., 2006).The designated elongation factor EFsec (Dobosz-Bartoszek et al., 2016;Fagegaltier, Hubert, et al., 2000;Fagegaltier, Lescure, et al., 2000) then delivers the selenocysteine-charged tRNA-Sec specifically to SBP-bound mRNAs (SBP: SECIS-binding protein, Copeland et al., 2000, Hilal et al., 2022).SBP recognizes mRNAs with a SECIS motif (SECIS: selenocysteine insertion sequence, Hilal et al., 2022;Hubert et al., 1996), which contains at least two stem loops.This mechanism enables the precise suppression of stop codons that are meant to decode selenocysteine only and circumvents unwanted translational readthrough.In bacteria, conversion of the serine on tRNA-Sec to selenocysteine is achieved in one step by SelA (Itoh et al., 2013), and SelB, which corresponds to EFSec, binds the SECIS motif directly (Baron et al., 1993;Driscoll & Copeland, 2003).
In contrast, pyrrolysine is charged on tRNA by its own aaRS (Polycarpo et al., 2004) but is not found in mammals and certain bacteria, including Escherichia coli (Ambrogelly et al., 2007;L.-T. Guo et al., 2022).This makes the corresponding aaRS, PylRS, a suitable tool for synthetic biology approaches focusing on genetic code expansion (see Figure 4e and later sections).
All aaRSs aminoacylate tRNA via a two-step mechanism (Figure 1b) (Giegé, 2006;Ibba & Soll, 2000;J. Ling et al., 2009;P. R. Schimmel & Söll, 1979).Initially, an aaRS recognizes and binds its cognate amino acid and ATP.Both molecules react at the protein's active site to generate an activated aminoacyl adenylate intermediate and release pyrophosphate, whose hydrolysis drives the reaction forward.While still bound to the enzyme, the amino acid becomes covalently linked to the tRNA by reacting with the tRNA's 3 0 end, subsequently releasing AMP.Lastly, the aminoacyl-tRNA is freed from the aaRS, bound by the elongation factor eEF1A, and transferred to the ribosome, where it is used for protein synthesis (Giegé, 2008;Green & Noller, 1997;Kirchner & Ignatova, 2015).In certain aaRSs, the initial amino acid activation step is only initiated after tRNA binding (Alexander & Schimmel, 1999;Lazard et al., 2000).In addition to the specific recognition of the cognate amino acid and tRNA, the fidelity of genetic code interpretation is ensured through editing activity in certain aaRSs (Jakubowski, 2012).Well-studied examples include LeuRS, IleRS, and ValRS, which recognize amino acids with similar properties.Designated editing domains in these proteins hydrolyze tRNAs that are mischarged (Jakubowski, 2012).
aaRSs are classified into Class I and Class II based on their structural and catalytic properties (Eriani et al., 1990;O'Donoghue & Luthey-Schulten, 2003) (Figure 1c,d).Class I enzymes possess a shared protein domain called the Rossman fold (RF), which contains their active site.The Rossmann fold contains two well-preserved motifs, HIGH and KMSKS, which are connected by the short amino acid sequence connective peptide 1 (CP1; Moras, 1992).Class I aaRSs are mainly monomeric while class II are predominantly active as multimeric units (Perona & Gruic-Sovulj, 2014).The catalytic domain of class II aaRSs consists of an antiparallel six-stranded beta-sheet fold, containing three conserved motifs (Eriani et al., 1995).These motifs mediate dimer formation, coupling, and ATP-activation, respectively (Perona & Hadd, 2012).Additionally, these two classes can be further divided into subclasses which group enzymes that share a phylogenetic history and charge amino acids with common chemical properties (Pang et al., 2014).
Apart from tRNAs, aaRSs have also been shown to aminoacylate other RNAs and lysine side chains on proteins.Several plant virus RNAs are known substrates of a select set of aaRSs (HisRS, TyrRS, ValRS;Dreher, 2010).Interestingly, while several of these viral RNAs mimic the traditional tRNA shape, others diverge significantly (Bonilla et al., 2021;Colussi et al., 2014).These findings raise the question of what other RNAs might be modified by aaRSs if identity motifs can differ so drastically from the traditional tRNA shape.In addition to RNAs, wide-spread and targeted aminoacylation of lysine side-chains in proteins has been shown (X.-D.He et al., 2018;Vo et al., 2018), adding aaRSs to the list of enzymes that catalyze post-translational modifications.Lysine aminoacylation have been shown to buffer misacylation events (Vo et al., 2018) and to signal metabolic changes (X.-D.He et al., 2018).
In addition to their protein synthesis-related, essential house-keeping functions, aaRSs regulate non-translational functions (Arif et al., 2017;M. Guo & Schimmel, 2013;Hyeon et al., 2019;S. Kim et al., 2011;Kwon et al., 2019;Lo et al., 2014;Wei et al., 2019).aaRS make the perfect blueprint for metabolic sensors: they are ubiquitously expressed, inherently able to bind RNAs, amino acids, and ATP, and they can be "repurposed" by the addition of non-catalytic domains.Since the first reports of cytokine-like activities (Wakasugi & Schimmel, 1999), investigations into the nontranslational functions of aaRSs have proven to be a treasure trove, with many more functions likely to be discovered in the future.

| MULTISYNTHETASE COMPLEX
aaRSs form large protein complexes, the so-called multisynthetase complex (MSC).In contrast to other enzyme complexes, which must be fully assembled for activity, aaRSs charge tRNAs independent of complex formation.Across different species, the MSC varies in composition and complexity.Simpler, multicellular organisms form MSCs with a reduced number of aaRSs compared with mammals.In yeast, the homolog of the mammalian MSC is a 3-membered complex comprised of MetRS, GluRS, and the adapter protein Arc1p (Koehler et al., 2013;Simos et al., 1996).Parasites from the genus Trypanosomas form an MSC with three aaRSs and one adapter protein (Cestari et al., 2013).In contrast, animals with more specialized tissues possess a larger MSC which suggests a persistent evolutionary pressure toward complex formation (M.Guo et al., 2010;Havrylenko & Mirande, 2015;Hyeon et al., 2019;M. H. Kim & Kim, 2020).The nematode MSC contains eight aaRSs and one adapter protein (M.H. Kim & Kim, 2020).Mutations in one of the mammalian adapter proteins, which disrupts complex formation, are lethal in mice (M.J. Kim et al., 2003).Plants also assemble 11 aaRSs into a complex, however, its composition differs from the mammalian MSC (McWhite et al., 2020).Recent findings also indicate that complex formation is not limited to cytoplasmic aaRSs.Human mitochondrial aaRSs cluster together-with mitochondrial AlaRS and SerRS forming a complex-and interactions were detected between other mitochondrial aaRSs (Peng et al., 2022).
In mammals, the MSC is comprised of nine aaRSs (ArgRS, GlnRS, MetRS, EPRS, IleRS, LeuRS, AspRS, LysRS; Bandyopadhyay & Deutscher, 1971; M. H. Kim & Kim, 2020; Figure 2a).Three adapter proteins (AIMP1-3, also referred to as p43, p38, and p18) are critical for the integrity of the complex (Cirakoglu et al., 1985;Quevillon et al., 1997Quevillon et al., , 1999;;Quevillon & Mirande, 1996;Robinson et al., 2000) with deletion of AIMP-2 causing complex disassembly (J.Y. Kim et al., 2002).A recent addition to the MSC is ThrRS-L protein (TARS3).ThrRS-L possesses an N-terminal domain resembling the ArgRS leucine zipper but the catalytic and editing activity of ThrRS (Chen et al., 2018).Expression levels of ThrRS-L are lower compared with other aaRSs, but ThrRS-L integration into the complex was detected when ArgRS was excluded (Cui et al., 2021).This is especially interesting as gene duplications in aaRSs are rare-one hypothesis is that reliance on only one copy keeps the selective pressure high to maintain fidelity.As such, the pure existence of ThrRS-L is notable and suggestive of a pivotal role of the subcomplex to which both ThrRS-L and ArgRS locate.
The evolutionary drive toward the formation of an MSC suggests an essential function in complex organisms.While the exact role of the MSC remains elusive, complex formation was traditionally proposed to increase protein synthesis efficiency through channeling of tRNAs to the ribosome (Kyriacou & Deutscher, 2008).However, the conservation of global protein synthesis upon exclusion of individual aaRS from the MSC and a lack of pausing on the corresponding codons support a function that is independent of bulk translation (Cui et al., 2021;J. Y. Kim et al., 2002).It is of note that translation of a subset of genes does change upon the exclusion of individual aaRSs from the MSC and neural developmental factors were enriched among these (Cui et al., 2021).In line with these findings, an alternative proposed function of the MSC is to sequester aaRSs into an assembly to regulate their function and availability through the controlled release at defined subcellular compartments.
Post-translational modifications are known to regulate MSC assembly and the release of aaRSs.Phosphorylation of the EPRS noncatalytic linker (3xWHEP domain) causes EPRS to leave the MSC (Arif et al., 2009(Arif et al., , 2019)).Once free, EPRS forms a complex which regulates transcript-selective translation, the so-called GAIT complex, by binding to target mRNAs, ribosomal proteins, and the initiation factor eIF4G (Mukhopadhyay et al., 2009).In mast cells, the phosphorylation of LysRS underpins the dynamics between its transcriptional and translational functions (Lee et al., 2004;Yannay-Cohen et al., 2009).Upon phosphorylation of Ser207, LysRS undergoes a conformational change which disrupts its integration into the MSC and cause its release (Yannay-Cohen et al., 2009).Additionally, phosphorylation of LysRS causes nuclear translocation, where LysRS regulates mast cell activation (Ofir-Birin et al., 2013).
While crystal structures for individual aaRSs and MSC subcomplexes have been elucidated, a structure of the complete MSC is still missing.A model of the human holo-complex was reconstituted based on crosslinks that were identified through mass spectrometry (Khan et al., 2020).A mirror symmetry assembly of the MSC has been postulated (Hyeon et al., 2019;M. H. Kim & Kim, 2020).Questions, such as how AspRS and ProRS (a part of EPRS) dimerize, how tRNAs access their respective cognate aaRSs, and how post-translational modifications regulate complex integrity remain to be explored.While a complete picture of the MSC both from functional and structural perspectives has yet to be drawn, recent evidence supports a role of complex formation in aaRS subcellular localization and controlling extranslational functions.

| LOCALIZED TRANSLATION IN MAMMALIAN CELLS
Over the last decades, we have come to realize that mRNA translation is often compartmentalized as opposed to uniformly distributed (Figure 2b).In settings where cells need targeted and efficient recruitment of proteins, mRNA localization confers spatially restricted and controlled mRNA translation.Additionally, mRNA localization is more energy efficient than protein localization, as one copy of mRNA can guide the synthesis of many copies of the same protein.
Localized mRNA translation is critical in cell polarization, which is a prerequisite for development (Das et al., 2021;Holt & Bullock, 2009;Medioni et al., 2012).While it was initially thought that mRNA localization was restricted to a minority of transcripts, it is now appreciated that most mRNA transcripts are locally translated (Martin & Ephrussi, 2009).Initial reports uncovered mRNA localization in ascidian embryos and chicken fibroblasts where cytoskeletal mRNAs are differentially distributed (Jeffery et al., 1983).Subsequent studies in various organisms indicated that asymmetric RNA distribution is a feature conserved across species (Das et al., 2021).Several mechanisms controlling the spatial distribution of mRNAs occur in the cell, including the protection of localized mRNAs from degradation, translocation along a polarized cytoskeleton, or diffusion-coupled local entrapment (Medioni et al., 2012).In neurons, mRNAs are preferentially translated either in the axons and dendrites (collectively referred to as the neuropil) or the soma, depending on the function of their protein product (Glock et al., 2021;Holt et al., 2019).

| tRNA TRAFFICKING IN MAMMALIAN CELLS
Given the need for efficient protein synthesis at specific subcellular localizations, a steady supply of aminoacylated tRNAs is also needed at these sites in mammalian cells.Controlling the local supplies of tRNAs, therefore, allows for the spatial regulation of translation.
tRNA abundance is dependent on cell state and hallmark studies have suggested that tRNAs in human tissues are organized into distinct sets which are used for protein synthesis during cell proliferation or differentiation, respectively (Aharon-Hefetz et al., 2020;Gingold et al., 2014; Figure 2c).Codons from each defined tRNA pool are used for key genes which define each cell state (Gingold et al., 2014).Asymmetric tRNA distribution can thereby enable high translation rates of defined mRNA transcripts even when the overall tRNA concentration is low.
The processes controlling tRNA localization are dynamic and bidirectional.Contrary to our earlier understanding of tRNA biogenesis and function, where tRNAs are synthesized in the nucleus and exported in the cytoplasm to carry out their duty in a unidirectional fashion, studies showed that tRNA trafficking between the nucleus and the cytoplasm is reversible (Kramer & Hopper, 2013;Schwenzer et al., 2019).Several pathways exist to export tRNAs out of the nucleus, which can be grouped under primary tRNA nuclear export, tRNA nuclear re-export, and retrograde tRNA transport (Chatterjee et al., 2018).Before export, tRNAs undergo cleavage of their 5 0 end and 3 0 end leader and trailer sequences, posttranscriptional addition of the 3 0 CCA sequence, nucleoside modifications, and splicing of introncontaining transcripts (Chatterjee et al., 2018).tRNAs can subsequently be exported with the participation of Exportin-t nuclear exporter (Arts et al., 1998;Leisegang et al., 2012).
The interplay between tRNA export and tRNA import is strictly regulated in various cell stress situations.Nuclear accumulation of tRNAs is observed in response to oxidative stress through active retrograde transport of tRNAs in a reversible fashion (Schwenzer et al., 2019).The transport of tRNAs to the nucleus is linked to the integrated stress response pathway (Schwenzer et al., 2019).Moreover, the process is selective for tRNAs truncated at the 3 0 end and tRNA-Sec (Schwenzer et al., 2019).tRNAs also accumulate in the nucleus upon nutrient deprivation.While oxidative stress triggers active tRNA retrograde transport, nutrient deprivation impedes tRNA re-export, with consequences for protein synthesis (Schwenzer et al., 2019).
When considering tRNA trafficking in the context of localized protein translation, the well-established separation between the nucleus and the cytoplasm does not capture the nuances of tRNA subcellular localization.In fact, tRNA distribution in the cytoplasm has only recently begun to be investigated and the available data indicates that tRNA distribution in the cell is asymmetric and actively controlled (Dhakal et al., 2019;Koltun et al., 2020;Pilotte et al., 2018).In neurons, tRNAs were shown to aggregate into granules and are thereby sequestered to local sites of translation (Pilotte et al., 2018).These tRNA granule-like structures are heterogeneous in size and function and occurred both in the presence or absence of other translation factors (Pilotte et al., 2018).It should be noted though that tRNAs were externally labeled and then injected, which could influence their behavior (Pilotte et al., 2018).Another study in mouse fibroblasts under nutrient stress echoes the discrete spatial arrangement of cytoplasmic tRNAs.Here, tRNAs were found in aggregates or "hot spots" around the nucleus (Dhakal et al., 2019;Huynh et al., 2010).No single mechanism is established for tRNA localization, but it was proposed that those hot spots could arise from interactions with cellular components such as the MSC or polysomes (Dhakal et al., 2019).The recent evidence found in both neuronal and nonneural cells suggests that cytoplasmic tRNA localization is likely controlled and highlights a need for a better understanding of the spatial control of aaRSs themselves.
Recently, regulatory functions of tRNA and tRNA fragments have been discovered (P.Schimmel, 2018), and we have come to appreciate the unexpectedly tight regulation of tRNA availability during protein synthesis.Thus, the spatial distribution of tRNA-specifically of aminoacylated tRNA-and cell state specific distribution might be a critical regulator of mRNA translation.

| GENETICS AND GENETIC DISEASES
aaRSs mutations have been linked to a range of diseases, the most prevalent of which is Charcot-Marie-Tooth (CMT) neuropathy.CMT is the most common inherited neurological disorder, with a study in Western Norway reporting an incidence rate as high as 1 in 2500 individuals (Skre, 1974).This heterogeneous group of disorders is characterized by chronic progressive neuropathy affecting peripheral sensory and motor nerves (Bird, 1993).Its main clinical features are a loss of sensation in the lower extremities and muscle weakness.Wasting of legs, foot deformities, and claw hands are other common symptoms in CMT patients (Pareyson & Marchesi, 2009).aaRS are the largest gene family where mutations have been found causative for CMT (Wei et al., 2019).CMT disorders caused by aaRS mutations are predominantly classified as subtypes of CMT type 2, which is characterized by damage to the axons themselves.Six aaRSs (GlyRS, TyrRS, AlaRS, HisRS, TrpRS, MetRS) are linked to CMT.Monoallelic mutations in this set of cytoplasmic aaRSs follow a dominant inheritance pattern, meaning that one copy of the mutant gene is sufficient to cause symptoms.Dominant inheritance patterns suggest either a gain of function or haploinsufficiency, with mouse studies suggesting that the latter is not sufficient to elicit CMT-like symptoms (Groza et al., 2022;Motley et al., 2011).Mutations linked to CMT can, but do not always, disrupt the catalytic activity of the aaRS (Turvey et al., 2022;Wei et al., 2019).Therefore, a gain of function mechanism is likely, with studies showing that mutated aaRSs aberrantly sequester other proteins or tRNAs (W.He et al., 2015;Spaulding et al., 2021;Zuko et al., 2021).For example, mutated GlyRS interacts with Neuropilin-1 (NRP1) and subsequently competes with VEGFA (W.He et al., 2015).As NRP1 is a membrane protein, this necessitates localization of GlyRS at subcellular sites distinct from the cytoplasm.Interaction between an aaRS and Neuropilin receptors has also been shown for Neuropilin-2 and HisRS (Z.Xu et al., 2020).In addition to aberrant interactions with proteins, GlyRS mutations also cause tighter binding to tRNAs (Mendonsa et al., 2021;Spaulding et al., 2021;Zuko et al., 2021).This reduced the availability of tRNAs for translation and induced intracellular stress signaling (Spaulding et al., 2021;Zuko et al., 2021).Overexpression of tRNAs alleviated the CMT phenotype (Zuko et al., 2021), offering a pathway for therapeutic intervention through tRNA delivery.In both scenarios, the cellular localization of the mutated GlyRS is central to the CMT pathology.Interestingly, five of the six aaRSs that are mutated in CMT are free-standing aaRSs.This observation strengthens the hypothesis that the MSC sequesters aaRSs, as gain of function aberrant interactions might be less pathological if the mutant enzyme is contained within a large protein complex.
Hypomyelinating leukodystrophies (HLD) constitute another heterogeneous group of neurological disorders that affect white matter in the central nervous system.MRI imaging reveals a lack of myelin deposition in HLDs and form the basis for clinical diagnosis (Fuchs et al., 2019;Wolf et al., 2021).Some clinical features of HLDs include hampered motor function, mental retardations, and microcephaly, with symptoms varying in severity (Wolf et al., 2021).Unlike CMT, HLD are genetically inherited in an autosomal recessive fashion, meaning that patients predominantly carry compound heterozygous or homozygous mutations, suggesting a loss of function as the underlying cause (Fuchs et al., 2019).Genes involved in hypomyelination broadly fall under structural myelin proteins, cytoskeletal proteins, oligodendrocyte development genes, and the transcription and translation machinery (Wolf et al., 2021).Mutations within ArgRS (RARS1), EPRS1, and AspRS (DARS1)-which are all part of the MSC-have been identified as causal for hypomyelination (Wolf et al., 2021).R339X nonsense mutations in EPRS localizes EPRS to polymeric aggregates in oligodendroglia cells as opposed to being spread through the cell bodies, dysregulating cell differentiation (Sawaguchi et al., 2021).For DARS1, C-terminal and catalytic core mutations have been reported (Fröhlich et al., 2017(Fröhlich et al., , 2020;;Taft et al., 2013).Aside from a loss of catalytic activity, several mutations are located to domains which mediate MSC assembly (Figure 3a).Most strikingly, the predominant HLD-causing mutation in the ArgRS gene RARS1 is a D2G mutation right after the start codon, leading to increased expression from a second start codon downstream of the leucine zipper (Mendes et al., 2019).This results in a N-terminally truncated form of ArgRS that is excluded from the MSC (Li et al., 2021).Neurodegenerative and neurodevelopmental disorders also arise from mutations in the non-enzymatic MSC components AIMP1 and AIMP2 (Boespflug-Tanguy et al., 2011;Feinstein et al., 2010;Shukla et al., 2018).
The scope of aaRSs in diseases is not limited to neurological disorders but also encompasses autoimmune disorders, which are referred to as antisynthetase syndrome.Antisynthetase syndrome is not associated with causative aaRS mutations but instead arises when autoantibodies against aaRSs are formed by the patient's own immune system (Kanaji et al., 2022).Lungs and muscles are predominantly damaged despite aaRSs being expressed in all cells.Some recurrent clinical presentations are muscle inflammation, interstitial lung disease, mechanic's hand, and arthropathy (Galindo-Feria et al., 2022;Kanaji et al., 2022).Lethality most often arises when the lungs of patients are affected.Eight aaRSs are known to elicit autoantibodies, which can cause antisynthetase syndrome: HisRS, ThrRS, AlaRS, GlyRS, IleRS, AsnRS, and PheRS.The autoantibody against HisRS (Jo-1) is the most common one and was the first to be discovered (Galindo-Feria et al., 2022).Jo-1 specifically recognizes the WHEP domain of HisRS, a small, non-catalytic motif which is found in several aaRSs (Raben et al., 1994).The therapeutic usage of the immune-modulatory function of aaRSs is currently being explored and ongoing clinical trials are promising (aTyr press release).A recent review highlighted the interplay between tRNA and secreted aaRS in the etiology of antisynthetase syndrome (Kanaji et al., 2022), again suggesting that altered subcellular localization is a hallmark and a determinant of aaRS-driven pathologies.
Recent clinical studies explored dietary supplementation of amino acids in patients with causative aaRSs mutations and an enzymatic loss of functions (Kok et al., 2021).As amino acids are a natural part of the human diet, this is an attractive option to offset reduced tRNA aminoacylation with little expected side effects.Indeed, treatment was well tolerated and alleviated symptoms in certain patients, coinciding with increased growth and developmental progress (Kok et al., 2021).In the correspondence following this report, a role of the MSC has been suggested (Shen, 2022).It should be noted though that in vitro studies on HisRS mutations showed that treatment with increased levels of the cognate amino acid could also cause protein aggregation (Qiu et al., 2022).Future studies into this exciting avenue will surely follow as this strategy alleviates disease by addressing the underlying molecular mechanism with minimal expected side-effects for the patients.
It is also noteworthy for all diseases associated with aaRSs that despite the ubiquitous expression of aaRSs and their importance in all cells and tissues, the initial damage is restricted to a specific organ and even a differentiated cell population within it.As aaRSs partake in ex-translational functions, many of which are tissue-specific, diseases could originate from a disruption of these (Turvey et al., 2022).Several disease-causing mutations in aaRSs and the MSC adapter proteins are unlikely to affect the enzymatic function of the aaRS, but complex formation instead (Fuchs et al., 2019;Mendes et al., 2019).Altered or unexpected cellular localization is a characteristic found of aaRSs-caused disease (Figure 3b).Unraveling and understanding these interconnected functionalities will shed light on the many open questions associated with this ancient protein family.

| GENETIC CODE EXPANSION THROUGH aaRS ENGINEERING
Beyond the characterization of aaRSs in their cellular context, aaRSs have also been extensively studied as tools for biotechnology.Despite the variety of functions proteins can accomplish with the canonical 20 amino acids alone, their chemical diversity is limited.Genetic code expansion can be used to add new chemical properties (Figure 4a) and refers to the incorporation of noncanonical (ncaa) or unnatural amino acids (unaa) into proteins.To this end, the cells' own translation machinery is engineered to accommodate noncanonical amino acids (L.Wang et al., 2006;C. C. Liu & Schultz, 2010).
The desire to expand the repertoire of amino acids can be traced back to the discovery of tRNA.Initial experiments showed that chemically aminoacylating tRNAs were feasible by modifying the dinucleotide pCpA with a noncanonical amino acid and then attaching it to the tRNA acceptor arm (Hecht et al., 1978).Site-specific incorporation into proteins was later achieved by using a tRNA which decodes a stop codon (Noren et al., 1989(Noren et al., , 1990)).The resulting suppressor tRNA was aminoacylated chemoenzymatically and three distinct ncaa were successfully introduced into the active-site of β-lactamase (Noren et al., 1989).These early studies on genetic code expansion enabled the study of the enzymatic activity of specific aaRSs residues and an evaluation of the effects of noncoding amino acids on protein structure (Chung et al., 1993;Mendel et al., 1992).Although promising, this chemoenzymatic approach presented some limitations: it was necessary to synthesize a dinucleotide precursor for every new amino acid and protein production was limited by tRNA availability.Despite optimization (Her & Kluger, 2011), with the rise of directed evolution and more sophisticated methods to engineer enzymes, efforts predominantly shifted toward enzymatic aminoacylation in living cells.
To this end, the natural promiscuity of aaRSs toward substrates that are not found in biological systems has been exploited (Budisa, 2004).Incorporation of selenomethionine is commonly used for experimental phase determination in X-ray crystallography and does not necessitate engineering of the host protein synthesis machinery (Hendrickson et al., 1990).In mammalian systems, amino acid homologues with bio-orthogonal reactivities or properties are used to track metabolic processes in cells and living animals (Bassan et al., 2019;Dieterich et al., 2006).However, this strategy is limited to what endogenous aaRSs will tolerate, thereby excluding bulky side chains and generally amino acids that differ too much from our natural 20.The global replacement of a canonical amino acid also causes toxicity due to protein misfolding as observed for several natural products (Akaogi et al., 2006;Song et al., 2017).
F I G U R E 4 (a) Genetic code expansion to enable the incorporation of noncanonical amino acids during endogenous protein synthesis.Either the natural promiscuity of endogenous aminoacyl-tRNA synthetases (aaRS) can be exploited (middle panel) or tRNAs and aaRSs can be engineered to tolerate unnatural amino acids (right panel).(b) The concept of orthogonality: tRNAs and aaRS are optimized to minimize cross-reactivity between the engineered aaRS/tRNA pair and the endogenous aaRS/tRNAs.(c) Lack of cross-reactivity between kingdoms for at least one aaRS/tRNA pair between kingdoms, which can in turn be exploited to generate orthogonal pairs.(d) Decoding of orthogonal codons through aaRS/tRNA pairs.Stop codons and quadruplet codons can be used to encode noncanonical amino acids.Rarely used sensecodons can be repurposed.(e) PylRS does not occur in mammals or E. coli and does not aminoacylate E. coli or mammalian tRNAs.Thus, it can be repurposed for the incorporation of unnatural amino acids (upper panel).PylRS is especially suitable due to its naturally deep substrate pocket which tolerates bulky and elongated amino acid side chains (lower left panel).PylRS does not identify its tRNA by the anticodon, allowing facile replacement with the desired orthogonal codon (lower right panel).
Pioneering work by Wang and Schultz described a system in which aaRSs are engineered to accept unnatural amino acids (Figure 4).An engineered tRNA and aaRS from an organism which is evolutionary far from the host (L.Wang et al., 2001) was chosen, thereby achieving orthogonality from the endogenous system (Figure 4c).An archaeal TyrRS (Methanococcus jannashii TyrRS) and a suppressor tRNA (meaning that it decodes a stop codon) were subjected to selection pressure following random mutagenesis (L.Wang et al., 2001).The resulting mutations in five residues close to the active site increased M. jannashii TyrRS' substrate tolerance toward noncanonical amino acids (L.Wang et al., 2001).With these orthogonal aaRSs/tRNA pairs at hand, a noncanonical derivative of tyrosine was introduced into a protein expressed in E. coli (L.Wang et al., 2001).The same principles were applied to optimize aaRS/tRNA pairs for other noncanonical amino acids to enable the incorporation of amino acids that differ significantly from the canonical 20 (L.Wang et al., 2001).In addition to stop codon suppression, quadruplet codons can be used to decode noncanonical amino acids and rare sense codons have been repurposed to enable the simultaneous integration of several noncanonical amino acids into proteins (Figure 4d).We can now incorporate amino acids with bio-orthogonal reactivity, bearing post-translational modifications, or containing chemical groups for direct protein labeling (Shandell et al., 2021;Tang et al., 2022;Young & Schultz, 2018).

| GENETIC CODE EXPANSION IN MAMMALS
In 2003, genetic code expansion was achieved in eukaryotic cells (Chin et al., 2003).Building on earlier findings that E. coli's TyrRS and its cognate tRNA are orthogonal to the yeast translation machinery, mutant E. coli TyrRS/tRNA pairs were engineered (Chin et al., 2003).In theory, aaRS/tRNA orthogonal pair optimized in yeast should be functional in mammalian cells, as their translation systems share many characteristics.However, E. coli tRNAs suffer from low transcription levels in mammalian cells as these genes are not recognized by the RNA polymerase III (Pol III) (Sakamoto et al., 2002).To address this issue, tRNAs from Bacillus stearothermophilus, which contain elements that are recognized by Pol III, were used instead (W.Liu et al., 2007).With this approach, six different noncanonical amino acids were incorporated (W.Liu et al., 2007).tRNAs can also be placed under the control of a H1 or U6 promoter, which enables the expression of small RNAs in mammalian cells.These promoter elements facilitate high expression levels of orthogonal suppressor tRNAs (W.Wang, Takimoto, et al., 2007) and are now mainly used for genetic code expansion in mammalian cells (Shandell et al., 2021).
A breakthrough came in the form of engineering pyrrolysyl-tRNA synthetase (PylRS) and its cognate tRNA (which naturally recognizes a stop codon; Neumann et al., 2008;Figure 4e).PylRS does not interact with the anticodon region of its tRNA and is not present in mammalian cells, making PylRS an excellent starting point for engineering aaRS/ tRNA pairs for genetic code expansion (Nozawa et al., 2009;Wan et al., 2010).Additionally, PylRS accepts bulky amino acid side chains, which enables the direct integration of expanded π-systems for fluorescence or bio-orthogonal reactivity driven by ring strain.Even encoding several noncanonical amino acids into one protein was now possible and an ochre and an amber stop codon were recoded simultaneously (Xiao et al., 2013).

| OVERCOMING CHALLENGES IN GENETIC CODE EXPANSION BY FORCED COLOCALIZATION
These advances in aaRS/tRNA engineering enabled the synthesis of proteins with new noncanonical amino acids but issues remain.tRNAs that recognize stop codons in theory suppress all of them in the cell, which disrupts general protein synthesis and non-canonical amino acids are incorporated where they are not intended.Suppressor tRNAs compete with release factors, which reduces protein yield.E. coli strains lacking release factors (Johnson et al., 2011) or amber stop codons (Lajoie et al., 2013) were developed to circumvent this issue.An alternative approach is to repurpose synonymous codons by replacing rare sense codons (codon compression).For example, the least used serine codons (out of six) has been replaced in the entire bacterial genome (Fredens et al., 2019).
In addition to aaRSs and tRNAs, the ribosome itself can be engineered to increases its tolerance toward noncanonical components (Figure 5a,b).Ribo-X, which is highly optimized for tolerating suppressor tRNAs (K.Wang, Neumann, et al., 2007), has mutations at its 16S rRNA to decrease interaction with the release factor 1 (K.Wang, Neumann, et al., 2007).Moreover, Ribo-X modifications enabled selectivity toward mRNAs with a specific Shine-Dalgarno sequence, thereby creating an orthogonal ribosome/mRNA pair that cannot engage with the endogenous machinery (K.Wang, Neumann, et al., 2007).Ribo-X was further optimized to efficiently decode quadruplet codons, leading to Ribo-Q1 (Neumann et al., 2010).This new quadruplet-decoding ribosome can be used alongside different orthogonal aaRS/tRNA pairs to incorporate noncanonical amino acids in a target protein (Neumann et al., 2010).
These efforts focused on the smaller subunit of the ribosome, but the tolerance toward bulky residues is restricted by the larger subunit.However, due to the high concentration of endogenous ribosomes, the chances of two modified subunit assembling into an orthogonal ribosome are low (Orelle et al., 2015).To overcome this, the ribosome subunits were tethered together with an RNA linker.In Ribo-T, the 23S rRNA is connected to the 16S rRNA by two poly A linkers (Orelle et al., 2015).Mutations in the 50S subunit are otherwise not well tolerated due to their toxicity but can thus be restricted to the orthogonal ribosome (Carlson et al., 2019).Ribo-T accepts non-canonical amino acid with bulkier side chains that are normally excluded due to the size of the A-site (Carlson et al., 2019).Combining these mutations with those for Ribo-X led to the development of the orthogonal ribosome variant O-d2d8, which produced proteins in good yield and barely interacted with the native translation machinery (Schmied et al., 2018).The final design enables the existence of two completely orthogonal translation systems in the same (bacterial) cell (Aleksashin et al., 2020).
While limited to bacterial systems, these studies highlight the importance of spatial proximity for protein synthesis.Compartmentalization of the translation in mammalian cells achieved similar orthogonality (reviewed in Reinkemeier & Lemke, 2021b; Figure 6a).To this end, aaRSs and target mRNAs with stop codons were concentrated in membraneless artificial organelles (Reinkemeier et al., 2019).Both were enriched and colocalized by direct or indirect fusion with disordered protein domains that induce phase separation.Two ms2 steam loops were added to the 3 0 -UTR of the target mRNA (Reinkemeier et al., 2019).These loops are bound by the major capsid protein (MCP), which in turn was fused to EWSR1 (Reinkemeier et al., 2019).EWSR1 is a disordered protein and undergoes phase separation (Reinkemeier et al., 2019).The orthogonal, engineered PylRS, was fused to FUS, another protein which mediates phase separation (Reinkemeier et al., 2019).A truncated motor protein (kinesin) was then added to both, to force trafficking of the mRNA and the aaRSs, which form an artificial, phase-separated cellular compartment through the disordered regions attached to them (Reinkemeier et al., 2019).Immunofluorescence and FISH confirmed the formation of organelle-like structures in the cytosol with high concentrations of ms2-labeled RNA and ribosomes (Reinkemeier et al., 2019).Within these organelles, a reporter protein with a noncanonical amino acid was expressed efficiently (Reinkemeier et al., 2019).In a real breakthrough, this system was selective against other genes that contained stop codons but not the ms2 recruiting sequences (Reinkemeier et al., 2019).True compartmentalization was achieved, enabling the selective suppression of defined stop codons in mammalian cells.
This concept was expanded to enable the construction of several artificial organelles in one cell, each with the ability to produce protein with a different non-canonical amino acid (Figure 6b; Reinkemeier & Lemke, 2021a).PylRS and mRNA-binding MCP were again fused with FUS and EWSR1 but, in addition, peptides that target defined structures in the cell were added (Reinkemeier & Lemke, 2021a).Each artificial organelle selectively produces proteins with distinct noncanonical amino acids (Reinkemeier & Lemke, 2021a), achieving orthogonality between compartments.One stop codon now encoded different noncanonical amino acids in the same cell, without interfering with the endogenous protein synthesis machinery.More recently, these artificial organelles were targeted to the microtubule cytoskeleton of mammalian cells (Reinkemeier & Lemke, 2022).This series of work exemplified the importance of aminoacylation localization for mRNA translation, as spatial enrichment was sufficient to create protein synthesis hubs that were orthogonal not only to the host system but also to each other.

| CONCLUSION AND OUTLOOK
We here provide an overview of aaRS physiology, pathology, and synthetic biology.aaRSs mediate the communication between RNA, metabolites, and proteins, and are central to one of the most basic processes in all living beings, genetic code interpretation.Despite our comparatively detailed understanding of their enzymatic mechanism, new discoveries about aaRS are constantly made.While the subcellular localization of mRNAs and ribosomes is being studied, the spatial regulation of aaRSs has been far less explored.Examples from both human biology, disease states, as well as synthetic biology, strongly support that control of aaRS localization is critical.The interpretation of the genetic code is of unarguable importance for all living beings and the added complexity in mammalian cells forced its organization and compartmentalization.With our growing appreciation of the regulation of protein synthesis beyond housekeeping, small RNA trafficking and function, and the biology underlying diseases, it becomes clear that aaRSs are both regulated and regulators.In parallel, genetic code expansion has matured, so efforts now focus on improving host fitness and the separation between endogenous and engineered components.In doing so, we learn more about the biology of aaRSs and their interplay with the rest of the protein synthesis machinery.

F
I G U R E 1 (a) Principle of the central dogma in biology, where information is stored as DNA, transcribed into mRNA, and translated into proteins.Lines on top of the mRNA depict triplet codons.(b) Schematic of the aminoacylation reaction catalyzed by aminoacyl-tRNA synthetases.(c) Class I versus Class II aminoacyl-RNA synthetase fold.Red: catalytic domain.Blue: tRNA (anticodon) binding domain.(d) Classification of human cytoplasmic aminoacyl-tRNA synthetases into class I and class II.

F
I G U R E 2 (a) Overview of domain structures of human aminoacyl-tRNA synthetases that are part of the MSC.Catalytic, aminoacylation domain; GST, glutathione S-transferase-like motif; LZ, leucine zipper; tRNA, tRNA binding domain.WHEP domain, domain unique to aminoacyl-tRNA synthetase and found, among others, in TrpRS (W), HisRS (H), GluRS (E), and ProRS (P).NTD, N-terminal domain that aids in tRNA binding; UNE-L, domain unique to LeuRS; UNE-I, domains unique to IleRS.(b) Schematic of localized translation within neurons.(c) Overview of tRNA trafficking in mammalian cells.

F
I G U R E 5 (a) Engineered ribosomes for the incorporation of noncanonical amino acids.Artificial ribosomes can assemble either in cis or trans to the endogenous ribosomes, with trans-assembly lowering the efficiency for genetic code expansion.(b) Mutations are introduced to increase the tolerance for quadruplet codons, to reduce affinity for competing release factors, and/or to tolerate bulkier site chains.Mutations are inserted into either the large or the small ribosome unit (or both).Tethering of the two subunits forces assembly in cis and increases orthogonality as well as tolerance toward mutations in the large subunit.

F
I G U R E 6 (a) Design of artificial organelles for subcellular enrichment of the genetic code expansion machinery (mRNA with orthogonal codons, orthogonal aaRS) and spatial separation from the endogenous machinery.PylRS mutants and mRNAs are enriched in phase-separated compartments by direct fusion to disordered proteins (PylRS) or by RNA-binding proteins that are fused to disordered proteins (mRNA).(b) Several of these artificial organelles can be established in the same mammalian cell by targeting of the condensate to different locations.