D-Peptide and D-Protein Technology: Recent Advances, Challenges, and Opportunities**

Total chemical protein synthesis provides access to entire D-protein enantiomers enabling unique applications in molecular biology, structural biology, and bioactive compound discovery. Key enzymes involved in the central dogma of molecular biology have been prepared in their D-enantiomeric forms facilitating the development of mirror-image life. Crystallization of a racemic mixture of L-and D-protein enantiomers provides access to high-resolution X-ray structures of polypeptides. Additionally, D-enantiomers of protein drug targets can be used in mirror-image phage display allowing discovery of non-proteolytic D-peptide ligands as lead candidates. This review discusses the unique applications of D-proteins including the synthetic challenges and opportunities.


Introduction
Proteins, like other biomolecules, are composed of chiral building blocks. [1]Ribosomes recruit L-(Levorotatory) aminoacids for catalysis, and hence recombinant proteins are largely refrained in the L-framework unless engineered ribosomes are used. [2]In contrast, incorporation of D-(Dextrorotatory) amino acids into L-polypeptides requires the use of engineered ribosomes, [2a,b] post-translational modification systems (PLPdependent enzyme) [3] or non-ribosomal peptide synthetases (NPRS). [4]At the time of writing, proteins entirely comprised of D-amino acids have yet to be found in nature and must be synthesized via chemical routes.Though being more difficult to prepare, D-amino acid proteins that fold into reciprocal chirality (Figure 1) possess extraordinary potentials in scientific research, spanning from the creation of mirror-image life, mechanistic investigations of natural proteins to the isolation of ultra-stable binders. [5]In this review, we will describe current research surrounding enantiomeric proteins including both the challenges and opportunities.

Towards 'mirror-image life'
Mirror-image life which was first proposed by Louis Pasteur in 1860 [6] refers to the creation of an artificial biosystem with all macromolecules presented in their opposite enantiomeric forms.In these self-replicating systems, L-nucleic acids serve to store genetic information creating D-protein workforce for biological function following the mirror-image central dogma. [7]ndeed.the de novo design of living entities has gained significant attention because of our fundamental interest in understanding the origin of life. [8]While preparation of an entirely self-replicating living entity in mirror image form is a major challenge, D-enantiomers of key enzymes involved in the central dogma of molecular biology have been prepared (Figure 2). [5]These enantiomeric enzymes hold potentials in research.For example, mirror-image polymerases can be used to generate L-nucleotide aptamer libraries [9] or L-genome for bioorthogonal information storage. [10]Also, an enantiomeric ribosome could facilitate access to D-proteins through mirrorimage translation.Furthermore, creation of a self-replicating mirror-image entity can offer access to D-proteins, L-sugars and/or other enantiopure pharmaceutical compounds.Recent examples of preparing D-enzymes towards mirror-image life are summarized below:

Creation of artificial L-polynucleotides by use of mirrorimage nucleic acid polymerases
Template-directed polymerizations of the enantiomeric L-DNA and -RNA were first carried out using the D-protein enantiomer of African swine fever virus polymerase X (D-ASFV pol X). [12] Composed of only 174 residues, ASFV pol X is the smallest DNA polymerase known, hence an ideal candidate for total chemical synthesis.The longest polynucleotide successfully replicated by D-ASFV pol x was 44 nucleotides in length but required fresh enzymes at each cycle due to its weak thermal stability. [12]A significant advancement was achieved by preparing the 358residue enantiomeric P2 DNA polymerase IV (Dpo4) from Saccharolobus solfataricus, which is sufficiently stable to temperature flux and could perform mirror-image polymerase chain reaction (miPCR). [9,13]The D-Dpo4 could successfully create a 120-bp L-DNA sequence encoding the E. coli 5S ribosomal RNA gene rrfB. [9]Its thermostable variant D-Dpo4-3C could assemble a full L-DNA gene encoding protein Ssoo7d. [13]Interestingly, a further-engineered variant D-Dpo4-5m-Y12S was reported that was capable of both transcription and reverse transcription, laying a strong foundation for enabling mirror-image life. [14]n order to create a lengthy enantiomeric gene with high fidelity, access to polymerase enzymes with a low error rate is essential, thus an enantiomeric derivative of Pyrococcus furiosus (Pfu) DNA polymerase has been deduced. [15]Pfu is composed of 775 amino acids reaching 90 kDa in molecular weight, rendering its chemical synthesis challenging.To circumvent this issue, a split version of Pfu was prepared consisting of N-(467residue) and C-(308-residue) fragments.Due to the significant cost of D-isoleucine and its association with the aggregation of peptide fragments, most of them were replaced by other bulky residues including valine and leucine. [10]The split polymerase D-Pfu could synthesize a 1.5-kb mirror-image gene from short, synthetic oligonucleotides.Interestingly, because of the inherent stability towards enzymatic cleavage, a trace amount of artificial L-DNA preserved in water from a local pond remained amplifiable by D-Pfu after one year, whereas D-DNA could not be amplified by L-Pfu after one day.This work is a clear leap forward in the pursuit of mirror-image biology.In addition, the miPCR platform is potentially useful in molecular discovery programs generating nuclease-resistant L-nucleotide aptamers for critical drug targets. [9]

Other life-essential, mirror-image proteins
Enantiomeric ligase is another critical enzyme that can be used to create long stretches of L-DNA. [16]Preparation of a mirrorimage ribosome is exceptionally challenging as it composes of multiple protein and nucleotide subunits. [17]Currently, three out of the ~50 enantiomeric E. coli ribosomal proteins have been reported, including L5, L18 and L25. [18]The D-ribosomal proteins were prepared with native post-translational modifications and interacted specifically with L-5S RNA to form a mirror-image ribonucleoprotein complex.On the other hand, eukaryotic ribosomes consist of 79-80 proteins and four rRNAs, [17] requiring approximately 200 non-ribosomal factors for assembly. [19]Both the ribosomal proteins and rRNA itself also bear post-translational modifications, [20] adding further challenges to their preparation.

Remarks
The genome of the laboratory strain E. coli K12 encodes for approximately 4300 proteins, [21] but only a small fraction of their enantiomeric counterparts have been reported. [22]Many of these proteins bear intrinsic synthetic challenges, because of their size, post-translational modifications and folding (see section 3).Construction of the necessary mirror-image oligonucleotides is also a major challenge.] With the advent of the high fidelity D-Pfu capable of assembling complex genes, [10] one might argue that this challenge is within reach.Given the significant efforts, it is also expected that a mirror-image ribosome from bacteria will soon be reported.In addition, mirror-image tRNAs, tRNA synthetases, and translation factors will need to be prepared to enable the translation of L-mRNA to D-polypeptides.When made available, the mirror-image translation system will be game-changing in the landscape of enantiomeric protein synthesis.Finally, pursuit of a truly self-replicating system will require an approach of devising a minimal cell and assembly of each essential component in mirror-image form.

Racemic protein crystallography
Racemic protein crystallography utilizes synthetic protein enantiomers for crystallization and is a technology particularly useful at yielding atomistic structural insights. [24]Unlike native L-proteins, racemic proteins can crystallize into achiral space groups possessing higher symmetry and order (Figure 3). [25]olving structure based on racemic proteins can be advantageous.As illustrated in the first example, the phase issue in solving the structure of rubredoxin was vastly simplified, because the space group was found to be centrosymmetric with the crystal unit cell containing a center of inversion (Figure 3B). [26]The off-diagonal phases of the X-ray diffraction data obtained from a centrosymmetric crystal cancel out, restricting the phases to 0°or 180°, as opposed to the possible 0°to 360°arising from a homochiral crystal. [24]Additionally, the centrosymmetric crystal has high dimensionality.This generally results in rapid protein crystallization and structure solving with high-resolution detail, in addition to unveiling solute and ligand interactions. [24,27]onsidering the higher order of crystal symmetry and favorable crystal growth, racemic protein crystallography has been used to resolve X-ray structures of numerous proteins up to ultra-high (sub-angstrom) resolution (Table 1).Indeed, 13 (20 % so far) of the reported racemic crystal structures were resolved in the centrosymmetric space group P1 .Many quasiracemic crystal structures, in which the enantiomers differ slightly, are reported to adopt the space group P1 due to a lack of true symmetry. [28]However, it is more appropriate to classify these crystals as pseudo-centrosymmetric or "pseudo-P1 ." [25,29]hile almost half of the racemic structures were resolved in centrosymmetric/pseudo-centrosymmetric space groups, the other half were also resolved in chiral space groups (Table 1).Notably, there are no reported instances of a single enantiomer crystallizing into a chiral space group from a racemate.Thus, crystallization of proteins from racemates may have advantages beyond that explained by the achiral space group theory (for review -see ref. [24]).
Racemic protein crystallography has been most frequently applied to study miniproteins containing fewer than 100 residues.These proteins are known to be difficult to crystallize because of their globular morphology which disfavors crystal packing.Meanwhile, their small sizes render the chemical synthesis of these proteins feasible (see section 3 below for synthetic approaches). [30]Some of the unique insights generated by racemic crystallography are listed in Table 1.

Quaternary states of protein
Oligomeric assemblies are thought to play a common role in the activity of a variety of antimicrobial peptides, particularly those acting on the bacterial membrane. [59]In an aim to elucidate its mechanism of action, the β-sheet antimicrobial peptide originated from Baboons (BTD-2) was chemically synthesized in both L-and D-forms.Interestingly, racemic protein crystallography of BTD-2 revealed a novel anti-parallel trimeric form (Figure 4B).This supramolecular discovery is fibrillike and is postulated to have critical roles in membrane disruption. [39]In another example, melittin is an α-helical antimicrobial peptide isolated from honeybee venom which is known to exert its activity by disrupting the bacterial cell membrane.The tetrameric assembly observed in the solution state is also present in the racemic X-ray crystal structure (Figure 4A). [36][b] Ion channel TM helix 24 2.00 P2 1 /c 4RWB Investigations of a heterochiral coiled coil [27b] M2-TM [c] 1.05 P1 ̄4RWC M2-TM (I39A) [c] 1.55 P1 ̄6MPL [27a] M2-TM (I42A) [c] 1.40 P1 ̄6MPM M2-TM (I42E) [c] 1. Heterochiral interactions between L-and D-protein isomers observed during structure elucidation may also lead to fruitful development in binder creation.In the studies of the transmembrane helix of the influenza M2 ion channel protein (TM-M2), [27b] a heterochiral coiled-coil association was observed between the two peptide enantiomers in the presence of detergent octyl-glucoside (DL-OG) or within the monoolein lipid cubic phase (LCP) (Figure 4D).The LCP structure shows an 11residue helical repeat (hendecad, 3,4,4 spacing) in the coiledcoil, which differs from homochiral coiled-coils that adopts a 7residue helical repeat (heptad, 3,4 spacing).The crystals grown in DL-OG do not form a hendecad repeat, as steric clashes involving Ile39 and Ile42 prevent proper 3,4,4 interaction.27a] Such heterochiral interactions can be used to design D-proteins drugs, which are generally non-proteolytic and non-immunogenic (see section 2.3).The resulting drug-target complexes can also be resolved using racemic protein crystallography to aid in rational optimization (Figure 4E).).All structures were modelled in CCP4MG [58] with data obtained from the Protein Data Bank.

Post-translationally modified proteins
While obtaining homogenous, post-translationally modified (PTM'd) proteins through recombinant methods remains a major technical challenge, [60] chemical protein synthesis offers exquisite atomistic control and thus ensures homogeneity (see also section 3).Racemic protein crystallography of PTM'd proteins was first applied to the glycosylated chemokine Ser-CCL1 protein, for which no structure was reported. [53]The protein was synthesized in the native L-form, followed by sitespecific asparagine N-glycosylation with the native biantennary D-glycan.Synthesis of the corresponding D-enantiomer without glycosylation enabled the co-crystallization of the quasi-racemic protein (Figure 5A).
Another application involves the study of branched ubiquitin chains, where the folding of the branched protein molecules could be solved through racemic protein crystallography.Similarly, D-ubiquitin was prepared in unmodified form [41] and used to facilitate crystallization of the resulting branched L-ubiquitin proteins (Figure 5B). [28]A further example involved the preparation of branched ubiquitin proteins in both enantiomeric forms for racemic protein crystallography (Figure 5C), but the iso-peptide linkages of the D-proteins contained a non-native cysteine residue scar to facilitate ligation. [42] key challenge, as presented in the former examples, is the preparation of PTM'd proteins in all D-form.Glycosylated proteins possess glycans in native D-chirality which would require complex synthesis from the corresponding L-carbohydrates for a true racemic crystal.In addition, preparation of branched ubiquitin chains requires the use of non-natural amino acids as auxiliaries for attachment of the ubiquitin, but they often suffer from poor ligation efficiency (see section 3.3.2.). [42]

Remarks
Challenges associated with D-protein synthesis, folding and PTMs hamper the application of racemic protein crystallography.The average size of a protein ranges from 283-438 residues in length [61] with many bearing PTMs.Obtaining enough D-protein for crystallization screening (generally in milligram range) remains labor-intensive and uneconomical, typically requiring multiple chemical steps and protein refolding.Nevertheless, racemic protein crystallography of miniproteins remains an excellent method for deciphering molecular interactions, particularly serendipitous intermolecular interactions that deem difficult to obtain using homochiral protein crystallography, solution state NMR and/or computational structure prediction. [27,39,44,50]Perhaps, a more promising avenue is to conduct quasi-racemic crystallography where a minimal Dprotein is used to facilitate crystallization of a larger L-protein with (pseudo-)repeated domains. [28]

Identification of drug candidates through mirror-image phage display and related screening technologies
Polypeptide binders can offer significant selectivity and potency, and hence are excellent candidates for the treatment of various diseases and human disorders.One major bottleneck is that many peptide candidates suffer from proteolytic degradation, both limiting the option of the delivery methods and eliciting unwanted immune responses caused by major histocompatibility complex (MHC) presentation by immune cells. [62]eptides comprised of D-amino acids are a viable approach as they are non-recognizable by endogenous proteases. [63]It has been suggested that D-peptide binders can be made by retroinversion (RI) which relies on flipping the entire peptide chain from the N-to C-termini to offset the flip in the side chain chirality. [64]Though some success has been seen in short binders, it was quickly discovered that this double-flip approach did not reinstate the true peptide structure and, in some cases, could drastically weaken the peptide binding. [65]Another approach was to use D-peptides to mimic the shape of the Lpeptide agonist when presented in an MHC, without reference to the L-peptide sequence, acting as a stable vaccine candidate. [66]Recently, binders composed of both D-and Lresidues have been developed through ribosomal engineering [67] or in silico protein design. [68]In order to create binders entirely composed of D-amino acids, the most routine approach is mirror-image phage display (MIPD). [69]In MIPD, D- ).All structures modelled in CCP4MG [58] with data obtained from the protein data bank.

ChemBioChem
Review doi.org/10.1002/cbic.202200537enantiomers of protein targets are synthesized and subjected to L-peptide screening (Figure 6).Due to the nature of mirrorimage symmetry, the same-sequence D-peptide will bind to the native L-protein target with equal affinity, thus yielding an inherently non-proteolytic peptide binder.MIPD has discovered D-peptide binders for a range of targets (Table 2).Key examples are summarized in Table 2.

D-Peptides as potential anticancer lead candidates
Growth factor proteins and/or their receptors are overexpressed in many types of cancer, [70] and hence development of their antagonists can hinder malignant tumor growth as a form of treatment in cancer therapy. [71]Consequently, enantiomeric segments of both the epidermal growth factor (EGF) and vascular endothelial growth factor (VEGF-A) were synthesized for MIPD. [56,72]A 12-residue linear D-peptide ligand for EGF, D-PI_4, was identified with both binding affinity and half-maximal inhibitory concentration (IC 50 ) in micromolar range. [72]In the case of VEGF-A, the mini-protein GB1 was used as a template scaffold to create a 56 residue D-mini-protein RFX001.D that has a binding affinity as low as 85 nM. [56,73]A heterochiral protein complex between a vascular endothelial growth factor (L-VEGF-A) and a D-protein antagonist was also solved using racemic protein crystallography (Figure 4E), providing a foundation for structure-based optimization of the D-protein antagonist. [56]pon optimization, the binder RFX037.D was created, increasing both binding affinity (K D = 6 nM vs 85 nM) and thermal stability (T m > 95 °C vs 33 °C). [74]Of note, RFX037.D was nonimmunogenic in mice, whereas the L-enantiomer generated a strong immune response.In an extension of the MIPD against VEGF-A, bivalent D-protein ligands were also developed using orthogonal MIPD assays with two different scaffold miniproteins (53 and 58 residues). [75]The two best scaffolds were connected via a covalent linkage to yield the bivalent D-protein RFX-V1a2a, with sub-nanomolar (K D = 0.8 nM) affinity for VEGF-A.
Other key targets for cancer treatment are immune checkpoints, [76] which are often suppressed by cancer cells to avoid recognition by the innate immune system.The immuno-

Protein
Function Length [b] Name Type Length [c] K d [μM] [d] Ref. globulin-like variable (IgV) domains are known to govern immune checkpoints, and thus enantiomeric counterparts of the IgV domains of the programmed-cell death protein ligand 1 (PD-L1, 124 residues) and the T-cell immunoglobulin and ITIM domain (TGIT, 119 residues) were synthesized for MIPD. [77]77b]

D-peptides as lead preventive therapeutic candidates
The development of potent D-peptide antagonists of the HIV-1 envelope protein gp41 was shown to prevent viral fusion and entry into cells. [78]78a] Further pharmacokinetic optimization and synthetic scale-up yielded the cholesterol-conjugated trimeric D-peptide CPT31, [79] which is currently in Phase Ia clinical trials for the treatment of HIV.
In another study, MIPD was used to create D-peptides with high affinity for the microtubule-binding protein Tau, preventing self-aggregation in the treatment of tauopathies. [80]The hexapeptides PHF6 (VQIINK) and PHF6* (VQIVYK) were found to promote Tau aggregation in tauopathies such as Alzheimer's disease, and their enantiomers have been used as a target for MIPD. [80,83]Notably, two peptide candidates, MMD3 and MMD3rev, demonstrated cell-penetrating properties, with the ability to cross the cell membrane of neurons. [83]

Remarks
Despite all the research efforts, there is no D-peptide therapeutic that has yet reached the market.To our knowledge, CPT31 is the only D-peptide candidate that has entered earlystage clinical trials, and the estimated success rate of bringing a binder from phase I to approval is 14 %. [87]Discovery of Dpeptide binders remains challenging hampering downstream clinical research and product development.The ultimate challenge of MIPD lies within the preparation of the enantiomeric protein target.Not only can size be a concern, but both the PTM and protein folding status can also pose major synthetic challenges (see section 3 for synthetic approaches).Except for the Tau targeting peptides, many targets are restricted to extracellular protein domains, as cell-penetrating properties of peptide binders are often weak.In addition, the use of MIPD to identify competitive antagonists is limited by the arbitrary selection of off-target binders.Efforts have been directed to computation-based approaches with the goal to replace the tedious synthesis with in silico studies, including the docking of mirror-image helices derived from the PDB; [88] screening of D-tri/tetra peptides against the target active site; [89] virtual affinity maturation based on existing heterochiral structures. [90]However, existing in silico methods suffer from a lack of library diversity and polypeptide binder size, limiting the best example to a binding affinity of 20 μM. [89]

The Current State of the Art in D-Polypeptide Preparation
The most common issue encountered in enantiomeric protein research surrounds their preparation.Since polypeptides entirely composed of D-amino acids cannot be made recombinantly at the time of writing, they must be prepared by chemical synthesis and the current state-of-the-art is summarized as below:

Solid phase peptide synthesis (SPPS)
Allowing stepwise addition of protected amino acids on an insoluble polymer support, solid phase peptide synthesis (SPPS) facilitates access to D-polypeptide chains. [91]Fmoc-protected amino acids have gained popularity over the past two decades as they facilitate the use of milder cleavage conditions. [92]fficient reagents that allow high conversion of amino acid coupling have been reported. [93]The systematic nature of SPPS has also led to automated systems. [94]When paired with microwave irradiation, each amino acid coupling cycle can be performed in four minutes. [95]Recently, a fully automated system for SPPS, where the assembly is complete in a flow system has been developed, yielding complete coupling cycles in less than two minutes, and a full protein up to 164 residues has been assembled. [96]However, a major drawback is its requirement for a large excess of amino acids (6-60 equivalents), a major financial burden when it comes to Dpolypeptide synthesis.This is especially the case when the target proteins contain diastereomeric D-isoleucine, which is significantly higher in cost than other building blocks.

Chemical ligation
To bring down the cost, convergent synthesis of proteins through the assembly of smaller polypeptides by chemical ligation has been achieved. [97]In general, two peptide fragments are chemically bought together by reacting two latent reaction motifs located at the termini.Various methods for chemical protein ligation have been developed, although some suffer drawbacks when preparing polypeptides in opposite chirality due to requiring modified amino acids (Table 3, Entries 1-4).On the other hand, native chemical ligation (NCL) and serinethreonine ligation (STL) employ canonical cysteine or serine/ threonine residues, though the latter is yet to be reported in Dprotein synthesis.NCL is commonly used whereby the thiol group of an N-terminal cysteine peptide and a C-terminal thioester of a reacting pair undergo trans-thioesterification,   [121b] Post-translational modifications 26 Lys ubiquitination δ-mercapto lysine for NCL followed by desulfurization N Accessibility of enantiomeric reagent.
[136a] 27 Solid-phase isopeptide bond formation N Requires assembly of large fragments by SPPS -not costeffective with D-amino acids [156]   ChemBioChem followed by an S-to-N acyl shift to yield a traceless native peptide bond (Figure 7).

Thioester preparation
Activated thioesters cannot be anchored at the C-terminus during Fmoc-SPPS due to their sensitivity to base treatment which is routinely used during deprotection.Consequently, inert precursors including the refined Dawson linker have been developed (Table 3, Entry 6).Following SPPS, the base-stable 3,4-diaminobenzoic acid (Dbz) derivative is activated and subjected to thiolysis to yield the thioester for native chemical ligation.More recently, a second generation Dawson linker recruits an N-methylated amino group minimizing unwanted acylation when an excess of glycine reagent is used (Table 3, Entry 7). [98]Assembly of peptides directly onto the linker amine is a simple and useful feature of the Dawson linker.Alter- Asn N-Glycosylation Fmoc-Asn(Glycan)À OH or Boc-Asn(Xan)À OH (and) further glycosylation on-resin or in solution.
N Accessibility of enantiomeric reagent (would also require Lsugars).
N Accessibility of enantiomeric reagent (would also require Lsugars). [158] 33 Thr O-Glycosylation Fmoc-Thr(Glycan)À OH N Accessibility of enantiomeric reagent (would also require Lsugars).natively, the C-terminal peptide hydrazide can be used to create a C-terminal thioester (Table 3, Entry 5). [99]Activated by the addition of sodium nitrite, the hydrazide can be oxidized into an acyl azide which can be subjected to thiolysis in situ. [100]ydrazine resins are prepared fresh before use, but it has recently been shown that resins can be prepared with Fmochydrazine facilitating long term storage and facile loading quantification. [101]n the convergence of multiple peptide fragments to prepare larger synthetic proteins, some central fragments will be ligated at both N-and C-termini.To prevent intramolecular side-reactions (cyclization), the peptide can be prepared as a thioester surrogate, whereby a C-terminal functional group remains inert in NCL and can be subsequently activated for the next ligation step.77b] Other methods for generating thioester surrogates, such as SEA chemistry (Table 3, Entry 8), [102] could also be used but have yet to be applied to D-protein synthesis.

N-terminal cysteine protection
In the pursuit of more complex protein targets, sequential NCL steps will require orthogonal N-terminal cysteine protecting groups for middle segments (Table 3, Entries 9-13).A common and reliable choice is the protection of cysteine in a thiazolidine ring (Thz), due to the facile conversion into cysteine, compatibility with ligation conditions and its commercial availability in both enantiomeric forms. [103]However, Thz was found to be unstable under the oxidation conditions required for peptide hydrazide oxidation. [99]Numerous efforts have been directed to the development of other orthogonal, N-terminal cysteine protecting groups to facilitate sequential peptide hydrazide ligations (Table 3, Entries 9-12).Preparation of their enantiomeric counterpart is theoretically simple.Indeed, TFA-Thz, which is stable to hydrazide oxidation, was used in the preparation of enantiomeric polymerase D-Dpo4 [9] and ubiquitin. [41]

NCL beyond cysteine
Due to the low abundance of cysteine residues in proteins (< 2 %), [104] significant efforts have been directed to link peptides with alternative amino acids. [105]Desulfurization converts the reactive cysteine thiol CH 2 SH to the CH 3 group of alanine, [106] which is significantly higher in abundance (> 8 %). [107]Techniques include hydrogenation over Pd/Al 2 O 3 or Raney nickel (Table 3, Entry 14). [106]Given the issues associated with purity, a metal-free desulfurization alternative has been developed through a free-radical mechanism (Table 3, Entry 15; Figure 8A).However, native cysteines must be protected during desulfurization reactions.Acetamidomethyl cysteine (Cys(Acm)) is commonly employed in D-protein synthesis, and is commer-cially available in both L-and D-enantiomers for Fmoc-SPPS (Figure 8B). [108]Following desulfurization, the Acm group can be removed using mercury acetate, [137] iodine, [138] or more recently, PdCl 2 . [109]hiols may also be inserted into other canonical amino acids and can be removed by desulfurization (for review, see ref. [110]).However, this approach has not been applied in Dprotein synthesis because of challenges associated with synthesizing the enantiomeric building blocks (Table 3, Entry 16).Commercially available D-penicillamine (D-Pen) may be a viable option for ligation at valine (Figure 8C), [111] but this residue is associated with slow ligation kinetics with all C-terminal thioester sites other than glycine. [112]NCL can also proceed by employing a temporarily inserted thiol auxiliary which can be removed by acid cleavage. [113]Such auxiliaries enable the generation of the lysine isopeptide bonds in the synthesis of branched ubiquitin chains (Table 3, Entry 28). [28,42]Nevertheless, this approach has not yet been applied in the synthesis of the mirror-image D-counterparts (see section 3.3.2.3). [42]

One-pot approach
During the synthesis of large protein targets, it becomes advantageous to conduct steps in a 'one-pot' fashion minimizing lengthy and yield-reducing purification maneuvers.For example, directly after a NCL reaction, the desulfurization step can be performed in one pot, followed by Cys(Acm) deblocking without intermediate purification. [109]A primary limitation of performing desulfurization immediately after NCL is that: a large excess of thiol catalyst such as 4-mercaptophenylacetic acid (MPAA) is needed to improve ligation rates but they also quench the desulfurization reaction. [114]Efforts have been directed toward finding new thiol catalysts that are compatible with desulfurization (Table 3, Entries 17-19).Methyl thioglycolate was found to not interfere with desulfurization whilst providing good catalytic properties, and it has been used in the synthesis of mirror-image ubiquitin for racemic protein crystallography studies. [41]Other efforts to reduce the number of HPLC purification steps include one-pot ligations of numerous frag- ments and performing chemical ligations on a solid support. [57,115]

Enhancing solubility of hydrophobic peptide fragments
The nature of protein has implications on the experimental design.Hydrophobic proteins such as membrane proteins can suffer from aggregation and poor solubility. [116]A solution that can address these issues is to recruit the first-generation removable backbone modification (RBM), which minimizes aggregation whilst allowing conjugation to a poly-arginine solubility tag (Table 3, Entry 22). [117]However, the RBM is installed via a removable glycine auxiliary and thus has limited scope.A second-generation RBM was designed to be installed into all other amino acids, including the challenging Val-Ile junction, making this highly attractive for the synthesis of membrane proteins (Table 3, Entry 23; Figure 9). [118]77b] In addition to RBMs, removable solubilizing tags could also be incorporated onto lysine side chains (Fmoc-Ddae-OH) or (Fmoc-Ddap-OH) using the 'helping-hand' strategies (Table 3, Entries 20, 21). [119]Whilst no D-proteins are currently reported using this method, these tags could possibly be installed onto commercially available Damino acid building blocks.The lysine tags can also be employed to install click handles for templated chemical protein ligations. [120]

Remarks
Since the advent of chemical ligation methods, protein synthesis has transformed into a rapidly evolving field of research.Particularly, native chemical ligation has facilitated the synthesis of numerous D-proteins: enantiomeric venom toxins, [30,[34][35][36]38,45,48,50,52,57] growth factors, [56,72,74,75] and enzymes, [5,7,9,[12][13][14]121] including the 90 kDa split-enzyme D-Pfu. [10] The pursuit omore complex D-proteins is perplexed by their size and post-translational modification status. Multi-segmentone-pot approaches can improve efficiency, but many recruit unique amino-acid reagents that need to be prepared in the laboratories. [122] Another isse involves the rates of ligation reactions, which can potentially be increased by adopting selenocysteine NCL, [123] and template-directed chemical ligations.[124] Easier access to protected building blocks, and their commercial availability, will greatly improve the possibilities for D-protein chemical synthesis.Perhaps, the ultimate goal for D-polypeptide production is to completely circumvent chemical synthesis, with the entire mirror-image translational machinery (ribosome, rRNA, tRNA, tRNA synthetase, etc.) served as a replacement.

Transformation from D-polypeptide to mirror-image protein
A somewhat overlooked challenge is the complexity of protein folding that researchers may need to address during synthetic protein preparation. [126]Typically, a solution of the protein in chaotropic conditions such as guanidine or urea is prepared and then diluted into a refolding buffer. [127]Many larger proteins require chaperones for efficient folding, particularly in vivo where direct control of refolding conditions is limited.121b] This observation suggested that the protein chaperone activity can be achiral and may find broad utility in the pursuit of mirror-image life systems.Two specific challenges associated with protein folding that will be discussed here include correct oxidation of disulfide bonds and installation of post-translational modifications.

Disulfide bond formation
Correct folding of the disulfide bonds is case-dependent, often requiring extensive screening and optimization for each protein.Nevertheless, preparation of enantiomeric disulfide-rich proteins for racemic crystallography has broad utility in generating new structural insights (Table 1 and Figure 10). [30,34,45,50,52]One common method for disulfide bond formation involves diluting the reduced D-polypeptide chain from chaotropic agents in the presence of reduced and oxidized thiols as redox reagents, as reported in the synthesis of D-rC5a. [35]However, misfolded Dprotein often arises as thermodynamically trapped by-products containing mismatched disulfide bonds and adducts with thiol reagents. [30,45]Oxidation by air or DMSO has been used, following careful optimization of buffer additives, reagent concentrations, and pH.However, this method often results in low yields (typically < 50 %), requires large solvent volumes, and proceeds over several days. [30,45,49]To gain additional control, orthogonal cysteine protection followed by pairwise cysteine oxidation may be used.Recently, an orthogonal cysteine protection scheme was developed, encoding a rapid system for disulfide oxidation (Table 3, Entry 24).Using palladium and UV light mediated deprotections, it was shown that up to three correctly paired disulfide bonds could be formed in less than 13 minutes. [128]The use of trityl (Trt) and acetamidomethyl (Acm) protecting groups has been reported for formation two disulfide bonds in a D-protein (Table 3, Entry 24). [52]Another protecting group that can be used is the 2-nitrobenzyl group but must be chemically synthesized (Table 3, Entry 24). [128]isulfide bond formation can also be directed without the use of chaotropic agents or orthogonal protection schemes, such as cysteine-penicillamine pairings or repeat-proline (CPPC) motifs. [129]

Post-translational modifications
To achieve a true, mirror-image protein with reciprocal stereospecific protein activity, PTMs must be incorporated into the Dpolypeptide product.Serine and tyrosine phosphorylation are common PTMs. [130]Whilst the L-serine equivalent (Table 3, Entry 37) is commercially available, the corresponding protected D-Ser building block must be accessed through chemical synthesis. [18]This has been demonstrated in the synthesis of mirror-image ribosomal proteins which are essential in the binding of L-RNA molecules. [18]Phosphorylation and sulfation of tyrosine residues have been incorporated into chemical synthesis of L-proteins [131] using similar protected building blocks (Table 3, Entry 36 & 38).It is reasonable that equivalent building blocks can be prepared in D-enantiomeric form.130c] Glycosylation is another widespread PTM found in proteins. [132]1a] Mirror-image glycoprotein would require installation of L-sugar polymers onto the D-polypeptide and has not yet been reported.However, quasi-racemic protein crystallography of synthetic glycoprotein Ser-CCL1 could be facilitated using the unglycosylated D-enantiomer. [53]In many cases, glycans can play important roles in protein or nucleotide binding, [132] and therefore achieving mirror-image protein glycosylation is a key milestone in research surrounding D-proteins, particularly in the area of MIPD.
Other promising PTMs include lysine trimethylation or acetylation, both of which have been incorporated into the chemical synthesis of L-histones (Table 3, Entries 29, 30). [133]reparation of the corresponding D-histones would require the synthesis of the lysine building blocks in D-form.A more challenging lysine PTM is site-specific ubiquitination. [134]Chemical synthesis of branched ubiquitin chains in native L-form has been reported, [135] typically employing a δ-mercapto lysine to facilitate ligation, followed by desulfurization (Table 3, Entry 26). [134,136]However, preparation of the branched ubiquitin chains for racemic protein crystallography utilized an auxiliary for NCL, following subsequent removal with TFA to generate a native glycine at the branched ligation site (Figure 11A). [28,42]uring preparation of the D-enantiomers of ubiquitinated proteins, the auxiliary was replaced with a cysteine residue to circumvent the low efficiency of the ligations, leaving a nonnative cysteine as a scar at the ligation site (Figure 11B). [42]almitoylation also serves as a promising PTM for incorporation into D-proteins and has been applied to the synthesis of cysteine palmitoylated L-proteins (Table 3, Entries 34, 35). [137]almitic acid is achiral and could be conjugated to D-cysteine  or other D-amino acid residues.Synthesis of palmitoylated Dproteins may find use in racemic protein crystallography [26] which has been applied to reveal the structure of transmembrane protein domains in racemic detergents. [27]In addition, palmitoylation is a useful modification for pharmacokinetic optimization of peptide drugs, [138] including the FDAapproved Liraglutide. [139]This modification may find use in prolonging the half-life of D-peptide drug candidates, as encountered with the conjugation of cholesterol to the promising D-peptide drug candidate, CPT31. [79]

Remarks
Folding of D-polypeptide chains into the desired protein conformation is a challenging task, and often requires optimization in a case dependent manner.Protein refolding protocols can be pre-established using the native recombinant Lcounterparts, [140] and they are often sufficient to fold the corresponding D-enantiomer.In the case where disulfide bond formation is involved, oxidants such as cystine or glutathione disulfide can be used. [52]Alternatively, orthogonal protection schemes can be implemented. [52,128]Most PTM installations can be achieved, particularly the achiral components such as phosphorylation. [18]However, chiral PTMs such as glycosylation largely remains an unresolved synthetic challenge. [53]A plausible solution would be to obtain a mirror-image enzyme (such as endo-glycoside hydrolase), [141] capable of assembling the necessary polysaccharide building block from L-sugars, much like the enantiomeric polymerase enzymes discussed earlier. [9,10,13,14]

Perspectives
Applications of synthetic D-protein enantiomers are vast but remains to be challenged by the difficulty in their preparation, particularly when their complexity increases with their size, folding and post-translational modifications.The ability to translate L-mRNA into D-proteins will be truly revolutionary.The first milestone will be the complete assembly of an enantiomeric ribosome, comprised of D-proteins and L-rRNAs.Since three out of ~50 mirror-image E. coli ribosomal proteins [18] and efficient L-nucleotide polymerases [7,9,10,[12][13][14]16] have been reported, it is anticipated that this ambition will be achieved in the future. Miror-image translation could then be achieved in vitro, [142] using D-aminoacyl-tRNA synthetases to load the L-tRNAs along with the mirror-imaged translational factors.Subsequently, all D-proteins needed in mirror-image life, racemic protein crystallography and D-targets for mirror-image phage display may be obtained by suppling the exogenous Lnucleic acids encoding the desired protein.Perhaps, preparation of entire D-polypeptides can also be achieved using "Flexizyme" technology, [143] which has been reported to incorporate D-amino acids into peptide chains without using enantiomeric translation components.[2b,67a,b] Chemical synthesis remains superior at atomistic control allowing researchers to incorporate building blocks without constraints associated with ribosome-based systems.Complex D-protein targets can be achieved via: engineering of split enzymes, [10] mutational installation of suitable ligation sites [10] and in silico design of accessible D-enzymes.[16] For protein crystallography, smaller Dproteins can be used to facilitate crystallization of complex Lproteins by quasi-racemic protein crystallography. [28] n addition, discovery of D-peptide binders could be achieved via computational-based approaches.[88,89,144] Together, while there are unsolved challenges, D-polypeptide research remains to have strong potentials that can generate explosive impacts on numerous research topics.

Figure 3 .
Figure 3. Illustrative comparison of (A) a homochiral crystal unit cell and (B) a racemic crystal unit cell containing an inversion center (e. g.P1 ).

Figure 6 .
Figure 6.Process of mirror-image phage display.Natural chirality L-proteins are represented in blue, and mirror-image synthetic D-proteins are represented in red.Purple beads represent protein immobilization.

Figure 7 .
Figure 7. Reaction scheme of native chemical ligation.

Figure 9 .
Figure 9. Second generation removable backbone modification.4-Methoxy-5-nitrosalicylaldehyde is installed onto backbone nitrogen during Fmoc-SPPS, followed by assembly of peptide main chain and desired tag sequence.Reversible acetylation of phenol group controls TFA lability for tag removal.

Figure 10 .
Figure 10.Examples of disulfide-rich D-proteins prepared by chemical synthesis, with structures resolved by racemic protein crystallography.

Figure 11 .
Figure 11.Methods for branched protein ligation used in preparation of poly-ubiquitins; (A) glycyl auxiliary mediated NCL for preparation of branched L-proteins and (B) cysteine mediated NCL for preparation of branched D-proteins.

Table 1 .
A decade of racemic protein crystallography (at the time of writing).

Table 3 .
Synthetic methods used in chemical protein synthesis, indicating use in reported D-protein synthesis and potential issues encountered with use.

Table 3 . continued
YRemoval of metal impurities can be problematic.Use of hydrogen gas.Potential side reactions with Trp and Met Quenched by thiol catalyst.Native Cys must be protected.[108] 15 Metal-free radical based VA-044 TCEP tert-butylthiol Y Quenched by thiol catalyst.Native Cys must be protected.

Table 3 . continued
N Additional steps to incorporate and remove tag.Potential issues with stability.Y Limited to Gly only.Lengthy synthesis of building block.Additional steps to incorporate and remove tag.Y Practically limited to two disulfide bonds.Accessibility of a third, orthogonally protected D-Cys building block.