A Tight Contact: The Expanding Application of Salicylaldehydes in Lysine‐Targeting Covalent Drugs

The installation of aldehydes into synthetic protein ligands is an efficient strategy to engage protein lysine residues in remarkably stable imine bonds and augment the compound affinity and selectivity for their biological targets. The high frequency of lysine residues in proteins and the reversibility of the covalent ligand‐protein bond support the application of aldehyde‐bearing ligands, holding promises for their future use as drugs. This review highlights the increasing exploitation of salicylaldehyde modules in various classes of protein binders, aimed at the reversible‐covalent engagement of lysine residues.


Introduction
Covalent drugs represent a multifaceted group of chemotherapeutics capable of forming a covalent bond with the target protein of interest (POI).This bond is formed at the binding interface, by reaction of a specific electrophilic moiety in the drug structure and a nucleophilic residue on the POI.As a result, this intermolecular covalent interaction can increase the drug potency, induce long-lasting pharmacological effects and, in some cases, drive the drug selectivity for the POI over structurally related receptors.Several examples of covalent drugs have been developed over the last century, generated either serendipitously, or by screening of compound libraries, or by rational design. [1]In this context, reversible covalent (RC) ligands are gaining increasing attention: in contrast to irreversible covalent binders (often referred to as "suicide inhibitors"), the formation of RC bonds proceeds under thermodynamic control and it requires multiple noncovalent interactions to stabilize the ligand-protein complex, limiting off-target binding. [2]From the energetic point of view, RC ligand binding is typically described as a two-step process (Figure 1A).Firstly, the small molecule (SM) ligand forms a reversible complex with the target protein, stabilized by canonical non-covalent forces.This ligand-protein interaction orients the electrophilic handle in the proximity of an accessible nucleophilic residue of the protein, facilitating the covalent bond formation (rate constant k 2 ).At this stage, the reverse bond-breaking reaction is characterized by rate constant k À 2 , which differentiates RC ligands from fully irreversible binders (for which k À 2 = 0).While k À 2 governs the lifetime of the ligand-protein covalent bond, the overall free energy of the binding process is the result of both non-covalent and covalent states.To date, RC ligand approaches have predominantly used electrophilic species reacting reversibly with thiol groups of cysteine (Cys) residues, [3] leading to the identification of RC inhibitors of tyrosine kinases, [4] proteases, [5] transcription factors [6] and others. [7]Very recently, Cys-selective RC moieties have been also used in proteolysis-targeting chimeras (PROTACs), [8] which testifies the importance of extended residence times for the formation of functional ligand-protein complexes.However, Cys is one of the least common amino acids in proteins and most of Cys side chains are engaged in disulfide bonds, [9] thus preventing the general use of thiol-reactive electrophiles in RC drugs.On the other hand, lysine (Lys) is highly abundant in the proteome (approximately 6 % of all amino acids in human proteins) and, together with glutamic acid (Glu), it is the most frequent residue on the outer structural layers of proteins. [10]For these reasons, electrophiles capable of engaging Lys ɛ-amino groups may serve as portable handles to develop RC binders and drug clinically relevant targets.For instance, aldehydes [11] represent ideal aminophilic units, but the exploitation of reversible imine bonds between drug and POI may be limited by i) the intrinsic nucleophilicity of the targeted amine (typically lower than that of thiols) and ii) the low imine stability in aqueous media (Figure 1B, Entry 1).Concerning the nucleophilicity, the Lys ability to form covalent bonds is typically described in terms of pK a values: while Lys(ɛ-NH 2 ) (pK a = 10.4) is mostly protonated under physiological pH, different structural or mechanical aspects in proteins or protein-protein interfaces can sensibly decrease the amine pK a (up to 5 units), reducing the extent of protonation and so enhancing its reactivity. [12,13]In addition to the pK a values, the individual HOMO and LUMO energies of amine and aldehyde reactants were reported to influence the imine bond formation. [14]Nevertheless, these three parameters did not always result in a correct prediction of imine rate formation in aqueous solution, where a subset of orthosubstituted benzaldehydes showed exceptional reactivity.
In 2012, Gois and coworkers described the reaction of Lys(ɛ-NH 2 ) with ortho-formyl phenylboronic acid (Figure 1B, Entry 2): the resulting iminoboronates showed high stability towards hydrolysis due to a dative bond between the N atom lone pair and the B atom empty orbital. [15]Besides several applications in bioconjugates, [16] iminoboronates have brought reversible-covalent SM inhibitors directed either to a non-catalytic Lys residue in the anti-apoptotic protein Myeloid cell leukemia 1 (Mcl-1) [17] or to the catalytic Lys in tyrosine kinase BCR-ABL. [18]Furthermore, exploiting the advances in genetically encoded platforms, [19,20] Gao and coworkers have recently incorporated ortho-carbonyl phenylboronic acids into phage-displayed combinatorial peptide libraries, leading to the identification of RC binders for various POIs including Staphylococcus aureus sortase A, the SARS-CoV-2 spike protein and the Tobacco Etch Virus (TEV) protease. [21,22]Although iminoboronates show exceptional stability in water, [23] other mildly-reactive ortho-hydroxy aldehydes such as pyridoxal or salicylaldehyde (SA) derivatives have been also used as imine-stabilizing agents.Here, an intramolecular hydrogen bond between the phenolic H and the imine N atom provides a 3 kcal/mol extra contribution to the imine stabilization (Figure 1B, Entry 3). [24]These classes of aldehydes have been used for the site-selective modification of proteins at the N terminus [25] or at Lys residues. [26]Besides these applications, SA derivatives have been also studied as RC drugs and, particularly in the last years, these reactive units are emerging as general and selective tools in drug discovery to reversibly engage Lys residues in multiple classes of protein targets.

Discussion
In 2011, different high-throughput screening campaigns led to the identification of salicylaldimine derivatives as inhibitors of inositol-requiring enzyme 1 (IRE1). [27,28]The latter is involved in the unfolded protein response pathway, which is a key regulator of endoplasmic reticulum stress and whose chronic activation is implicated in many pathologies. [29,30]In particular, it rapidly emerged that the inhibitory activity of the imine  candidates was due to their hydrolysis to SA, [28,31] which engages a Lys(ɛ-NH 2 ) group buried in the enzyme RNase domain. [32]This discovery led to the clinical evaluation of ORIN1001 (1, Figure 2) as treatment option against solid tumors in combination with chemotherapy. [33]In another example, the SA group proved a key pharmacophore in small binders for mutant hemoglobin (Hb) (compounds 2-4, Figure 2).In particular, sickle cell disease (SCD) results from the mutation of Nterminal glutamate into valine, which promotes the protein precipitation and the characteristic alteration of the red blood cell shape.As extensively described elsewhere, [11] the protein precipitation is abolished by the selective engagement of the Val(α-NH 2 ) residue with SA derivatives and, after 40 years of investigation of potential candidates, Voxelotor (4, Oxbryta™) received marketing authorization in 2019 as the first hemoglo-bin modifier for SCD treatment. [34]Following these examples, the application of SA derivatives in protein ligands has recently become more general, as this aldehyde proved reactive with a broad range of Lys residues in the proteome.
In 2018, Neri and coworkers proposed the general use of SA as a portable tag to modulate molecular interactions.In particular, the annealing of protein ligands with a SA handle displayed on two complementary locked nucleic acid strands (LNA, Figure 3) could enhance the ligand affinity (i.e. lower K D ) for the parent proteins, such as human serum albumin (HSA, see ligand 9) and interleukin 2 (IL-2, 10).By contrast, the binding of sulfonamide ligands to carbonic anhydrase II was not improved, and this was associated with the absence of Lys residues in the area surrounding the sulfonamide binding site.In addition to the experiments with LNA-displayed compounds, the group observed that the conjugation of SA and the simple benzamidine ligand (i.e. a reversible inhibitor of urokinase-type plasminogen activator, uPa) to give small molecule 5 (Figure 2) leads to a 20-fold higher enzyme inhibition compared to control compounds unable to engage the Lys(ɛ-NH 2 ) group. [35]ater on, further evidences of the broad Lys reactivity of SA derivatives have been provided.In 2020, Tzalis and Ottman reported crystal soaking studies of protein 14-3-3σ, using a group of aldehydes capable of imine bond formation with a pK a -perturbed and hydrophobically-buried Lys: in contrast to other aldehydes, the use of a SA derivative led to crystal cracking, which was ascribed to a pan-labelling of the majority of lysine residues in the protein. [36]n 2021, a chemical proteomic analysis reported by Abbasov and Cravatt indicated that, in cancer cell proteomes, SA derivatives are able to engage more Lys(ɛ-NH 2 ) groups compared to other aminophilic compounds (including boronic acid derivatives). [37]Very recently, the SA group was also  1), [33] and more recent kinase inhibitors 6 [38] and 7. [39] Mutant hemoglobin-interacting agents include Tucaresol (2), Valeresol (3) and Voxelotor (4), [11] with the latter approved as drug for the treatment of sickle cell disease.The benzamidine derivative 5 was reported as model inhibitor of urokinase-type plasminogen activator (uPa), [35] while 2-hydroxy-1-naphthaldehyde derivatives (8) have been recently described as RC inhibitors of Krev interaction trapped 1 (KRIT1) protein. [40]gure 3. LNA duplex bearing a SA tag on one oligonucleotide and a small ligand on a complementary strand.The chemical structures of ligands to human serum albumin (HSA, 9) and interleukin-2 (IL-2, 10) are also shown.In fluorescence anisotropy experiments, these duplexes showed lower K D values for the parent POIs compared to analogues with a non-reactive benzaldehyde unit. [35]nstalled in kinase inhibitors, particularly selective for BCR-ABL [38] (6, reported as the first example of SA-based, reversible covalent kinase inhibitor in the literature) and Aurora A [39] (7) enzymes.The co-crystal structures of both compounds complexed with the respective targets revealed the clear formation of the covalent bond between inhibitor 6 and the BCR-ABL kinase domain (Figure 4A), as well as between 7 and the Aurora A kinase domain (Figure 4B).Additionally, the anticipated intramolecular hydrogen bond between the imine nitrogen atom and the phenolic proton was observed in both cases, together with other "canonical" non-covalent interactions with the respective proteins.
During the preparation of this manuscript, a group of SAbearing RC inhibitors of the Krev Interaction Trapped 1 Protein (KRIT1) was reported. [40]KRIT1 interaction with Heart of glass 1 (HEG1) protein is crucial in endothelial cell-cell junctions involved in the formation and maintenance of the heart and vessels.This protein-protein interaction was strongly inhibited by 2-hydroxy-1-naphthaldehyde derivatives 8 a-c (Figure 2), where the imine engagement of Lys720 in KRIT1 led to long and tunable residence times.Interestingly, in addition to Lys720, KRIT1 binding site features two other lysine residues, which were described to contribute to the RC binding of 8 by assisting the ligand orientation.Conceivably, the proximity of three positively charged Lys groups may also enhance the amine nucleophilicity by pK a perturbation due to charge repulsion. [41]ur group has recently explored the tailored SA installation into peptide ligands, particularly at the N or C termini of a model cyclic peptide, bearing the ArgÀ GlyÀ Asp (RGD) as wellknown recognition sequence for integrin α V β 3 . [42]The latter features four solvent-exposed Lys near the peptide binding site, which lies at the interface of the protein subunits.Competitive binding assays indicated that the naive ligand strongly inhibits the binding of vitronectin (i.e., a natural integrin ligand bearing the RGD motif) to the receptor (IC 50 � 6 nM).Variations of these values were not detected when the peptide was modified with a Lys-unreactive benzaldehyde at either the peptide C or N termini (compounds 11 and 12, respectively), indicating that the introduction of rather bulky and aromatic substituents are tolerated at both positions (Figure 5A).On the other hand, the use of the Lys-engaging SA tag did alter the peptide binding affinity, which was slightly increased in the case of C-modified ligand (compound 11-SA, IC 50 � 3 nM) and dramatically lowered in the N-modified analogue (compound 12-SA, IC 50 > 100 nM).To rationalize this observation, covalent docking studies [43] compared the binding poses of 11-SA (Figure 5B) and 12-SA (Figure 5C) in the X-ray structure of α V β 3 , [44] while forcing the covalent interaction of the SA residues with the most accessible Lys(ɛ-NH 2 ) groups (i.e. β 3 Lys125 for 11-SA and β 3 Lys253 for 12-SA).In these covalent docking calculations, the forced bond between the SA tag of 11-SA and Lys125 did not alter the noncovalent interactions of the cyclic RGD peptide in the binding pocket, as the latter was mimicking the crystallographic structure of the benchmark Cilengitide peptide in 10/10 poses.In contrast, the anchoring of 12-SA to Lys253 was found to destabilize non-covalent interactions in 5/10 poses, in which the cyclic RGD peptide loses the "canonical" Cilengitide pose.These computational analyses were in good agreement with binding studies, as the good fit of 11-SA was reflected by a � 50 % lower IC 50 value compared to the control compound 11.This modest increase in binding affinity may result from a combination of different factors, such as a suboptimal length/ flexibility of the SA-RGD triazole tether or the low reactivity of Lys125 ɛ-amine, whose ammonium ion shows the highest pK a values of the group (pK a = 11.0 calculated in silico).Concerning ligand 12-SA, our data may indicate that upon non-covalent docking of the RGD peptide, the imine formation between the SA and Lys253 destabilizes the peptide pose.In competitive binding assays, this would facilitate the vitronectin binding to the empty α V β 3 pocket, ultimately increasing the observed IC 50 values.This mechanism is further supported by the relatively low pK a (10.4) calculated for Lys253, which favors the formation of the imine bond during binding assays.
This recent work promotes the use of covalent docking as a valuable guide during chemical design of SA-bearing ligands, whenever crystallographic data are available.nd of Aurora A kinase domain with 7 (B, PDB: 7FIC). [39]Both structures show the intermolecular imine bond (red dotted square) between SA and a proximal Lys(ɛ-NH 2 ) group (Lys271 in BCR-ABL kinase and Lys162 in Aurora A kinase).Moreover, both structures highlight the stabilizing H bond between the phenolic proton and the imine N atom, as well as other non-covalent forces (dotted lines, e. g.H bonds and π-π interactions).The figures were obtained from the published PDBs and originally edited with Schrödinger Maestro graphical interface (Schrödinger Release 2021-1).

Summary and Outlook
Over the last years, a growing number of medchem reports have described the use of SA derivatives as aminophilic units to engage Lys residues in reversible-covalent bonds, thus stabilizing the ligand-protein complex.These peculiar aldehydes proved reactive towards a large number of Lys residues in the proteome, as well as a wide adaptability to different classes of ligands, from synthetic small molecules to peptide and oligonucleotide structures.The covalent docking figures were originally edited with Schrödinger Maestro graphical interface (Schrödinger Release 2021-1).

Mattia
Mason graduated in Chemistry cum laude at Università degli Studi di Milano in 2022, with a Master's thesis in the release of small Toll-like Receptor agonists as immunostimulatory agents.He is currently pursuing his doctoral studies in Chemistry at the same institution, under the supervision of Dr. Dal Corso.In 2023, he was awarded the Marinella Ferrari Prize of the Rotary Club of Milan.His research interests include the synthesis of reversible-covalent drugs and the selective delivery of antitumor agents.Laura Belvisi graduated in Chemistry cum laude at Università degli Studi di Milano in 1990 and received her PhD in 1994 with Prof. Carlo Scolastico.After a post-doc, she became researcher at Università degli Studi di Milano in 1998 and associate professor of organic chemistry in 2015.Between 2001 and 2011 she directed the Molecular Modeling Laboratory at the Interdipartimental Center for Biomolecular Studies and Industrial Applications of Università degli Studi di Milano.Her research interests focus on the computeraided design and the study of glycomimetics and peptidomimetics as ligands for protein targets and modulators of sugar-protein or protein-protein interactions.Luca Pignataro studied Industrial Chemistry at Università degli Studi di Milano, where he graduated in 2003 and received his PhD in 2006 under the supervision of Prof. Franco Cozzi.In 2007 he joined the group of Prof. David Leigh at the University of Edinburgh (UK) as a postdoc.He returned to Italy (2008) in the group of Prof. Cesare Gennari.In 2012 he became a researcher at Università degli Studi di Milano and in 2019 he was appointed associate professor.His main research interests include synthetic methodologies, supramolecular catalysis and medicinal chemistry.Alberto Dal Corso studied Chemistry at Università degli Studi di Milano, obtaining his PhD in 2015 with Prof. Cesare Gennari.He then joined the group of Prof. Dario Neri at ETH Zürich as a postdoc.In 2018, he returned to Università degli Studi di Milano, where he is now Assistant Professor.In the last years, he has been awarded the Junior Prize "Organic Chemistry for Life Sciences 2019" and the "Primo Levi Award 2020" by the Italian Chemical Society.His research interests include the development of novel drug delivery strategies and the synthesis of ligands for clinically relevant protein targets.

Figure 1 .
Figure1.A) Two-step mechanism of RC ligand binding.The first step involves the docking of the SM ligand, forming a reversible complex which directs the RC handle to an accessible nucleophilic residue of the protein.The second step is the formation of the RC bond (lock) that stabilizes the ligand-protein complex.B) Reactivity of different aldehydes with the primary amino group of lysine: "unmodified" benzaldehydes (Entry 1) typically result in hydrolyticallyunstable imines.By contrast, the use of a boronic acid or a phenolic group in the formyl ortho position enables imine stabilization either by intramolecular dative bond between the imine N atom and the B centre (Entry 2) or by hydrogen bond with the acidic phenolic proton (Entry 3).

Figure 4 .
Figure 4. Cocrystal structures of BCR-ABL kinase with 6 (A, PDB: 7 W7Y),[38] and of Aurora A kinase domain with 7 (B, PDB: 7FIC).[39]Both structures show the intermolecular imine bond (red dotted square) between SA and a proximal Lys(ɛ-NH 2 ) group (Lys271 in BCR-ABL kinase and Lys162 in Aurora A kinase).Moreover, both structures highlight the stabilizing H bond between the phenolic proton and the imine N atom, as well as other non-covalent forces (dotted lines, e. g.H bonds and π-π interactions).The figures were obtained from the published PDBs and originally edited with Schrödinger Maestro graphical interface (Schrödinger Release 2021-1).

Figure 5 .
Figure 5. A) Molecular structures of RGD peptides featuring a Lys-unreactive benzaldehyde (11 and 12) or a SA tag (11-SA and 12-SA) at either the peptide C or N termini.B) Representative covalent docking pose of the SAbearing peptide 11-SA where the cyclic peptide connected through the SA tag to the accessible Lys125 at the β 3 integrin subunit overlaps with the Xray structure of benchmark Cilengitide ligand (green).C) Representative covalent docking pose of the SA-bearing peptide 12-SA, where the forced interaction of the N-terminal SA with the accessible β 3 Lys253 leads to a poor overlay of the cyclic peptide with Cilengitide (green).[42]The covalent docking figures were originally edited with Schrödinger Maestro graphical interface (Schrödinger Release 2021-1).