π‐Hole Interactions Involving Nitro Aromatic Ligands in Protein Structures

Abstract Studying noncanonical intermolecular interactions between a ligand and a protein constitutes an emerging research field. Identifying synthetically accessible molecular fragments that can engage in intermolecular interactions is a key objective in this area. Here, it is shown that so‐called “π‐hole interactions” are present between the nitro moiety in nitro aromatic ligands and lone pairs within protein structures (water and protein carbonyls and sulfurs). Ample structural evidence was found in a PDB analysis and computations reveal interaction energies of about −5 kcal mol−1 for ligand–protein π‐hole interactions. Several examples are highlighted for which a π‐hole interaction is implicated in the superior binding affinity or inhibition of a nitro aromatic ligand versus a similar non‐nitro analogue. The discovery that π‐hole interactions with nitro aromatics are significant within protein structures parallels the finding that halogen bonds are biologically relevant. This has implications for the interpretation of ligand–protein complexation phenomena, for example, involving the more than 50 approved drugs that contain a nitro aromatic moiety.


General computational methods for calculations in the main text:
Molecular electrostatic potential surfaces were rendered with Spartan 2018 (version 2.0.7) from geometry optimized structures with density functional theory at the DFT/B3LYP [1] /6-31+G* [2] level of theory. The MEP values indicated in Figure 1 were computed at the MP2 [3] /aug-cc-pVDZ [4] level of theory using the B3LYP/6-31+G* geometries. All binding energies reported herein were computed with Turbomole 7.0 program and employing the spin-component scaled second-order Møller-Plesset method (SCS-MP2) [5] and using as the basis set default 2 with triple zeta valence quality polarization function (def2-TZVP), as this gives an accurate energy at reasonable computational cost and a very low basis set superposition error (BSSE). [6] 1. 2

. Computation of simple model adducts:
As is detailed in Table S1, high level computations were conducted to explore the binding potential of the π-hole on nitrobenzene (3) or its water adduct (4). The interacting partners chosen were: water, obviously abundant in proteins; methanol and dimethyl ether as models for sp 3 hybridized O-atoms such as in Serine and Threonine; dimethyl thioether as model for sulfur containing residues like Methionine; and formamide as a model for Asparagine, Glutamine and amides in general. For these calculations we employed modern Møller Plesset method (SCS-MP2) [5] and using as the basis set default 2 with triple zeta valence quality polarization function (def2-TZVP), as this gives an accurate energy at reasonable computational cost and a very low basis set superposition error (BSSE). [6] The interacting energies range from -1.7 kcal/mol for [3•••water] to -4.2 kcal/mol for [4•••formamide] and the water adduct of nitrobenzene (4) consistently results in lower energy complexes. In all cases involving O as interacting atom, > 0.2 Å van der Waals overlap is observed between N and O. The longest N•••O distance (d) computed is 2.863 Å for [3•••formamide], which is still 0.207 Å within the sum of the van der Waals radii [7] of N and O (i.e. 1.55 + 1.52 = 3.07 Å). Only very limited overlap of ≤ 0.04 Å is observed with dimethylthioether (i.e. N + S = 1.55 + 1.80 = 3.35 Å).
In order to probe the influence of translating the water molecule in [3•••water] (-1.7 kcal/mol) away from N, several additional computations were conducted. To avoid additional influences on the interaction energies apart from the π-hole interaction, the conformation of the water molecule was confined so that the H2O lone pair is pointing to the π-hole and the H atoms are not directed toward the nitro or aromatic π-system. The interaction energies gathered in Table S2 confirm that small displacement of the interacting atom from the π-hole has little influence on the interaction energy (maximum displacement 0.35 Å).
An 'atoms in molecules' analysis [8] of the complexes with dimethylether and formamide (see Figure  S1) clearly revealed bond critical points and bond paths connecting nitrobenzene's N and the O-atom of dimethylether or formamide, thus confirming that these π-hole interactions are bonding in nature.   complexes. The value of ρ(r) in a.u. is given for 3/4 in complex with dimethylether.

Energy decomposition and Atoms in Molecules Analyses of examples in the main text:
The energy decomposition (or fragment) analysis and the atoms in molecules analysis [8] were done with the Amsterdam Density Functional (ADF) [9] modelling suite at the B3LYP-D3/TZ2P level of theory

Comparative computational evaluation of ligand-protein complexes in 4nvh and 4nvi:
All heavy atom coordinates were extracted from the PDB entry and the binding pocket was manually trimmed to include (simplified) residues that are close to the ligand. Water molecules shared by both binding pockets (i.e.: H2O-428, 662, 664 and 723) were included. All hydrogen atoms were initially added with the automatic editor embedded in Mercury 3.10.1, [10] whereafter their positions were optimized with simple molecular mechanics. The resulting coordinates were used as input for calculations at the SCS-MP2/def2-SVP level of theory, whereby all except the H-atoms were set as frozen. The resulting structure was used to compute the binding energy by also running a single point calculation of the thus optimized ligand and pocket coordinates. For comparison purposes, an in silico mutant was also evaluated where the ligand's -Br was replaced by -NO2 for 4NVI, or the ligand's -NO2 group was changed to a -Br group for 4NVH. This procedure was repeated for both structures and omitting all water molecules from the pockets. See also Figure S16-2.

Query for retrieving data on aromatic nitro ligands in the PDB:
The PDB was inspected with the online Query Sketcher of Relibase version 3.2.1 on the 2 nd of March 2016. An schematic overview of the query used is shown in Figure S2. The aryl bonds of the central ArNO2 unit were set to aromatic (solid and dashed black lines) and the other three bonds were specified as 'any type' (only dashed black lines). All covalent bond distances and selected triatomic angles as well as the O1-N-X1-X2 torsion angle were collected to reconstruct the average model used for accessing directionality (see below). The interatomic distance between the interacting atom (red sphere; O in O=C or OH2 or S in SC) and the nitro's N-atom (e, highlighted in red in Figure S2) was set as ≤ 5 Å so that the data was confined within a 10 Å diameter sphere centred on N. The X6-NO2 central unit was marked as a ligand and the interacting atom(s) as part of a protein, or in the case of water the interacting O was specified as water. The numerical data for the amount of PDB structures (NPDB) and the amount of individual hits (Nhits) that resulted from these queries are collected in the first two columns of Table S3 (the other columns involve narrower datasets, explained below). a) Data with a spherical segment of 5Å sphere and│x│ ≤ 2 Å. b) and r ≤ 1 Å. Percentages relative to all the data with r ≤ 1 Å. c) R = any atom but mostly C (methionine).

Method to derive Cartesian coordinates of XNO2, the interacting atom and the ArNO2 model.
The interatomic distances between the interacting atom (red sphere) and X2, O1 and O2, as well as the X1-O1 distance were also collected (set to ≤ 8 Å). The triangle formed by X1-N-O1 was chosen as the base, and the interacting atom as the tip of a tetrahedron (see Figure S2) so that Cartesian Coordinates {X,Y,Z} of all the atoms could be derived as follows: the N-atom was taken as the centre {0,0,0}, X1 as {0,c,0}, O1 as {x,y,0}, and the interacting atom at {l,m,n}. Distances a-f were measured, from which y, x, m, l and n can be derived using equations (1) -(5) respectively.
(1) Thus, the distance between the interacting atom and the plane defined by O1-N-X1 is n, i.e. the Zvalue. With this and the N···interacting atom distance (e) the parallel displacement parameter (r) could be derived according to equation (6): (6) 22 r e n  With this procedure the sign of n (i.e. the Z-axis) is always positive, meaning that data in one half of the sphere were reflected to the other half of the sphere to obtain the data within a 5 Å high and 10 Å wide hemisphere.
To obtain all {X,Y,X} coordinates of the average model for the aromatic nitro central group, it was assumed that both the X6 ring and the NO2 group were planar, where the angle between these two planes is given by the O1-N-X1-X2 torsion angle. The averages of relevant distances and angles were then used together with the rules of sine and cosine to obtain the {X,Y,X} coordinates. The relative standard deviations of the parameters used were typically below 5%.

Methods to probe directionality with 3D, 4D density, P(r) and N(d') plots.
As the substituent on the aryl ring is left unspecified, we only scrutinized the data characterized by N···interacting atom ≤ 5Å and │x│ ≤ 2 Å, meaning that the data is confined within half of a spherical segment centred on N and with base made dimensions of 4 x 10 Å and a 5 Å radius (see also the volume outlined in red in Figure S3). Figure S3. Illustration of the aromatic nitro model and the volumes used to access directionality. The volume outlines in red is a spherical segment centred on N with N•••interacting atom ≤ 5Å and │x│ ≤ 2 Å.
The amount of data within these bodies are collected in the third column of Table S3 (N│x│ ≤ 2) and the raw 3D data (with a simplified, flattened model ArNO2) are shown in Figure S4. Four dimensional (4D) density plots were generated by first binning the data (using a custom build Excel spreadsheet, available on request) in 96 volumes {X [3 x 4 /3 Å], Y [8 x 10 /8 Å], Z [4 x 5 /4 Å]}. The percentage of the total that each volume contains was computed by dividing the number of data in a certain volume by the total amount of data. This density information was projected onto the centre of each volume using Orgin Pro 8. The size and colour of the spheres in the resulting plots are a visual representation of the density of data, whereby red and larger is denser, empty and small is less dense. These plots are shown in Figure S5. The {X,Y,Z} coordinates of the model, together with the standard van der Waals radii of C (1.70 Å), N (1.55 Å) and O (1.52 Å) were used to generate a model as a single body 'part' file (.ipt) using Autodesk Inventor® Professional 2016 (by using mm instead of Å). Similarly, a half spherical segment was created (radius 5 mm, width 4 mm and length 10 mm). 5 mm high half cylinders of increasing radius (up to 5 mm) were also generated and trimmed so that their width (X-axis) was ≤ 4 mm. All these bodies were collected in an assembly file (.iam), properly alighted, and the half spherical segments were mirrored along the X-axis; the result is illustrated in Figure S3 with (trimmed) half cylinders of 1, 2, 3, 4 and 5 mm radius. These (trimmed) half cylinders are representative of volumes characterized by ≤ r.
Using the 'Analyse Interference' option in Autodesk Inventor® Professional 2016 the interfering volumes between the model and the (trimmed) half cylinders could be derived. The volume difference between two such interfering volumes of incremental r-values, say ra and rb, thus represent the volume that the model occupies in between two values of r, i.e. Vmodel. Similarly, the interfering volume between the half spherical segment and the (trimmed) half cylinders could be derived as a function of r, from which the volume in between two r-values as found within the half spherical segment could be derived, i.e. Vno model. The actual free volume in between two r-values that a 'host' can occupy, i.e. r free V , is thus given Vno model -Vmodel. The total freely accessible volume, total free V , is naturally given the volume of the half spherical segment minus the volume of the model in the half spherical segment. The random (or volume) distribution as a function of r, i.e. r chance D , is thus given by: The actual distribution of the data, r data D , is naturally given by: Thus, the change corrected distribution of data, P(r) is given by: For an accidental distribution, P should be unity across all r-values; a P-value greater than unity is thus evidence of positive clustering (suggesting a favourable interaction), while P-values smaller than unity reflect a depletion of data (suggesting an unfavourable interaction). These P(r) plots are sown in Figure  S6. The data further characterized by a parallel displacement parameter of ≤ 1 Å were further inspected to assess possible overlap of van der Waals shells. These data were plotted as percentages as a function of the van der Waals corrected N -interacting atom (O or S) distance (d') in both absolute and cumulative fashion, as shown respectively as solid bars and empty circles in Figure S7 (See also Figure 2 in the main text). Figure S7. N(d') plots where d'= van der Waals corrected N···interacting atom distance and N stands for the relative number of hits in absolute (solid bars) and cumulative (empty circles) fashion. These data were extracted from the PDB and characterized by N···interacting atom ≤ 5 Å, │x│ ≤ 2 Å and r ≤ 1 Å, i.e. the gold region in the inset figures. The interacting partner atoms are: a) C=O Oxygen with 384 hits; b) H2O Oxygen with 426 hits; and c) CSR Sulphur (R can be any atom) with 78 hits. The amount of data involved in apparent overlap of van der Waals shells is given in blue as '% ΣvdW'.

General approach and overview of data.
The four dimensional (4D) density plots and the directionality analyses were less clear than hoped for and likely underdetermined by the data. Hence, a further scrutiny was undertaken of the data. First the data was filtered for r ≤ 1.0 Å (column 'Nr≤1' in Table S3). Then the data was limited to hits with [ NO2 N•••X water/CO/S ] ≤ the sum of the van der Waals radii of N and O/S + 0.5 Å (column 'N≤vdW+0.5' in Table  S3, also highlighted in blue). These data contained a total of 170 unique hits, as is detailed in Table S4.
For all these hits the PDB structures and the articles in which they were published were collected and inspected manually. Instances where structural and/or functional relevance was suspected were processed further, leading to the cases highlighted in Figure S9 - Figure S36. The remaining structures were dismissed (although structural and/or functional significance cannot be ruled out).
In a typical case, a cartoon representation of the (aligned) protein structure(s) was generated and if a structure contained multiple chains these were aligned and cartoon representation generated. The (aligned) binding pockets of relevant ligands and/or selected residues around the nitro aromatic were also highlighted and geometric (distance) information is frequently given. When relevant, illustrations of comparative structures are given (sometimes not bearing a nitro aromatic ligand).
In all cases, a LigPlot + (version 1.4) [11] plot was generated for the ligand involved in the π-hole contact and in case of multiple similar structures the one with the shortest contact distance was used. The Chemical structure(s) of the ligand(s) is also always given. Molecular Electrostatic Potential maps (MEPs) were computed with DFT at the BLYP [1] 6-31G* [2] level of theory as follows (using Spartan © 2014). The atomic coordinates of the nitro-bearing ligand was first lifted from the relevant PDB structure and the structure was simplified and/or completed as necessary (e.g. by adding H-atoms). A DFT energy optimization was then performed ensuring that the measured atomic coordinates remained at their experimentally determined positions. A water molecule was then added in a coplanar hydrogen-bonding geometry with the nitro group to mimic the mimic the ligand being bound by the protein. The energy was again minimized (conserving the coordinates of measured atoms) and the MEPs of resulting structures are used in Figure S9 - Figure S36 with the indicated electropositive potentials of the nitro's π-hole in kcal/mol.

3.2.
Examples where a π-hole interaction is highly preserved within similar structures. Figure S9. Illustrations of possible Ar-NO2 π-hole interactions involving the 20 protein structures 3r28, 3qx4, 3r1q, 3qzi, 3qwk, 3qwj, 3r7v, 3r7e, 3r7i, 3r7u, 3qx2, 3r7y, 3r83, 3rm6, 3qzh, 3r6x, 3r71, 3rpo, 3rai and 3r73. [22] A) Cartoon representation of aligned protein structures with the ligands in capped-sticks mode; B) Capped stick representation of aligned ligands (carbon backbone in grey) with nearby water molecules and the residues in close contact with these water molecules. The small solid spheres represent average atomic coordinates ('dummy atoms') and the distances shown are measured between these average positions. One group of water molecules is consistently found near the π-hole region of the nitro aromatic with an average water O···N NO2 distance of 3.07 Å, which is 0.05 Å within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; C) Partial binding pocket of ligand xa0-782 (see also E) as found within 3r28. The same residues and water molecules as in B are shown. The water-735 O···N NO2 distance of 2.98 Å is 0.09 Å within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; D) LigPlus plot of ligand xa0-782 in 3r28. Polar contacts are shown as green striped lines with bonding distance.
The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of the ligands, with ligand xa0-782 highlighted (used as example in C and D); F) The molecular electrostatic potential map of xa0-782 (atomic coordinates extracted from 3r28) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +150 (blue) to -150 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). NB: These structures are part of a series of 38 crystal structures of cyclin-dependent kinase 2 in complex with small molecule inhibitors. No binding data are available, as all these structures were only published in the PDB. Of these 38 structures, 25 structures had a ligand with the general structure as shown in E, and the 20 structures used here had a very similar binding pocket. [22] ). See also Figure 3 (left) in the main text. Figure S10. Illustrations of possible Ar-NO2 π-hole interactions as found within protein structures 1eei, [25] 1lt6, [50] 1llr, [45] 1rdp, [61] 1rf2, [61] 1rcv, [61] 1rd9, [61] 1jqy [35] and 1pzi. [54] A) Cartoon representation of all twelve aligned pentameric protein structures (1lt6 contains two pentamers and 1jqy contains three pentamers); B) Cartoon representation of all 60 protein chains aligned; C) Out of the 60 binding pockets (aligned in B), 45 contained a water molecule and out of those 35 had a similarly oriented ligand. In these structures, the water molecule was encapsulated by two glutamine residues (56 and 61) and the ligand (H-bonding and π-bonding interactions). These structures were overlaid, illustrating that this water binding mode is highly preserved. The distances shown are averages, calculated by creating the average dummy atoms (shown as small spheres) of the atoms involved in the interactions. Note that the average water O···N NO2 distance of 3.184 Å is very close to the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and in some structures this distance is significantly below this benchmark (e.g.  (right) in the main text. NB: in the original paper it is reported that -NO2 bearing ligands are two times better type II dehydroquinase (from Mycobacterium tuberculosis) inhibitors than ligands bearing an -NH2 functionality (see ligand 41a vs 41b and ligand 44a vs 44b in Table 1). [65] 3.3. Examples of likely functional relevance of -hole interactions with H2O. Figure S12. Illustrations of possible significance of Ar-NO2 π-hole interactions in the functioning of protein structure 1grn. [29] A) Cartoon representation of overlaid protein structures of oncogene product p21 H-ras in complex with GTP ligands. B-D) illustrations of these three different GTP ligands and the adjacent magnesium binding pocket. Note that three ligands in 1grn (B) and 1gnq (D) are isomers and that 521p (C) is regular GTP. In structure 1gnr (B) there is a water molecule that seems encapsulated by the ligand with a very short water O···N NO2 distance of 2.450 Å, well within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å). E) LigPlus plot of ligand cag-167 in its binding pocket of 1grn. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; F) Schematic drawing of the chemical structures of the isomeric ligands cag-167 in 1grn and 1gnq. G) The molecular electrostatic potential map of cag-167 in 1grn (atomic coordinates extracted from the PDB) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +150 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). See also Figure 4 and the main text. Figure S13. Illustrations of possible relevance of Ar-NO2 π-hole interactions involving protein structure 1gvs. [31] A) Cartoon representation of the entire protein structure; B) Capped stick representation of the ligand tnf-500 (in grey, see also D) surrounded by the residues in the binding pocket that are H-bonded to the ligand (in green). One water molecule (H2O-2565) in close to one of the ligands N-atoms, possibly held in place by combined πbonding (to -NO2) and H-bonding (to the orto C-H). C) LigPlus plot of ligand tnf-500 in its binding pocket in 1gvs. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; D) Schematic drawing of the chemical structures of ligands tnf-500. E) The molecular electrostatic potential map of tnf-500 (atomic coordinates extracted from the PDB) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). Figure S14. Illustrations of possible relevance of Ar-NO2 π-hole interactions involving protein structure 2y59. [41] A) Cartoon representation of the entire protein structure; B) Cartoon representation of aligned chains A-D; C) Capped stick representation of aligned binding pockets of chains A-D (4 Å residues around the ligands) and their ligands (za3-500, see also H), showing that the binding pocket is highly preserved throughout the series; D) Capped stick representation of aligned binding pockets (4 Å residues around the ligands) of chains A and D wherein the ligand is covalently bound to Serine-49 via the ligands boronate (CBO3). The residues involved in polar contacts are shown as thicker sticks, as is tyrosine-147 which is π-π stacking with the ligand's nitroaryl moiety. The distances shown are those found in chain D (hence the asterisk in the figure); E) Capped stick representation of aligned binding pockets (4 Å residues around the ligands) of chains B and C wherein the ligand is covalently bound to Serine-49, Serine-298 and Lysene-410 via the ligands boronate (CBNO2). The residues involved in polar contacts are shown as thicker sticks and the distances shown are those found in chain B (hence the asterisk in the figure). Threonine-413 is H-bonded to the nitro moiety (NH•••O) and its carbonyl-O is in close proximity of the ligands nitro π-hole region with Thr-413 O···N NO2 distance of 3.379 Å; F) and G) LigPlus plot of ligand za3-500 in chains D and B of 2y59 respectively. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; H) Schematic drawing of the chemical structures of ligands za3-500, wherein the boranate is generalised as unbound -BH2; I) The molecular electrostatic potential map of za3-500 (atomic coordinates extracted from chain B of 2y59 and the boronate simplified as neutral -BH2) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). See also Figure 4 and the main text. Figure S15. Illustrations of possible relevance of Ar-NO2 π-hole interactions involving protein structure 4qo7. [78] A) Cartoon representation of aligned chains A-D; B) Cartoon representation of the entire protein structure; C) Capped stick representation of aligned binding pockets of chains A, C and D (4 Å residues around the ligands; chain B is empty) and their ligands (36v-803, see also E), showing that the binding pocket is highly preserved throughout the chains. The residues involved in polar contacts are shown as thicker sticks and the distances shown concern chain A (hence the asterisk in the figure). Asparganine-165, shown as the thickest sticks, is in close proximity to the ligands nitro π-hole region with an Asp-165 O···N NO2 distance of 3.376 Å; D) LigPlus plot of ligand 36v-803 in chains A 4q07. Polar contacts are shown as green striped lines with bonding distances. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of ligands 36v-803; F) The molecular electrostatic potential map of 36v-803 (atomic coordinates extracted from chain A of 4qo7 with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). See also Figure 4 and the main text.

Examples of likely functional relevance of -hole interactions with C=O.
Figure S16-1. Illustrations of possible relevance of Ar-NO2 π-hole interactions involving protein structure 4nvh compared to structures 4nvi and 4nvg. [72] A) Cartoon representation of the entire protein structures aligned with one another; B) Capped stick representation of aligned binding pockets (4 Å residues around the ligands) and their ligands (2nb-302, 2nw-302 and 2n9-302, see also G), showing that the binding pocket is highly preserved throughout the series. The Methionine residues close to the nitrate/bromo/ester moieties is shown as thicker sticks; C) The binding pocket of 2nb-302 in 4nvh (4 Å residues around the ligands) with a short Met-228 O···N NO2 distance of 3.084 Å (nearly within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å)); D) and E) are idem as C), but for 2nw-302 in 4nvi and 2n9-302 in 4nvg. The

Figure S17.
Illustrations of possible functionality of Ar-NO2 π-hole interactions as found within protein structures 4r6s and 4r2u. [84] A) Cartoon representation of both structures aligned (4r6s has two chains labelled A and B, while 4r2u has two chains labelled A and D); B) Cartoon representation of all four chains aligned; C) The stereoisomeric ligands (see also E) within chain B of 4r6s (thick sticks) and chain D of 4r2u (narrow sticks) which have a glutamine (286) and leucine (453) residue in close proximity of their nitro-moiety. For the isomer in structure 4r6s a very close Gln-286 O···N NO2 distance of 2.806 Å is observed, which is well within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å). This is not the case in 4r2u, where the glutamine residue is actually facing away from the Ar-NO2 moiety. D) LigPlus plot of ligand 3k2-501 in its binding pocket within chain B of 4r6s. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of ligands 3k2-501 and 3jx-501. F) The molecular electrostatic potential map of 3k2-501 (atomic coordinates extracted from the PDB) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). See also Figure 4 and the main text. Figure S18. Illustrations of possible relevance of Ar-NO2 π-hole interactions involving protein structures 2w7x, [38] 2yir, [42] 2ycq, [39] 2xk9 [39] and (in which the ligands bear the -NO2 functionality), and 2ycr, 2ycs and 2ycf [39] (in which the ligands do not bear the -NO2 functionality). [38][39], [42] A) Cartoon representation of aligned structures; B) Capped stick representation of aligned binding pockets (4 Å residues around the ligands) and their ligands, showing that the binding pocket is highly preserved throughout the series. The residues involved in polar contacts are shown as thick sticks: glutamic acid 273 is involved in classical H-bonding and methionine 304 is concurrently involved in NH···O NO2 hydrogen bonding and Met-304 O···N NO2 π-hole bonding with an average distance of 2.874 Å (of four ligands), which is well within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å); C) The four individual structures and their ligands and π-hole bonding distances; D) LigPlus plot of ligand d1a-601 in its binding pocket in 2w7x. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of the ligands bearing a nitro group (2w7x, 2yir, 2ycq, 2xk9) and those not bearing a nitro group (2ycs, 2ycr, 2ycf). F) The molecular electrostatic potential map of d1a-601 (atomic coordinates extracted from the PDB) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). See also Figure 4 and Figure 5 (middle) and the main text.    3.6. π-hole interactions that seem at least structurally relevant with H2O. Figure S23. Illustrations of possible Ar-NO2 π-hole interactions involving protein structure 4mik.

Example of likely functional relevance of -hole interaction with C-S.
[57] A) Cartoon representations of protein structure 4mik with chains A and B aligned (left) and separate (right) and the ligands in space filling mode; B) Capped stick representation of the aligned binding pockets (4 Å residues around the ligand) of ligand jil-301 (shown as thickest sticks with a grey C-backbone, see also F) as found in chains A and B. The residues involved in polar contacts are shown as thinker sticks; C) Selected part from the binding pocket of jil-301 highlighting the nitro aromatic moiety (thicker grey sticks) and the residues surrounding it. This view has the same perspective as in B and the same residues as in D; D) Partial aligned binding pockets of jil-301 (same residues as in C but from a different perspective) as found in chains A and B of 4mik. The distances shown are taken from chain B (hence the asterisk in the figure). Water molecule H2O-552 has polar contacts with residues Arganine-44 and Asparganine-39, and water molecule H2O-537. Concurrently, the H2O-552 O-atom is in close proximity to the π-hole of ligand jil-301 (grey) with a water-552 O···N NO2 distance of 2.613 Å, which is 0.457 Å within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; E) LigPlus plot of ligand jil-301 in chain B of 4mik. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; F) Schematic drawing of the chemical structures of ligand jil-301; G) The molecular electrostatic potential map of jil-301 (atomic coordinates extracted from chain B of 4mik) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). Figure S24. Illustrations of possible Ar-NO2 π-hole interactions involving protein structure 2wbk. [17] A) Cartoon representation of chains A and B in protein 2wbk with ligands m2f-1869 (see also E) in space-filling mode; B) Cartoon representation of aligned chains A and B of protein 2wbk; C) Capped stick representation of the binding pockets of water molecule H2O-2567, revealing polar contacts with residues Asparagine-199 and Tyrosine-537 (green). The water O-atom is also close proximity to the π-hole of ligand m2f-1869 (grey) with a water-2567 O···N NO2 distance of 2.702 Å (in chain A, hence the asterisk in the figure), which is 0.368 Å within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; D) LigPlus plot of ligand m2f-1869 in chain A of 2wbk. Note that the nitrate moiety possibly involved in the π-hole interaction (ortho relative to the carbohydrate part) is H-bonded to Asn-178, but -according to current conventional criteria-not bound to H2O-2567. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structure of ligand m2f-1869; F) The molecular electrostatic potential map of m2f-1869 (atomic coordinates extracted from chain A of 2wbk) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +175 (blue) to -175 kJ/mol (red) in eight bands. All distances are in Angstroms (Å). Figure S25. Illustrations of possible Ar-NO2 π-hole interactions involving protein structure 4zbb. [18] A) Cartoon representation of protein structures 4zbb with the ligands in space filling mode; B) Cartoon representation of aligned chains A-D from protein structure 4zbb with the ligands in capped sticks mode; C) Capped stick representation of the binding pocket (4 Å residues around the ligand) of ligand gdn-301 (shown as thickest sticks with a grey C-backbone, see also E), revealing that the binding mode is well-preserved throughout chains A-D. The distances shown are taken from chain D (hence the asterisk in the figure). Water molecule 499 is located very near the π-hole interaction region of one of the ligand's nitro moieties with a water-499 O···N NO2 distance of 2.850 Å, which is 0.220 Å within the sum of the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å). Ligand gdn-301 is also interacting with ligand gdn-300 (medium thickness, grey C-backbone), possibly by two π-hole interactions involving both nitro moieties of gdn-300 as electron acceptors. The electron donors are then the S atom in gdn-301 and the nitro-group ortho to that S-atom. These distances are, respectively: gdn-301 S···N NO2@gdn-300 = 3.

3.7.
π-hole interactions that seem at least structurally relevant with C=O. Figure S29. Illustrations of possible Ar-NO2 π-hole interactions involving protein structures 1d0y and 1d1a. [21] A) Cartoon representation of the aligned protein structures; B) Capped stick representation of the aligned ligands (grey C-backbone) in their binding pockets (4 Å residues around the ligands). The residues involved in polar interactions are shown as thicker sticks and asparagine-127, which is near the ligands π-hole region, is shown as the thickest sticks. C) and D) show the individual binding pockets (4 Å residues around the ligands) of 1d0y and 1d1a respectively. The Asn-127 O···N NO2 distances of 3.016 Å (in 1doy) and 2.968 Å (in 1d1a) are below the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; E) LigPlus plot of ligand dae999 in 1d1a. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; F) Schematic drawing of the chemical structures of ligands dae-999 (found in 1d1a) and onp-999 (found in 1d0y); G) The molecular electrostatic potential maps of dae-999 (left, extracted from 1d1a) and onp-999 (extracted from 1d0y) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +150 (blue) to -150 kJ/mol (red) in eight bands for dae-999 (left, 1d1a) and idem but from +125 to -125 for onp-999 (right, 1d0y). The F3Be-terminating phosphate chain has been simplified to circumvent the issue of charge of the phosphate groups (which is unknown). All distances are in Angstroms (Å). NB: Out of six ATPanalogues examined, the two highlighted above (named oNPhAE (onp-999) and op-NPhAE (dae-999) in the paper) are the only two that have a similar active section and shortening velocity as ATP (see Table V in the article). The authors attribute this to hydrogen bonding patterns, but the interaction with Asn-127 might contribute to this as well. Indeed, the authors noted that: 'there is a positive correlation between the distance of the bridging nitrogen from Asn127 and the ability to sustain tension and generate movement.'. [21]   . NB: It is noteworthy that these -essentially isostructural-complexes were obtained from very different conditions; 1e36 was obtained from a solution at pH 5 and 1e37 (1 minute soaking) / 1e38 (2 minutes soaking) from a solution at pH 9 (see Table 1 in the paper).

Figure S33.
Illustrations of possible Ar-NO2 π-hole interactions involving protein structure 3m3o. [53] A) Cartoon representation of the protein structure with its ligands; B) Cartoon representation two protein structures (generated by symmetry and colour coded orange and green) that all surround the ligand npo-0242 (and attached sugar residues, see also E); C) Capped stick representation of the binding residues to ligand npo-0242 in 3m3o (4 Å residues around the ligands). The ligand's carbon backbone is represented in grey and the interacting residues are colour coded to identify the (symmetry related) protein chain they belong to (blue, orange, green). The carbonyl O-atom of Lysine-125 belonging to the chain colour coded green is in close proximity to the ligands π-hole region with a Lys-125 O···N NO2 = 2.983 Å, which is 0.087 Å below the van der Waals radii of N+O (1.55 + 1.52 = 3.07 Å) and thus indicative of a π-hole interaction; D) LigPlus plot of ligand npo-242 in 3m3o. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of ligand npo-242 (and a2g-241 and gal-240, which are covalently attached to each other); F) The molecular electrostatic potential map of npo-242 and attached carbohydrate residues (atomic coordinates extracted from 3m3o) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein. The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å).   Figure S36. Illustrations of possible Ar-NO2 π-hole interactions involving protein structures 2zvp, [79] 2zyv [81] and 2zyw. [81] A) Cartoon representation of the aligned protein structures; B) Cartoon representation of the individual protein structures (the asterisk denotes that the distances in C) are taken from structure 2zvp); C) Capped stick representation of aligned binding pockets in 2zvp, 2zyv and 2zyw (4 Å residues around the ligands; distances with 2zvp) and their ligands (npo, see also E), showing that the binding pocket is highly preserved throughout these three structures. One central npo ligand (npo-1202) is H-bonded to water molecules on one side (OH), while the -NO2 moiety seems sandwiched in between a water molecule ('bottom', H2O O···N NO2 = 3.570 Å) and a methionine ('top') that has a S-atom near the nitro's π-hole region ( Met-243 S···N NO2 = 3.731 Å) and its methyl might have a CH-π interaction with npo's aryl ring ( Met-243 C···centroid NPO = 3.334 Å); D) LigPlus plot of ligand npo-1202 in 2zvp. Polar contacts are shown as green striped lines with bonding distance. The polar residues are shown as balls and sticks and the apolar residues as red eye-lashes; E) Schematic drawing of the chemical structures of ligand npo-1202; F) The molecular electrostatic potential map of npo-1202 (atomic coordinates extracted from 2zvp) with the nitro group hydrogen bonded to a water molecule to mimic the ligand being bound by the protein.

π-hole interaction that seem at least structurally relevant with C-S.
The colour code spans from +125 (blue) to -125 kJ/mol (red) in eight bands. All distances are in Angstroms (Å).