Identifying Aspirin Polymorphs from Combined DFT-Based Crystal Structure Prediction and Solid-State NMR.

A combined experimental and computational approach was used to distinguish between different polymorphs of the pharmaceutical drug aspirin. This method involves the use of Ab Initio Random Structure Searching (AIRSS), a DFT-based crystal structure prediction method for the high-accuracy prediction of polymorphic structures, with DFT calculations of NMR parameters and solid-state NMR experiments at natural abundance. AIRSS was used to predict the crystal structures of form-I and form-II of aspirin. The root-mean-square deviation (RMSD) between experimental and calculated 1 H chemical shifts was used to identify form-I as the polymorph present in the experimental sample, the selection being successful despite the large similarities between the molecular environments in the crystals of the two polymorphs.


| INTRODUCTION
A significant fraction of all active pharmaceutical ingredients (API) are small molecules formulated as drugs in crystalline form. Many such molecules are able to crystallize in several different possible structures (polymorphic modifications) characterized by different physical and chemical properties. Of great importance is the dissolution rate of the drug, which is highly dependent on the crystal form; that is, polymorphic modifications have different properties that depend fundamentally on their corresponding crystal structures. The ability to predict the different crystal structures of possible polymorphs of drugs has attracted great interdisciplinary interest as conduit for combining different experimental and computational methods. Although straightforward methods of determining crystal structures, such as single crystal X-ray diffraction or neutron diffraction, are preferred, that is not always achievable due to the requirement of growing single crystals for these experiments. Thus, more recently, several experimental and theoretical methods have been utilized to develop new ways to determine which is the crystal structure of the investigated sample. [1] In this respect, the development of several computational approaches for crystal structure prediction (CSP), in order to be able to make reliable predictions of potential structures, has improved tremendously during the past few years, and the recent progress and advances on the hardware and the software have contributed significantly to the advancement of various CSP methods. [2,3] It has been shown that CSP is a powerful tool in rationalizing experimental observations. The state-of-the-art progress in the development of crystal structure prediction methods has been addressed by Cambridge Crystallographic Data Centre (CCDC) by a series of blind tests, [3,4] pushing further the development of CSP methods to tackle more and more complex systems.
Solid-state nuclear magnetic resonance (ssNMR) is one of the most important analytical tools providing atomically resolved information on the molecular and crystal structure of solid materials. Recent developments in ssNMR involving the available high-field NMR systems and specialized pulse sequences enabled the use of multinuclear solid-state NMR spectroscopy in combination with density functional theory (DFT) calculations to be widely applied for structure determination and validation of organic molecular compounds. [5][6][7] Although ssNMR spectroscopy is ideally suited for in-depth, atomic resolution analysis, some experiments on many APIs still suffer from poor sensitivity. Therefore, high-field dynamic nuclear polarization (DNP) can be applied to enhance the sensitivity of ssNMR experiments by many orders of magnitude. [8] The modern high-field DNP experiments can provide valuable novel structural information about the molecular and crystalline structure of the APIs. [9] Herein, we demonstrate the applicability of ab initio random structure searching (AIRSS), [10] a DFT-based crystal structure prediction method in combination with solid-state NMR experiments to identify the crystal structure of organic molecular crystals. Although a combination of AIRSS and ssNMR was previously employed for organic molecular crystals [7] to validate AIRSS structures using NMR data, in this study, we use AIRSS together with chemical shifts RMSD to distinguish between different polymorphs of aspirin and to identify which polymorph is the one that is present in the experimental sample. Aspirin has received considerable interest over the past few years due to the prediction of a new polymorphic crystalline form [11] in 2004 mostly known as aspirin-II or form-II, with very similar molecular environment in the crystal lattice as form-I, and subsequent studies have tried to isolate the pure form-II. [12] Our aim is to demonstrate how a high accuracy CSP method in combination with chemical shifts extracted from solidstate NMR experiments can be used to identify the polymorph in the experimental sample despite the structural similarities of the two forms ( Figure 1).

| Samples
Powdered aspirin (acetylsalicylic acid) sample, with the molecular structure depicted in Scheme 1, was purchased from Sigma-Aldrich as form-I and used as received, without any further purification or recrystallization. Aspirin has three known polymorphs, and for this study, we focused on polymorphs I and II, each of these two forms containing four symmetry equivalent molecules per unit cell, whereas the third polymorph has a more complex structure with Z 0 = 2, which is outside the scope of this paper. Both crystal structures belong to the P2 1 /c monoclinic space group with the following unit cell dimensions: a = 11.416(5) Å, b = 6.598(2) Å, c = 11.483(5) Å, β = 95.60(3) for form-I (CSD entry code: ACSALA07) [13] and a = 12.2696(5) Å, b = 6.5575(3) Å, c = 11.4960(4) Å, β = 68.163(2) for form-II (CSD entry code: ACSALA17). [14] For this study, the crystal form present in the experimental sample was form-I.

| NMR experiments
The NMR spectra were measured on a Bruker Avance HD 600 WB NMR spectrometer using the 3.2-mm and the 0.7-mm MAS probes. The finely powdered aspirin sample was packed in ZrO 2 rotors with the outer diameter of 3.2 mm for the 13 C CP-MAS and 13 C-13 C INADE-QUATE experiments and 0.7 mm for the 1D 1 H and 2D 1 H-13 C HETCOR experiments, respectively.

F I G U R E 1
The crystal structures of form-I (left) and form-II (right) of aspirin illustrating the strong similarities between the molecular environments of the two polymorphs. The cell parameters are given in Table 1 The 1D 13 C CPMAS spectrum was recorded at a MAS rate of 12.5 kHz with a CP contact time of 4 ms. SPI-NAL64 [15] decoupling was applied with a 1 H decoupling power of 100 kHz. One hundred twelve scans were acquired with a repetition delay of 60 s. The 13 C signal of adamantane at 38.48 ppm was used for referencing the 13 C spectra.
For the CP-INADEQUATE [16] experiments, the temperature was regulated at 100 K with a MAS rate of 12 kHz, and the 1 H and 13 C nutation frequencies were 100 and 71 kHz, respectively. The repetition delay was 6.5 s, and the evolution delays in refocused CP-INADE-QUATE [16] were 4 ms, and 100 t 1 increments consisting of 256 scans were collected, leading to a total experimental time of 46 hr. SPINAL64 [15] decoupling ( 1 H B 1 field of 100 kHz) was applied during both t 1 and t 2 periods.
The 1 H MAS and 1 H-13 C HETCOR spectra were recorded at a MAS rate of 111 kHz, and the temperature was set to 293 K. The CP contact time for the HETCOR spectrum was set to 0.5 ms, and 56 t 1 points were collected with 16 scans each. A recycle delay of 120 s was used for the HETCOR experiment. The 1 H and 13 C scales were calibrated to adamantane as an external standard (1.87 ppm for 1 H signal and 38.46 ppm, CH 2 signal for carbon).

| Crystal structure prediction
A total of~1,700 candidate crystal structures was generated using the ab initio random structure searching (AIRSS) [10] approach, where the crystal structure prediction consisted of two steps. Initially, the trial structures were generated by randomly placing the aspirin molecules in the unit cell, whose space group was set to P2 1 /c, the known space group of both form-I and form-II. All the randomly generated crystal structures comprised four molecules per unit cell to match the number of molecules present in forms I and II of aspirin. The angles and cell dimensions of each structure were restricted within ±5% of those of the single crystal X-ray crystallographyderived structural parameters of aspirin-I and aspirin-II, respectively. Additionally, minimum separations were specified on a specific pairwise basis (see Supplementary Information for more details). We performed two separate searches targeting the aspirin-I and aspirin-II structures, respectively. In both cases, we used as constraints the cell parameters, as described above, and the already known molecular formula of aspirin as a whole unit for the search, fixing its conformation to the molecular conformation taken from the structure determined via single-crystal XRD. The exact ranges of parameters used for the AIRSS search are given in Table 1. We generated about 1,200 and 500 candidate structures for aspirin-I and aspirin-II, respectively. After the structure generation step, all structures were geometry optimized using periodic density functional theory (DFT) calculations as described in the DFT step below.

| DFT calculations
The initial DFT calculations carried out on the randomly generated structures during the first stage of the AIRSS search used the CASTEP [17] software for the geometry optimization, using a coarse plane-wave cutoff energy of 400 eV and a k-point spacing [18] of 0.1 Å.
The lowest energy structures from AIRSS were then subjected to further optimization using a stringent set of parameters, which we use for a precise geometry optimization. For the final calculations, we used plane-wave cutoff energy of 900 eV and k-point spacing of 0.05 Å.
After the full geometry optimization of the minimum energy structures obtained from AIRSS, NMR chemical shielding values were calculated for these fully optimized structures. For calculating the NMR parameters, the GIPAW method [19] was used in combination with ultrasoft pseudopotentials [20] and plane-wave basis sets, which had been previously proven to be a very efficient method for calculating NMR chemical shifts of crystalline solids. [21] A plane-wave energy cutoff of 900 eV, and a Monkhorst−Pack grid of k-points [18] corresponding to a maximum spacing of 0.05 Å were used for the NMR calculations.

| RESULTS AND DISCUSSION
In order to extract the experimental chemical shifts, we performed a complete 13 C and 1 H assignment of the aspirin resonances. Figure 2 shows the 13 C-13 C INADE-QUATE spectrum of aspirin recorded at a MAS rate of 12 kHz and a temperature of 100 K. A lower temperature was used for these experiments to reduce the T 1 of the sample, thus reducing the experimental time required for the acquisition of 13 C-13 C INADEQUATE spectra. At natural abundance (NA), only the directly bonded carbon S C H E M E 1 The molecular structure of aspirin (acetylsalicylic acid) with the labeling scheme used here atoms are depicted, indicated by through bond 2Q-1Q correlation peaks in the 13 C-13 C INADEQUATE spectrum. The 13 C-13 C INADEQUATE spectrum clearly illustrates all the carbon-carbon correlation peaks for aspirin at NA. The C1 carbon is correlating with C2, C6, and C7, whereas the C3 is correlating with C2 and C4. Similarly, the C5 carbon is showing correlation peaks with both the C4 and the C6 carbons. As expected, the methyl carbon C9 only correlates to C8. Thus, using the INADEQUATE spectrum, we could unambiguously assign all the 13 C resonances in the 1 H-13 C CPMAS spectrum of aspirin. Figure 3 shows the 1 H-13 C HETCOR spectrum of aspirin recorded at 111-kHz MAS at the temperature of 293 K. We employed a short contact time of 500 μs during the CP step to restrict the magnetization transfer from the directly bonded protons, which results in the observation of the protons directly bonded to the carbon atoms as correlation peaks in the 2D 1 H-13 C HETCOR spectrum. The HETCOR spectrum enables us to resolve the aromatic protons H1-H4, which are heavily overlapped in the directly excited 1 H MAS spectrum even at the ultrahigh spinning rate of 111 kHz (see Figure S1).
The AIRSS runs were carried out by fixing the space group to P2 1 /c, and we only allowed variability of the unit cell parameters by ±5% of the values known from X-ray diffraction in order to reduce the time for the CSP step. These constraints were implemented with the hypothesis that certain structural information, such as space group, unit cell dimensions and volume, and number of molecules per unit cell, could all be potentially obtained from powder X-ray diffraction. Considering the complexity of organic molecular crystals, using such data as constraints for the crystal structure prediction would enable AIRSS, a DFT-based crystal structure prediction method, to be used for structure prediction of these complex systems. Because we restricted the cell parameters to within ±5% of the known values of the cell parameters for the two polymorphs we were analyzing, we performed two separate AIRSS searches-one for aspirin-I and one for aspirin-II -generating about 1,200 and 500 structures, respectively. Figure 4 shows the AIRSS generated structures, with a coarse DFT optimization, function of their lattice energy; the structures are plotted by the energy difference (ΔE) with respect to the lowest energy structure. The lowest energy structures of each structure search, for both aspirin-I and aspirin-II, were selected and were further optimized with a higher cutoff energy and a finer kspace grid, as described in the experimental details. The final energy and lattice parameters after the geometry optimization are given in Table S1. The energy of the AIRSS-generated structures with respect to the minimum energy structure from the structure search as function of the cell volume per molecule are shown in Figure S2 as a complementary tool to cluster the predicted structures. The NMR parameters were calculated for the lowest energy AIRSS structures after a geometry optimization with a high cutoff energy and fine k-space grid. The experimental 1 H chemical shifts of aspirin were then compared with the DFT-calculated 1 H chemical shifts for the AIRSS generated structures. For this comparison, the chemical shifts of the methyl (-CH 3 ) and hydroxyl (-OH) protons were estimated at the temperature of 0 K by extrapolating the chemical shifts obtained from the 1D 1 H MAS spectra at 293 K, 245 K, and 100 K following the example of Webber et al. [6] The spectra and extrapolation of chemical shifts are shown in Figure S3 and Table S2.
In addition to the low energy structures, we randomly selected 36 of higher energy structures for both aspirin-I and aspirin-II across the entire energy range and added these structures to the comparison of experimental versus calculated chemical shifts to better illustrate differences between low energy structures compared with the higher energy structures. For each of these structures, we calculated the RMSD (root-mean-square deviation) between the experimental and calculated chemical shifts in order to obtain a quantitative parameter to pinpoint the correct structures. The calculated RMSD for each AIRSS generated structure can be seen in Figure 5, where the green bars indicate the RMSDs for structures resulted from the structure search of form-I, whereas the blue bars indicate the RMSDs for structures outputted from the form-II structure search. The structures are listed according to increasing energy of the final geometry optimized structures. For comparison, we added the known crystal structures of form-I (ACSALA07) and form-II (ACSALA17), for each case considering the X-ray determined structure where we did both a full geometry optimization of the structure and a geometry optimization where we fixed the cell parameters and the heavy atoms and relaxed only the H atom positions. Figure 5 illustrates that, although the lowest energy structures are crystal structures that correspond to the aspirin-II polymorph, the lowest RMSD structures belong to the aspirin-I cluster, indicating that the polymorph present in our sample is aspirin form-I. For each structure search, AIRSS revealed several structures at minimum energy that correspond to the correct form for that F I G U R E 4 The lowest energy structures generated by AIRSS function of lattice energy (ΔE) with respect to the lowest energy candidate structure of each search; the green color corresponds to the crystal structures generated in the aspirin-I structure search, whereas the blue corresponds to the structures generated in the aspirin-II search F I G U R E 5 1 H chemical shift RMSD plot. Green bars correspond to the AIRSS generated aspirin-I structures whereas blue bars correspond to the AIRSS generated aspirin-II structures. Single crystal X-ray derived structures of both aspirin-I and aspirin-II are marked accordingly. The structures are ordered function of increasing energy of the final optimized structures. The break in the X-scale is separating the low energy structures from higher energy structures. The vertical line at 0.65 ppm is for illustrative purpose only to indicate the lowest RMSD structure search. This repetition of the correct structure at minimum energy, illustrated by the green and blue bars, preceding the bars for the XRD structures in Figure 5 confirmed that AIRSS is able to predict the crystal structures of organic molecular crystals. Figure 6 shows the comparison of experimental and calculated chemical shifts for both aspirin-I and aspirin-II lowest RMSD structures with their corresponding single crystal X-ray derived structures. This clearly indicates the match between the calculated chemical shifts of single crystal X-ray derived and the AIRSS generated structures for both aspirin-I and aspirin-II. The aspirin-I polymorph shows a better chemical shift correlation with an RMSD of 0.65 ppm compared with 0.84 ppm for form-II, thus selecting form-I as the polymorph present in our sample. Figure 7 illustrates the comparison between X-ray determined crystal structure of form-I and the AIRSS generated lowest RMSD structure showing a clear match between the two structures. This confirms that the polymorph present in our sample is form-I, and it illustrates how NMR crystallography can be used to distinguish between two structures with very similar molecular environments.

| CONCLUSIONS
In this study, we have demonstrated the applicability of AIRSS-a DFT-based crystal structure prediction method to organic molecular crystals and have shown how, if applying certain experimental constraints to the structure search, AIRSS can successfully predict the crystal structure of organic molecular crystals; herein, this was illustrated on the case study of powdered aspirin. We have demonstrated how AIRSS can be used together with NMR chemical shifts to discriminate between form-I and form-II of aspirin and to identify the molecular environment of the polymorph present in the experimental sample, despite the structural similarities of the two forms. The experimental chemical shifts of methyl and hydroxyl protons used here were extrapolated to 0 K F I G U R E 7 Comparison of aspirin-I crystal structure from AIRSS search (red) with single crystal X-ray crystallography structure obtained from CSD (ACSALA07)