Chalcogen bonds: Hierarchical ab initio benchmark and density functional theory performance study

Abstract We have performed a hierarchical ab initio benchmark and DFT performance study of D2Ch•••A− chalcogen bonds (Ch = S, Se; D, A = F, Cl). The ab initio benchmark study is based on a series of ZORA‐relativistic quantum chemical methods [HF, MP2, CCSD, CCSD(T)], and all‐electron relativistically contracted variants of Karlsruhe basis sets (ZORA‐def2‐SVP, ZORA‐def2‐TZVPP, ZORA‐def2‐QZVPP) with and without diffuse functions. The highest‐level ZORA‐CCSD(T)/ma‐ZORA‐def2‐QZVPP counterpoise‐corrected complexation energies (ΔE CPC) are converged within 1.1–3.4 kcal mol−1 and 1.5–3.1 kcal mol−1 with respect to the method and basis set, respectively. Next, we used the ZORA‐CCSD(T)/ma‐ZORA‐def2‐QZVPP (ΔE CPC) as reference data for analyzing the performance of 13 different ZORA‐relativistic DFT approaches in combination with the Slater‐type QZ4P basis set. We find that the three‐best performing functionals are M06‐2X, B3LYP, and M06, with mean absolute errors (MAE) of 4.1, 4.2, and 4.3 kcal mol−1, respectively. The MAE for BLYP‐D3(BJ) and PBE amount to 8.5 and 9.3 kcal mol−1, respectively.

the σ*-type LUMO of the chalcogen molecule. 5 The debate over the origin and fundamental bonding mechanism of the ChB continues to stimulate much interest in the literature.
Density functional theory (DFT) based Kohn-Sham molecular orbital analysis has been paramount for our understanding of bonding mechanisms and the nature of chemical phenomena. 6 Selection of the appropriate density functional approximation to investigate chalcogen bonding is critical to ensure trust-worthy results, but unfortunately this is not entirely straightforward, as the question of which approximate functional works best is highly dependent on the property and system of interest.
The first purpose of this work is to provide a detailed benchmark study of high-level relativistic ab initio methods and focus on the investigation of ChB, using the D 2 Ch molecules as chalcogen-bond donors and the halides A − as chalcogen-bond acceptors (see Scheme 1). Our model complexes systematically varies the substituent (D), the chalcogen atom (Ch), the acceptor (A − ), and is the perfect archetype for strongly bound chalcogen systems studied experimentally. 2a,3c This is done by computing the D 2 Ch•••A − complexation energies ΔE for the first time in a procedure involving both a hierarchical series of ab initio methods [HF, MP2, CCSD, and CCSD (T)] 7 in combination with a hierarchical series of Gaussian-type basis sets of increasing flexibility, polarization (up to g functions), and diffuseness, thereby eclipsing the two other benchmarks based on a single-shot CCSD(T) approach. 7i,j Interestingly, the predictions of ΔE by both benchmarks for the same systems can differ by up to 10 kcal mol −1 . The basis set superposition error (BSSE) has been accounted for through the counterpoise correction (CPC) of Boys and Bernardi. 8 The second purpose of this work is to evaluate the performance of 13 different density functionals in combination with ADF's Slater-type QZ4P basis set (vide infra) for predicting the ChB energy ΔE against our best ab initio benchmark. Thus, we perform an extensive analysis to highlight the importance of diffuse and polarization functions in the basis set, the role of the BSSE, and the necessity of Coulomb correlation as well as the extent to which the approach has converged with respect to the level of correlation treatment and basis set quality. Our analyses identify the B3LYP and M06-2X functionals, along with the M06 DFT approach as appropriate and computationally efficient alternatives to T A B L E 1 Number of relativistically contracted basis functions for ZORA-def2-basis sets without (BS) and with (BS+) diffuse functions for F, S, Cl, and Se elements. 2 | THEORETICAL METHODS

| Ab initio geometries and energies
All ab initio calculations were carried out using ORCA. 9 The atomic orbitals were described by the all-electron scalar relativistically contracted variants of Gaussian-type def2-XVP(P) (X = S, TZ, QZ) basis sets with polarization functions (up to g functions) in the series BS1 to BS3 (see Table 1). 10 The series BS1+ to BS3+ result from BS1 to BS3 after adding extra s and p minimally augmented (ma) diffuse functions (see Table 1). 10c For each of the six basis sets (BS#), the equilibrium geometry was computed using coupled-cluster singles and doubles with perturbative triples, i.e., at CCSD(T)/BS#. 11 Then, for each BS# and corresponding CCSD(T)/BS# geometry, energies were evaluated along the following hierarchical series of quantum chemical methods: Hartree-Fock theory (HF/BS#), second-order Møller-Plesset perturbation theory (MP2/BS#), 12 coupled-cluster with single and double excitations (CCSD/BS#) 13 and CCSD(T)/BS#. 11 The scalar relativistic effects were accounted for using the scalar zeroth-order regular approximation (ZORA). 14 23 This is a large, uncontracted and relativistically optimized, all-electron (i.e., no frozen core approximation) basis set of Slater-type orbitals (STOs), which is of quadruple-ζ quality for all atoms and has been augmented with the following sets of polarization and diffuse functions: two 3d and two 4f on fluorine, three 3d and two 4f on sulfur and chlorine, two 4d and three 4f on selenium. The molecular density was fitted by the systematically improvable Zlm fitting scheme. Scalar relativistic effects were accounted for using the zeroth-order regular approximation (ZORA). 14 3 | RESULTS AND DISCUSSION

| Ab initio geometries
First, we examine the equilibrium geometries of D 2 Ch•••A − complexes (Ch = S, Se; D, A = F, Cl) which were fully optimized at the ZORA-CCSD(T) level along with a hierarchic series of Gaussian-type basis sets both with and without diffuse functions (see Table 1; for optimized Cartesian coordinates see Tables S10, S11 in the Supporting Information). The isolated halide and C 2v symmetric D 2 Ch neutral fragment form the stable T-shaped, chalcogen-bonded com- Figure 1). All species have been verified through a vibrational analysis to represent equilibrium structures (no imaginary frequencies). Thus, we have a set of geometries that have been optimized at the same relativistic ab initio level along with each basis set considered in this work, without any structural or symmetry constraint (for complete structural details, see Tables S2 and S3 of the Supporting   Information).
The chalcogen bond distance in the D 2 Ch•••A − complexes become longer as the chalcogen atom (Ch) varies from S to Se and as the accepting halide (A − ) varies from F − to Cl − , and shorter as the substituent D varies from F to Cl (see Figure 1). Furthermore, the Θ 1 and

| Ab initio Chalcogen bond energies
Here, we report the first systematic investigation of the complexation energies, with (ΔE CPC ) and without (ΔE) counterpoise corrections, as a function of a hierarchical series of ab initio methods and basis sets.
The results of our ab initio computations are collected in Tables 2-5 (ΔE CPC , ΔE, and BSSE; for thermodynamic values see Tables S8 and   S9 in the Supporting Information) and graphically displayed in    Figure 3(B)). This is caused by the breathing orbitals of the anionic halide fragments going from diffuse in the isolated anion to more compact upon forming the ChB complex, which leads to charge delocalization over the molecular system. 24a,25 In the absence The BSSE becomes significantly smaller with the addition of diffuse functions and decreases from 1.2-3.9 kcal mol −1 at ZORA-CCSD (T)/BS3 to 0.9-1.8 kcal mol −1 at ZORA-CCSD(T)/BS3+ (see Tables 4-6). However, the BSSE is large, in particular, for highly correlated methods and smaller basis sets without diffuse functions, that is, at the ZORA-CCSD(T)/BS1 level (see Figure 4). As a result, the ZORA-CCSD(T) ΔE CPC are better for the BS1+ to BS3+ series but become similar to the series without diffuse functions as the BSSE simultaneously decreases as the basis sets size increases. Both basis sets series, indeed, converge to a similar value independently of the number of diffuse functions, but this result is fortuitous due to the BSSE correction that damps any fluctuations along the BS1 to BS3 series. In fact, the uncorrected ZORA-CCSD(T) complexation energies ΔE converges significantly faster along the BS1+ to BS3+ series (within 0.3-1.5 kcal mol −1 ) compared to the BS1 to BS3 series (within 1.9--3.5 kcal mol −1 ) (see Tables 2 and 3). This is, again, due to the poor description of the anionic reactants by basis sets without diffuse functions. This effect is particularly apparent at HF where Coulomb correlation is absent, mainly for systems involving the compact atom F − . 24a For example, the ΔE for Cl 2 Se•••F − that is −86.7 kcal mol −1 at ZORA-HF/BS1 significantly weakens to −57.0 kcal mol −1 at ZORA-HF/BS1+, whereas, for Cl 2 Se•••Cl − , the ΔE is −38.8 kcal mol −1 at ZORA-HF/BS1 and weakens to −25.9 kcal mol −1 at ZORA-HF/BS1+ (see Table 3).
Lastly, inclusion of Coulomb correlation is critical to achieve accurate chalcogen-bond energies. At HF, the D 2 Ch•••A − complexes are weakly bound and enter into stronger chalcogen bonds as Coulomb correlation is introduced (see Figure 5). For example, from HF to CCSD(T), the ΔE CPC for F 2 S•••F − strengthens from −38.1 to −46.6 kcal mol −1 for BS3 and from −37.4 to −45.2 kcal mol −1 for BS3 + (see Table 2). We also note that the stabilization of ΔE CPC due to the increasing of basis set size is more pronounced for high correlated methods. For example, from BS1+ to BS3+, the ΔE CPC for F 2 Se•••F − slightly varies from −51.8 to −51.4 kcal mol −1 at HF level and strengthens from −51.9 to −56.4 kcal mol −1 at CCSD(T) level (see Tables 2 and 3). This is due to the well-known fact that correlated ab initio methods strongly depend on the extent of polarization functions to generate configurations through which the wavefunction can describe the correlation hole. 7c On the other hand, at the HF level without Coulomb correlation, there is much less sensitivity of ΔE CPC towards increasing the flexibility and polarization functions of the basis set. Taken altogether, our benchmark approach, based on hierarchical series, reveals that our best estimates are converged with regards to correlation and basis set within 1.1-3.4 kcal mol −1 and 1.5-3.1 kcal mol −1 , respectively, and provides the most accurate benchmark to date, surpassing the recently published benchmark based on a single-shot CCSD(T) approach. 7i In the next section, we discuss the ability of DFT to describe Coulomb correlation compared to our ZORA-CCSD(T)/BS3+ benchmark.

| Performance of density functional approximations
Finally, we have computed the complexation energies ΔE for various GGAs, meta-GGAs, hybrid, and meta-hybrid functionals in  Tables S6 and S7; for optimized Cartesian coordinates see Tables S12-S27 in the Supporting Information). In general, we find that the density functionals give longer chalcogen bonds and bigger bond angles Θ 2 (Scheme 1) compared to our best level ZORA-CCSD(T)/BS3+ geometries (see Figure 6). The best overall agreement with our best ab initio The ME is negative, and its absolute value is equal to the MAE for all density functionals, that is, the stabilization of the D 2 Ch•••A − chalcogen-bonded complexes is overestimated by all functionals in this study. Nevertheless, our best performing density functionals together with the Slater-type QZ4P basis set have the same trends in chemical stability and geometry as our ZORA-CCSD(T)/BS3+ benchmark, with relatively small deviations from the ab initio ΔE CPC . For larger chalcogen-bonded systems, the smaller Slater-type TZ2P basis set may be used, which also provides satisfactory results in comparison with our best ab initio level. For our three-best density functionals, B3LYP, M06-2X, and M06, the ΔE is ca. 2 kcal mol −1 more over-binding for TZ2P than for QZ4P (see Table 6), that is, the overestimation on the stability of chalcogen-bonded systems increases.
This results in larger errors relative to our best estimate and the B3LYP, M06-2X, and M06 density functionals in combination with TZ2P basis set turn out to have similar accuracy as the ZORA-BLYP/ QZ4P. Thus, we identify not only B3LYP and M06-2X, 7i but also M06, in combination with the all-electron QZ4P basis set, to be reasonable approaches for computing the complexation energies of chalcogen bonds without relying on expensive ab initio methods.

| CONCLUSIONS
We have computed a ZORA-CCSD(T)/BS3+ benchmark for the arche- Coulomb correlation is also crucial, and, for highly correlated methods, addition of polarization functions is necessary to accurately describe the correlation hole.