Local energy decomposition of coupled-cluster interaction energies: Interpretation, benchmarks, and comparison with symmetry-adapted perturbation theory

Local energy decomposition analysis provides a breakdown of the domain-based local pair natural orbital CCSD(T) [DLPNO-CCSD(T)] energy into additive contributions rep-resenting the interaction between pairs of user-defined fragments. Each of these fragment-pairwise components can be further decomposed into a sum of physically meaningful terms, such as electrostatics, dispersion, and exchange. In this study, the dependence of such energy terms on the basis set size, the approximations used for the two-electron integrals, and the localization scheme used for the virtual orbitals have been carefully evaluated on the interaction energies of the S66 benchmark set. A comparison with the energy components obtained at the SAPT2 + (3) δ MP2 level of Symmetry-Adapted Perturbation Theory is also provided.

of atoms. [11][12][13] In particular, the domain-based local pair natural orbital variant of the CCSD(T) method [DLPNO-CCSD(T) [13][14][15][16][17][18][19][20][21][22] ] has been shown to provide extremely accurate interaction energies, with errors that are typically below 0.2 kcal mol −1 with respect to its canonical counterpart. [22][23][24][25][26] Recently, an EDA scheme called local energy decomposition (LED) was introduced. [27] This method decomposes the accurate DLPNO-CCSD (T) interaction energy into repulsive electronic preparation, electrostatics, exchange, dispersion, and nondispersive correlation, and it is available for closed-and open-shell systems. [28,29] The LED scheme has already found widespread applications in the study of intermolecular interactions. [27][28][29][30][31][32][33][34][35][36] In addition, by exploiting the multilevel implementation of the DLPNO-CCSD(T) scheme, [37] the LED scheme was also used to develop an accurate and efficient London dispersion-corrected Hartree-Fock (HF) method, labeled HFLD. [38] In this work, we provide a numerical comparison between the LED and SAPT energy terms, and we discuss in detail the dependence of the individual energy components extracted from the LED on the localization scheme adopted, the basis set size, and the nature of the various approximations used in the calculation of the reference HF energy. In particular, the resolution of identity (RI) scheme [39,40] has been particularly successful for evaluating the Coulomb term (RIJ). [41,42] The ORCA program [43,44] adopts a modified variant [45] of the efficient RIJ algorithm implemented by Ahlrichs and co-workers. [46] The RI technique for the exchange term (RIK) was less successful. Nevertheless, there are several implementations in combination with Coulomb fitting (RIJK). [47][48][49][50] Among the more advantageous methods for the evaluation of the HF exchange term, the chain of spheres for exchange (COSX) [51,52] approach uses a seminumerical algorithm as a basis for integral decomposition.
Combined with the RIJ approach, this is often called the RIJCOSX approximation to the HF energy. A detailed comparison of these methods can be found in Refs. [53,54]. Here, the efficiency and accuracy of the RIJK and RIJCOSX in the LED context is discussed.
The paper is organized as follows. First, a description of the LED scheme and of the associated integral evaluation approximations is given (Section 2). Following the computational details (Section 3), the dependence of the LED results on the basis set (Section 4.1) and localization scheme used (Section 4.2) is investigated on the S66 set. [55] Then, the efficiency and accuracy of the integration approaches used in LED calculations are examined on the S66 set, as well as on a subset of the S12L [56] (Section 4.3) set. Finally, the decomposed LED terms were compared with those obtained from a popular variant of the SAPT method (Section 4.4). The last section of the study (Section 5) is devoted to the concluding remarks.

| Local energy decomposition
Given a molecular complex of two (or more) weakly interacting monomers, a supermolecular approach can be used to compute their interaction energy (ΔE int ). If a post-HF method such as DLPNO-CCSD(T) is used, the resulting interaction energy can be divided into the HF (ΔE HF int ) and correlation (ΔE C int ) contributions: Each of these contributions is obtained as the corresponding energy of the adduct minus the sum of those of its constituting monomers frozen in their in-adduct geometry.
Within the LED scheme, ΔE HF int can be decomposed into a series of terms by exploiting the localization of the occupied orbitals in the DLPNO-CCSD(T) method: [28,33] in which ΔE HF el− prep is the energy required to distort the electronic structure of the isolated monomers from their ground state to the one that is optimal for the interaction, and E elstat and E exch are the electrostatic and exchange interactions, respectively, between the monomers. This decomposition is achieved by assigning each orbital to the fragment in which it is dominantly localized, as detailed in Ref. [28]. As an example, here, we report the expressions of E elstat and E exch for an arbitrary pair of fragments X and Y: where Z A is the nuclear charge of atom A at position R A ; i and j label the localized occupied orbitals (the subscripts X and Y denote the fragment in which the orbital is mostly localized); and (i X i X | j Y j Y ) and (i X j y | i Y j Y ) are two-electron intergrals in Mulliken notation. Typically, the calculation of such two-electron integrals is the most computationally demanding step in LED calculations. Various integral decomposition techniques can be used to increase the efficiency of the technique, as detailed in the following section.
It is worth mentioning here that the dependence of the LED terms on the localization scheme used for the internal orbital space was already investigated in the original LED paper. [28] Although Foster-Boys (FB) [57] and Pipek-Mezey (PM) [58] localization schemes provide an analogous partitioning of HF energies, in some cases (eg, certain delocalized π systems), noticeable deviations might occur. Generally, FB localization is recommended as it typically provides more localized orbitals.
The DLPNO-CCSD(T) correlation energy can be expressed as a sum of double-excitation contributions from the localized occupied orbitals to the virtual space, which is spanned by a set of pair natural orbitals (PNOs) that is specific for each electron pair. In LED, PNOs are subjected to an additional localization procedure, such as FB and PM, and then assigned to the fragment in which they are dominantly localized.
By exploiting the localization of both occupied and virtual orbitals, [28] the correlation energy can be expressed as the sum of double-excitation contributions from pairs of occupied orbitals to the localized PNOs. This allows one to identify double-excitation types as given in Scheme 1.
By exploiting this decomposition, ΔE C int can be written as where E C −CCSD disp is the London dispersion attraction between the fragments, and ΔE C −CCSD no −disp is the nondispersive correlation, which, among other things, corrects for the permanent and induced electrostatics approximately described by the HF method. Finally, ΔE is the perturbative triple contribution to the interaction energy. Note that ΔE can be further decomposed into dispersive and nondispersive contributions. [34] This allows us to define the overall DLPNO- correlation terms. This approach is used in this study in the comparison of LED and SAPT terms (Section 4.4). A more detailed description of the terms just discussed can be found in Refs. [29,31,34].
S C H E M E 1 Double-excitation contributions to the correlation energy

| LED/RIJCOSX vs LED/RIJK
We now turn to the evaluation of the two-electron integral contributions in Equations (3) and (4). Introducing the density belonging to a given fragment X in terms of the molecular orbital (MO) coefficients c iX μ , we can write for the Coulomb part: and for the exchange part: For the Coulomb term, the most successful approximation has been RIJ, which relies on the following decomposition pattern: where V AB is the two-index Coulomb integral, and the indices A, B label auxiliary basis functions. The advantage of this decomposition is that the density can be contracted with a three-index integral to yield the vector d Y B as The most expensive steps in the above equations scale formally as O(LN 2 ), where L is the number of auxiliary basis functions. The method can be implemented efficiently to approach quadratic scaling with a small prefactor. [45] In typical calculations, Coulomb terms usually cost much less than exchange terms.
The RI for exchange approach can be combined with RIJ, but the resulting RIJK method does not reduce the formal scaling of the exchange term as it is not possible to fully contract D Y κλ with only one of the three index integrals. In this case, it is more advantageous to use the molecular orbital coefficients as the number of occupied MOs i is much smaller than the dimension of the AO basis As a result, the method scales as O(N O LN 2 ), where N O is the number of occupied MOs. As this is usually small, the prefactor is expected to be small as well, even though, formally, the scaling is still O(N 4 ).
Alternatively, RIJ may be combined with the COS approximation for the exchange, yielding the RIJCOSX approximation. The exchange term is then approximated as in which X κg is the value of the AO κ at grid point g multiplied by the square root of the grid weight. Q κg can be obtained from X κg by the overlap fitting process, [52] that is, by requiring that the product of the two matrices is just the AO overlap matrix. Finally, The formal scaling of these steps is O(MN 2 ), where M is the number of grid points. In practice, the scaling may be reduced to one between linear and quadratic using proper prescreening techniques. [51] As M is much larger than the number of basis functions, the method has a lower formal scaling than RIJK but a larger prefactor. Thus, it is expected that RIJK will perform better for smaller molecules, and RIJCOSX will take over as the system size grows. Note that the summation over grid points can be divided into many independent batches, which makes RIJCOSX an ideal approach for parallelization.

| COMPUTATIONAL DETAILS
All DLPNO-CCSD(T)/LED calculations were performed with a development version of the ORCA 4.2 program package. [43,44] The accuracy and efficiency of RIJK and RIJCOSX approximations [47][48][49][50][51][52][53] in the decomposition of HF energies are assessed in this work. The localization of the occupied orbitals was carried out using the FB scheme. [57] This method typically provides a satisfactorily local representation of the occupied orbital space, and it is thus chosen as the default localization scheme in LED applications.
DLPNO-CCSD(T) correlation energies were calculated using the default frozen core settings [59] of ORCA. TightPNO settings were used. [13,22] All the correlated electron pairs were included in the CCSD treatment (T CutPairs = 0). The perturbative triples contributions were calculated using the iterative (T 1 ) algorithm. [60] In LED calculations, FB [57] and PM [58] schemes were tested for the localization of the PNOs in the correlation energy decomposition.
The aug-cc-pVnZ (n = T and Q) correlation consistent basis sets were used in conjunction with their matching auxiliary counterparts. [61][62][63] Interaction energies were corrected for the basis set superposition error (BSSE). As for COSX integration grids, the default GRIDXS2 [52] and a larger and more accurate GRIDX7 [51] were used in this study.

| RESULTS AND DISCUSSION
In this section, the dependence of the LED results by the technical parameters of the calculation is initially evaluated (Section 4.1: basis set; For a detailed discussion of the accuracy of the DLPNO-CCSD(T) method in the calculation of relative energies, the reader is referred to Refs. [22][23][24][25][26]. The decay of the LED terms with intermolecular distance is discussed in Refs. [27][28][29][30][31]. Figure 1 shows HF and DLPNO-CCSD(T) correlation interaction energies, that is, ΔE HF int and ΔE C int , respectively, for the 66 reactions of the S66 set computed with the aug-cc-pVTZ and aug-cc-pVQZ basis sets. As expected, HF interaction energies show only negligible differences with the two basis sets, while the convergence of correlation energies with respect to the basis set size is much slower. [64] In particular, ΔE C int increases up to 0.53 kcal mol −1 by increasing the basis set size, with a mean absolute deviation (MAD) between aug-cc-pVTZ and aug-cc-pVQZ results of 0.146 kcal mol −1 .  . Note that, in our recent HFLD study, [38] it was pointed out that the small aug-cc-pVDZ basis set already provides nearly converged E C −CCSD disp values. Hence, the dominant part of the variation observed forΔE C int going from aug-cc-pVTZ to aug-cc-pVQZ arises from the ΔE C− CCSD no− disp component (MAD = 0.139 kcal mol −1 ). This behavior is somewhat consistent with the notion that short-range correlation effects are the most challenging to treat using wavefunction-based methods. [64] Finally, it is worth emphasizing that ΔE C −CCSD no −disp incorporates the effect of many different double-excitation classes, as shown in Scheme 1. A detailed analysis of the basis set dependence of these terms is given in the Figure S1.

| Accuracy and efficiency of integral approximations
In this section, the efficiency and accuracy of the RIJK and RIJCOSX approximations for the reference HF energy decomposition are discussed.
For the S66 set, the RIJK and RIJCOSX errors in the calculation of the overall ΔE HF int are shown in Figure 5. As expected, RIJK provides almost exact interaction energies when used in conjunction with very large auxiliary JK basis, as shown in Figure 5A. If the smaller aug-cc-pVTZ/JK basis is used, the error in individual reactions reaches up to 0.02 kcal mol −1 (MAD = 0.006 kcal mol −1 , see Figure 5B).
If RIJCOSX is used instead of RIJK to approximate the two-electron integrals, a MAD of only 0.001 kcal mol −1 is obtained with a conservative grid "GridX7", as shown in Figure 5C. With the default grid "GridX" (Figure 5D), the maximum error increases to 0.06 kcal mol −1 , with a MAD of 0.006 kcal mol −1 . These results confirm that these two approaches provide very robust results for relative energies, as already discussed in previous papers. [47][48][49][50][51][52][53][54]65] It is now interesting to discuss how these approximations influence the individual LED energy components that constitute the reference interaction energy ( Figure 6). All calculations were carried out at the HF/aug-cc-pVTZ level. Reference values were obtained using LED/RIJK results in conjunction with a large aug-cc-pV5Z/JK basis as this approach has close to zero MAD for the overall ΔE HF int ( Figure 5A).
F I G U R E 1 Basis set dependence of A, HF and B, DLPNO-CCSD(T) correlation interaction energies on the S66 set. The RIJK approach was used in the reference HF calculations. The diagonal line corresponds to the ideal relation In all cases, the largest error was obtained for E elstat and ΔE HF el −prep if RIJCOSX is used in combination with the default COSX grid. This error originates from the fact that same density is used to compute the Coulomb and exchange contributions, and COSX seems to have a larger indirect effect on the Coulomb component rather than a direct one on the exchange. The error on E elstat and ΔE HF el− prep amounts to 0.07 kcal mol −1 , which is of the order of the inherent accuracy of the reference interaction energies in the S66 set. [66] Therefore, it can be concluded that all these approaches provide relatively small errors in all cases.
We now discuss the efficiency of RIJK and RIJCOSX approximations in LED calculations. The average computational time required for performing an LED calculation on the DLPNO-CCSD/aug-cc-pVTZ energy is given in Figure 7A for the dimers in the S66 set by using RIJK and RIJCOSX in the HF part. In addition, the time required for the LED calculations at the HFLD/TightPNO/aug-cc-pVDZ level is given in Figure 7B on the 3a (126 atoms) and 7b (153 atoms) molecules of S12L set when using RIJK and RIJCOSX/GridX7 in the HF part. On both benchmark sets, RIJCOSX is at least twice more efficient than RIJK in the LED calculations.
F I G U R E 2 Basis set dependence of the local energy decomposition contributions to the HF interaction energy on the S66 set. A, basis set dependence of the repulsive electronic preparation ΔE HF el− prep . B, basis set dependence of the electrostatic interaction E elstat. C, basis set dependence of the exchange interaction E exch . The FB localization scheme was used for occupied orbitals. RIJK approach with the standard matching auxiliary basis sets was utilized. The diagonal line corresponds to the ideal relation

| Comparison with SAPT
Within the SAPT framework, [2] intermolecular interactions are treated as a perturbation to the energy of the isolated monomers. Many SAPT variants exist, differing for the level of theory used for description of the monomers and for how the perturbation series is truncated. [1][2][3][4][5][6] Typically, the SAPT interaction energy E SAPT is expressed as the sum of a few physically meaningful energy terms, such as exchange-repulsion E exch−rep , electrostatic interaction E elstat , induction E ind , and dispersion E disp . For example, at the SAPT2 + (3)δMP2 level, the interaction energy E SAPT2+(3)δMP2 reads: [1] F I G U R E 3 Basis set dependence of the decomposed DLPNO-CCSD correlation interaction energy terms in terms of local energy decomposition on the S66 set. A, basis set dependence of the dispersion energy E C− CCSD disp . B, basis set dependence of the nondispersive interaction energy ΔE C −CCSD no − disp . C, basis set dependence of the triples contribution to the interaction energy ΔE C− T ð Þ int . The FB localization scheme was used for both occupied orbitals and pair natural orbitals. RIJK approach with the standard matching auxiliary basis sets was utilized in the HF part. The diagonal line corresponds to the ideal relation F I G U R E 4 The effect of the localization scheme used for pair natural orbitals on the decomposed DLPNO-CCSD correlation interaction energy terms in terms of local energy decomposition on the S66 set, that is, A, E C −CCSD disp and B, ΔE C −CCSD no −disp . The FB localization scheme was used for occupied orbitals. The aug-cc-pVTZ and its standard matching auxiliary basis sets (within RIJK approach in the HF part) were utilized F I G U R E 5 Error associated to different RI approaches in HF/aug-cc-pVTZ interaction energy calculations on the S66 set. RIJK calculations were performed with the A, aug-cc-pV5Z/JK and B, aug-cc-pVTZ/JK auxiliary basis sets. RIJCOSX calculations were performed with the aug-cc-pVTZ/JK auxiliary basis set by using C, the large grid "GridX7" and D, the default grid "GridX". Reference values were obtained from HF/aug-cc-pVTZ calculations carried out without using any approximation for the two-electron integrals F I G U R E 6 Mean absolute error (MAD) in using RIJK and RIJCOSX (with the default GridX and GridX7) for the HF/aug-cc-pVTZ interaction energy ΔE HF int and its decomposed E elstat , E exch , and ΔE HF el −prep terms on the S66 set. The aug-cc-pVTZ/JK auxiliary basis sets were used in all cases. Reference values were obtained from HF/aug-cc-pVTZ calculations using the RIJK approximation in conjuction with a very large auxiliary basis set (aug-cc-pV5Z/JK) F I G U R E 7 A, The average computational time of local energy decomposition calculation of DLPNO-CCSD/aug-cc-pVTZ energy of adducts with RIJK and RIJCOSX approaches in the HF part on the S66 set that contains molecules of up to 34 atoms by using 16 cores from Intel Xeon. The RI calculations were performed with the matching auxiliary basis sets of aug-cc-pVTZ. The effect of using larger JK basis set in RIJK (ie, augcc-pV5Z/JK) and larger GridX (ie, GridX7) in RIJCOSX is also given. B, The computational time of LED calculation of HFLD/TightPNO/aug-cc-pVDZ energy on the 3a (126 atoms) and 7b (153 atoms) molecules of the S12L set with RIJK and RIJCOSX (GridX7) with the matching auxiliary basis sets using four cores from Intel Xeon F I G U R E 8 The correlation between the A, "exchange terms", B, "electrostatic terms", and C, dispersion energies in DLPNO-CCSD/ LED and SAPT2 + (3)δMP2 calculations on the S66 set. The diagonal line corresponds to the correlation relation between the corresponding local energy decomposition (LED) and SAPT2 + (3)δMP2 terms, shown in each panel. The aug-cc-pVTZ basis set was used in all cases. In LED calculations, the FB localization scheme was used for both the occupied and the virtual orbitals. The RIJK approach was used in the LED decomposition of HF energies in conjuction with the aug-cc-pVTZ/JK auxiliary basis set where the notation E (νw) is used to indicate terms of order v in the intermolecular potential and of order w with respect to intramolecular correlation, as detailed in Ref. [6]; the superscript t indicates that only a certain part of the corresponding term is actually computed; the subscript r denotes that orbital relaxation is included; the exch-ind and exch-disp abbreviations denote the exchange correction to the induction and dispersion energies, respectively; ΔE C− MP2 int is the BSSE-corrected MP2 correlation interaction energy; and δ HF and δ MP2 are correction terms and are often added to the E ind term. A more detailed discussion on the individual terms can be found in Refs. [1,6].
Establishing a one-to-one correspondence between the energy terms of SAPT and LED is not possible. However, both approaches are expected to provide similar trends in their most comparable terms. [27] Here, we establish a series of useful correlations between the energy terms found in LED and SAPT. In particular, the present LED/aug-cc-pVTZ results are compared with the previously published [67,68] SAPT2 + (3)δMP2/ aug-cc-pVTZ energy terms on the S66 set (see Figure 8 and Tables S10 and S11).
As mentioned above, the LED scheme allows us to separate the repulsive electronic preparation energy ΔE HF el −prep from the attractive Slater exchange energy E exch . The sum of these two quantities, ΔE HF el − prep + E exch , correlates well with the E exch − rep term of SAPT2 + (3)δMP2 (R 2 = 0.994), as shown in Figure 8. An equally good linear correlation was obtained by comparing the sum of the LED electrostatics and nondispersive correlation, E elstat + ΔE , with the sum of SAPT2 + (3)δMP2 electrostatics, induction, and HF and MP2 correction terms, that is, E elstat + E ind + δ HF + δ MP2 (R 2 = 0.995), as shown in Figure 8B. Note that, as described in Ref. [31], LED terms can, in principle, also be decomposed into their permanent and induced contributions. Finally, the SAPT2 + (3)δMP2 estimate of the dispersion energy is about 16% larger than that obtained from LED at the DLPNO-CCSD(T) level, with R 2 = 0.986 ( Figure 8C).

| CONCLUSIONS
On the S66 benchmark set for NCIs, we studied the dependence of the individual LED energy terms on the size of the basis set, the localization scheme used for the virtual PNO space, and the approximations used in the calculation of the reference HF energy. Typically, we found a fast basis set convergence behavior of the individual terms and a weak dependence on the localization scheme used for the PNOs. We also found that, without significant loss in accuracy, the RIJCOSX approximation generally increases the efficiency of the LED decomposition in comparison with RIJK. Finally, we have established a series of useful linear relationships between the energy terms found in LED and SAPT2 + (3)δMP2.