Empirical Double-Hybrid Density Functional Theory: A 'Third Way' In Between WFT and DFT

Double hybrid density functional theory arguably sits on the seamline between wavefunction methods and DFT: it represents a special case of Rung 5 on the"Jacobs Ladder"of John P. Perdew. For large and chemically diverse benchmarks such as GMTKN55, empirical double hybrid functionals with dispersion corrections can achieve accuracies approaching wavefunction methods at a cost not greatly dissimilar to hybrid DFT approaches, provided RI-MP2 and/or another MP2 acceleration techniques are available in the electronic structure code. Only a half-dozen or fewer empirical parameters are required. For vibrational frequencies, accuracies intermediate between CCSD and CCSD(T) can be achieved, and performance for other properties is encouraging as well. Organometallic reactions can likewise be treated well, provided static correlation is not too strong. Further prospects are discussed, including range-separated and RPA-based approaches.


Introduction
Wavefunction-based ab initio theory (WFT) and density functional theory (DFT) are the two primary approaches of electronic structure. WFT methods can approach exact solutions of the Schrödinger equation to almost arbitrary accuracy (see, e.g., Refs. [1][2][3][4] and references therein). Alas, its unfavorable CPU time scaling with system size (N 7 for CCSD(T), [5,6] steeper for more rigorous methods) limits its applicability (as of 2019) to small molecules. (At the CCSD(T) level, localized pair natural orbital approaches [7,8] represent an emerging remedy.) In contrast, density functional theory (DFT) features relatively gentle system size scaling, at the expense of introducing an unknown (and perhaps unknowable) exchange-correlation functional. The proliferation of approximate exchange-correlation functionals has led to what Perdew has termed [9] "the functional zoo". [9,10] WFT has a well-defined path for convergence to the "right answer for the right reason". While no equivalent exists for DFT, in 2001 Perdew [11] introduced the organizing principle he called "Jacob's Ladder", using a Biblical metaphor (Genesis 28: [10][11][12]. Its definition is illustrated in Figure 1, together with example functionals on each "rung".  Earth is the Hartree "Vale of Tears", with neither exchange nor correlation. The Hartree equations can be solved exactly within the finite basis given (or on the real-space grid given), [12] without resorting to any adjustable parameters.
Heaven is the elusive goal of the exact exchangecorrelation (XC) functional, and thus the exact solution of the Schrödinger equation.
The local density approximation constitutes rung one: it represents the exact solution for a uniform electron gas of a given density. The exchange energy can be determined analytically (Slater [13] ), while different parametrizations [14][15][16][17] of the LDA correlation functional represent different fits to the quantum Monte Carlo results [18] for uniform electron gases. The LDA XC functional only depends on the electronic density and not on any derivatives nor any other entities.
Rung two corresponds to GGAs or generalized gradient approximations, in which the reduced density gradient is introduced in the XC functional. Examples are the popular BP86 [19,20] and PBE [21] functionals.
Rung three consists of meta-GGAs (mGGAs), which additionally involve higher density derivatives (or the kinetic energy density, which contains similar information to the density Laplacian). TPSS [22] is a popular meta-GGA, as is the recent SCAN (strongly constrained and appropriate normed [23] ).
Rungs two and three are often collectively referred to as semi-local functionals.
Orbital-dependent DFT [24] covers rungs four and five: on rung four, only occupied orbital-dependency is introduced, while on rung five the unoccupied orbitals make their appearance. The most important rung four functionals are the hybrids, which can be further subdivided into four subclasses: • global hybrid GGAs, such as the popular B3LYP [25,26] and PBE0 [27] hybrids, as well as B97-1 [28] . (We note that Hartree-Fock theory itself is a special case, with 100% exact exchange and null correlation.) • global hybrid meta-GGAs, such as M06, [29] M06-2X, [29] and BMK [30] • range-separated hybrid GGAs, such as CAM-B3LYP [31] and ωB97X-V [32] • range-separated hybrid meta-GGAs, such as ωB97M-V [33] This leaves rung five, of which we will presently consider one subcase, the double hybrids. More general reviews have been published earlier; [34][35][36] the present review will focus primarily on our own work, as well as the broader context.

The GMTKN55 benchmark
There is no shortage of papers that advocate functional X for property Y of molecule family Z, based on data of greater or lesser quality. In order to make broader statements, however, large and chemically diverse benchmarks are in order. The two largest ones presently available are the gargantuan (4,985 unique data points) MGCDB84 (Main Group Chemistry Data Base, 84 subsets) of Mardirossian and Head-Gordon, [37] and the still very large GMTKN55 (General Main-group Thermochemistry, Kinetics, and Noncovalent interactions database, 55 subsets) of Grimme, Goerigk, and coworkers, [38] itself an expansion and update of earlier GMTKN24 [39] and GMTKN30 [40,41] databases. GMTKN55 is the benchmark we shall use and discuss throughout the present paper, even though it was developed more than a decade after the oldest DH functional discussed here.
In all, GMTKN55 consists of almost 1500 unique energy differences spread over 55 different problem sets, entailing 2,459 unique single-point energy calculations. The problem sets can be grouped into five major classes: thermochemistry, barrier heights, large molecule reactions, intermolecular interactions, and conformer energies (mostly driven by intramolecular interactions).  The original GMTKN55 paper prescribes def2-QZVP basis sets for all calculations, with (for rung 5 and WFT methods) all subvalence electrons correlated. The statistic reported is WTMAD2 (weighted mean absolute deviation, type 2), which is defined as: in which |∆E|i is the mean absolute value of all the reference energies for subset i, Ni the number of systems in the subset, and MADi represents the mean absolute difference between calculated and reference reaction energies for subset i. The normalization by |∆E|i compensates for the different energy scales of different types of reaction energies: thus, errors in noncovalent interactions or conformer energies do no "drown in the noise" of, e.g., total atomization energies and ionization potentials that are 2-3 orders of magnitude larger.
We note in passing that MAD is a more "robust statistic" [42] than the root mean square deviation (RMSD), in the statistical sense that MAD is less prone to hypersensitivity to one or a few "outlier" points than RMSD, while the latter (for the same reason) is more useful for spotting "troublemakers". For an unbiased normal distribution, 4RMSD/5MAD=(25/8π) 1/2 ≈1. [43,44] [38] unless indicated otherwise or defined in original paper. For the avoidance of doubt, all data in the table were taken either from Refs. [45,53] or calculated in this work using Q-CHEM 5.2, [64] unless indicated otherwise.
[a] Iron & Janes revised MOBH35 transition metal barriers [65] [b] Recalculated in present work [c] Ref. [53] In the present work (Table 1) and our recent papers, [45,53] we have adopted GMTKN55 with the following computational protocol modifications: • Subvalence electrons are frozen according to ORCA's "chemical cores" prescription, i.e., (n-1)spd shells of transition metals and heavy pblock elements are unfrozen • For the large-molecule, small-weight subsets C60ISO and UPU23, we employ the def2-TZVPP basis set for computational cost reasons. • For the remainder, basis sets are identical to or larger than in the original GMTKN55 paper. Specifically, we use def2-QZVPP except that: • We augment it with diffuse functions, to def2-QZVPPD, for the subsets WATER27, RG18, IL16, G21EA, and AHB21 • The SG-3 integration grid, [66] i.e., a pruned (99,590) grid roughly comparable to the UltraFine grid in the Gaussian program system, [67] is used for GGAs and hybrid GGAs, while for pure and hybrid mGGAs we employed the still larger unpruned (150, 974) grid.
Several authors, notably Gould [68] for GMTKN55 and Chan [69] for MGCDB82, have tried to put said data sets "on a diet", i.e., to reduce them to weighted averages of results for small statistically significant subsamples. All WTMAD2 statistics reported in Table 1 refer to the full GMTKN55 set, however. In addition, we report the partitioning of WTMAD2 among five subsets: basic thermochemistry, barrier heights, large-molecule reactions, intermolecular reactions, and conformer energies (mostly driven by intramolecular noncovalent interactions/ Finally, in the last column we report results for a very recent transition metal benchmark, namely MOBH35 (metal-organic barrier heights, 35 reactions) by Iron and Janes. [65] 3. Empirical double hybrids

Simple double hybrids
While semilocal correlation functionals typically cope quite well with short-range correlation effects, their semilocality makes them intrinsically less suitable for long-range correlation effects, be they dispersion interactions or static correlation. [70] In the early 1990s, Görling and Levy [71] proposed a perturbation theory expansion in a basis of Kohn-Sham orbitals. (We should stress here that, while it is analogous to the commonly used Møller-Plesset perturbation theory in a basis of HF orbitals, GLPT2 is by no means equivalent to MP2.) Conceptually, double-hybrid DFT seeks to, on the one hand, combine the strongest features of semilocal correlation and GLPT2, and on the other hand to conjoin the strongest features of semilocal and "exact" exchange through a (global or distance-dependent) linear combination of these two.
In their most general form (omitting range separation parameters for now, for the sake of clarity) double hybrid DFT functionals can be formulated as follows: (a) In step one, a set of KS orbitals are obtained by solving the KS equations for the following energy functional: E = E 45 + c 8 9 E 8,;< + (1 − c 8 9 )E 8,@<A + c B,8C 9 E C,@<A where 1 4D is the one electron energy term; E F 9 and E G,FH 9 are the percentage of HF exchange and DFT correlation used, respectively; 1 F,IJ is the HF-like exchange energy, and 1 F,KJL and 1 H,KJL are DFT exchange and correlation terms.
(b) In step two, the converged KS orbitals from step 1 are applied to evaluate the energy from the following expression: Truhlar and coworkers first coined [72] the term "doubly hybrid" for a mixture of GGA DFT correlation and pure MP2 correlation, but the term "double hybrid" as it is presently understood first was introduced by Grimme for his B2PLYP functional, which corresponded to: where he obtained cHF=0.53, c2=0.27 through minimization of the error in the G2-1 atomization energies with the def2QZVP basis set.
Yet during our own numerical explorations at the time, we found B2PLYP(D) had room for improvement for barrier heights (a problem in DFT that had preoccupied us for some time [30] ). In addition, we realized that double hybrids inherit some of the slow basis set convergence of correlated WFT methods, and hence the incomplete basis set (plus problematic accuracy of the G2 reference data, as commented on repeatedly [75] ) introduces some parametrization bias.
Hence we set about determining new DBH24 and W4-08 reference data for barrier heights and atomization energies, respectively, through the W4 high-accuracy WFT protocol. [76,77] From two-dimensional contour plots in cHF and c2 of both datasets, we learned that (a) RMSD for atomization energies has not so much a minimum as an "optimum canal" spanning roughly from (0.50,0.22) to (0.72,0.42); (b) RMSD for barrier heights has a more clearly defined minimum (0.64,0.30) at the bottom of an elliptical valley; (c) (0.65,0.36) is a good compromise between these two.
The scaling parameter s6=0.40 for the D2 disperison correction was subsequently determined from the original [78] S22 noncovalent interactions dataset in the same manner as Ref. [74] Thus we finally obtained the B2GP-PLYP-D2 (GP for "general purpose") double hybrid [79] with cHF=0.65, c2=0.36, and s6=0.40. As can be seen in Table 1, this represents a significant improvement over B2PLYP not just for barrier heights but also for basic thermochemistry and large-molecule reactions. It still outperforms the best rung 4 functional to date, ωB97M-V. Intriguingly, B2GP-PLYP is fairly insensitive to the type of dispersion correction it is paired with.
Six years later, Yu [80] proposed B2NC-PLYP with cHF=0.81, c2=0.55, and no dispersion correction, claiming it would be unneeded because of the high percentage of MP2 correlation. With D3(BJ) parameters from the ESI of the GMTKN55 paper, [38] however, WTMAD2 drops from 3.56 to 2.96 kcal/mol, the latter being the lowest value for any simple double hybrid. [48] To what degree do double hybrids (particularly DSD) offer added value over simple MP2 or spin-componentscaled MP2? [81] (For a review of spin-component-scaled perturbation theory, see Ref. [82] ) This issue was previously addressed in Ref. [40] for GMTKN30 and in Mehta et al. [83] for GMTKN55 (see especially Tables S19 and S21 in that reference). For the sake of completeness, we supply Table 2 as an MP2 counterpart of Table 1, calculated using the same basis sets.  b Iron and Janes. [65] All remaining data calculated in present work.
Clearly, as already shown in Refs. [40,83] , a double hybrid amounts to more than simply the sum of its parts.

DSD double hybrids
There is intrinsically no reason why the only semilocal XC functional that works well in a DH context should be BLYP. Yet both we ourselves and Grimme found [79,86] that, while pretty much any good exchange functional can be substituted for Becke88, [19] the only correlation functional that works well for a simple double hybrid is LYP. Now LYP is derived from the Colle-Salvetti model [87] for correlation in helium-like atoms, hence the peculiar (and unphysical) feature that the LYP correlation energy of a fully polarized uniform electron gas vanishes. [88]  From Ref. [89] , courtesy of the American Chemical Society.
In addition, Martin and coworkers found in a number of studies (e.g., [89][90][91] ) on noncovalent interactions that the same-spin correlation energy contains quite similar information to the dispersion term. In a detailed study [89] on the conformer surface of n-pentane, we have shown the similarity between the CCSD same-spin correlation energy surface and various empirical dispersion corrections (see Figure 2).
In combination, these factors led Kozuch and Martin [90,91] to propose the so-called DSD functional family (dispersion-corrected, spin-component-scaled, double hybrids).

DSD-DH:
As seen in Table 1, DSD-BLYP [90] represents no major improvement over B2GP-PLYP. The major advantage of the DSD form is that it no longer requires LYP correlation specifically-indeed, we found that even LDA correlation performs comparably to LYP, as do PBEc and PW91c, and that B95c and P86c were indeed superior to LYP. Our overall winner was the DSD-PBEP86-D3 functional, at RMSD=1.62 kcal/mol for the training set in Ref. [91] The explanation becomes apparent on inspection of the optimized c2ab and c2ss, or rather their ratio c2ss/c2ab. For DSD-LDA, we have 0.11/0.58, for DSD-BP86 0.24/0.49, for DSD-PBE 0.12/0.53, but for DSD-BLYP 0.43/0.46. In other words, the only case where the constraint c2ss=c2ab of simple DHs does not amount to a "Procrustean bed" is DSD-BLYP.
Practical implementations of DSD-PBEP86, alas, yield slightly different total energies between different codes. The reason for that lies in the P86 correlation functional, where all codes do implement the same GGA correction but different codes apply it to different LDA correlation parametrizations. For instance, Gaussian by default uses the Perdew-Zunger [15] 1981 (PZ81) parametrization (as Perdew himself did in the original P86 paper [20] ), but this has not been implemented in ORCA, [93] which instead offers either the Vosko-Wilk-Nusair [14] parametrizations (VWN3 or, preferably, VWN5) or the Perdew-Wang [16] 1992 (PW92) parametrization. DSD-PBEPBE does not have the same reproducibility issue, since all codes known to the author use PW92 local correlation with the PBEc correction.
Do mGGAs offer any advantage over GGAs for the semilocal part of a DH? The comparatively good performance of DSD-LDA suggests otherwise, and indeed Kozuch and Martin [91,92] found that DSD-TPSS offers no advantage over DSD-PBE or other DSD-GGAs. The only mGGA correlation that looked promising was B95c, [94] in which the only "meta" aspect is a factor in the same-spin correlation term that ensures one-electron systems have no spurious self-correlation.
Roch and Baldridge introduced the mSD family [95,96] of functionals (without dispersion correction) featuring just two independent parameters: the percentage of HF exchange and the percentage of SCS-MP2 correlation, while they constrain the ratio c2ab:c2ss= 6/5:1/3 as in the original SCS-MP2 paper. [81] Aside from different ratios (e.g., 1.15:0.75 as in S2-MP2 [84] ) being justifiable on theoretical grounds [82] even for HF reference orbitals, there is no intrinsic reason why either the SCS-MP2 or S2-MP2 would necessarily be valid or optimal for GLPT2 in a basis of Kohn-Sham orbitals.
As shown in Figure 2 of Ref. [83] , mSD-PBEPBE performance for GMTKN55 is not just inferior to other double hybrids, but it is indeed comparable to an mGGA.

Dispersion corrections in a DH context
Among dispersion corrections that have been used in conjunction with double hybrids are the original D2, [73] D3 with zero-damping, [60] D3 with Becke-Johnson damping, [97] D4, [98,99] and the Vydrov-Van Voorhis [100] nonlocal dispersion functional. A detailed discussion of different empirical dispersion approaches is beyond the scope of this review: the reader is referred to, e.g., Refs. [101][102][103] It suffices to say in this context that the original D2 was a damped Lennard-Jones correction with a single parameter (a prefactor s6) and fixed atomic parameters that were completely insensitive to the chemical environment. In D3, a bond order dependence is introduced, while D4 is even more responsive to the chemical environment through a dependence on partial charges. Additionally, D4 includes a 3-body Axilrod-Teller-Muto [104,105] type term.
While Refs. [91,92] do suggest that double hybrids are more forgiving of the foibles of the simple D2 method than DFT functionals on lower rungs would be, D3(BJ) do generally lead to better results. Simply using D4 as a drop-in replacement for D3BJ improves results in many cases (e.g., DSD-BLYP-D4, DSD-PBEP86-D4, and especially DSD-PBE-D4, Table 1) while the reoptimized revDSD-XC-D4 functionals almost invariably outperform their revDSD-XC-D3(BJ) cognates. [45] Conversely, one might decide to eliminate the dispersion correction altogether and instead rely on same-spin correlation for the same purpose. In the socalled DSD-XC-noD [92] or noDispSD-XC functionals, exactly this choice is made: typically, c2ss there is about 0.2-0.3 larger than in the corresponding DSD-XC-D3(BJ) functional.
We also optimized s6 and s8 coefficients (prefactors for the R -6 and R -8 terms, respectively) in the D4 dispersion correction [99] to go with spin-resolved MP2 variants. Dispersion is absolutely necessary for SOS-MP2, while for simple MP2 and S2-MP2, we find s6 coefficients for the R -6 term nearly zero combined with significant negative coefficients for the R -8 term. This indicates compensation for an overcorrection at medium-range in MP2, as previously noted in Ref. [107] A similar pattern -small s6 and large negative s8 -can be seen in Table S1 of Mehta et al. [83] for MP2-D3(BJ), as well as in Table 14 of Ref. [107] (a revision of the S66x8 noncovalent interactions benchmark [108] ). The latter reflects that full MP2, in a noncovalent interactions setting, behaves correctly at long distance but overbinds at intermediate distance.

Reparametrizing for chemical robustness: revDSD
For reasons of computational efficiency, the original parametrization set [91,92] for DSD-PBEP86 et al. had to be quite small: it consisted of just the W4-08 total atomization energies, [79] the DBH24 barrier heights, [109] the S22 noncovalent interactions set, [110,111] Truhlar's model system [112] for the Grubbs metathesis, our older set [113] of prototype oxidative additions at Pd atom, and Korth and Grimme's "mindless benchmark [114] ." Experimentation with weights for these subsets, as well as with the addition of a rare-gas dimers benchmark, revealed that sensitivity to the weights assigned to the Pd, S22, and RG subsets was greater than desirable. This prompted the question whether reparametrization to a very large and chemically diverse dataset like GMTKN55 [38] (see Section 2 above) would yield more solid parametrizations. 1 Many of the parameters in the double hybrid, and hence in the objective function, are entirely or nearly linear in the optimization sense of the word. Hence, we carried out macroiterations on the few nonlinear parameters we do have (primarily the percentage of HF exchange) and at each point evaluated all 2,459 energies broken down into components, then employed the latter in microiteration cycles where the remaining parameters were self-consistently optimized. As the cost of reevaluating all dispersion corrections is negligible compared to the CPU time for the electronic structure calculations, we attempted to also microiterate the nonlinear parameters in the dispersion correction. We however found that this leads to a numerically highly unstable optimization as the penalty surface in those parameters is quite flat. We therefore instead use fixed, reasonable values.
Across the board, we find our revised DSD functionals (denoted revDSD) to have lower WTMAD2 values than the originals, but the difference is particularly blatant for revDSD-PBEP86-D4. With just six adjustable parameters, said functional approaches the performance of ωB97M(2) to within overlapping uncertainties.
One notable change between original and revised optimized parameters is that same-spin coefficients drop substantially: this is advantageous if one desires to retain only the opposite-spin MP2-like term for reasons of computational efficiency (see below).
Finally, we had already found earlier (see Figure 1 in Ref. [92] ) that performance of a double hybrid for the W4-11 thermochemical benchmark [44] depends much more weakly on the fraction of HF-like exchange than an ordinary hybrid, and that the dependence is weaker still for DSD double hybrids. In the revDSD paper we showed that this is true for GMTKN55 as well if all remaining parameters are reoptimized self-consistently: for DSD-SCANx-D3, we see that varying cX, the fraction of HFlike exchange, over the range 0.63 to 0.74 affects WTMAD2 by less than 0.05 kcal/mol. Hence, semiarbitrary values can be chosen such as the published cX for the original DSD functional -or, failing that, 0.69 ≈ 3 −1/3 as a reasonable compromise value.

Accelerating double hybrids and elimination of same-spin correlation
An objection often raised against double hybrids is their alleged high computational cost. However, this is only really an issue with codes that do not allow DF-MP2, a.k.a. RI-MP2 [115,116] (density fitting MP2, "resolution of the identity" MP2).
RI-MP2 still asymptotically has an O(N 5 ) scaling with system size. However, in our calculations on the GMTKN55 set, we found that the RI-MP2 step reached at most 25-30% of total CPU time (for C60 isomers and larger). For small molecules, the RI-MP2 step's cost is a single-digit percentage of the total. Still, for large molecules, scaling reduction would be highly desirable.
There are a number of techniques involving localized pair natural orbitals (such as DLPNO-MP2 of Neese, Valeev, and coworkers [117,118] ) that asymptotically scale as O(N) for an approximate correlation energy.
If one eschews orbital localization, Head-Gordon and coworkers [85] showed that the Häser-Almlöf [119] Laplace transform MP2 technique can be used to evaluate just the opposite-spin MP2 term (SOS-MP2) in O(N 4 ) time.
Variations on this approach have been implemented in several electronic structure codes. [120][121][122][123] Very recently, the tensor hypercontraction approach of Song and Martinez [124,125] offers an O(N 3 ) alternative.
(Other options are atomic orbitals MP2 with distance screening as advocated by Ochsenfeld and coworkers [126] or stochastic MP2. [127] ) The first attempted double hybrid with c2ss=0 was B2-OS3LYP by Head-Gordon and coworkers, [128] followed by PWPB95 and PTPSS from the Grimme group. [40] In Ref. [92] we provided alternative fits that are samespin-free, which we denoted DOD-PBE-D3, DOD-PBEP86-D3, and the like. It stands to reason that degradation in performance compared to the corresponding DSD functionals would be smallest for semilocal functional components associated with a small optimum c2ss, such as DSD-PBE-D3. In our recent study Ref. [45] , however, we found that DSD-SCAN-D3(BJ) and the refitted revDSD-PBEP86-D3(BJ) both had very small c2ss, and that their analogues DOD-SCAN-D3(BJ) and revDOD-PBEP86-D3(BJ) yield essentially the same performance without same-spin MP2 correlation.
HEAVY28 (bond separation reactions of heavy p-block elements) subsets, owing to the small reaction energies involved.

XYG3 type functionals and xDSD
Zhang et al. [129] argued that the damped semilocal correlation used during the orbital step will adversely affect the orbitals, both in terms of shape and in terms of orbital energies. They therefore proposed the XYG3 functional, which generally corresponds to: Further additions to this family have been made and reviewed in detail in Ref. [131] In Section 5.3 of Ref. [40] , Goerigk and Grimme (GG) address the effect of full vs. partial semilocal correlation in the orbital generation step, as well as of using low vs. high percentages of HF exchange in that step. They show that in fact the orbitals used in B2PLYP are not qualitatively different from BHalf&HalfLYP orbitals, and that the primary effect of reducing the fraction of HF exchange is not the shapes of the occupied orbitals (which are relatively invariant across XC functionals [132] ) but an artificial lowering of the virtual orbital energies. As a result, MP2 denominators are smaller and the MP2 energy is artificially raised. GG attribute the relatively good performance for main-group chemistry of XYG3 and friends to orbitals generated with low percentages of HF exchange being less prone to spin contamination. 2 In a later collaboration, a "nonempirical" xDH-PBE0 was proposed. [133] In a comment [134] on said paper, we explored an XYG3-like modification of our empirical DSD functionals, e.g.: xDSD: DH XC,PT2 [E Ö , 1|E Ö 9 , c B,8C , E qÜá , E qàà |D3(â o , ä q , s 8 =a 1 =0)] We found that cx=cx' systematically yields the best results for a given cx value, and that for a given XC (e.g., PBEP86), xDSD may be marginally more accurate than DSD. We later confirmed [45] this finding for xrevDSD-PBEP86-D3(BJ) vs. revDSD-PBEP86-D3(BJ): the WTMAD2 lowering of about 0.1 kcal/mol appears to be principally due to the RSE43 radical reaction subsets.
In a later article comparing xDH-PBE0 and DSD-PBEPBE-D3(BJ) for different properties, it was argued [135] that xDH-PBE0 had an edge in terms of charge delocalization, and ascribed this to allegedly reduced self-interaction error (SIE). 3 One way to thoroughly reduce SIE would be the use of range-separated hybrids, which we will discuss presently.

Range-separated double hybrids and ωB97M(2)
In their most commonly used form, the interelectronic repulsion r12 -1 is split up by means of an error function of r12 into a short-range(SR) and a long-range ( where, 0 ≤ α + β ≤ 1 One of the earliest such functionals to gain wide acceptance was CAM-B3LYP of Handy and coworkers, [31] where α=0.19; β=0.46; ω = 0. 33 Another common such functional is LC-ωPBE where α=0 and β=1.0 There are several strategies for setting the rangeseparation parameter ω. One is to tune it for every system [136] to minimize the difference between the Koopmans' Theorem and ∆SCF values of the IP and the EA, thus ensuring the HOMO-LUMO gap approaches the "fundamental gap" IP-EA. While this strategy is very valuable for optical spectroscopy or materials properties, its application to thermochemistry (or, generally, any "mass production" computational project) is intrinsically awkward. For the CAM-QTP functionals, [52,137] Bartlett and coworkers instead carried out the tuning for a small set of reference molecules, then assumed these parameters to be transferable to all systems.
In an attempt to walk a middle course between the Perdew school of nonempirical functionals and the heavily parametrized functionals of the Truhlar group (such as M06, [138] M11, [51] and MN15 [56] ), Mardirossian and Head-Gordon developed a combinatorial optimization procedure in which they do expand exchange and correlation functionals into Becke-Handy [28,139] power series, but use a very large training set and winnow terms in the series expansions down for statistical significance. In this manner, they climbed up the Jacob's Ladder with the mGGA B97M-V, [57] then the range-separated hybrid GGA ωB97X-V [32] (α=0.167, β=1.0, ω =0.3, WTMAD2=3.96 kcal/mol) and the RSH meta-GGA ωB97M-V. [33] The latter was found (α=0.15, β=1.0, ω=0.3, WTMAD2=3.29 kcal/mol) by Goerigk et al. [48] and by ourselves [45] to be the most accurate 4 th -rung functional to date. ωB97X-V and ωB97M-V have 10 and 15 linear adjustable parameters, respectively -about one-third as many as in the M11 RSH mGGA functional of Truhlar and coworkers, [51] its 2019 revision [50] , or the MN15 empirical global hybrid mGGA.
With the recent ωB97M(2) functional, [140] the same "combinatorial optimization" machinery that led [32,33,57] to B97M, ωB97X-V, and ωB97M-V was applied to obtain a double hybrid. In Ref. [45] we found it to have the lowest WTMAD2 thus far reported, [45] 2.18 kcal/mol. However, the WTMAD2 differences with revDSD-PBEP86-D4 and especially xrevDSD-PBEP86-D4 are within the uncertainty of the reference data, while these latter functionals only have 6 adjustable parameters (just 5 in the DOD variants) rather than the 16 in ωB97M (2).

Comparison with WFT-based composite ab initio methods.
The mind wonders how WTMAD2 values of just above 2 kcal/mol compare with what can be achieved using composite wavefunction thermochemistry protocols (see, e.g., Refs. [141,142] for recent reviews), such as the Gn family, [143,144] CBS-QB3, [145,146] or the ccCA approach. [147] While evaluating the entire GMTKN55 dataset with them was deemed computationally too costly, we have calculated WTMAD2 for a subset of small and medium-sized molecules that does not include elements beyond Kr (for which the basis sets are lacking). This left 642 reaction energies: we note that all geometries were frozen at the GMTKN55 reference, and that zero-point and thermal corrections were excluded. (All such calculations were performed in this work using Gaussian 16 [67] on the Faculty of Chemistry's Linux cluster.) The WTMAD2 for full G4 theory, [148] 1.76 kcal/mol, is still markedly superior to the ωB97M(2) value for this subset, 1.95 kcal/mol, while the lower-cost variants G4MP2 [149] and G3B3 [150] clock in at 2.29 and 2.20 kcal/mol, respectively, for the same subset; CBS-QB3 achieves 2.25 kcal/mol. In all, it is clear that the best double hybrids have now entered the accuracy regime of composite WFT methods, at lower cost and with gentler CPU time scaling. (Moreover, relatively inexpensive analytical first and second derivatives are available for the double hybrids.) In principle, localized pair natural orbital coupled cluster theory [7,8] may represent a linear-scaling alternative amenable to still larger molecules. [151] It should be kept in mind, however, that aside from any empirical parameters that seek to improve accuracy/"trueness," any DLPNO-CCSD(T) or PNO-LCCSD(T) implementation is reliant on several adjustable parameters that will affect numerical precision at chemically significant levels. (See, e.g., Refs. [7,152] for detailed discussions.) While one might deem such "empiricism of precision" the lesser of two evils (since one needs no reference datasets, just recalculation at tighter thresholds to establish convergence to the canonical limit), it remains a source of uncertainty nevertheless.

Basis set convergence and double hybrids
As the total energy of a double hybrid is a superposition of a hybrid DFT-like component and an MP2-like component, it follows that (a) the basis set convergence behavior would be intermediate between the geometric behavior observed for hybrid DFT and the asymptotic L -3 behavior [153,154] (with L the highest angular momentum in the basis set) of MP2; (b) the MP2like behavior will dominate in the large basis set limit as the hybrid DFT component will have reached saturation. (Compared to straight MP2, it is obviously mitigated by a factor c2ab; E2ss converges faster, more like L -5 . [155] ) It also follows that basis set dependence of double hybrids would be most WFT-like for functionals with large fractions of PT2 correlation, and (e.g., as seen in Ref. . [40] for PWPB95) most DFT-like for functionals with small percentages of PT2 correlation. (We note in passing that Mardirossian and Head-Gordon [156] found Minnesota functionals to exhibit anomalously larger basis set sensitivity than other rung 4 functionals.) For wavefunction calculations, the Dunning correlation consistent basis sets [157,158] are something of a de facto standard. Hence, they are efficient for the MP2like component of a double hybrid. However, as the sp function contractions in them are based on HF orbitals, they are intrinsically less suited for pure or hybrid DFT or for the hybrid DFT component of a double hybrid.
Conversely, the pc-n and aug-pc-n polarization consistent basis sets of Jensen [159][160][161] are inherently well suited to DFT but perhaps less to the MP2-like component of a double hybrid.
The Karlsruhe (a.k.a. Weigend-Ahlrichs) basis sets [162] offer an interesting compromise here, as specifically the "def2" sequence was developed with both HF/DFT and correlated WFT methods in mind. Additionally, they cover the elements H-Rn (using relativistic pseudopotentials from Rb onward).
When parametrizing an empirical DFT functional, one could follow two approaches: either parametrize to the CBS (complete basis set limit) -which guarantees that results obtained with the functional will be basis set convergent -or parametrize to a basis set that is practically useful, in which case the parameters will be specific to that basis set. In DFT more generally speaking, Adamson, Gill, and Pople's EDF1 (empirical density functional one) [163] was the first to specify a particular basis set (and integration grid, SG-1). Similarly, Chai and Head-Gordon [106] propose two sets of parameters for their ωB97X-2 double hybrid: one set for extrapolation to the CBS limit and another specific to the 6-311++G(3df,3pd) "Large Pople" basis set.
Boese, Martin, and Handy [164] considered the sensitivity of parameters in empirical functionals fitted for specific basis sets, and found the parameters to be reasonably transferable as long as the basis sets were of at least triple-zeta plus polarization quality. In terms of double hybrids, the situation is murkier, but our experience suggests that similar transferability requires a basis set of at least quadruple-zeta quality.
Radom and coworkers [165] considered basis set dependence of double-hybrid parametrizations: they concluded that, while the deficiencies of triple-zeta basis set could to some degree be 'smoothed over' by ad hoc parametrizing to the basis set, this approach no longer works for still smaller basis sets. Basis set extrapolation formulas for double hybrids have been proposed by Chuang et al. [166] and by Chan and Radom. [167] Witte, Neaton, and Head-Gordon, [168] from DFT basis set convergence studies on the S22 weak interactions benchmark, [78,169] recommended to use at least a quadruple-zeta basis set for DFT parametrization. Basis set convergence for the larger S66x8 benchmark [108] has been considered in great detail in Ref. [107] , particularly Tables 6-9 there (for HF, PBE0, MP2, and DSD-PBEP86-D3, respectively). 4 From both studies emerges that if one wants to avoid counterpoise [170] calculations, then def2-QZVPD will effectively represent a basis set limit for rungs 1-4; from the latter study it became clear that for the DSD-PBEP86 even def2-QZVP is not adequate unless counterpoise corrections are applied. (See also Refs. [171,172] for detailed discussion of basis set superposition error for wavefunction methods.) In the B2GP-PLYP paper, we used extrapolation from Jensen's aug-pc2 and aug-pc3 polarization consistent [160] basis sets. In the DSD papers, [91,92] we did the same for the thermochemistry subset, but for the S22 weak interactions benchmark [78,169] were forced to fall back to a rather smaller basis set. In the revDSD paper (see below), we used def2-QZVPP except for anionic subsets, where we used def-QZVPPD.
As parametrization to very large training sets becomes unwieldy for basis set extrapolation, both the Head-Gordon group [32,33,57,69] and the present authors [45] adopted the compromise solution of using a fixed def2-QZVPPD (or similar) basis set for that purpose.
As the famous aphorism goes, "in theory there is no difference between theory and practice; in practice, there is". [181] So for the SCAN semilocal functional [23] and the GMTKN55 dataset, we found [45] the surprisingly linear (in the cx=0.50-0.80 range) empirical evolution of the coefficients shown in Figure 3. (We note in passing that nonempirical SCAN-based double hybrids [47] such as SCAN0-2 have distinctly inferior performance to these DSD-SCAN empirical double hybrids.)  Figure S1 in ESI of Ref. [45] , reused with permission, courtesy of the American Chemical Society.

Organometallic reaction barrier heights
The GMTKN55 dataset only covers main-group systems. A large segment of transition metal systems is prone to severe static correlation, which intrinsically limits the applicability of GLPT2-based double hybrids. However, there are some benchmarks available for transition metal reactions with small to moderate static correlation. In this section, we shall focus on the MOBH35 (metal-organic barrier heights of 35 reactions) sample of Iron and Janes, [182] particularly on its updated version [65] with basis set limit energetics with the new DLPNO-CCSD(T1) approach [8] which unlike the popular DLPNO-CCSD(T) [183] no longer neglects off-diagonal Fock matrix elements. The systems in MOBH35 are realistic organometallic reactions with large ligands, rather than the small model systems considered by Kozuch and Martin. [92] Rungs 1 and 2 are clearly inadequate for this problem, while the best Rung 3 performer, B97M-V, still has an MAD=2.9 kcal/mol. Among global hybrids, the nonempirical SCAN0 [184] is the best performer, followed immediately by the empirical PW6B95-D3(BJ). RSHes seem to have a much easier time with this problem, and the best RSH for GMTKN55, ωB97M-V, also is the best MOBH35 performer, at MAD=1.7 kcal/mol. The best performer is revDOD-PBEP86-D4 at 1.4 kcal/mol, a number that can be lowered to 1.2 kcal/mol by expanding the basis set to def2-QZVPP and slightly further by basis set extrapolation. We note that the revDSD reparametrizations consistently outperform the corresponding original DSD functionals for MOBH35, despite TMs not being involved at all in parametrization. This indicates that the revDSD functionals actually do capture the physics of the problem better than their DSD progenitors, and are not merely 'fits of round pegs to square holes'.
We also note that ωB97M(2) actually performs slightly worse than ωB97M-V, and that thus an empirical double hybrid does not automatically represent an improvement over the underlying fourth-rung functional. We would be remiss by not pointing out, however, that ωB97M-V appears to be an unusually versatile 'workhorse functional'. This is again confirmed in a recent computational study [185] on the reaction mechanism of Milstein's [186] concurrent hydroarylation and oxidative coupling catalyzed by Ru(II) complexes. While the primary goal of this study was assessing the performance of DLPNO-CCSD(T) [8] and PNO-LCCSD(T) [7] methods compared to full canonical CCSD(T)/def2-{T,Q}ZVPP extrapolation, performance of DFT functionals was a secondary goal. The trends seen largely follow MOBH35, with again ωB97X-V and ωB97M-V putting in 'best in class' performances. revDSD-PBEP86-D3BJ still performed slightly better, however, and definitely better than the original DSD-PBEP86-D3BJ parametrization.
Some light might be shed by considering the performance of different spin-resolved MP2 variants. Somewhat intriguingly (Table 2), not only does SCS-MP2 exhibit much better performance (MAD=2.8 kcal/mol) than standard MP2 (MAD=6.0 kcal/mol), but SOS-MP2 without any same-spin correlation clocks in lower still at 2.5 kcal/mol. Similarly, for the transition states in Ref. [185] we find the following RMSDs from CCSD(T)/def2-TZVPP (using the same basis set): MP2 9.5, S2-MP2 8.6, SCS-MP2 5.2, and SOS-MP2 5.4 kcal/mol, the latter two figures superior to CCSD. One might be forgiven for concluding that same-spin MP2 correlation is somewhat "poisonous" in systems with significant static correlation. This finding might explain why Iron and Janes found DOD-DHs to be superior over DSD-DHs for MOBH35, and may also shed some light on the somewhat disappointing performance of ωB97M(2), for which c2ab=c2ss=0.341.  Figure 3 of Ref. [45] , reused with permission, courtesy of the American Chemical Society. As analytical first and second derivatives of double hybrids are relatively inexpensive, they offer an attractive option not just for harmonic frequencies, but for anharmonic fundamentals, overtones, and combination bands obtained via second-order rovibrational perturbation theory [187] on a semidiagonal quartic force field. The latter can be obtained fairly affordably, if analytical second derivatives are available, using the Schneider-Thiel technique [188] of central numerical differentiation in normal coordinates.

Vibrational frequencies and derivative properties
In Ref. [189] , performance of many exchangecorrelation functionals (including double hybrids) was considered in depth for the HFREQ27 benchmark defined there. (It is quite possible, for diatomics and small polyatomics, to extract harmonic frequencies and anharmonic corrections separately from high-resolution spectra, and thus have reference values to 1 cm -1 or better.) For basis set limit CCSD(T), the RMSD is 4.6 cm -1 , the remaining error distributed evenly between subvalence correlation (which tends to push stretching frequencies up [190] ) and post-CCSD(T) valence correlation (which tends to lower them [190] ).
As shown in Ref. [189] , even after corrective frequency scaling MP2 has an RMSD=30 cm -1 near the basis set limit, compared to 32 cm -1 for PBE0 and 25 cm -1 for B3LYP. B2PLYP goes down to a respectable 11 cm -1 ; so can B2GP-PLYP, but only after scaling by about 0.99. In contrast, DSD-PBEP86 reaches a similar accuracy without scaling, and thus its improved performance for energetic does not come at the expense of PES curvature.
Does the improved accuracy for thermochemistry of revDSD come at the expense of harmonic frequencies? It can be seen in Figure 4 below that this is not the case, and that if anything, frequencies improve a little furtheragain, despite not being targeted during parametrization at all.
In 2013, Alipour [200] considered performance of DHs for electric field response properties, specifically, isotropic and anisotropic polarizabilities of water nanoclusters. Unlike for GMTKN55-type energetics, nonempirical functionals like PBE0-DH and PBE0-2 were found to perform much better than empirical functionals: this point merits further study.

Electronic excitation spectroscopy
Already shortly after publication of the original B2PLYP paper, [201] Grimme and Neese reported [202] its extension to excited states, combining a conventional TD-DFT framework for the hybrid-GGA part with a scaled second-order perturbation correction based on Head-Gordon and Lee's CIS(D). [203] . Two years later, Goerigk et al. [204] demonstrated promising performance of TD-B2PLYP and TD-B2GP-PLYP (both MAD=0.22 eV) for the 142 vertical singlet excitations benchmark of Thiel and coworkers. [205] Goerigk and Grimme additionally considered low-lying valence excitations in large organic dyes [206] as well as polycyclic aromatic hydrocarbons, [207] and found particularly TD-B2GP-PLYP to perform very well, although range-separated hybrids were still superior for charge transfer states. Such global double hybrids were also found [208] to be the only ones able to simulate an exciton-coupled electronic circular dichroism (ECD) spectroscopy spectrum that had been initially studied [209] in 2008 by wave function methods.
Thus far, no spin-component scaling (SCS/SOS) of nonlocal correlation was considered in TD-DHDFAs. Schwabe et al. first did so in 2017. [210] Very recently, Goerigk and coworkers [211] proposed the ωB2LYP and ωB2GP-PLYP range-separated variants of B2PLYP and B2GP-PLYP, [79] respectively, with a view to application to electronic excitation energies (such as UV/VIS spectra). All parameters were frozen at their original values, with the range-separation parameter ω newly optimized to 0.30 (ωB2PLYP) and 0.27 (ωB2GP-PLYP). These authors found that particularly the latter remedies the deficiencies of the global double hybrid B2GP-PLYP for Rydberg excitations as well as charge transfer excitations, while preserving its excellent performance for valence excitations.

Orbital-optimized double hybrids
Another approach that should be mentioned in passing is the adoption of orbital-optimized MP2 (OO-MP2) [212,213] inside a double hybrid framework. In such methods, the total MP2 energy is optimized with respect to the orbitals. This approach was considered first [212,213] for MP2, SCS-MP2, and SOS-MP2, then later for nonempirical double hybrids (e.g., Refs. [46,214] ).
Najibi and Goerigk [215] carried out a more comprehensive survey of both empirical and nonempirical double hybrids, as well as spin-resolved MP2 variants, for a subset of GMTKN55 plus a few dedicated test sets. For the singlet-triplet splitting in the polyacene series benzene through hexacene, OO-DHs often cut errors in half compared to the underlying DH. Noticeable improvements are likewise seen for the S22 noncovalent interactions [110,111] and W4-11 smallmolecule thermochemistry [44] sets, but performance for barrier heights actually gets worse. As summarized in Figure 5a of Ref. [215] , for the closed-shell tess only OO-SOS-MP2 and OO-SCS-MP2 represent improvements over SOS-MP2 and SCS-MP2, and in OO-DSD-PBEP86 and OO-PBE0-2 significant deterioration is actually seen. However, if also open-shell cases are considered, then those two latter double hybrids actually do represent improvements, while for OO-SOS-MP2 and OO-SCS-MP2, the "OO bonus" is quite marked. Further investigation may be warranted.

dRPA based double hybrids
The Achilles's Heel of MP2-like correlation contributions is the orbital energy differences in the denominator. For molecules with small band gaps (absolute near-degeneracy, a.k.a., Type A static correlation [70] ), MP2-based DHs will intrinsically struggle. Chan, Goerigk, and Radom [216] considered the use of higher-order perturbation theory and CCSD, and found that this leads to no significant improvement despite the extra computational cost.
Grimme and Steinmetz proposed [231]  The special twist here lies in the use of mGGA orbitals (cX=0), which greatly speeds up the orbital evaluation phase. Hence, with a fast dRPA code, PWPRPA calculations can be a factor of 4-5 faster than double hybrids that include exact exchange during the iterations.

Summary and Outlook
Double hybrids can, in a sense, be regarded as sitting on the seam line between WTF and DFT methods. For main-group thermochemistry, kinetics, and noncovalent interactions, they can achieve accuracies well beyond those attainable with the best rung four functionals, at a computational cost that is not much greater. They are, arguably, entering the territory of what is attainable with low-cost composite ab initio thermochemical protocols. In addition, they only require a modest number of empirical parameters. Moreover, the superior performance is also seen for types of systems and types of properties not at all considered in the parametrization: therefore, such empirical double hybrids do represent more than "fitting round pegs to square holes".
The main Achilles' Heel of the DH approach is static correlation, particularly the Type A variety (absolute neardegeneracy [70] ). This means that double hybrids, as currently constituted, are more likely to misbehave for transition metal systems. Possibly this may be mitigated by considering RPAtype methods. Coupling singlet-paired coupled cluster [232] with DFT correlation [233] may represent another path.
Range-separated exchange double hybrids could offer succor in other areas through mitigating self-interaction error and initial applications to optical spectra look very promising. In addition, range-separated correlation might offer [234] an avenue to eliminate or mitigate reliance on the dispersion correction at long range.
As a final reflection, we would argue that the choice between purely nonempirical functionals and radically pragmatic approaches with dozens of adjustable parameters like Refs. [29,30,51,56,235] is a false dichotomy. The Berkeley functionals B97M-V, [57] ωB97X-V, [32] and ωB97M-V [33] represent a fruitful middle course, achieving high accuracy and versatility with comparatively modest numbers (around a dozen) of empirical parameters. We have shown that DSDtype double hybrids can in fact achieve even better accuracy with still fewer (4-6) empirical parameters, and that they can reach accuracies comparable to composite wavefunction approaches. The latter either involve similar numbers of parameters (G3, G4) or are much costlier (ccCA, W1, W2-F12), and none of these afford similarly convenient energy derivatives. It therefore appears that empirical double hybrids bring something to the computational modeling table not currently offered by any other approach.