Abstract
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
The choice of basis set in quantum chemical calculations can have a huge impact on the quality of the results, especially for correlated ab initio methods. This article provides an overview of the development of Gaussian basis sets for molecular calculations, with a focus on four popular families of modern atomcentered, energyoptimized bases: atomic natural orbital, correlation consistent, polarization consistent, and def2. The terminology used for describing basis sets is briefly covered, along with an overview of the auxiliary basis sets used in a number of integral approximation techniques and an outlook on possible future directions of basis set design. © 2012 Wiley Periodicals, Inc.
Introduction
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
Ab initio and density functional theory (DFT) calculations typically require two main approximations, the “method” (the expansion of the manyelectron wavefunction or the choice of exchangecorrelation functional) and the “basis set” (the expansion of the oneelectron orbitals). The choices made in these basic considerations for solving the electronic Schrödinger equation will determine the overall accuracy of the results and the associated computational cost required to obtain them. Although knowledge of basis set development is not a prerequisite for performing quantum chemical calculations, especially as many basis sets are included in widely available program packages or basis set archives,1–3 the aim of this review is to aid the theoretical/computational chemist in making informed basis set choices for molecular calculations, focusing on insights into the design, development, and optimization of the more recent developments in the field. It is the belief of the author that this will allow a feeling for the basis set incompleteness error in a given calculation to be developed, or “the right results for the right reason” to be obtained.
Detailed descriptions of the functions found within basis sets and their classifications can be found in a number of texts (e.g., Ref.4), hence the following description is kept purposefully brief to serve as a reminder of terminology. This review is concerned only with Gaussian basis sets,5, 6 where the basis set is comprised of basis functions taking the form:
 (1)
or an analogous representation using Cartesian, rather than polar, coordinates. For the Gaussiantype orbital (GTO) shown in Eq. (1), is a normalization constant, ζ is known as the exponent, and Y_{l,m} are spherical harmonic functions. The indices n, m, and l determine the type of orbital (s, p, d, etc.). Basis functions of this explicit form are also known as Gaussian primitives (the Gaussian prefix shall be assumed herein) and, for reasons of balance and efficiency, a linear combination of a number of primitives is often undertaken to produce contracted functions:
 (2)
where n_{i} is known as a contraction coefficient.7 The methods and philosophies of optimizing the exponents and contraction coefficients, and their subsequent collection into a single defined whole (the basis “set”) will be the focus of this article.
How the exponents of a basis set are optimized has a large dependence on its intended usage. For example, if an investigation is focused solely on the electric properties of molecules, then using a basis set that has been optimized specifically for that purpose is likely to produce very good results.8 However, attempting to address all basis sets specialized for every different type of physicochemical property would represent a huge undertaking, hence this review will be mainly concerned with atomcentered, “energyoptimized” basis sets, where the exponents are optimized to minimize an electronic energy. Such basis sets are considered to be more generalpurpose, yet are often augmented with additional functions when considering specific problems (see later). Although all exponents may be optimized using common minimization algorithms such as BroydenFletcherGoldfarbShanno (BFGS) and simplex,9 an alternative option is to optimize a single exponent, then produce additional exponents using mathematical relationships such as “welltempered,” “eventempered,” and “Legendre polynomials.” Such methods are incredibly useful for large basis sets where optimization of every individual primitive is difficult, and have been discussed by Petersson et al.10 Eventempered schemes also find use in extending basis sets with additional diffuse functions for the description of, for example, molecular anions. Such extensions often take the form:
 (3)
where ζ_{3} is the new exponent, ζ_{2} is the next most diffuse exponent, and so forth.
Quality and contraction pattern
Working within the assumed framework of atomcentered, energyoptimized GTOs, basis sets can be further classified based on the number and type of functions included. The simplest basis sets are termed “minimal basis sets,” as they possess only the functions necessary to contain the electrons within an atom. For example, a minimal basis set for a neutral Ne atom would contain two s functions and a single threecomponent p function. In practice, minimal basis sets are rarely used due to the fact that they allow for little or no electron correlation.
One method of improving on a minimal basis set is to simply double the number of functions, leading to four s functions and two (threecomponent) p functions in our Ne example. In a similar fashion, one could triple, quadruple, and so forth, the number of functions, and such basis sets are known as double zeta (DZ), triple zeta (TZ), quadruple zeta (QZ), respectively. Most commonly used basis sets use a slightly modified arrangement where only the basis functions corresponding to the valence electrons of a given element are increased in zeta, leading to “splitvalence” basis sets. Such basis sets may still be referred to as, for example, DZ, but the abbreviations VDZ and DZV (valence double zeta and double zeta valence, respectively) are also in use and provide additional clarity. Beyond increasing the zeta of a given basis set, its quality may also be improved by adding additional higher angular momentum functions beyond those required to contain the neutral atom electrons, for example, adding dtype functions (or higher) to a basis set for Ne. Functions of this type are termed “polarization functions” or “correlation functions” and interested readers are referred to Ref.4 for a more detailed description of the physical interpretation of the effects of adding such functions to a basis set. Suffice it to say that they are incredibly important for investigations using electron correlation.
Although the contraction of basis functions according to Eq. (2) is a simple linear combination, the design decisions undertaken when developing new basis sets has led to the emergence of two distinct methods of selecting which GTOs are contracted and the pattern for doing so. The first of these is the “segmented contraction,” where each primitive contributes to only one contraction. This can lead to the situation where several primitives are cloned to appear in a number of different contractions. The second method is known as “general contraction,”11 and in this case all primitives of the same angular momentum are present in every contracted function of the same angular momentum. A given primitive will then have a different contraction coefficient in each contracted function. For either type of contraction, it is also common to include a number of primitives in their uncontracted form (essentially a single primitive with a contraction coefficient of 1). When listing a basis set composition, both types of contraction are expressed in the style (12s6p3d2f1g) [5s4p3d2f1g], where the parenthesis denotes the primitives and the square brackets the contracted functions. It is important to note that this provides almost zero information about the contraction pattern, and to uncover the details one must usually refer to some list of exponents and contraction coefficients or the original publication. Although most modern quantum chemistry codes support both segmented and generally contracted basis sets, the algorithms used are usually significantly more efficient for one contraction scheme than the other. A user should use such information when deciding on which basis set or software package to use for a given investigation.
With the terminology of basis sets in place, the development of several families of modern basis sets can now be discussed. For earlier developments and further history of Gaussian basis sets, there are a number of wellwritten reviews, such as Ref.12.
Atomic Natural Orbital Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
Almost every modern basis set draws at least some inspiration from the atomic natural orbital (ANO) basis sets of Almlöf and Taylor.13–15 One of the major goals of this seminal advance was to reduce the contraction error (defined as the energy difference between calculations with an uncontracted primitive basis and those using a contracted version) in postHartree–Fock (HF) calculations. The approach taken was based on the primitive sp basis sets of van Duijneveldt,16 which were subsequently generally contracted in a way that produced HF contraction errors to within “chemicalaccuracy” of 1 kcal mol^{−1}. Configuration interaction with singles and doubles (CISD) calculations were carried out using the uncontracted primitives to produce ANOs, with the expansion coefficients of these NOs then used as the contraction coefficients within the general contraction of the basis. Eventempered polarization functions were added to the sp set and contracted in the same fashion, producing a family of basis sets for the first and second row atoms (and hydrogen). To address the transition metal elements, the above procedure was modified such that density matrices were obtained for several electronic states, which were then averaged to produce the natural orbitals used in the contraction.17, 18
The averaging of density matrices was also used in an alternative formulation of ANO basis sets (ANOL) described by Roos and coworkers,19 in which the matrix and resulting orbitals are averaged over anions and cations, atoms in electric fields and several electronic states for each atom. This produced a balanced description of hydrogen, helium, and the first row elements for properties including ionization potentials, electron affinities (EAs), and polarizabilities. The same methodology was later used to produce revised ANO basis sets for the second row elements,20 and for the first row transition metal elements.21 The composition of the largest of the resulting bases (along with compositions of many of the other sets detailed in this review) are displayed in Table 1. Smaller ANOtype basis sets (ANOS) were also developed by Roos and coworkers22 for the elements HKr, with the intention that they could be used for larger molecules.
Table 1. Compositions of selected contracted Gaussian basis setsBasis set  Elements 

H  BNe  AlAr 


ANOL  [4s3p2d]  [6s5p3d2f]  [7s6p4d3f] 
ANOS  [4s3p]  [4s3p2d]  [5s4p3d] 
ANORCC  [6s4p3d1f]  [8s7p4d3f2g]  [9s9p5d3f2g] 
ccpVDZ  [2s1p]  [3s2p1d]  [4s3p1d] 
ccpVTZ  [3s2p1d]  [4s3p2d1f]  [5s4p2d1f] 
ccpVQZ  [4s3p2d1f]  [5s4p3d2f1g]  [6s5p3d2f1g] 
ccpV5Z  [5s4p3d2f1g]  [6s5p4d3f2g1h]  [7s6p4d3f2g1h] 
ccpV(T + d)Z  [3s2p1d]  [4s3p2d1f]  [5s4p3d1f] 
augccpVTZ  [4s3p2d]  [5s4p3d2f]  [6s5p3d2f] 
ccpwCVTZ  [3s2p1d]  [6s5p3d1f]  [7s6p4d2f] 
ccpVTZF12  [4s3p1d]  [6s6p3d2f]  [7s7p4d2f] 
ccpCVTZF12  [4s3p1d]  [7s7p4d2f]  [8s8p5d3f] 
pc0  [2s]  [3s2p]  [4s3p] 
pc1  [2s1p]  [3s2p1d]  [4s3p1d] 
pc2  [3s2p1d]  [4s3p2d1f]  [5s4p2d1f] 
pc3  [5s4p2d1f]  [6s5p4d2f1g]  [6s5p4d2f1g] 
pc4  [7s6p3d2f1g]  [8s7p6d3f2g1h]  [7s6p6d3f2g1h] 
augpc2  [4s3p2d]  [5s4p3d2f]  [6s5p3d2f] 
def2SVP  [2s1p]  [3s2p1d]  [4s3p1d] 
def2TZVPP  [3s2p1d]  [5s3p2d1f]  [5s5p3d1f] 
def2QZVPP  [4s3p2d1f]  [7s4p3d2f1g]  [9s6p4d2f1g] 
def2TZVPPD  [3s3p1d]  [6s3p3d1f]  [6s5p4d1f] 
6–31G  [2s]  [3s2p]  [4s3p] 
A third iteration of ANO basis sets (ANORCC) have also been published, with the design goal of ensuring basis sets of the same quality for the whole periodic table of elements. The initial publication in this direction targeted the alkali and alkaline earth metals,23 introduced scalar relativistic effects via the use of a Douglas–Kroll–Hess (DKH) Hamiltonian,24, 25 and substituted the CISD method with completeactivespace selfconsistent field (CASSCF) and completeactivespace secondorder perturbation theory (CASPT2).26, 27 The same design principles have since been used in the development of ANORCC sets for the main group elements (groups 13–18),28 the transition metal elements [including spinorbit (SO) effects],29 and the actinides and lanthanides (again including SO effects).30, 31 The ANOS, ANOL, and ANORCC basis sets are the standard basis sets within the MOLCAS software.32
Recently, Neese and Valeev have presented a series of ANO basis sets that take many cues from the correlation consistent basis sets (see below). The initial publication covers the elements HAr,33 with the multireference averaged coupledpair functional method used in the optimization.34 A number of variations in polarization functions and contractions were benchmarked in terms of HF energy, correlation energy, and total energy, before comparison with several basis sets from other families. Although the initial results with these anopVnZ (n = D, T, Q) bases appear promising, at the time of writing, it is unclear how well the basis sets perform for atomic/molecular properties and for relative energies.
Correlation Consistent Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
The large number of primitive functions in the original ANO basis sets, particularly with high angular momentum, and the associated computational cost of evaluating them posed a significant barrier to adoption. An alternative approach, originally developed for the first row atoms (BNe and H) by Dunning,35 has led to the correlation consistent (cc) family of basis sets. The sets are energyoptimized and use general contraction schemes for the functions describing the occupied orbitals (the HF wavefunction). The key advance of the cc basis sets came from inspecting the incremental change in correlation energy (at the CISD level) when additional uncontracted higher angular momentum primitives were added, with the striking results depicted in Figure 1. It can easily be seen that the polarization functions can be separated into discrete groups based on the magnitude of correlation energy they recover. For example, adding a single d function has by far the largest effect on the energy, but adding a second d function has approximately the same effect as adding a single f function, forming a second grouping. Likewise, a third grouping is formed by the third d function, second f function and first g function. The naming of the basis set family now becomes somewhat obvious, each function within a grouping adds a (approximately) consistent amount of correlation energy. Combining these polarization groupings with splitvalence DZ, TZ, and QZ quality groups of s and p functions (plus a number of s and p polarization functions), naturally led to basis sets that are known as ccpVDZ, ccpVTZ, and ccpVQZ (correlation consistent polarized valence n zeta).
One of the primary reasons for the cc basis set family's lasting popularity is due to a series of empirical observations that as the cardinal number (n in ccpVnZ) of the basis set is increased, energies and various properties converge smoothly toward the complete basis set (CBS) limit. This provides an established route for the systematic improvement of ab initio calculations toward the exact solution of the timeindependent nonrelativistic electronic Schrödinger equation. Just as one may improve a coupled cluster operator from single and double excitations to single, double, and triple excitations,36 the basis set can be improved from, for example, ccpVTZ to ccpVQZ. This systematic improvement with respect to cc basis sets paved the way for a large number of extrapolation formulas that can be used to estimate the CBS limit. A review of extrapolation procedures is beyond the scope of the current work, and interested readers are referred to Ref.37 and references therein.
The same principles used in the original cc investigation have since been used to extend the prescription up to ccpV5Z or ccpV6Z for H–Ne (and even beyond, such as ccpV10Z for Ne),38–41 and to produce analogous basis sets for the second row elements (NaAr),41–43 and third row main group elements (Ga–Kr).44 A number of years after the initial publications, it was observed that adding a single additional “tight” (large exponent) dfunction and reoptimizing the d primitives produced significantly better bond lengths and dissociation energies for molecules involving the second row elements. The resulting basis sets are denoted ccpV(n + d)Z and should be used in place of the original sets for those atoms.41, 45
Extensions to the cc basis sets
In addition to the systematic improvement toward the CBS limit, cc basis sets have proven popular due to the large number of “extensions” available that adapt the original scheme to heavier elements and augment the sets with primitives for describing properties of interest. Many of these extensions were reviewed by Peterson in 2007,46 hence they will only be mentioned briefly here, along with descriptions of more recent developments.
Diffuse functions.
An accurate description of EAs and noncovalent interactions requires “diffuse” (small exponent) functions within the basis set to satisfy the longrange part of the wavefunction. The cc philosophy to address such situations is to augment a “parent” set with an additional diffuse function for each angular momentum symmetry occupied within that basis.47 These functions are typically optimized for atomic anions and the resulting set prefixed aug, for example, augccpVDZ. When cc basis sets were developed for heavier elements, it became commonplace that diffuse augmenting functions were published alongside their parent and, in situations where even more diffuse functions are required, doubly (daugccpVnZ) and triply (taugccpVnZ) augmented basis sets are available.38 Additional augmentation can be produced in an ad hoc fashion using the eventempered scheme of Eq. (3).
Corevalence correlation.
By default, most ab initio programs use a frozencore approximation for correlated methods and all of the parent cc basis sets were designed with this in mind. For highaccuracy studies, it is frequently important to correlate at least an additional shell of core electrons, which requires a number of tight functions optimized for this purpose. There are two established approaches for optimizing such exponents; in the ccpCVnZ basis sets, the exponents are optimized based on the difference between valenceonly and core and valence correlation energies,48–51 whereas in the ccpwCVnZ bases, the corevalence contribution to the correlation energy is strongly weighted over the corecore (hence the w in the basis set abbreviation).49, 50 Typically, the ccpwCVnZ sets produce better results for small zeta, with the results from both approaches converging rapidly for larger basis sets.
Scalar relativistic effects.
For the lighter elements of the periodic table scalar relativistic effects are typically small, but, in veryhigh accuracy calculations where, for example, wavenumber accuracy is the goal, they should still be included. Perhaps the easiest way of doing so is to use a DKH Hamiltonian, and a comprehensive study has indicated that simply using the standard cc basis sets in such calculations can lead to large errors.52 The same study went on to show that recontracting the basis sets (keeping all exponents the same) in atomic DKH calculations was sufficient to produce good results for the elements HHe, BNe, AlAr, and GaKr,52 with a later study developing similar sets (denoted ccpVnZDK) for Li, Be, Na, and Mg.41 Alternatives to DKH calculations include fullyrelativistic 4component Dirac–Hartree–Fock calculations (fullyrelativistic herein) or those including relativistic pseudopotentials [PPs, also known as effective core potentials (ECPs)], which shall be discussed in the context of the heavier elements below.
Heavier elements.
All the cc basis sets described above share a common methodology in their design and optimization, yet challenges posed by heavier elements such as greater relativistic effects, a proliferation of common oxidation states, and a simple increase in the total number of electrons have necessitated some changes to the approach. The use of PPs remedies two of these problems as they accurately include scalar relativistic effects in a simple fashion, and they replace a number of core electrons, reducing the size of the basis set. For further details on the design and constructions of PPs, the reader is referred to a recent review,53 but for present purposes it is important to note two different sizes of PP: largecore (replacing all but the valence electrons), and smallcore (an extra corevalence shell of electrons remain). In the latter, the primary goal is an accurate description of relativistic and correlation effects, rather than decreasing the overall computational cost.
Some of the first cc basis sets matched to PPs were produced for the elements GaKr and InXe, at TZ and QZ quality.54 These sets are based on the largecore Stuttgart–Dresden–Bonn PPs,55 and are denoted SDBccpVnZ. Subsequently, cc basis sets from DZ to 5Z quality have been developed that are matched to the smallcore Stuttgart–Köln PPs.56–58 These basis sets span the whole postd main group elements, are denoted ccpVnZPP, and have diffuse augmented and corevalence variants available.59 As had been suggested previously,54, 60 calibration of the corevalence effect confirmed that correlation of the (n − 1)d electrons is very important for systems containing the group 13–15 elements, with the effects becoming larger further down the groups. Correlating the (n − 1)s and p electrons is much less important, but still advisable for highaccuracy studies.59 A revised PP and accompanying cc bases have been published for I, and due to the increase in accuracy should be used in preference to the originals.61 In terms of basis sets for fully relativistic calculations on postd main group elements, cc sets of DZ–QZ quality are available,62–65 along with additional functions for core correlation.66 Dyall has also recently published fully relativistic sets for the 7p elements, managing to overcome the problem of very large SO coupling in the process.67
Highaccuracy calculations involving transition metal elements are demanding on both method and basis set due to the likelihood of a large number of low lying electronic states, and care must be taken to ensure a basis set provides a balanced description of all these states (as seen in the ANO basis sets where an averaged density matrix was used). The first cc basis sets for transition metals were published for Ti and Fe,68, 69 where the approach chosen was to contract the primitive functions developed by Partridge,70 before optimizing polarization functions. Allelectron basis sets for all of the 3d elements were later produced by Balabanov and Peterson,71 with care taken to ensure the sets were flexible enough to describe the 4p orbitals and to average the primitive d functions over (up to) three electronic states. Although only TZ–5Z sets were described in the original publication, DZ sets developed using the same strategy are available,72 and included in packages such as MOLPRO.73, 74 The same publication also detailed cc sets for use in DKH calculations, diffuse augmented bases, and corevalence sets. The latter differ slightly from the ccpwCVnZ bases for lighter elements as a number of the valenceonly exponents were reoptimized in addition to augmenting with a number of extra functions. This change of approach was necessitated by significant overlap between valence and corevalence functions in initial test optimizations.
Relativistic effects can no longer be neglected once the 4d and 5d transition metal elements are reached, hence cc basis sets have been developed that are matched to the smallcore StuttgartKöln relativistic PPs for those elements and Cu and Zn.75–78 As the reader may have already surmised, aug and ccpwCVnZ variants were introduced simultaneously with their parent sets (DZ–5Z), but these investigations also optimized allelectron DKH basis sets at the TZ level for the purposes of quantifying the error introduced by using the PP approximation. These errors were shown to be negligible, except when corevalence calculations were carried out on electronic excited states. Basis sets for fully relativistic calculations (DZ–QZ) are also available for the 4d and 5d elements.79, 80
The heavier alkali and alkaline earth elements are one of the areas of the periodic table that are not covered particularly well by the cc family of basis sets. For calcium, ccpVnZ and ccpCVnZ (n = D–5) sets have been published,81 and it has been established that correlation of the 3sp outer core electrons is important for a number of calcium containing molecules.81, 82 However, fully relativistic sets are available for 4s, 5s, 6s, and 7s elements,83 along with the lanthanides and actinides.84, 85
Basis sets for explicit correlation.
Over the course of the last decade, there has been a great deal of excitement in the theoretical chemistry community surrounding explicitly correlated wavefunctions (for recent reviews, see Refs.86–89). By incorporating functions depending explicitly on the interelectronic distance into the wavefunction, the convergence of energies and properties toward the basis set limit is greatly accelerated. In many cases, explicitly correlated methods with TZ quality basis sets produce results comparable to 5Z sets in conventional methods.90 Although using “standard” diffuse augmented correlation consistent basis sets (augccpVnZ) in explicitly correlated calculations produces good results, these methods also offer an attractive proposition from a basis set development point of view. In conventional postHF calculations, the basis set must be able to describe the HF wavefunction, the correlation hole (associated with the Coulomb cusp), and the longerrange component of the wavefunction. The additional correlation factor in explicitly correlated methods is a twoelectron function and thus is able to accurately describe the cusp region, allowing the oneparticle basis set to focus on the HF and longerrange components.
The approach of Peterson and coworkers to producing basis sets for use with explicitly correlated methods mirrors the original cc work of Dunning as correlation consistent groupings were determined by plotting the incremental contribution of individual functions to the total correlation energy at the explicitly correlated secondorder Møller–Plesset perturbation theory level (MP2F12,91, 92 where F12 denotes a nonlinear correlation factor).93 A comparison of the incremental correlation energy contributions at the MP2 and MP2F12 levels made it obvious that higher angular momentum functions are much less important in explicitly correlated calculations, meaning only DZ–QZ sets were developed. At the time of the original investigation, openshell explicitly correlated codes were not commonly available, hence the exponents for most elements were determined by molecular calculations that covered a number of bonding environments. The choice of HF and s and p polarization functions was based on the HF functions from the ccpV(n + 1)Z basis sets along with diffuse augmenting functions from the corresponding augccpV(n + 1)Z set. At first glance, this leads to a surprising result, the composition of the new ccpVTZF12 basis set for Ne is [6s6p3d2f], while the augccpVTZ composition is [5s4p3d2f]. However, the reasoning behind this minor increase in basis set size is very sound. Contrary to the situation in conventional postHF calculations, for F12 methods the basis set error in the total energy is often dominated by the error in the HF energy. The increase in the number of s and p functions thus reduces the HF error with little increase in computational cost.
A comparison of the MP2F12 correlation energies for Ne, Ar, N_{2}, and P_{2} demonstrated that the new ccpVnZF12 basis sets produce results roughly equivalent to those with an augccpV(n + 1)Z basis (it is important to note that an F12 method is used in both cases), and benchmark calculations on a larger set of relative total energies indicate similar performance.93 Systematic convergence toward the CBS limit is again a feature of these basis sets and an extrapolation procedure has been reported for use in highaccuracy studies of, for example, thermochemistry and ab initio spectroscopy.94 The F12 basis sets were originally developed for the elements H, He, B–Ne, and Al–Ar and have since been extended to cover Li, Be, Na, and Mg.95 Although the ccpVnZF12 basis sets outperform the equivalent augccpVnZ sets in most cases, the latter do exhibit a more rapid convergence for the interaction energies of molecules bound by noncovalent interactions.96 This is due to the presence of additional diffuse higher angular momentum functions in the augsets; however, for the same reasons, the ccpVnZF12 sets suffer from significantly less basis set superposition error and may be preferred when counterpoise correction calculations are not carried out.96, 97
Correlation consistent basis sets for accurately describing corecore and corevalence correlation in an F12 framework have also been published for the elements LiAr.51, 95 Two alternatives were proposed and denoted ccpCVnZF12 and augccpC_{F12}VnZ, based on adding tight functions to the ccpVnZF12 and augccpVnZ basis sets, respectively. The number of additional functions added to construct the ccpCVnZF12 sets is very small, especially at the QZ level, where only [1s1p1d] are added (for the first row elements), compared to [3s3p2d1f] in augccpCVnZ. A subsequent benchmarking of these new sets for the calculation of equilibrium geometries, dissociation energies, and harmonic frequencies of diatomic molecules at the coupled cluster level indicates that when using explicitly correlated methods the corecore and corevalence effects are effectively converged at the QZ level and that the ccpCVnZF12 sets outperform alternative bases. For more pragmatic studies, the corevalence effect is reasonably well converged at the ccpCVTZF12 level, with ccpCVDZF12 producing good results for diatomic molecules containing second row elements.
Polarization Consistent Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
Most of the above review of the ANO and cc basis sets has been considered from a correlated ab initio perspective, where the scaling of computational effort with respect to basis set size is steep and convergence of the total energy is typically slow (inverse polynomial with respect to the highest angular momentum functions).98–101 However, the DFT (and HF) methods converge significantly faster (exponential with respect to basis set size),102, 103 and hence the basis set requirements are obviously different. It also bears mentioning that DFT correlates all electrons, but most postHF methods and the basis sets designed for them use a frozencore approximation. Although the rapid convergence with respect to basis set has led to a proliferation of DFT calculations carried out with the older Pople style basis sets, for example, 631G and extensions,104 it is still desirable to be able to systematically approach the CBS limit. Not only does this enable one to produce presumably more accurate energetics, but it also allows for an inspection of the intrinsic errors of a specific exchangecorrelation functional. Basis sets that systematically approach the DFT (Kohn–Sham) and HF limit have been developed by Jensen and coworkers, and are denoted polarization consistent (pc).105, 106
Determining basis sets for molecular calculations using the HF method leads to a problem as the atomic HF energy does not depend on polarization functions. Jensen's solution to this was to minimize the HF energy for symmetric homonuclear molecules containing the elements H, C, N, O, and F, with atomic s and p functions (s only for H), thus ensuring that only the polarization functions may be biased toward the molecules in the optimization set. In a similar fashion to the cc basis sets, plotting the energy contribution from individual polarization functions for the N_{2} molecule, on top of the (26s17p) atomic functions of Koga et al.,107 produces the pattern displayed in Figure 2. A comparison of Figures 1 and 2 shows that the composition of the first two groupings of (non s and p) polarization functions is the same for both the cc and pc schemes, yet the subsequent pc groupings contain slightly more d angular momentum symmetry functions than the cc groupings. For example, summing the first three pc groupings results in 4d2f1g functions, but 3d2f1g functions in the cc groupings. Figure 2 shows energy contributions for N_{2} with internuclear separations of both 2.068 and 2.680 a.u., demonstrating that a change in geometry produces only a minor effect on the energies and does not alter the groupings. Analogous plots for C_{2}, N_{4}, O_{2}, O_{3}, and F_{2} indicate little change in the optimal groupings for each molecule, allowing for a series of consistent groupings. The abbreviation pcn is then used to designate the basis sets, where n indicates the level of polarization. For example, pc0 has no polarization functions, pc1 has one polarization function (1d), and pc4 has polarization functions of four angular momentum symmetry higher than that needed for the atom (6d3f2g1h). Although the polarization functions are left uncontracted, a number of the s and p functions are generally contracted using the atomic HF orbital coefficients as the contraction coefficients. Subsequently, the exponents and contraction coefficients were recalculated using the BLYP functional,108, 109 resulting in slightly more diffuse s and p functions, and slightly tighter exponents for the polarization functions.110 The composition of all sets was fixed to those of the earlier HF based work.
Applications where diffuse functions are important, such as EAs, produced some interesting problems in the development of diffuse augmented pc basis sets (denoted augpcn).111 Much of this is related to the selfinteraction error in DFT (see Refs.112 and113, and references therein), which is particularly problematic for anionic systems. In the process of developing these new sets, Jensen examined how the BLYP EA of a number of systems changed as basis set size is increased and concluded that the good agreement between DZ and TZ basis set results and experiment is somewhat fortuitous. Pragmatically, such levels of theory can be thought of as “parameterized models” for the calculation of EAs. Analogous to the augcc sets, a single additional diffuse function was added to each angular momentum symmetry occupied in a given basis set, with the s and p exponents determined by a scaled eventempered extension [a modified version of Eq. (3)], while the diffuse polarization exponents were produced by scaling the resulting diffuse p exponents. In some cases, the use of partially augmented basis sets can be advocated, such as in the calculation of EAs or for the largest pc4based sets. Although the DFT and HF methods preclude the need for corevalence basis sets, the pcn family has been augmented with tight exponents, primarily for use in the calculation of spinspin coupling constants.114 The resulting sets are denoted pcJn, and have been further improved by subsequent recontraction.115
Following the same principles used in the initial investigations, pcn and augpcn basis sets have been developed for the elements Si–Cl.116 The number of polarization functions remains the same as for CF, with small changes in the number of s and p functions. Just as for the lighter elements,105, 110, 117 it has been demonstrated that the pcn family of basis sets for the second row elements can produce accurate atomization energies, equilibrium geometries, harmonic vibrational frequencies, and can easily be extrapolated to the CBS limit.116 Extensions of the pcn family to the remaining elements in the first two rows of the periodic table (plus He),118 encountered some small problems for the alkali and alkaline earth elements due to the energetic groupings of the polarization functions being different to those of the pblock elements. Although a number of other basis set families ensure that the sblock elements of a given row have the same number of functions as the pblock, the driving force behind the pcn sets having a different composition is based on the basis sets producing a balanced description of energetic properties across an entire row. The pcn basis sets have also recently been extended to include the 4s (K, Ca) and 4p (GaKr) elements.119
The benchmarking and calibration of the pcn and augpcn basis sets has shown that, for the HF and DFT methods, this family of basis sets typically outperforms other sets of a roughly equivalent size for atomization energies, dipole moments, ionization potentials, relative molecular energies, and equilibrium geometries. The pc3 and pc4 basis sets are somewhat larger than basis sets commonly used in DFT investigations, but pc2 produces results that are reasonably well converged and the errors associated with pc1 can be of the same order as the intrinsic error of functionals such as BLYP.117
Def2 Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
The naming of the def2 family of basis sets developed by Ahlrichs and coworkers is due to their inclusion in the TURBOMOLE program,120, 121 where a def (for default) prefix was added to some basis sets to prevent confusion with earlier, similarly named sets. These more recent developments have established a family of basis sets that have a similar accuracy for (almost) all elements in the periodic table,122, 123 which are now prefixed def2 and should be used in preference over the def sets. Although they do not systematically approach the CBS limit in the same fashion as the cc or pcn bases, the def2 family has several qualities available: def2SV, def2TZV, and def2QZV. There are two different sets of polarization functions available, P and PP (not to be confused with PPs), where the former is intended for HF and DFT calculations, and the latter for postHF studies. These polarization prescriptions are added as a suffix, for example, def2QZVPP. It should be noted that P and PP are identical for pblock elements.
The exponents describing the HF wavefunction are energyoptimized at the atomic HF level, along with segmented contraction coefficients. Many of the polarization functions are taken directly from the correlation consistent basis sets for pblock elements (and H, He), with those for other elements determined by a mixture of MP2 calculations and interpolation. For the elements Rb–Rn (excluding lanthanides), smallcore relativistic PPs are used.56, 58, 124–127 Compared to the cc basis sets for the heavier pblock elements, the def2TZVP(P) and def2QZVP(P) bases contain additional tight f functions that produce significantly better results for complexes such as At_{2}, SbF_{5}, and SbF_{3}.122 Much of the def2 family of basis sets is based on previous work by Ahlrichs and coworkers,128–132 with some redevelopment to ensure smallcore ECPs are used consistently for the elements Rb–Rn and that a similar level of performance can be expected for all elements.
The def2 family has recently been extended to the lanthanides,133 based on the segmented bases of Cao and Dolg and using smallcore PPs.134, 135 The majority of the primitives were energy optimized in HF calculations on the lowest 4f^{n + 1}6s^{2} electronic state of a lanthanide, and augmented with HF energyoptimized diffuse d and p functions for a 4f^{n}5d^{1}6s^{2} configuration. Six g functions were also added by fixing their exponents to be the same as the six smallest f functions, producing a (14s13p10d8f6g) set.136 The segmented contraction coefficients were then determined from an ANOlike scheme.135, 136 The def2TZVP(P) sets recontracted the p functions of Cao and Dolg, added additional diffuse p functions and reduced the number of g functions. The def2SV(P) and def2QZVP(P) sets were then constructed by removing and adding functions, including the addition of a single h function in the latter case. Calibration of the bases on a number of lanthanide complexes indicates that the basis set errors are comparable to those calculated previously for lighter elements,133 inkeeping with the overall philosophy of the def2 bases.
Two different schemes for the partial augmentation of the def2 bases with diffuse functions have been presented by Rappoport and Furche,137 and Zheng and coworkers.138 The former (denoted by a suffix D, e.g., def2TZVPPD) takes a propertyoptimized approach to ensure the resulting sets are suitable for the calculation of response properties. By including only the minimal number of diffuse functions needed to obtain the desired accuracy in dipole polarizabilities, these bases avoid some of the problems associated with diffuse functions such as illconditioned overlap matrices and reducing the effectiveness of integral prescreening.137 The approach of Zheng et al. was to simply add additional s and p exponents equal to ζ_{1}/3, where ζ_{1} is the most diffuse (s or p) exponent in the parent def2 set. These minimally augmented (ma) sets perform very well for the calculation of EAs with DFT methods, although little improvement is observed when evaluating interaction energies for noncovalent bonds with DFT methods and the madef2 basis sets.138
Other Orbital Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
There are a plethora of Gaussian basis sets available, to the point where attempting to provide a broad coverage of all of them would be monumental task. These bases range from those optimized for a single atom and a single purpose to much more general and broadly applicable. A small number of these basis sets are briefly described below, yet no conclusions on quality or purpose should be inferred from the absence or presence of any particular bases.
Pantazis and Neese have developed a series of segmented allelectron relativistic contracted (SARC) basis sets, primarily for investigating properties that depend on core electrons, such as electron paramagnetic resonance, where the use of PPs is not advisable. The small size of these bases ensures they are competitive with PP sets in terms of computational efficiency, even though they require a relativistic Hamiltonian such as DKH or zerothorder regular approximation (ZORA).139–141 The SARC primitives are extrapolated based on CASSCF results, before contraction based on scalar relativistic CASSCF calculations. This procedure was carried out for the 5d transition metal elements,142 the lanthanides,143 and the actinides,144 with a relativistic recontraction of the def2 basis sets performed for the elements of the first three rows of the periodic table to ensure consistency with the SARC bases.142 DFTbased ZORA and DKH results using the SARC bases are in good agreement with experimental or high level ab initio results for ionization potentials, geometries, and bond dissociation energies.
The natural orbitalbased segmented contracted Gaussian (NOSeC) basis sets consist of polarization exponents and contraction coefficients determined by minimizing deviation from accurate NOs generated by atomic CISD calculations. The functional used in this minimization is
 (4)
where ψ_{k} is the familiar linear combination of atomic orbitals (contracted basis functions), ξ_{k} are the reference NOs with occupation numbers ν_{k}, and w is a weight function. These polarization functions can be combined with an arbitrary HF basis set and have been reported for most of the periodic table,145–149 including sets for corevalence correlation,150 and relativistic effects using a DKH Hamiltonian.151–155 The NOSeC bases typically recover more than 99% of the correlation energy calculated using the parent NOs, but are significantly more computationally efficient. Where necessary, such as the first row transition metal elements, the reference NOs were averaged over different electronic configurations, and can be combined with PPs for the heavier elements. For elements with a single valence electron, such as Li, a common approach is to optimize exponents for polarization functions using the homonuclear dimer.41 However, the approach taken for the NOSeC bases uses the virtual K orbital method of Feller and Davidson to produce reference orbitals rather than CISD NOs. These orbitals are produced by diagonalizing the G operator of Eq. (5) over the virtual space:
 (5)
where K_{j} is the exchange integral operator for the occupied orbital j, α is based on the virtual orbital occupation of an important orbital, and F is the Fock operator.156 The resulting correlating orbitals can be loosely compared to a molecular situation where an additional electron is provided by an adjacent atom.149 Recently, the SapporonZP (n = D, T, Q) have been developed using a similar fitting to reference NOs approach.157, 158 Unlike the NOSeC bases, SapporonZP specify an underlying HF set and core correlating functions are available for all elements (with the obvious exceptions of H and He). For the heavier elements, relativistic effects are considered through a DKH Hamiltonian.
Many of basis sets for use with DKH Hamiltonians are simply the same exponents taken from the nonrelativistic sets recontracted with a relativistic treatment. In contrast, Hirao and coworkers have developed Gaussian basis sets specifically for use in thirdorder DKH calculations.159, 160 The exponents were optimized by minimizing atomic HF energies (with a DKH Hamiltonian), with functions added until the incremental energy change was less than 1 m E_{h} in each angular momentum symmetry. This approach ensures equal quality for all elements, at the expense of requiring a large number of primitives for heavier elements. To enable their use in molecular calculations, the primitives were generally contracted, and augmented with polarization functions and a number of eventempered diffuse functions.
Auxiliary Basis Sets
 Top of page
 Abstract
 Introduction
 Atomic Natural Orbital Basis Sets
 Correlation Consistent Basis Sets
 Polarization Consistent Basis Sets
 Def2 Basis Sets
 Other Orbital Basis Sets
 Auxiliary Basis Sets
 Outlook
 Acknowledgements
 Biographical Information
The evaluation of electron repulsion integrals is often a bottleneck in both DFT and ab initio calculations. Focusing momentarily on pure DFT, that is, without exact exchange, the Coulomb (J) term can be separated from the exchangecorrelation term as:
 (6)
where ρ (r) is the molecular electron density. By expanding these molecular electron densities in an atomcentered auxiliary basis set (ABS), it is possible to bypass the evaluation of fourcenter twoelectron integrals and hence reduce the computational effort of pure DFT calculations by up to an order of magnitude with negligible loss of accuracy. Due to the expressions resembling a resolution of the identity (RI), this method is often referred to as RIJ. The expansion of the electron density takes the form:
 (7)
where α represents a function in the ABS and c_{α} is an expansion coefficient.161–168 The approximation used means that the ABS used in Eq. (7) depends on the chosen orbital basis set [orbital basis set (OBS) will be used herein to reduce confusion between auxiliary and orbital bases], and the expansion coefficients are determined by minimizing
 (8)
where ρ_{RIJ} denotes the density approximated in Eq. (7).165, 168 This leads to a set of linear equations expressed as:
 (9)
where ν and μ are functions in the OBS and D denotes the density matrix. It can be shown that Eq. (8) is equivalent to
 (10)
which appears similar to the insertion of a RI.168 This leastsquares fitting and choice of Eq. (8) to specify the metric is not the only route to approximating J,167, 169–171 but has proven to be the most popular. Two methods of determining the ABS exponents specifically matched to an OBS have thus emerged; explicit optimization of ABS exponents,168 and algorithms for the automatic generation of ABSs.172, 173 Although the latter holds the advantage that no ABS development is required to fit new OBSs, the ABSs resulting from explicit optimization typically contain fewer functions for a given level of accuracy and are thus computationally more efficient.
The first systematic investigation into optimized ABSs was carried out by Ahlrichs and coworkers and produced bases specifically matched to the defSVP OBS.168 The resulting ABSs were optimized such that difference between the fitted and conventional densities (at the HF level to prevent any bias towards a particular functional) were effectively minimized and the functions were contracted using a segmented pattern, with this work later complimented by ABSs matched to defTZVP.174 Through fitting to def2QZVPP OBSs and keeping the number of functions as small as possible, Weigend175 superseded this previous work by developing a single RIJ ABS (for the elements H–Rn) that is suitable for use with any of the def2 family of OBSs, showing negligible errors for atomization energies, dipole moments, geometries, and vibrational frequencies.
The reduction in the computational cost of evaluating the Coulomb integrals has promoted the wide adoption of RIJ fitting when pure DFT functionals are used. However, in hybrid DFT methods, where some amount of exact HF exchange is included, evaluating the exchange scales as ON^{2}M, where N and M are the number of functions in the OBS and ABS, respectively, and O is the number of occupied orbitals. This means that for systems with a large number of electrons, such as medium sized transition metal complexes, the fitting has no advantage over conventional integral evaluation algorithms. Some methods that alleviate the exchange fitting problem have been proposed, and will be discussed later.
RIlike approximations have also been used to improve the efficiency of methods such as HF,176–179 MP2,180 coupled cluster with singles and doubles (CCSD),181 and approximate coupled cluster singlesanddoubles model (CC2).182 Herein, the approximation will be referred to as density fitting (DF) due to the reasons detailed in Ref.183, although several groups and program packages still use an RI prefix. It is important to note that the density refers to products of basis functions rather than the electronic density, and the DF moniker also eases potential confusion when discussing ABSs in the context of explicitly correlated methods. In addition to the Coulomb term, applying the DF approximation to HF theory requires fitting of the exchange part of the Fock matrix,178 and, as mentioned above in the context of hybrid DFT, the efficiency of the resulting algorithm is related to the ratio between the number of functions in the basis sets and the number of occupied orbitals. This indicates that DFHF is best applied to calculations using relatively large basis sets on small to medium sized molecules and, in practice, is most often used in conjunction with DFMP2. Compared to RIJ, where only the total electron density is approximated, DFHF requires the fitting of products of individual basis functions and hence a larger number of auxiliary functions are required. The additional flexibility this affords means that ABSs optimized for exchange fitting are also appropriate for DF of the Coulomb part.184 ABSs for use in DFHF are thus often termed “JKfit” and have been matched to defTZVPP, ccpVTZ, ccpVQZ, and ccpV5Z for the elements H, B–F, Al–Cl, and Ga–Br.178 A series of “universal” JKfit sets for the elements H–Rn have also been published and although originally optimized for the def2 family of OBSs,133, 184 they have also seen significant use when matched to the cc bases. DFHF can be further simplified by using the Poisson method, and eventempered ABSs for this purpose have been published for the elements H and CF.179
The overall goal of DFMP2 is to ensure that the difference between the conventional MP2 correlation energy and that from DFMP2 is as small as possible. However, minimizing such a quantity does not lead to optimal ABSs and in practice the optimization proceeds by minimizing the quantity
 (11)
where 〈ab‖ij 〉 = (aibj) − (ajbi), with i, j denoting occupied orbitals, a, b virtual orbitals, and ϵ_{x} the HF orbital energies.131 Unlike the energy difference, δDF will always be positive and the fact that it only depends on the ABS functions (for a given OBS) makes it desirable from an optimization viewpoint. It has been demonstrated that accurate DF at the MP2 level requires the ABS to include functions with an angular momentum of at least ℓ_{occ} + ℓ_{OBS}, where ℓ_{occ} is the highest angular momentum occupied in the HF wavefunction and ℓ_{OBS} the highest angular momentum present in the OBS. This can result in the case where ℓ_{ABS} is greater than what is supported in some quantum chemical codes, despite ℓ_{occ} being within the allowed range. Unsurprisingly, simply truncating the ABS to the maximum supported ℓ results in large fitting errors,185 but there is some evidence that this error is somewhat constant and cancels for relative energies.186
The development of ABSs for DFMP2 was greatly streamlined by the introduction of a driver program that uses analytic gradients for the optimization of ABSs within the RICC2 module of the TURBOMOLE package.121, 187 This means that MP2 fitting ABSs (often referred to as “MP2fit”) have been produced that are specifically matched to correlation consistent and def2 OBSs for most of the periodic table,95, 185, 187–193 with the advent of explicitly correlated and local electron correlation methods (that use DF) a strong driving force for this development. The resulting ABSs typically contain between two and four times as many functions as the OBS they are matched to, and the fitting errors are two to four orders of magnitude smaller than the incompleteness errors in the OBS. To ensure a balanced description of the different oxidation states of transition metal elements, a number of ABS exponents are optimized for the cations corresponding to the highest oxidation state commonly found in complexes, along with some exponents optimized for intermediate oxidation states.187 Calibration of the resulting ABS on small to medium sized complexes with a variety of oxidation states indicates that negligible fitting errors are produced.185, 187, 190–192
An alternative to DF using preoptimized ABSs are methods based on Cholesky decomposition (CD) techniques, and although such methods have been the subject of recent reviews,194, 195 they are briefly recapped below. A symmetric positive definite matrix V can be decomposed into the product of a lower triangular matrix, L, and its adjoint:
 (12)
and when CDs are applied in a quantum chemical context, the matrix V is usually taken to be that of the twoelectron integrals, V_{ij,kl} = (ijkl).196 This allows the integrals to be expressed as
 (13)
and in a similar fashion to Eq. (11), the error in the integral representation can be given as
 (14)
Combining Eqs. (13) and (14) defines a residual matrix such that
 (15)
and this suggests the accuracy of the CD approximation can be controlled as
 (16)
δ is then known as the decomposition threshold, with all integrals being reproduced to at least the value of δ via a recursive procedure. Beebe and Linderberg demonstrated that CD of twoelectron integrals can also be thought of as the determination of a linearly dependent set of atomic orbital product functions,196 which are commonly referred to as the Cholesky basis, and it is very important to note that using a Cholesky basis as an ABS makes CD and DF formally equivalent.
Early CD implementations were significantly more expensive than DF as they still required fourcenter twoelectron integrals to be evaluated, and the Cholesky basis had a strong dependence on molecular geometry. The onecenter CD (1CCD) approach improved computational efficiency by restricting the product functions within the Cholesky basis to be located on a single center, but retained the strong dependence on geometry.173 A significant advance in the application of CD to quantum chemical methods was the introduction of atomic CD (aCD), where the Cholesky basis is produced onthefly from CD of the atomic twoelectron integral matrix for a given OBS, along with the addition of missing angular components.173 A hierarchy of aCD ABSs were generated by varying the threshold δ = 10^{−n}, with the ABSs then denoted aCDn^{*}. With δ = 10^{−4},173, 197 these ABSs produce accurate results for a number of methods, including DFT (pure and hybrid), HF and MP2, but the number of functions in the set is often too large to make aCD truly competitive with preoptimized DF sets.198
To reduce the total number of functions, the atomic compact CD (acCD) procedure performs a second CD on the aCD basis, before the remaining functions are fit to produce an accurate representation of that parent aCD set.199 For a large iron containing complex, this led to a 70% reduction in the number of ABS primitives, but with no loss of accuracy when compared to aCD. Although it is possible to generate acCD ABSs and store them for later use in a basis set library, in practice this is not carried out as the computational cost for onthefly generation is negligible. In addition to removing the need for preoptimization of ABSs, general applicability to a number of quantum chemical methods is a major strength of the CD approach, with recent applications to CASSCF and CASPT2 methods.200, 201 The number of functions in an acCD ABS is still considerably larger than in preoptimized ABSs, with a study on a test set of noncovalent interactions using TZ OBSs indicating that acCD4 is around four times larger than the equivalent MP2fit ABS, although, perhaps reassuring given the significantly larger number of functions, smaller errors relative to conventional methods were produced.202
As mentioned above, fitting of the exchange integrals proves problematic when the number of occupied orbitals or size of the OBS becomes large, limiting the size of problem that may be tackled with DFHF or DF of hybrid DFT methods. Two different solutions to this problem have emerged, both based on locality. The local DFHF approach of Polly et al.179, 183 localizes orbitals and then restricts the fitting basis for any occupied orbital to a spatially close “domain,” which is asymptotically independent of system size and leads to linear scaling for evaluating the exchange matrix. The local K (or LK) method of Aquilante et al.203 also uses localized orbitals, but uses CD techniques and integral screening to achieve a similar result without variable domain definitions.
A large number of explicitly correlated methods, such as MP2F12 and approximate CCSD(T)F12,204–207 use DF in the construction of the Fock and exchange matrices (using JKfit sets), and in the evaluation of the remaining twoelectron integrals (using MP2fit sets).208 But, the three and fourelectron integrals necessary to explicitly include interelectronic distances in the wavefunction can also be tackled in a superficially similar manner. Inserting an RI approximation allows these multielectron integrals to be simplified into products of no more than twoelectron integrals,209–211 with a separate RI ABS used to allow smaller OBSs to be used.212 The complementary ABS (CABS) approach involves constructing the union of the OBS and ABS and the reduction in RI error (due to using an incomplete RI basis) from this approach means it has become something of a standard in production level F12 codes.213 A secondary benefit of the CABS approach is that the error in the HF energy can be significantly reduced by including single excitations into complementary auxiliary orbital space.205
Peterson and coworkers have developed a series of ABSs specifically for use in the RI approximation in explicitly correlated methods, which are denoted OptRI. These sets are specifically matched to OBSs and are optimized using the functional
 (17)
where V and B contain the manyelectron integrals approximated by the RI,91 and RI_{ref} is a very large reference set of functions.214 The resulting OptRI ABSs are compact and produce negligible RI errors compared to the basis set incompleteness error of the underlying OBS, both for correlation and relative energies. Although originally published to match the ccpVnZF12 bases, the same methodology has been used to produce OptRI sets specifically matched to augccpVnZ bases,215 to ccpCVnZF12 bases,51 for the light alkali and alkaline earth elements,95 and for the group 11 and 12 transition metal elements.186 It has been observed that while these OptRI basis sets are not optimal for producing the largest CABS singles energy corrections, if relative energies are considered the OptRI sets in conjunction with CABS singles corrections are sufficient to practically eliminate the HF basis set error.216