Advances in vibrational configuration interaction theory ‐ part 1: Efficient calculation of vibrational angular momentum terms

Finite basis vibrational configuration interaction theory (VCI) is a highly accurate method for the variational calculation of state energies and related properties, but suffers from fast growing computational costs in dependence of the size of the correlation space. In this series of papers, concepts and techniques will be presented, which diminish the computational demands and thus broaden the applicability of this method to larger molecules or more complex situations. This first part focuses on a highly efficient implementation of the vibrational angular momentum (VAM) terms as occurring in the Watson Hamiltonian and the prediagonalization of initial subspaces within an iterative configuration selective VCI implementation. Working equations and benchmark calculations are provided, the latter demonstrating the increased performance of the new algorithm.


| INTRODUCTION
The variational calculation of accurate vibrational spectra is a demanding task and many routes have been devised to reduce the computational effort of such calculations. [1][2][3][4][5][6][7][8][9][10][11][12][13] It is the size of the correlation space, which constitutes the primary bottleneck and which is combated by the concept of configuration selection. [2][3][4][5] Different variants of configuration selective vibrational configuration interaction theory, cs-VCI, have been developed, but common to all is a criterion, which allows to decide about the importance of an individual Hartree product. The evaluation of this criterion, which usually involves the determination of many integrals, may again be computationally demanding, in particular for molecules with more than a couple of atoms and in the case that high-order terms within the expansion of the potential energy surface, PES, have been included, for example, 4-mode coupling terms. In order to reduce the overall CPU time of a cs-VCI calculation, the different individual steps of such a calculation have to be optimized. For example, an iterative cs-VCI calculation based on a criterion related to 2nd order vibrational Møller-Plesset perturbation theory, VMP2, or 2x2 VCI matrices relies on three major steps: (1) the evaluation of the selection criterion, has been done with respect to an acceleration of the other two aspects-besides the contraction of integrals 15 -which leads to substantial savings for potential energy surfaces with high-order contributions.
In this series of articles, we present new technical aspects of cs-VCI theory, which lead to significant speed-ups and thus allow for the application of cs-VCI theory to even larger molecular systems. Those concepts, which act on the selection criterion, in principle may have impact on the final results, while all others do not.
However, in all cases the observed deviations are less than one wavenumber and are thus negligible for most applications and much lower than the intrinsic error bar of the electronic structure method used for spanning the PES. This first paper focuses on the vibrational angular momentum (VAM) terms, whose calculation within building the VCI matrix may require significant computational effort, especially for large systems. Unrolled analytical expressions are presented, which constitute the basis for a fast and efficient evaluation of these terms. For a constant μ-tensor within the Watson Hamiltonian this ends up in substantially reduced computational cost. Subsequently, an alternative set-up of the initial configuration space within the iterative configuration selection and diagonalization of the resulting subspace will be discussed. This improves the convergence behavior of the iterations and leads to an overall stabilization of the algorithm. We will discuss these two aspects independently from each other and thus two subsections being split into a theory part and subsequent benchmarks will be presented below. Both algorithms presented in the following have been implemented into the MOLPRO package of ab initio programs. 16

| VIBRATIONAL ANGULAR MOMENTUM TERMS
In the following, general aspects regarding an implementation of the VAM terms will be outlined, being necessary for an understanding of what follows. The Hamiltonian, which will be employed within this work, is the commonly used Watson Hamiltonian for non-rotating molecules J ¼ 0 ð Þ , 17 which is given as with M being the number of modes of the system. The first term describes the VAM terms, the third one denotes the kinetic energy contribution. The second term, the Watson correction term, will be added as a mass-dependent pseudopotential to the potential energy surface (PES), V, for which we use an n-mode expansion. 18,19 Mainly due to the large computational cost and due to their intricate implementation, the VAM terms are neglected in some works, 20 (1), the VAM operator π α is given as with rectilinear normal coordinates q i and the Coriolis-coupling coefficients ζ α lk fulfilling ζ α lk ¼ Àζ α kl and ζ α kk ¼ 0, α, β x,y, z f g. The ζ α lk are given as cross products of the unitary matrix of displacement vectors L transforming massweighted displacement coordinates into normal coordinates. The μ-tensor is connected with the moment of inertia tensor I of the molecule via with In analogy to the expansion of the potential energy surfaces, an n-mode expansion for the μ-tensor surface 25 will be used, which is given as The matrix element of two Hartree products (configurations) j Φ I i and j Φ J i and the first term of Equation (1) is thus given as Equation (7)  to the modes r,s, t, u, … in Equation (7), which leads to the use of conditional statements within the implementation.
In previous work 25 the calculation has been speeded up essentially by using a prescreening technique, which only includes contributing index combinations depending on the size of the prefactor μ 0 αβ ζ α rs ζ β tu , since many of these are very small or do vanish by theory. Of course, this procedure depends on a threshold, but usually one gains at least a factor of about 10 in speed. For further information about this implementation of the VAM terms, we refer to Reference [25], where also details about the calculation of the contributing integrals can be found. Instead of skipping elements by their prefactor, in our new implementation presented here we will focus on the integrals and the structure of Equation (7) in order to remove or at least reduce the dependency on the number of modes. Simplification of Equation (7) by rigorously using symmetry properties and excluding vanishing parts from the summation in advance leads to a substantially smaller computational overhead.

| COMPUTATIONAL DETAILS AND BENCHMARK SYSTEMS
In order to prove the reduction of computational effort achieved by the techniques described in the successive sections, we performed extensive benchmark calculations. All calculations presented below have been performed in a fashion close to real applications aiming at highly accurate results, that is, high quality potential energy surfaces (PES) and sufficiently large configuration spaces have been used.
Diborane (B 2 H 6 , D 2h ), fluoroethane (C 2 H 5 F, C s ) and allene (C 3 H 4 , D 2d ) have been chosen as benchmark systems as these systems display different difficulties within the calculations. For example, the CHstretchings of fluoroethane show strong resonance/multi-mode character which, inter alia, lead to large configuration spaces. This selection also takes different point groups into account, that is, Abelian versus non-Abelian point groups are considered, which behave differently with respect to the handling of a real-based configuration space (see Reference [26]). This, in addition, allows to consider different situations regarding the exploitation of symmetry properties. B 2 H 6 , for example, allows for an extensive use of symmetry since it belongs to the point group D 2h , whereas for C 2 H 5 F there is just one symmetry element, which is reflected in larger CPU times.
As mentioned above, in all calculations presented in this work, we used high quality potential energy surfaces spanned by rectilinear normal coordinates, which have been obtained from explicitly correlated coupled-cluster calculations with a basis set of triple-ζ quality, that is, CCSD(T)-F12a/cc-pVTZ-F12. The respective equilibrium geometries have been used as an expansion point for the PES, for which an nmode expansion, 18,19 being truncated after 4-mode coupling terms, has been employed. In order to restrict the computational effort, a multi-level scheme [27][28][29] has been utilized. For further details regarding the calculation of the potential energy surfaces, we refer to the original literature, that is, Reference [30] (B 2 H 6 ), Reference [31] (C 3 H 4 ) and Reference [32] (C 2 H 5 F). A highly efficient Kronecker product fitting procedure 33 has been employed in order to obtain a polynomial representation of the PES. By solving the vibrational self-consistent field (VSCF) equations in a basis of 20 distributed (mode-dependent) Gauss functions, [34][35][36] we obtain real one-mode wavefunctions (called VSCF modals in the following). These VSCF modals are used to form Hartree products (configurations) that are employed as basis functions to span the correlation space for the VCI calculations. The n i th modal for coordinate q i will be denoted φ n i i in the following, and the respective configurations by j Φ I i ¼j Q i φ ni i i. As described above the μ-tensor is expressed in an n-mode expansion as given in Equation (6). In all calculations, the VAM terms are considered (at least) until zeroth order.
For the subsequent VCI calculations, an iterative procedure has been used (Note: For additional and detailed information concerning our VCI algorithm, we refer to References [2,25,26].). We initially generate a configuration space restricted by (a) the number of modes being excited, (b) the number of excitations for a single mode, and (c) the total number of quanta within the configuration. Subsequently, the initial configuration space is reduced by scanning it via a VMP2-based configuration selection criterion 2,37 with the VCI wavefunction of state A in the ath iteration step and c a ð Þ AK are the respective coefficients. Within this configuration selection process, the correlation space is iteratively increased by employing Equation (8) until convergence of the state energy is reached. Using the algorithm described in Reference [26], the eigenstate having the largest overlap integral with the multidimensional harmonic wavefunction is chosen to be the reference state in Equation (9). In order to determine the intermediate VCI wavefunctions within these iterations, we use an iterative eigenvalue solver based on the RACE-algorithm, 14 which has been shown to outperform the commonly used Jacobi-Davidson algorithm. In order to guarantee that the state of interest is tracked properly during the iterations of configuration selection, physically meaningful startvectors are needed. 26

| CALCULATION OF VIBRATIONAL ANGULAR MOMENTUM TERMS
As mentioned in Section 2, within the calculation of VCI matrix elements the evaluation of the contribution of the VAM terms may be a computational bottleneck. Although a brute force implementation can be very much improved by using prescreening techniques, 25 the additional computational time for including VAM contributions is still significant. Therefore, we present here an even more efficient ansatz based on analytical considerations. The underlying equations are modified in a way that most of the numerous elements resulting in vanishing contributions are excluded from the onset. Additionally, reordering of the terms and rigorous use of symmetry properties of the integrals lead to very compact equations which can be implemented very efficiently. We will only discuss VAM terms of zeroth and first order, respectively, which have been found to be sufficient for most applications. 24,25 Nevertheless, the basic ideas can be easily transferred to higher order contributions.
Major simplifications of Equation (7) can be achieved by the distinction of cases regarding the involved basis functions (modals) By doing so, it becomes possible to exploit symmetry properties, to skip vanishing integrals in advance and to unite terms. Let be the set of mode labels, whose corresponding modes differ by the quantum numbers n I k and n J k of the respective modals.
In all what follows we request the modals, φ, to be orthogonal. In favor of a shorter notation, we will drop the configurations j Φ I i, j Φ J i for the set M m as long as the context is obvious. Due to the orthogonality of the modals and the separation of the respective multidimensional integrals into products of one-dimensional integrals it is possible to simplify expression (7) in dependence on m. Obviously, if m is larger than the order of the operator, the corresponding expectation value must vanish. Consequently, for a constant μ-tensor there is a maximum m of 4, while for its 1Dterms m ¼ 5 and so on.
In the following, independently of m and the order of the μ-tensor, the properties of the one-dimensional integrals shown in Equations (10) and (11) will be exploited in all cases. For the left and with φ being real valued and normalized functions.

| Vibrational angular momentum terms of zeroth order
Depending on the summation indices r, s, t, u referring to the modes, one has to distinguish 10 different cases regarding the operators (or integrals, respectively) involved. These are q r ∂ q s q t ∂ q u , q r ∂ q s q t ∂ q t , q r ∂ q r ∂ q s q s , q r ∂ q r q t ∂ q t , and q 2 r ∂ 2 q s D E . As mentioned above, we have dropped the configurations in the integrals, that is, Since ζ α ii ¼ 0, there is no contribution of integrals of types q r ∂ q r q t ∂ q u , q r ∂ q s q t ∂ q t and q r ∂ q r q t ∂ q t in Equation (7a). Giving support to a better readability of the following equations, we introduce the abbreviations where ε ijk is the Levi-Civita symbol with arbitrary mode indices i, j, k.
In the following, we present the working equations for the VAM terms of zeroth order for a molecule in arbitrary orientation. Note, that in the case that the system is oriented along its central principal axis of inertia, the summation P αβ in the following equations can be simplified to P α,α¼β , that is, only products including diagonal elements of μ 0 αβ have to be taken into account. Let Due to the orthogonality of the modals, all integrals with less than four different indices vanish and therefore do not contribute. In addition, the only integral in Equation (7a) having four different mode indices, that is, q r ∂ q s q t ∂ q u , is different from zero only in the case r, s, t, u f g M 4 .
Thus, the only combinations of indices contributing are permutations of M 4 , that is, 24 non-zero elements remain. Moreover, the integral q r ∂ q s q t ∂ q u itself is symmetric regarding commuting r and t and s and u, respectively, that is, q r ∂ q s q t ∂ q u ¼ q t ∂ q u q r ∂ q s . Since (a) μ 0 αβ is independent of the indices concerning the modes and (b) it is μ 0 αβ ¼ μ 0 βα and (c) the summation runs over all α, β, there are pairs of identical summands including the prefactor ζ α rs ζ β tu . Summarized, using the abbreviation (12), one obtains Since all the A À are independent of α and β, they need to be calculated only once per matrix element. Consequently, the computational effort for a zeroth order VAM contribution with m ¼ 4 is very low and additionally independent of the number of modes present in the system. Note, that this result refers to the evaluation of the VAM contribution for a single matrix element.
Due to the orthogonality of modals, integrals with less than three different mode indices do not contribute, that is, q r ∂ q r ∂ q s q s ¼ 0 and q 2 r ∂ 2 with four different indices. The term q r ∂ q s q t ∂ q u is symmetrical concerning an interchange of s and u and/or r and t. The integral will not vanish due to Equation (10a), as long as s, u f g i 1 , i 2 , i 3 f g . Thus, there are six possible combinations, each two of them having the same prefactor. Consequently, the contribution of the integral q r ∂ q s q t ∂ q u is given by There is also a number of integrals having three different mode indices. For each of them, only elements with r, s,t that is, all permutations of M 3 , yield non-zero terms. Additionally, by using Equation (11) and taking advantage of the antisymmetry of the ζ-constants, one obtains, after summation over all indices, identical contributions for the integrals q r ∂ q r ∂ q s q t and q r ∂ q s q s ∂ q u . Considering all aspects mentioned, a matrix element with m ¼ 3 is given as as solely the integrals q r ∂ q r ∂ q i 1 q t D E and q r ∂ q s q s ∂ q i 1 D E contribute due to Equation (10a). Considering Equations (11) and (10b) (Equation (10c), respectively) and using the symmetry of μ 0 αβ , summation over α and β yields identical terms with different sign. Additionally, the contribution of the integrals q r ∂ q r ∂ q s q s add up to zero, that is, P αβ P rs μ 0 αβ ζ α rs ζ β sr q r ∂ q r ∂ q s q s ¼ 0, since the use of Equation (11) leads to two identical parts with reversed sign canceling each other.
Finally, one ends up with Equation (22) Using Equation (10a), one obtains q r ∂ q s q t ∂ q u ¼ 0, q 2 r ∂ q s ∂ q u ¼ 0, q r ∂ q r ∂ q s q t ¼ 0, and q r ∂ q s q s ∂ q u ¼ 0. Therefore, only parts containing the integrals q r ∂ 2 q s q t D E , q r ∂ q r ∂ q s q s , and q 2 r ∂ 2 q s D E must be considered in Equation (7a). Consequently, considering ζ α il ¼ Àζ α li and Equations (10b) and (10c) one obtains Using Equation (23)

| Results
In order to determine the computational savings generated by the respectively. For C 3 H 4 , also the first overtones below 2500 cm À1 have been calculated. One set of calculations has been performed by using our former prescreening algorithm as described in Reference [25], the other one with the new ansatz described in Section 4. Timings and savings arising from the new algorithm are summarized in Table 1. Figure 1 shows the CPU times for building the VCI matrix using the equation shown above (light blue bars) with respect to our former implementation (red bars). In order to visualize the order of the effect, we also ran calculations using a brute force implementation of Equation (7) T A B L E 1 CPU times in seconds for evaluating the VCI matrix depending on the operator calculated with (i) a former implementation based on prescreening 25 (abbrv. As "Prescr.") and (ii) with the implementation based on the equations given in Section 4 ("This work"), the computational savings are given in percent ("Sav.")  25 and this work based on Equations (18), (20), (22), and (23) (0D) and the equations shown in the Appendix (1D)) regarding CPU times for evaluating the VCI matrix within the calculation depending on the operator included. The times shown refer to the calculation of the fundamentals and first overtones below 2500 cm À1 , that is, 19 states are included. Relative CPU times with respect to a calculation excluding the VAM contributions are given in blue, the second y-axis depicts the computational savings with respect to the implementation using prescreening As Figure 1 clearly shows, using Equations (18), (20), (22) On average (based on the data in Table 1), our new implementation relying on the equations presented in Section 4 is approximately 12 times faster than the former one for 0D VAM contributions, 20 times faster for also including 1D VAM contributions for the diagonal elements and 12 times faster for considering 0D and 1D VAM terms for all matrix elements (see Table 1). Compared with the computational effort for calculating the contributions of the potential energy surfaces, including the 0D VAM terms, which include the most important physically relevant contributions and are often sufficient, is no longer a computational bottleneck.
Note that, in the following all CPU times given refer to the new implementation.

| PREDIAGONALIZATION OF SUBSPACES
States located in regions of high state density often suffer from comparatively slow convergence within iterative configuration selection algorithms, for example, cs-VCI. In the following, we present a method to improve the convergence of vibrational state energies and the VCI wavefunction based on spanning meaningful subspaces of the correlation space and prediagonalize them. In our benchmark calculations, we put a special focus on fundamentals belonging to CH-stretching modes, since these often show rather slow convergence.

| Method and implementation
As described in Section 3, within our implementation of VCI theory each calculation starts by generating an initial configuration space restricted by (a) the maximum number of modes excited within the configurations (n ex,init: ), (b) the maximum excitation within one oscillator (n max,init: ) and (c) the sum of quantum numbers within one Hartree product (n sum,init: ). Simplest, the starting wavefunction (9) in the initial iteration step of the configuration selective procedure is given by the corresponding VSCF configuration. For Abelian systems and rectilinear normal coordinates, a single Hartree product is used, otherwise meaningful linear combinations representing all physically relevant information are employed, that is, for Abelian systems P n cn j Φni else, e:g:, non-Abelian systems and=or localized normal coordinates ( ð24Þ with the coefficients c n describing meaningful linear combinations of real-valued configurations covering additional quantum numbers as, for example, the l-quantum number. For further details, we refer to Reference [26]. Subsequently, Equation (8) is utilized for the configuration selection process. Note that, the whole initial correlation space is screened in this step. After building the VCI matrix in the correlation space of selected configurations, the targeted eigenvalue is determined by an iterative eigenvalue solver employing the harmonic representation of the reference wavefunction of the first iteration step (24) as startvector. Convergence is defined to be reached if the difference of the state energies of two consecutive iterations is below a given threshold. This procedure refers to our former implementation and will be used as reference regarding performance in the following.
In our new ansatz aiming at an improvement of the convergence behavior of critical states regarding the number of necessary iteration steps, we essentially modify two steps within the algorithm described above: (i) We replace the starting wavefunction (24) by a wavefunction containing static correlation effects. In order to do so, we define a meaningful subspace (abbreviated as "subsp." in what follows) of the initial correlation space, which includes configurations showing resonances with the reference state. In analogy to the notation for the initial configuration space, we denote the characteristic values describing the size of the subspace n ex,subsp: , n max,subsp: , and n sum,subsp: .
The set of configurations contained in the subspace, is denoted K tot,subsp: . Usually the most important resonances are of Fermi-or Darling-Dennison type. Thus, we choose n ex,subsp: ¼ 3 and n max,subsp: ¼ 3 with a value of 4 up to 5 for n max,subsp: depending on the size of the initial correlation space (these values refer to the reference configuration j Φ 0 i and not to the ground state configuration). Note, that the subspace is generated exactly as the initial configuration space, that is, not a subspace of modes is used, but the configurations included are chosen via the parameters n ex,subsp: , n max,subsp: , and n sum,subsp: and regardless of the state of interest all modes are treated equally. In most cases, the values mentioned cover the most contributing resonances, but they may be increased if necessary. Note that, choosing a subspace of configurations is somewhat related to the concept presented in Reference [38], but here additionally resonance effects are addressed explicitly, which leads to a more generalized ansatz.
Subsequently, we prediagonalize the defined subspace and use our state picking scheme to identify the state of interest, that is, the one with the largest overlap with the harmonic reference state. The chosen eigenstate obtained by the diagonalization of the subspace can be written as eigenvector with c AI being the respective coefficients. The state (25) is used instead of (24) within Equation (8) for constructing the correlation space. Note, that we include all the configurations from the chosen subspace in the selected configuration space regardless of their contribution according to Equation (8).
(ii) Instead of using the harmonic representation of the state (24) as starting vector for the iterative eigenvector solver used, we now employ the eigenstate (25)

| Results
In order to prove the prediagonalization of appropriate subspaces leading to faster convergence for states lying in regions of high state density, we performed benchmark calculations based on 4D PESs including 0D VAM terms for the CH and BH-stretching modes of B 2 H 6 , C 3 H 4 , and C 2 H 5 F. The initial configuration space has been restricted by n ex,init: ¼ 6, n max,init: ¼ 6, and n sum,init: ¼ 15 in all cases. For defining the subspace, we used n ex,subsp: ¼ 3, n max,subsp: ¼ 3, and n sum,subsp: ¼ 4 (B 2 H 6 and C 2 H 5 F) and n sum,subsp: ¼ 5 (C 3 H 4 ), respectively. In Figure 2, a detailed breakdown of the CPU times needed for the three most time consuming steps within our iterative VCI implementation is shown for allene, which has four CH-stretching modes: ν 1 (A 1 ), ν 5 (B 2 ) and two degenerate states ν þ1 8 (E) and ν À1 8 (E). We label the degenerate states by ν As can be seen, in all cases the number of required iterations is larger when using our former algorithm not considering subspaces.
Of course, omitting whole iteration steps leads to relatively large CPU time savings. On average, 1.3 iterations per state are saved within the calculations for allene, 1.2 for C 2 H 5 F and 0.0 for B 2 H 6 .
As the configuration selection has the largest share with respect to the total CPU time, it is also responsible for most of the CPU time savings if an iteration can be left out. Note, that we lose performance by directly including all configurations from the subspace in the current configuration space without selection via criterion (8 Note: For all systems, the fundamentals belonging to the CH-stretching and BH-stretching modes, respectively, have been calculated. We used n ex,init: ¼ 6, n max,init: ¼ 6, and n sum,init: ¼ 15 for the initial correlation space in all cases. For defining the subspace, we employed n ex,subsp:¼3 , n max,subsp:¼3 , and n sum,subsp: ¼ 4 (B 2 H 6 and C 2 H 5 F) and n sum,subsp: ¼ 5 (C 3 H 4 ), respectively. For details regarding the single states see Tables S1-S3 in the Supporting information.
iteration steps. Although we handle larger configuration spaces in comparison to the former implementation at the beginning of the calculation, the computational effort for the last iteration steps is substantially larger. This is mainly due to the rising cost of the selection process and the diagonalization, both dramatically increasing for almost converged (and therefore large) correlation spaces. On the other hand, the dimension of the correlation space is smaller by using the new algorithm for the states ν 5 and ν jþ1j 8 of allene, but larger for the two states ν 5 and ν jÀ1j 8 . It can be seen in Figure 2, that the savings for the matrix set-up and its diagonalization due to the smaller correlation spaces are not too large, since the configuration selection dominates the total CPU time in the case of allene (see above). Nevertheless, for the states ν 5 and ν jþ1j 8 the smaller dimensions lead to CPU time savings, while most of the extra time using our former algorithm is used for further optimization of the configuration space, that is, configuration selection. In contrast, in the case of B 2 H 6 (see Table S1 in the Supporting information), all of the CPU time savings generated by using the concept of subspaces results from smaller correlation spaces. Here, matrix set-up, diagonalization and configuration selection become faster due to smaller correlation spaces during the iterations. The same holds true for C 2 H 5 F, the final configuration spaces on average are smaller for all states, on average 10,500 configurations are saved (see Table S2 in the Supporting information).
In summary, as demonstrated by the benchmark calculations above, the ansatz of improving the start vector for the configuration selection scheme by considering resonances from the very beginning leads to a faster convergence of the correlation space. Additionally, the number of required iterations decreases in some cases and leads to CPU time savings. The results for the energy eigenvalues are almost identical, that is, there is no loss of accuracy here.

| SUMMARY
Two strategies have been presented for the acceleration of vibrational configuration interaction calculation, the first one being applicable to any VCI implementation, while the 2nd is specific to configuration selective variants. It was found that computational bottlenecks arising from the VAM terms can efficiently be eliminated by unrolling the equations with respect to the number of different modals within the VCI matrix elements. Simple prescreening of the VAM terms cannot compete with this implementation. This allows to consider 0D VAM terms for all VCI matrix elements plus 1D VAM terms for the diagonal terms in standard applications by default, which is of particular relevance for rovibrational calculations requiring high accuracy. The introduction of prediagonalizations in configuration selective VCI algorithms is a subtle technique, which leads to an overall stabilization of the selection process and, at the same time, leads to moderate acceleration of these calculations. Both approaches essentially have no impact on the final results, but increase the efficiency of VCI implementations and thus allow to handle larger correlation spaces or even larger molecules. In the subsequent paper within this series, cf. Reference [39], even further techniques for accelerating VCI calculations will be presented.

ACKNOWLEDGMENTS
Financial support by the Deutsche Forschungsgemeinschaft (project Ra 656/25-1) and the Studienstiftung des Deutschen Volkes is kindly acknowledged.
Open access funding enabled and organized by Projekt DEAL.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.  (8). For every state shown, the bar on the right hand side shows the CPU time of the calculation using the prediagonalization of a meaningful subspace whereas the bar on the left hand side is the reference calculation. The corresponding differences between the resulting energies are given above the bars, the dimensions of the associated correlation spaces are depicted in red. The second y-axis on the RHS shows the number of iterations required

Guntram
since due to the symmetry of the integral regarding interchanging the indices s and u, summation over all α, β, and v not coupling to other indices, only the subsetS M4 has to be taken into account.