General recurrence‐relation generation scheme for molecular integral evaluation

We develop a new scheme for evaluating different molecular integrals using Gaussian type orbitals. In this new scheme, the evaluation of integrals is performed in two steps during runtime. The first step is a top‐down procedure that maps each recurrence relation into a jagged array (array of arrays), where each element of a member array represents either the final results or some intermediate integrals that are stored in our developed data structure “coarse‐grained circular buffer”. This step is the same for all different one‐ and two‐electron operators so that the same algorithm and source codes can be used. In the second step, a bottom‐up procedure is carried out that computes all the intermediate and the final molecular integrals by backtracking elements from the last member array of each jagged array. Different source codes should in principle be used for different electron operators in the second step, but which can be generated automatically by our developed recurrence‐relation compiler. The currently proposed general recurrence‐relation generation scheme provides a new, generic and automatic programming way for various one‐ and two‐electron integrals needed in computational chemistry. Users can even introduce new electron operators and evaluate their integrals during runtime by combining the implementation of the proposed new scheme and the just‐in‐time compilation technique.

should in principle be used for different electron operators in the second step, but which can be generated automatically by our developed recurrence-relation compiler. Gaussian type orbitals for their efficiency in integral evaluation. Currently, almost all molecular integrals are calculated using recurrence relations-one starts from a simple integral that can be computed easily, and from which the final results are calculated by recursively using some mathematical formulas.
Even though one may program each integral in an efficient way using the Obara-Saika, [1,2] McMurchie-Davidson, [3] or Rys quadrature [4] schemes, a unified computational procedure for evaluating these integrals and their derivatives is nevertheless valuable, especially when exploring higher-order molecular properties with the recently proposed open-ended quasienergy derivative approach [5] where a large number of different complicated integrals are prerequisite for the calculations of molecular properties.
We have also recently proposed a procedure for evaluating oneelectron integrals and their geometrical derivatives by using a generalized one-electron operator, in which an arbitrary central-potential operator f(jr − Cj) around center C can be chosen for different oneelectron operators, [13] for instance f(jr − Cj) = jr − Cj −1 , jr − Cj −2 , and Dirac delta function δ(r − C).
The aforementioned contributions have nicely illustrated the generalization of various integrals and their derivatives, but what is missing is the generalization of programming different recurrence relations for different one-and two-electron operators. For instance, one will have different recurrence relations for the potential jr − Cj −1 and the Dirac delta function δ(r − C), and correspondingly different algorithms and source codes have to be written for these two different operators. One can program different recurrence relations manually, but such codes can be prone to error and need to be totally rewritten whenever any change is required in the corresponding recurrence relations-for instance, due to efficiency or stability reason. Manual programming therefore requires much effort for the implementation of integrals of new electron operators, and for the maintenance and the improvement of existing integral evaluation algorithms and codes.
Most modern programming languages provide recursive function that allows programmers to express operations in terms of the function itself, and therefore reduces the effort of programming recurrence relations. However, source codes using recursive functions are often inefficient for the integral evaluation, because it is not trivial to reuse computed intermediate integrals during recurrence relations. [13] Reusing intermediate results is of key importance for saving computation time in the integral evaluation, in particular for integrals with high angular momentum basis sets and/or higher order (geometrical) derivatives. Therefore, instead of using the recursive function, most efficient integral codes are currently either manually programmed or generated using the automatic programming technique. In the latter, the actual codes (mostly using C, C++, or Fortran languages) are generated from a set of codes (named as integral code generator thereafter) written at a higher abstraction level using, for instance Python language.
The automatic programming technique therefore reduces the programming effort to some extent. However, we note that most integral code generators cannot treat, for instance arbitrary angular momentum and/or arbitrary order of (geometrical) derivatives. Moreover, different integral code generators have to be written for different forms of electron operators. All these limitations again restrict the development of new molecular integrals and studies of new molecular properties, or one has to dedicate much effort on such development. Therefore, the current contribution aims to develop a new scheme to reduce the programming effort for the integral evaluation of different electron operators. We name the new scheme as "general recurrence-relation generation scheme", which divides the integral evaluation into two steps: All recurrence relations are first mapped into a series of jagged arrays in which each element of a member array repre- The remainder of this paper is organized as follows: we first present our notation conventions and theoretical background for the integral evaluation in computational chemistry. The general recurrence-relation generation scheme is described afterwards, as well as its design and implementation. Finally, we discuss the performance of the proposed scheme by using different examples and give our final concluding remarks.

| Notation conventions
Let us first define our notation conventions: A bold capital letter such as R κ denotes the position of a nucleus (or a center) κ. The vector from R λ to R κ is denoted by R κλ = R κ − R λ . The capital letters X κ , Y κ , and Z κ represent the Cartesian coordinates of a nucleus (or a center) at the position R κ , whereas R κ denotes the norm of the vector R κ . The position of an electron relative to a nucleus (or a center) at the position R κ is given by r κ = r − R κ . Small letters x κ , y κ and z κ , and r κ denote the three Cartesian coordinates of the electron relative to the position R κ , and the norm of the vector r κ , respectively. Moreover, we use the multi-index notation extensively [14] to simplify the expressions for the recurrence relations. For instance, the jKj-th order geometrical derivatives with respect to a center at the position R κ will be written as where the three-dimensional multi-index K = (K X , K Y , K Z ) T is a vector of nonnegative integers and jKj = K X + K Y + K Z is the norm (length) of the multi-index K.
For different recurrence relations, we will often use e ξ for the increment along one Cartesian direction ξ, where ξ = x, y or z.

| Integral evaluation
The integrals we consider are evaluated over either the contracted real solid-harmonic Gaussian type orbitals (GTOs) located at one center R κ with the orbital quantum number l κ or the contracted Cartesian GTOs where w SGTO iκ and w CGTO iκ are the radial contraction coefficients, and each real solid-harmonic or Cartesian GTO in the summation is called primitive GTO with a iκ being the orbital exponents. S lκ mκ r κ ð Þ is the real solid-harmonic function, which satisfies the following transformation. [15] S lκ mκ r κ ð Þexp −a iκ r 2 with S lκ mκ lκ being the transformation coefficients.
Reine et al. [16] have proven that the real solid-harmonic GTOs can also be represented by the Hermite Gaussian functions with the same transformation coefficients of Equation (4) This enables us to evaluate both the integrals and their geometrical derivatives on a common footing by using the Hermite Gaussian functions. [13,16] Take a one-electron operatorÔ r Cα f g ð Þ for example-here r Cα f g represent a set of vectors relative to centers C α (α = 1, 2, ÁÁÁ), the geometrical derivatives of its integrals over contracted GTOs χ κ (r) and χ λ (r) are denoted as where l κ and l λ are respectively the orbital quantum numbers of contracted GTOs χ κ (r) and χ λ (r), ÁÁÁ represents a set of derivatives with respect to the centers C α (α = 1, 2, ÁÁÁ). Replacing χ κ (r) and χ λ (r) with the contracted real solid-harmonic GTOs (2) or the contracted Cartesian GTOs (3), we have where we have introduced the basic integrals over primitive Hermite Gaussian functions Notice that we could further transfer l κ to L κ for each integral over primitive Cartesian GTOs in Equation (9), using the following recurrence and likewise for transferring l λ to L λ .
After performing the recurrence relation (12) for each integral over primitive Cartesian GTOs, we will also arrive at the following basic integrals over primitive Hermite Gaussian functions Therefore, the prerequisite for calculating integrals over either the contracted real solid-harmonic GTOs or the contracted Cartesian GTOs is the evaluation of integrals (10).

| Recurrence relations
The evaluation of integrals (10) however depends on the knowledge of explicit form of the operatorÔ r Cα f g ð Þ. For instance, integrals of the Cartesian multipole moment operatorÔ r Cα f g ð Þ= r m M (dipole origin M) can be evaluated using the recurrence relations. [13,16] l κ l λ ,m + e ξ ½ HGTÔ l κ + e ξ , 00 and starting from where While for the nuclear attraction potentialÔ r Cα f g ð Þ= ∂ LC C r −1 C , different recurrence relations have to be used. [13] l κ + e ξ , l λ L C ;0 00, L C + e ξ ;n 0 ½ HGTÔ and 000;n 0 ½ HGTÔ where F n0 p ij R 2 Cγ is the n 0 -th order Boys function. [15] Generally, one has to manually program different source codes or prepare different recurrence-relation code generators for the recurrence relations of different electron operators. However, as will be shown in the following section, it becomes possible to evaluate different recurrence relations of various electron operators with our developed "general recurrence-relation generation scheme".

| SCHEME, DATA STRUCTURE AND ALGORITHM
Before presenting the new scheme, we give a more formal definition of our interested recurrence relations-q-indexed recurrence relation of order (t 1 , Á Á Á, t k , Á Á Á, t q ) as. [17] i 1 , ÁÁÁ, i k + e ξ , ÁÁÁ, i q ½ where and a r1,ÁÁÁ,rq ξ are coefficients of the right-hand-side (RHS) terms, which can be constants or variables depending on the indices. We further name the index i k as "output index", indices i 1 to i k − 1 are called "inner indices", and i k + 1 to i q are "outer indices".
For instance, the recurrence relation (14) is a 3-indexed recurrence relation of order (1, 1, 2), and the output index is m, the inner indices are l κ and l λ , and there is no outer indices.
The integral evaluation using different multi-indexed recurrence relations of different orders can fall into two steps: a top-down procedure and followed by a bottom-up procedure, which will be described in the following two subsections respectively.

| Jagged Array
From the graph theory, [18] all integral terms of a recurrence relation can be readily described by the so-called "directed acyclic graph". Rák et al. have proposed a method to map an integral into a thread of a parallel architecture, where a directed graph has been used for the computation of the integral expressed as a summation. [19] Different from the invention by Rák et al., we care more about the key information that can be delivered to and guide the next step-the practical integral computations. More exactly, we are interested in finding: • all integral terms [Á Á Á] needed for the recurrence relations, and • relationships among these integral terms as described by the recurrence relations.
Therefore, instead of the directed acyclic graph, we have chosen the other simpler data structure-jagged array to represent a recurrence relation. The implementation of the jagged array-named as RecurArray-has been made in our recently developed tIntegral library, [20] and C++ programming language has been chosen for the implementation. The tIntegral library is released under the Mozilla Public License (version 2.0) and a development version (version 1.0.0) is available at https://gitlab.com/tglue-chemistry/tintegral.
The rationale behind the design of the data structure Rec-urArray can be better understood from the evolvement of recurrence relations in the top-down procedure: 1. As shown in Figure 2, several RecurArray's will be created during the top-down procedure that can be readily put into a sequence container like std::array in C++ programming language.
These jagged arrays are arranged following the top-down manner, that is, the first jagged array represents the recurrence relation for the final molecular integrals while the last jagged array for the recurrence relation starting from the integrals 0ÁÁÁ0 ½ HGTÔ O .

Each integral term of a recurrence relation is implemented by a
class RecurNode, which contains information such as orders of indices, address of integrals in the other data structure coarsegrained circular buffer (named as RecurBuffer and will be discussed later in the current section) and RHS RecurNode's.
As shown in Figure 1, integral terms of each recurrence relation can be arranged into different levels-those connected by the dashed arrows have the same order of the output index, and can therefore be put into the same member array.
Each member array can be realized by the sequence container std::vector in C++ programming language, which will gradually grow whenever new integral terms are found with the corresponding order of the output index.
3. Also as revealed in Figures 1 and 2, there are integral terms appearing in both the last member array of a jagged array and the succeeding jagged array. For instance, integral terms-from [120] to [000] of the first jagged array also appear in the second jagged array for the recurrence relation of l λ .
To efficiently handle the above "shared ownership" of integral terms, we have used the C++ smart shared pointer std:: shared_ptr to manage each integral term, which has also been used for the management of RHS RecurNode's of each integral term.

| Coarse-grained circular buffer
Now let us focus on the other important data structure-coarsegrained circular buffer, or RecurBuffer that will be used to store (intermediate) integrals physically in computer memory. The following requirements have to be considered for an efficient and nonconflicting use of the computer memory: 1. The amount of the required computer memory should be as less as possible, and whenever a RecurNode will not be involved in the 3. Integral terms appearing in both the last member array of a jagged array and the succeeding jagged array need special consideration because their integrals can be viewed as the "input" of the former jagged array (recurrence relation) and the "output" of the succeeding recurrence relation; In other words, these integrals have to stay in the computer memory during the evaluation of both recurrence relations.
As will be discussed in the next subsection, the bottom-up proce- 4. For output terms of a recurrence relation, their used segments of the computer memory need to be reserved as the input of the preceding recurrence relation.
By following the above decisions and noticing the bottom-up procedure will be performed order by order, we can design the RecurBuffer as a coarse-grained version of the known data structure "circular buffer". That means, as illustrated in Figure 3, the RecurBuffer divides a portion of the computer memory into several segments (separated by solid lines), and each of them contains the molecular integrals of RecurNode's in the same member array.
Let the maximum order of the output index i k be max order (i k ) for a recurrence relation (25). The number of needed segments for the recurrence relation can be determined by which is not greater than the number of member arrays (=max order (i k ) + 1) so that less computer memory is required.
Meanwhile, we can ensure each RecurNode and all its RHS RecurNode's will use different segments by using the computer memory in a cyclic manner, that is, only the segment used by a member array with the least order of the output index will be released or be used by another member array with greater order of the output index. As illustrated in Figure 3, the segements used by integral terms when the former will not be involved in the evaluation of the recurrence relation.
It therefore means that the RecurBuffer is well-suited as a first-in-first-out buffer and also serves well for one of our objectivesthe first computed intermediate integrals could be firstly "kicked out". A straightforward solution is to assign segments to integral terms in an opposite way for the recurrence relation (14), i.e., starting from the highest address of the RecurBuffer. As illustrated in Figure 3, RecurNode's [000] to [120] therefore occupy the computer memory with the highest address, and are also arranged from higher address to lower one. The same procedure continues so that the direction of data storage always takes the opposite of the preceding recurrence relation.
To briefly summarize, the top-down procedure therefore needs

| Triangle-based recurrence relations
All integrals in the bottom-up procedure are treated following a triangle-based scheme in our current contribution. As shown in In the tIntegral library, [20] we have considered several combinations of the triangle-component ordering and its possible one-to-one recurrence relationship(s) as shown in Table 1. Such a combination can be provided as additional information to our general recurrence-

| Converting to loops
After choosing a combination in Table 1 The restriction on the order (t 1 , Á Á Á, t q ) should not affect the use of our recurrence-relation compiler, because most one-and twoelectron integrals needed in computational-chemistry calculations can be evaluated with recurrence relations of order ≤2. [13,16] Furthermore, such a restriction can be simply extended as will be shown in the following discussion of loops of different indices.
The input of the RecurCompiler is recurrence relation(s), or more exactly the right hand side that is given in the form of (25). We have implemented an abstract symbolic class RecurSymbol and a few derived classes in the tIntegral library [20] as shown in Table 2.
These derived classes can be used to (rapidly) construct the right hand side of a recurrence relation (25).

Now let us look into the most challenging part of the
RecurCompiler-the generation of loops over XYZ components of different indices i 1 , Á Á Á, i k , Á Á Á, i q with i k the output index, and the order (t 1 , Á Á Á, t k , Á Á Á, t q ). As discussed in the triangle-based recurrence relations, the generation of such loops over XYZ components are mostly decided by the chosen triangle-component ordering and the one-toone recurrence relationship. The order (t 1 , Á Á Á, t k , Á Á Á, t q ) can also affect how the loops over XYZ components will be carried out.
Apparently, the generated loops over XYZ components should ensure: 1. Each component on the left hand side of a recurrence relation will be visited once so that its value can be computed from the recurrence relaiton; 2. Only contributing components on the right hand side will be visited, which will be used to compute the corresponding LHS component along either x, y or z direction; T A B L E 1 Triangle-component ordering and possible one-to-one recurrence relationship(s) implemented in the tIntegral library (version 1.0.0) [20] Component ordering

Contiguous components in memory
One-to-one recurrence relationship(s)
3. Noncontributing RHS components will be skipped in an appropriate manner.
Let us first consider the output index i k . Take the descending XY- For other combinations of the triangle-component ordering and the one-to-one recurrence relationship in Table 1, the generation of loops over XYZ components of the output index can be performed by following similar procedures to that of Figure 6. The restriction on the order of t k (1 ≤ t k ≤ 2) can also be removed by developing slightly different procedures for the cases of t k > 2.
The generation of loops over XYZ components of inner and outer indices requires a different consideration. During the loops, one needs to figure out: • For an inner index, which of its XYZ component(s) on the right hand side will contribute to the recurrence relation along a direction x, y or z that was set during the loops of the output index; and F I G U R E 6 Loops over XYZ components of the output index i k for the evaluation of the left hand side (i k + e ζ k ) of a recurrence relaiton from contributing components on the right hand side (i k and i k −e ζ k ) with the descending XY-major order and the 4 x YZ • For an outer index, along which direction(s) x, y and/or z, one of its XYZ components on the right hand side will contribute to the recurrence relation. Actually, the above slightly different statements for inner and outer indices require the same and important information for the generation of their loops over XYZ components-contributing RHS components along direction(s) x, y and/or z. For any inner or outer index i p , the contributing RHS components of the increment P ζ p e AE ζ p = 0 are obvious-each LHS component X l p Y m p Z n p (l + m + n = ji p j) has the contributing RHS component X l p Y m p Z n p regardless of the direction.
In Figure 7, we present contributing RHS components of inner and outer indices with the descending XY-major order and the 4 x YZ ! y Z z recurrence relationship for the increment (a) e AE ζ p = −e ζ p and (b) e AE ζ p = e ζ p . The loops over XYZ components of inner and outer indices can therefore be performed by looping the LHS components and determined contributing RHS components from given direction(s), which also holds for any other increment P We also note that there are noncontributing RHS components for the increment e AE ζ p = e ζ p that need to be skipped during loops. For example, RHS components of the last row-from Y 3 to Z 3 as shown in Figure 7(b)-will not contribute to the recurrence relation along x direction and should be skipped.
A more general form of contributing and noncontributing RHS components is given in Figure 8 for an increment P ζ p e AE ζ p τ p . Take the increment e AE ζ p = e ζ p for example, which gives τ px = 1 and τ py = τ pz = 0 along x direction, so that ji p j + 2 noncontributing components should be skipped after loops.
After converting the bottom-up procedure to different loops, the left and the only step that one needs to manually program is the evaluation of integrals 0ÁÁÁ0 ½ HGTÔ O , which is usually trivial compared with the implementation of the aforementioned different loops. Therefore, our developed automatic programming approach can work for almost all different molecular integrals in computational-chemistry calculations. Furthermore, by considering the recently developed just-in-time compilation technique [21] in computer science, users can even introduce new electron operators and evaluate their integrals during runtime. This will become quite useful for developers to quickly test their new idea in computational chemistry.

| DESIGN AND IMPLEMENTATION
The previous section has presented our two key steps for integral evaluation from an algorithmic view, in which the proposed scheme, data structures and algorithms can in principle be served to guide the practical implementation-the development of the tIntegral library. [20] More exactly, what we have presented so far can fall into either software requirements (which functionalities of the tIntegral library we expect), or software construction (coding, data structures and algorithms) in the software engineering discipline.
One important software development process between the software requirements and the software construction is software design. A considered design can help one develop modularized, reusable, maintainable and extensible software. We have therefore followed standards of the software design in the development of the tIntegral library.
In particular, we have applied well-known design patterns [22] to solve design problems we have encountered, and we have also employed unified modeling language (UML) to describe and to help us understand how our chosen design works both structurally and behaviorally.
We will in the next two subsections present our software design for integral computation and integral code generation using the tIntegral library. The object-oriented programming has been chosen as our programming paradigm and C++ programming language for the implementation.

| Integral computation
In Figure 9, we collect (important) classes of the tIntegral library (ver- Similarly, we define classes GaussianFunction (for primitive Hermite Gaussian functions) and ContractedGTO (for contracted real solid-harmonic GTOs or contracted Cartesian GTOs) derived from the base class Symbol so that they can also be used for symbolic operations in the response theory calculations.
For ordinary users, it is advisable to use the template class OneElecGTOIntegration for computing integrals of different oneelectron operators with either contracted real solid-harmonic GTOs or contracted Cartesian GTOs. The template parameter RealType will be specified during compile time that determines the type of floating point numbers.
The class OneElecGTOIntegration actually works as the skeleton of integral computations. As illustrated in Figure 10, it will construct appropriate concrete integration classes during runtime according to the types of one-electron operator and basis functions on bra and ket centers. When the member method integrate is called, the class OneElecGTOIntegration will invoke these con- Except for the class ContractedSGTOIntegration, all other concrete integration classes are automatically generated from the general recurrence-relation compiler RecurCompiler, whose design and implementation will be presented in the next subsection.
One may notice that the construction of an appropriate concrete integration requires the knowledge of either the one-electron operator class or the basis function class. A possible solution is to resort to F I G U R E 9 UML class diagram of the tIntegral library (version 1.0.0) [20] for one-electron integral computation with Gaussian type orbitals. Classes of basis functions and the abstract one-electron operator class OneElecOperator are respectively from our other developed libraries tBasisSet (https://gitlab.com/tglue-chemistry/tbasis-set) and tSymbolic (https://gitlab.com/tglue-mathematics/tsymbolic). Integration classes marked in red color are automatically generated from the general recurrence-relation compiler RecurCompiler (see Figure 11) [Color figure can be viewed at wileyonlinelibrary.com] the so-called visitor pattern. [22] However, it is not trivial to introduce any new (one-)electron operator or basis function class using the visitor pattern.
For the time being, we employ the member function type_id of the base class Symbol (see Figure 9) to create the correct concrete integration in the class OneElecGTOIntegration during runtime.
This function returns the runtime type informaiton (RTTI) of an object (a given one-electron operator or basis function).
Nevertheless, the use of RTTI is not a reasonable choice within the framework of object-oriented programming, and the code structure of the class OneElecGTOIntegration is also a bit "brutal" that has many conditional statements to check each RTTI. We have considered the other probably better solution-pattern matching [23] for the implementation of the class OneElecGTOIntegration, which may become available in the next release of the tIntegral library.
Last but not least, we would like to point out that one can also directly use, for instance the derived class CartMultMoment HGTOIntegration or NucAttractPotentialHGTOIntegration to compute integrals with primitive Hermite Gaussian functions. It can be useful for some computational chemistry programs that have their own routines for the transformation to contracted Cartesian GTOs or contracted real solid-harmonic GTOs.
F I G U R E 1 0 UML sequence diagram of the tIntegral library (version 1.0.0) [20] for computing integrals of Cartesian multipole moment operator with contracted Cartesian GTOs [Color figure can be viewed at wileyonlinelibrary.com]  Table 1, together with the schemes presented in Figures 6-8. Finally, the code to compute each individual LHS component needs to be generated within the innermost loop by following the formula of the recurrence relation. Based on the above task analysis, we can divide the general recurrence-relation compiler into the following (categories of) classes:

| Integral code generation
1. RecurTraversal is a simple class for recording which combination of the triangle-component ordering and the one-to-one recurrence relationship in Table 1 is chosen.
2. RecurExpression (marked in red color in Figure 11) contains detailed information of a recurrence relation that the RecurCompiler accepts, including the RHS and indices involved into the recurrence relation. It will also create a RecurAnalyzer object for the recurrence relation during runtime.
The RecurCompiler takes a vector of RecurExpression objects as one input argument for the integral code generation.
Currently, we manually prepare these RecurExpression objects for different one-electron operators and basis functions in a class RecurRelation (see Figure 12). However, our long-term objective is to "teach" the class RecurRelation to automatically generate the RecurExpression objects for given electron operators We have used the visitor pattern [22] for the above three classes to process different RecurSymbol derived classes that are used to construct the (RHS) of a recurrence relation. Within the visitor pattern, each RecurSymbol derived class needs to implement a dispatching operation accept that dispatches a request to the base class RecurExprVisitor as shown in Figure 11.
Meanwhile, each class derived from the RecurExprVisitor needs to implement several dispatch operations to make sure all different RecurSymbol derived classes can be visited, which is a double dispatch approach. [22] 4. RecurOutputConverter and RecurNonOutputConverter (marked in green color in Figure 11) respectively convert a recurrence relation to loops for the output index and nonoutput (inner and outer) indices. Both of them are derived from the class RecurConverter, which takes care of a few common tasks in the loop generation, for example, adding debug information.
Because the loops of each index are converted in a few steps (see Furthermore, we have employed the abstract factory pattern [22] for creating RecurOutputConverter and Rec-urNonOutputConverter objects during runtime. As illustrated in Figure

| EXAMPLES AND DISCUSSIONS
To test the performance of the tIntegral library, we have chosen our previous implementation-Gen1Int library (version 0.2.1) [24] for reference. In Table 3, we present the CPU time used by these libraries with different one-electron operators and Hermite Gaussian functions.
Different from the general recurrence-relation generation scheme proposed here, all recurrence relation codes were manually converted T A B L E 3 CPU time (millisecond) used by Gen1Int (version 0.2.1) [24] and tIntegral (version 1.0.0) [20] libraries with different one-electron operators, and Hermite Gaussian functions 2a into loops in the Gen1Int library. Instead of explicitly constructing jagged arrays for different recurrence relation terms, the loops of different indices in the Gen1Int library were prepared by specifying their maximum and minimum orders. As such, there may be extra and unnecessary integrals computed using the Gen1Int library in some cases.
Another pitfall of the Gen1Int library is that it only accepts one pair of exponents on bra and ket centers at a time, different from the tIntegral library that can take multiple exponents on both centers.
Nevertheless, we only use one pair of exponents a κ and b λ for our comparison in Table 3.
It is not surprised that the computations using the tIntegral library take more CPU time than those of the Gen1Int library, in particular in the cases of lower order Hermite Gaussian functions (till f shell) where the CPU time is dominated by the top-down procedure as illustrated for the operatorÔ = r 3 M in Table 3. However, we need to point out that the integrals with s and p shells are directly calculated in the Gen1Int library without any loop of indices, whereas the top-down procedure is always performed and the jagged arrays are always constructed in the current implementation of the tIntegral library. It can be improved by adding conditional statements in the concrete integration classes so that Hermite Gaussian functions with lower orders (such as s, p and d shells) will be computed directly.
We would also like to argue that the current implementation of the top-down procedure can be further optimized in the tIntegral library. For example, we currently compare orders of all indices when trying to find matching RHS nodes, which can be performed only for indices involved in the recurrence relation.
Furthermore, it is worth mentioning that the comparison in Table 3 is carried out with only one pair of exponents (a κ and b λ ). The percentage of the CPU time used for the top-down procedure will become less if there are multipole exponents on bra and/or ket cen-ters, which is usually the case in computational chemistry calculations.
One should also observe that, for higher order Hermite Gaussian functions, the CPU time used by the tIntegral library becomes more and more comparable with that of the Gen1Int library. Moreover, by only considering the bottom-up procedure, the tIntegral library has used less CPU time than that of the Gen1Int library in the cases of i and j shells as revealed in Table 3. It can be explained from the aforementioned fact that the loops of different indices in the Gen1Int library are carried out from a given minimum order to a maximum one, so that there may be extra and unnecessary integrals computed and more CPU time can be taken.
We have also compared the performance of the tIntegral library to the Libint library (version 2.1.0). [25] The CPU time is given in Table 4 for different one-electron operators and Cartesian Gaussian functions. We have chosen the horizontal recurrence relation for the order of the Cartesian multipole moment operator and the Obara-Saika recurrence relations for the angular momenta of Cartesian Gaussian functions [15] in the tIntegral library.
The Libint library generates integral codes for a given maximum angular momentum and (intermediate) integrals are explicitly addressed without any loop. As such, the codes are highly efficient as revealed in Table 4. But the pitfall is that the Libint library can not handle arbitrary orders of (geometrical) derivatives and angular momenta without regenerating codes. In contrast, our proposed scheme in the current paper and the tIntegral library can compute different integrals as well as their (geometrical) derivatives to arbitrary order, which is vital for (high-order) response theory calculations.
We have also measured the CPU time with more than one Gaussian function on bra and ket centers. In Table 4, we present the CPU time used for computing integrals ofÔ = 1 andÔ = r 3 M with 5 Cartesian Gaussian functions on bra and ket centers. Interestingly, the performance of the tIntegral library becomes comparable with that of the Libint library, and even better forÔ = r 3 M and Gaussian functions with T A B L E 4 CPU time (millisecond) used by Libint (version 2.1.0) [25] and tIntegral (version 1.0.0) [20] libraries with different one-electron operators, and Cartesian Gaussian functions r lκ κ exp −a κ r 2 κ À Á on bra and r lλ λ exp −b λ r 2 λ À Á angular momenta h and i. It may be due to the Libint library places the loops of Gaussian functions outside recurrence relations, while the tIntegral library puts these loops inside recurrence relations and at the deepest level-which is more efficient.
To briefly summarize, it is worthy of using the current version the tIntegral library in practical calculations, especially for the computations of different one-electron integrals. Except for further improvement of the library itself, users can consider to generate all possible jagged arrays in advance so that they can be (re)used during runtime.
As such, the total CPU time used for the top-down procedure will become trivial.

| CONCLUSIONS
A general recurrence-relation generation scheme has been proposed in the current contribution, and its implementation in the recently developed tIntegral library [20] has also been discussed. In particular, the application of software design patterns [22] has been highlighted for developing a modularized, reusable, maintainable and extensible library.
This new scheme and the tIntegral library are able to program different molecular integrals automatically and thus significantly reduces the programming and maintaining effort for the integral evaluation in computational chemistry. Our chosen software designs have also made the tIntegral library be able to work with the just-in-time compilation technique. For instance, it can be straightforward to use the tIntegral library in the recently developed interactive C++ interpreter-Cling [21] (which is built on top of the LLVM compiler infrastructure [26] ) so that one can generate source codes for any new electron operator and immediately evaluate its integrals during runtime.
Our current focus is on the further improvement and development of the currently proposed scheme and the tIntegral library. More explicitly, we have considered and begun to work on the following issues: 1. Evaluation of integrals using London atomic orbitals [27] and their derivatives with respect to the external electric and magnetic fields are important for molecular property calculations, which will require new recurrence relations and can be fitted into the current proposed scheme.
2. Our general recurrence-relation generation scheme can also be used for programming two-electron integrals, but the top-down procedure needs to be further optimized or discarded so that loops of indices will be performed for a given range of orders.
3. Although the RecurCompiler class has provided an automatic, generic and simple way to program various recurrence relations, one still needs to prepare the right hand side of a recurrence relation using the RecurSymbol derived classes, which usually requires tens of lines or a few hundred lines of coding work.
To further reduce human's coding work, we can "teach" the class RecurRelation to automatically generate the right hand side of a recurrence relation based on the tSymbolic library (https://gitlab. com/tglue-mathematics/tsymbolic).