On the many‐body nature of intramolecular forces in FFLUX and its implications

FFLUX is a biomolecular force field under construction, based on Quantum Chemical Topology (QCT) and machine learning (kriging), with a minimalistic and physically motivated design. A detailed analysis of the forces within the kriging models as treated in FFLUX is presented, taking as a test example a liquid water model. The energies of topological atoms are modeled as 3Natoms‐6 dimensional potential energy surfaces, using atomic local frames to represent the internal degrees of freedom. As a result, the forces within the kriging models in FFLUX are inherently N‐body in nature where N refers to Natoms. This provides a fuller picture that is closer to a true quantum mechanical representation of interactions between atoms. The presented computational example quantitatively showcases the non‐negligible (as much as 9%) three‐body nature of bonded forces and angular forces in a water molecule. We discuss the practical impact on the pressure calculation with N‐body forces and periodic boundary conditions (PBC) in molecular dynamics, as opposed to classical force fields with two‐body forces. The equivalence between the PBC‐related correction terms in the general virial equation is shown mathematically.


| INTRODUCTION
Correctly modeling atomic interactions in order to predict the macroscopic behavior of a molecular system is the fundamental goal of atomistic force fields. Typically, the quantum mechanical description of these interactions is substituted for fully empirical or physically motivated approximate potential functions. This substitution is what gives force fields an enormous computational advantage compared to ab initio methods. However, this reduction in computational cost generally comes at a cost to accuracy. Nevertheless, force fields are the workhorse in biomolecular simulations at thermodynamically meaningful timescales. [1] The deficiencies in accuracy of conventional pairwise additive force fields have been demonstrated several times [2][3][4][5][6] Indeed, careful analysis of the results obtained for small molecules revealed the deficiencies from the early days of force field design. [7][8][9][10] In response to these deficiencies, the re-introduction of manybody effects became a focus in both biomolecular force fields (see References 11 and 12 for a rich, critical and recent perspective) and water models, which are crucial components of biomolecular simulation. Typical improvements include the addition of multiple charge sites, inclusion of low order multipole moments and polarization, as a means of accounting for the effect of the surroundings. [1,13] For completeness, we also mention the many-body effects on dispersion. As discussed in a recent article, [14] such effects may result in variations of atomic and molecular dispersion coefficients of leading order (r −6 ) on the order of 10%.
In liquid water modeling, notable advances were made in using accurate ab initio data for the parameterization of analytical potential energy surfaces to capture the many-body effects, [15] combined with polarization at longer range. Models such as the CC-pol family, [16] WHBB [17] and the state-of-the-art model MB-pol [18] contain terms that explicitly model short-range two-and three-body interactions between molecules. Furthermore, the polarizable AMOEBA+ potential now also has [19] geometry-dependent charge flux, a feature already present in FFLUX [20] (see below) for a number of years.
Such advances in pure water modeling are in stark contrast with biomolecular simulations more generally, where the preference in practice is still given to simpler biomolecular force fields [12] and water models. [21] The focus in protein force fields is still on adjusting the existing potentials and inclusion of polarization. [1] These apparent differences are to a large extent due to higher computational cost [21] in biomolecular modeling, shifting the cost-accuracy trade-off. Moreover, compatibility of sophisticated water models with existing biomolecular force fields, or alternatively, applicability of these approaches to larger molecules remains an unsolved issue. [13,21] A realistic, computationally efficient and widely applicable force field is yet to surface and many challenges remain.
Machine learning could assist in solving the aforementioned issues, boosted by its recent commercial success in areas such as computer vision and speech recognition. If computational chemistry adopts machine learning, it will take better advantage of the highperformance hardware that is increasingly available to it. [22] Machine learning promises to be able to efficiently model multidimensional functions in many complex and useful scenarios. In the topological force field FFLUX, a machine learning technique called kriging (also known as Gaussian process regression [GPR]) [23,24] is employed to predict atomic properties as functions of atomic coordinates. FFLUX [20] is a force field that incorporates Quantum Chemical Topology (QCT) [25,26] in order to quantum mechanically define an atom in a molecule. An advantage of QCT is that atomic properties are obtained by a universal formula, that is, a volume integral of a property density over a (finite) atomic volume. Hence, an atom's internal energy, as well as its charge, dipole moment and higher multipole moment, are all obtained in a uniform and consistent way. Machine learning is used to accurately capture these atomic properties and to make predictions efficiently, using trained models, during a molecular dynamics simulation.
The goal of the current article is to analyze the intramolecular forces in the FFLUX framework and clearly highlight the prominent features of our approach to these interactions that differentiate it from conventional approaches and latest developments. Note that the term "intramolecular" should be clarified because FFLUX handles both intra-and interatomic forces within a given molecule. Secondly, it should be understood that the whole approach applies not only to molecules but also to molecular clusters although this belongs to a future project.

| Quantum topological energy partitioning
The Quantum Theory of Atoms in Molecules (QTAIM) [27] is a particular case (i.e., that of the electron density or its Laplacian) of the more general QCT approach. QTAIM partitions the electron density into unambiguously defined subspaces called topological atoms. The key concept to achieve this partitioning is the gradient vector, which naturally carves out these space-filling atoms: they do not overlap and leave no gaps between them. This space-filling property has consequences that have not yet been fully assessed (e.g., protein-ligand interaction in drug design) and also has conceptual advantages [20,28] at short-range. Topological atoms are defined as subspaces with a vanishing Laplacian of the electron density over their volume. It is easy to prove that topological atoms are also quantum atoms, which are atoms that possess a well-defined kinetic energy. [25] Topological atoms play the role of "information carriers" in FFLUX. Energies of molecular entities are calculated as the sum of energies of their constituent topological atoms, with forces arising from the interactions between all atoms in a system. A method called Interacting Quantum Atoms (IQA) [29] is a topological energy partitioning scheme based on the QTAIM framework and is thus part of QCT. IQA partitions the total energy of a system into atomic energies, that is both intra-and interatomic, without ever invoking a(n) (artificial) reference state as is done in traditional energy decomposition analyses: Equation (1) shows three levels of resolution (i.e., coarse versus fine) in terms of atomic localization and energy type. Firstly, the total energy is the sum of individual atomic energies E A IQA . This simple addition is a clean procedure as a consequence of the space-filling nature of topological atoms, which are well-defined at short-range. Secondly, these atomic energies can be decomposed into intra-atomic energies (E A intra ) and interatomic energies (V AA 0 inter ) where A 0 refers to all atoms in the system except atom A. At this resolution the partitioning is centred on any given atom A, and hence specific atom-atom interactions are not known. Of course it is possible to go beyond this resolution (i.e., towards a third level) and introduce V AB inter as explicit interatomic potential energies. IQA also allows the contributions to be broken down even further [30,31] (not shown in Equation (1)), namely into kinetic, exchange-correlation (which can be split in exchange and correlation [32][33][34][35] ) and electrostatic components. This finer partitioning provides more chemical insight but at a computational cost. However, for the purpose of constructing a force field, the partitioning into individual atomic energies E A IQA is sufficient. Indeed, any energy terms from a finer partitioning would be summed anyway by virtue of calculating the atomic energies, which is the only energy needed in geometry optimization or molecular dynamics simulations.

| Learning the energies and multipole moments with kriging
Kriging is a non-parametric machine learning technique at the heart [36] of FFLUX. Kriging applies Gaussian processes to establish a mapping between a collection of inputs (called features) and a single output. In this case, kriging links the geometry of a collection of atoms to their atomic energies and multipole moments, for previously unseen geometries. Features are defined with respect to a so-called atomic local frame (ALF). [37] As a result, features are independent of global translation and rotation of molecules, which prevents the kriging models unnecessarily learning the purely mathematical and well-known transformation between each ALF and the global frame. As such, the quantities of interest are whereŶ f * is the predicted output of the function (atomic energy or multipole moment), f * is the feature vector of interest (not present in the training set), μ is the mean output of the training samples, w j is the "weight" of a given training point j, the number of training points is N tr , the number of features is N feat , while f j k and f * k are the features of a N feat -dimensional training sample vector f j and the vector f * , respectively. In Equation (2), the kernel function is the exponential term that multiplies the weight w j . The symbols θ k and p k refer to the hyperparameters of the kernel being optimized during training. The hyperparameter θ k determines the importance of the dimension k, while p k determines the smoothness of the function variation along the given dimension. In the current version of FFLUX we set p k = 2, turning the kernel into a widely-used squared exponential kernel. [24] It should be noted that the weights w j are actually functions of the covariance matrix and consequently of θ k and p k , and are also calculated during the construction of a model. [23] This is why the term "weights" was put in quotes the first time it appeared, to take away the false impression that they were somehow adjustable.
A general framework for production of kriging models [38] consists of a number of steps, starting with the preparation of the training and test sets. The wavefunction for the training geometries are then calculated (GAUSSIAN09 [39] ), and the atomic IQA energies and multipole moments are then produced from the wavefunctions (using AIMALL [40] ). The kriging models are then trained, after which they are assessed by a geometry optimization and by so-called S-curves, which are sigmoidal curves that fully display the accuracy of a model as a cumulative distribution of the prediction errors for all training geometries.

| Training features
In its spirit, the ALF is similar to a Z-matrix, where the internal coordinate representation consists of bond lengths, bond angles and dihedral angles, thereby fully specifying the 3N-6 (or 3N-5 in the rare case of linear molecules) internal degrees of freedom. The differences lie in the details of how the axes and angles are defined.
Each atom's nucleus serves as the origin of its own ALF. The corresponding input features are used to model the energy and multipole moments of the central atom. Figure 1 illustrates the idea by depicting the ALF of a carbonyl carbon atom in N-methylacetamide.
This approach has the added benefit of ensuring atomic multipole moments are defined at the origin of the coordinate frame.
The first ALF feature, representing an internal degree of freedom, is the distance R Ωx ALF (Equation [4]) from the central atom Ω 0 to the bonded atom with the highest priority Ω x , according to Cahn−Ingold −Prelog rules, which form the basis of prioritization for all atoms. This bond also fixes the first local coordinate axis, x ALF . A third atom Ω xy (which is nitrogen in the example of Figure 1) is now assigned by the rules, in order to fix the ALF completely. The second feature is the distance R Ωxy ALF from the central atom Ω 0 to the next-highest-priority bonded atom Ω xy . The third feature is the angle χ Ω0 ALF , which is subtended by Ω xy and Ω x from Ω 0 . The three aforementioned atoms form The atomic local frame (ALF) of the carbonyl carbon atom in N-methylacetamide [Color figure can be viewed at wileyonlinelibrary.com] a xy-plane. The ALF's z-axis can then be defined as the axis through the origin and perpendicular to xy-plane, forming a right-handed coordinate system. Positions of other atoms are described by spherical polar coordinates (R Ωn ALF , θ Ωn ALF , ϕ Ωn ALF ) defined with respect to the ALF. The list of the feature definitions [38] is as follows: where R Ωn is the global position vector of atom n, and ζ Ωn are local Cartesian coordinates (x, y, z, respectively) of atom n expressed in the ALF.

| Analytic forces
The intramolecular forces are calculated as negative gradients of predictedÊ A IQA energy values with respect to displacement. As shown in previous work, [38] since the energy is parameterized by the ALF features, the practical definition is the rightmost expression in Equation (10), where F Ω i is the ith (x, y, or z) force component force on the atom Ω while α Ω i is the global ith coordinate of that atom. The quantityÊ TOT is the molecular energy, which is the sum over N atoms terms ofÊ A IQA , that is, the predicted (marked by the hat) atomic energies. The quantity f A k is the kth feature of atom A, exerting a force on atom Ω. Both partial derivatives were previously obtained analytically. [38,41] For further details on the expressions and derivations see Mills et al. [41] and Thacker et al. [38] 3 This is not the case in classical force fields where the energies are typically viewed as belonging to "interactions," for example, energies "stored" in bonds, without the possibility for a full and true atomistic view.
Let us take the water molecule as an example and write out the terms on the right-hand side of Equation (10) while omitting the hat for brevity from now on and focusing on the x-component only. We restrict ourselves to two features only, R O1H2 and R H2O1 , one of which is associated with O1 and the other with H2, respectively. As a result, only a handful of derivative terms from Equation (10) will remain: only the terms containing the aforementioned features. In other words, we isolate the partial forces, on each of the three atoms in water, purely due to changes in these two features (i.e., the O1H2 distance), resembling the bond length-related forces in classical force fields: where F O1 R 12 ,x is the x-component of force exerted on O1, but only due to a change in the features R O1H2 and R H2O1 , and consequently only due to energy changes of O1 and H2 because the energy of H3 does not depend on either of these two features. R 12 was used in the subscript of the force to refer to both features collectively. One should also notice that the partial force on H3 (F H3 R 12 ,x in Equation [18]), caused by a change in R O1H2 and R H2O1 , is zero because ∂R O1H2 ∂x H3 = ∂R H201 ∂x H3 = 0: The latter equation follows from the definitions of the features because R O1H2 = R H2O1 = |R O1 − R H2 |, that is, H3 does not participate in either of these features although formally written out in Equation (18) publications. [30,38,42,43] Also, the number of training samples that kriging requires in order to achieve a certain level of accuracy was shown to be small relative to neural networks. [44] The main consequence of having a single PES is that there are no separate, uncoupled forces: all forces are N-body in nature.
This fact is worth emphasizing, as it is an appealing feature of the This non-zero effect was already expressed in the first classical force fields [45,46] as coupling terms (e.g., between bond length and bond angle energies), partially because they were originally derived to account for vibrational frequencies. [47,48] Many popular force fields later removed interactions in order to simplify the force field so they could be used on early-generation computers, for large molecules such as proteins. It has long been known that these approximations are not the complete picture [49][50][51] and were initially made out of necessity. However, advances in computational capabilities have allowed for the re-introduction of complexity in classical force fields in the form of cross-terms(e.g., describing a coupling between bond length and bond angle energies). However, FFLUX has no need for the addition of extra terms to make up for the insufficiency of the initial description. The full complexity of the system is inherent in The forces in FFLUX can be collated with respect to given features, representing the internal degrees of freedom such as distances or angles, as shown in Equations (16)- (18). However, these forces retain an imprint of the remaining degrees of freedom, as will be explained below. Using the previous example, one can isolate the x-component of the force contribution related to R O1H2 and R H2O1 on each atom, see Equations (16)- (18). It was previously explained that F H3 R 12 ,x = 0 because H3 does not participate in either of these features. A different aspect of the same definition (i.e., R O1H2 = R H2O1 = |R O1 − R H2 |) that we emphasize here is that ∂R O1H2 ∂x O1 = − ∂R O1H2 ∂x H2 and also ∂R H2O1 ∂x O1 = − ∂R H2O1 ∂x H2 . This is mathematically trivial and can be understood by realizing that O1 and H2 are located on the opposite ends of the bond. This connection allows one to relate F O1 R 12 ,x and F H2 R 12 ,x by recognizing similar terms. In fact, F O1 R 12 ,x = −F H2 R 12 ,x or if expanded, using Equations (16) and (17), we obtain which is true at any point on the PES. The equality in Equation (20)

bodies). The Computational Example
Section below gives a quantitative example of these effects.

| COMPUTATIONAL EXAMPLE
Here we present a computational example, showcasing the N-body nature of forces in a water molecule, as discussed earlier. The internal force contributions originating from the E O1 IQA energy gradient with respect to the χ O1 angular feature and E O1 IQA or E H2 IQA energy gradients with respect to R O1H2 or R H2O1 distance features were examined at two different O1H3 distances. While the values of the features under consideration do not change, the magnitudes of the forces are shown to be dependent on the position of H3. The forces were calculated in DL_FFLUX (modified DL_POLY 4.08 [52] ) with a preliminary kriging model trained on 32 training points (molecular geometries), which was enough to produce a good quality PES. The reference energies were calculated at B3LYP/6-31+G(d,p) level.
A single water molecule positioned in the xy-plane was examined using a distorted geometry with R O1H2 = R O1H3 = 1:00 Å and χ O1 = 90 . It was compared to a water molecule with R O1H2 = 1:00Å, R O1H3 = 0:95Å, and χ O1 = 90 were compared. The particular geometry was chosen for illustrative convenience only, and also indicates the ability of FFLUX to handle geometries far from the minimum energy. The comparison involved only the atoms participating in the examined features (χ O1 and R O1H2 /R H2O1 ). The results are illustrated in Figure 3 and presented in full in Table 1.
The bond length of 1.00 Å is longer than the equilibrium distance at the level of theory of the training set (B3LYP/6-31+G(d,p)), resulting in attraction between O1 and H2 through the corresponding feature (R O1H2 /R H2O1 ). The opposite is true for the angle, which is smaller than that of the equilibrium geometry. From ∂x H3 , inversely proportional to the bond length, counteracted this effect on H3, overall increasing the force contribution on that atom by 3.5% at shorter O1H3 bond length. The three-body effect played a more noticeable role in the O1H2 bonded forces. All the differences originate from the different positions on the three-dimensional PES of the two geometries. The O1H2 bond forces overall became 3.4% more attractive due to the displacement of H3 (i.e. shorter O1H3 bond), which is non-negligible.
The QCT view allows us to look further into the atomic energy gradients. The energy gradient of H2 with respect to R H2O1 increased by 9.4% accompanied with a 0.4% decrease in the O1 gradient. We hypothesize that such a large effect on H2 atomic energy gradient is related to its low electron density (absence of the core electron density), making it more sensitive to the surrounding. [53] An identical comparison of the force magnitudes at a less constrained angle (χ O1 = 100 ) showed analogous patterns and variations in magnitudes (Table 2).
Overall, the illustrated three-body effects identified from this comparison tended to be small, which is consistent with the intuition about N-body effects being "correction terms." This observation may explain the reasonable success of force fields that isolate interactions as well as that of force fields accounting for many-body effects by the post hoc addition of approximate terms. In the latter case an approximate description of many-body effects may possibly suffice to improve results considerably. However, the rigorous theoretical grounding of the N-body effects in FFLUX puts it in better stead to expand to larger systems. It is expected that N-body effects, and therefore any errors in their description, will accumulate in more complex molecules. In fact, the latter revealed less than the careful work on small model compounds because the flaws in the force field were easier to tease out and correct in small model compound systems compared to in protein or oligopeptide simulations. This accumulation may contribute to the lack of success of classical force fields when modeling large systems such as proteins.

| IMPLICATIONS OF MANY-BODY INTERACTIONS
While coming at increased computational cost compared to simpler force fields, our approach leverages a powerful machine learning technique for effective interpolation of multidimensional functions, and applies long-range multipolar electrostatics. The MPI parallelization of the code means that the in-house DL_FFLUX is closely comparable to the original DL_POLY (4.08) code [52] in terms of performance, with any extra cost located in the prediction routines, as one would expect.
In light of the ever increasing availability of computational resources, the onus is now on more accurate, physically motivated potentials.
Water was chosen as a visually tractable and relevant example for this paper but, our approach is equally applicable to larger molecules in exactly the same way. Furthermore, the technique is not limited to single molecules but can and will be extended to molecular complexes.
While not being fundamentally different from the QCT point of view-both molecules and molecular complexes are collections of atoms-it is more challenging in practice, due to the need to sample much larger regions of input space. An important example of such a system would be the water dimer (or trimer, etc.). Applying kriging to such a system would model both intra-and intermolecular degrees of freedom, successfully capturing the intermolecular N-body effects where N is now the number of molecules.
The example of MB-pol water model, which explicitly captures two-and three-body interactions [54,55] at short-range based on F I G U R E 3 Partial forces on each of the atoms (in 10 −2 atomic force units, denoted by U; to obtain the values in atomic units one needs to multiply them by 0.01), originating from atomic energy gradients with respect to χ O1 and R O1H2 /R H2O1 at two different O1H3 separations [Color figure can be viewed at wileyonlinelibrary.com] T A B L E 1 Comparison of non-zero force contributions F Ω i (in 10 −2 atomic force units; to obtain the values in atomic units one needs to multiply them by 0.01) related to features χ O1 , R O1H2 , and R H2O1 at two different O1H3 distances and χ O1 = 90  (10)). The complete list of eight non-zero components is presented, as determined by the alignment of the molecule with respect to the axis (e.g., all z-components are 0). accurate ab initio reference data, has shown that such approaches lead to remarkable results. [18] The advantage of kriging can be seen in its generality as a mathematical technique and effective interpolation. The PES of a single water molecule was previously shown to be nearly perfectly reproduced at the theory level of the training set with 50 training samples. [43] On a more practical note, while many-body forces for the most part do not result in significant algorithmic changes for MD simulations (apart from introducing the interaction-related code), the calculation of pressure is something that needs to be implemented correctly. The well-known virial equation for pressure for pairwise additive forces (Equation (21)) clearly does not apply if distancedependent N-body interactions are present, Note also that forces dependent purely on angles and dihedrals can be shown to contribute nothing to the virial. [56] In FFLUX, where the forces are related to features representing different types of degrees of freedom, it is convenient to have a general equation that accounts for everything in the same way. Alternatively, the pairwise virial with the minimal image convention could have been applied exclusively to radial features because within a single simulation frame the forces related to radial degrees of freedom act as two-body forces. Angular features could be then treated as purely angular within each simulation frame. However, following the route of the most general virial equation (Equation (22)) is a natural way to calculate the pressure in such a situation, and this equation is thus applicable with many-body interactions where F i is the resultant force on atom i and r i is its global coordinate.
However, a fact that appears to be somewhat overlooked in the literature is that this equation does not apply in the case of periodic boundary conditions (PBC). [57,58] The difference arises from the need to correctly account for the interactions of the main cell with periodic images. Interestingly, Equation (21)  Louwerse et al. described [57] the problem with Equation (22) and expressed the required pressure correction term in the case of a cubic simulation box, resulting in where U is the potential energy per unit cell (equivalently of the main cell) and L is the cubic box length. The correction term − 1 In the above equation, H is the matrix of the cell vectors and n represents the offset vector of the periodic cell and can be any vector with integer components. As explained by Thompson et al., [58] F 0 in can T A B L E 2 Comparison of non-zero force contributions F Ω i (in 10 −2 atomic force units; to obtain the values in atomic units one needs to multiply them by 0.01) related to features χ O1 , R O1H2 , and R H2O1 at two different O1H3 distances and χ O1 = 100  (10)). The complete list of the 9 non-zero components is presented, as determined by the alignment of the molecule with respect to the axis (e.g., all z-components are 0). be thought of as a partial force on the periodic image of i in the cell n related to the main cell, as opposed to the total force on the periodic image atom i in the cell n. As indicated by the authors, Equation (24) also has the form of the equation earlier derived by Bekker et al. [59] We complement the presented ideas by providing a simple mathematical proof by explicit differentiation of ∂U ∂L À Á ri f g that the correction terms in Equations (23) and (24) are indeed identical for cubic cells, leveraging the physical meaning of ∂U ∂L À Á ri f g (see below). Application of the ideas as presented by Thompson et al. [58] resulted in a wellbehaved pressure of the yet unpublished FFLUX liquid water model with PBC.
Starting with the view presented in the paper by Louwerse et al., [57] the factor ∂U ∂L À Á ri f g in Equation (23) is the change in potential energy per unit cell (e.g., the main cell) associated with the increase of the box size without changing the absolute coordinates of atoms in the main cell {r i } (n = 0, the zero vector). Physically, such expansion/ contraction affects the potential energy of the main cell via the changes in interactions between the atoms inside the main cell and the image atoms. The insight, not explicitly presented in the papers, [57,58] is to incorporate this physical meaning (change in U due to change in r in ) mathematically into the equation: The notation of Thompson et al., [58] Equation (24), is helpful in this situation: the double sum iterates over all atoms in all cells, F 0 in is the partial force on the in-th atom associated with the main cell potential energy U, not the total force on that image atom. The vector r in can be expressed as a function of r i and L if the cell is cubic, Equation (27), and the derivative of r i with respect to L is then given by Equation (28) where I is the identity matrix. One can clearly see that the atoms in the main cell (n = 0) are unaffected. After substituting Equation (28) into Equation (25) and using the fact that V = L 3 (for a cubic cell), the equivalence of the equations of interest becomes obvious, Equation (31): The computational example showed numerically the presence of the three-body effects in water. In the particular example, the magnitudes of the energy gradients varied from 0.4 to 9.4% upon translation along internal degrees of freedom on the PES, other than the one studied. These small changes are non-negligible and we believe that these interactions are important for achieving higher accuracy and capturing finer details in molecular dynamics.
Application of these ideas might have even more significant effect on larger molecules and molecular complexes, enabling to accurately describe intra-and intermolecular interaction with the same approach.
Research on such systems is currently work in progress.
In addition, calculation of pressure is discussed, advising caution