Decoding Energy Decomposition Analysis: Machine-Learned Insights on the Impact of the Density Functional on the Bonding Analysis

The concept of chemical bonding is a crucial aspect of chemistry that aids in understanding the complexity and reactivity of molecules and materials. However, the interpretation of chemical bonds can be hindered by the choice of the theoretical approach and the specific method utilized. This study aims to investigate the effect of choosing different density functionals on the interpretation of bonding achieved through energy decomposition analysis (EDA). To achieve this goal, a data set was created, representing four bonding groups and various combinations of functionals and dispersion correction schemes. The calculations showed significant variation among the different functionals for the EDA terms, with the dispersion correction terms exhibiting the highest variability. More information was extracted by using unsupervised learning in combination with dimensionality reduction on the data set. Results indicate that, despite the differences in the EDA terms obtained from different functionals, the functional has the least significant impact, suggesting minimal influence on the bonding interpretation.


Introduction
Chemical bonding is a fundamental concept in chemistry that plays a critical role in understanding the complexity of molecules and materials.Despite its significance, there is no unique definition of "chemical bonding" or what is a chemical bond.It thus remains a constructively but often controversially discussed topic in the chemical community. 2,35][6][7][8] EDA can be thought of as dissecting a puzzle to understand how each piece contributes to the overall structure, which means it comprises a decomposition of the total energy into its contributing components.A better understanding of the different contributions to a chemical bond can provide insights into what governs stability and reactivity. 7Subsequently, this information can be used to improve the design of new molecules and materials, for instance, by manipulating the molecular structure such that specific chemical interactions can be altered in a desired way, e.g., by reducing Pauli repulsion or increasing orbital interaction. 1,9,10though density functional theory (DFT) based approaches are widely used for EDA, the most used approximations have well-known limitations.One such approximation in the Kohn-Sham approach to DFT is the choice of the density functional, which is used to approximate the exchange-correlation energy, a crucial component of the total energy in DFT.Due to the unknown exact functional form, a large range of functionals have been developed over the last decades, resulting in different quantitative results for the same system. 11,12One of the most commonly used type of density functionals in EDA studies are based on the Generalized Gradient Approximation (GGA).Particularly, the Perdew-Burke-Ernzerhof (PBE) 13 functional has often been employed.The PBE functional, combined with a method to account for dispersion interactions, has been shown to provide good results for a wide range of systems and has become a standard choice for many applications in theoretical chemistry, especially in the surface and material sciences. 14,15However, for certain systems, other functionals may provide more accurate results, and the choice of the functional should be carefully considered for each specific system being studied. 16As a result, the use of different functionals across various EDA studies could result in bonding interpretation becoming dependent on the specific choice of the functional.However, it is critical that the bonding analysis is consistent.Further, one wants to make sure that the bonding interpretation for a given system does not depend on the computational setup, most importantly the density functional, used to conduct the EDA analysis.In a previous study, we found the relative importance of dispersion attraction and covalent bonding contributions to depend on the density functional used. 17 this study, we systematically shed light on the impact of the choice of the functional on the results of chemical bonding analyses with EDA.We thus carried out EDA studies of 11 molecules that are representatives of the different types of chemical bonds in molecular systems: Covalent bonds, main group donor-acceptor bonds, transition metal donor-acceptor bonds, and hydrogen bonds.We used different density functionals, i.e., 7 GGA, 6 meta-GGA, 2 hybrid, 3 meta-hybrid, and 1 range-separated functional with various types of dispersion corrections resulting in 28 different calculations per molecule.The results are analyzed with the help of dimensionality reduction and supervised learning techniques.The different functionals are accounted for by encoding them into the molecular descriptor of the learning models.Analysis of the importance of the contributions of the descriptor provides insights into what constitutes the chemical bond and what impact the functional has on the result of the bonding analysis.The EDA procedure starts by defining molecular fragments, A and B, of a chemical system, AB, with an interaction energy, ∆Eint.This interaction energy can be obtained by subtraction of the preparation energy, ∆Eprep, from the bonding energy between the two fragments, Ebond (eq.( 1)).

Ebond = ∆Eint + ∆Eprep eq (1)
Next, ∆Eint is decomposed into dispersion, ∆Eint(disp), and electronic, ∆Eint(elec), terms (eq.( 2)).The first term, ∆Eint(disp), gives the fraction of dispersion energy contribution to the bonding which is usually taken from a semiempirical correction scheme.EDA then decomposes ∆Eint(elec) into three well-defined terms that can be interpreted in chemically meaningful ways (eq.( 3)).These terms are (1) the quasiclassical electrostatic interaction energy between the charge densities of the fragments, ∆Eelstat, (2) the exchange repulsion between the fragments due to Pauli's principle, ∆EPauli, and (3) the energy gain due to orbital mixing of the fragments, ∆Eorb.In this way, information about the importance of electrostatic, dispersion, repulsion, and orbital interactions can be obtained. 18∆Eint = ∆Eint(disp) + ∆Eint(elec) ∆Eint(elec) = ∆Eelstat + ∆EPauli +∆Eorb eq (2)   eq (3) Although EDA based on Kohn-Sham DFT is the most widely used approach for bonding analysis, other EDA implementations exist.Some examples are the block-localized wavefunction (BLW) EDA, 19 the very similar absolutely-localized molecular orbital (ALMO) EDA [20][21][22] or the generalized Kohn-Sham (GKS) EDA. 23A more comprehensive overview can be found in recent review articles. 7,9,18,24,25Although few of these approaches have been extended to wavefunction based methods, 23,26 the application of DFT approaches by far prevails.The two main reasons are (i) efficient computations are possible with DFT approaches and (ii) the ease of interpretation resulting from the mean-field approach resulting in an effective one-electron picture.

Machine learning (ML)
ML comprises a collection of algorithms that aim to map an input space  to an output target space  using statistical methods, expressed as :  → .Unlike traditional physical models, ML algorithms usually do not rely on prior physical assumptions but find functional relations by training on a set of reference data, containing both,  and , in case of supervised learning, or only  in case of unsupervised learning.Once trained, ML is powerful as it can generalize to new, unseen data, with high computational efficiency and the accuracy of the reference method. 27,28In this work, we made use of the propensity of ML to find relations between molecular inputs, functionals, and EDA terms to analyze their influence on the bonding analysis.In the following, we will lay the groundwork for the terminology and concepts used in this work and provide a brief overview of the different types of ML algorithms employed.

Molecular representation
A critical component of ML is data representation, which involves encoding molecular structure and properties into numerical descriptors that can be used as input features, , for ML algorithms.There are several types of descriptors available, including molecular fingerprints, SMILES strings, or descriptors based on 3-dimensional arrangements of atoms in a molecule. 29,30nce EDA focuses on bonds, local descriptors to represent the two atoms participating in the bond under investigation were used.Local descriptors can describe molecules of arbitrary sizes by representing atoms in their chemical and structural environment. 31In particular, the Smooth Overlap of Atomic Positions (SOAP) 32 descriptor was employed here.SOAP transforms the atomic positions into a three-dimensional grid and computes the overlap between neighboring atomic densities.This procedure gives a set of coefficients that can be used as input features for an ML model, which remain unaffected by symmetry operations performed on atoms in the environment.In addition, we tested a minimal descriptor based on distances of the bond under investigation and elemental charges of the atoms involved in the bond as the bond distance separates the different bonding groups (see Figure S1 in section S1 in the supporting information (SI)).
Besides structural representations, also molecular properties can be used as input information to an ML model. 27Therefore, the EDA terms and information on density functionals served as inputs.

Supervised learning
In this work, classification algorithms were used as supervised learning techniques, for which the model required access to the target properties, , which are EDA terms (regression) and bonding classes (classification) in this case.
For classification, we tested the different methods mentioned above that were used for regression to classify molecules into their bonding characters by using EDA terms, information on the method, and structural features as inputs, .

Unsupervised learning
To unravel relations in input data, unsupervised learning was used.In contrast to supervised learning, unsupervised learning involves learning from unlabeled data, hence the model does not know about .This type of learning can be used for tasks such as clustering and dimensionality reduction.
Here, dimensionality reduction, i.e., principal component analysis (PCA), was used in this work to reduce the dimensionality of structural descriptors (used as ) of the chemical systems under investigation as these often contain many thousands of values.PCA is a linear method that finds the most significant orthogonal directions of variation in the data and projects the data onto a lower-dimensional subspace.The new set of variables are called principal components (PCs) and capture the variance in the original data.Noteworthy, PCs do not have a physical meaning, although feature importance analysis can be used to assess the influence of each descriptor value on a given PC. 33 addition, clustering tools were tested to assess the propensity to group data based on their structures and EDA terms without having access to the predefined bonding classes.Clustering can thus be seen as the unsupervised counterpart to classification. 29 3 Models and Methods

EDA data set: Choice of molecules
To study the impact of the functional and dispersion correction on EDA results, a data set was generated that comprises 11 molecules (see Figure 2) and EDA terms obtained from 28 different methods for which the functional was varied (details on the different combinations of functionals and long-range interactions can be found in Table S1 in the SI).In total, 308 data points with each data point containing 8 entries obtained from EDA were generated.The molecules used are representatives of four characteristic bonding types, i.e., covalent (1-3), main group donor-acceptor (4-6), transition metal donor-acceptor (7-9) as well as hydrogenbonded systems (10, 11).The choice of these four classes was made to represent the most common bonding types.Covalent bonds result from both fragments contributing one electron each to the bond (shared-electron bonding).Donor-acceptor bonds result from both bonding electrons stemming from one fragment, only.The donor-acceptor bonds are further split into two groups, i.e., those of main group elements and those of transition metals.Although the bonding mechanism is the same, the elements and orbitals involved in the bonds are chemically different enough to distinguish them via EDA.Non-covalent interactions are exemplified by hydrogen-bonded systems.The molecules tested (Figure 2) are ethane (1), ethene (2), ethyne (3), ammonia borane (4), ammonia boron trichloride (5), ammonia boron trifluoride (6), imidazol-2-ylidene (nNHC) transition metal complexes of copper-, silver-, gold-chloride (7-9), as well as a water (10) and ammonia dimer (11).Subsequently, we generated a test set of additional molecules comprising butane (12), difluoroethane (13), ethanol (14), and trimethylphosphine borane (15) for which EDA terms were computed using up to 4 different functionals each, summing up to a total of 14 additional data points.These additional data were not included in any training or validation process but were used solely for testing of the ML models as a separate hold-out test set.

Density functional theory (DFT) calculations
For EDA calculations, DFT was used in combination with different functionals and long-range corrections (Table S1 for a full list).All geometry optimizations and EDA calculations were carried out using the AMS21 software package.For functionals where there was no DFT-D3 dispersion correction available in AMS21, 34 the standalone DFT-D3 package by Grimme and co-workers [35][36][37] was used.The starting geometries for the molecules 1-6 and 12-15 were obtained by a manually formed guess and subsequent pre-optimization using the UFF forcefield in AMS.The starting structures for 10 and 11 were taken from the S22+ data set and the starting structures for the molecules 7-9 were obtained from reference 38.All final geometry optimizations were carried out with tight optimization parameters, i.e., 10 -9 a.u.for SCF energy, 10 -6 a.u.for the total energy convergence, and 10 -4 a.u./Å for the gradient convergence.For the geometry optimizations, we tested the impact of the functional on EDA by optimizing all structures with PBE 14 and a few molecules (3, 4, 7, 11) with all other 28 method combinations.All calculations were performed using scalar relativistic approaches (ZORA) 39,40 and the TZ2P basis set. 41The fragments, charges and spin polarizations employed for the EDA calculations can be found in Table S2 in the SI and chosen parameters were verified by comparing them to literature (see section S2.1 in the SI). 38,42e relative orbital and electrostatic contributions, ∆Erel,orb and ∆Erel,elstat, respectively, have been calculated as a percentage of the sum of both (eq (4)), whilst the relative dispersive contribution, ∆Erel.disp,has been calculated as a percentage of the total interaction energy (eq (5)).
,orb = E orb E elstat + E orb * 100 = 100 −  , eq (6) To evaluate the differences between the used method combinations, the mean values (standard deviation () (see SI section S2.2)) as well as the total energy ranges covered were calculated for each EDA term.

ML models
In this work, supervised and unsupervised ML to analyze and fit data obtained from EDA were used.For the input to ML models, we tested different combinations of EDA terms, PCs obtained from SOAP 32 applied to the chemical bond under investigation and a description of the functional number.PCA was needed to reduce the dimension of the SOAP descriptor as it contained over 16.000 values, while the descriptor based on EDA terms or the functional only contained 1-8 values.For the structural descriptor, 3 PCs were used that describe over 90% of the variance in the data (see Figure S2 in section S2.3 of the SI).The SOAP descriptor was generated using DScribe. 43ML models were used as implemented in scikit-learn 43 using default parameters unless stated otherwise.The functional was encoded by a number between 1 and 28, which were subsequently normalized.For all features used to define the descriptor, the following normalization procedure was applied: , with  being the original feature space and  ′ the standardized feature.
Hyperparameters for all ML models used were screened using 5-fold cross-validation. 29For cross-validation, the data set was split into 75% training data, of which 90% were used for training and 10% for validation to obtain hyperparameters.25% of the data were used as a holdout test set to assess the performance of the ML models.As this 25% of data contained molecules with EDA terms obtained from different functionals, but structures already included in the training data, an additional test set with molecules never seen by the models was used to assess the performance.
For pattern recognition, clustering and classification models were tested.For clustering, kmeans, 44 DBSCANk, 45 OPTICS, 46,47 Feature Agglomeration, 48 Spectral Clustering 49 and Affinity Propagation 50 were tested with minor deviations between models.Hence, k-means clustering was applied for all subsequent analyses.For classification tasks, we tested a multilayer perceptron and support vector machine.The final study was carried out with the multilayer perceptron.To investigate the influence of the functional on the bonding interpretation, we used a descriptor that encoded the EDA terms and the functional.Feature importance analysis 51,52 was used to assess the importance of each descriptor value for the bonding recognition task.
We further investigated the ability of ML to learn EDA terms obtained from different functionals and to apply these machine-learned values as inputs for classification.Therefore, supervised regression models were trained on EDA terms using a structural descriptor.As regressors, we tested kernel ridge regression, neural networks, and Gaussian process regression.Kernel ridge regression was found to give the lowest errors; hence these models were chosen for subsequent studies.All hyperparameters screened and final hyperparameters used are summarized in the SI in section S2.4.

EDA from different method combinations
To understand the impact of using different functionals on EDA results, we followed these steps: First, we averaged the individual energy terms of each molecule separately, considering all the functionals used.Second, we compared how these average values changed among molecules within the same bonding group.Finally, we compared these trends with the largest and smallest values obtained from the different functionals/methods tested for each energy term.This helps us to see if the trends remain consistent with varying method combinations and whether EDA terms of diverse studies and methods can be used to compare bonding patterns between molecules.Additionally, we calculated the mean, maximum, and minimum energy values for each molecule and energy type, which can be found in Tables S3-S13 in section S3.1 of the SI.EDA results when using the same functionals for the geometry optimization as in the EDA calculation are discussed and shown for a few examples in section S3.2 in the SI (Tables S14-S17).Effects can be considered negligible, justifying the use of PBE-D3 for geometry optimization of all systems.When looking at the averaged energy terms of the bonding classes, one would assume that the bonding classes are well separated from each other.However, this is hardly the case and is visualized in Figure 3 using boxplots (bonding classes are indicated using the color scheme of Figure 2).In each panel of the figure, one box refers to one molecule.Each box shows where most energy terms obtained from the calculations, i.e., two quartiles, lie.The points indicate outliers.Outliers are mainly due to energy terms generated with the functionals OLYP, revPBE and M06-L but in general, energies of 16 method combinations were at least once regarded as outliers by the box-plot analysis (see Table S18 in section S3.3).
The plots can be interpreted for each EDA term individually.Take for instance the first panel that shows the distribution of ∆Eint of the different molecules.Covalently bound molecules (purple) show a considerable spread in the energy and are well separated from the rest of the classes (spreads can additionally be found in Table S19 in section S3.4 in the SI).In addition, hydrogen bonded systems (yellow) are well separated, but transition metal (green) and main group donor-acceptor bonded systems (blue) are overlapping.A similar result is obtained for ∆Eorb.With regards to ∆Eelstat and ∆EPauli covalently bound systems (purple), transition metal (green) and main group donor-acceptor (blue) bonds, are overlapping.Only hydrogen bound systems are well separated for all the four mentioned EDA terms.In case of ∆Eint(disp), all bonding classes are overlapping.The last panel, which shows results for ∆Erel,elstat, is a combination of ∆Eorb and ∆Eelstat.As can be seen, combining these two values leads to an overall better separating of all the bonding classes observed, especially for the transition metal (green) and main group donor-acceptor (blue) groups, which are overlapping with other groups everywhere else.In addition, the covalently bound (purple) and main group donor-acceptor bound systems (blue) are well separated from each other.However, the hydrogen bound systems (yellow) are overlapping with the transition metal donor-acceptor group (green).The reason that these classes are overlapping is because the ratio between ∆Eorb and ∆Eelstat is approximately 1:3 for both classes (see e.g., Tables S9 to S13), hence the relative terms fall within the same range.Nevertheless, this effect does not limit the interpretability of the results as yellow classes are separated for all other terms except for ∆Eint(disp).
In addition to the EDA conducted using average values, we also analyzed how the magnitudes of the EDA terms changed from one molecule to another (see Tables S20-S22).In summary, changes in energy from one molecule to another can be reproduced when looking only at results from methods that give minimum values or maximum values.Only in case of ∆Erel,elstat an inverse trend can be observed when comparing maximum values of H3B-NH3 and Cl3B-NH3.
An inverse trend means that instead of an energy decrease from one molecule to another, an energy increase or vice versa is observed.When comparing minimum/maximum values to each other instead of average values, the situation changes.For many molecules trends for ∆Erel,elstat, ∆EPauli, and ∆Eint change (the exact terms and molecules that are affected can be found in Table S22).In case of ∆Eorb and ∆Eelstat, trends between only 2 systems, i.e., 7 to 9, cannot be preserved.With respect to ∆Eint(disp), no trend can be preserved.The same is true for ∆Erel,disp, as shown in the last panel of Figure 3.The latter might be caused by the fact that in most cases, a large part of the dispersion interaction is not captured by the functionals, but the external dispersion correction.Hence, for M06 the additional correction term is small, leading to ∆Eint(disp) values close to zero, while the average dispersion energy for all molecules and all methods is -7.78 kJ/mol.To summarize, when considering multiple EDA terms, such as ∆Eint, ∆Eorb, and ∆Eelstat, or both ∆Eint and ∆Erel,elstat of a bond, the four bonding classes can be separated even when looking at data obtained from different functionals.In general, ∆Eint(disp) does not allow for separation of the bonding classes when comparing various methods.

Assessment of the influence of functionals via pattern recognition and feature importance analysis
As obtained from the previous analyses of EDA results, different functionals can influence the contributions of the EDA terms.Still, the previous analyses give unconcise results and do not allow to draw a clear conclusion about whether the amount of spread in the contributions leads to different bonding interpretations.Therefore, we applied pattern recognition that allows the grouping of the different molecules based on a predefined descriptor.Therefore, the descriptor encodes the different functionals in addition to properties that are important for the bond under investigation, i.e., the EDA terms.Details are discussed in the Methods section.In combination with feature importance analysis, this tool provides a way to investigate the influence of the functional on the pattern recognition into different bonding classes.
A multi-layer perceptron (MLPC) was used for the final analysis as it gave robust results for grouping the molecules with respect to their bonding types.To train the classifier, the training set was randomly split into 75% training data and 25% test data, meaning the model learned on 75% of the data how to classify molecules into four bonding groups.The trained model was then tested on the rest of the data with results shown in Figure 4.The colors in the plot refer to the different groups specified earlier, i.e., covalently bound molecules (purple), main group donor-acceptor bonds (blue), transition metal donor-acceptor groups (green), and hydrogenbonded systems (yellow).All molecules in the test set were classified correctly, which can be seen by looking at the numbers and comparing to the original groups in Figure 2. As EDA terms for each molecule were computed 28 times, i.e., with 28 different methods, we had 28 different input vectors and thus data points for each molecule.Therefore, we computed an additional test set with completely new molecules (Figure 2, 12-15).These molecules are also classified correctly and are marked with crosses in Figure 4.
As the model can successfully distinguish the different molecules based on EDA terms and functionals, we further sought to investigate the impact of each of the features on the pattern recognition task.Therefore, the permutational importance of each input feature was computed by observing how exchanging the values of a feature with random noise affects the error.If the error undergoes a significant change upon permutation, it indicates that the feature holds importance for our model.The procedure was carried out 100 times delivering a standard deviation value.The results are shown in Figure 5a.As can be seen, the largest contributions to the pattern recognition task come from ∆Erel,elstat and ∆EPauli.Interestingly, ∆Eint(disp) and ∆Erel,disp have very little influence, as can be seen from the small contributions in Figure 5a.This is encouraging as the previous discussion has shown that these terms do not allow for separation of the bonding classes.As can be further seen, the smallest contribution to the classification comes from the functional.
The influence of the functional was further analyzed by training the same classification algorithm on data of only one functional and applying it for classification of the rest of the data obtained from different functionals.Therefore, the EDA terms of the different functionals were used as an input and the functional was removed from the descriptor.This procedure should give rise to information about whether data of one functional can be used to correctly classify input data of other functionals or not.The results when encoding the functional in addition to the EDA terms led to similar trends and can be found in Figures S3 and S4.The results are illustrated in Figure 5b by using the percentage of incorrectly classified systems per trained algorithm.As can be seen, when training on EDA data obtained from most functionals, the errors for predicting molecules based on EDA terms obtained from other functionals are minor.However, there are two functionals that show over 20% error, i.e., OLYP-D3(BJ) and rPBE-D3(BJ), which have the largest error bars in Figure 5b.These results suggest that these functionals can lead to different bonding interpretations when comparing to other functionals.Among the functionals that could be used to train the MLPC model and classify almost all data of other functionals are PBE0, B3LYP, and variants thereof including different dispersion corrections, which have the smallest error bars in Figure 5b.These results are encouraging as indeed, these functionals are among the most widely used.In addition, Figure 5c shows the permutational importance of the descriptor values when training on one functional, which is shown using box plots.
As can be seen, the importance of each feature is comparable to the permutational importance when training on all functionals, as shown in Figure 5a.The main difference arises from a larger importance of ∆Eorb for the classifier trained on individual data of a single functional.These results further support that the functional has little influence on the bonding group classification.
Finally, we investigated for which molecules the bonding category was mostly predicted incorrectly by training MLPCs on data obtained from some of the worst (rPBE-D3 BJ, revPBE-D3 BJ, OLYP-D3 BJ, B97-D3 BJ) and best method combinations (PBE-D3 BJ and B3LYP-D3 BJ) individually (termed misclassification).The results are shown in Figure 5d.As can be seen, most molecules that are predicted incorrectly when using data from other method combinations as input to the MLPCs trained on one functional belong to groups of main group donor-acceptor bonds (blue).The functionals that lead to most misclassifications show up to over 80% of wrong predictions of this bonding type.A few transition metal donor-acceptor groups (green, abbreviated as TM D-A in the figure ) were not classified correctly.No trend can be found for functionals that lead to relatively few errors.Hydrogen-bonded systems (yellow) are always classified correctly.
These results indicate that the main group and transition metal donor-acceptor groups might be better combined instead of separated, which is also supported by the EDA analysis in the previous section as these groups are overlapping for most EDA terms.From a chemical perspective, this makes sense as these groups are representing the same bonding type and are only distinguished by the type of atoms involved (resulting from either having d-orbitals in the valence space or not).Therefore, we repeated the whole analysis by choosing only three groups in total instead of four (see Figures S5 and S6 in section S4.2 in the SI).While analysis of the feature importance (Figure S5) leads to similar results with respect to the dispersion energy contributions and the functional importance, the importance of ∆Erel,elstat decreased considerably when training on only 3 classes.This change can be explained by looking at ∆Erel,elstat values of each of the molecules in the different groups in the last panel of Figure 3.It can be seen that ∆Erel,elstat values of transition metal donor-acceptor molecules are approximately twice as large as the values of main group donor-acceptor molecules.When training a classification model on data of only one functional using three classes instead of four, the errors when predicting classes using data obtained from the other functionals are comparable to previous results using four classes (Figure S4 in the SI, section S4.2).As a large difference in ∆Erel,elstat between transition metal and main group donor-acceptor bonds was found, four classes were determined more adequate for the conducted analysis.

Summary and Outlook
This study investigates the impact of density functionals on the bonding interpretation through energy decomposition analysis (EDA).To address this, we generated a dataset representing four bonding groups comprising main group donor-acceptor bonds, transition metal donoracceptor bonds, covalently bonded, and hydrogen-bonded systems.28 calculations for 11 molecules using different functionals and long-range corrections were conducted.The comparison of density functional theory calculations revealed significant variation among the different functionals for various EDA terms, with the dispersion correction terms showing the most significant variation.Furthermore, the results show that the bonding classes are not always well separated from each other when looking at individual EDA terms, and there is some overlap between different bonding types, especially transition metal donor-acceptor and main group donor-acceptor bonds.However, relative energy terms that combine multiple EDA terms, help in better separating these bonding classes.
To answer the question of how much the functional choice influences the bonding analysis, a classification model was trained that takes EDA terms and the applied method as inputs to identify patterns and classify data into their respective bonding groups.This allowed examination of the importance of the features, i.e., the EDA terms and functional, for the classification task.Findings revealed that despite the considerable energy spread of the dispersion terms, they ranked second-to-last in contribution with the functional ranking last.This means the choice of the functional is not critical for the interpretation of the bonding analysis in terms of bonding classes.
Further analyses showed that data obtained using the most frequently applied functionals in EDA studies, especially PBE0, B3LYP, and their variations, are interchangeable when used as inputs for classification models, which is not the case when taking data of rPBE and OLYP functionals, for instance.These results indicate that selection of the latter functionals may not be optimal when applying EDA and comparing to studies using other methods.

Data availability:
All EDA calculations conducted in this study are uploaded to the open data repository NOMAD. 53The EDA terms are further summarized in a csv file and are part of the supplementary information.

S1 Data set characteristics
As can be seen from Figure S1, the bond distances separate the different bonding groups well.The shortest bond belongs to the group of covalently bound systems, while the longest bonds are characteristic for hydrogen bonded molecules.

Figure S 1:
Relation between the bond distance and the bonding group.

S2.1 Quantum chemical calculations
As explained in the Methods section of the main text, different methods were used to compute EDA terms.Therefore, 28 combinations of functionals and long-range corrections were applied.The scaling factors for the dispersion correction used in this work were compared to the ones given at the webpage of the university Bonn 1 ("List of Functionals and Coefficients for Zero-Damping -Chemie Universität Bonn" n.d., "List of Functionals and Coefficients for BJ-damping Chemie Universität Bonn" n.d.).For the functional revTPSS, the parameters of TPSS were used, as supported in the literature. 2 The functionals and additional parameters used are summarized in Table S1.The dispersion energy has been computed using the stand-alone version of DFT-D3 code (version 3.2 Rev 0, downloaded from Ref. 1 ).
b These functionals have been computed using the libXC library as implementen in AMS. 3 4 The fragments, charges and spin polarizations employed for the EDA calculations are summarized in Table S2.
Table S2: Fragmentation of a given molecule for the EDA calculations, charges, and spin polarization.

S2.2 Statistical analysis
To analyze the EDA results of 28 different functional and dispersion correction combinations for 11 molecules, the mean values,  , maxima, minima, and standard deviations, , were calculated for the EDA results of each molecule.The mean and standard deviation of the different EDA energy terms were calculated according to the following formulas: ∑   eq (2)

S2.3 Principal component analysis
In order to reduce the dimensionality of the SOAP descriptor, 4 containing over 1600 values per molecule, principal component analysis (PCA) was performed.The first three principal components (PCs) described 90 % of the variance of the SOAP descriptor.

S2.4 Model parameters
To choose suitable hyperparameters for the classification using a multi-layer perceptron for classification (MLPC), a grid search (regularization: 0.01-0.000001,learning rate: 0.1-0.00000001,number of iterations: 100-7500) with 5-fold cross validation was performed on the trainingset, The input, i.e., descriptor, for the ML models were a combination of ΔEint(disp), Δ , ΔEint ,Δ , Δ , Δ , , Δ , , and XC-number.All values were normalized.The output was the bonding class label, which was predefined.The initial parameter grid was adjusted in case the optimized hyperparameter was close to the specified bounds.For the MLPC the final hyperparameters used were: regularization: 0.1, learning rate: 0.00001, maximum number of iterations: 7000.
The same procedure was carried out for the kernel ridge regression model, for which we used 3 principle components of the SOAP descriptor defined for the two atoms of the bond and the XCnumber as input.The outputs were EDA terms as used for the input of classification.The following hyperparameter ranges were tested: regularization 0.1-0.000001,kernel width:0.1-0.0001,kernel: radial basis function, Laplacian, polynomial.The final hyperparameters used were a regularization rate of 0.000010, a kernel width of 0.005 in combination with a Laplacian kernel.

S3 Summary of performed EDA calculations S3.1 Statistical results
In the following Table S3 to Table S13 the mean,  ‾, standard deviation, , largest, MAX , and smallest, MIN , values, as well as the spread (difference between largest and smallest value) for the EDA calculations of all 28 method combinations are given on a per-molecule basis.
The relative dispersive and electrostatic EDA terms are calculated according to equations 4 and 5 in the main part of the paper and are given in parentheses in the tables bellow.The largest and smallest values for the relative contributions are given for the whole data set and are thus not necessarily correspond to the non-relative EDA term (i.e.,different method) given before the parentheses.

S3.2 Geometry dependence on EDA calculations
To evaluate the influence of functional choice for the geometry optimization on the EDA results, the geometry of the molecules ethyne, ammonia borane, ClCu-nNHC and the ammonia dimer have been optimized with all 28 functional dispersion correction combinations, see Table S1.The subsequent EDA has been carried out with the same functional as used in the geometry optimization.Comparing the spreads (between different functionals) of the different energy terms for the PBE-D3 optimized calculations and those optimized with the same functional as used in the EDA the influence of the geometry optimization is evaluated.For the interaction, orbital, electrostatic and relative electrostatic energy no clear trends between the spreads have been found.Only for the dispersion energy spread of the calculations where the geometry was optimized with the same functional as used in the subsequent EDA calculation, the spread was either the same or lower relative to the one for the PBE-D3 optimized EDA (see Tables S3-S13 above).

S3.5 Relative spread of EDA terms
In order to evaluate which EDA term shows the largest difference between different calculation parameters the spread for each EDA term, relative to the average value for each molecule is given below.

S3.5 Trends of EDA influenced by the choice of method
To assess whether the various methods can influence energy trends (as defined in the main part and below) between two molecules, we analyzed whether a trend changes or not when comparing two molecules in the following Tables.To obtain reference trend between two molecules we use the EDA terms for all calculation methods averaged.These trends are then compared to the trends either using only the calculation method giving the highest or lowest value for a given EDA term.
If the sign of the energy difference between two molecule changes in comparison to the energy difference seen for the average over all calculation methods it will be interpreted as a misrepresented trend (indicated as "-").If a trend is preserved, an "X" is shown in the Table .The results are summarized in the Tables S15 and S16.

S4 Machine learning based assessment of functional importance
To evaluate the EDA terms that influence the classification the most, the permutational importance for the MLPC has been calculated employing the ELI5 package. 5The permutational importance feature in ELI5 calculates the importance of each feature in an MLPC by permutation of the feature values with random noise and measuring the resulting change in the model's performance.The feature importance scores are then normalized so that they add up to 1.
The standard deviation of the feature importance scores is a measure of how much the importance score varies when the permutations are applied repeatedly.It is calculated by repeating the permutation process multiple times and calculating the standard deviation of the resulting feature importance scores.For the below calculations the number of iterations was set to 100 in order to ensure a meaningful permutational importance.
To ensure that the obtained feature importance is suitable for generalization it has been calculated for the test set (see Figure 2a of the main text).

S4.1 MLPC results for 4 bonding classes excluding the XC number from the descriptor
The feature importance has been determined for an MLPC containing labels for 4 bonding classes with (Figure 5a) and without (Figure S5) the functional as a descriptor value.To evaluate which of the 28 functional-dispersion-correction combinations might lead to wrong classifications of the bond type the MLPC has been trained on each of the combinations individually and was then tested on the remaining 27 combinations.Whilst for 4 bond classes the relative electrostatic energy seems to have the biggest impact on the classification, the Pauli energy has the biggest influence for 3 bond classes.For both classifications the XC number as well as well as the relative dispersion energy show a negligible permutational importance.
As it is visible from Figure S6, the number of misclassifications hardly changes when using 3 bonding classes instead of 4 (compare to Figure 5b in the main text).

Figure 1 .
Figure 1.Schematic representation of energy terms derived in the EDA approach used to analyse a chemical bond between two generic fragments A and B forming AB.Adapted and modified based on reference 1.

Figure 3 .
Figure 3. Boxplots of EDA energies of 28 distinct method-combinations for each molecule within the test set (see Figure 2).Molecules of the same bonding class are colored according to the legend in the bottom.For each energy term and molecule, the boxes show where two quartiles lie, while outliers are shown with dots.

Figure 4 .
Figure 4. Normalized interaction energy plotted against normalized electrostatic energy for the molecules in test set 1 (25% of original datapoints) shown as dots and in test set 2 (completely new structures) shown as crosses.The color indicates the group as obtained by the trained MLPC model.Corresponding labels are shown in the legend.Numbers correspond to molecules in Figure 2.

Figure 5 :
Figure 5: a) Permutational feature importance and standard deviation of each descriptor value for the classification task.b.) Percent of misclassifications for data when trained on data of one functional at a time.c) Feature importance for the MLPC trained on one functional at a time (functional trained on indicated by the color).d) Incorrectly predicted bond classes for a multi-layer perceptron (MLPC) separately trained on data of different functionals as indicated in the legend.Main group donor-acceptor and transition metal donor-acceptor bonds are abbreviated as MG D-A and TM D-A, respectively.

Figure S 2 :
Figure S 2:Explained variance ratio of principle components 1 to 3 for the structural descriptor SOAP.4

Figure S 3 :
Figure S 3: Feature importance for MLPC trained on 4 bonding classes, excluding the XC number from the descriptor.

Figure S 5 :
Figure S 5: Feature importance for MLPC trained on 3 bonding classes with the XC number included in the descriptor.

Figure S 6 :
Figure S 6: Number of wrong classifications, with functional number for 3 Bonding classes.

Table S1 :
Damping function and software packages used for the dispersion correction of a given functional.

Table S3 :
Averages and spread of Energy decomposition analysis of ethane, values given in kJ/mol.

Table S4 :
Averages and spread of Energy decomposition analysis of ethene, values given in kJ/mol.

Table S5 :
Averages and spread of Energy decomposition analysis of ethyne, values given in kJ/mol.

Table S6 :
Averages and spread of Energy decomposition analysis of H3B-NH3, values given in kJ/mol.

Table S7 :
Averages and spread of Energy decomposition analysis of Cl3B-NH3, values given in kJ/mol.

Table S8 :
Averages and spread of Energy decomposition analysis of F3B-NH3, values given in kJ/mol.

Table S9 :
Averages and spread of Energy decomposition analysis ofClCu-nNHC, values given  in kJ/mol.

Table S12 :
Averages and spread of Energy decomposition analysis of H2O dimer, values given in kJ/mol.

Table S13 :
Averages and spread of Energy decomposition analysis of NH3 dimer, values given in kJ/mol.

Table S14 :
Averages and spread of Energy decomposition analysis of ethyne, optimized with same functionals as used in EDA, values given in kJ/mol.

Table S15 :
Averages and spread of Energy decomposition analysis of H3B-NH3, optimized with same functional as used in EDA, values given in kJ/mol.

Table S16 :
Averages and spread of Energy decomposition analysis of ClCu-nNHC, optimized with same functional as used in EDA, values given in kJ/mol.

Table S17 :
Averages and spread of Energy decomposition analysis of NH3 dimer, optimized with same functional as used in EDA, values given in kJ/mol.

Table S19 :
Relative spread in % of different EDA terms for each molecule.Average over all molecules given in the last row.