PyFrag—Streamlining your reaction path analysis


  • Willem-Jan Van Zeist,

    1. Afdeling Theoretische Chemie, Scheikundig Laboratorium der Vrije Universiteit De Boelelaan 1083, NL-1081 HV Amsterdam, The Netherlands
    Search for more papers by this author
  • Célia Fonseca Guerra,

    1. Afdeling Theoretische Chemie, Scheikundig Laboratorium der Vrije Universiteit De Boelelaan 1083, NL-1081 HV Amsterdam, The Netherlands
    Search for more papers by this author
  • F. Matthias Bickelhaupt

    Corresponding author
    1. Afdeling Theoretische Chemie, Scheikundig Laboratorium der Vrije Universiteit De Boelelaan 1083, NL-1081 HV Amsterdam, The Netherlands
    • Afdeling Theoretische Chemie, Scheikundig Laboratorium der Vrije Universiteit De Boelelaan 1083, NL-1081 HV Amsterdam, The Netherlands
    Search for more papers by this author


The PyFrag program (released as PyFrag2007.01) is a “wrap-around” for the Amsterdam Density Functional (ADF) package and facilitates the extension of the fragment analysis method implemented in ADF along an entire potential energy surface. The purpose is to make analyses of reaction paths and other (in principle also multidimensional) potential energy surfaces more transparent and user-friendly. PyFrag also automates the analysis of reaction paths in terms of the extended activation strain model of chemical reactivity. © 2007 Wiley Periodicals, Inc. J Comput Chem, 2008


Understanding reaction mechanisms usually starts off with finding the stationary points associated with the reaction, which are typically reactants, transition states, products and intermediate complexes. However, for a true understanding of a reaction, it is often insufficient to look at the stationary points alone.1–4 An energy minimum or a saddle point is in general the result of the interplay between counteracting factors that reach an equilibrium at a given geometrical configuration of the atoms. Therefore, analyzing only the stationary points does not fully uncover the structural and electronic effects that are responsible for their occurrence. Instead, it is useful and often essential that the bonding in a model system is analyzed along a one- or multidimensional potential energy surface (PES) that is associated with all relevant geometrical degrees of freedom.1–3 In the case of, for example, a chemical reaction, this can be the PES along the entire intrinsic reaction coordinate (IRC). But depending on the particular problem, also the PES associated with others, multidimensional changes in the geometrical arrangement of the atoms in a molecular species may be examined.

The Amsterdam Density Functional (ADF) package5, 6 enables one to analyze systems in a fragment-oriented manner. Thus, the bonding in a molecular system is analyzed in terms of the interaction between two or more fragments. This means that, in the first step of analyzing any molecular system, it has to be split up into fragments. These fragments are then calculated in a single point manner, in the geometry they adopt in the overall system. In the final step, that is, in the actual analysis computation, the fragments are combined to yield the overall species in the above-mentioned geometry. A single-point calculation is then performed, which proceeds from the computational results of the fragment calculations and relates all properties of the overall species to these fragments. Note that these calculations are based on molecular fragments and not the atom fragments as in the usual ADF calculations. For example, we can then understand the stability of the species under consideration in terms of the bonding mechanism between the fragments. Note however that performing this type of analyses can become cumbersome for a large set of geometries, e.g., along an IRC or another one- or multidimensional PES.

PyFrag is a program designed to make the analysis of a PES with the Amsterdam Density Functional (ADF) package5, 6 more user-friendly by performing and coordinating all the above-mentioned ADF calculations, for each point on the grid of geometries associated with the PES, in combination with extracting, integrating and post processing the relevant information. Thus, PyFrag de facto extends the fragment-orientated energy decomposition analysis as implemented in the ADF package from treating single-points to examining entire potential energy surfaces. The program is written in the popular and highly portable Python programming language. The name PyFrag is derived from Python and Fragment analysis.

Description of the Program

PyFrag is intended as a “wrap-around” for ADF to facilitate fragment analysis calculations along a set of geometry points. It is controlled by an ADF input file augmented with extra statements understandable by PyFrag. The ADF input script is then used as a basis to construct and execute the necessary ADF calculations. The desired molecular coordinates can be read from the result files of intrinsic reaction coordinate (IRC) or (one- or multidimensional) linear transit (LT) calculations. But, reading cartesian coordinates from an xyz-file containing a series of multiple geometrical structures is also possible. PyFrag can also generate a series of single-point calculations by introducing variable coordinates within a chosen molecular geometry. The later facilitates a quick scan and analysis (!) of the PES along one or more coordinates.

The program has been used in our group to employ the “extended activation strain” model of chemical reactivity.1–4, 7–9 In this model, the total energy along the PES is decomposed along a reaction coordinate ζ that can be, for example, the distance between two atoms or a more complex linear combination of geometry parameters, see eq. (1):

equation image(1)

The activation strain ΔEstrain(ζ) is the energy associated with deforming the fragments from their equilibrium geometry to the geometry they acquire at a particular point ζ along the PES. The energy associated with ΔEint(ζ) is the interaction energy between these deformed reactants. It may be further analyzed, at any value of ζ, in terms of physically meaningful concepts provided by the quantitative molecular orbital (MO) model contained in Kohn-Sham density functional theory (DFT).10–15 Thus, ΔEint is further decomposed into the classical electrostatic interaction ΔVelstat between the fragments, Pauli repulsion ΔEPauli (destabilizing interactions between occupied orbitals which are responsible for steric repulsion), and bonding orbital interaction ΔEoi (e.g., HOMO–LUMO interactions, or electron-pair bonding), see eq. (2)10–15:

equation image(2)

The orbital interaction energy can be further decomposed into the contributions from each irreducible representation of the interacting system using the extended transition state (ETS) scheme developed by Ziegler and Rauk.13–15 Note that our approach differs in this respect from the Morokuma scheme,16, 17 which instead attempts a decomposition of the orbital interactions into polarization and charge transfer.

As might be clear, a lot of information can be extracted from the ADF fragment analysis calculations, the principle data being often the energy decomposition terms as described earlier in eq. (1) and/or eq. (2). These terms will be printed by default into a text-data file. Optional data can be printed for each point along the PES, such as atomic charges,18 orbital populations, orbital energies, and orbital overlaps. All these quantities can be valuable when trying to understand the behavior of a PES, especially their behavior as a function of the reaction coordinate is important, which makes the transparent way in which PyFrag structures this data very attractive. The data file can be very easily used in various data plotting programs.

Figure 1 shows a schematic example of an activation strain analysis that highlights the importance of analyzing the energy of the reaction system along the entire reaction coordinate. Two reactions are considered and the strain and interaction energy are plotted along the reaction coordinate. In this example, the strain energy does not change; only the interaction energy becomes less stabilizing for the second reaction (bottom red line, label 2). This reduction in interaction energy not only causes the reaction barrier to be pushed up but it also shifts the transition state to the right along the reaction coordinate, i.e., towards the products (upper red line, label 2). Note that if one were to focus solely on the stationary points, it would seem, as illustrated by the dashed lines in Figure 1, that the interaction energy is actually more stabilizing for the second reaction with the red curves (label 2) and that the higher barrier for the latter is entirely due to an increased strain. Clearly this is misleading, as the strain curves of the two reactions are identical and it is the red interaction-energy curve (label 2) that is weaker at any given point along the reaction coordinate. Because of this type of effects, it is mandatory to take into account the entire reaction path to compare different systems in a consistent manner.

Figure 1.

Schematic illustration of an activation strain analysis; the energy profile ΔE of two arbitrary reactions is decomposed along reaction coordinate ζ into the strain energy ΔEstrain of the increasingly deformed reactants plus the interaction energy ΔEint between these reactants. The strain curves for the two reactions are identical. The fact that the PES and the transition state (indicated with a bullet) of the reaction represented with the red curves (label 2) is higher than those of the reaction with the black curves (label 1) is, in this example, entirely due to the weaker interaction in the case of the former. Note however that a decomposition in the transition states alone would erroneously suggest the opposite (see dashed lines). [Color figure can be viewed in the online issue, which is available at]

A current example of research within our group is the comparison of oxidative insertion with SN2 pathways for the Pd- and PdCl-mediated activation of the chloromethane C[BOND]Cl bond.1 Adding the Cl ligand (anion assistance) changes the preference of the reaction from oxidative insertion to SN2 and applying the activation strain model to these reactions gives interesting insights. It can be seen in Figure 2 that the strain curves (ΔEstrain) of the reactions practically coincide and that the differences in the reactions can be found in the metal–substrate interaction curves (ΔEint). The preference of the oxidative insertion over the SN2 pathway for bare Palladium is caused by a poorer backdonation along the SN2 pathway. This effect can be traced back to a smaller increase of the overlap 〈4d|σ*C[BOND]Cl〉 along the SN2 pathway. The preference for the SN2 pathway induced by the Cl ligand can be explained by a more pronounced stabilizing effect of the Cl ligand on the interaction energy compared to the oxidative insertion pathway.

Figure 2.

Analysis of the PES ΔE(ζ) of the OxIn and SN2 pathways for oxidative addition of the chloromethane C[BOND]Cl bond to Pd and PdCl, along the reaction coordinate ζ projected onto the C[BOND]Cl bond length (see Ref.1). Dots indicate TS. Left panel: decomposition of ΔE into ΔEstrain and ΔEint for the OxIn (black lines, label 1) and the SN2 pathway (red lines, label 2) for addition to Pd. Middle panel: orbital overlap 〈4d|σ*C[BOND]Cl〉 (square root of the combined squares of overlaps of the degenerate Pd 4d orbitals with the chloromethane σ*C[BOND]Cl orbital) for the OxIn (black line, label 1) and the SN2 pathway (red line, label 2) for addition to Pd. Right panel: decomposition of ΔE into ΔEstrain and ΔEint; Black lines (label 1): addition to Pd along the SN2 pathway; red lines (label 2): addition to PdCl along the SN2 pathway; green lines (label 3): addition to PdCl along the OxIn pathway. [Color figure can be viewed in the online issue, which is available at]

Another example is the investigation of how and why SN2@P reactions of Cl + P(O)R2-Cl are affected by the variation of substituents R at the phosphorous atom. Recently, research in our group highlighted the importance of steric effects in determining whether a certain reaction proceeds via a labile transition state (TC) or a stable transition complex (TC).19 To this end, several IRC calculations were performed, both without any geometrical restrictions and with the substrate partially or completely frozen. Comparison of the potential energy surfaces resulting from the various IRC calculations reveals the repulsive effects that would arise if the substrate would not deform. Figure 3 shows the resulting activation-strain graphs and the decomposition of the interaction energy.

Figure 3.

Analysis of the PES ΔE(ζ) of the SN2 reactions of Cl + POH2Cl, along the reaction coordinate ζ projected onto the Cl[BOND]P bond length (see Ref.19). Left panel: PES ΔE(ζ). Middle panel: activation strain analysis of the PES ΔE(ζ) = ΔEstrain(ζ) (bold lines) + ΔEint(ζ) (dashed lines). Right panel: energy decomposition of the nucleophile–substrate interaction ΔEint(ζ) = ΔVelstat(ζ) (dashed lines) + ΔEPauli(ζ) (bold lines) + ΔEoi(ζ) (plain lines). Black lines (label 1): regular internal reaction coordinate (IRC). Blue lines (label 2): IRC with geometry of [POH2] unit in substrate frozen to that in reactants (“R”). Red lines (label 3): IRC with geometry of entire substrate POH2Cl frozen to that in “R”. [Color figure can be viewed in the online issue, which is available at www.interscience.wiley. com.]

Besides looking at reaction paths, it may also be desirable to scan a multidimensional PES in a series of single point calculations. A recent example is the exploration of the PES for intercalating, among others, daunomycin in between two stacked DNA base pairs.20 The aromatic compound is placed in the x,y-plane after which PyFrag can compute and analyze the interaction with the two surrounding DNA base pairs in a grid along x and y coordinates.

Program Availability

Python is a powerful and highly transferable script language and running PyFrag on various operating systems is easily achieved (given there is a local copy of the ADF-package present, of course). Besides Python, no extra modules or programs need to be installed to run PyFrag. The output generated by PyFrag can be easily imported into programs such as Excel. It is also possible to generate a file that can be read by gnuplot, an often used and free data-plotting program. The program, released as PyFrag2007.01, is freely available. The distribution includes the Python source code, documentation and some example scripts and can be downloaded from∼bickel.