Automatized protocol and interface to simulate QM/MM time‐resolved transient absorption at TD‐DFT level with COBRAMM

Abstract We present a series of new implementations that we recently introduced in COBRAMM, the open‐source academic software developed in our group. The goal of these implementations is to offer an automatized workflow and interface to simulate time‐resolved transient absorption (TA) spectra of medium‐to‐big chromophore embedded in a complex environment. Therefore, the excited states absorption and the stimulated emission are simulated along nonadiabatic dynamics performed with trajectory surface hopping. The possibility of treating systems from medium to big size is given by the use of time‐dependent density functional theory (TD‐DFT) and the presence of the environment is taken into account employing a hybrid quantum mechanics/molecular mechanics (QM/MM) scheme. The full implementation includes a series of auxiliary scripts to properly setup the QM/MM system, the calculation of the wavefunction overlap along the dynamics for the propagation, the evaluation of the transition dipole moment at linear response TD‐DFT level, and scripts to setup, run and analyze the TA from an ensemble of trajectories. Altogether, we believe that our implementation will open the door to the easily simulate the time‐resolved TA of systems so far computationally inaccessible.


| INTRODUCTION
Transient absorption (TA) spectroscopy is a crucial technique to deeply understand the ultrafast dynamics of photoactive systems and can offer a temporal resolution in the order of the femtosecond (fs) time scale. [1][2][3][4][5][6][7] The basic scheme of an experiment is the following ( Figure 1): a wavepacket is initially created on an excited state potential energy surface (PES) by a pump pulse and it starts to propagate along this surface. During its evolution in time, the system can continue propagating on the current PES or relax to a lower-lying excited state through a nonradiative decay. This dynamics can be probed by a second pulse that additionally hits the excited system after a certain delay time t. If the incident probe photon covers the UV/vis spectrum as well, the wavepacket can be promoted to a higher-lying excited state, the so-called excited state absorption (ESA). Alternatively, the probe pulse can induce the relaxation to the ground state through a radiative decay, the so-called stimulated emission (SE). A pump-probe (PP) spectrum is normally recorded with a first measurement performed with the pump turned off and only the probe beam hitting the system, followed by a second measurement with the pump on. In the first measurement only ground state populations is probed, while in the second one part of the total population is on the excited state.
The difference in intensity between the ground state absorption in the two measurements gives rise to the so-called ground state bleaching, which corresponds to the depletion of the ground state population due to the pump pulse.
Despite the powerfulness of the technique, a computational support is often required to confirm and rationalize the time evolution of the system and the chemical and physical nature of the features giving rise to the signals in the TA spectra. 8 Many approaches have been reported to simulate such signals with varying compromises between accuracy and cost, documenting an immense interest of the computational photochemistry community to these kinds of simulations and the importance of developing tools in this field. [9][10][11][12][13][14][15][16][17][18][19] The most rigorous and accurate way to compute TA signal is based on quantum dynamics (QD) simulations, which allow to explicitly consider the electric field of the pulses in the simulations, to vary their strength, temporal and spectral profiles, and other parameters. 18,20 QD simulations are usually performed on a potential energy grid and the cost associated with pre-computing these grids, exponential increasing with the number of degrees of freedom (DOF), limits their applicability to two or three DOFs. 21 A way to consider all the DOFs in the simulation of the optical response of the system is to parametrize a model Hamiltonian in the harmonic approximation (i.e., multidimensional uncoupled displaced harmonic potentials). In the weak field impulsive regime and under the approximation of adiabatic dynamics, the TA signal becomes proportional to the third-order nonlinear response of the system to the field-matter interaction. 22 In this framework, exact analytical equations can be derived. Electronic population dynamics and decoherence can be introduced phenomenologically. Multiconfiguration time-dependent Hartree represents a powerful way to perform QD on model Hamiltonians with up to tens of DOFs, thereby explicitly considering electric fields and non-adiabatic effects. 23 Approaches based on semi-classical and mixed quantum-classical trajectories represent a good balance between accuracy and computational efficiency. [24][25][26][27][28] Trajectorybased methods allow to consider all DOFs without the limitation of the harmonic approximation, thus generalizing the dynamics to arbitrary systems. Non-adiabatic effects can be included at various levels of sophistication. Trajectory surface hopping (TSH) is a popular method to simulate nonadiabatic dynamics based on the propagation of independent trajectories that evolve on potential energy surfaces calculated on-the-fly that drive the nuclear motion. 29 The drawback of TSH is that the nuclei follow classical laws of motion, so that the density matrix is not defined which prevents the calculation of quantities directly related to electronic coherences. Subotnik and co-workers derived an expression for the density matrix consistent with the TSH method, which allowed them to develop a protocol for simulating TA in the impulsive limit which gives excellent results within the short time approximation. 30 The elevated quality comes at the expense of needing to launch a new propagation of the swarm of trajectories on each electronic state coupled to the photoactive states through the probe pulse at each delay time t for an interval τ. More often, a more radical approximation is made, considering only the vertical energy gap between the photoactive state and the remaining states at each delay time t, termed Condon approximation, thus assuming instantaneous dephasing. This approach has been used to successfully reproduce TA spectra using wavefunction methods for the description of the excited states. 9,10,12,31 Recently, Domcke and coworkers reported a TSH compatible protocol within the Condon approximation which accounts for the finite duration and spectral shape of the pump and probe. 9 In this work, we want to introduce a series of tools implemented in the open-source academic software developed in our group, Cobramm is Optimized in Bologna to Run Ab-initio and Molecular-Mechanics calculations (COBRAMM) 32,33 that automatize the whole F I G U R E 1 Schematic representation of a transient absorption experiment and of the signals simulated in this work: (A) a pump prob λ 1 excited the system to an excited state and the created wavepacket starts propagating along this potential energy surface (PES); (B) during a certain delay time the wavepacket can split and decay to a low-lying electronic state; (C) after that delay time a second probe pulse λ 2 can promote the system to a different excited states or induce the relaxation to the ground states and stimulate the emission.
workflow necessary to simulate a TA spectrum, from building the initial system, through simulating the non-adiabatic dynamics with TSH, until the calculation and convolution of the time-resolved TA spectrum. Although this protocol can be applied for molecules in the gas phase and to any level of theory desired by the user, we here want to emphasize two possibilities that our implementation offers to the community, namely the explicit inclusion of solvent and environmental effects through a mixed quantum mechanical/molecular mechanics (QM/MM) approach, 34 and the applicability of the workflow to medium-to-large chromophores, thanks to the present extension of the protocol to the time-dependent density functional theory (TD-DFT). 35 The present tool offers the opportunity of studying the TA of systems that would otherwise be too computationally demanding.
The article is organized as followed: first of all, the theoretical background needed in all the steps of the protocol is revised; then the details of the new implementations required in COBRAMM are listed; finally the TA spectrum of 4-(N,N-dimethylamine)-benzonitrile (DMABN) is presented as an exemplary application, as its ultrafast nonadiabatic dynamics is well-known and its TA spectrum has recently been characterized computationally with a different method, 31 and thus, DMABN represents a perfect test case for our implementations. Particular emphasis will be put on the postprocessing tools developed, which allow the convolution of the PP spectra in the energy and time domain, the application of different polarization combinations and the decomposition of the signal in single transitions contributions.

| QM/MM and DFT in COBRAMM
The QM/MM scheme 34,36 implemented in COBRAMM has been widely reported and explained in literature. 32,33,37,38 However, we will briefly summarize here the main features of the approach.
COBRAMM collects the energy of the QM (E QM ) and MM (E MM(tot) and E MM(QM) ) part through its own interfaces to Molcas 39 and OpenMolcas, 40 The evaluation of the electrostatic interaction between the QM

| Trajectory surface hopping
The great popularity of TSH 48,49 can be attributed to the approximations intrinsic to the method, which make it easily applicable to study nonadiabatic dynamics of medium-to-large size chromophores, otherwise impossible to study, with still a reliable accuracy, 50 also thanks to the validation and application of TD-DFT as ab initio method to calculate electronic energies. 51 another, is stochastic, based on a probability algorithm. In the fewest-switches-surface-hopping (FSSH) algorithm, the probability (P j ) to hop from a state j to another state is given by the electronic coefficients of the wavefunction: where the time evolution of the electronic coefficients is given by: Beside the energies of state i and j, the hopping between them is driven by the time-derivative coupling (TDC) T ji which is the dot product of the velocity and the nonadiabatic coupling (NAC) vector. The NAC can be computed at each step along the dynamics or can be approximated by computing the wavefunction overlap (WFO) of the two electronic states involved. 53 This can be done in order to speed up the calculation time, or if the NAC vector calculation is not available in a specific software or not possible within the framework of a QM method.
This is the case of TD-DFT and two methods to calculate the WFO implemented in COBRAMM will be introduced in section 3.2. The FSSH algorithm was already implemented in COBRAMM and includes the energy-based decoherence correction scheme, 54 the velocity-Verlet integration, 55 the kinetic energy rescaling to preserve the total energy after a hop, and many other simulation parameters that can be set by the user. For example, in TD-DFT trajectories, an induced hop to the ground state can be included, in order to approximately describe a relaxation to S 0, which cannot be properly describe with linear response TD-DFT within the adiabatic approximation. 47 This hop is based on energy gap between S 1 and S 0 , at the choice of the user.

| Time-resolved pump-probe spectrum
The time-resolved TA spectrum can be computed from an ensemble of trajectories independently propagated that sample the evolution of the wavepacket along the potential energy surfaces. 9 intensity proportional to f of the transition. The intensity of the ESA from the state a to an excited state e is given by: and, analogously, the SE from state a to the ground state ( gs) is given by: In principle, SE to any state below the "active" state a in which the system evolves at the time of the interaction with the probe pulse could be observed/simulated. However, these transitions fall very often below the detection limit of the experimental apparatus and it is commonly observed and simulated only the SE to the ground state.
Summing over all the trajectories we can obtain the TA for a certain time and convoluting over all the time steps t, we obtain the final TA as: Regarding the convolution with respect to an experimentally recorded spectrum, some considerations regarding our implementation need to be pointed out: (i) the Gaussian line shape used to convolute the spectrum is artificially broadened by FWHM to smooth the signal, removing the noise; (ii) three possible arrangements of the pump and probe pulses are considered. 56 The first one is at the so-called "magic angle," for which the intensity of the signal is assumed to be only dependent on the magnitude of the TDM and independent from the angle between the dipole moments and the angle between the pump and probe pulse. Other two possible arrangements considered are the parallel and the orthogonal. In these two cases, the intensity of the signal is scaled by a factor A that corresponds to in case of parallel arrangement and to in case of orthogonal arrangement, where TDM i represents the transition dipole moment vector of the transition promoted by the pump at t 0 and TDM j the one promoted by the probe at time t and θ the angle between them. That means that a parallel arrangement leads to a signal three times more intense for transition that have TDM parallel to the one generated by the pump pulse, while the one orthogonal to it will be two times more intense for orthogonal arrangement, and in general indicates a modulation of the relative intensities of the signals; (iii) an additional broadening in the time domain as is applied to account for finite pulse duration; (iv) white light is assumed for the probe pulse, which means that the pulse covers the whole spectral window of interest.

| Preparation and analysis scripts for nonadiabatic dynamics
We implemented a new series of auxiliary scrips in COBRAMM that help the user prepare the system step-by-step, and analyze the results of the simulation. These python scripts are added to the already present system-preparation tool (cobramm.prep), with the advantage of being divided according to the individual task to be performed. This allows to use all of them in serial or independently for particular tasks needed by the user.
An extremely important, yet delicate, task is the generation of meaningful initial conditions used to propagate the trajectories. 57,58 This aspect becomes even more sensitive in case of QM/MM setups. 59 In COBRAMM, we tackle this task in two ways, depending on the complexity of the system.

| Isolated chromophores in solution
This task is fully covered by new python scripts. Starting from an isolated geometry of a chromophore, that might be one or more organic molecules, this is solvated with a box of solvent molecules and the whole system is minimized, heated, and equilibrated at the MM level with a standard procedure, as described in section 4.1, using General

| Built-in calculation of wave function overlaps
As suggested by Hammes-Schiffer and Tully, the nonadiabatic coupling between two states can be reliably approximated by the overlap between the wavefunction of these states. 53 In this way, the TDC can be expressed as: In COBRAMM, the TDC is approximated to finite difference and computed from the anti-symmetrized projection, using the overlap matrix at three consecutive points We then implemented a second, faster, way of computing WFO that relies on transformational sub-matrices of occupied and virtual orbitals.
In TD-DFT within the Tamm-Dancoff approximation 64 (TDA), the wavefunction Ψ i (r) of the ith electronic state at position r can be written as: with C r ia and Φ p a being the CI coefficients of state i and are the determinants associated with each single excitation from occupied orbital a to virtual orbital p, respectively. With this definition the overlap term between states i and j at two different geometries r and r 0 becomes where Φ p a r ð Þ ¼j ψ 1 r ð Þ, …, ψ aÀ1 r ð Þ, ψ p r ð Þ,ψ aþ1 r ð Þ, …, ψ n r ð Þ⟩ ð15Þ with ψ p r ð Þ being the p-th molecular orbital (MO). The determinant overlap term B virt a denotes the projection of the occupied orbital a at geometry r on the subset of virtual orbitals at geometry r 0 .
Since the single electron excitation requires to exchange an MO from the occupied subset with an MO from the virtual subset, the CI coefficients at geometry r 0 are modified by the transformation via the expression where C i is defined as the matrix of CI coefficients of dimension n Â m. Since U occ is unitary matrix, the transpose of its inverse gives the U occ matrix itself. Thus, only the inverse of the MO overlap matrix of the virtual orbitals needs to be computed. In practice, since the occupied and virtual blocks are not complete, the two expressions (whether using S ÀT occ,occ or S occ,occ ) give slightly different results which are related to the quality of the approximation of completeness.
With the above MO transformation at geometry r 0 it is achieved that the full MO overlap matrix becomes a unitary matrix Consequently, the determinant overlap term ⟨Φ p a r ð ÞjΦ s b r 0 ð Þ⟩ survives only when bra and ket are identical determinants (i.e., ⟨Φ p a r ð ÞjΦ p a r 0 ð Þ⟩) simplifying the calculation of the wavefunction overlap The presented approach achieves accuracy comparable to the explicit computation of the overlap via Slater determinants (Equation 14), but accelerates immensely the calculation, as will be shown in section 4.2.

| Evaluation of excited states transition dipole moment
The probability and intensity of a transition between two electronic states is proportional to the magnitude of the TDM between them.
Consequently, in order to evaluate the ESA, it is necessary to compute the TDM between a currently populated excited state and the higherlying states. This can be analytically calculated exactly for wavefunction-based method. 65 Although this is not possible for the linear response formulation of TD-DFT, the most used in computational spectroscopy, several developing and applicative works have been done in this direction. [66][67][68] However, the TDMs can be obtained in an approximate way from the one-electron transition density matrix between two excited states. The transition density matrix is expressed as: where Ψ are the wavefunction of the electronic states, r the spatial and x the spin-space coordinates. In case of CIS or TD-DFT, but only if the TDA 64 is applied to exclude the de-excitation in the wavefunction expansion, the transition densities can be obtained as diagonal element of this matrix, using the Kohn-Sham molecular orbitals, as In case of a TDA-TD-DFT calculations, ω corresponds to the element of the vector X of the Casida equation. 69 Multiplying Equation (26) with the dipole moment of each state and integrating over the whole space the TDM is obtained. In our current implementation, this is done by Multiwfn, 70   proposed as a molecular model system to test nonadiabatic methods since its S 2 /S 1 decay resembles the second one-dimensional model original proposed by Tully for the same aim, and already used with this purpose. 77 For all these reasons, DMABN represents an ideal system to test the reliability and robustness of our implementation on its different steps. We will first test our TSH implementation, with the hopping driven by the computation of WFO for the S 2 /S 1 early relaxation within the first 100 fs. Second, we will test the setup to calculate the TA spectrum on the same 100 fs long trajectories, to map the very first nonadiabatic event during the dynamic of this molecule.

| Computational details
The initial DMABN geometry was solvated with a box of 24 Å of MeCN molecules. The parameters for MeCN were taken from literature, 78 while the General Amber Force Field was used to parametrize the chromophore. 60 The whole system was minimized classically for 1000 cycles, half with steepest descent and half with gradient conjugated method, then heated to 300 K for 20 ps and then pressure and volume equilibrated in for 300 ps with the temperature kept constant with a Langevin thermostat. 79 During all these steps SHAKE was activated to constrain all the bond involving hydrogen atoms, 80 allowing a time step of 2 fs, periodic boundary conditions were used and the cutoff to calculate the electrostatic interaction in the real space within the particle meshed Ewald scheme was set to 12 Å. 81

| DMABN non-adiabatic dynamics
The main feature of the early nonadiabatic dynamics of DMABN is the ultrafast relaxation from S 2 to S 1.   85 However TSH at algebraic diagrammatic construction method showed an almost complete depopulation of S 2 , without back hops to that state. 31 Along our simulation time, we can clearly see the oscillations typical of this behavior in the population profiles, until the S 2 population is 20% of the total one after 100 fs.
We tested the performance computing the time derivative coupling by approximating the NAC by WFN with the two algorithms we implemented in COBRAMM (section 3.2). We ran two trajectories starting from the same initial condition and propagate them adiabatically on S 2 surface, forbidding the hops to other states. We computed the time-derivative coupling between S 2 and S 1 along these trajectories, as show in Figure 4.
As it is visible, the two algorithms give the same result for the WFO, resulting in the same values for the TDC along the trajectories.
The main difference stands then in the time needed to compute the WFO. In our implementation, the orbital rotation scheme calculates the TDC in less of a second, for the system used and with the number of roots included, which turned out to be up to 100 times faster along the trajectories studied, suggesting this method as the indicated one, giving identical result in a more efficient way. with opposite sign intensity can be found the SE signal at higher energy. We successfully reproduced the signatures of TA spectrum, which is reported in Figure 5.
The SE signal reflects the ultrafast decay from the bright S 2 state, which is responsible of the intense spot at 5 eV, to the darker S 1 state, that shows the constant band centered at 4.5 eV.
The ESA band is composed by the intense signal centered at 1.75 eV. As for other simulations, we overestimate the initial SE energy. 72,84 Regarding the ESA, experimentally, 74 it was found to be centered at around 1.8 eV and slightly higher in the simulations with higher-accuracy methods (around 2 eV), so we reproduce satisfactorily the energy of the signal. This signal is assigned to the absorption of the local excited state S 1 to higher excited states. The relative intensity of the two signals is also in agreement with previous results, with the strongest signal given by the initial spot of the S 2 emission, and the ESA and SE bands from 10 to 100 fs less intense and constant over this time. 31,72,84 Both signals are expected to decay at longer time in the picosecond time scale and a signal associated to the emission from a second S 1 minimum is expected to rise, but we stopped our trajectories after 100 fs, since the goal was only to verify and proof the applicability of the workflow implemented and presented, which will be soon applied to simulate the ultrafast TA spectra of other molecules, whose ultrafast dynamics is less clear and might be only simulated at the TD-DFT level, due to the size of the chromo- In order to finally disentangle the role of different excited states involved in the ESA band, a decomposition of the that band is performed at the magic angle orientation (Figure 8).
We reported the single contributions of the S 1 -> S n transition and S 2 -> S n transitions. This is practically done by convoluting the F I G U R E 7 Transient absorption spectra for parallel (left), magic angle (center), and orthogonal (right) orientation of the pump and probe pulses spectrum considering only the S 1 (or S 2 ) to higher states transitions individually, for each step in which the initial state of interest is also active one. We can identify six different sub-signals (Figure 8a-f) that altogether give rise to the full band (bottom right square, Figure 8).
Signals from S 1 start to appear after 10 fs, once this state is populated ( Figure 8, top half). In detail, the S 1 to S 2 transition does not give any signal in the energy range considered (1-3 eV) due to the close vicinity in energy along the dynamics. The first sub-signal (a) is centered at F I G U R E 8 Excited state absorption band decomposition in single contributions of transition from active state S 1 (top half) and S 2 (bottom half) to single S n and full signal (bottom right square). Each box represents a single excite state absorption band and the letters indicates individual bands to whose single state absorptions contribute to 1.75 eV and is obtained from excitations from S 1 to S 3 and S 4 . These components are the relatively most intense ones among the whole set of transitions considered and their absorption represents the main contribution to the ESA signal. Transitions from S 1 give rise also to the other part of the band. Excitations to S 5 -S 7 give a weak signal centered at 2 eV (b), while the S 1 ->S 8 absorption is centered at 2.25 eV (c) and finally the upper blurred contribution of the total ESA signal is obtained from transition to S 9 and S 10 (d). Although it is rapidly depopulated, the S 2 relative population along the dynamics still produces some contribution to the total ESA band. Transitions to S 3 and S 4 do not contribute at any time along the dynamics, while a first contribution from this state is found where it is excited to S 5 À S 8 (e), which excitations contribute in minor part to the main signal at The total results show how such type of post-processing tools might turn out as a crucial tool to disentangle more complex TA signals.

| CONCLUSION AND OUTLOOK
In this work, we presented some of our most recent implementations in COBRAMM, a software developed in our group to simulate photo- where double excitations might be relevant in the TA. We aim to extend the workflow to methods like ADC(2), 86 for which, being a black box method as TD-DFT, the only work needed to include it in our workflow would be to write an interface to any QM software where this method is implemented. The same workflow, but not in an automatized way, has been already successfully applied by our group by using multireference methods. 12 It is our intention to implement an automatization of this workflow, including the choice of active space and other crucial aspects of these kind of calculations. Finally, our protocol could be in the future extended and adapted to simulate other type of TR experiments like TR photoelectron spectroscopy or TR Xray absorption spectroscopy.