#### Methods

In this perspective, we compare 2DUV spectra of a cyclic Cysteine-Phenylalanine-Tyrosine-Cysteine (CFYC) tetrapeptide (Fig. 4a) calculated with two different approaches, the EM based on the same Frenkel exciton matrix used in the EHEF method (see eq. (1) in Ref. [36]) and the *ab initio* model based on the sum-over-states (SOS) approach and QM/MM (CASSCF/CASPT2//Amber) calculations (named SOS//QM/MM, or simply *ab initio*, thereafter). In the EM, the single excitation energies are calculated for isolated chromophores (by means of gas-phase *ab initio* calculations) and parametrically used in the exciton Hamiltonian, whereas the couplings between two excitons (within or between chromophores) are estimated using a quasiparticle approach,[41] with mixed doubly excited states located at energies that are the sum of the corresponding single excitation energies. In the *ab initio* description of the electronic structure, the double-exciton manifold is constituted by different localized doubly excited states, high-lying single-excited states, and delocalized doubly excited states that are located at an energy different from the exact sum of the single excitation energies (due to the quartic coupling, δ). As the nonlinear response detected by a 2D experiment is represented by the evolution of the density matrix in the Liouville space,[40] a difference in the energy levels distribution gives rise to differences in the calculated 2D electronic spectra.

Simulations of 2DUV spectra using the EM method were performed using the computational protocol previously described.[36] The computational procedure for simulating 2DUV from *ab initio* calculations includes the following steps: (i) configurational space sampling with classical MD simulations, (ii) selection of MD snapshots, (iii) refinement of selected geometries at the QM/MM level, (iv) calculation and collection of energies and transition dipole moments, and (v) calculation of 2DUV signals. While in this perspective, we consider only one selected structure of the biological target, focusing on the comparison of 2DUV spectra obtained with a Frenkel excition method or with first-principle calculations, inclusion of thermal fluctuations (that shape 2D electronic peaks and are disregarded here) is straightforward and will be documented soon.

Classical MD simulation of the cyclic CFYC oligopeptide was carried out for 40 ns after pre-equilibration, using a 2 fs time step. The CFYC molecule was capped with acetyl and N-methylamine protecting groups and solvated in a box of TIP3P water molecules,[42] using cubic periodic boundary conditions as implemented in the Amber 11 tools,[43] and the standard ff10 Amber force field. The particle-mesh Ewald approach was applied to treat long-range electrostatic interactions, with a cut-off of 12 Å for nonbonding interactions. After initial relaxation, heating in six stages of 50 K using Langevin thermostat was applied to achieve a constant temperature of 300 K. Subsequently, the system was equilibrated for 5 ns at 1 atm constant pressure. Cluster analysis of the 40 ns trajectory indicates that the aromatic rings are oriented in a T-stacked conformation in ∼75% of the frames, with a shallow barrier allowing for the short-time population of unstacked conformations. The energetically lowest geometry was selected as a representative structure of the equilibrium dynamics of the cyclic CFYC oligopeptide in solution, having a T-stacked conformation (Fig. 4a). The selected geometry was refined in an optimization at QM/MM level performed with the Cobramm package,[44] using Molpro 2010[45] for state-of-the-art QM calculations, and an electrostatic embedding scheme for describing the electrostatic QM/MM interactions. The link-atom technique[46] and redistribution of residual charges among nearest neighbors were used, with both aromatic side chains included in the QM layer and the remaining atoms treated classically. The H-atom link was located along the C_{α}-C_{β} bond axis of the aromatic side chains. The peptide and the water molecules participating in hydrogen bonds with CFYC were allowed to move during the optimization, whereas the remaining bulk waters were kept frozen. The structure was optimized at the CASSCF[47] level with an active space of eight electrons in eight aromatic orbitals (four electrons and four orbitals per each chromophore). The ANO-L(4s3p2d/2s)[48] basis set was used to account for both polarization and dispersion. To collect the information on the single- and double-excitation manifolds required for calculation of 2DUV signals, a single point state-average CASSCF(12,12)/ANO-L(4s3p2d/2s) calculation with 120 roots, followed by an independent single-state CASPT2 correction for the lowest 70 roots was performed on top of the CASSCF optimized structure (the so called CASPT2//CASSCF approach)[49] using Molcas 7.7.[50] The number of roots comprised in the CASSCF and the CASPT2 calculations are chosen to ensure that all the excitations of the double-exciton manifold lying, upon PT2 correction, in the energy range reported in the 2DUV spectra are included. An imaginary shift[51] of 0.2 was used, and the Ionization Potential Electron Affinity (IPEA) shift[52] was set to zero. Transition dipole moments were calculated at the CASSCF level.

For the generation of the 2D-NUV rephasing signal (K_{I}), heterodyne detected by superimposing it with an LO in direction *k*_{I} = −*k*_{1} + *k*_{2} + *k*_{3} (Fig. 1), we assume four equivalent short Gaussian laser pulses with central frequency at 38,000 cm^{−1} and a full width at half maximum 5864 cm^{−1} (corresponding to a Fourier limited pulse of ∼2.5 fs). Long laser pulses (15–20 fs) can be used in realistic experiments, whereas considering that the corresponding pulse frequency narrowing implies weakening (or loss) of some 2DUV signals. The ideal central frequency and the duration of the laser pulses are, thus, dependent on the type of signals that are investigated. We focus on the NUV region between 34,000 and 44,000 cm^{−1}, showing clear signatures of the L_{b} signals. This pulse spectrum does not cover the DUV region above 45,000 cm^{−1} where the very intense B_{a}, B_{b}, and backbone amide absorptions are expected to obscure the L_{a} signals. A constant broadening of 200 cm^{−1} was used throughout. Calculations were performed with Spectron 2.7[21] for the nonchiral xxxx polarization configuration. The 2D signals, which are functions of the two frequencies Ω_{1} and Ω_{3}, were calculated by 2D Fourier transformation along *t*_{1} and *t*_{3}, whereas *t*_{2} was set to zero, thereby suppressing coherent excited state dynamics.

#### 2 DUV spectra of a cyclic tetrapeptide

Figure 4 shows the comparison between the simulated 2D spectra in the NUV region (i.e., 34,000–44,000 cm^{−1}) of the solvated CFYC oligopeptide in a T-stacked conformation, obtained with the EM and the *ab initio* approaches. A single geometrical configuration of the cyclic tetrapeptide, that is representative of the most populated cluster, has been harvested from the classical MD simulation and structurally refined at the QM/MM level (Fig. 4a). The unconstrained geometry optimization led to planarization of the two chromophores, while keeping the relative distance between them and the conformation of the peptide unchanged with respect to the initial structure.

The EM spectrum (Fig. 4c) is dominated by the diagonal L_{b} absorption of Tyrosine (Tyr) at ∼36,500 cm^{−1}, which is close to the central frequency set for the four laser pulses (i.e., 38,000 cm^{−1}). The closely lying L_{b} absorption of Phenylalanine (Phe) around 38,000 cm^{−1} has a molar absorption coefficient much smaller than Tyr (195 vs. 1405 M^{−1} cm^{−1}, respectively)[53] and, hence, not clearly visible in the spectrum.

Within the *ab initio* SOS//QM/MM approach, excitation energies and transition dipole moments of the singly and doubly excited manifolds lying in a large energy window are needed to simulate the 2DUV spectrum. Figure 4b shows the level scheme obtained from CASPT2//CASSCF calculations, indicating the energy levels that are involved in bright transitions among the singly and doubly excited manifolds. CASPT2//CASSCF calculations indicate the presence of several excited states lying in the region between 70,000 and 80,000 cm^{−1} (i.e., between 34,000 and 44,000 cm^{−1} from the L_{b} manifold), with nonvanishing transition dipole moments, out of the singly excited (L_{b}) manifold (36,500–38,000 cm^{−1}).

As for the spectrum obtained at the EM level, the calculated SOS 2DUV spectrum is dominated by the L_{b} absorption of Tyr. However, the latter spectrum is much richer than the EM one, due to the presence of resolved bright excitations to double and high-lying single excited states appearing in the spectrum as off-diagonal peaks. Part of the off-diagonal contributions to the nonlinear signal arises from excitations to local doubly excited states that can be accessed from the singly excited manifold of the chromophores. In particular, an excitation out of the L_{b} state of Tyr to the localized doubly excited state (D) appears in the *ab initio* spectrum at Ω_{1} = 36,516 cm^{−1} and Ω_{3} = 34,625 cm^{−1} (peak 3). Due to its local nature, this signal appears independently of the presence of a chromophore–chromophore interaction, however, its spectral position and relative intensity depends on the immediate local environment and, thus, contains implicit structural information.

Excitations to local doubly excited states are usually neglected in the Frenkel EMs, where only the double excitations associated with combination of electronic states in the single-exciton manifold are considered. In particular, in the cyclic tetrapeptide, a mixed doubly excited state resulting from combination of the L_{b} states of the two chromophores (2L_{b}) can be accessed with excitation out of the L_{b} states of the chromophores, either Tyr or Phe. In the Frenkel EM, the energy of the 2L_{b} state is approximated to be the exact sum of the energies of the two L_{b} states (2L_{b}^{F}, see Fig. 4b), leading to two symmetric off-diagonal peaks in the 2D map. However, in the CFYC tetrapeptide, the excitations from the L_{b} states to the 2L_{b} state have small oscillator strengths and thus, they are not visible in the EM spectrum (Fig. 4c). While the *ab initio* SOS//QM/MM results confirm that the 2L_{b} off-diagonal peaks are covered by the intense L_{b} bleach signal of Tyr, they also indicate the presence of quartic coupling shifting the energy of the 2L_{b} state by 1359 cm^{−1} higher than the sum of the energies of the two L_{b} states (Fig. 4b). This outcome implies that the *ab initio* SOS//QM/MM approach allows assessment of the quartic couplings between electronic states of the single-exciton manifold, determining more accurately the positions of the positive signals associated with mixed doubly excited states.

Other important off-diagonal contributions to the 2DUV spectrum are caused by excitations from local excited states (such as the L_{b} states) to high-lying singly excited states (S) that could have localized, delocalized (S_{L} and S_{D}, black and green lines in Fig. 4b, respectively) or charge-transfer (CT) character, depending on the nature of the molecular orbitals involved in the excitation out of the L_{b} states and the permanent dipole moment of the final excited state. For example, excitation out of the L_{b} state of Tyr to a high-lying singly excited state localized on the same chromophore gives rise to a strong positive peak at Ω_{1} = 36,516 cm^{−1} and Ω_{3} = 40,912 cm^{−1} (peak 6 in Fig. 4d), whereas the analogous transition from the L_{b} state of Phe originates a strong positive signal at Ω_{1} = 37,938 cm^{−1} and Ω_{3} = 37,257 cm^{−1} (peak 9), which overlaps with the tail of the main diagonal (negative) peak of the Tyr L_{b} state. Moreover, excitations from the L_{b} states of Tyr and Phe to two delocalized high-lying singly excited states (S_{D}) give rise to two strong positive peaks at Ω_{1} = 36,516 cm^{−1} and Ω_{3} = 38,186, 38,831 cm^{−1} (peaks 4 and 5, respectively) and two strong signals at Ω_{1} = 37,938 cm^{−1} and Ω_{3} = 36,764, 37,409 cm^{−1} (peaks 8 and 10, respectively). Again, the excitations out the L_{b} state of Phe (peaks 8 and 10) overlap with the tail of the Tyr L_{b} state absorption, showing weak positive peaks in the 2D map. The excitations out the L_{b} states of both chromophores can also involve occupation of CT states, high-lying singly excited states with permanent dipole moment different from the ground state. In particular, transitions to the excited state characterized by transfer of charge from Tyr to Phe, namely CT(Y==>F), contribute to the 2D spectrum with two positive signals at Ω_{1} = 36,516 cm^{−1} and Ω_{3} = 41,517 cm^{−1} (peak 7) and Ω_{1} = 37,938 cm^{−1} and Ω_{3} = 40,095 cm^{−1} (peak 11), corresponding to excitations from the L_{b} states of Tyr and Phe, respectively.

The positions and the intensities of the signals involving delocalized high-lying singly excited states depend explicitly on the electronic coupling between the two aromatic residues. In the T-stacked conformation of the CFYC tetrapeptide, we observed several bright excitations to such delocalized excited states (peaks 4-5, 7-8, 10), including excitation to a CT state (Fig. 4d). These signals are neglected in the Frenkel exciton approach (Fig. 4c), implying that the EM signal is missing some of the structural information and part of the electronic couplings contained in the 2D map, which can be instead revealed by simulations based on first-principle calculations.