Control of Protonated Schiff Base Excited State Decay within Visual Protein Mimics: A Unified Model for Retinal Chromophores

Artificial biomimetic chromophore-protein complexes inspired by natural visual pigments can feature color tunability across the full visible spectrum. However, control of excited state dynamics of the retinal chromophore, which is of paramount importance for technological applications, is lacking due to its complex and subtle photophysics/photochemistry. Here, ultrafast transient absorption spectroscopy and quantum mechanics/molecular mechanics simulations are combined for the study of highly tunable rhodopsin mimics, as compared to retinal chromophores in solution. Conical intersections and transient fluorescent intermediates are identified with atomistic resolution, providing unambiguous assignment of their ultrafast excited state absorption features. The results point out that the electrostatic environment of the chromophore, modified by protein point mutations, affects its excited state properties allowing control of its photophysics with same power of chemical modifications of the chromophore. The complex nature of such fine control is a fundamental knowledge for the design of bio-mimetic opto-electronic and photonic devices. A joint experimental-computational study elucidates the photophysics of retinal Schiff bases embedded in artificial proteins capable of mimicking the color tunability of natural visual pigments as compared to solvated chromophores. Combining ultrafast transient absorption spectroscopy with quantum mechanics/molecular mechanics simulations allowed monitoring nonlinear optical signals of the chromophore, shedding light on the complexity beyond the fine control of its excited state lifetime.


Franck-Condon region
Ground state geometries. Considering the CRABPII proteins in the absence of explicit solvent molecules (apart from those water molecules present in the crystallographic structures) and optimizing the ground state geometries at the CASSCF level leads to computed vertical S0→S1 transition energies largely blue-shifted with respect to the experimental absorption maxima (see Table 1 in the main text), with blue-shifts of 0.88, 0.73 and 0.22 eV for M4, M8 and M10 respectively at the SS-CASPT2 level (and of 0.63, 0.64 and 0.41 eV at MS-CASPT2), while the expected accuracy of this QM/MM approach is around 0.2 eV. Changing the optimization method from CASSCF to DFT (B3LYP) or MP2 for the ground stat geometries did not correct this large blue-shift. Since the crystal structures (protonated using standard protonation states) are negatively charged, we have also considered proteins neutralized with Na + counterions, which still show analogous large blue-shifts. As showed in Figure S1, inspection of the X-ray structure of the hCRBPII proteins indicates that the PSB molecule (and in particular the ionone ring) is more exposed to solvent water molecules in the mimics than in natural opsin proteins, which are transmembrane proteins that efficiently shield the PSB moiety from its external environment. Thus, explicit solvent molecules are required for a realistic modeling of these systems. As described in the main text, we have followed the approach previously used for CRABPII proteins, performing classical molecular dynamics (MD) simulations and selecting a limited number (ten for each hCRBPII protein) of frames that have been subsequently refined with QM/MM optimizations to obtain the final ground state geometries.
The bond lengths and the dihedral values along the PSB polyene chain, obtained after CASSCF ground state optimizations at the QM/MM level, are collected for all the ten configurations of each protein. We present in this section the average values obtained. The bond lengths show very little deviation from configuration to configuration for a given protein, in contrast to dihedral angles ( Figure S2). As can be seen, the PSB in M4 has a more pronounced bond length alternation in the C12-C15 region, and it is more distorted. As expected from the minor difference in the crystal structure, the PSBs in M8 and M10 have very similar geometry. Interestingly though, the ionone ring is more twisted in M8 than in M10, with dihedral angles values for the C5-C6-C7-C8 bond of 48.1° and 17.8° respectively, even if the only difference between M8 and M10 is the residue number 4 (Gln in M8, Arg in M10) located at the other side of the PSB. Excited states wavefunctions. In this section we analyze the wavefunctions obtained after CASSCF optimizations of the (ten) different configurations in the ground state. We also present the Perturbatively Modified CASSCF (PM-CASSCF) wavefunction, arising from the diagonalization of the effective Hamiltonian in the MS-CASPT2 approach. 1 All the transition energies are presented in eV, and are computed with the ANO-S basis set. For the wavefunction characters, only configurations with a weight of more that 0.09 were kept. « GS » refers to the closed shell determinant, « H » refers to Highest Occupied Molecular Orbital, « L » refers to Lowest Unoccupied Molecular Orbital, and the superscript « 2 » denotes a doubly excited transition. As indicated in the main text, the M8 wavefunctions feature two distinct behaviors among the ten solvent configurations selected from the MD sampling. Indeed, in four cases (Table S3), the S1 and S2 states are well separated in energy, with an S1 state dominated by the H → L configuration and an S2 state dominated by the (H → L)² configuration (as for M10), while in all the others, the same mixing as for M4 appears. This translates into two sets of vertical absorptions, where the 'M4-like' (namely M8'') configurations give highly blue-shifted values, while the 'M10-like' (namely M8') configurations provide a better agreement with the experimental linear absorption spectrum. Given the proximity of M8 and M10, we expect an overall similar behavior for these two systems. However, we obtained more M8'' configurations than M8', indicating that M8 is a challenging case for our (limited) sampling approach, possibly requiring quite extended sampling of solvent configurations and accurate refinement of the initial crystal structure. Obtaining a large and very accurate sampling of the conformational space of solvated hCRBPII proteins is out of the scope of this work, where the MD sampling has been performed to alleviate the limitations of computing transition energies on top of crystal structures in gas-phase and to obtain at least few representative protein structures in order to investigate the decay dynamics of the photogenerated S1 state in the PSB. Permanent dipole moments. The permanent dipoles, extracted from the CASSCF wavefunctions, have been collected for all the ten configurations for each protein, and are displayed in Figure S3. In M4, large variations in direction are observed for the S1 dipoles, which in some cases overlap with S2 dipoles. The direction of the dipole moment in S0 is consistent with the localization of the charge in the ground state on the nitrogen side of the PSB molecule. Upon excitation, a charge transfer along the PSB conjugated chain is expected. However, the direction of the S1 dipole moments indicates a quite ineffective charge transfer in this protein. The intensity of the dipole moment also decreases sharply when going from S0 to S1, which indicates that the charge separation is low on the excited state. Finally, here again the huge influence of the HBN is clearly noticeable, given the wide variations in directions. Figure S3. Distribution of the permanent dipole moments obtained for the three proteins, for the states S0 (blue), S1 (green) and S2 (red), computed at the CASSCF/ANO-S level.
In M8, there is also a wide variation in the directions in S1 dipole moments. However, the S1 dipole moments are more oriented toward the ionone ring than in M4, indicating a more effective charge transfer in this protein. The increasing value of the intensity of this dipole is another proof of this more efficient charge transfer. Finally, we can notice that the S0 dipoles are less intense than in M4, which indicate a lower localization of charges in the ground state.
Finally, in M10, the dispersion of directions is reduced, consistently with the efficient shielding of the PSB from external environment. Also, the intensities of S1 and S0 dipoles are in the same range (13.25 D and 12.30 D respectively), indicating a very effective charge transfer.
The evolution of the norm of permanent dipole moment vector difference Δµ(S0→S1), i.e. µ(S0)µ(S1), from M4 to M10, as depicted in Figure S4, further indicates that the charge transfer intensity is increasing going from M4 to M10. In the three proteins, the direction of the dipole moment differences shows the transfer of positive charge from the nitrogen side to the ionone side of the PSB. The values of Δµ(S0→S1) in M8' and M10 are similar. Figure S4. Distribution of the permanent dipole moment vector difference, with Δµ(S0→S1) in green and Δµ(S0→S2) in orange, obtained for the three proteins at the CASSCF/ANO-S level.
Partial charges analysis. The Mulliken charge analysis is carried out on the CASSCF wavefunctions. For each solvent configuration, the charges are computed for all atoms while the hydrogen charges are summed up on heavy atoms. The PSB molecule is divided into two fragments by cutting it at the C11-C12 bond. The two fragments will be referred to as the lysine and ionone part, respectively. The charges of each atom in each fragment are summed up, and the average over the ten snapshots of each protein is computed. The results are presented in Figure S5. The stronger localization of the charge on the lysine side in M4 in the ground state (S0) compared to M8 and M10 is clearly observed. The lysine side indeed accounts for around 90% of the positive charge in M4, whereas this amounts to around 75% in M10. The charge transfer efficiency is computed as the amount of charge that is transferred from one side to the other upon excitation. We found that on average, 27%, 41% and 46% of the charge initially on the lysine side is transferred to the ionone side for M4, M8 and M10 respectively. For completeness, by taking into account the M8' set of M8 configurations, the amount of charge transfer from the lysine side to the ionone side rises to 47%, and it is therefore very close to M10. These results confirm our conclusion that the charge transfer is significantly reduced in M4 with respect to what happens in the two other systems. As mentioned in the main text, PSB in M4 behaves like in methanol solution, as the computed internal charge transfer is comparable to what has previously been found for solvated PSB. Figure S5. Averaged Mulliken charges summed up on the Lysine part (from N to C12) and on the Ionone part (from C11 to ionone ring).
To investigate more in depth on the repartition of the charge in the different systems, a per-atom analysis was carried out, in which we consider only the atoms in the conjugated chain of the PSB.
In this analysis, the charges carried by methyl groups are summed on the backbone heavy atoms the methyl belongs to. For instance, all the charges in the ionone ring are added to C5. The charges obtained are then averaged for each state in each protein, considering only M8' configurations for M8. The graph obtained is presented in Figure S6. The C12-C13 and C13-C14 bonds in the S0 state of M4 and M8 proteins appear more polarized than in M10 proteins, which notably relates to the reactivity of the C13-C14 bond in the photoisomerization pathways showed in Figures S16-S17. In M10, the positive charge is evenly spread across the C13 -C8 region. The positive charge also appears strongly located on the N -C15 bond in both M4 and M8. When exciting M4 to S1 , the charge of C10 and C8 increases to positive values, while the polarization of bonds in the C13 -C10 region is inverted. Overall, the polarization of C-C bonds is reduced in S1. In M8, similar situation is observed with respect to M4, although a far more positive charge transfer is evident from the charge increase in the C10 -C7 region. In M10, a different picture appears. As expected from the bar diagrams presented below ( Figure S5), the positive charge is transferred toward the ionone ring upo photoexcitation. To investigate more in depth on the effect of the environment on the spectroscopic properties of the embedded PSBs, we extracted the PSB molecule as defined in the high layer from the proteins, after QM/MM geometry optimizations. Without any reoptimization, we re-computed the energies of the S0, S1 and S2 states at the SS-and MS-CASPT2 levels. By calculating the difference between the energies obtained in protein and those obtained in vacuum, we expect to get insight on differential effects of the set of point charges used for describing the protein on the different states involved. All the results in this section were computed using the 6-31G * basis set, and are reported in Table S7.

Table S7. Averaged vertical excitation energies and corresponding standard deviations computed for M4, M8 and M10 in protein and in vacuum (i.e. by removing all electrostatic contributions). All energies are in eV.
The excitation energies obtained for the PSBs extracted from the three proteins (upon QM/MM optimizations) are close to one another (1.98 eV, 2.17 eV and 2.00 eV for M4, M8 and M10 respectively at SS-CASPT2 level), indicating that the tuning of the vertical absorption observed experimentally is mainly due to the environment (the solvated protein scaffold), and that the geometry of the PSB plays a minor role. Standard deviations of the excitation energies obtained in vacuum for the ten selected configurations (for each protein) are very low, and indicate that the PSB geometry and the vertical absorptions are very stable along each trajectory. The resulting values in vacuum are in good agreement with previous computational studies in gas-phase, and the blue-shift of M8 can be attributed to the twisted conformation of the ionone ring obtained (see Figure S2). It is worth noting that for M10 the results in protein and in vacuum are quite similar, pointing out how the PSB is efficiently shielded from the external environment in the M10 protein, and it behaves very closely to a PSB molecule in gas phase. Here, we evaluate the electrostatic effect of the protein scaffold as the energy difference "# = %&(%&/&&) − %&(+,-../) where %&(%&/&&) is the QM energy obtained for the PSB embedded in the solvated protein scaffold, and %&(+,-../) is the QM energy of the PSB in the gas phase, considered in the exact same geometry. This difference is computed for the S0, S1 and S2 states for all the snapshots considered before. A negative difference indicates that the protein stabilizes the state under consideration. The results (computed with CASPT2 QM energies) obtained for the three proteins are presented in Figure S7. First, one can notice that the solvated protein scaffold stabilizes all the states in all proteins, due to the overall electrostatic stabilization of the charged PSB species by the environment. In particular, the positive charge carried out by the PSB PSB is stabilized by the overall negative charges of the M4, M8 and M10 proteins, carrying charges of -3, -5, -4, respectively (as determined by standard protonation states), with total systems neutralized by Na ions in solution. However, the overall stabilization in M4 and M8 is higher than in M10. This is most likely due to the mutation of Gln to Arg at position 4 in M10, leading to another positive charge at close proximity with the protonated Schiff base, thus less stabilizing the system. However, the states are not all equally affected. In M4, the solvated protein stabilizes more the covalent S0 and S2 states than the ionic S1 state, leading to a blue shift of the vertical absorption with respect to gas-phase. The PSB positive charge is mainly located on the nitrogen in the ground state and it is expected to be stabilized by the HBN involving the nearby water molecule and Gln4 residue in M4 and M8, thus stabilizing S0 and S2 more than in M10. In M10, in fact, all the states are similarly stabilized, and no specific electrostatic effect can be observed. This is consistent with the effective shielding from external environment in M10. Focusing on the M8' configurations of M8 (i.e. configurations 03, 04, 06 and 09), a similar feature is observed, in which no differential effect is found for the three states.

Photophysical and photochemical properties
Excited state geometries. Optimizing the excited state in hCRBPII proteins is challenging, mainly due to root flipping that can occur. This has especially been the case for all the configurations in M4, and for some configurations in M8. At the CASSCF level, the H → L root is found to be root number 3 for all configurations in M4. This was also the case for the M8'' configurations of M8. In M10 and the M8' configurations of M8, the H → L root is always root 2. The case of M4, as highlighted in the main text, is highly similar to the one of PSB in solution reported previously. Indeed, the PSB optimizations give always an EBL-like geometry, characterized by even bond lengths in the conjugated chain of the PSB. However, some variability of the geometrical parameters among the various configurations is observed, as can be seen in Figure S8. Notably, most configurations have a strong elongation of the C13-C14 bond, from an average of 1.363 Å in the ground state to 1.447 Å in the excited state, although it varies from 1.437 Å to 1.459 Å. The deviations from planarity of the PSB conjugated chain show that two configurations stand out. Indeed, most structures show strong deviations around the C12-C13 and C13-C14 bonds, whereas not much change occurs in this region for configurations 09 and 10. These two have strong distortions around the C11-C12 bond instead. This change of behavior stresses once again the importance of the initial sampling. Indeed, the HBN conformation around the PSB seems to strongly influence the excited state potential energy surface, by favoring some distortions over others.
For M8, the M8'' set leads to EBL-like optimized geometries, while the optimizations of the configurations belonging to the M8' set (except configuration 06) give ABL-like geometries. It is worth to notice that configuration 06 lies in between these two situations, with a BLA pattern reminding the one labeled ABL* in a recent study on solvated PSB molecules. The most elongated bond in the ABL optimized configurations is the C13 -C14 bond.
All configurations show a twisting around the C13-C14 bond, and this does not depend on the BLA pattern, although the ABL configurations have a stronger twisting. Another feature of the excited state geometries is that the ionone ring gets planarized, which is in line with the double bond character acquired by C6-C7 bond upon excitation (from 1.484 Å in the ground state to 1.425 Å in the excited state). After optimization of the H → L root in the M10 protein, the geometries obtained are ABL, featuring inversion of bond lengths of the conjugated chain of the PSB molecule, as obtained for 10-Methylated PSB in the solvated PSBs. The bond lengths obtained, as well as the deviations from planarity of the dihedrals are presented in Figure S10. Differently from M4 and M8, all the bond lengths obtained for M10 geometries are highly similar among them. The highest elongation is obtained for the central C11-C12 bond, which goes from an average of 1.357 Å in the ground state to 1.466 Å in the excited state. Concerning the dihedral angles, although some variety is observed, all the configurations share the same global parameters. Overall, they are fairly close to the averaged ground state structure. However, the twisting around the C11-C12 bond has significant variation along the trajectory, some configuration (like configuration 07) featuring a very strong distortion (up to 30 • ), while some others are nearly planar all along the chain (such as configuration 08 and configuration 04).   Theoretical spectroscopy data     Photoisomerization pathways Figure S16. Constrained scans for the rotation around the C11-C12 bond for representative configurations of M4, M8 and M10 proteins. The PES were obtained at the CASPT2 (red) and CASSCF (gray) levels, with the 6-31G* basis set. If the dihedral angle around C11-C12 is pretwisted in one direction (e.g. counter-clockwise for M10) the opposite direction is not considered. Figure S17. Constrained scans for the rotation around the C13-C14 bond for representative configurations of M4, M8 and M10 proteins. The PES were obtained at the CASPT2 (red) and CASSCF (gray) levels, with the 6-31G* basis set. If the dihedral angle around C11-C12 is pretwisted in one direction (e.g. counter-clockwise for M10) the opposite direction is not considered. Figure S18. Experimental decay associated spectra (DAS) for the M4, M8 and M10 proteins and corresponding time constants. Note that the DAS do not fully capture the instantaneous signals observed for M8 and M10 and corresponding to stimulated Raman scattering from the aqueous solvent.