Studies of light sensing proteins with organic chromophores cover a wide area of modern research including experimental and theoretical characterization of structures and spectral features of the protein molecules.[1, 2] Modeling these properties by using quantum chemistry tools which allows one to compute equilibrium geometry parameters of the chromophore containing domains as well as positions and intensities of bands in excitation and emission spectra is becoming an essential part of such studies. However, collaboration with experimentalists working in this field imposes a strong request in fast theoretical predictions of perspective variants of fluorescent or photoreceptor proteins, for example, in respect of compositions of the latter attainable by mutations. Correspondingly, a request in economical quantum chemistry methods capable to compute structural parameters and energy differences between electronic states of fairly large molecular systems containing organic chromophores with a reasonable accuracy is recognized.
We consider here two approaches in quantum chemical modeling that can be claimed as economical, the well-known semiempirical ZINDO method and a relatively new configuration interaction type approach, the scaled opposite-spin configuration interaction with singles and perturbative doubles, SOS-CIS(D).[4, 5] We assign these approximations to the economical type methods, since they do not depend critically on the demanding preliminary analyses of wavefunctions typical for applications of the advanced complete active space SCF (CASSCF)-based approximations.[6-9] Application of the ZINDO and SOS-CIS(D) techniques is illustrated here for two distinct classes of light sensing proteins.
First, we consider the proteins of the green fluorescent protein (GFP) family widely used as biomarkers in living cells. The working horse of them are the chromophores derived from the hydroxybenzylidene-imidazolinone molecule. The detailed chemical structure and the protonation state of the chromophore and of the immediate amino acid residues are essential features responsible for the photophysical behavior of the GFP-like fluorescent proteins including the shifts in optical bands along the photocycles. For a series of applications,[10-14] the ZINDO method used for calculations of excitation energies at the geometry configurations of model systems optimized in the density functional theory (DFT) approximations led us to remarkably good results.
Second, we consider the blue light photoreceptor proteins with the so-called blue-light using flavin adenine dinucleotide (BLUF) domain whose function depends on subtle changes in the hydrogen bond networks near the flavin chromophore. Although the ZINDO approach is capable to reproduce well the position of the optical band in flavin, application of this technique for calculations of very small shifts in absorption bands of flavin in BLUF (less than 10 nm) leads to practically indistinguishable values for reaction intermediates. In this case, ZINDO is not sensitive enough to the changes in the chromophore containing domain. As we demonstrate below, the SOS-CIS(D) method turned out to be more successful in this respect.
In the next section, we summarize the findings of relevant recent publications for the specified systems and report some new results to support our conclusions.
Summary of recent results
As shown in Ref., the vertical excitation energy for the gas-phase GFP chromophore in the anionic form illustrated in Scheme 1 computed with ZINDO at the geometry configuration optimized in the B3LYP/6-31G+(d,p) approximation (the ZINDO//B3LYP/6-31G+(d,p) result) well correlate with the experimental value.
It is important to stress that the same approach, ZINDO//B3LYP/6-31G+(d,p), applied to the neutral form of the GFP chromophore allowed us to correct an ambitious experimental assignment of the absorption spectral bands; several state-of-art quantum chemical methods arrived to the same conclusion. Excitation energies for the GFP variants in which the tyrosine-derived chromophore is replaced by the histidine (Y67H) and tryptophane (Y67W) derived moieties thus providing shifts of absorption bands to the blue have been correctly calculated in the same approximation. Also, in Ref. a mimic of the chromophore containing domain of GFP was constructed following motifs of the crystal structure 1EMG from the Protein Data Bank (PDB). The coordinates of the Cα atoms of the involved residues were kept frozen as in the crystal structure and the remaining geometry parameters were optimized in the B3LYP/6-31G+(d) approximation. The computed absorption wavelength in the ZINDO//B3LYP approximation results in the value 502 nm (2.47 eV), which is 20 nm above the experimental value for GFP with the anionic chromophore.
The same computational strategy, namely, to optimize ground state equilibrium geometry parameters for fairly large molecular clusters in the B3LYP/6-31G(d) approximation and to compute vertical excitation energies by using ZINDO, turned out successful in computing absorption bands for the monomeric teal fluorescent protein mTFP1 as well as for the red fluorescent proteins DsRed and mCherry.
We conclude this short survey by citing the results[13, 14] for the photoswitchable fluorescent protein as FP595. First, using large molecular clusters (containing up to 300 atoms) geometry parameters of which were optimized in the B3LYP/6-31G(d) approximation followed by ZINDO allowed us to distinguish protein conformations with the trans and cis anionic chromophore forms. Second, we could estimate emission bands by locating minimum energy point in the excited S1 state by using the CIS procedure followed by ZINDO calculations of transition energies. Correspondence of the computed band position in absorption for the conformation with the trans-anionic chromophore at 572 nm and in emission for the conformation with the cis-anionic chromophore at 599 nm with the experimentally observed band maxima (568 and 595 nm, respectively) provides a solid support to the tentative assignments of the spectral bands and to involvement of the trans-cis chromophore isomerization in this protein upon kindling. Third, we computed spectral bands for the structure with the protonated (neutral) form of the chromophore in the trans configuration. We provided evidence that this protein conformation accounted for the previously observed but unassigned absorption band at 445 nm.
Structure and spectra of TagRFP
Here, we report a new series of calculations for another FP of the GFP family, namely, TagRFP. It is a monomeric red fluorescent protein characterized by high brightness, complete chromophore maturation, prolonged fluorescence lifetime, and high pH-stability making it an excellent tag for protein localization studies and fluorescence resonance energy transfer applications.
As before,[10-14] we started from the coordinates of heavy atoms from the relevant crystal protein structure, this time, from the PDB entry 3M22. We selected for simulations the model system composed of the chromophore, the side chains of Leu13, Gln39, Ala59, Thr60, Ser61, Phe62, Ser66, Arg67, Arg92, Gln106, Tyr117, Asn143, Glu145, Ser158, Met160, Phe174, His197, Leu199, Gln213, Glu215, and 11 water molecules. After adding hydrogen atoms, the system included 326 atoms in total. Optimization of geometry coordinates was performed in the B3LYP/6-31G(d) approximation when keeping the Cα atoms of the involved residues frozen according to the coordinate-locking scheme. Vertical excitation energies were computed at the optimized geometry structure by using ZINDO. The Gaussian-03 program was used in all calculations.
A view of the most important part of the model cluster is illustrated in Figure 1. We specify several intermolecular distances to demonstrate that a good agreement with the parent crystal structure 3M22 solved at the resolution 2.2 Å is obtained in calculations, as in other applications.[10-14] We do not compare the computed intramolecular geometry parameters since they are well predicted by DFT calculations, even better than those resolved in X-ray experiments.
We computed the position of the absorption S0-S1 band maximum at 551 nm (2.25 eV) in good agreement with the experimental value at 555 nm (2.23 eV). Such an agreement is characteristic for recent applications[13, 14] of the ZINDO//DFT approach, in which fairly large molecular clusters model the chromophore containing domains.
Summary of recent results
The BLUF domain is a blue light photoreceptor that cycles between dark-adapted and light-induced functional states. The absorption spectrum of one of the proteins of this family, AppA BLUF, in the dark state consists of the broad peak at 443 nm attributed to the S0–S1 electronic transition of flavin (Scheme 2) buried in the chromophore-containing pocket, while in the light-induced transient form, this band is shifted to 456 nm.
Unlike the case of GFP-type model systems, our attempts to use ZINDO to analyze subtle details of the BLUF photocycle were discouraging. Although the band positions of the flavin moiety were well consistent with experimental data, the excitation energies computed with ZINDO for molecular clusters mimicking flavin-containing pocket in BLUF at various stages of the photocycle were practically indistinguishable: the differences in wavelengths were within 2 nm. By this reason, we considered other economical quantum chemical methods and found that the SOS-CIS(D) approximation could be used to solve this problem. As shown in Ref.[19, 20], the observed small shifts in the optical spectral bands due to subtle changes in the hydrogen bond network near the chromophore in the BLUF domains could be well reproduced by using the SOS-CIS(D) approximation.[4, 5] Following the models[21, 22] for the photocycle AppA BLUF, according to which formation of the light-induced state is accompanied by internal rotation and tautomerization of the critical residue Gln63, we showed that the computed red shift in the absorption band maxima between the dark and light-induced states was 12÷16 nm, to be compared with the measured 13 nm shift. The subsequent work was devoted to computational characterization of possible reaction intermediates in the AppA photocycle, and again a very good agreement between the calculated and observed shifts in optical bands was obtained. The SOS-CIS(D) calculations were performed for the molecular clusters constituted the quantum subsystem in the quantum mechanical–molecular mechanical (QM/MM) simulations as clarified in the following subsection.
Effect of the Q63E mutation in AppA BLUF
As described in Ref., the photoinactive mutant has been generated in the BLUF domain of AppA by replacing Gln residue by Glu at position 63 (the Q63E mutant). Steady-state absorption spectra of the dark-adapted, light-induced wild-type AppA BLUF, and the Q63E mutant recorded in phosphate buffer at pH8 showed slight shifts in absorption band maxima between these species: from 447 to 450 nm when comparing the dark state and the Q63E mutant and from 450 to 458 nm when comparing the Q63E mutant and the light state.
To simulate the effect of the Q63E mutation, we started from the coordinates of the light-induced conformation of AppA BLUF obtained in previous works,[19, 20] manually replaced the side chain of Gln at position 63 by Glu and optimized geometry parameters in QM/MM calculations in the electronic embedded cluster approximation using the NWChem program. Energies and forces in the QM-subsystem (the isoalloxazine ring of the chromophore, and the side chains of Tyr21, His44, Asn45, Gln63, and Trp104) were calculated by DFT with the B3LYP functional and the cc-pVDZ basis set. The remaining part of the protein and of the chromophore, and solvent water molecules were assigned to the MM subsystem described with the AMBER force field parameters. Figure 2 illustrates the equilibrium geometry configuration of the chromophore containing domain. It should be stressed that the structure of the Q63E mutant is not known from the experimental studies.
Vertical excitation energies were calculated for the S0-S1 electronic transition in the SOS-CIS(D) approximation[4, 5] with the cc-pVDZ basis set using the Q-Chem program package. These calculations were performed for the molecular cluster, which was constructed by using coordinates of the corresponding QM part of the QM/MM optimized structure.
We collect in Table 1 experimental and computed excitation energies and the wavelengths of absorption band maxima. One can see that very delicate shifts in excitation energies (0.02–0.05 eV) or in the band maxima (3–5 nm) between these three species observed experimentally are reproduced exceptionally well in these simulations. This provides one more evidence favoring the applied computational strategy SOS-CIS(D). Successful characterization of the spectra indicates that the computationally predicted structure of the chromophore containing domain of the Q63E mutant (Fig. 2) is also reasonable.
Table 1. Experimental and computed vertical excitation energies and the corresponding wavelengths of the S0-S1 transitions for the dark and light states of wild-type AppA BLUF and its Q63E mutant
We demonstrate here that fairly inexpensive quantum chemistry approaches can be successfully applied for simulations of structure and spectra of light sensing proteins with the organic chromophores. We consider two distinct classes of such proteins, photophysical behavior of which is different: in the case of GFP-like fluorescent proteins transformations with the chromophore are essential, whereas in the case of BLUF-containing proteins the light sensing is due to changes in the hydrogen bond networks near the chromophore. The use of ZINDO for calculation of transition energies along the GFP photocycle turns out to be successful. In the second case, ZINDO is not sensitive enough leading to almost indistinguishable values of excitation energies for different stages of the BLUF photocycle. However, using the SOS-CIS(D) method which also does not require demanding preliminary considerations of electronic structure is recognized.
To a large extent, success of both techniques is explained by an observation that the singlet-singlet electronic excitations in the systems are of a simple HOMO–LUMO transition type. In this case, a computational strategy to use “black-box” methods ZINDO and SOS-CIS(D) for fairly large molecular clusters can be recommended. Perspectives related to quantum chemistry applications for a wider class of light sensing biomolecules, most likely, will include development of faster algorithms of the more general CASSCF-based methods and expected progress in the TD DFT approaches.
Improvements of the computational strategy to simulate properties of photosensing proteins also include consideration of conformational sampling treated in the QM/MM approaches coupled to dynamics as, for example, in Refs.[7, 26-28]. Besides, the vibronic effects and the spin-orbit contributions are often important in calculations of spectral features of light receptors.
This work is partly supported by the Russian Foundation for Basic Research (project 12-03-00149) and the Program of Molecular and Cell Biology from the Russian Academy of Sciences. AN and MK thank the Research Computing Center of M.V. Lomonosov Moscow State University for providing computational resources. IT and JC thank the staff and administration of the Advanced Biomedical Computing Center for their support of this project. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government.
Dr. Alexander Nemukhin is Professor of Chemistry at the Moscow State University and the director of laboratory at the Institute of Biochemical Physics. His primary interests concentrate on computer modeling of molecular structure and properties. He received his degrees (Ph.D., D.Sc.) in physical chemistry from the M.V. Lomonosov Moscow State University (Russia). Dr. Nemukhin has more than 200 peer-reviewed scientific publications. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Dr. Igor Topol is Senior Scientist at the Advanced Biomedical Computing Center at the SAIC-Frederick Inc. Dr. Topol's research focuses on quantum chemical applications and molecular modeling pertaining to cancer. He received his MS degree in solid state physics from Rostov State University (Russia), Ph.D. from the Leipzig University (Germany) and D.Sc. from Moscow State University (Russia). Dr. Topol has more than 150 peer-reviewed scientific publications. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Dr. Jack Collins is the director of the Advanced Biomedical Computing Center at the Frederick campus of the National Cancer Institute. Dr. Collins' research focuses on biomedical applications pertaining to cancer. His Ph.D. is in theoretical chemistry from the University of Nebraska at Lincoln. Prior to joining NCI, Dr. Collins worked at the Molecular Research Institute as director of Computational Biology. He has more than 70 peer-reviewed scientific publications and 3 patents. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Dr. Maria Khrenova is a Researcher at the Chemistry Department of the Moscow State University. Her research focuses on computer based molecular modeling of structure and properties of biomolecular systems. She gained her Ph.D. degree in 2011 in computational and quantum chemistry from the M.V. Lomonosov Moscow State University. Dr. Khrenova has 22 peer-reviewed scientific publications. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]