The energetics and evolution of oxidoreductases in deep time

The core metabolic reactions of life drive electrons through a class of redox protein enzymes, the oxidoreductases. The energetics of electron flow is determined by the redox potentials of organic and inorganic cofactors as tuned by the protein environment. Understanding how protein structure affects oxidation–reduction energetics is crucial for studying metabolism, creating bioelectronic systems, and tracing the history of biological energy utilization on Earth. We constructed ProtReDox (https://protein-redox-potential.web.app), a manually curated database of experimentally determined redox potentials. With over 500 measurements, we can begin to identify how proteins modulate oxidation–reduction energetics across the tree of life. By mapping redox potentials onto networks of oxidoreductase fold evolution, we can infer the evolution of electron transfer energetics over deep time. ProtReDox is designed to include user‐contributed submissions with the intention of making it a valuable resource for researchers in this field.

In deep time, a set of enzymes evolved to facilitate electron transport-the oxidoreductases or EC 1 proteins.Biological electronic circuits require the movement of electrons over sub-nanometer distances through an electron transfer chain that powers life.5][6] Oxidoreductases organize the positions and relative energetics of chains of redoxactive cofactors, assuring the rapid, directional flow of electrons. 7The energetic tendency of a redox-active group to gain electron or lose electrons can be experimentally measured as the redox potential, expressed in volts (V), relative to a reference such as the standard hydrogen electrode, at a standard pH.Redox-active groups that contribute to the redox potential can be cofactors such as iron-sulfur clusters, hemes, or flavins, or amino acid residues such as cysteine, methionine, or tryptophan.The relative stability of cofactor oxidation states is largely determined by the cofactor itself 8 but are further modulated by the protein matrix.Electrostatic interactions, such as proximity of positively charged basic amino acids, can stabilize a redox cofactor in the reduced state. 9,10The protein can modulate oxidation-reduction energetics through hydrogen bonding, 11,12 hydration 13 and dynamical features 14 of the protein-cofactor environment.Groups of oxidoreductases form metabolic pathways, powering cellular-scale circuits where the current depends on the rate of catalysis and diffusion of substrates. 15It is critical to study how the protein environment modulates the energetics of oxidation-reduction reactions in order to understand how electron transfer is coupled to metabolism.
The connection between oxidoreductase structure and energetics is central to the deep-time evolution of metabolism.7][18][19][20] Due to its fundamental electrical nature, the evolution of metabolism, and the associated oxidoreductases, was strongly coupled with changes in the redox state of the planet, which has become increasingly oxidized over time due to both geochemical and biological processes. 2,3,21dern oxidoreductases that play central roles in metabolism such as nitrogenases, photosystem reaction centers and respiratory complexes, are massive nanomachines-far too complex to have arisen early in evolution in their current forms.3][24][25][26][27][28][29][30][31] In previous work focused on the evolution of oxidoreductases, we found that modern, large enzymes were largely derived from just a few minimal protein-cofactor building blocks. 16,17,32In addition to identifying core cofactor binding folds, we used a structure-derived criterion for electron transfer based on cofactor-cofactor distances 7 to map a network of electron transfer pathways between the different folds-which we refer to as the Spatial Adjacency Network, SpAN.A notable feature of the SpAN was the abundance of more reducing cofactor-binding folds in the network center and more oxidizing cofactor-folds at the periphery. 17This suggests a time axis in the SpAN from the center to the periphery of the network reflecting the adaptation of protein redox energetics to emerging electron sources and sinks made available by an oxidizing planetary environment over geologic time.Mapping quantitative estimates of protein redox energetics onto the SpAN would allow us to potentially constrain the age of various protein folds based on redox information in the geologic record. 2,33,34mputational approaches for prediction of redox energetics based on protein structures are an ongoing challenge.Current methods span many levels of theory from quantum-mechanical to empirical 35 and recent advances using machine learning. 36Sitedirected mutagenesis studies on natural oxidoreductases [37][38][39] and protein engineering [40][41][42][43] have been used to test molecular hypotheses of how the protein environment tunes redox energetics.Large datasets of protein structures, including oxidoreductases, are on the horizon with advances in functional annotation from genomic and metagenomic datasets 20,44 combined with recent advances in structure prediction [45][46][47] including bound cofactors. 48Effective models that can predict redox energetics based on structural information will become increasingly valuable for understanding bioenergetics, evolution of metabolism and engineering of bioelectronic pathways. 41,49tivated by the need to design and train better models and the goal of mapping redox energetics onto the SpAN to study oxidoreductase evolution, we develop ProtReDox, a manually curated database of protein redox potentials.We examined literature reports of oxidoreductase energetics and identified the cofactor type, redox potential, UniProt and PDB (if available) identifiers, and experimental metadata such as potentiometric measurement technique, pH and buffer conditions.ProtReDox version one is available at https://proteinredox-potential.web.app.We apply this dataset to explore how redox energetics is modulated by cofactor type, protein environment, experimental conditions and finally how energetics mapped onto the SpAN inform geochemical constraints on deep-time oxidoreductase evolution.

| Data collection and curation
The dataset includes 514 redox potentials from 239 unique enzyme/ cofactor pairs consisting of metal ions (Cu, Fe, Mo), flavins, hemes, and multinuclear iron-sulfur clusters.Proteins are indexed by their UniProt ID, and approximately half are associated with high-resolution structures deposited in the Protein Data Bank. 50Redox potentials are normalized to the standard hydrogen electrode and pH-corrected to pH 7.0.Redox potentials were only included if the midpoint potential could unequivocally be assigned to a particular cofactor.

| ProtReDox database construction
Redox potentials and associated data are stored in a Google Firebase Cloud Firestore database. 51The ProtReDox website is rendered using the Firebase Web v.9 modular JavaScript SDK in combination with React.js(v.18.2) (react.dev).The website user interface comprises a navigation, logo, searchable redox dataset table, and a form to input new redox potentials and associated information.User-contributed additions to the dataset will be marked for review and evaluated manually.

| Feature-redox correlation analysis
To better understand the key features controlling redox potential, 486 features were calculated as previously described 52 for a set of 42 protein structures with type 1 copper sites with experimentally identified reduction potentials.These features covered 10 categories of physicochemical properties based on how they were calculated: solvation, electrostatics, hydrogen bonding, van der Waals, geometry, pocket void, secondary structure of the backbone region of the protein directly interacting with the redox site, amino acid angles, pocket lining, and surface area.The property values for sites on different chains of the same protein structure were averaged.Any features for which all structures had the same value were removed, leaving 446 features.Pearson correlation coefficients between features and reduction potential were then calculated using the python library SciPy. 53For each structure, the reduction potential with an experimental pH closest to the crystallization pH was selected.When no crystallization pH was available, the reduction potential with the most neutral pH was selected.These reduction potentials were then normalized to pH 7.0 for further analysis Equation (1).

| Mapping redox energetics on the SpAN
The SpAN is a network representation of protein electron transfer pathways with nodes corresponding to classes of protein microenvironments surrounding the redox cofactor (termed modules) and edges reflecting instances in the PDB where two modules are within electron transfer proximity (cofactor-cofactor distance <14 Å).The generation of this network was described in our previous work. 16,17The 2020 version of the SpAN was used in this study.

| Cofactor type is the primary determinant of redox energetics
Redox potentials included in ProtRedox span almost 2 V, ranging from the À675 mV 2[4Fe-4S] binding bacterial ferredoxin of E. coli 54 to the +1301 mV chlorophyll A in PS II within T. elongatus. 55Within this broad range, the cofactor type is the primary determinant of redox potential (Figure 1).Cofactor types are designated based primarily on the PDB-derived nomenclature.Cofactors from most reducing to most oxidizing were 4Fe-4S (SF4), 2Fe-2S (FES), flavins, mononuclear iron sites (Fe), iron-bound hemes (HEM) and copper sites (Cu).These ranges are consistent with previous analyses of protein electron carriers. 8

| Molecular features that determine energetics
Protein redox potential is a complex property that is affected by features of the redox site first and second shell environment: solvation, hydrogen-bonding, ligand interactions, metal coordination, electrostatic interactions 56,57 and corresponding enthalpic and entropic energy terms. 58Redox potential can be directly calculated from first principle quantum mechanics calculations, 59,60 however, these calculations are expensive and are not practical for protein design.To better understand the protein features that determine redox potential, we calculated the correlation between 433 physicochemical features (including energy and geometry features) and reduction potentials (Figure 2) for copper proteins with ReProDox.
The categories of features with best correlations tended to be those related to electrostatics and solvation.These include solvation features that describe Lazaridis-Karplus solvation energies both isotropic and anisotropic contributions for various distance cutoffs within 9 Å.The significant electrostatic features include calculations for Coulombic electrostatic potential as well as features describing the theoretical titration curve of surrounding residues.In contrast, F I G U R E 1 Distribution of redox potentials for the most abundant cofactor types found in ProtReDox.Sorted according to mean redox potential cofactors and displayed vertically from most reducing to most oxidizing along with corresponding atomic structures.SF4 (4Fe4S; 82) À329 ± 268 mV; FES (2Fe2S; 64) À220 ± 199 mV; Flavin (114) À183 ± 141 mV; Fe (10) À72 ± 173 mV; HEM (42) 43 ± 190 mV; Cu (147) 333 ± 129 mV.Count of each cofactor type is within parenthesis.other categories of features are more statistical.For example, eight of the nine significant "amino acid angle" features are Dunbrack rotamer energies of residues within 5 Å, indicating the use of some more common and some less common rotamers.In addition to further evaluating significant features that correlate with protein redox potentials found in ProtRedox, we expect these features can be used to train models 36,52 for high throughput redox-active protein design.

| Coupling redox energetics to pH
Comparing protein redox potentials is challenging due to the numerous experimental conditions under which redox potentials are measured.[63][64] To compare experimental redox potentials values are normalized to a reference pH (7.0) using the Nernst equation where 59.16 mV is the Nernst constant relating pH to redox potential.Redox gradients and oxidoreductase evolution.Many of the ProtRe-Dox entries are associated with an experimentally determined three-dimensional structure deposited in the PDB.This allowed us to map the redox energetics onto the SpAN-an existing network mapping electron transfer pathways in oxidoreductases of known structure. 16,17Nodes in the SpAN correspond to redox-cofactor binding protein microenvironments-termed modules.Edges reflect the existence spatial proximity of cofactor atoms in a pair of modules in one or many structures, providing a pathway for electron transfer.Cofactor edge-to-edge distances less than 14.0 Å were considered electron-transfer competent. 7e full SpAN contains 133 modules. 17We identified 18 modules with specified redox energetics (Figure 4, Table 1).These modules formed a fully connected sub-graph within the SpAN with the exception of the heme-binding cytochrome-C fold module 140.Within this network, there is a clear downhill redox energetic gradient, starting from 4Fe-4S coordinating ferredoxin folds (module 85) with an average potential of À430 mV and ending with more oxidizing hemes (modules 1737 at +168 mV; 1746 at +70 mV), the molybdenumcontaining module 16 (+204 mV) and copper module 72 at +325 mV.
One can envision electrons percolating from the center of this network to the periphery, driving redox-coupled reactions along a metabolic pathway.
Multiple features of the SpAN suggest its structure provides insight into the evolution of oxidoreductases in addition to their metabolic function.[68] In the ProtReDox annotated sub-graph of the SpAN, flavin module 7 and 4Fe-4S module 85 are reducing such that they are energetically matched with the early Earth redox environment.It is informative that the annotated modules form a connected sub-graph within the SpAN.
Most of these modules correspond both to isolated protein electron carriers 44 as well as being domains within larger oxidoreductases.
F I G U R E 4 Network of redox active modules where the nodes are modules that share common and structural similarity, 16,17,71 and the edges are connections between modules that are within the same protein and are capable of electron transfer between each module.Node colors represent a gradient of average redox potentials for each module.Note: Module numbers defined in Reference [17].Potentials are the arithmetic mean of values in the database.
Assignment of potentials is easier within an isolated domain versus a larger, multi-cofactor enzyme.Small, isolated modules would be useful building blocks of larger enzymes, forming multi-domain structures through duplication and diversification.Metal utilization for central versus peripheral modules is largely consistent with metal availability through geologic history, 3,69,70 with early folds incorporating ironcontaining cofactors and later folds binding molybdenum and copper.

| CONCLUSIONS
Knowledge of redox energetics of oxidoreductases is critical to understanding metabolic function and evolution.ProtReDox is intended to be a valuable tool in this regard as we and others contribute to its growth.Currently, the size of ProtReDox limits the extent to which structure-based predictive models can be trained on redox energetics.
However, with further experimental investigations and as high-quality models of protein structures become readily accessible, these models should improve.This would allow a more complete mapping of data structures such as the SpAN, providing a greater understanding of the evolution of redox energetics in metabolism through time.

E
red is the normalized reduction potential of each protein at pH 7 and E is the reduction potential measured at the literature pH.The variable n, assumed to be one, is the electron-to-proton ratio involved in the redox reaction, respectively.For copper proteins with an azurin fold, we observed a correlation between pH and redox potential with a slope of À51 mV/pH unit (Figure3A), near what is expected if the reactions followed Nernstian behavior (À59 mV/pH unit), assuming one electron transfer per reaction coupled with one protonation event.Normalization removes the slope of this correlation (Figure3B).Experimental pH conditions showed the strongest positive Pearson coefficient with redox potential among the computed factors from structure described earlier.The large variance in redox energetics cannot be explained by a single molecular feature.Weaker dependencies on pH (i.e., n < 1) have been observed in systems where electron transfer is accompanied by partial protonation/deprotonation events.65This should be taken into F I G U R E 2 Correlation of reduction potential and physicochemical properties: Groups of physicochemical properties of metalloenzymes that can be measured from protein structure are shown along the x-axis.Each circle is the absolute value of the Pearson correlation coefficient for reduction potentials.Colored circles represent pvalue ≤0.05 for the correlation and empty circles represent p-value ≥0.05.Horizontal box lines, from top to bottom, represent the upper quartile, median, and lower quartile correlation values for the respective property category.The whiskers display the range of correlation values for the respective property category, except for outliers, which are greater than 150% of the interquartile range from the box.F I G U R E 3 Cu cofactor type correlation of pH and redox potential (A) experimental results and (B) normalized using the Nernst equation (Equation (1)).Figures include linear regression with a 95% confidence interval and the equation of best fit.consideration when interpreting pH-corrected redox potentials reported in ProtRedox.