Reconstructing the evolutionary history of nitrogenases: Evidence for ancestral molybdenum‐cofactor utilization

Abstract The nitrogenase metalloenzyme family, essential for supplying fixed nitrogen to the biosphere, is one of life's key biogeochemical innovations. The three forms of nitrogenase differ in their metal dependence, each binding either a FeMo‐, FeV‐, or FeFe‐cofactor where the reduction of dinitrogen takes place. The history of nitrogenase metal dependence has been of particular interest due to the possible implication that ancient marine metal availabilities have significantly constrained nitrogenase evolution over geologic time. Here, we reconstructed the evolutionary history of nitrogenases, and combined phylogenetic reconstruction, ancestral sequence inference, and structural homology modeling to evaluate the potential metal dependence of ancient nitrogenases. We find that active‐site sequence features can reliably distinguish extant Mo‐nitrogenases from V‐ and Fe‐nitrogenases and that inferred ancestral sequences at the deepest nodes of the phylogeny suggest these ancient proteins most resemble modern Mo‐nitrogenases. Taxa representing early‐branching nitrogenase lineages lack one or more biosynthetic nifE and nifN genes that both contribute to the assembly of the FeMo‐cofactor in studied organisms, suggesting that early Mo‐nitrogenases may have utilized an alternate and/or simplified pathway for cofactor biosynthesis. Our results underscore the profound impacts that protein‐level innovations likely had on shaping global biogeochemical cycles throughout the Precambrian, in contrast to organism‐level innovations that characterize the Phanerozoic Eon.


| INTRODUC TI ON
All known life requires nitrogen for the synthesis of essential biomolecules, including nucleotides and amino acids. However, though the atmosphere contains nearly 80% N 2 by volume, most organisms are not able to assimilate N 2 . Select bacteria and archaea called diazotrophs accomplish biological nitrogen fixation by nitrogenase metalloenzymes (E.C. 1.18.6.1) that catalyze the reduction of N 2 to bioavailable NH 3 . Both phylogenetic and geochemical evidence suggest that nitrogenases are an ancient family of enzymes; time-calibrated phylogenies place the evolution of nitrogenases at ~1.5-2.2 billion years ago (Ga) (Boyd, Anbar, et al., 2011), and the oldest potential isotopic biosignatures of nitrogenase activity date even further back to ~3.2 Ga (Stueken, Buick, Guy, & Koehler, 2015). genes in addition to vnf and anf genes, respectively Joerger & Bishop, 1988;Kennedy & Dean, 1992). These include nifBEN for most alternative gene clusters, with the exception of certain taxa (including A. vinelandii) that possess vnfEN homologs of nifEN that likely perform a similar biosynthetic function (Boyd, Anbar, et al., 2011;Boyd & Peters, 2013;Hamilton et al., 2011).
Paleobiological interest in nitrogenases has primarily centered on the coevolution of nitrogenase metal usage and the geochemical environment, with the possible implication that marine metal availabilities have significantly constrained nitrogenase evolution over geologic time (Anbar & Knoll, 2002;Canfield, Glazer, & Falkowski, 2010;Raymond et al., 2004). Inferences of ancient nitrogenase metal usage have relied on isotopic biosignatures (Stueken et al., 2015) and metal abundances (Anbar & Knoll, 2002) evidenced by the geologic record, as well as on phylogenetic reconstructions of both catalytic and cofactor biosynthesis proteins (Boyd, Anbar, et al., 2011;Raymond et al., 2004). High marine Fe concentrations and potential Mo scarcity prior to increased atmospheric oxygenation surrounding the ~2.3-2.5 Ga Great Oxidation Event (Anbar et al., 2007;Lyons, Reinhard, & Planavsky, 2014) have led to the hypothesis that Fe-or V-nitrogenases may have been dominant in early oceans (Anbar & Knoll, 2002;Canfield et al., 2010) and possibly predate Mo-nitrogenases (Raymond et al., 2004). More recent phylogenetic reconstructions have instead suggested that the evolution of Mo-nitrogenases, dated by time-calibrated phylogenies of Nif/ Vnf/AnfDKEN sequences to ~1.5-2.2 Ga (Boyd, Anbar, et al., 2011), preceded that of V-and Fe-nitrogenases .
These phylogenetic inferences are also consistent with the observation that vnf and anf genes are only present in organisms that also harbor nif, and that V-/Fe-nitrogenase assembly relies on nif biosynthetic genes Joerger & Bishop, 1988;Kennedy & Dean, 1992). However, ~3.2-Ga isotopic signatures of biological nitrogen fixation suggest an earlier origin of nitrogenase (Stueken et al., 2015) and, even though isotopically consistent with Mo-dependent nitrogen fixation, predate age estimates of both Mo-nitrogenase (Boyd, Anbar, et al., 2011) and earliest marine Mo availability (Anbar et al., 2007;Anbar & Knoll, 2002;Lyons et al., 2014). Thus, the evolutionary trajectory of nitrogenase metal usage-and by extension the link between nitrogenase evolution and marine metal availabilities over geologic time-is not yet known.
Here, we explored the indicators of nitrogenase metal usage history by a combined method relying on ancestral sequence reconstruction, an evolutionary approach by which inferred, historical protein sequence information can be linked to functional inferences of molecular properties evidenced by computational analyses or laboratory experiments (Aadland, Pugh, & Kolaczkowski, 2019;Benner, Sassi, & Gaucher, 2007;Thornton, 2004). These paleogenetic approaches have been increasingly applied in biogeochemically relevant molecular studies to offer insights into the coevolution of life and Earth (Garcia & Kacar, 2019;Gomez-Fernandez et al., 2018;Kacar, Hanson-Smith, Adam, & Boekelheide, 2017). We reconstructed the phylogenetic history of Mo-, V-, and Fe-nitrogenases in order to resurrect ancestral nitrogenases in silico, as well as to map the taxonomic distribution of cofactor biosynthetic components considered necessary for cofactor assembly (Boyd, Anbar, et al., 2011;Curatti et al., 2007;Dos Santos et al., 2012;Hu et al., 2005;Shah et al., 1994Shah et al., , 1986Tal et al., 1991). Through this combined approach, we access ancestral enzyme sequence features suggestive of ancient Mo-dependence. We further present phylogenetic data that suggest that the ancient Mo-utilization was potentially achieved via a unique pathway for the synthesis of the FeMo-or similar cofactor that may be present in poorly characterized, basal lineages. Integration of protein evolution and paleobiology is a unique melding of disparate datasets; this approach may provide hypotheses that address interactions ranging from the external environment to the cellular environment, and from the cellular environment to that maintained around the interacting protein. The exchange of materials across these different scales necessitates constraints on the flow and availability of substrates that make such exchanges possible.
It is the specific nature of these constraints and how they may change in response to external perturbations that enable us to develop completely new experimentally testable hypotheses that connect geochemical reservoirs with biological metabolisms-those that cannot be constructed from macroevolutionary or geological frameworks alone.

| Ancestral reconstruction of nitrogenase protein sequences
An initial dataset of extant nitrogenase Nif/Vnf/AnfHDK homologs was constructed by retrieving amino acid sequences from the National Center for Biotechnology Information non-redundant protein database, accessed September 2018 (O'Leary et al., 2016). Potential homologs were identified by BLASTp (Camacho et al., 2009) (Spatzal et al., 2011), V-nitrogenase (c; PDB 5N6Y; Sippel & Einsle, 2017), and Fe-nitrogenase (d; proposed structure; Harris, Lukoyanov, et al., 2018). Residue numbering from aligned A. vinelandii NifD. Cofactor atom coloring is as follows: C, tan; Fe, rust; Mo, cyan; N, blue; O, red; S, yellow NifD: Avin_01390, NifK: Avin_01400) and an expect value cutoff of <1e−5. The dataset was then manually curated to remove partial and distantly related sequences. Additional nitrogenase sequences were manually retrieved from the Joint Genome Institute's Integrated Microbial Genomes and Microbiomes database, accessed September 2018 (Chen et al., 2019). The nitrogenase sequence dataset was finalized to include NifHDK sequences from 256 taxa, AnfHDK sequences from 14 taxa, VnfHDK sequences from 14 taxa, and outgroup light-independent protochlorophyllide oxidoreductase (Bch/ChlLNB) sequences-sharing distant homology with nitrogenases (Boyd, Anbar, et al., 2011;Hu & Ribbe, 2015;Raymond et al., 2004)-from 10 taxa (Appendix S1; additional analyses were performed with an expanded outgroup, Appendix S2). Only one Nif/ Anf/VnfHDK sequence set was retained per genus to broaden taxonomic sampling. Equal sequence sampling for Anf and Vnf was made to remove the potential for oversampling bias in ancestral sequence inference. H-, D-, and K-subunit sequences corresponding to each taxon were manually checked for synteny of their encoding genes.
AnfHDK and VnfHDK sequences were identified by the proximity of each gene locus to anfG or vnfG, which encodes the additional G-subunit present in the VeFe or FeFe protein, respectively, but not present in the MoFe protein (Eady, 1996). Finally, the presence of the cofactor biosynthetic nifBEN genes was investigated for all taxa represented in our dataset by BLASTp, as well as by manually inspecting the nif genome region.
Reconstruction of ancestral nitrogenase sequences was performed by PhyloBot (Hanson-Smith & Johnson, 2016) (www.phylo bot.com), which automates multiple sequence alignment, phylogenetic reconstruction, and ancestral sequence inference methods. The concatenated 294-sequence dataset of Nif/Anf/VnfHDK homologs (including 10 Bch/ChlLNB outgroup sequences) was aligned by MSAProbs v0.9 5r1 (Liu, Schmidt, & Maskell, 2010) and MUSCLE v3.8.31 (Edgar, 2004). Both alignment outputs were then used to perform phylogenetic reconstruction by RAxML v8.1.15 (Stamatakis, 2014) under 6 different combinations of amino acid substitution and rate heterogeneity models. Branch support was evaluated by the approximate likelihood-ratio test (aLRT) (Anisimova & Gascuel, 2006), which assesses the gain in overall likelihood against a null hypothesis of branch length = 0. Additional phylogenetic reconstructions with an expanded outgroup were performed outside of PhyloBot to resolve root positioning, but were not used in subsequent ancestral sequence inference (Appendix S2). Ancestral sequences were inferred by joint maximum likelihood using CODEML v4.2 (Yang, 2007) at all nodes within the 12 PhyloBot-constructed phylogenies (Tree-1-Tree-12), with gaps inferred by parsimony. To assess ancestral sequence robustness to phylogenetic uncertainty (Hanson-Smith, Kolaczkowski, & Thornton, 2010), ancestors inferred from the top five phylogenies ranked by log-likelihood scores were selected for further analysis (Table 1, Appendix S2). Finally, to evaluate the effects of ambiguously reconstructed sites on subsequent structural analyses, Bayesian-sampled ancestors were inferred from the maximum-likelihood site posterior probabilities calculated by CODEML (Aadland et al., 2019). One hundred random Bayesian sequences were generated for each of the five ancestral nodes of interest across the top five phylogenies. Thus, 25 maximum-likelihood and 2,500 Bayesian-sampled ancestral sequences were analyzed in total. All maximum-likelihood reconstructed trees and ancestral sequences are available for view and download at http://phylo bot. com/61328 2215/.

| Probabilistic model for metal dependence classification of nitrogenase sequences
Extant nitrogenase sequences of known metal dependence were labeled as binding either the FeMo-("Nif"), FeV-("Vnf"), or FeFecofactor ("Anf") according to their phylogenetic clustering, and were used to train a support vector classifier that identifies cofactor specificity, based on the residues in the active-site cofactorbinding pocket. The 30 active-site residues were identified as those residing within 5 Å of any atom in either the FeMo-cofactor of the A. vinelandii NifD protein (PDB 3U7Q; Spatzal et al., 2011) or the FeV-cofactor of the A. vinelandii VnfD protein (PDB 5N6Y; Sippel & Einsle, 2017) (see Section 3.3 for further discussion of selected residues). The corresponding residues in other extant homologs and inferred ancestral sequences were identified by multiple sequence alignment using MAFFT (Katoh & Standley, 2013). For each sequence in the labeled dataset and residue in the active site, the probability distribution over all 20 possible amino acids was inferred by averaging (a) the observed distribution for that sequence, which places probability 1.0 on the observed residue; (b) the probability distribution that would be expected over a short evolutionary distance (0.01 substitutions/site), starting from the observed residue and assuming the JTT substitution model (Jones, Taylor, & Thornton, 1992) all possible amino acids (Sjolander et al., 1996). The labeled dataset was randomly divided into 60% training data for training the support vector classifier and 40% out-of-sample testing. The support vector classifier used a radial basis kernel, with the kernel coefficient estimated as 1/(# of predicted features × Var(features)).
Data samples were weighted based on the frequencies of Nif, Vnf, and Anf labels in the training data: the weight assigned to samples labeled as cofactor I (c i ) = # of samples/(3 × Freq(c i )). The regularization parameter was optimized using fivefold cross-validation, and classification accuracy was assessed by out-of-sample testing. The procedure of randomly splitting data into training/testing, training the classifier, and assessing out-of-sample accuracy was replicated 10 times.
Extant sequences and maximum-likelihood ancestral-reconstructed sequences were classified as Nif, Vnf, or Anf using one-versus-rest multiclass classification, based on the observed amino acid residues in their active sites. In addition, ancestral sequences were also classified based on the inferred posterior probability distributions over all 20 possible amino acids at each position in the active site. Classification support is reported as the distance of each data sample from the support vector hyperplane defining the inferred class. The classifier script can be found at https ://github.com/kacar lab/Ancie ntNit rogenase.

| Structural homology modeling of extant and ancestral nitrogenase D-subunits
Structural homology modeling of extant and ancestral (25 maximum likelihood and 2,500 Bayesian-sampled) nitrogenase D-subunit proteins was performed by Modeller v9.2 (Sali & Blundell, 1993). Extant nitrogenase sequences, broadly sampled from the reconstructed nitrogenase phylogeny, were modeled to provide comparisons with ancestral models. D-subunit sequences from extant and ancestral nitrogenases were aligned to 38 NifD and 2 VnfD structural templates retrieved from the Protein Data Bank (Berman et al., 2000), accessed November 2018 (Appendix S3; published AnfD models not available at time of analysis).
Information from all 40 templates was used to model each structure. All models were generated by specifying the inclusion of the

| Active-site pocket volume calculation of extant and ancestral D-subunit models
Volumes of the modeled ancestral and extant D-subunit active-site cofactor pockets were calculated by POVME v2.0 (Durrant, Votapka, Sorensen, & Amaro, 2014). Spatial coordinates and the inclusion region for volume calculation were specified manually. Pocket volumes were calculated with a grid spacing of 0.5 Å and a 1.09 Å distance cutoff from any receptor atom's van der Waals radius (all POVME input parameters can be found at https ://github.com/kacar lab/ Ancie ntNit rogenase). Volume outside of the modeled convex hull of the cofactor pocket and non-contiguous volume were removed.
Statistical analysis of ancestral and extant pocket volume data was performed in R (R Core Team, 2014).

| V-and Fe-nitrogenases diversified after Monitrogenases
We reconstructed the phylogenetic history of Mo-, V-, and Fe- sess nifBEN genes considered to be minimally required for the synthesis of the FeMo-cofactor (Boyd, Anbar, et al., 2011;Curatti et al., 2007;Dos Santos et al., 2012;Hu et al., 2005;Shah et al., 1994Shah et al., , 1986Tal et al., 1991), it is likely that Mb-Mc nitrogenases are Modependent . Thus, the phylogenetic positioning of Vnf and Anf is consistent with previous suggestions that V-and Fe-nitrogenases diversified after Mo-nitrogenases .

| Basal uncharacterized nitrogenases lack associated genes for FeMo-cofactor synthesis
In addition to investigating the phylogenetic relationships between Mo-, V-, and Fe-nitrogenases, we mapped the presence of the biosynthetic nifB, nifE, and nifN genes-considered to be necessary for FeMo-cofactor assembly (Boyd, Anbar, et al., 2011;Curatti et al., 2007;Dos Santos et al., 2012;Hu et al., 2005;Shah et al., 1994Shah et al., , 1986Tal et al., 1991 Both basal Clfx and F-Mc homologs are found primarily in thermophilic taxa and lack one or more associated nifEN cofactor biosynthesis genes, as has been noted previously Dos Santos et al., 2012;Shock & Boyd, 2015). Clfx homologs, constit™uting the most basal uncharac- suggests that these strains may not be capable of synthesizing the FeMo-cofactor, given previous in vitro and in vivo experiments demonstrating the requirement of both nifEN for FeMocofactor biosynthesis (Curatti et al., 2007;Hu et al., 2005;Shah et al., 1994Shah et al., , 1986Tal et al., 1991).

| High statistical support for ancestral nitrogenase active-site residues
We inferred ancestral sequences for each of the H-, D-, and K-subunits that constitute the nitrogenase enzyme complex probability for each node also does not deviate by more than ~ 0.02 across each of the five phylogenetic topologies (Trees-1-5; Appendix S4). These observations suggest that sequence support for ancestral nitrogenases is more sensitive to ancestral node position than to topological differences between the analyzed trees.
In addition to surveying total ancestral HDK sequence support, we analyzed support for 30 active-site residues, defined as those (c) statistical support for ancestral active-site residues is higher than the mean support across entire HDK sequences (see Section 3.3).
We first assessed the sensitivity of ancestral sequence variation to phylogenetic uncertainty and ancestral statistical support.
Overall, mean identities for ancestral sequences compared across different nodes range from ~55 to 90%. Ancestral sequences inferred from the same node across alternate phylogenies (Trees-1-5) have relatively high mean identities, ranging from ~93 to 95% across the total HDK sequence and from ~96 to 100% within the active site. These high mean identities suggest that topological differences among the alternate phylogenies used for ancestral sequence inference do not contribute to a high degree of ancestral sequence variation. Identities among sequences inferred from the same node also do not appear to be correlated with statistical support. For example, though full AncA HDK sequences are reconstructed with the highest mean statistical support (~0.89-0.91), they exhibit neither the lowest nor the highest mean identities as compared to sequences inferred from other nodes (Appendix S4).
We next identified specific residues within the nitrogenase active site that are unique to particular homologs of known metal dependence in order to survey their occurrence in ancestral sequences. We found that three active-site residues are unique to Mo-nitrogenases for these uniquely conserved residues, we hypothesize that these residues may contribute to the metal specificity of the active-site environment. Surprisingly, most nitrogenase ancestors exhibit comparable numbers of residues unique to either V-/Fe-nitrogenases or Mo-nitrogenases, and thus, their occurrence does not appear informative for inferring ancestral metal dependence (e.g., ancestral AncC sequences contain two residues unique to V-/Fe-nitrogenases and two residues unique to Mo-nitrogenases). An exception is AncA sequences, which contain a preponderance of active-site residues that are unique to extant V-and Fe-nitrogenases. One of the residues unique to only V-nitrogenases, Thr-355, has recently been suggested to interact directly with a proposed FeV-cofactor carbonate ligand not present in the FeMo-cofactor (Sippel & Einsle, 2017). This carbonate ligand lies within a secondary structure loop that also contains Pro-358, unique to V-nitrogenases, and Leu-360, unique to V-and Fe-nitrogenases. These Thr-355, Pro-358, and Leu-360 residues are observed across all AncA sequences.
Classification support (reported as distance of each data sample from the support vector hyperplane defining the inferred class) for ancestral sequences is nearly identical across different ancestors inferred for the same phylogenetic node ( for further details).

| Active-site structural features are uninformative for inferring ancestral metal dependence
To investigate metal-specific features of ancestral nitrogenase structures, we generated homology models of both extant
We used phylogenetic reconstruction, ancestral sequence inference, and structural homology modeling to explore outstanding questions of early nitrogenase evolution and metal dependence.
Though the tree reconstructions presented here are largely in congruence with previous phylogenetic analyses, certain topological differences, particularly with regard to basal uncharacterized homologs lacking one or more associated nifEN genes, suggest important deviations from previous narratives of early metal dependence (Boyd, Anbar, et al., 2011;Boyd & Peters, 2013;Raymond et al., 2004). The reconstruction of ancestral nitrogenase sequences in silico provides the means to directly infer ancient metal-binding properties from molecular information.

| Inferred metal dependence of ancestral nitrogenases
The nitrogenase active site is known, in addition to the metal content of the cofactor, to contribute to the variable catalytic properties of different nitrogenase forms (Brigle et al., 1987;Christiansen et al., 2000;Fixen et al., 2016;Kim et al., 1995;Sarma et al., 2010;Yang et al., 2012). We therefore explored active-site features of ancestral nitrogenases for correlations with metal dependence.
We find that modeled active-site structural features are not informative for the inference of ancestral metal dependence. Pocket volumes of modeled extant nitrogenases do not appear to be strongly correlated with metal cofactor content. Problematically, the predicted volume ranges of oldest ancestors (as well as those of uncharacterized nitrogenases) overlap with both V-and Monitrogenases ( Figure 6). We acknowledge that these homology modeling results may not precisely reflect true biological differences (for which a comprehensive analysis is not yet possible due to the limited availability of V-and Fe-nitrogenase structures). This is illustrated by the difference between pocket volumes of published structures and those of homology models (e.g., A. vinelandii PDB 3U7Q NifD structure pocket volume ≈ 994.750 and mean A. vinelandii NifD modeled structure pocket volume ≈ 1,186.412).
However, it is not surprising that active-site pocket volume does not predict metal dependence, given the probable similar sizes and structures of the FeMo-, FeV-, and FeFe-cofactors (Eady, 1996;Krahn et al., 2002;Sippel et al., 2018;Spatzal et al., 2011). These findings contrast with a previous study by McGlynn et al. (2012) in which pocket volume comparisons were used, in part, to clas- Furthermore, AncA sequences generally exhibit greater numbers of residues unique to V-nitrogenases than those unique to Fenitrogenases, the mean identity of AncA active-site sequences is the highest for V-nitrogenases (Figure 4a), and the classifier analysis based on active-site amino acid probability distributions suggests that AncA sequences most resemble modern V-nitrogenases (Section 3.4; Appendix S6). Together, these observations suggest that AncA is specific toward the VFe-cofactor.
Ancestral and extant sequence comparisons indicate that the active sites of oldest nitrogenase ancestors (AncB-AncE) resemble those of extant Mo-nitrogenases more than V-or Fe-nitrogenases (Figure 4a; Appendix S6). This observation is particularly significant given that these same patterns are not observed across the total HDK sequence ( Figure 4b). Specifically, this discrepancy supports the notion that the nitrogenase active site has adapted to the catalytic properties of each metal cofactor over its evolutionary history  and that this adaptation has manifested in active-site sequence differences that stand apart from baseline phylogenetic distance.
Though we are not able to identify specific residues or motifs that may functionally be attributed to metal dependence (as with AncA), the resemblance of the early ancestral nitrogenase active site to that of Mo-nitrogenases is highly suggestive of an ancient role for an active-site cluster resembling the FeMo-cofactor.

| A proposed model for the evolution of nitrogenase metal dependence over geologic time
Despite the lack of information provided by structural analyses, the active-site sequence features of oldest ancestral sequences (i.e., AncB-E) support the inference that these early nitrogenases incor- resulting only from the duplication of nif structural genes; , typically lack dedicated scaffolding genes and instead co-opt nifEN; Hamilton et al., 2011;Mus et al., 2018). One may thus parsimoniously conclude that uncharacterized AncC-D ancestors similarly lacked one or more nifEN genes. A more recent, previously published topology is more similar to the tree presented here, though lacking in Clfx sequences and thus missing relevant information regarding the lack of associated biosynthetic components for early-diverged homologs (Boyd & Peters, 2013). It is possible that the larger and more broadly sampled sequence dataset used here has refined the placement of these uncharacterized clades, which is supported by our analyses with an expanded outgroup that main- We prefer the second model for several reasons: • It is evolutionarily unlikely for a portion of the FeMo-cofactor biosynthetic pathway to be lost in Clfx and F-Mc taxa. To our knowledge, no other Mo-nitrogenases are known to lack associated nifEN genes, suggesting that the canonical biosynthetic pathway for the FeMo-cofactor is selectively advantageous (anf and vnf gene clusters typically lack dedicated scaffolding genes and co-opt nifEN; Hamilton et al., 2011;Mus et al., 2018). Rather, it is more parsimonious to infer that these early-branching uncharacterized clades diverged prior to the nifDK duplication that produced nifEN, as is evidenced by the nesting of nifEN genes within nifDK clades in previous phylogenetic reconstructions (Boyd, Anbar, et al., 2011).
• It has previously been proposed that early nitrogenases may have been capable of reducing nitrogen prior to the development of the canonical FeMo-cofactor biosynthetic pathway Boyd & Peters, 2013;Mus, Colman, Peters, & Boyd, 2019;Soboh, Boyd, Zhao, Peters, & Rubio, 2010). Further, an ancient Mo-independent nitrogenase may have been capable of-albeit inefficiently-reducing nitrogen by a cofactor resembling the Fe-S-C cluster assembled by NifB, which constitutes the biosynthetic precursor to the FeMo-cofactor (Boyd & Peters, 2013;Mus et al., 2019;Soboh et al., 2010). Though our sequence analyses cannot fully assess ancestral nitrogenase dependence for a NifB-cofactor, it is likely that the NifB-cofactor resembles the structure and composition of the FeFe-cofactor, excepting homocitrate (Corbett et al., 2006;Guo et al., 2016;Harris, Lukoyanov, et al., 2018). • Mo-nitrogenases are far more efficient at reducing nitrogen than V-or Fe-nitrogenases (Eady, 1996;Harris et al., 2019;Harris, Yang, et al., 2018), and the majority of all extant nitrogenases are Mo-dependent across both anoxic and oxic environments Mus et al., 2019;Raymond et al., 2004). Even those organisms that have additional V-or Fe-nitrogenases still retain and preferentially express Mo-nitrogenases (Boyd, Anbar, et al., 2011;Dos Santos et al., 2012;Hamilton et al., 2011;Raymond et al., 2004).  (Stueken et al., 2015). However, this interpretation conflicts with age estimates of both nifEN (inferred to also reflect the age of the FeMo-cofactor) (Boyd, Anbar, et al., 2011) and of earliest marine Mo availability (Anbar et al., 2007;Lyons et al., 2014). It is possible that an alternative pathway for FeMo-of similar cofactor synthesis in early nitrogenases may provide an explanation for these sofar unresolved, early isotopic signatures.
The proposed model of an ancestral, alternative pathway for Mo-cofactor assembly, mapped to our phylogenetic reconstruction and inferred ancestors, is illustrated in Figure 7. In the first stage,

| CON CLUS ION
We reconstructed the phylogenetic history of nitrogenase proteins, as well as inferred and modeled ancestral nitrogenase sequences, in order to explore the evolutionary trajectory of nitrogenase metal dependence. We find that, whereas modeled structural features of ancestral nitrogenases do not offer conclusive indications of ancient metal usage, active-site sequence features of oldest ancestors most resemble those of extant Mo-nitrogenases. The absence of associated cofactor biosynthesis proteins that function within the canonical FeMo-cofactor biosynthetic pathway in several early-branching, uncharacterized homologs evidences a possible alternative and/or simplified pathway for Mo-cofactor assembly. We speculate that this alternative pathway may today be present in extant uncharacterized taxa, and we propose a model wherein canonical Mo-cofactor assembly evolved via the stepwise introduction of FeMo-cofactor biosynthetic components following the divergence of more basal uncharacterized lineages. V-nitrogenases subsequently diversified, followed by Fe-nitrogenases, in agreement with previous phylogenetic inferences that Mo-dependence evolved first . This model helps to reconcile phylogenetic and geobiological explanations of nitrogenase evolution (Anbar & Knoll, 2002;Boyd, Anbar, et al., 2011;. Future molecular paleobiology studies, particularly those that integrate experimental assessments of laboratory-resurrected ancestral nitrogenases, may continue to refine our understanding of nitrogenase and environmental coevolution.