Oxygen False Positives on Habitable Zone Planets Around Sun-Like Stars

Oxygen is a promising exoplanet biosignature due to the evolutionary advantage conferred by harnessing starlight for photosynthesis, and the apparent low likelihood of maintaining oxygen-rich atmospheres without life. Hypothetical scenarios have been proposed for non-biological oxygen accumulation on planets around late M-dwarfs, where the extended pre-main sequence may favor abiotic O$_2$ accumulation. In contrast, abiotic oxygen accumulation on planets around F, G, and K-type stars is seemingly less likely, provided they possess substantial non-condensable gas inventories. The comparative robustness of oxygen biosignatures around larger stars has motivated plans for next-generation telescopes capable of oxygen detection on planets around sun-like stars. However, the general tendency of terrestrial planets to develop oxygen-rich atmospheres across a broad range of initial conditions and evolutionary scenarios has not been explored. Here, we use a coupled thermal-geochemical-climate model of terrestrial planet evolution to illustrate three scenarios whereby significant abiotic oxygen can accumulate around sun-like stars, even when significant non-condensable gas inventories are present. For Earth-mass planets, we find abiotic oxygen can accumulate to modern levels if (1) the CO$_2$:H$_2$O ratio of the initial volatile inventory is high, (2) the initial water inventory exceeds ~50 Earth oceans, or (3) the initial water inventory is very low. Fortunately, these three abiotic oxygen scenarios could be distinguished from biological oxygen with observations of other atmospheric constituents or characterizing the planetary surface. This highlights the need for broadly capable next-generation telescopes that are equipped to constrain surface water inventories via time-resolved photometry and search for temporal biosignatures or disequilibrium biosignatures to assess whether oxygen is biogenic.

Planetary evolution is divided into an initial magma ocean phase, and a subsequent solid-mantle phase, as shown in Figure 1, although a planet may transition between magma ocean and solid mantle multiple times. The model is initialized with a fully molten mantle and some endowment of volatiles, radiogenic inventory, and an initial mantle oxygen fugacity (i.e., after core formation, Figure 1, left). Magma ocean solidification follows previous models such as Lebrun et al. (2013) and Schaefer et al. (2016): the magma ocean freezes from the core, upwards, as governed by the following equation: Here, V mantle is the volume of the molten mantle, ρ m is the average density of the mantle, Q radioactive is radiogenic heat production per unit mass, r p is planetary radius, H fusion is the latent heat of fusion of silicates, r s is the solidification radius, c p is the specific heat of silicates, Q core is the heatflow from the metallic core, and T p is mantle potential temperature. The heatflow from the interior, q m , is calculated using 1-D convective parameterization, with temperature-dependent magma ocean viscosity, ν(T p ), (see Figure S3 and supporting information Section A.3): Here, T surf is mean surface temperature, and C qm is a constant that depends on thermal conductivity, thermal diffusivity, critical Rayleigh number, gravity, and thermal expansivity. Supporting information Section A.1 describes this convection parameterization in more detail. Equation 1 continues to govern the thermal evolution of the mantle after the magma ocean has solidified with dr s /dt = 0.0.
During magma ocean solidification H, C, and O are partitioned between dissolved melt phases, crystalline phases, and the atmosphere by assuming chemical equilibrium (supporting information Section A.7). The rate at which the mantle freezes is controlled by outgoing longwave radiative (OLR), which is balanced by heat from interior and absorbed shortwave radiation (ASR) from the host star at every timestep (see climate model description below): During the magma ocean phase, planetary oxidation may occur from the loss of hydrogen to space (less oxygen drag): Free oxygen produced via H escape is dissolved in the melt and may be transferred to the solid mantle as the magma ocean solidifies (Schaefer et al., 2016). We parameterized atmospheric escape as either diffusion limited or XUV-limited, depending on the composition of the stratosphere and the stellar XUV flux (supporting information Section A.9). During XUV-driven escape of a steam-dominated atmosphere, the hydrodynamic escape of H may drag along O (and even CO 2 ) following Odert et al. (2018). In contrast, if the stratosphere is mostly dry, then the escape of H will be limited by the rate at which H 2 O can diffuse through the cold trap (Wordsworth & Pierrehumbert, 2013), and nothing heavier than H escapes. Standard parameterizations of solar bolometric luminosity (Baraffe et al., 1998(Baraffe et al., , 2002 and XUV luminosity evolution are adopted (Tu et al., 2015), as described in supporting information Section A.4.
A radiative-convective climate model is used to self-consistently calculate surface temperature, OLR, ASR, the water vapor profile, and surface liquid water inventory (if any) during both the magma ocean phase and subsequent temperate evolution. OLR is a function of the surface H 2 O and CO 2 inventories, and is calculated using the publicly available correlated-k radiative transfer code of Marcq et al. (2017). To obtain KRISSANSEN-TOTTON ET AL.

10.1029/2020AV000294
4 of 20 OLR in the presence of condensable water vapor, a dry adiabat to moist adiabat to isothermal atmospheric structure is assumed (Kasting, 1988). To calculate ASR across a wide range of temperatures, we adapted the albedo parameterization described in Pluriel et al. (2019). Refer to supporting information Section A.5 for full details of radiative transfer calculations along with example outputs.
When heat from accretion and short-lived radiogenics is sufficiently dissipated-the timescale for which is controlled by insolation and greenhouse warming from outgassed volatiles-a planet's surface temperature may drop below the solidus and the magma ocean phase is over. At this point, the model transitions to solid-state mantle convection and temperate geochemical cycling (Figure 1, right). The redox budget during solid-state evolution is modeled as follows: the only source of oxygen is still atmospheric escape using the same parameterization as described above. However, there are now numerous crustal sinks for oxygen including (i) subaerial and submarine outgassing of reduced species (e.g., H 2 , CO, and CH 4 ), (ii) water-rock serpentinizing reactions that generate H 2 (the "wet crustal" sink), and (iii) direct oxidation of surface crust by atmospheric oxygen (the "dry crustal" sink): The sizes of these three oxygen sinks are self-consistently calculated from the planetary interior evolution and mantle volatile content. Outgassing fluxes are calculated using the melt-gas equilibrium outgassing model of Wogan et al. (2020); outgassing fluxes depend on mantle oxygen fugacity, degassing overburden pressure, the volatile content of the mantle, specifically H 2 O and CO 2 content, and the rate at which melt (new crust) is produced (see supporting information Section A.10). The possible influence of pressure overburden on redox evolution has been discussed previously (Wordsworth et al., 2018), and is quantifiable within our outgassing model framework. Dry and wet crustal sinks for oxygen similarly depend on crustal production rates, and are described in full in supporting information Sections A.12 and A.13. Crustal production is calculated from interior heatflow, which is modulated by temperature-dependent mantle viscosity and radiogenic heat production (supporting information Section A.10). We assume plate tectonics when calculating crustal production rates to maximize crustal sinks of oxygen.
Weathering processes (Krissansen-Totton, Arney, et al., 2018) and the deep hydrological cycle (Schaefer & Sasselov, 2015) are explicitly modeled because climate and surface volatile inventories control crustal oxygen sinks and atmospheric escape processes. Our model incorporates a rudimentary carbon cycle, which is a simplified version of that described in Krissansen-Totton, Arney, et al. (2018). Carbon is added to the atmosphere via magmatic outgassing (described above), and returned via continental weathering and seafloor weathering, whose relative contributions depend on climate and the total surface water inventory. Our carbon cycle parameterization is described in supporting information Section A.11 and the deep hydrological cycle in supporting information Section A.12. Self-consistent climate modeling enables an assessment of whether abiotic oxygen can coexist with a habitable surface climate, which would be especially problematic for unambiguous biosignature gas interpretations.
The time evolution of volatile reservoirs during both the magma ocean phase and the solid state evolution is governed by the following equations (see supporting information Section A.6 for details):


Here, Λ i represents a generic volatile species (e.g., H 2 O, CO 2 , free O). The first term on the right hand side represents the transfer of volatiles from the fluid phases (magma ocean + atmosphere) to the solid mantle as the magma ocean solidifies: Λ i k is the melt-solid partition coefficient for species Λ i , and Esc . Because we are assuming a plate tectonics regime, the model does not separately track volatile reservoirs in the crust and mantle. Instead, we assume carbon, oxygen, and water added to crust is immediately subducted into the mantle; a single, wellmixed "interior" reservoir is used to represent storage of volatiles in solid silicates. The extremely broad range of crustal hydration and crustal oxidation efficiency factors sampled (see below) can accommodate differing subduction and arc volcanism efficiencies. Note that the model only tracks C, H, and O-bearing species, as well as Fe 2+ /Fe 3+ speciation in the interior. Nitrogen fluxes are not modeled, and we instead assumed a 1 bar N 2 background partial pressure in all model runs. This conservative assumption ensures that, for temperature surface conditions, there are always sufficient non-condensables to maintain a cold trap and prevent excessive water loss; any oxygen accumulation that results is due to other processes.
The model does not include any explicit photochemistry; it tracks fluxes of oxygen into/out of the combined atmosphere-ocean reservoir, and all outgassed reductants are assumed to instantaneously deplete atmospheric oxygen. This simplification is adequate for estimating oxygen accumulation because if oxygen sources exceed oxygen sinks, then oxidant build-up will occur; neither photolysis reactions nor spontaneous reactions can add net reducing power the atmosphere-ocean system. Our model cannot predict low steady state oxygen abundances in predominantly anoxic atmospheres, however; atmospheric O 2 is truncated at a lower limit of 0.1 Pa for numerical efficiency. Importantly, our modeling approach is agnostic on the plausibility of the various photochemical scenarios that have been proposed for abiotic O 2 atmosphere, such as O 2 -and CO-rich atmospheres maintained by continuous CO 2 photodissociation (Gao et al., 2015;Hu et al., 2020). Studies of these scenarios enforce global redox balance at the boundaries of the atmosphere-ocean system and determine whether appreciable atmospheric oxygen exists in the resultant photochemical steady state (e.g., Harman et al., 2015). Here, we are instead using a time-dependent model to investigate whether slight imbalances in atmosphere-ocean boundary fluxes can result in atmospheric oxygen accumulation on long timescales (cf., Luger & Barnes, 2015;Schaefer et al., 2016;Wordsworth et al., 2018).
There are many uncertain parameterizations and parameter values in our model, and so all results are presented as Monte Carlo ensembles that randomly sample a wide range of uncertain parameter values. Parameter ranges and their justifications are described in full in the supporting information (Table S1). We sampled a range of temperature-dependent mantle viscosities, efficiencies of XUV-driven escape, uncertain early sun XUV fluxes, carbon cycle feedbacks, deep hydrological cycle dependencies, and albedo parameterizations. Unknown parameters that are particularly important for oxygen false positives include the dry crustal oxidation efficiency, f dry−oxid , which is the fraction of Fe 2+ in newly produced crust that is oxidized to Fe 3+ in the presence of an oxidizing atmosphere via non-aqueous reactions. This parameter is sampled uniformly in log space from 10 −4 to 10% (Gillmann et al., 2009). Another important parameter is the XUV-driven escape efficiency,  lowXUV, which is the fraction of stellar XUV energy that drives H-escape. This is sampled uniformly from 0.01 to 0.3, and the portion of energy that goes into escape once the XUV flux exceeds what is required for O-drag is an additional free parameter.

Model Validation
To validate the model, we first show that it can successfully reproduce the atmospheric evolution of Earth and Venus. Venus results are described in detail in supporting information Section C, and here we summarize key results for Earth. Figure 2 shows Monte Carlo model outputs over a range of Earth-like volatile inventories, specifically an initial water content of 1-10 Earth oceans, an initial CO 2 content of 20-2,000 bar. Moreover, only initial CO 2 inventories less than the initial water inventory by mass are permitted, and an initial (post core-formation) mantle redox state around the Quartz-Fayalite-Magnetite (QFM) buffer is assumed. There is evidence for more reducing Hadean continental crust (Yang et al., 2014), and other terrestrial planets such as Mars likely have reducing mantles (Wadhwa, 2001). While a rapidly oxidized mantle (e.g., Zahnle et al., 2010) is assumed in all nominal calculations, the sensitivity of our results to initial mantle redox is explored in supporting information Section G.

10.1029/2020AV000294
6 of 20 Figure 2a shows the time-evolution of mantle potential temperature and surface temperature, Figure 2b shows the solidification of the magma ocean from core to surface, which takes several Myr, Figure 2c shows the evolution of atmospheric volatile inventories, Figure 2d shows the globally averaged depth of liquid water oceans. The inflection in temperature evolution and solidification radius around 10 4 years reflects the transition from a low viscosity, rapidly convecting magma ocean, to more solid-like magma mush convection (Lebrun et al., 2013). Note the transfer of water from steam atmosphere ( Figure 2c) to liquid water ocean (Figure 2d) following magma ocean solidification at around 10 6 years. Figure 2e shows the planetary energy budget, Figure 2f shows carbon outgassing and weathering fluxes, Figure 2g shows total crustal production, Figure 2h shows the evolution of (solid) mantle oxygen fugacity relative to the QFM buffer, and Figure 2i shows oxygen fluxes into/out of the atmosphere, excluding loss of oxygen to the magma ocean, which is modeled as instantaneous melt-atmosphere equilibration rather than a continuous sink flux; this temperature-dependent equilibrium partitioning controls atmospheric pO 2 for the first few million years ( Figure 2c).
Our modeled early Earth atmosphere-thermal-climate evolution is broadly consistent with semiquantitative reconstructions of Hadean atmospheric evolution (Zahnle et al., 2007(Zahnle et al., , 2010. We also find that in virtually every model run, after 4.5 Gyr of atmospheric evolution, the atmosphere is anoxic (Figure 2c). This is unsurprising. Hydrogen escape during the initial ∼ Myr magma ocean does not add free oxygen to the atmosphere but instead oxidizes the interior, as has been described previously (Hamano et al., 2013). In some cases, small amounts of abiotic oxygen are produced in the post magma-ocean steam atmosphere, but KRISSANSEN-TOTTON ET AL.  Earth's coupled redox-thermal-climate evolution (without life). The model is applied to the Earth from magma ocean to present with initial water inventories ranging from 1 to 10 Earth oceans, and initial CO 2 inventories ranging from 20 to 2,000 bar. Additionally, we only plot model runs where the initial water inventory exceeds the initial CO 2 inventory. The lines are median values and shaded regions denote 95% confidence intervals across 3,000 model runs.
In the absence of life, Earth's atmosphere after 4.5 Ga is always anoxic (c) because outgassing and crustal hydration sinks overwhelm oxygen production via photolysis and diffusion-limited hydrogen escape (i). (a, b) The magma ocean persists for a few million years, consistent with previous studies. (e) The magma ocean ends when the planet's interior cools such that heatflow from the interior drops below the runaway greenhouse limit. (d) When this occurs, liquid water oceans condense onto the surface, (f) a temperate carbon cycle commences. (c) There is sometimes a brief spike in atmospheric oxygen following magma ocean solidification due to the persistence of a steam atmosphere and hydrogen escape, (i) but this oxygen is rapidly drawn down by geological sinks. (g) Volatile cycling is controlled by the rate at which fresh crust is produced. Mantle redox evolution is plotted (h) alongside proxy estimates (O'Neill et al., 2018;Trail et al., 2011). this atmospheric oxygen is rapidly overwhelmed by outgassing and other crustal sinks. Subsequent oxygen production via diffusion-limited escape is small, and so there are no further opportunities for abiotic accumulation so long as the planet remains geologically active.
For Venus, the model can recover current atmospheric conditions assuming the initial water inventory is small, and that crustal sinks of oxygen are efficient (see supporting information Section C). Venusian histories in which the surface was never habitable and in which the surface was habitable for several billion years can both be reconciled with the current atmosphere, which is broadly consistent with previous modeling of Venus ' atmospheric evolution (Chassefière et al., 2012;Kasting & Pollack, 1983;Way et al., 2016).

Results
If Earth's initial volatile inventories are varied, then oxygen-rich atmospheres may be possible. Here, we outline three scenarios whereby an abiotic Earth could have accumulated an oxygen-rich atmosphere after 4.5 Gyr. None of these scenarios guarantee an oxygen-rich atmosphere; instead, oxygenated atmospheres are a possible outcome that is dependent on the efficiency of oxygen crustal sinks and atmospheric escape.

Scenario 1: High CO 2 :H 2 O Initial Inventory Leading to Perpetual Runaway Greenhouse
Figure 3 shows selected model outputs for planets with initial CO 2 :H 2 O volatile inventories greater than one by mass and atmospheric O 2 > 10 17 kg at present (P O2 > ∼0.02 bar). For these planets, the greenhouse warming from a dense CO 2 atmosphere ensures that the surface temperature is above the critical point of water; liquid water never condenses on the surface at 1 AU, except briefly, and in small amounts, during KRISSANSEN-TOTTON ET AL.  . Oxygen false positives from high initial CO 2 :H 2 O inventories (Scenario 1). The model is applied to the Earth from magma ocean to present with randomly sampled initial water inventories ranging from ∼0.1 to 10 Earth oceans, and initial CO 2 inventories ranging from ∼20 to 2,000 bar (implying CO 2 :H 2 O ranging from 0.01 to 100 by mass). Only model outputs with modern day atmospheric oxygen exceeding 10 17 kg (>∼0.02 bar) are plotted. Subplots are the same as in Figure 2, and shaded regions denote 95% confidence intervals. (a) High atmospheric CO 2 ensures the surface temperature always exceeds the critical point of water after the pre-main sequence, and (d) thus permanent liquid water oceans do not condense. The lack of surface water, low volatile content of the mantle, and high surface pressure increasing volatile solubility in partial melts all limits oxygen sinks. (a, i) The largest atmospheric sink is dry crustal oxidation, which diminishes with time as the interior cools. (c) Atmospheric oxygen produced via H escape may start to accumulate after several Gyr of evolution. early solar evolution (Figure 3d). These high CO 2 :H 2 O perpetual runaway atmospheres have been described previously Salvador et al., 2017). The lack of liquid surface water precludes CO 2 -drawdown via silicate weathering (Figure 3f). Reactions between supercritical water and silicates will be severely kinetically limited by sluggish solid state diffusion, and are therefore assumed to be negligible (Zolotov et al., 1997). Consequently, a dense CO 2 atmosphere and supercritical surface temperature persist indefinitely (Figure 3a), despite the planet residing in the habitable zone. Moreover, there is sufficient steam in the atmosphere to ensure diffusion-limited hydrogen escape provides an appreciable source flux of oxygen ( Figure 3i).
Oxygen accumulation also requires limited oxygen sinks, and this may occur on high CO 2 :H 2 O worlds for two reasons. First, since permanent oceans do not condense, it is difficult to sequester outgassed volatiles left over from the magma ocean in the interior; hydration reactions and carbonatization reactions do not occur without liquid surface water. Limited mantle regassing after magma ocean outgassing implies low mantle volatile content, which inhibits the capacity of outgassed reductants to draw down oxygen. Second, the high pressure from the dense CO 2 atmosphere, while not enough to significantly increase the silicate solidus, does increases the solubility of volatiles in partial melts. In combination with low mantle volatile content, this ensures limited outgassing of reducing species (Figures 3f and 3i, cf., ).
However, even with limited outgassing, new crust is still being produced that may be directly oxidized by gaseous O 2 (Figure 3g). Figure 4a shows atmospheric oxygen abundances after 4.5 Gyr as a function of dry crustal oxidation efficiency, f dry−oxid , for a large number of model runs sampling 10 20 -10 22 kg initial CO 2 and H 2 O (or ∼20-2,000 bar and 0.1-10 Earth oceans). The efficiency parameter is necessarily quite low (<0.1%) in all the model runs that produce significant oxygen. The plausibility of such inefficient crustal oxidation is explored in the discussion. Figure 4b   Gyr is plotted as a function of (a) dry crustal oxidation efficiency, and (b) the initial CO 2 :H 2 O inventory by mass. Note that Scenario 1 oxygen accumulation (high CO 2 , perpetual runaway greenhouse atmospheres) requires both an initial CO 2 :H 2 O ratio >1 (green box) and for dry crustal oxidation to be relatively inefficient, with <0.1% of Fe 2+ in newly produced crust oxidized. The observed range in carbonaceous chondrite CO 2 :H 2 O ratios (purple interval) is shown in (b) as a rough proxy for Earth's initial volatile inventory. The outliers with high oxygen (red box) are Scenario 3 (desertworld) false positives, which are examined in Section 3.3. Anoxic atmospheres truncate at ∼10 −6 bar for numerical efficiency (see supporting information); these model runs represent outcomes with essentially no atmospheric oxygen.

Scenario 2: Waterworlds
Figure 5 shows selected model outputs for planets with H 2 O volatile inventories between 10 and 230 Earth oceans. For these planets, a liquid water ocean condenses out of the atmosphere after a few million years and any oxygen left over from the post-formation steam-dominated atmosphere is typically removed by oxygen sinks (Figure 5i), just as in the nominal Earth model ( Figure 2). However, the pressure overburden from the large surface water inventory dramatically increases the solidus of the silicate mantle. This effect, which has been described previously (Kite & Ford, 2018;Noack et al., 2016), causes fresh crustal production to cease completely after a few billion years when the mantle potential temperature drops below the solidus ( Figure 5g). The cessation of crustal production suppresses all geological oxygen sinks; hydration reactions stop as there is no melt production or new crust to oxidize (Figure 5i). The source flux of oxygen is low in this scenario. An effective cold trap ensures oxygen production rates of ∼0.01 Tmol/yr via diffusion-limited escape. However, since oxygen sinks are negligible, this small source flux is sufficient to accumulate modern Earth-like oxygen abundances over several Gyr (Figure 5c). There are also a small number of model runs where oxygen persists from the magma ocean, since equilibrium oxygen fugacities are high at the elevated solidus temperature under high pressures (Figure 5c). Figure 6 shows the oxygen abundance after 4.5 Gyr for individual model runs as a function of the initial water inventory. Waterworld oxygen false positives only begin to become likely when the initial water inventory exceeds around 50 Earth oceans or 10 23 kg (for Earth-sized planets). There is significant scatter in results due to uncertainty in the temperature-dependent mantle viscosity, which controls the duration of tectonics.  . The model is applied to the Earth from magma ocean to present with randomly sampled initial water inventories ranging from 10 to 230 Earth oceans, and initial CO 2 inventories ranging from ∼20 to 6,000 bar. Only model outputs with modern day atmospheric oxygen exceeding 10 17 kg (>∼0.02 bar) are plotted. Subplots are the same as in Figure 2, and shaded regions denote 95% confidence intervals. (d) The large surface volatile inventory increases the (g) mantle solidus such that melt production and tectonics shut off shortly after formation. (i) This shuts down all oxygen sinks and (c) allows for the gradual accumulation of oxygen via diffusion-limited H escape over several Gyr.

Scenario 3: Desertworlds
The final scenario whereby abiotic oxygen could accumulate on habitable zone planets around Sun-like stars occurs for planets with extremely small initial volatile inventories (initial water inventory < ∼0.3 Earth oceans). Figure 7 shows selected model outputs representing this desertworld false positive. The required sequence of events are as follows: the low volatile inventory ensures that the magma ocean freezes quickly (typically ∼10 5 years, Figure 7b), even though the planet is still in a runaway greenhouse state due to high heatflow from the interior (Figure 7e). A steam-dominated atmosphere can therefore persist for a few million years, and oxygen may accumulate during this time because there is no surface magma ocean to dissolve the oxygen. Dry crustal oxidation will remove some oxygen during this steam atmosphere phase, but oxidation will be limited by the rate at which oxygen can diffuse into extrusive lava flows (Figures 7i and S1d). When a shallow ocean does eventually condense out as heatflow from the interior drops below the runaway greenhouse limit (Figures 7d and 7e), oxygen may persist for billions of years if oxygen sinks are small ( Figure 7c). Outgassing sinks are limited by the low volatile inventory of the planet, but inefficient dry crustal oxidation is also required post-magma ocean for the oxygen to persist for 4.5 Gyr; the efficiency parameter, f dry−oxid , must be <0.1% ( Figure S1). This habitable scenario is qualitatively different to the uninhabitable Scenario 1 false positives: in the former, atmospheric oxygen accumulates early and gradually declines due to crustal sinks, whereas in the latter, oxygen accumulation takes several Gyrs. Desertworld false positives also require efficient XUV-driven hydrodynamic escape (>10%) during the steam atmosphere phase to produce large atmospheric oxygen abundances after the magma ocean has solidified ( Figure S1).

Discussion
The modeling approach adopted in this paper has several important caveats and limitations. First, we consider the reasons why abiotic oxygen accumulation could be underestimated in our model.

Assumptions That May Underestimate Abiotic Oxygen Accumulation
As noted above, the model does not track nitrogen fluxes and instead assumes 1 bar N 2 partial pressure throughout. This limits oxygen accumulation by providing a non-condensable background gas to throttle hydrogen escape at the cold trap (Kleinböhl et al., 2018;Wordsworth & Pierrehumbert, 2014). Nitrogen atmospheric evolution for terrestrial planets is highly uncertain, and the evolution of Earth's atmospheric N 2 inventory is poorly constrained (Johnson & Goldblatt, 2018;Stüeken et al., 2016). However, for terrestrial planets that form with low nitrogen inventories, or with most of their nitrogen sequestered in the interior (Wordsworth, 2016), then oxygen accumulation on temperate planets could be a more common outcome than our modeling suggests. Our nominal outgassing model may overestimate fluxes of reduced gases per unit mass partial melt. This is because we do not account for graphite saturation and redox-dependent partitioning of carbon-bearing species between crystalline and melt phases; we instead assume a constant partition coefficient for relating solid mantle CO 2 content and total melt plus gas phase concentrations (e.g., Lebrun et al., 2013). Models of Martian outgassing (Grott et al., 2011), and more generalized terrestrial outgassing models (Ortenzi et al., 2020) both show that reducing mantles tend to outgas fewer volatiles by mass than more oxidized mantles for the same amount of crustal production. Our model does, however, account for the greater reducing power of volcanic outgassing on planets with lower mantle oxygen fugacities Wogan et al., 2020). Consequently, our conservative approach maximizes fluxes of outgassed reductants, and may KRISSANSEN-TOTTON ET AL.  Gyr is plotted as a function of the initial planetary water inventory. Waterworld oxygen false positives are unlikely unless the initial water inventory exceeds 10 23 kg (∼50 Earth oceans). There is a high probability of abiotic oxygen accumulation for water inventories exceeding 100 Earth oceans. Clustering around 0.1 bar occurs because this is the amount of oxygen accumulated after 4.5 Gyr for temperate surface conditions and Earth-like stratospheric water abundances. Warmer surface oceans (∼400-600 K) result in more stratospheric water vapor and thus enable higher oxygen levels. underestimate oxygen accumulation. Sensitivity tests which consider more reducing mantles and graphite saturated melts are described below. Finally, our model may underestimate the duration of steam atmospheres following magma ocean solidification, and therefore underestimate oxygen accumulation prior to ocean condensation. Based on the time required to precipitate an Earth ocean and the apparent absence of stable climate states at the runaway greenhouse limit ( Figure S4), it is typically assumed that the time required to transition from steam atmosphere to surface water ocean is ∼10 3 years (Abe, 1993;Zahnle et al., 2007). In our model, the atmosphere-ocean system is assumed to be in radiative equilibrium with a negligible heat capacity (Lebrun et al., 2013). However, for waterworlds with hundreds of Earth oceans, the time required for a steam atmosphere to condense is long enough for abiotic oxygen accumulation to be significant. Moreover, stable climate states may exist in between a molten surface and a temperate surface ocean, especially if cloudswhich we ignore-are included in radiative transfer calculations   Figure 6). Finally, in our model, oxygen is partitioned between the magma ocean and the atmosphere assuming chemical equilibrium. This is a reasonable assumption for high temperature/low-viscosity magma oceans with very short mixing times, but as the surface temperature approaches the solidus, the "magma ocean mush" will behave more like a solid (Lebrun et al., 2013;Salvador et al., 2017), and oxygen produced during the final stages of the magma ocean may not be efficiently sequestered in the melt.

Assumptions That May Overestimate Abiotic Oxygen Accumulation
Next, we consider model assumptions that might cause abiotic oxygen accumulation to be overestimated.  . The model is applied to the Earth from magma ocean to present with randomly sampled initial water inventories ranging from ∼0.05 to 0.35 Earth oceans, and initial CO 2 inventories ranging from ∼6 to 60 bar. Only model outputs with modern day atmospheric oxygen exceeding 10 17 kg (>∼0.02 bar) are plotted. Subplots are the same as in Figure 2, and shaded regions denote 95% confidence intervals. (b) The low initial volatile inventory ensures the magma ocean solidifies before the runaway greenhouse is over, (c) allowing for significant oxygen accumulation in the ∼Myr steam atmosphere that (e) persists until the heatflow from the interior drops below the runaway greenhouse limit. If subsequent oxygen sinks are low, then the few bar oxygen that accumulate early on may persist for billions of years. (Krissansen-Totton et al., 2016), and the reaction may occur via lightning. Indeed NO-formation via lightning has previously been established as an important mechanism for removing photochemically produced atmospheric oxygen and inhibiting oxygen false positives . However, nitrogen fixation via lighting is unlikely to prevent abiotic oxygen accumulation on waterworlds. In the absence of any other sources or sinks nitrate-formation via lightning could draw down the modern Earth's atmospheric oxygen reservoir in 20-200 Myr (Krissansen-Totton et al., 2016) and could therefore remove 1 bar of nitrogen every 0.17-1.7 Gyr. Once most atmospheric nitrogen is converted to nitrate in the ocean, oxygen may accumulate rapidly due to the lack of a non-condensible cold trap (Wordsworth & Pierrehumbert, 2014), and nitrogen will not be significantly replenished by outgassing on waterworlds due to the pressure overburden effect precluding new crustal production. Waterworlds with large initial nitrogen atmospheric inventories (10 s of bar) could avoid abiotic oxygen accumulation if oxygen drawdown via lightning exceeds oxygen production via diffusion-limited hydrogen escape. However, it is debated whether nitrate is the kinetically stable form of nitrogen on habitable worlds (Hu & Diaz, 2019;Ranjan et al., 2019;Wong et al., 2017). Efficient conversion of nitrate back to molecular nitrogen via abiotic chemodentrification might return oxygen to the atmosphere and prevent its accumulation in the ocean, regardless of the initial N 2 volatile inventory. In summary, it is unlikely that nitrate sinks will always preclude abiotic oxygen accumulation on waterworlds, but better constraints on aqueous nitrogen chemistry would improve this assessment.
Another important limitation of our model is that it ignores infrared cooling of the upper atmosphere and the throttling of hydrogen escape that results from a cool stratosphere. In our nominal model, an isothermal 210 K stratosphere is assumed. However, CO 2 -rich atmospheres may efficiently radiate in the IR, cooling the stratosphere and enhancing the water cold trap (Wordsworth & Pierrehumbert, 2013). To test the sensitivity of our results to stratospheric temperature, we repeated our calculations and introduced an additional stratospheric temperature variable, which was randomly sampled from 150 to 250 K. The full results of these calculations are shown in Figure S13. To summarize, neglecting the stratospheric radiation budget does not affect the viability of Scenarios 2 (waterworlds) or 3 (desertworlds) because, in the former, the oxygen source is diffusion-limited escape through a N 2 -dominated atmosphere at modern Earth-like rates, whereas for the later, oxygen accumulation occurs predominantly during the early, steam-dominated atmosphere, and so stratospheric temperature does not have a strong influence on escape rates (Figures S13b and S13c). However, oxygen accumulation via Scenario 1 (high CO 2 :H 2 O perpetual runaway greenhouse) is unlikely if the stratosphere is cooler than 200 K ( Figure S13a). Photochemically produced ozone or hazes may offset the cooling effect of CO 2 , and so a full assessment of Scenario 1 requires more detailed radiative-photochemical modeling.
One additional caveat is that our model assumes an anhydrous solidus. This simplification is probably reasonable for post-magma ocean mantle conditions (Kite & Ford, 2018), but it is possible to imagine a scenario whereby waterworld mantles become increasingly hydrated via subduction after the magma ocean phase, and that this hydration offsets the pressure overburden effect to maintain geologic activity, and therefore oxygen sinks, for much longer than our nominal model suggests. To test this possibility, we conducted a sensitivity test where we accounted for mantle hydration decreasing the solidus (Katz et al., 2003). The results of this sensitivity test are discussed in detail in supporting information Section D, but in summary, mantle hydration does not have a large effect on oxygen accumulation on waterworlds; it merely shifts the ocean mass threshold for oxygen accumulation.
Sensitivity tests were also conducted to assess whether the delivery of reducing material such as metallic iron and FeO via impacts (Zahnle et al., 2020), a more reducing initial mantle, or larger planet-star separations could inhibit abiotic oxygen accumulation. These sensitivity test results are described in full in supporting information Sections E, G, and H, respectively. In summary, we find that high impactor fluxes could preclude desertworld oxygen accumulation assuming all impactor material is completely oxidized (supporting information Section E). Impactors may thus prevent Scenario 3 (desertworld) false positives in some cases, although this is not guaranteed because it is possible to imagine planetary formation pathways with smaller impactor fluxes and/or where the majority of impactor material is buried or lost to space as suggested by some impact simulations (Marchi et al., 2018). Scenarios 1 (high CO 2 :H 2 O perpetual runaway greenhouse) and 2 (waterworlds) are viable under a more reducing (iron-wüstite buffer) initial mantle (supporting information Section G). This counterintuitive result occurs because, even though degassed KRISSANSEN-TOTTON ET AL.

10.1029/2020AV000294
13 of 20 volatiles are likely to be more reducing, total volatile concentrations in the melt phase are typically lower due to graphite saturation (Grott et al., 2011;Ortenzi et al., 2020). Moreover, crustal sinks are precluded by high overburden pressure, regardless of the redox state of the crustal material. Although Scenario 3 (desertworlds) is seemingly not excluded by a more reducing mantle, evaluating this would require more complete radiative transfer and photochemical modeling of CO-H 2 dominated atmospheres. Finally, when nominal calculations are repeated at 1.3 AU, both Scenario 2 and Scenario 3 oxygen false positives still occur frequently (supporting information Section H). Scenario 1 false positives do not occur at large planet-star separations because a high CO 2 :H 2 O atmosphere cannot maintain a perpetual runaway greenhouse state after magma ocean solidification.
For Scenario 1 and 3 to be viable, the dry crustal oxidation parameter, f dry−oxid , must be relatively small (<0.1%). This contrasts with Venus where f dry−oxid probably needs to exceed >0.1% to remove virtually all O 2 from the atmosphere (see Venus validation in supporting information). This parameter is challenging to definitively constrain because it represents a range of physical processes including the diffusion of oxygen into extrusive lava flows (Gillmann et al., 2009), direct oxidation of small grain erosion products (Arvidson et al., 1992), and various other gas-solid redox reactions (Zolotov, 2019). Even if the oxidation of fresh crust is typically efficient, low dry crustal oxidation efficiencies cannot be ruled out because tectonic regimes where most magmatic activity is intrusive and isolated from the atmosphere are possible. In any case, the uncertainties in crustal oxidation processes highlights need for future missions to Venus to better constrain its redox evolution. Mars' crust is more oxidized than its upper mantle, but this oxidation cannot necessarily be used to constrain dry crustal oxidation efficiency since it might be attributable to early hydrous alteration (Herd et al., 2002;Wadhwa, 2001). The presence of gray, reduced sediments mere centimeters below more oxidized Martian regolith, as revealed by Curiosity, argues against efficient post depositional gas-solid oxidation under oxic conditions (Ming et al., 2014).
Finally, we note that in some cases abiotic oxygen accumulation is contingent on highly uncertain atmospheric escape physics. This uncertainty does not affect the viability of the waterworld (Scenario 2) false positives because the required H escape flux is comparable to the modern Earth's diffusion-limited escape flux (Catling & Kasting, 2017, p. 148). Oxygen accumulation only occurs in this case because of the pressure overburden suppression of oxygen sinks. Scenario 1 (high CO 2 :H 2 O perpetual runaway greenhouse) is similarly unaffected. However, for Scenario 3 (desertworlds), oxygen accumulation only occurs because of efficient XUV-driven escape of hydrogen,   lowXUV 0.1. If H-escape is photochemically limited (e.g., Wordsworth et al., 2018) then oxygen accumulation may be limited by the photochemical dissociation of water by UV photons, and the rate and which H 2 O recombination reactions occur. However, the oxygen source fluxes required in our desertworld scenario (∼500 Tmol O 2 /yr, Figure 7i) are comparable to the water loss rates inferred for a steam-only early Earth atmosphere calculation using a photochemical model (Wordsworth et al., 2018, their Figure 9). Future work ought to couple the geochemical evolution model here to a photochemical model that includes C-bearing species to better assess the potential for oxygen accumulation on desertworlds. Finally, it is possible that non-thermal O loss (Airapetian et al., 2017) or photochemically modulated stoichiometric escape of H and O (McElroy, 1972) could lessen oxygen accumulation. Upcoming JWST observations of highly irradiated terrestrial planets may constrain escape processes and improve predictions of oxygen accumulation on more temperate planets.
Sulfur outgassing and burial may have played an important role in the oxygenation of Earth's atmosphere (Gaillard et al., 2011;Olson et al., 2019). Sulfur species are ignored in our nominal model because their bulk abundances are probably too small to qualitatively change our oxygenation scenarios (Wordsworth et al., 2018). Supporting information Section I explores the consequences of adding reduced sulfur species to our outgassing model, and confirms that, for Earth-like sulfur mantle abundances, total oxygen sinks are comparable to when sulfur is neglected. With that said, mantle sulfur abundances are contingent on formation processes (e.g., Grewal et al., 2019) and could be highly variable. Incorporating a complete sulfur cycle into a redox evolution model to investigate the sensitivity of oxygenation to initial sulfur content is an opportunity for future research. Note, however, that mantle sulfur abundances are irrelevant for waterworlds (Scenario 2) where all crustal production is suppressed.
In summary, there are several unknowns that preclude definitive predictions of how frequently the three scenarios outlined in this study might occur, but none can be ruled-out with current knowledge.

Implications for Future Observations
How might future observations discriminate between the three abiotic oxygen scenarios described above and oxygen produced by a biosphere? In principle, high CO 2 :H 2 O atmospheres should be possible to diagnose via direct imaging spectral observations because they are not habitable. A clear atmosphere is likely since the coexistence of atmospheric H 2 O, O 2 , O 3 , and abundant OH radicals may preclude the accumulation of photochemical hazes. Strong CO 2 absorption features ought to be visible, as should pressure-sensitive CIA features from the high pressures; more detailed photochemical and spectroscopic simulations will be required to determine the best false positive discriminants for these worlds.
Because waterworlds and desertworlds are habitable, they may be more challenging to discriminate from inhabited terrestrial planets. Crucially, waterworld false positives would be ruled out by a detection of subaerial land because, for Earth-like gravity, the presence of emerged continents limits the maximum ocean depth to around 10 km (Cowan & Abbot, 2014), or equivalently, a few Earth oceans by mass. This limit arises because silicates cannot support their own weight with greater topography. Consequently, the detection of an ocean-continent dichotomy using time-resolved photometric mapping (Cowan et al., 2009;Farr et al., 2018;Fujii et al., 2010;Kawahara & Fujii, 2010;Lustig-Yaeger et al., 2018) could rule out a waterworld false positive, assuming alternative explanations for dichotomies in surface maps could be excluded. This highlights the need for large aperture direct imaging mission to ensure sufficient time-resolution to map the surface over a planet's rotation. Alternatively, independent mass and radius constraints from radial velocity observations and thermal infrared direct imaging (Quanz et al., 2019), respectively, could also help rule out large (few wt.%) water inventories based on bulk density.
Desertworlds are likely the most challenging scenario to disambiguate from biological oxygen. Time-resolved photometric surface maps and/or the lack of ocean glint could help evaluate the surface water inventory and might be suggestive of a small water inventory (Lustig-Yaeger et al., 2018;Robinson et al., 2010). There are potentially other diagnostic spectral signatures of desertworlds such as spatial variation in atmospheric water vapor and photochemistry that could be tested using general circulation models and photochemical models. The presence of long-lived sulfuric acid hazes (Loftus et al., 2019) has been proposed as putting an upper bound on surface water abundances, but the desertworlds considered here likely have larger surface water inventories than this threshold.
Broadly speaking, the scenarios outlined in this study emphasize that no single observation, including oxygen detection on habitable zone planets around sun-like stars, will be uniquely diagnostic of life. It will be necessary to design future telescopes that are capable of both constraining the full planetary/stellar context and identifying multiple lines of evidence for life (Catling et al., 2018;Walker et al., 2018). For example, oxygen detection on an ostensibly habitable terrestrial planet would be persuasive if accompanied by surface biosignature detections , temporal biosignatures (Olson et al., 2018), or co-existing reducing gases in atmospheric disequilibrium (Krissansen-Totton et al., 2016). The coexistence of oxygen and methane remains an excellent biosignature and would not be expected for any of the oxygen false-positive scenarios described above. Indeed, it is difficult to produce large methane abundances in habitable planet atmospheres without life, even in anoxic atmospheres (Krissansen-Totton, Olson, et al., 2018;Wogan et al., 2020). It should also be noted that the scenarios in this study were illustrated for habitable zone planets around sun-like stars, but they may also be applicable to habitable zone planets around M-dwarfs.

Conclusions
The redox evolution of habitable zone terrestrial planets is strongly dependent on initial volatile inventories and the efficiency of crustal sinks. Uninhabited, Earth-sized planets within the habitable zone of G-type stars are very unlikely to accumulate abiotic oxygen if their initial volatile inventories are Earthlike. However, if initial volatile inventories differ dramatically from that of the Earth, then non-biological oxygen accumulation is possible, even when atmospheric noncondensable inventories are large. This may occur when either (i) the initial CO 2 :H 2 O ratio exceeds one, which suppresses oxygen sinks due to the low mantle volatile content and because surface conditions are too hot for aqueous reactions, or (ii) the initial H 2 O inventory is very large, thereby halting crustal production after a few billion years and shutting off all KRISSANSEN-TOTTON ET AL.

10.1029/2020AV000294
15 of 20 oxygen sinks, or (iii) the planet is very volatile-poor, in which case oxygen may accumulate during the steam atmosphere that persists after magma ocean solidification. Inefficient dry crustal oxidation is required for scenarios (i) and (iii) to yield large oxygen abundances, and scenario (i) is sensitive to stratospheric temperature. Fortunately, observational discriminants exist for all three of these scenarios; scenario (i) planets are uninhabitable, whereas the ability to constrain surface water inventories using time-resolved photometry would be useful for ruling out scenarios (ii) and (iii). More generally, the possible existence of these oxygen false positive scenarios highlights the need for a systems approach to biosignature assessment where biogenicity is judged not by the presence or absence of a single biosignature gas, but by multiple lines of evidence from both spectrally resolved and temporally resolved observations.

Conflict of Interest
The authors declare no conflicts of interest relevant to this study.

Data Availability Statement
The

Text A. Model description
A.1) Thermal evolution: Planetary thermal evolution is specified by energy budget and temperature-dependent viscosity. The time-evolution of mantle potential temperature, p T (K), is determined by the following equations, representing the magma ocean and solid-state convection phases, respectively: Note that we do not account for uncertainty in core heatflow since we are already sampling an order of magnitude range in radiogenic inventories (see above). Moreover, by choosing a core heatflow history at the higher end of literature estimates (Nimmo 2007;O'Rourke & Stevenson 2016), we are effectively maximizing crustal recycling and subsequent oxygen sinks. Tidal heating is ignored; if Earth-moon system tidal heating were included then the duration of the magma ocean could be extended by a few million years, potentially providing more time for oxidation via H escape (Zahnle et al. 2015). Equation (1) governs thermal evolution except in rare cases where transition to runaway greenhouse causes surface temperature to increase above mantle potential temperature, in which case a conduction regime is adopted (see below).
The heatflow from the convecting interior, m q (W/m 2 ), is parameterized as follows: Here,  =2×10 -5 K -1 is the thermal expansion coefficient for silicates, g (m/s 2 ) is surface gravity, crit Ra = 1100 is the critical Rayleigh number, k = 4.2 W/m/K is the thermal conductivity of silicates, 1 / 3  = and  =10 -6 m 2 /s is thermal diffusivity (Lebrun et al. 2013;Schaefer et al. 2016). Heatflow is dependent on the kinematic viscosity,  , which is a function of mantle potential temperature (see below). This parameterization is adopted for both solid state and magma ocean phases.
A.2) Solidus parameterization and magma ocean freezing: The solidus controls both the freezing of the magma ocean and the production of partial melt (and outgassing) during temperate geochemical evolution. The adiabatic mantle temperature profile, solidus, and liquidus are parameterized as follows.
Here, 1 T is a linear fit to the solidus for low pressure dry peridotite, and 2 T is a linear fit to the solidus for the high pressure lower mantle (Hirschmann 2000;Schaefer et al. 2016). A smooth function between them is assumed for solidus T so that an analytic derivative exists at all radii (see below). Following Schaefer et al. (2016), the liquidus is assumed to be 600 K warmer than the solidus at all pressures. We also allow for modulation of the solidus and liquidus by the pressure overburden of surface volatiles.
Here, overburden P is the pressure from all H2O (liquid and gaseous), CO2, and O2 at the surface. The pressure overburden is only accounted for after the magma ocean has solidified and after the mantle has degassed.
The time evolution of the solidification radius is determined by a similar method to Schaefer et al. (2016). The rate of change in the solidification radius can be obtained by noting that the time derivative of solidus evolution and the time derivative of the adiabatic temperature profile must be equal at the solidification radius: Note that the solidification radius must remain constant when the core-mantle boundary temperature exceeds the solidus temperature and when the surface temperature drops below the solidus to ensure the solidification radius and mantle potential temperature begin evolving together when ( ) ( ) This approach is more computationally expensive but yields an identical solidification radius evolution to the analytic expression eq. (7).

A.3) Mantle Viscosity Parameterization:
Our viscosity parameterization needs to have several properties. First, it must successfully reproduce the modern Earth's heatflow, melt production, and plate velocity. Second, it needs to transition smoothly from low viscosity magma ocean, to magma mush, to solid state convection. Our parameterization is a variation of those assumed in other magma ocean-to-solid interior evolution models (Lebrun et al. 2013;Salvador et al. 2017;Schaefer et al. 2016). However, the parameterizations in these studies needed modification because they predict a low viscosity magma-ocean or mush at the modern Earth's potential temperature (~1620 K). Noting that there is a large uncertainty in the critical melt fraction that controls the transition from solid-like to fluid-like convection (Costa et al. 2009), we adopted a parameterization that ensures this transition occurs at a temperature that exceeds the modern Earth's potential temperature. This is illustrated in Fig. S3, which compares our viscosity parameterization to others in the literature.    T T  T  T  T  T T  T T  T  T   TT   TT Here, coef V = 10 1 to 10 3 Pa s is a randomly sampled parameter that accounts for uncertainty in solid-state viscosity.

A.4) Stellar evolution
We assume solar bolometric luminosity evolution, () Lt (W), for all model runs (Baraffe et al. 1998;Baraffe et al. 2002). For the evolution of stellar XUV luminosity, we follow the empirical fit developed in Tu et al. (2015). The early sun's rotation rate, 0  , is an unknown parameter sampled uniformly from 1.8 to 45 (relative to modern) in log space. From the early sun's rotation rate, the time for the early sun to fall out of saturation, sat t (Myr), is given by: For the chosen range of rotation rates, sampled saturation times range from 6 to 226 Myrs. We assume that at saturation the sun's XUV luminosity is 10 -3.13 () Lt . To retrieve the modern XUV flux, we define the exponent, The flux received at each planet's orbital distance, planet star D − (m), is calculated using Earth-sun and Venus-sun separations. We begin our model runs at t = 10 Myrs in the stellar evolution model, but results are insensitive to the choice of zero point.

A.5) Surface Energy Budget:
At each time-step in the model, the surface temperature is solved numerically by finding the surface temperature that ensures heatflow from the interior plus absorbed shortwave radiation (ASR) is exactly balanced by outgoing longwave radiation (OLR). This assumption ignores the intrinsic heat capacity of the ocean, which is reasonable for Earth-like oceans, but for waterworlds this radiative equilibrium approach may underestimate the transition time from runaway greenhouse to surface ocean (see Discussion). Surface temperature is found by solving the following equation for  Here, m q is specified by equation (4). Absorbed shortwave radiation is a function of the planetary albedo and stellar luminosity: A . The "hot" and "cold" states do not refer to non-glaciated and glaciated states, which we do not consider in our model. Instead, they allow for a transition from low albedo cloud free runaway greenhouse atmospheres at high (>~ 1000 K) temperatures, to a range of cloudy and non-cloudy states under more temperate conditions (Pluriel et al. 2019). This distinction is important for modeling Venus, where the "cold" state albedo is ~0.7, but the albedo during the initial runaway greenhouse phase was potentially much lower. Albedos for hot and cold states are sampled uniformly form 0-0.3 and 0.25-0.35, respectively for Earth and 0-0.3 and 0.2-0.7 for Venus. The albedo of the hot state must always be equal to or less than that of the cold state.
To calculate the outgoing longwave radiation (OLR), we used the publicly available code from Marcq et al. (2017). This code uses DISORT (Stamnes et al. 1988) with four stream longwave radiative transfer. The radiative transfer model only considers opacity due to water vapor and carbon dioxide. Rock vapor opacities are ignored since the time spent at rock-vaporizing temperatures is very short and unlikely to affect long term redox evolution. Correlated k coefficients are calculated from the high resolution molecular absorption spectra computed with kspectrum (Eymet et al. 2016), H2O-H2O continuum absorption is taken from Clough et al. (2005), and CO2-CO2 continuum absorption from fits to Venus observations (Bézard et al. 2011). H2O-CO2 continuum opacity is not considered, and is likely negligible compared to H2O-H2O and CO2-CO2 continuum absorption (Ma & Tipping 1992). The runaway greenhouse limit calculated using the code of Marcq et al. (2017) closely agrees with line-by-line calculations in Goldblatt et al. (2013).
The atmosphere model calculates atmospheric structure and abundance profiles using the expressions for dry and moist adiabats in Kasting (1988), and the thermodynamic properties of water are taken from steam tables (Haar et al. 1984). Given an assumed surface temperature and total water inventory, the code calculates the atmospheric water vapor profile assuming a dry convective regime (partial pressure of water vapor less than saturation) to moist convective regime (partial pressure of water vapor equals saturation) to isothermal temperature structure, where the isotherm temperature is the planetary skin temperature. The dry convective regime may not be present if water vapor is saturated at the surface. Once the water vapor profile has been calculated, the remainder of the surface water inventory (if any) resides in a surface water ocean.
If a portion of the surface water resides in an ocean, then the partitioning of carbon between the atmosphere and ocean also determines the OLR. Thus, OLR is a function of dissolved carbonate concentrations. This is calculated as follows: Here, rather than explicitly track the ocean alkalinity budget, we follow the approach of Schwieterman et al. (2019) and assume that carbonate precipitation will ensure the longterm carbonate saturation state of the ocean,  , is constant. Values for  are sampled randomly from 1 to 10 to allow for abiotic supersaturation. Similarly, rather than explicitly track cation weathering budgets, we assumed constant dissolved calcium abundances and sample uniformly in log space from 10 -4 to 3×10 -1 mol/kg. Dissolved calcium concentrations in Earth oceans have varied from 10 -2 to 3×10 -1 mol/kg over Earth history (Halevy & Bachan 2017), but we adopt a broader range to account for different crustal compositions and ocean volumes. For example, Kite and Ford (2018) consider waterworld scenarios with essentially zero Ca 2+ up to 0.25 mol/kg based on thermodynamic models of basalt-water interaction. Explicitly tracking the ocean alkalinity budget is an opportunity for future research. The temperature dependent solubility product, sp K , is the same as in Krissansen-Totton et al. (2018). Once the total carbon reservoir, the ocean size, and the dissolved carbonate concentrations are known, the entire carbonate equilibrium system of equations can be solved to determine atmospheric pCO2 (Krissansen-Totton et al. 2018).
Rather than call the atmospheric radiative transfer code in real time, we precomputed a grid of OLR values as a function of surface temperature (250-4000 K), surface water (10 Pa -1 GPa), surface carbon dioxide (10 Pa to 0.1 GPa), and planetary effective temperature (150 -350 K). Within the grid we linearly interpolate between grid points, and on the rare occasion when the model moves beyond the grid, linear extrapolation is adopted. A 1 bar partial pressure of N2 is assumed at every grid point. Since the atmospheric model calculates atmospheric structure, stratospheric mixing ratios are obtained, which are used to determine atmospheric escape rates (see below).
Here, zero during solid-state convection and are described in detail in their corresponding sections below. Total fluid volatile masses are converted to partial pressures using atmospheric mean molecular weight (no magma ocean) or using the melt solubility relationships described below (magma ocean). Fig. S2 shows illustrative outputs from a single model run. Subplots Fig. S2a, S2b, and S2c show the evolution free oxygen, water, and carbon dioxide reservoirs, respectively, as governed by equation (16).
A.7) Magma-ocean evolution: Whilst the magma ocean exists, volatiles in fluid phases are partitioned between the melt, melt crystals, and the atmosphere. For water, this partitioning is described by the following equation (Schaefer et al. 2016 Here, we use the solubility relationship from Pan et al. (1991).
For oxygen, equilibrium partitioning is more complicated because both Fe 2+ and Fe 3+ melt phases must be included. We adopt the experimental fit in Kress and Carmichael (1991) Equation (19) is also used to calculate mantle oxygen fugacity for the purposes of outgassing calculations by substituting the solid mantle molar fractions of oxidized and reduced iron.
A.8) Transition from magma ocean to solid mantle convection: The model switches between magma ocean and solid-state convection freely, as dictated by the radiation and interior heating budget. At each time step, surface temperature is compared to the solidus. For as long as the surface temperature exceeds the solidus, the magma ocean model is adopted (equation (1)), and the solidification radius evolves with time according to equation (7). However, once the surface temperature drops below the solidus, the solidification radius is set to the planetary radius, and solid state interior evolution is dictated by equation (2). Volatiles are instantaneously exchanged between the magma ocean and the solid interior at this transition. The model assumes that when the surface freezes, any volatiles still dissolved in the magma mush remain in the interior (e.g. as basaltic glass, or gas bubbles in melt inclusions), which is reasonable given the short timescale for magma ocean solidification and the high viscosity of the late-stage magma mush. This assumption also maximizes the mantle's capacity for subsequent outgassing of reduced products that remove oxygen from the atmosphere.
During the transition from magma ocean to solid-state convection, volatile inventories undergo the following one-off adjustment:  Fig. S5 shows the fraction of total CO2 and H2O that reside in the solid mantle immediately after magma ocean solidification for outputs from Fig. 2 in the main text. These mantle fractions are determined by the partitioning of volatiles in the magma ocean as described in Supplementary Section A.7, and the instantaneous retention of leftover melt when surface temperature drops below the solidus, as described in this section. While we do not explicitly model mechanisms of volatile retention in the magma ocean, such as compaction within the moving freezing front, our spread of final mantle volatile fractions is comparable to that of more detailed models (e.g. Hier-Majumder & Hirschmann 2017).

Fig. S5
: Volatile fraction in the solid mantle immediately after magma ocean solidification for nominal model calculations (Fig. 2). The left subplot shows the carbon dioxide mantle fraction, whereas the right subplot shows the water mantle fraction.
In a few rare cases, the transition back from solid to magma ocean causes the surface temperature to exceed the mantle potential temperature. When this occurs, we modify the energy budget as follows: Here, c q (W/m 2 ) is the conduction of heat from the surface to the interior and is approximated by the diffusion equation: The time evolution of volatile reservoirs is also modified in this regime to account for the fact that the radius of solidification is moving downward towards the core (see source code for full details). This lasts until surface temperature is less than mantle potential temperature, and the model switches back to a convective mantle.
A.9) Atmospheric Escape Parameterization: Atmospheric escape controls the source flux of abiotic oxygen. Escape rates are determined by the composition and temperature of the stratosphere, and by the stellar XUV flux. For low stratospheric water abundances, escape is limited by the diffusion of water through background gases, or by the XUV flux from the star (whatever is smaller).
As the water content in the upper atmosphere increases, the escape regime transitions to XUV-limited because, for steam-dominated atmosphere, there is no cold trap limiting the supply of water to the upper atmosphere. Our approach does not rigorously capture the complexities of atmospheric escape physics (e.g. Owen 2019), but instead uses plausible parameterizations that incorporate broad parameter ranges to cover a wide range of uncertain physical processes. Moreover, our parameterization collapses to wellestablished solutions (e.g. the diffusion limit) for end-member cases.
Following Wordsworth and Pierrehumbert (2013), the diffusion of water through noncondensible background gases is given by To calculate XUV-driven hydrodynamic escape of H, and associated O and CO2 drag, we follow Odert et al. (2018) and Zahnle and Kasting (1986). The XUV-energy mass loss rate, XUV  (kg/m 2 /s) is specified by the following equation: Here, XUV F is the XUV flux (W/m 2 ) received from the star (see stellar parameterization).
The efficiency of hydrodynamic escape,  , is a function of atmospheric composition and XUV stellar flux, as described below.
In general, the XUV-driven mass flux will be partitioned between H loss, O drag and, very under high XUV fluxes, CO2 drag. The hydrogen escape flux, H  (molecules H/m 2 /s), can be obtained by analytically solving equations (4), (5), and (6) Here, i m (kg) is the mass of the i-th species, i X is the stratospheric mixing ratio of the ith species, where we conservatively assume CO2 is not dissociated to minimize the drag of carbon, T (K) is stratospheric temperature, B k is Boltzmann's constant, and ij b − is the binary diffusion coefficient of i through j. We refer the reader to the original paper for the details. Crucially, once the loss of hydrogen is known, the oxygen fractionation factor, O  , can be obtained: If 0 O   , then the hydrodynamic wind drags oxygen. The carbon dioxide fractionation factor can then be similarly calculated: It is convenient to convert these molecular escape rates to molar escape rates: These weighting functions ensure diffusion-limited H escape for low stratospheric abundances, and a smoothly transition to XUV-driven escape as the upper atmosphere becomes steam dominated. The precise transition abundance is unknown and will, in general, depend on conductive and radiative cooling of the upper atmosphere as well as downward diffusive transport. Here, it is represented by the free parameter, tra  , which ranges from 10 -2 to 10 2 and is sampled uniformly in log space.
The efficiency of hydrodynamic escape is parameterized by loosely following the approach of Wordsworth et al. (2018). If the XUV-stellar flux is insufficient to drag oxygen, then the efficiency is equal to a constant, lowXUV  , which is randomly sampled from 1% to 30%. Alternatively, if the XUV-stellar flux exceeds what is required to drag O, then some portion of the excess energy,  , goes into driving further escape, whereas the rest, 1  − , is assumed to be efficiently radiated away. The efficiency factor,  , is randomly sampled from 0-100% for complete generality. This leads to the following function for the efficiency of hydrodynamic escape: (1 ) 4 Fig. S6: Illustrative examples of the escape parameterization. Each line denotes a different calculation, where uncertain escape parameters lowXUV  and tra  have been randomly sampled. Black lines denote the cold trap diffusion limit (eq. (26)), red lines denote XUVdriven hydrodynamic escape (eq. (35)), and green lines show the weighted combination that is the escape parameterization in our model (eq. (37)). On the left-hand side, the stellar XUV flux is held constant, and escape fluxes are plotted as a function of the stratospheric H mixing ratio (for an atmosphere without CO2). On the right-hand side, the composition of the upper atmosphere is held constant, and escape fluxes are plotted as a function of the stellar XUV flux received by the planet.
A.10) Solid-state evolution: Outgassing and crustal production The outgassing model follows that described in Wogan et al. (2020) where we calculate redox-dependent speciation of volatiles between melt and gas phase. Given mantle concentrations of H2O and CO2 (by mass), corresponding melt fractions can be calculated assuming accumulated fractional melting: Here  is the average melt fraction over the portion of the mantle where melting occurs (see below). These melt fractions, along with mantle oxygen fugacity (eq. (19)) and magma chamber pressure-temperature conditions are used as inputs to the outgassing calculations. For subaerial outgassing, the outgassing pressure is the pressure overburden of the atmospheric inventory, whereas for submarine outgassing, the pressure overburden is the atmospheric + ocean inventory. All outgassing is assumed to occur at the solidus temperature. Given these inputs, the outgassing thermochemical equilibrium model ) outputs gaseous mixing ratios, i f , for outgassed CO2, H2O, H2, CO, and CH4, as well as the moles of gas per total moles of melt plus gaseous species, G  . Note that the outgassing model does not consider the evolution of melt composition and oxygen fugacity along a degassing path; instead, we assume that the melt oxygen fugacity is buffered to that of the source rock, and that outgassed volatiles are determined by the equilibrium gas phase mixing ratios at surface pressure. Moreover, the molecular oxygen fraction of the degassed mixed is conservatively assumed to be negligible; the only possible source of atmospheric oxygen in the model is H escape.
Gaseous mixing ratios can be convolved with melt production, MP , (m 3 /s, described below) to calculate outgassing fluxes, i V (mol/s), for each species: Here, m  = 15.5 mol magma/kg magma is the inverse molar mass of magma, which is assumed to be constant. The overall O2-consumption sink from outgassed volatiles is the summation of reducing species: Overall outgassing fluxes are the combination of subaerial and submarine contributions, weighted by the surface land fraction, LF (see below): (1 ) To obtain mass fluxes, these molar outgassing fluxes must be weighted by their respective molecular masses: Here, the melt fraction at any given radius is given by the following expression: Clearly these are crude approximations, but they capture the fact the planets with a few Earth oceans ought to have some subaerial land, but that for large water inventories all crust is submerged. In any case, total land fraction does not have a large impact on weathering feedbacks (Abbot et al. 2012 = 90 kJ/mol, and pH-dependence of seafloor weathering are assumed to be known constants. More sophisticated weathering parameterizations that account for kinetic dependencies, thermodynamic solute concentration limits, and a precipitationlimited runoff dependence have been proposed (Graham & Pierrehumbert 2020), but given the uncertainties in geological parameters that feed into such models, our simple CO2 and temperature dependent formulation is adequate for providing a crude thermostat.
Total weathering will be the summation of continental and seafloor weathering, weighted by the fraction of liquid water at the surface, . This ensures that weathering tends to zero as oceans evaporate. We also include a possible supply limit to weathering as an unknown variable that could potentially limit the rate at which dense pCO2 atmospheres are sequestered if the supply of erodible rock is low. This supply limit, sup lim W − = 10 5 to 10 7 (kg/s), is not coupled to crustal production since not all newly produced crust will necessarily be delivered to the surface to be weathered. The overall expression for CO2-removal via weathering is therefore: Because we are modeling a plate tectonics regime, we assume that all carbon dioxide removed by weathering is returned to the mantle. This can be seen in equation (16) (16). Note that these loss and gain quantities are not necessarily equal in because, as discussed above, H2 produced by serpentinization may be lost to space, thereby permanently removing water from the planet.
A.13) Solid-state evolution: Planetary redox budget The interior may become oxidized via outgassed of reduced species (discussed above), dry oxidation, and wet oxidation. First, we consider dry (direct) crustal oxidation, which can be represented by the following reaction: The flux of this dry crustal sink is parameterized by the following equation: ( ) Here, there is a land fraction dependence to ensure that no dry oxidation of the crust occurs when the surface is covered in water. The unknown efficiency parameter dry oxid f − (10 -4 to 10%) is discussed in the main text, and is the fraction of reduced iron in newly produced crust that is oxidized.
We also modify the melt production term to represent the fact that there is some maximum amount of surface melt accessible to oxidation via diffusion through extrusive lava flows. The diffusivity of oxygen in silicate melts is ox D ≈ 10 -7 cm 2 /s (Canil & Muehlenbachs 1990), which implies a downward diffusion depth of ~o Here, the efficiency factor, lava f , is the average fraction of the planetary surface that is continuously molten due to extrusive volcanism after the magma ocean has solidified. Because thermal diffusivity exceeds chemical diffusivity-and because mean surface temperature is below the solidus by definition-extrusive magmas will form a low permeability solid crust as they cool, precluding continuous diffusion of oxygen. Thus, even for extreme rates of resurfacing due to high heatflow, lava f is likely low. For example, on Io, where average internal heatflow is 1-3 W/m 2 (Veeder et al. 2012), only a few km 2 of the surface is estimated to be molten at any given moment (Mura et al. 2020).
We uniformly sample lava f = 10 -4 to 1 in log space for full generality. Given this upper limit on melt oxidation, the amount of oxidizable melt is given by: does not affect any of our model scenarios except for Scenario 3, where values exceeding 0.01 are required for oxygen accumulation (Fig. S1d).
The wet oxidation of the interior from hydration reactions is already the appropriately weighted serpentinization flux: Recall the term oxid fluid F − represents the total flux of free oxygen lost from the atmosphere-ocean reservoir to the interior, whereas the term oxid fluid F − represents the total flux of free oxygen gained by the interior. Oxidized crust is assumed to be mixed back into the mantle on long timescales (equation (16)) via subduction, or slab delamination. We also assume that the outgassing of reduced species must ultimately be balanced by oxidation of the crust, and so the so net oxidation of the interior equals the reduction of the fluid reservoir: The only exception to this equality is when atmospheric oxygen levels are very low, and fluxes are modified for numerical reasons (see below).

Text B. Numerical approach
All code is written in python and available open source ([DOI_TBD]). The system of differential equations was solved explicitly with either RK45 or RK23 using the solve_ivp module in scipy. The maximum timestep was shorter for the magma ocean phase (10000 years) compared to the temperate evolution phase (10 6 years).
To avoid sawtoothing and excessive computation time at low reservoir abundances, various adjustments were made to the differential equations to ensure adequate performance. First, if atmospheric CO2 dropped below 50 Pa and if weathering exceeds outgassing, then the time-derivative of surface carbon dioxide was set equal to zero. This may mean climate evolution is slightly inaccurate at low pCO2, but the effects are minor. Second, atmospheric oxygen is similarly prevented from dropping below 0.1 Pa. If atmospheric oxygen is below 0.1 Pa and if oxygen production via escape is less than oxygen consumption, then the following adjustments are made. The atmosphere is assumed to be in an anoxic steady state and so oxygen atmospheric sinks are set equal to the escape source. However, since oxygen production via escape is less than oxygen sinks (outgassing and H2, wet and dry oxidation reactions), then the interior is assumed to be oxidized by the difference as the excess reductants (H2) escape to space. This might slightly overestimate mantle oxidation because some reductants (e.g. CO) will get photochemically oxidized rather than escape, but the redox budget of the atmosphere is not directly affected, and the effects on mantle redox evolution are negligible.

Text C. Venus Model validation:
To further validate the model, we demonstrate that it can broadly reproduce the known atmospheric evolution of Venus. To model Venus, all parameters are kept the same as for Earth except for planet radius, mass, planet-star separation, and albedo parameterization (see Table S2). Fig. S7 shows all model outputs that reproduce modern Venus, which is defined to be atmospheric oxygen < 10 15 kg (~0.2 mbar), atmospheric CO2 exceeding 2×10 20 kg (40 bar), no surface water ocean, and atmospheric H2O < 2×10 16 kg (~3 mbar).

Fig. S7
: Model runs that reproduce modern Venus conditions. Note that there are two qualitatively different histories that can reproduce modern Venus (top left). Either Venus was always in a runway greenhouse phase and never condensed a surface ocean, or Venus maintained a temperate surface for several Gyr before transitioning back to runaway greenhouse as solar insolation increased.
Note that our Venus model somewhat overpredicts modern day Venusian heat flow and melt production because we assume a plate tectonics model (Nimmo & McKenzie 1998). We save a more detailed comparative study of solar system planets with stagnant lid tectonics and resurfacing events for future study.
In Fig. S8 we plot key parameter values for model runs that successfully reproduce modern Venus. Initial volatile inventories are likely small (Fig. S8a) and dry crustal oxidation must be relatively efficient (Fig. S8c).

Text D. Hydrous Mantle Sensitivity Test
Here, we test whether modifying the solidus for hydrated mantles affects oxygen accumulation in the waterworlds scenario. Following Katz et al. (2003) we modify our expressions for the solidus and liquidus as follows: oxygen accumulation becomes increasingly likely for initial water inventories exceeding 100 Earth oceans (Fig. S10).   Fig. 6 in the main text except the solidus decreases with mantle hydration. Abiotic oxygen accumulation is somewhat less frequent, and occurs at higher initial water inventories, but results are qualitatively the same.

Text E. Impact Ejecta O2-sinks
Here, we test whether the delivery of reducing materials from impactors could potentially draw down oxygen produced in desertworld scenarios. We introduce an impactor flux, imp F (kg/yr), that diminishes exponentially with time: Here, the coefficient 0 imp F is randomly sampled (in log space) from 10 11 to 10 14.5 kg/yr and the decay time, decay t (Gyr), is randomly sampled from 0.06 to 0.14 Gyr. These ranges are adopted because they approximately reproduce plausible estimates for impactor fluxes in the Hadean and early Archean, both with and without a late heavy bombardment (Kadoya et al. 2020). Additionally, we assume that impactors are 30% metallic iron by mass, and that 100% of this iron is completely oxidized to ferric iron instantaneously upon impact, depleting atmospheric oxygen. Fig. S11 shows our desertworld calculations repeated with this impactor flux. We find oxygen accumulation and retention for several Gyr is still possible, although only when the total impactor flux is low. Fig. S12 shows oxygen accumulation after 4.5 Gyr as a function of total impactor flux. Abiotic oxygen may accumulation for impactor mass fluxes < 10 20 kg.   Each dot is a model run representing an oxygen false positive. For the first scenario (Fig.  S13a), abiotic oxygen only occurs when stratospheric temperature exceeds ~200 K. This is because, at lower temperatures, the cold trap becomes more effective and H escape (and therefore O accumulation) is throttled.
For the waterworld scenario (Fig. S13b) oxygen accumulation may occur at any stratospheric temperature. However, this is more probable-and abiotic oxygen abundances are greater-at higher stratospheric temperatures. On waterworlds, cold stratospheres are not necessarily expected because an N2-dominated atmosphere with low CO2 is a probable outcome (Fig. 5), especially if continuous CO2-drawdown via weathering occurs (Nakayama et al. 2019). Moreover, modest oxygen accumulation would result in ozone formation, that would further warm the stratosphere, potentially resulting in a positive O2-accumulation feedback that is not considered here. Note that there are two subclasses of oxygen false positives in Fig. S13b, denoted by red and blue dots. The blue dots show model runs where oxygen accumulated during the initial magma ocean is completely sequestered in the mantle upon magma ocean solidification, whereas red dots denote scenarios whereby appreciable oxygen is left over after magma ocean solidification due to the high pressure-temperature conditions of the overburden suppressed solidus.
The third, desertworld scenario ( Fig. S13c) is largely insensitive to stratospheric temperature. This is because water loss and oxygen accumulation occur immediately after magma ocean solidification while the steam-dominated atmosphere persists. There is no effective cold trap in the steam-dominated atmosphere and so escape fluxes are insensitive to stratospheric temperature.

Text G. Reducing Mantle Sensitivity Test
The nominal model assumes Earth-sized planets undergo rapid core formation with mantles that quickly approach ~FMQ±2. While this is a common assumption when modeling magma ocean evolution of the early Earth (e.g. Zahnle et al. 2007;Zahnle et al. 2010) and of terrestrial exoplanets (e.g. Schaefer et al. 2016), this may not be true for all terrestrial planets. To investigate the effects of a more reduced initial mantle, sensitivity tests were performed whereby the initial magma ocean was endowed with a smaller amount of free O (0.5×10 21 to 1.5×10 21 kg), such that after 4.5 Gyrs of evolution, mantle oxygen fugacity is closer to the iron-wüstite buffer than FMQ. Additionally, following Ortenzi et al. (2020), we modified the outgassing model such that the melt-solid partitioning of carbon is controlled by graphite saturation.
To calculate melt concentrations for outgassing calculations, we take the minimum of the concentrations in equations (41) and (73) Taking the minimum ensures that graphite saturation does not overestimate dissolved carbon concentrations under oxidizing conditions and when the total carbon content in the mantle is low.
Finally, while we do not explicitly account for graphite precipitation during magma ocean solidification, we set 2 CO k =1.0 in equation (16) to allow for greater retention of carbon in the mantle. It should be emphasized that these modifications do not constitute a fully consistent model of reduced mantle planetary evolution because the radiative transfer model does not allow for CO and H2 dominated atmospheres. However, for post magma ocean evolution they are adequate approximations.
Fig. S14 is identical to Fig. 2 in the main text except for the reducing mantle initial conditions and other assumptions described above. Once again, an anoxic atmosphere is assured after 4.5 Gyrs of evolution because crustal oxygen sinks overwhelm oxygen sources. Fig. S15 is the reduced mantle equivalent of Fig. 3 in the main text showing high CO2:H2O oxygen false positives. This scenario is largely unchanged by a reducing mantle; magmatic outgassing does not occur due to the high pressures and low mantle volatile concentrations following magma ocean solidification. Gradual oxygen accumulation may occur after several Gyrs of H loss to space. Fig. S16 is the reduced mantle equivalent of Fig. 5 in the main text showing waterworld oxygen false positives. The pressure overburden of a large surface ocean once again shuts down crustal production after ~1 Gyr, thereby removing all crustal oxygen sinks and permitting atmospheric oxygen to accumulate. Fig. S17 is the reduced mantle equivalent of Fig. 7 in the main text showing desertworld oxygen false positives. Although Scenario 3 is apparently unchanged by having a lower mantle redox, a fully self-consistent model that accounted for the high CO-H2 content of the originally degassed atmosphere would likely preclude early O2 accumulation, in practice.
Fig. S14: Nominal Earth evolution with a more reduced initial mantle. This is identical to Fig. 2 in the main text except (i) the initial free oxygen of the mantle is lower (0.5-1.5×10 21 kg), (ii) graphite saturation in reduced melts is accounted for, (iii) and carbon is partitioned into the solid phase during magma ocean solidification. Fig. S15: Scenario 1 oxygen false positives with a more reduced initial mantle. This is identical to Fig. 3 in the main text except (i) the initial free oxygen of the mantle is lower (0.5-1.5×10 21 kg), (ii) graphite saturation in reduced melts is accounted for, (iii) and carbon is partitioned into the solid phase during magma ocean solidification. The terminal magma ocean becomes more oxidized than the solid mantle as H escape oxidizes the combined melt-volatile reservoir faster than solidification transfers oxidized material to the mantle. Oxygen accumulation may occur after several Gyr because outgassing of C-bearing volatiles is negligible from the graphite-saturated mantle.
Fig. S16: Scenario 2 oxygen false positives with a more reduced initial mantle. This is identical to Fig. 5 in the main text except (i) the initial free oxygen of the mantle is lower (0.5-1.5×10 21 kg), (ii) graphite saturation in reduced melts is accounted for, (iii) and carbon is partitioned into the solid phase during magma ocean solidification. Oxygen sinks are suppressed by the large pressure overburden of the surface ocean, the same as in the nominal oxidized-mantle calculations.
Fig. S17: Scenario 3 oxygen false positives with a more reduced initial mantle. This is identical to Fig. 7 in the main text except (i) the initial free oxygen of the mantle is lower (0.5-1.5×10 21 kg), (ii) graphite saturation in reduced melts is accounted for, (iii) and carbon is partitioned into the solid phase during magma ocean solidification. The terminal magma ocean becomes more oxidized than the solid mantle as H escape oxidizes the combined melt-volatile reservoir faster than solidification transfers oxidized material to the mantle. While the persistence of oxygen is permitted in these calculations, in practice, the early atmosphere is likely too reducing to permit such oxygen accumulation.

Text H. Stellar Separation Sensitivity Test
While this study is not an exhaustive exploration of the oxygen false positive parameter space, nominal model calculations were repeated at 1.3 AU to show that oxygen accumulation is not dependent on being close to the inner edge of the habitable zone. Fig. S18 shows all Scenario 2 false positives at 1.3 AU. The increased stellar separation results in lower H escape fluxes, but oxygen accumulation may still occur if crustal sinks are suppressed by pressure overburden. Similarly, Fig. S19 shows all Scenario 3 false positives at 1.3 AU. Oxygen accumulation on desertworlds can similarly occur at larger stellar separations because oxygen accumulation occurs early during the steamdominated atmosphere. Scenario 1 false positives do not occur at large stellar separations because even under a high CO2 atmosphere, the runaway greenhouse state is not maintained after magma ocean solidification.   Fig. 7 in the main text except the assumed planet-star separation is 1.3 AU. Abiotic oxygen accumulation is still permitted due to early oxygen accumulation during the steam-atmosphere phase.

Text I. The Effect of Sulfur Outgassing
Sulfur outgassing and burial may have played an important role in the oxygenation of Earth's atmosphere (e.g. Gaillard et al. 2011;Olson et al. 2019). While including a complete model of sulfur cycling is beyond the scope of this study, we present calculations showing that Earth-like sulfur mantle abundances are unlikely to qualitatively change our conclusions.
Following Gaillard and Scaillet (2014), sulfur speciation is added to our outgassing model by adding the following system of equations to those already described in Wogan et al. shows total oxygen sinks with (blue) and without (red) sulfur-bearing volatiles. While the inclusion of sulfur-bearing species may result in a slightly larger oxygen sink, for Earthlike mantle concentrations the effect of sulfur outgassing is minimal. Fig. S21: Comparison of oxygen sinks with and without sulfur-bearing outgassed volatiles. Subplot (a) shows outgassing sinks for Scenario 1 false positives from Fig. 3 (red) compared to outgassing sinks if sulfur-bearing species are included (blue) and a constant 300 ppm mantle sulfur concentration is assumed (von Gehlen 1992). Subplot (b) shows total oxygen sinks with (blue) and without (red) sulfur-bearing volatiles. Including sulfur-bearing species has a negligible effect on outgassing sinks because the pressure of the dense CO2 atmosphere inhibits exsolution of sulfur gases. * We apply the additional constraint that the initial water inventory is greater than the initial carbon dioxide inventory, in accordance with typical carbonaceous chondrite abundances.  * We apply the additional constraint that the initial water inventory is greater than the initial carbon dioxide inventory, in accordance with typical carbonaceous chondrite abundances. ** High CO2:H2O runs are a subset of this range with initial CO2/H2O>1 (by mass).