Soil Respiration Phenology Improves Modeled Phase of Terrestrial net Ecosystem Exchange in Northern Hemisphere

In the northern hemisphere, terrestrial ecosystems transition from net sources of CO2 to the atmosphere in winter to net ecosystem carbon sinks during spring. The timing (or phase) of this transition, determined by the balance between ecosystem respiration (RECO) and primary production, is key to estimating the amplitude of the terrestrial carbon sink. We diagnose an apparent phase bias in the RECO and net ecosystem exchange (NEE) seasonal cycles estimated by the terrestrial carbon flux (TCF) model framework and investigate its link to soil respiration mechanisms. Satellite observations of vegetation canopy conditions, surface meteorology, and soil moisture from the NASA SMAP Level 4 Soil Moisture product are used to model a daily carbon budget for a global network of eddy covariance flux towers. Proposed modifications to TCF include: the inhibition of foliar respiration in the light (the Kok effect); a seasonally varying litterfall phenology; an O2 diffusion limitation on heterotrophic respiration (RH); and a vertically resolved soil decomposition model. We find that RECO phase bias can result from bias in RECO magnitude and that mechanisms which reduce northern spring RECO, like substrate and O2 diffusion limitations, can mitigate the phase bias. A vertically resolved soil decomposition model mitigates this bias by temporally segmenting and lagging RH. Applying these model enhancements at continuous soil respiration (COSORE) sites verifies their improvement of RECO and NEE skill compared to in situ observations (up to ΔRMSE = −0.76 g C m−2 d−1). Ultimately, these mechanisms can improve prior estimates of NEE for atmospheric inversion studies.

1. Does a seasonally varying adjustment of RECO or its components, R H and R A , improve the fit in modeled NEE phase compared to observed NH ecosystem CO 2 fluxes from a global network of EC flux towers? This adjustment might take the form of either an explicit phenology model or a seasonally varying climatic response 2. What is the impact of alternative R H or R A models on the mean RECO phase and estimation skill, validated against flux data measured at towers and in situ chamber sites?
We identified potential improvements to the TCF model based on mechanisms hypothesized to affect the timing of RECO components that are missing or inadequately represented in the current framework. Potential improvements should be consistent with an operational, data-driven, and low-latency daily model such as SMAP L4C. We did not consider refinements to the LUE model or GPP parameters in this study because GPP in TCF is constrained by satellite-observed vegetation phenology.
Specifically, we hypothesized that one or a combination of processes might be critical to the correct timing of the RECO seasonal cycle in the NH: seasonally varying litterfall inputs to SOC, which would enhance an R H phenology; an upper limit on the response of R H to soil moisture (SM) due to limited O 2 diffusion at near-saturating SM conditions, which may occur seasonally; and the slow diffusion of heat and moisture across vertically stratified soil layers, which can result in temporally lagged R H flux. The RECO seasonal cycle could also be 10.1029/2021MS002804 3 of 19 adjusted by changes to the R A component; for example, through modeling of the inhibition of leaf R A in the light (Keenan et al., 2019;Wehr et al., 2016), also known as the Kok effect, which could reduce the high RECO bias during the NH spring (Byrne et al., 2018;Heskel et al., 2013), as the TCF framework lumps above-and below-ground R A together. Other potential modifications to the R A model not investigated here include increased construction respiration during spring leaf-out (Papale & Valentini, 2003) or increased respiration associated with the maintenance of photosynthetic rates (Migliavacca et al., 2015).
These hypotheses have support in the literature. The Kok effect is wellknown, despite uncertainty regarding the cellular mechanism(s) responsible (Heskel et al., 2013). A seasonally varying litterfall scheme is intuitive and consistent with observations of soil respiration in the NH fall season (Davidson et al., 2006) and experimental manipulations of litter inputs (Leitner et al., 2016;Nielsen et al., 2019). A looser coupling of litterfall and GPP is also consistent with the finding that peak below-and aboveground respiration are temporally separated (Davidson et al., 2006;Giasson et al., 2013). An O 2 diffusion limitation has been implemented in other terrestrial carbon flux models (Davidson et al., 2012;Sihi et al., 2018) and has some experimental support. For instance, Järveoja et al. found that the temperature sensitivity of R H in northern peatlands is enhanced in dry periods, possibly due to increased O 2 supply to heterotrophs. It has also been found to improve RECO estimation at wetland sites (Sulman et al., 2012) and where snowmelt also leads to an increase in soil water content in spring (Oikawa et al., 2014;Winnick et al., 2020). A vertically stratified soil column has been adopted in land models (dos Santos et al., 2021;Tao et al., 2017). The mechanics of heat diffusion suggest that surface layers of the soil will warm before deeper layers, inducing lagged respiration throughout the soil column. This time lag, most evident in the spring and fall, has been associated with changes in the share of ecosystem respiration from the soil (Davidson et al., 2006). Vertical variation in soil temperature, in particular, has been shown to substantially improve soil carbon stock estimates at high latitudes (Koven et al., 2017;Yi et al., 2020).

Data and Methods
We first describe the baseline TCF model, with a focus on soil respiration. Next, we discuss the modifications examined here that are hypothesized to affect the timing of the NEE and RECO seasonal cycles. Finally, we explain our evaluation approach against flux tower data and chamber data.

Baseline TCF Model
The baseline TCF model combines a light-use efficiency model for estimating GPP with an SOC decomposition and soil R H model (Kimball et al., 2009). We modified the open-source TCF source code (Endsley, 2021a) to support the respiration processes hypothesized to affect the timing of the NEE and RECO seasonal cycles. In TCF, soil decomposition is represented using a three-pool soil decomposition model with cascading litter quality and R H component fluxes from metabolic, structural, and recalcitrant soil carbon pools (Endsley et al., 2020;Jones et al., 2017). R H for the ith pool with current SOC content C i is calculated: Here, g(T S ) is an Arrhenius function of near-surface soil temperature, following the commonly used Lloyd & Taylor approach (Sierra et al., 2015). f(θ) is a linear, increasing function of surface soil wetness, θ, given by: Figure 1. NEE and RECO mean seasonal cycles, as measured by EC flux towers ("Towers") or modeled by TCF, for tower sites north of 40°N latitude. Shaded area represents one spatial standard deviation. TCF data are from the L4C Nature Run v8.3 simulation.
Both functions map these environmental constraints to an output range [0, 1]; hence, their values are dimensionless modifiers on the base rates of decomposition, or (inverse) turnover times, k i . Surface SM, converted to wetness (%), is used to model the response of R H to substrate availability; that is, liquid water in the soil pore spaces allows microbes to access organic carbon substrates, the decomposition of which produces a CO 2 flux (R H ). Increasing soil temperature also promotes soil decomposition. Daily litterfall is computed as a constant daily fraction of annual NPP and is added to two of the three SOC pools (metabolic and structural pools), differentiated by turnover time, as described by Jones et al. (2017, Equations 6-8).
The response of R H to surface SM and temperature is calibrated against a representative, global set of EC flux towers (Baldocchi, 2008), separately, for towers representing different plant functional types (PFTs), using constrained nonlinear least squares optimization. The global distribution of up to eight PFT classes are defined from the MODIS MOD12Q1 (Type 5) land-cover classification   (Reichle et al., , 2019. The TCF model assumes that the R H flux is dominated by SOC decomposition of recent litterfall. As such, in L4C, surface (0-5 cm) soil moisture and temperature data from L4SM are used to drive the TCF R H calculation (Equation 1). This depth is also consistent with the average, expected sensitivity of the SMAP radiometer and includes the most labile fraction of SOC (Endsley et al., 2020).
Both the operational SMAP L4C Version 5 and L4C NRv8.3 have the same model logic, with NEE computed as the residual difference between GPP and RECO. L4C NRv8.3 and the modified versions of TCF use the same daily surface meteorological driver data for the period 1 January 2000 through 31 December 2017. For each modification to TCF, a full, daily carbon budget was calculated at 356 EC flux tower sites from the FLUXNET La Thuille Collection (Baldocchi, 2008). The modeled fluxes are site-scale, representing a 9-km area centered on each EC tower site; model processing occurs at 1-km spatial resolution within that footprint.
The GPP model of L4C NRv8.3 is unchanged throughout this study; each experiment uses the same minimum air temperature, vapor pressure deficit (VPD), and photosynthetically active radiation (PAR) data from the Modern Era Retrospective Re-analysis (MERRA-2, Gelaro et al., 2017). Similarly, the fraction of PAR absorbed by the canopy (fPAR) is derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) MCD15A2H product (Myneni et al., 2015) and is used as a model input to compute identical GPP estimates in each experiment. The environmental responses for each PFT in L4C NRv8.3 and subsequent experiments were calibrated against observed GPP and RECO fluxes partitioned from daily NEE using the night-time method at representative tower sites (Keenan et al., 2019). In every experiment, the three SOC pools (stratified by base decomposition rates) were brought to steady-state conditions through an analytical spin-up followed by a numerical spin-up, consisting of repeated cycling of annual climatologies until the annual NEE balance is within ±1 g C m −2 d −1 .

Litterfall Phenology
The timing of litterfall allocation to SOC pools could have a profound effect on the seasonal cycle of R H . Randerson et al. (1996) tested litterfall allocation schemes based on remotely sensed leaf-area index (LAI) and selected the best-performing scheme for the CASA model. The CASA litterfall scheme, as implemented in the modified TCF framework, changes litterfall input,  , from a constant daily fraction of NPP to a moving-window function of LAI: where t is the day-of-year; δt is the time step in years (1/365); f E is the evergreen fraction, an estimate of the proportion of the canopy that is evergreen; and f L is the litterfall fraction in excess of a constant daily fraction (1/365): f L is normalized by the annual sum of L loss , the leaf-loss function. L loss is a triangular moving window centered on the current time step, amounting to the difference between lagged and leading LAI. Here, satellite-observed LAI inputs to the TCF model are obtained from the MODIS MCD15A2H product (Myneni et al., 2015). Unlike Randerson et al. (1996), we re-calculate f E each year, allowing for potential changes in the canopy species composition. We also used the full MODIS MCD15A2H record, down-scaled to daily time steps by forward-filling values, over the 2000-2017 period. The approach requires two leading values from MCD15A2H (two 8-day MCD15A2H composites), which would introduce a ca. 16-day latency. For an operational algorithm aiming for low latency, like SMAP L4C, a static 365-day LAI climatology could be used instead.

O 2 Diffusion Limitation
If O 2 diffusion becomes limiting at high SM, this could explain the apparent RECO high bias in TCF during the spring season. We verified that high SM conditions exist in the NH spring at multiple U.S. Surface Climate Observing Reference Networks (USCRN; Diamond et al., 2013) and Soil Climate Analysis Network (SCAN; Schaefer et al., 2007) in situ monitoring sites in the contiguous United States (CONUS) north of 40° latitude. An annual SM climatology, compiled for each sensor depth, based on these sites suggests that shallow soil layers, in particular, experience an increase in SM anomaly during the NH spring ( Figure S1 in Supporting Information S1).
To model an O 2 diffusion limit at high SM conditions, we adopted a Michaelis-Menten (MM) function of soil volumetric water content (Davidson et al., 2012) as an additional constraint on R H . Currently in TCF (NRv8.3), g(θ%) is a function of soil wetness (from here, given as θ% to distinguish it from volumetric water content, θ), representing substrate diffusion. In the modified TCF R H function, g(θ%) is replaced by the minimum of itself and the O 2 diffusion limit term, a function of the volumetric O 2 concentration [O 2 ], and the MM or half-saturation constant, 2 : In taking the minimum of these two constraints, we assume they are equally limiting for soil heterotrophs. We calculate the O 2 concentration based on the diffusion coefficient of O 2 in the air d gas , the air fraction of O 2 (0.209 L L −1 ), the porosity of the soil ϕ, and the volumetric soil moisture θ. In our approach, no new fit parameters are required, as the constants 2 and d gas can be identified based on the soil moisture distribution observed among sites with the same PFT. First, following Davidson et al., we assume that when soil moisture is very low (below fifth percentile), the O 2 concentration in the soil pore spaces is the same as in the air, leading to: Second, we set 2 ≡ [ 2] , calculated using this inferred value of d gas and the median soil moisture. As in NRv8.3, soil moisture and porosity are derived from the SMAP L4SM (Reichle et al., 2019) and GEOS-5 Catchment Land Model (Koster et al., 2000;Tao et al., 2017), respectively.

Vertically Resolved Soil Decomposition Model
The original TCF framework is not vertically stratified: soil decomposition and R H flux are considered to occur near the surface in a single soil layer of arbitrary thickness. The SMAP L4SM product estimates soil temperatures in seven layers with interfaces at 5, 15, 35, 75, 150, and 300 cm depth, accounting for bedrock. However, because of the particular structure of the Catchment model, L4SM only reports SM in three nested layers: the surface layer (0-5 cm), the root-zone (approximately 0-100 cm), and the soil profile (0 cm to bedrock depth). In order to obtain vertically resolved estimates of soil water content, we developed a simple physical model of soil water infiltration, diffusion, and lateral drainage (Endsley, 2021b) based on the modified Richards' equation and which is fully described in Appendix A. The corresponding, vertically stratified soil decomposition model is driven with these estimates of the soil water profile, which depend on surface infiltration estimates from L4SM, while L4C NRv8.3 is driven using L4SM surface soil moisture.
The multilayer soil profile modification to TCF includes modifications of the SOC and R H submodels. The new, vertically resolved SOC model is similar to that of Yi et al. (2020): where  ( ) represents inputs (and transfers) to SOC pool i at depth z and D(z) is the vertical diffusivity of SOC. Diffusivity is taken to be 2 × 10 −4 m 2 yr −1 , after Yi et al. (2020) for nonpermafrost soils. Each soil layer or depth, z, contains the same three SOC pools, which are the same pools in the baseline NRv8.3 and the other experiments. Litterfall input is now a function of depth: where  ( ) is the litterfall input to SOC pool i at depth z, an exponentially declining function of depth, after Koven et al. (2013), which estimated the e-folding depth, z e , to be equal to 10 cm.  , the total daily litterfall input across the soil profile, is estimated as in NRv8.3 as a constant daily fraction of NPP. f ji is the transfer function defining carbon (C) transfers from pool j to pool i.
Finally, R H is calculated similar to the baseline TCF model, NRv8.3, with environmental modifiers soil moisture and temperature, but as a composite sum of the R H in each soil layer and with an additional rate modifier, h(z), which accounts for the extinction of R H with depth due to factors other than soil moisture or temperature (Koven et al., 2013): z k , the depth at which environmentally constrained R H declines by a factor of e (due to, e.g., mineral protection, aggregation, etc.), is a free parameter that is fit in calibration against the observed RECO flux.

The Kok Effect
To simulate the inhibition of R A by light (the Kok effect), prior modeling studies have modulated maximum light-use efficiency (LUE) according to irradiance (Turner et al., 2006) or adjusted R A directly as a function of irradiance, solar elevation, and the leaf angle distribution (Wohlfahrt et al., 2005). In TCF, however, a potential inhibition of R A implicates both the calibration and forward modeling through plant carbon use efficiency (CUE), or the fraction of GPP that is not respired. During calibration, CUE is used to compute R H for fitting against EC flux tower observations. In this experiment, CUE now varies with PAR: where g(PAR) is a linear ramp function that monotonically increases with increasing PAR, identical to Equation 2, but as a function of PAR instead of soil wetness. In the experiment NRv8.3 + Kok Effect, the minimum and maximum values of Equation 2 are fit parameters. In the baseline TCF NRv8.3, CUE does not vary with PAR (i.e., g(PAR) ≡ 1).
While light inhibition of foliar respiration is also relevant for partitioning EC flux tower observations, we did not revisit the partitioning methods used to derive the FLUXNET data record (Baldocchi, 2008) as part of this study, opting to use the same EC flux tower data that have been used in prior L4C calibration and validation studies (Jones et al., 2017). Consequently, our discussion of the Kok effect, here, is limited to its relevance during TCF parameter fitting (above) and forward model operation (below).
During forward modeling, CUE is key to computing NEE as the residual between RECO and GPP or, equivalently, between R H and NPP:

Verification and Validation Against Flux Tower Observations
The mean seasonal cycle at the 356 EC tower sites was used as a within-sample check on the experimental results, as it is observed that the mean seasonal cycle of the calibrated model does not match that of the underlying calibration data for high northern latitudes ( Figure 1); that is, does the modified TCF model display better fidelity to NH seasonal cycles in the calibration data set? In addition to this model verification, we validated RECO and NEE modeling skill, in carbon terms, against the L4C Core Validation Sites (CVS; Jones et al., 2017).
We also validated the TCF mean seasonal cycles against that of the FLUXCOM up-scaled tower fluxes data set (Jung et al., 2020), which is based on the random forest method with combined remote sensing and meteorology drivers (RS + METEO). Like FLUXCOM, TCF-based models (e.g., SMAP L4C) extrapolate to the global land domain the site-level relationships between environmental drivers and carbon fluxes, based on a representative set of EC flux towers. Though not entirely independent of the FLUXNET towers used to calibrate TCF, the additional driver data sets and larger spatial extent of FLUXCOM motivate our comparison of the aggregate, mean NEE and RECO seasonal cycles. Unlike the EC tower data, FLUXCOM provides gridded data over land; we subset the data to all pixels ≥40 N latitude. As the global network of 356 EC towers used to calibrate TCF are assumed to be sufficiently representative for inferring relationships at global scale, we compared the aggregated mean seasonal cycle from FLUXCOM's larger spatial extent to that of our site-level modeled results.
Two techniques were used to quantify the effect of each TCF modification on the modeled NEE and RECO seasonal cycles. First, we applied a low-pass filter (smoother) to the mean seasonal cycle, aggregated across all towers matching each PFT or across the FLUXCOM time series. We chose a 7-day moving window for the filter based on visual inspection of the filtered results. Second, we used Fourier regression to quantify the phase shift, in days, of a harmonic function fit to the FLUXCOM time series or the complete time series of all tower sites within each PFT group. Specifically, with smoothing, we aggregated the mean seasonal cycle prior to applying the filter; with Fourier regression, the raw time series data were used to estimate model parameters. Fourier regression provides a standard error for the phase offset across PFTs; the low-pass filter provides an estimate of the location of minimum NEE or maximum RECO.

Validation Against Chamber Data
We used data from the Community Soil Respiration (COSORE) database (Bond-Lamberty et al., 2020), a collection of in situ chamber studies, to investigate the relative advantage of each TCF modification and validate the modeled RECO fluxes. As TCF does not distinguish between above-ground and below-ground respiration and calculates R A as a constant fraction of GPP, we assume that TCF RECO is roughly proportional to soil respiration (R S ) at daily time scales. R S should be the largest component of RECO and they generally show similar dynamics (Barba et al., 2018;Bond-Lamberty et al., 2018). Using the Soil Respiration Database (SRDB) version 5 (Jian et al., 2021), we extracted R H :R S ratios averaged by PFT and used these to calculate the R H fraction of COSORE-reported R S , based on matching PFT classes. Those ratios are consistent with the analysis of Bond-Lamberty et al. (Table S6 in Supporting Information). COSORE data sets that reported negative SM or SM < 0.02 m 3 m −3 were excluded from the analysis. Few COSORE sites report the depth of collar insertion; all have recorded depths ≤10 cm. We computed the median R S flux, converted from μmol CO 2 m −2 s −1 to g C m −2 s −1 using the molar mass of carbon, across ports.
After filtering COSORE data sets on these criteria, we split them into two groups. In the first group, 13 COSORE data sets (Ataka et al., 2014;Carbone et al., 2011;Chang et al., 2008;Sánchez-Cañete et al., 2016;Vargas et al., 2018) reported concurrent, daily SM, temperature, and R S flux values (Table S1 in Supporting Information S1). These in situ SM and temperature data are most appropriate for modeling at COSORE sites given the relatively coarse scale of TCF input data sets. Soil texture and porosity data were obtained for these sites from the Catchment model. We computed a 365-day GPP climatology from the SMAP L4C Version 5 data set (2015-2019) at each COSORE location. Use of a GPP climatology eliminates canopy changes, real or spurious, that may not be reflected in the respiration measurements from COSORE chamber studies due to the scale mismatch. For experiments that included a CASA litterfall phenology, the daily litterfall fraction was computed as the average across EC tower sites for each PFT class.
A key issue arises with using COSORE-reported driver data in TCF models calibrated on L4SM, as SM values generated by one model (or measured in the field) are generally not comparable with those derived from another (Koster et al., 2009). The small number of relevant COSORE data sets precludes re-calibration of TCF using COSORE observations. Instead, we applied a bias correction, using an affine statistical transformation to re-scale COSORE moisture and temperature values to match the L4SM data based on within-PFT means. The coefficients from a linear regression of rank-ordered L4SM values on rank-ordered COSORE values were applied to transform the COSORE values of sites based on their PFT (i.e., slope parameter varying with PFT). We mapped the COSORE-reported biome to MOD12Q1 PFTs, which were often identical. To obtain a continuous record of soil moisture and temperature (required for TCF model operation), a daily COSORE climatology, by PFT, was used to fill-in missing values.
The second group of COSORE data sets consists of 12 other sites Curtis et al., 2005;Detto et al., 2013;Gaumont-Guay et al., 2014;Jassal et al., 2008;Noormets et al., 2010;Ueyama et al., 2018;Zhang et al., 2018) located within 9 km of a FLUXNET tower. Although these 12 data sets did not include driver data, we compared the modeled R H flux (based on L4SM and MERRA-2 driver data) from those nearby FLUX-NET sites, for each experiment, to the (partitioned) R H flux from COSORE.

Results
Each modification to the R H and/or SOC submodels produced a meaningful improvement in the estimated RECO and NEE seasonal cycles relative to the TCF NRv8.3 model baseline with no modifications (Figure 2). The modification to the R A model, via the Kok effect, produced no discernible improvement in the seasonal cycles (Tables 1 and 2). The mean day-of-year (DOY) of the NEE minimum (RECO maximum) for the high northern latitudes (≥40°N latitude), based on filtering of tower data (Table S2 in Supporting Information S1), is estimated to be 181 (197). Depending on the method used to quantify the phase difference (Tables S2, S3 in Supporting Information S1), in NRv8.3 the NEE minimum (RECO maximum) is delayed (advanced) by 15-26 days (12-14 days). This aggregate seasonal cycle obscures underlying heterogeneity but is useful as a high-level diagnostic. Some PFTs show a stronger phase correction than others (Figure 3). Spatial variation in the timing of the NEE minimum due to PFT and climate can be observed if we apply the TCF model at global extent (Figure 4).
The Fourier regression (Table S3 in Supporting Information S1) and low-pass filter results (Table S2 in Supporting Information S1) agree broadly as to the effect of each modification on the overall fit to the seasonal cycle of the EC flux towers; that is, each intervention, other than the Kok effect, produces a meaningful model improvement. However, they disagree substantially as to the length of the time lag for all PFTs except ENF (Tables 2 and S3 in Supporting Information S1). Some of this difference can be attributed to the lack of strong periodicity in NEE for some PFTs (e.g., SHB, GRS) which can confound the Fourier regression results; conversely, PFTs with broad seasonal peaks/troughs (e.g., CCR, BCR) may confound the low-pass filter. Differences in the TCF model fit parameters (if re-calibrated) and other observations unique to each experiment are reported below.

NRv8.3 + Litterfall Phenology
A seasonally varying litterfall scheme produced the best joint improvement in the NEE and RECO seasonal cycles (Table S2 in Supporting Information S1), relative to NRv8.3, particularly for DBF (Tables 1 and 2). The NEE seasonal cycle of DBF, with the new litterfall scheme, is almost a perfect match to the tower record (despite a bias difference), including the autumn increase in CO 2 flux to the atmosphere. This autumnal increase is also shown in the modeled NEE results for the BCR PFT, but it is not apparent in the corresponding tower data. Conversely, NRv8.3 shows a spurious high NEE anomaly for BCR in spring that is eliminated by this experiment's considerable shift in the BCR RECO seasonal cycle ( Figure S3 in Supporting Information S1).

NRv8.3 + O 2 Limit
With the O 2 diffusion limit, all PFTs show reduced NEE magnitude and most show less RECO throughout most of the year, though increased RECO is observed in the fall for some PFTs, particularly DBF ( Figure 3)  RECO seasonal cycle is improved for ENF, DBF, SHB, and cropland PFTs. As with the new litterfall scheme, the addition of an O 2 diffusion limit eliminated a spurious spring NEE anomaly for BCR in NRv8.3 ( Figure S4 in Supporting Information S1). But unlike the new litterfall scheme, the O 2 diffusion limit did not introduce a spurious autumnal NEE anomaly for BCR ( Figures S3 and S4 in Supporting Information S1). In general, the resulting phase correction in RECO is not as strong as in the NRv8.3 + Litterfall Phenology experiment ( Figure S3 in Supporting Information S1). Looking at the residuals (compared to tower observations), NRv8.3 over-estimates RECO at high SM in ENF, GRS, and croplands. Adding an O 2 diffusion limit reduces that high bias; however, in the NRv8.3 + O 2 Limit experiment, GRS and DBF show a slight under-estimation of RECO at high SM.

NRv8.3 + Soil Profile
The NRv8.3 + Soil Profile experiment produced only a moderate correction to the NEE and RECO seasonal cycles. We experimented with different functional forms for the litterfall input distribution and R H extinction function, h(z; Equation 9). For the litterfall inputs, as Koven et al. also suggested, we evaluated profiles based on the root profiles of Jackson et al. (1996), a root density profile based on the Community Land Model (Lawrence et al., 2019), and the normalized, median SOC profile from SoilGrids 250m (Hengl et al., 2017; Figure S16 in Supporting Information S1). The negative exponential h(z) better matched the shape of the median SoilGrids 250m profile and, in anticipation of a high RECO bias due to high SOC storage, we reduced the SOC storage magnitude by using a 9-cm e-folding depth ( Figure S18 in Supporting Information S1), instead of 10-cm as Figure 3. Mean seasonal RECO and NEE cycles for each experiment and for the EC flux towers ("Towers") at tower sites north of 40°N latitude for the ENF and DBF PFTs. The shaded area shows one spatial standard deviation for the Towers and is clipped for DBF NEE. Plots of the mean seasonal cycles for each PFT, separately, are available in Supporting Information S1.
suggested by Koven et al. As expected, SOC storage increases with a multilayer soil model ( Figure S19 in Supporting Information S1).

Factorial Combinations
In addition to single-factor experiments, we ran experiments in which multiple factors were combined, with the exception of the Kok effect implementation, as that experiment did not show improvement in the timing of the mean seasonal cycles. For the NRv8.3 + O 2 Limit + Litterfall multifactor experiment, no re-calibration was necessary, as the NRv8.3 + O 2 Limit parameters were re-used. The NRv8.3 + Soil Profile + O 2 Limit experiment did require re-calibration. The experiment combining both an O 2 limit and litterfall phenology generally resembles an average of those single-factor experiments ( Figure S8 in Supporting Information S1). Interestingly, the NRv8.3 + O 2 Limit + Litterfall experiment yields the most substantial correction of all multifactor experiments and a substantial improvement in the RECO and NEE seasonal cycles for ENF compared to the single-factor experiments.
Where an O 2 limit and vertical soil profile were combined, the steady-state SOC storage was unreasonably high, with total-column SOC content exceeding 870 kg m −2 and surface-layer (0-5 cm) SOC density of around 120 kg m −3 . Nonexponential litterfall input functions combined with the power-law R H extinction function yielded smaller, more realistic steady-state SOC pools, but failed to improve the RECO and NEE skill. The NRv8.3 + Soil Profile + Litterfall experiment, however, improved upon the baseline and the respective single-factor experiments; notably, an autumn high bias in the NEE cycles of DBF and BCR in the NRv8.3 + Litterfall experiment was much reduced ( Figures S3, S7 in Supporting Information S1).

Validation Against Tower and Chamber Data Sets
Modeled fluxes from each single-factor experiment compared well to the observed NEE, RECO fluxes at the L4C CVS ( Figure 5). Some of these sites are located below 40°N latitude, including the southern hemisphere, and therefore indicate that none of the new respiration mechanisms, as single factors, results in degraded NEE or RECO skill relative to the baseline NRv8.3. Conversely, in the combined experiments, the combination of an O 2 diffusion limit with other changes to the R H model led to degraded NEE and RECO skill ( Figure 6).
Compared to FLUXCOM, the NRv8.3 + O 2 Limit showed the best agreement in the RECO and NEE seasonal cycles, though the NRv8.3 + Soil Profile and NRv8.3 + Soil Profile + Litterfall experiments also compare well (Tables 3  and S5 in Supporting Information S1). Both phase estimation approaches agree that the single-factor experiments (other than NRv8.3 + Kok Effect) match the FLUXCOM seasonal cycles of RECO and NEE better than the baseline NRv8.3 product; however, they disagree considerably about the multi-factor experiments and the apparent residual lead in the RECO seasonal cycle (Table 3).
The modeled results at COSORE sites, combining those with independent driver data and those that are within an EC tower footprint, indicate that every experiment, other than NRv8.3 + Kok Effect, improved upon the NRv8.3 baseline in terms of R H modeling skill (Table 4). The experiment with a vertical soil profile, with or without a litterfall phenology, produced an improvement in the R H anomaly correlation and the greatest improvement in all skill metrics. The O 2 diffusion limitation, in particular, produced a substantial improvement in R H RMSE and biased-adjusted RMSE (ubRMSE) that can be attributed primarily to the substantial reduction in high-R H residuals at high SM ( Figure S10 in Supporting Information S1). The improvement in the NRv8.3 + O 2 Limit experiment is notable at one moist, high-elevation ENF site (Chang et al., 2008); NRv8.3 and all other experiments fail to accurately simulate R H dynamics at this site (median r = 0.15; median anomaly r = 0.50) but, with the O 2 limit, TCF simulates R H with very high accuracy, including spikes in R H during dry-downs (r = 0.88; anomaly r = 0.85; Figures S20 and S21 in Supporting Information S1).

Discussion
Three different modifications to the TCF soil decomposition model resulted in substantial corrections to the modeled seasonal carbon cycles in the NH and improved overall RECO and NEE modeling skill. Of the singular modifications tested, a seasonally varying litterfall scheme resulted in the greatest, consistent improvement in the RECO and NEE phase across PFTs. As that experiment involved no model re-calibration or new parameterization, we can attribute that improvement to the relative change in SOC substrate availability for R H . In contrast, the moderate improvements in RECO and associated NEE phase under the O 2 diffusion limitation and vertical soil profile experiments seem to have resulted from an overall reduction in RECO, particularly during the NH spring (Figure 3). Seasonally varying litterfall was most effective at reducing the phase bias in DBF, while an O 2 limit was most effective in GRS; both were effective in ENF. The vertical soil profile was much less effective at reducing either RECO or NEE phase bias in most PFTs, though it did mitigate bias in croplands and improved overall modeling skill (Table 4).  It should be noted that the high NH RECO bias of TCF is a major contributing factor to the NEE phase bias; as the modeled GPP cycle is tied to satellite observations and fixed in each experiment, merely reducing the RECO magnitude would result in a phase shift of the NEE cycle. In the NH, the NEE cycle would be advanced (i.e., shifted earlier in time). We verified the role of RECO magnitude in the TCF simulations, by inflating tower RECO 25%, and then re-calculating NEE using NRv8.3 GPP. Consequently, while an NEE phase correction may result from the reduction of a bias in RECO magnitude, we can interpret a RECO phase correction as an improvement in the timing of respiration phenology. To verify the mechanisms tested here, we examined the change in the RECO residual (difference in residual between modeled and observed RECO) for each experiment compared to the baseline NRv8.3 (Figures S11-S14 in Supporting Information S1). The experiments that were successful at correcting the RECO seasonal cycle all showed substantially reduced RECO during the NH spring months (April, May, June), particularly for the DBF and cropland PFTs. With the exception of the NRv8.3 + Kok Effect experiment, which failed to mitigate RECO bias, each experiment reduced the spring RECO bias in a different way.
The O 2 diffusion limitation produced the greatest reduction in residual RECO at both low and high values of soil moisture (SM), particularly in spring ( Figure S13 in Supporting Information S1), suggesting that an optimum SM  Note. Standard deviation across sites is noted for RMSE, ubRMSE in parentheses. Significant improvements in correlation, relative to NRv8.3, are denoted: *** (p-value < 0.01), ** (p-value < 0.05), * (p-value < 0.1).  (Ryan et al., 2015;Sihi et al., 2018). An upper limit on the response of R H to soil moisture has been shown to improve modeled RH estimates (Ťupek et al., 2019) and, as our results at COSORE sites indicate, specifically improves estimates at sites that experience high soil moisture conditions and at one alpine ENF site (Table 4 and Figure S10 in Supporting Information S1). When the O 2 diffusion limit is combined with a linear or sublinear function that increases with soil moisture (i.e., representing greater substrate availability), the result is a triangular function with a fairly narrow range of optimum soil moisture, which agrees with the observation that SM is most limiting on R H when soils are relatively dry or approaching saturation (Reichstein et al., 2003). At high northern latitudes, these conditions may predominate during spring thaw (Oikawa et al., 2014;Winnick et al., 2020), which underscores the key role of SM in accurately modeling the corresponding carbon cycle transitions.
The new litterfall allocation scheme shows a similar spring reduction in the RECO residual but it is not patterned by soil moisture or temperature ( Figure S12 in Supporting Information S1). Instead, there is a temporal pattern: residual RECO is reduced in the first half of the year but is elevated during the second half, effectively reducing R H and RECO in spring just as an O 2 diffusion limitation does when SM is high. The fall RECO increase then results from a release from substrate limitation (Leitner et al., 2016;Nielsen et al., 2019). The CASA model (Randerson et al., 1996), from which our litterfall scheme is derived, displays RECO and NEE phase biases similar to TCF (Byrne et al., 2018, Figure 2). This is particularly interesting as the NRv8.3 + Litterfall Phenology experiment considerably improved the phase offset between TCF and the tower observations (Tables 1 and 2) and perhaps over-corrected when compared to FLUXCOM (Table S4 in Supporting Information S1). Randerson et al. noted the CASA litterfall scheme led to an advanced R H seasonal cycle (earlier peak), which was expected due to a build-up of fall substrate inputs and, in turn, a high substrate availability in spring (Byrne et al., 2018). However, in our experiment, the same litterfall scheme only delayed the R H cycle. This discrepancy depends on whether or not winter-time R H is sufficiently reduced, relative to litterfall inputs, so as to allow substrate pools to increase before spring. Another key difference between CASA and TCF is the much coarser spatial resolution of CASA (and coarser temporal resolution in Randerson et al., 1996).
When we look at the difference in RECO residuals from the NRv8.3 + Kok Effect experiment, stratified by PAR, the RECO residual is still high at almost all levels of PAR but especially when PAR is high, indicating that a CUE response to PAR is not having the intended effect on the seasonal cycle ( Figure S11 in Supporting Information S1). This may be due to TCF's high RECO bias in the NH (Figure 3), that is, the R A fraction increases to the extent that R H is reduced, resulting in a similar level of RECO to NRv8.3. This intrinsic high bias in RECO may be due to the night-time partitioning of EC tower fluxes (Keenan et al., 2019). Alternatively, or in addition to this problem, there may be a problem with our implementation of a PAR scalar modulating CUE at daily time scale, as R A is known to continue throughout the day and subdaily co-variation of PAR and temperature is considerable (Heskel et al., 2013;Peng et al., 2013); TCF's use of daily average meteorology that is more representative of daytime conditions may contribute to the high RECO bias (Wehr et al., 2016).
Despite its small effect on the mean seasonal cycles, the greatest improvement in both NEE and RECO modeling skill ( Figure 5) came from the incorporation of a vertical soil profile into the TCF soil decomposition model. The small correction in phase bias seems to be due to the lagged R H flux that arises from the slow diffusion of heat and, to a lesser extent, of moisture through the soil column. We verified this mechanism by plotting the standardized, modeled R H flux in each soil layer from the NRv8.3 + Soil Profile experiment, along with the (single-layer) flux from NRv8.3 ( Figure S15 in Supporting Information S1). The results indicate that, with a vertically stratified soil decomposition model, the individual-layer R H fluxes are lagged and decline in magnitude with increasing soil depth. Consequently, the whole-column, total R H flux in the vertically resolved model approaches the magnitude of the single-layer model, though the multi-layer total is slightly smaller. The result is that the NEE and RECO cycles both peak in early July (Table S2 in Supporting Information S1) but the RECO peak is broader, consistent with Yi et al. (2020). In mid-to-late summer, the RECO flux at the NH sites is substantially reduced due to SM (i.e., substrate diffusion) limits (not shown).
This lag effect and corresponding improvement in the RECO seasonal cycle could be enhanced if deeper soil layers were modeled with higher SOC storage. In the NRv8.3 + Soil Profile experiment, SOC storage diminishes to almost zero at 1.5 and 3-m depth. The exponentially declining input distribution of Koven et al. (2013) is a good match for the median, global SOC profile from SoilGrids 250m ( Figure S16 in Supporting Information S1) as well as the distribution of carbon by age (Balesdent et al., 2018); however, TCF depletes deep SOC storage during model spin-up ( Figure S18 in Supporting Information S1). This underscores that further improvements to effectively model SOC protection mechanisms are needed in order to accurately simulate R H fluxes from a multi-layer soil decomposition model. The exponential litterfall distributions that allocate very little SOC to deeper layers ( Figure S16 in Supporting Information S1) are probably more realistic than distributions based on root fractions (Shi et al., 2020). However, an exponential extinction of R H with depth may not be reasonable, as there is recent evidence that between 30% and 60% of CO 2 efflux originates below 1 m depth (Wan et al., 2018).
For simplicity, our model varies neither the turnover times nor the environmental response functions with depth. Addressing these limitations will require improved data on the vertical distribution of R H flux.
In TCF, calibrating SOC turnover is somewhat subjective, as the base decay rates are determined by comparing the inferred SOC storage from inverting the R H flux with that indicated by the International Geosphere-Biosphere Data and Information System (IGBP-DIS) soil inventory record (Global Soil Data Task Group, 2000). However, the base rates likely should be modified when soil decomposition mechanisms are changed and should probably vary with soil layer depth; doing so might result in more favorable RECO, NEE skill metrics for the multi-factor experiments ( Figure 6). Another limitation in this study is the neglect of GPP magnitude bias. Although the phase of GPP is expected to be constrained by the satellite-observed fPAR (Messerschmidt et al., 2013), a GPP magnitude bias also has the potential to introduce an NEE phase bias and requires further research along the same lines of this study.
The model enhancements produce similar phase corrections when results from different PFTs are pooled. This equifinality suggests that the modifications to TCF assessed here may not be equally relevant to all PFTs. For instance, the new litterfall scheme resulted in a better match to autumnal NEE for DBF but also created a spurious high NEE anomaly in autumn for BCR. The combination of O 2 limit and vertical soil profile also further delayed (advanced) the mean NEE (RECO) seasonal cycle for ENF. The equifinality among experiments also indicates that the NH seasonal cycle of NEE is an emergent property of terrestrial ecosystems (Birch et al., 2021) and that we are likely missing some interactions between limiting factors and driving relationships of soil decomposition, for example, microbial biomass and stabilization of SOC (Johnston & Sibly, 2018) or litter input traits (Hu et al., 2018). After all, there is some residual misfit in the modeled seasonal cycles (Tables 1 and 2) and TCF still retains a high RECO bias. In addition to the high residual RECO bias, which may be due to the partitioning of EC tower fluxes, TCF also has a relatively large NEE magnitude bias, as its summer-time GPP and NEE amplitudes are smaller than tower observations (Figure 3). Future development of TCF and similar models-given their promise for global, operational terrestrial carbon budgeting (e.g., SMAP L4C)-should focus on reducing RECO bias, starting with an assessment of different EC flux partitioning methods (Keenan et al., 2019) and SOC protection mechanisms.

Conclusions
A seasonally varying adjustment of R H model processes, including the litterfall allocation to SOC available for decomposition, resulted in major corrections to the modeled RECO and NEE seasonal cycles, as compared to EC flux tower observations, in a first-order soil decomposition model. An explicit litterfall phenology, with or without a vertically resolved SOC decomposition model, yields the best improvement in phase. The NEE phase bias in TCF for high northern latitude sites (≥40 N), was reduced from a lag of 15-26 days to between a 5-day lead or 15-day lag, depending on the experiment. Based on a comparison to the FLUXCOM seasonal cycle above 40°N latitude, the model enhancements generally eliminated the NEE phase bias, though a smaller RECO phase bias remains. Comparison to independent, in situ chamber measurements indicates the proposed mechanisms can improve RECO and NEE modeling skill.
The RECO phase bias can result from a bias in RECO magnitude, that is, from excess modeled autotrophic (R A ) or heterotrophic respiration (R H ) at key seasonal intervals. Two model enhancements, adding a limit on O 2 diffusion for soil heterotrophs or a seasonally varying litterfall inputs scheme, reduced the phase biases in RECO and NEE by reducing R H during the NH spring season. The O 2 limit restricts R H as soil moisture increases, which is common in the NH spring in many regions due to snowmelt and increased rainfall. The new litterfall scheme directly shifts the R H seasonal cycle later in time by enhancing substrate limitation in the spring. Although less effective at correcting RECO or NEE phase bias, a multi-layer soil decomposition model also reduced spring NH