Climate more important than soils for predicting forest biomass at the continental scale

Mg ha − 1 . This suggested that other non-climate, non-edaphic variables impose a substantial influence on forest above-ground biomass, particularly in the high biomass range. We conclude that climate is a strong predictor of above-ground biomass at broad spatial scales and across large environmental gradients, yet to predict forest above-ground biomass distribution under future climates, other non-climatic factors must also be identified.


Introduction
Biomass stored in forests is critical to the global carbon cycle.Forest ecosystems contain an estimated 861 ± 66 Gt C globally (Pan et al. 2011) and provide a vital climate regulating service (de Groot et al. 2010) by removing and sequestering carbon from the atmosphere.The live mass (biomass) in forest ecosystems is especially crucial as it annually sequesters around 2.4 ± 0.4 Gt C globally (Pan et al. 2011), offsetting 21% of the annual anthropogenic emissions of 10.7 ± 1.2 Gt C from fossil fuels, industry and land-use change (Le Quere et al. 2018).Since the global climate is predicted to change substantially over the coming decades (Collins et al. 2013) it is critical that we understand how strongly climate and other abiotic factors are associated with forest biomass at broad spatial scales.Such understanding may contribute to validating and constraining Earth System models and to predicting impacts of changing climates on forest carbon stores.
It is well established that climate strongly affects the distribution and storage of carbon in global forests (Pan et al. 2013).Temperature and precipitation are clearly linked to biomass storage (Keith et al. 2009, Pan et al. 2013) through the constraints they place on biomass production.Temperature regulates the rates of carbon dioxide assimilation in leaves (Farquhar et al. 1980) and carbon losses from respiration associated with maintaining living tissue (Larjavaara and Muller-Landau 2012).Precipitation influences water availability, which in turn affects stomatal conductance, nutrient uptake, leaf area index and thus forest productivity (Eamus 2003).Mean annual temperature (MAT) (Liu et al. 2014, Vieilledent et al. 2016, Zhang et al. 2016, Gordon et al. 2018, Ali et al. 2020) and mean annual precipitation (MAP) (Slik et al. 2010, Liu et al. 2014, Cook et al. 2015, Poorter et al. 2016, Vieilledent et al. 2016, Zhang et al. 2016, Li et al. 2019, Wang et al. 2019) are strongly associated with forest biomass (or biomass production) in diverse forests at a range of spatial scales.Yet temperature can also constrain forest biomass through seasonality (Vieilledent et al. 2016), the maximum and minimum temperatures of the warmest and coldest months (Bowman et al. 2014b) and the number of growing days (Ali et al. 2020).Further, precipitation can also constrain forest biomass through seasonality (Slik et al. 2010, Prior et al. 2011, Poorter et al. 2016), variability (Alvarez-Davila et al. 2017), and by determining the extent (Saatchi et al. 2007) and length of dry periods (Malhi et al. 2006, Saatchi et al. 2007).
Forest AGB is also influenced by non-climatic factors such as soil properties (Paoli et al. 2008).Soil affects the distribution of trees, and influences forest processes that affect AGB distribution (Pan et al. 2013) such as growth, recruitment and mortality (Yuan et al. 2019).Soil physical properties like bulk density can influence biomass production through effects on root growth and associated nutrient uptake (Aerts and Chapin 1999) either directly via physically inhibiting elongation or indirectly via altering water and oxygen availability (Bengough 2003).The type and size (such as clay, silt and sand) of soil particles affects its water holding capacity and thus imparts strong influence on the ability for plants to extract water (Bengough 2003) and for seedlings to establish (Ford and HilleRisLambers 2019).Soil chemical properties can also impose environmental limits to biomass production by restricting nutrient availability (such as N and P), or by influencing nutrient uptake through, for example, effects of pH on the availability of Ca, Al and P (Hjelm and Rytter 2016).
Few studies have examined statistical relationships between forest AGB and both climatic and soil physico-chemical properties at broad-spatial scales and in diverse forests.Such studies are important to reveal the effect of climate and soil on forest AGB independent of forest type, and for revealing ecological patterns that may be obscured at smaller spatial scales (Levin 1992).The two notable exceptions, Sankaran et al. (2005) whose continental-scale study examined African savannas and Zhang et al. (2016) whose subcontinental study examined six forest types in southwest China both suggest that soil imparts a small effect when compared to the substantial influence of climate.However, these studies addressed additional factors including disturbance (Sankaran et al. 2005) and stand age (Zhang et al. 2016), which might have weakened the apparent influence of soil.Soil properties have been shown to moderate the effects of climate at the local scale on both adult tree survival (Ibanez et al. 2014) and seedling establishment (Ford and HilleRisLambers 2019), and it is possible that such moderation may also occur at broader spatial scales.If soil is a driver of AGB across multiple forest types and biomes, then soil may moderate or compound effects of climate change on forest biomass.Further understanding of such feedbacks are necessary for validating and constraining broad-scale biomass models.
The Australian continent offers a unique opportunity to examine broad-scale drivers of forest biomass because it encompasses multiple biomes, several ecoregions, many forest types and broad climatic and abiotic gradients.Australia extends from 10°41′ to 43°38′S and from 113°09′ to 153°38′E, with a landmass of approximately 769 Mha of which native forest covers 16% or 123 Mha (Montreal Process Implementation Group for Australian and National Forest Inventory Steering Committee 2013).Forests are found in tropical, subtropical and temperate biomes (IPCC 2006), across six world ecoregions (Dept of Agriculture Water and the Environment 2020), and encompass a range of forest structures including eucalypt tall-closed forests, rainforests, eucalypt low, medium and tall open-forests, and eucalypt mallee woodlands (Montreal Process Implementation Group for Australian and National Forest Inventory Steering Committee 2013) of which some may be classed as savannas.The climate varies considerably across the continent, with the tropical north primarily influenced by the monsoon, the centre characterised by aridity, and a temperate climate prevailing in both the south west and east (Peel et al. 2007).Overall, the continent encompasses 12 separate Köppen-Geiger climate zones (Peel et al. 2007).In addition, Australia encompasses broad ranges in soil physico-chemical properties.For instance, sand, silt and clay fractions each cover the full range from 0 to > 90%, soil pH ranges from strongly acidic (2.1) to strongly basic (10.3), and effective cation exchange capacity ranges from 0 to 99.28 me/100 g (Viscarra-Rossel et al. 2015).
We examined the importance of climate and soil variables in explaining forest biomass across Australia using a continent-wide biomass database.By doing so, we aimed to elucidate a) whether models including climate-only, soil-only or climate plus soil explain more of the variation in AGB at broad spatial scales, and b) the broad-scale abiotic drivers of forest biomass.We hypothesised that models containing climate plus soil variables would explain more variation than models based on climate or soils alone.Consistent with global relationships, we hypothesised that MAT and MAP would be the most important climate variables to explain AGB distribution.Due to the importance of phosphorus to photosynthesis (Kirschbaum et al. 1992) and consistent with several studies that have demonstrated the importance of soil phosphorus (Sankaran et al. 2005, Quesada et al. 2012, Fedrigo et al. 2014, Navarrete-Segueda et al. 2018, van der Sande et al. 2018, Fricker et al. 2019, Cheng et al. 2020) to forest AGB we further hypothesised that phosphorus would be the most important edaphic variable.

Study area
Our study covered Australia's native forested area, with forests defined by the Montreal Process Implementation Group for Australian and National Forest Inventory Steering Committee (2013) as areas dominated by trees with mature or potentially mature stand height over 2 m and existing or potential crown cover of 20% or more.This forest definition is broad enough to include a wide variety of woody vegetation, including some that may be classified as woodlands and savannas in alternate classification systems (Dept of Agriculture Water and the Environment 2020) and is consistent with Kyoto carbon accounting methods (Commonwealth of Australia 2019).Australian forests thus encompass at least 15 forest types (Supplementary material Appendix 1 Table A1), are largely dominated by eucalypt species, and are located around the continent's perimeter with the greatest forest area situated along the east coast (Fig. 1).The forests are distributed across a wide range of climatic conditions (e.g.4.4-29.1°Cmean annual temperature range, 108-6563 mm mean annual precipitation range), and soil properties (e.g.0.0-61.6%clay, 1.18-61.7 me/100 g effective cation exchange capacity; Table 1).

Data sources and preparation
We used publicly available continental and global-scale data products to investigate the extent to which AGB is associated with climate and soil variables in Australia's forests (Supplementary material Appendix 1 Table A2).AGB (i.e.only in live pools) data were obtained from the AusCover and TERN Biomass Plot Library (TERN AusCover 2017).This is an Australia-wide compilation of 14 453 plot-level biomass estimates from 11 241 sites (mean area 0.158 ha, and range 0.005-3.2ha) computed from tree diameter measurements made by government, university and research institutions between 1936 and 2017.Soil data for 10 variables were obtained from the National Soil Landscape Grid (Viscarra-Rossel et al. 2015).These are continent-wide grids (~100 m resolution) that were derived by spatial modelling of over 2.4 million soil measurements (mainly sampled from agricultural regions) with 32 environmental variables (Viscarra-Rossel et al. 2015).They were available in 6 depth intervals (0-5, 5-15, 30-60, 60-100 and 100-200 cm), and we selected the topmost layer (0-5 cm) for all soil variables based on high correlations (r > 0.70) among all intervals.Climate data obtained from WorldClim 2.0 (Fick and Hijmans 2017) contained 19 pre-calculated bioclimatic variables at a spatial resolution of ~ 1 km 2 .
To prepare the data for analysis we first extracted climate, soil and forest attributes for each biomass plot location in ArcMap ver.10.4.1 (ESRI 2015).We then followed a thorough data cleaning process to remove errors and potential biases in the dataset, and to restrict our analysis to only minimally disturbed regions of mature forest (Supplementary material Appendix 1 Table A3).We did this to minimise the effects of both disturbance (Seedre et al. 2020) and stand age (Liu et al. 2014, Zhang et al. 2016, Zhu et al. 2018, Jones et al. 2019, Li et al. 2019) on forest AGB in our analyses; noting that consistent Australia-wide fire-history data are lacking, and that stand age is difficult to define for the many fire-tolerant forests that are composed of multiple age cohorts (i.e.fires are not stand replacing; Aponte et al. 2020).First, we applied a quantitative assessment that used ancillary datasets, direct communication with data custodians, and Landsat imagery from 1972 to 2016 to check for continuous forest cover (> 20%).We thus excluded plots that were significantly impacted by anthropogenic disturbance and/or by a major natural disturbance such as a cyclone or high-severity (i.e.crown-consuming) wildfire after 1972 as described in Roxburgh et al. (2019).Plots that were burned by low to moderate-severity prescribed fire or wildfire (i.e.crown cover retained or quickly recovered) during this period were thus retained.Next, we removed all measurements where forest height was < 2 m and canopy cover < 20%, and where the forest type was non-native (i.e.plantation) or unspecified using the Montreal Process Implementation Group for Australian and National Forest Inventory Steering Committee (2013) map.We then excluded measurements where AGB measurements were less than 0.01 Mg ha −1 (thereby excluding measurements of 0 Mg ha −1 ) or exceeded 1504 Mg ha −1 because this represented a reasonable likely maximum AGB for Eucalyptus regnans forest, widely recognised as  Australia's most carbon-dense forest type (Sillett et al. 2015, Volkova et al. 2018), acknowledging that the accuracy of one higher biomass estimate (3638 Mg ha −1 , Keith et al. ( 2009)), has been questioned (Sillett et al. 2015).For sites with multiple measurements, we removed duplicates or the lower estimate.Finally, we investigated all sites where AGB fell outside three standard deviations of the mean for the forest type and excluded those sites with no precedent in the literature (e.g.Acacia forest > 727 Mg ha −1 , Callitris forest > 349 Mg ha −1 ).
Our process retained 21.6% of measurements (27.8% of sites), leaving 3130 measurements for inclusion in our study from the original 14 453 measurements.

Modelling relationships
The Random Forest algorithm was used to model relationships between AGB and climate and soil variables.Random Forest is a machine learning model that handles numerous variables and is robust to over-fitting (Breiman 2001) thus being well suited to the complex data set used in this study.
Random Forest has been used to predict above-ground carbon stocks in Madagascar (Vieilledent et al. 2016) and Australia (Roxburgh et al. 2019), map carbon stocks in the Western Amazon (Mascaro et al. 2014), and predict changes to forest canopy cover with climate change in south east Australia (Williamson et al. 2014).The method outperformed simple regression analysis for predicting above-ground forest biomass at broad spatial scales (Corona-Nunez et al. 2017), and similarly has outperformed other machine learning techniques for forest biomass estimation (boosted regression trees, support vector machine, multiple regression splines (Safari et al. 2017), and support vector regression (Liu et al. 2017).We developed models for three combinations of variables -climate-only, soil-only and climate plus soils to test the performance of climate and soil variables in explaining variation in AGB of Australia's forests.Variables for inclusion in models were selected by first computing the correlation between all climate (19) and soil (10) variables, and then choosing only those variables that were not highly correlated (r < 0.8) with all other retained climate and soil variables and where possible minimising selection of those with correlations between 0.6 and 0.8 (Supplementary material Appendix 1 Fig.A1).When choosing between variables we also selected those with a likely proximal (or direct) link to forest biomass.For example, mean annual temperature (MAT, bio_1) and mean annual precipitation (MAP, bio_12) were selected due to their global associations with AGB (Liu et al. 2014).The soil physical properties bulk density (bdw_0_5) and clay (cly_0_5) were selected for their potential to affect root growth (Bengough 2003).Soil chemical properties such as total phosphorus (pto_0_5) were selected for their potential to affect plant growth through nutrient limitation (Aerts and Chapin 1999).Water availability (MAT -potential evapotranspiration) and MAT:MAP (the ratio of mean annual temperature to mean annual precipitation) were considered based on identified relationships with AGB (Brown andLugo 1982, Alvarez-Davila et al. 2017), but were excluded due to high correlations with MAP.Eight climate variables were selected: four representing temperature (MAT, isothermality, temperature annual range and mean temperature of the driest quarter), and four representing precipitation (MAP, precipitation seasonality, precipitation of the driest quarter, and precipitation of the coldest quarter).Six soil variables were selected: three representing physical characteristics (available water capacity, bulk density and percentage clay) and three representing chemical characteristics (effective cation exchange capacity, total phosphorus and pH; Table 1).Thus, we had 8 explanatory variables for the climateonly model, 6 for the soil-only model and 14 for the climate plus soils model.Histograms of each of the selected variables for the 3130 plots are presented in Supplementary material Appendix 1 Fig.A2.
Models were developed in R ver.3.4.3(R Core Team) using the 'randomForest' package (Liaw and Wiener 2002) tuned to try two variables at each split (mtry = 2), produce 300 trees (ntrees = 300), and with AGB log transformed to normalise the distribution.These model tuning parameters were selected because they minimised out of bag error rates (Liaw and Wiener 2002).We used k-fold cross-validation, randomly splitting the data into 10 equal-sized folds so that 90% of the data set was used for model training and 10% for model testing.We thus repeated the analysis 10 times (one for each fold) for each of the 3 variable sets.Model performance and explanatory power of each of the climate-only, soil-only and climate plus soil models was evaluated by calculating the average of the root mean squared error, model bias and coefficient of determination (R 2 ) across the ten folds.To test that our results were consistent across broad forest types, we repeated the analysis using plots assigned to their world ecoregion classification (Dept of Agriculture Water and the Environment 2020).For each ecoregion with sufficient plots (n ≈ 100) we randomly split the data into training and test sets using a 70:30 split and then repeated the Random Forest analysis for each of the variable sets.
The importance of each variable to the model performance was used as an indicator of that variable's influence on AGB.This was evaluated for each of the three variable sets by first calculating the reduction in mean squared error with variable removal for each of the folds, and then calculating the mean reduction in mean squared error and confidence intervals across the ten folds.To directly compare variable importance between the models produced for each variable set the % increase in MSE was normalised from 0 to 1.

Model predictions
All ten iterations of the Random Forest model for each of the variable sets were used to predict above-ground forest biomass distribution across Australia.We used the ensemble prediction to produce a map of AGB and the coefficient of variation to produce a map of prediction error.All statistical analyses were conducted in R ver.3.4.3(R Core Team) and spatial analyses in ArcMap ver.10.4.1 (ESRI 2015).

Above-ground biomass data
The 3130 measurement plots that remained after data cleaning were distributed across 15 forest types, three biomes and six world ecoregions (Supplementary material Appendix 1 Table A1).Plot locations were clustered in the south west corner, south east and east coast of the continent's forests with few samples in the north eastern and northern reaches and no plots located in the inner south west (Fig. 1).The majority of sites were in the subtropical biome (2083, 66.6%), followed by the temperate (775, 24.8%) and tropical (272, 8.69%) biomes.Mean AGB of the measurement plots was 201.1 (CI 95% ± 7.5) Mg ha −1 , ranging from 65.0 (± 9.7) Mg ha −1 for the tropical biome to 157.3 (± 5.7) Mg ha −1 for the subtropical and 366.7 (± 22.2) Mg ha −1 for the temperate biome.The majority of points were sampled in eucalypt forests (2477, 79.1%), of which eucalypt medium woodland (1078, 34.4%), eucalypt medium open (638, 11.1%) and eucalypt tall open (551, 19.5%) were the most highly represented forest types.The highest mean AGB was 925.5 (± 794.8)Mg ha −1 in eucalypt tall closed forest (albeit based on only two measurements) and the lowest was 61.8 (± 9.2) Mg ha −1 in eucalypt mallee woodland (Supplementary material Appendix 1 Fig.A3).

Model performance
Including soil variables did not significantly increase model fit based on R 2 values (Fig. 2, Supplementary material Appendix 1 Table A4).The two models that contained climate variables both resulted in higher R 2 values (climate-only: 0.47 ± 0.04; climate plus soil: 0.49 ± 0.04), than the soil-only model (0.42 ± 0.03).However, the only difference that was statistically significant, based on non-overlapping 95% confidence intervals, was between the soil-only model and the climate plus soil model.
The three variable sets produced models with similar error and bias.RMSE was lowest for the climate-only model (162 ± 11 Mg ha −1 ) followed by the climate plus soil model (163 ± 11 Mg ha −1 ) and was highest for the soil-only model (172 ± 10 Mg ha −1 ), although the 95% confidence intervals overlapped indicating non-significant differences.Bias followed a similar pattern (climate-only: 16.4 ± 3.1; climate plus soil: 18.5 ± 3.1; soil-only: 19.7 ± 2.9), although again the differences were not statistically significant.All models under-predicted AGB, particularly at higher biomass values (Fig. 3) and this effect was more pronounced in models that contained soil variables.Maximum predicted biomass for the test data was 776 Mg ha −1 for the soil-only model, 734 Mg ha −1 for the climate plus soil model, compared to 975 Mg ha −1 for the climate-only model, indicating that soil variables restricted the predictive range.Examination of the residuals showed that overprediction was linearly constrained and scaled with AGB.Further, residuals were greatest at low latitude and high longitude in a region that corresponds with the temperate forests of south east Australia (Supplementary material Appendix 1 Fig.A4).
The R 2 statistic for ecoregion models was always lower than the corresponding continent-wide model, yet results were broadly consistent (Supplementary material Appendix 1 Fig.A5).Climate-only and climate plus soils models performed equally well, explaining more of the variation than soil-only models for all ecoregions.Non-overlapping 95% confidence intervals indicate these differences were statistically significant.The Temperate broadleaf and mixed forest (TM-FOR) plots produced models with the greatest explanatory power (climate-only: 0.33 ± 0.01; climate plus soil: 0.32 ± 0.01; soil-only: 0.25 ± 0.01), compared to Tropical and subtropical grasslands, savannas and shrublands (TR-SAV) with the least explanatory power (climate-only: 0.17 ± 0.01; climate plus soil: 0.15 ± 0.01; soil-only: 0.09 ± 0.01).The model RMSE scaled with biomass, i.e. was least for TR-SAV and greatest for TM-FOR.Model bias was not clearly different from the full dataset (ALL) for any of the ecoregions with the  A4.exception of the TR-SAV climate plus soils model, which had weaker bias (TR-SAV: 14.5 ± 0.67; ALL: 18.5 ± 3.1).

Variable importance
The most important variables for each of the Random Forest models were mean temperature of the driest quarter (bio_9) for the climate-only and the climate plus soil models, and bulk density (bdw_0_5) for the soil-only model (Fig. 4a, Supplementary material Appendix 1 Table A5).Partial dependence plots (Supplementary material Appendix 1 Fig.A6) and Pearson correlation analysis (Supplementary material Appendix 1 Fig.A1) indicated that the relationship between AGB and mean temperature of the driest quarter was mostly flat, although a scatter plot indicated that AGB peaked in the 10-15°C temperature range (Supplementary material Appendix 1 Fig.A7).In contrast, AGB decreased with increasing bulk density.The least important variables for each of the Random Forest models were precipitation of the coldest quarter (bio_19) for the climate-only model, and percent clay (cly_0_5) for both the soil-only and the climate plus soil model.
Three main points can be drawn from the normalised variable importance comparison (Fig. 4b).Firstly, five climate variables (MAT (bio_1), MAP (bio_12), precipitation seasonality (bio_15), precipitation of the driest quarter (bio_17) and precipitation of the coldest quarter (bio_19) imposed a significantly different effect on the model performance of the climate-only model compared with the climate plus soil models.This is evidenced by non-overlapping 95% confidence intervals of the normalised increase in MSE and suggests that including soil variables increased the effect of these climate variables on model performance.Secondly, the three climate variables that imparted the greatest effect on the mean squared error (bio_1, bio_9, bio_15) were consistent across the climate-only and the climate plus soil models, indicating an importance to AGB that was independent of soil.Thirdly, the importance of three soil variables -bulk density (bdw_0_5), available water holding capacity (awc_0_5) and effective cation exchange capacity (ece_0_5) -was  significantly different between the soil-only and climate plus soil models.Including climate in the model significantly enhanced the importance of available water holding capacity, and significantly decreased the importance of bulk density, and effective cation exchange capacity.

Continental predictions
The three models produced predictions of AGB distribution across Australia's forested extent (Fig. 5a-c) that were very similar, including strong correlations between continentwide biomass means, particularly between the climate-only and climate plus soil models (Pearson r 0.97) (Supplementary material Appendix 1 Table A6).Predicted forest AGB ranged from 18 to 1066 Mg ha −1 (mean 100 ± 0.02 Mg ha −1 95% CI) for the climate-only model, 18 to 980 Mg ha −1 (mean 102 ± 0.02 Mg ha −1 95% CI) for the soil-only model, and 20 to 950 Mg ha −1 (mean 97 ± 0.02 Mg ha −1 95% CI) for the climate plus soil model.Total continent-wide forest AGB was predicted at 9517 Mt (± 66.6 Mt) for the climate-only model, 9768 Mt (± 73.8 Mt) for the soil-only model, and 9305 Mt (± 76.4 Mt) for the climate plus soil model.All models predicted regions of higher biomass concentrated in the south west and south east corners of the country, and the greatest differences between model predictions were in the tropical north east of the continent where the soil-only predictions indicated higher biomass than both of the models that included climate variables.Variation within model predictions indicated by the coefficient of variation (Fig. 5d-f ) was also similar among models, the majority of the predictions falling within the range of 0.1-2%.Coefficient of variations among model folds were highest in the north and south west for all models, in regions of low biomass that are mostly covered by savanna and woodland.

Discussion
In this study, we examined the importance of climate and soil variables to explaining AGB distribution in Australian forests.Climate was the strongest predictor of AGB and soil variables did not significantly improve model predictive performance, even when the analysis was repeated by broad forest ecoregion.In contrast to global studies, neither MAT, nor MAP alone were the most important climatic variables.Rather, our models showed that the average temperature during the driest quarter was most important.Further, soil phosphorus was not the most important soil variable, instead we found that bulk density imparted the strongest influence of all the soil variables that we explored.Overall, our best model explained 49% of the variation, therefore we expect that unexamined factors, such as disturbance regimes (particularly fire regimes), species diversity, stand age-cohort distribution, and forest structure are also important drivers of AGB in Australian forests.

Climate more important than soils
Climate was more important than soils for predicting AGB distribution across Australia.Contrary to our hypothesis, including soil variables did not improve model performance.This was an unexpected result, given the importance of soil to plant growth (Kulmatiski et al. 2008) and the demonstrated direct effect that the soil variables we considered have on AGB distribution and wood production in many forests (Slik et al. 2010, Baraloto et al. 2011, Quesada et al. 2012, Toledo et al. 2017, Navarrete-Segueda et al. 2018, van der Sande et al. 2018, Cheng et al. 2020).We propose several explanations for this result: Firstly, climate is more generally limiting to AGB than soil.It is well established that average temperature (Brown and Lugo 1982, Raich et al. 2006, Liu et al. 2014, Zhang et al. 2016) and annual precipitation (Slik et al. 2010, Stegen et al. 2011, Liu et al. 2014, Cook et al. 2015, Vieilledent et al. 2016, Zhang et al. 2016) are strongly associated with AGB distribution.Plants require temperatures that encourage growth, but discourage increased rates of transpiration (Bowman et al. 2014b, Prior andBowman 2014) or autotrophic respiration (Medlyn et al. 2011).Plants also require access to sufficient water for maintaining physiological processes (Lawlor 1995) and avoiding tree drought mortality (van der Molen et al. 2011).When temperatures are outside the range for optimal growth (Huang et al. 2019), or access to water is limited (Eamus 2003), forest productivity is constrained.Over time, total forest AGB is therefore limited by seasonal and annual fluctuations in temperature and precipitation.Our models suggest that these factors outweigh any influence that soils impart, at least in the region and across the environmental gradients our study encompassed.
Secondly, soils may act indirectly on AGB distribution through their effect on biotic factors.Emerging evidence suggests that AGB is influenced by the action of soil on stand characteristics such as structural diversity (Ali et al. 2019b, Aponte et al. 2020), stand density (Ali et al. 2019a), species richness (Ali et al. 2019a, b), functional diversity (Aponte et al. 2020, Cheng et al. 2020) and the number of large trees (Aldana et al. 2017, Navarrete-Segueda et al. 2018, Ali et al. 2019c).This has been demonstrated at broad spatial scales for temperate (Aponte et al. 2020), tropical (Ali et al. 2019b, c), and moist temperate, semi-humid and semi-arid forests (Ali et al. 2020).Soil characteristics have the potential to directly determine the type of vegetation that can be supported (i.e.grassland versus forest), and then influence the structural and functional characteristics of that vegetation.Since we limited our analysis to forests, the effects of soil on AGB distribution in our models may have been constrained to those on forest characteristics rather than the larger influence on woody versus non-woody ecosystems.
The forest types we examined and limitations in the soil data may also have contributed to our result.Many of the studies that demonstrated significant direct soil effects were in tropical rainforests, where soil may be more influential due to a lesser influence of climatic limitations on productivity.In tropical rainforests the small annual variation in daytime temperature means forests are growing in conditions close to their growth optima for much of the year (Doughty andGoulden 2008, Tan et al. 2017) and precipitation is rarely limiting.Our study encompassed many forest types, of which only a small proportion of sites (7%) were in rainforest and a large proportion were in open forests (38%) and woodlands (41%) some of which may be alternatively described as savannas (Dept of Agriculture Water and the Environment 2020).Moreover, Australian forests primarily consist of Eucalyptus species and thus may not reflect findings in tropical forests, or be representative of evergreen broad-leaved forests globally.Finally, the soil data we used were modelled rather than measured directly at measurement sites.Since soil can change dramatically over short distances it is likely that at some sites, the soil values that we used were inaccurate.Recent analysis of the National Soil Landscape Grid in south eastern Australia showed that the model consistently underpredicted Soil Organic Carbon (SOC) concentration in forests when compared to measured values (R 2 = 0.49, Bennett et al. 2020b).Thus, differences between our climate-only and climate plus soils models may have become significant if measured rather than modelled soil data were available.

Mean temperature of the driest quarter most important climate variable
Mean temperature of the driest quarter was the most important climate variable in our models, ranking higher than either MAT or MAP.This may be a function of forest growth across large climatic gradients, suggesting that climatic conditions during periods of constraint are more critical to AGB distribution across broad regions and diverse forest types than the mean annual condition.Both temperature and precipitation impose limitations to growth, which is one of the key demographic process that contributes to forest AGB (Vanderwel et al. 2016).Both species (Hinko-Najera et al. 2019) and ecosystem (Huang et al. 2019) growth rates are highest when temperatures are at an optimum, and when other factors (such as precipitation) are not limiting.One interpretation of our analysis is that during the driest period of the year, when precipitation already imposes a constraint on growth, the effect of an additional constraint (temperature) becomes more influential.
Our study is the first to indicate the importance of mean temperature of the driest quarter to forest AGB.Only one other study has analysed the relationship between these variables and akin to our analysis their simple correlation resulted in no significant relationship (Saatchi et al. 2007).This may be due to the study being conducted at the subcontinental scale, having low variation (~5°C) in annual temperature across the study range, and addressing tropical rainforests in the Amazon basin.Because we adopted a broad definition of forest that included forests with diverse structural and functional characteristics, our analysis indicated different drivers of biomass across Australia compared with the Amazon basin.Several studies have demonstrated the importance of mean temperature of the driest quarter to forest variables that in turn may be associated with AGB.For example, mean temperature of the driest quarter was strongly associated with the distribution of the Australian rainforest tree Nothofagus cunninghamii (Worth et al. 2015), the Chinese forest tree Liriodendron chinense (Xu et al. 2017), the distribution of Chinese forests (Dakhil et al. 2019), and the basal area and dominant height of Chinese larch plantations (Lei et al. 2016).Thus, mean temperature of the driest quarter may be affecting AGB distribution of forests across this broad region by influencing the distribution of species, forests and the size of individual trees.

Bulk density is the most important soil variable
Bulk density was the most important soil variable in AGB models, yet it was of substantially lower importance in models that also included climate variables.As a measure of soil compaction (Zhao et al. 2010), bulk density represents the strength, porosity and matric potential of soil, and thus influences root growth and plant access to water, nutrients and oxygen (Bengough 2003).In very hard soil (high bulk density) roots may be unable to penetrate the soil to access water and nutrients, whilst in very soft soil (low bulk density) physical contact between the soil and roots may restrict uptake (Stirzaker et al. 1996).By limiting plant growth rates, AGB of forests growing in soils of very high or very low bulk density may be reduced compared to those growing in soils of optimum bulk density.Our analysis indicates that the effects of bulk density are lessened in models that also include climate variables, indicating a stronger overall influence of the later on AGB distribution across Australia.This conclusion is consistent with other broad-scale studies that have demonstrated a smaller influence of soils than climate on the distribution of woody cover in African savannas (Sankaran et al. 2005) and forest AGB distribution in south-west China (Zhang et al. 2016).
Our study is the first to statistically demonstrate the importance of bulk density to forest AGB distribution.Few studies have considered relationships between AGB and bulk density in natural forests with two notable exceptions.Bulk density was in two of the top ten models that explained AGB distribution in Iran's moist temperate, semihumid and semi-arid forests (Ali et al. 2020) and it was significantly correlated with biomass in the Amazon basin when combined with other variables that represented soil structure (Quesada et al. 2012).Other studies showed that high bulk density limits height growth in conifer seedlings (Zhao et al. 2010), and both leaf area and root length in barley seedlings (Stirzaker et al. 1996), however the effects of bulk density on AGB in forests remains under-examined.

Climate variables predict pattern of above-ground biomass distribution
The overall spatial pattern of AGB distribution in Australian forests can be predicted with relatively few climate variables.Our study is the first to limit mapping of AGB to Australia's protected forests, however the spatial distribution predicted from our models was strikingly similar to studies that have predicted AGB across the full extent of the continent (Supplementary material Appendix 1 Fig.A8).Similarly to Berry and Roderick (2006), Montreal Process Implementation Group for Australian and National Forest Inventory Steering Committee (2013), Commonwealth of Australia (2018) and Roxburgh et al. (2019), our maps illustrate regions of high biomass in the south-west, southeast and east coast of the continent, with lower biomass in the north and the inner-east, and negligible biomass in the centre.A side-by-side comparison reveals regions of greatest difference between our estimate and previous estimates in the tropical north where our predictions are around 100 Mg ha −1 lower.This underestimation was likely caused by the lack of plots from this region available in the Biomass Plot Library source data used for training our models.Moreover, the lack of data from this region and minimal plots from the AGB range > 500 Mg ha −1 may have contributed to the overall underestimation of total forest biomass from our models at 88% (climate-only model), 91% (soil-only model) and 86% (climate plus soil model) when compared to the only known Australia-wide estimate of 5390 Mt C for combined production and non-production forests (10 780 Mt dry matter, assuming a conversion rate of 0.5, Commonwealth of Australia 2018).
Other non-climatic, non-edaphic variables will also influence AGB distribution.The climate and soil variables we selected explained ~ 50% of the variation, meaning factors we did not consider were equally important.Biotic factors such as species diversity and stand structural complexity (Ali et al. 2019b), structural diversity (Aponte et al. 2020), functional trait composition and diversity (Ali et al. 2017), stand age (Liu et al. 2014, Zhang et al. 2016, Zhu et al. 2018, Jones et al. 2019, Li et al. 2019), and average stem diameter (Poorter et al. 2015, Cuni-Sanchez et al. 2017) also influence carbon storage.As do disturbances such as prescribed fire (Bennett et al. 2013), wildfire (Bowman et al. 2013, Keith et al. 2014) and harvest (Norris et al. 2010).Further, climate and soil may act indirectly on AGB by affecting these biotic factors (Aldana et al. 2017, Ali et al. 2019b, c, Aponte et al. 2020).Hence, the traits of dominant species, and the structural and functional composition of the forest, together with disturbance history (i.e. the time since the last disturbance, severity of the disturbance and stand age-cohort distribution) likely have had considerable influence on AGB distribution Australia is a fire-prone continent (Murphy et al. 2013) so we anticipate that fire regimes will be of particular importance to AGB in Australia's forests.Fire affects the accumulation of forest AGB by consuming biomass, causing tree mortality, and by affecting juvenile recruitment, and the re-growth of established trees (Bennett et al. 2017).Across the continent, fire consumes on average 11% of the carbon captured via Net Primary Productivity, though primarily in litter and woody debris fuels rather than in live tree carbon (Murphy et al. 2019).Nonetheless, whilst fire is unquestionably important, unravelling its effects is problematic due both to inconsistent or incomplete fire records, and to the multi-faceted ways that fires influence forest biomass.From stand-replacing wildfires in highly productive forests in the south east (Bowman et al. 2014a), to minimal or partial mortality of established resprouter trees plus new seedling recruitment in fire-tolerant multi-aged forests (Bennett et al. 2016), to maintaining the balance of grass and tree co-existence in savanna ecosystems (Beringer et al. 2015).Whilst our datacleaning process aimed to reduce the influence of fire on our analyses, we excluded only plots with crown cover reduced to < 20% after 1972, and thus likely included many plots affected by low to moderate severity fires.As a consequence, fire effects likely contributed to the unexplained variation in our predictive models, including potentially constraining forest AGB below climatic and/or edaphic potential in some ecoregions (Supplementary material Appendix 1 Fig.A5).

Limitations of the source data
Several limitations in our source data are important to consider.Firstly, the Biomass Plot Library plot level data is an aggregation of measurements made by government and research organisations over many years.As such, data were not collected using a stratified sampling regime nor standard measurement protocol, thus all forest types and regions were not equally represented and there were likely (undescribed) differences in sampling methodologies.In particular, the north east tropical forests and sites of high biomass in the south east, especially in tall closed forests, were under-represented and sample plots ranged from 0.005 to 25 ha in size.The AGB values were also pre-calculated using the generalised allometric equation of Paul et al. (2016).While there is evidence to suggest that at the continent scale this would have minimal impact on AGB estimates (Paul et al. 2016), it is possible that species-specific equations would yield different results.Secondly, soil and climate variables were derived from modelled data and not measured directly at plots, thus values may not reflect actual soil and climate conditions at the Biomass Plot Library site locations.Any such deviations were likely greater for the soil data, as the majority of soil samples used in the soil modelling were derived from agricultural regions (Viscarra-Rossel et al. 2015).The deviation among the Soil Landscape Grid predictions and measurements of SOC concentration demonstrated by Bennett et al. (2020b) may also extend to the soil variables used in our study.Thirdly, the spatial resolution of the climate and soil data was mismatched to that of the biomass plot data and thus may have been inadequate to pick up effects of local topography on biomass distribution.Finally, the soil data we used as predictor variables in our models came from a continent-wide model that included 6 climate variables in a set of 32 predictors (Viscarra-Rossel et al. 2015).While only one of these climate predictors was also included in our predictor set (MAP) this meant our two variable sets (climate-only and soil-only) were not wholly independent.

Conclusion
Our study has demonstrated that climate is more important than soils for explaining forest AGB distribution at broad spatial scales across the continent of Australia.Since the climate is changing to become warmer, with more heat extremes, fewer cool extremes and decreases in rainfall (Bureau of Meteorology and CSIRO 2018), we were interested in examining whether models that used climate-only, soil-only or climate plus soil variables explained more of the variation in AGB distribution at broad spatial scales and across large environmental gradients.We showed that adding soil variables did not significantly improve the performance of climate-only models.Thus, akin to other studies, our results indicated a greater influence of climate than soil variables on forest AGB distribution at the continental scale.Because our models explained only 49% of the variation, we conclude that unexamined variables were equally important to forest AGB distribution in Australia.In particular, localised effects like steep topographic gradients and biotic factors -such as species and functional diversity, stand structural complexity and diversity, stand age-cohort distribution, and the traits of the dominant species -will warrant stronger consideration in biomass models, including better representation in biomass and ancillary data sets at broad scales.

Figure 1 .
Figure 1.The spatial extent of the study, showing the Australian forested area (green), 3130 biomass plot locations (red) and three biomes (grey) covered in this study.Biome data were sourced from the FAO GeoNetwork (Food and Agricultural Organisation of the United Nations 2001), forest data were sourced from Australia's National Forest Inventory, 2013 (ABARES 2013), and biomass plot locations were sourced from the Biomass Plot Library (TERN AusCover 2017).GDA94, Australian Albers projection.

Figure 2 .
Figure 2. Comparative model performance statistics for the Random Forest models produced from each variable set (climate-only, soil-only and climate plus soil), showing (a) R 2 ; (b) RMSE; and (c) model bias.Error bars represent the 95% confidence interval on the mean of the model statistic that was calculated from the 10 folds for each variable set.Detailed statistics are presented in Supplementary material Appendix 1 TableA4.

Figure 3 .
Figure 3. Observed versus predicted above-ground biomass (AGB) for the composite of 10 folds of models produced using the variable sets for (a) climate-only, (b) soil-only and (c) climate plus soil.

Figure 4 .
Figure 4. Comparison of variable importance between the Random Forest models produced from the three variable sets (climate-only, soil-only and climate plus soil) for (a) absolute % increase in MSE, and (b) normalised % increase in MSE.Error bars represent 95% confidence intervals of the mean of the 10 folds used during cross validation.

Figure 5 .
Figure 5. Predicted forest above-ground biomass (AGB) across the Australian continent based on the mean of the ten folds for (a) climateonly, (b) soil-only and (c) climate plus soil models.Associated model precision is indicated by the co-efficient of variation across the ten folds, for (d) climate-only, (e) soil-only and (f ) climate plus soil models.The model residuals plotted against latitude and longitude are available in the supplementary material (Supplementary material Appendix 1 Fig.A4).GDA94, Australian Albers projection.

Table 1 .
(Viscarra Rossel et al. 2014sed as potential predictor variables in models of above-ground biomass.Values are the range, mean and standard deviation for Australia's forested area.Bioclimatic variables are sourced from WorldClim 2.0(Fick and Hijmans 2017)and soil variables from the 0-5 cm layers of the Australian Soil Landscape Grid(Viscarra Rossel et al. 2014).