Allometric equations for integrating remote sensing imagery into forest monitoring programmes

Abstract Remote sensing is revolutionizing the way we study forests, and recent technological advances mean we are now able – for the first time – to identify and measure the crown dimensions of individual trees from airborne imagery. Yet to make full use of these data for quantifying forest carbon stocks and dynamics, a new generation of allometric tools which have tree height and crown size at their centre are needed. Here, we compile a global database of 108753 trees for which stem diameter, height and crown diameter have all been measured, including 2395 trees harvested to measure aboveground biomass. Using this database, we develop general allometric models for estimating both the diameter and aboveground biomass of trees from attributes which can be remotely sensed – specifically height and crown diameter. We show that tree height and crown diameter jointly quantify the aboveground biomass of individual trees and find that a single equation predicts stem diameter from these two variables across the world's forests. These new allometric models provide an intuitive way of integrating remote sensing imagery into large‐scale forest monitoring programmes and will be of key importance for parameterizing the next generation of dynamic vegetation models.


Introduction
Forests are a key component of the terrestrial carbon cycle (Beer et al., 2010;Pan et al., 2011), making forest conservation of critical importance for mitigating climate change (Agrawal et al., 2011). Yet effectively managing forests as carbon sinks is predicated on the assumption that carbon stocks can be quantified with accuracy across extensive and often remote areas. Traditionally, forest carbon stocks have been assessed by measuring the diameter (and sometimes height) of trees in permanent field plots and then using allometric equations to estimate biomass (Malhi et al., 2006;Pan et al., 2011;Anderson-Teixeira et al., 2015). Recently, however, we have begun to see a move towards remote sensing as the primary tool for monitoring forest carbon (Saatchi et al., 2011;Baccini et al., 2012;Avitabile et al., 2016). Airborne laser scanning (ALS) is particularly promising in this regard (Asner & Mascaro, 2014;, allowing the 3D structure of entire forest landscapes to be reconstructed in detail using highfrequency laser scanners mounted on airplanes or unmanned aerial vehicles. Importantly, advances in both sensor technology and computation mean we are now ablefor the first timeto reliably identify and measure the crown dimensions of individual trees using ALS (Yao et al., 2012;Duncanson et al., 2014;Shendryk et al., 2016), marking a fundamental shift in the way we census forests. To facilitate this transition, we aim to develop allometric equations for estimating a tree's diameter and aboveground biomass based on attributes which can be remotely sensednamely tree height and crown diameterenabling airborne imagery to be fully integrated into existing carbon monitoring programmes (Fig. 1).
While ALS opens the door to rapidly and accurately measuring the height and crown dimensions of millions of trees (Duncanson et al., 2015), it also poses the challenge of how best to use these data to estimate aboveground biomass. Current allometries rely on stem diameter as a key input for estimating biomass (e.g. Chave et al., 2014). But because diameters cannot be measured directly through ALS, new approaches that have tree height and crown dimensions at their centre are needed. We see two possible solutions for integrating tree-level ALS data into biomass monitoring programmes: the first is to use tree height and crown dimensions to predict diameters, allowing biomass to be estimated using existing allometric equations (Dalponte & Coomes, 2016). The second is to develop equations that estimate biomass directly from tree height and crown size, thereby bypassing diameter altogether.

Approach 1: estimating diameter
Theory based on the mechanical and hydraulic constraints to plant growth predicts that tree height (H, in m) should scale with diameter (D, in cm) following a power-law relationship with an invariant scaling exponent of 2/3 (H / D 2/3 ; West et al., 1999). This would suggest that measuring tree height should be sufficient for estimating diameter. However, growing evidence indicates that this is unlikely to be the case (Muller-Landau et al., 2006): not only do H-D allometries vary considerably among and within species, as well as in relation to climate and stand structure (Banin et al., 2012;Lines et al., 2012;Hulshof et al., 2015;Jucker et al., 2015), but power-law relationships also fail to adequately capture the asymptotic nature of height growth (Muller-Landau et al., 2006;Banin et al., 2012;Feldpausch et al., 2012;Iida et al., 2012;Chave et al., 2014). Trees typically invest heavily in height growth when young to escape shaded understoriesrapidly approaching their maximum heightbut then continue to grow in diameter throughout their lives (King, 2005). This makes estimating the diameter of large trees challenging, as trees of similar height can have very different diameterswhich is problematic given that large-diameter trees hold most of the biomass (Slik et al., 2013;Bastin et al., 2015). In this context, information on crown size may prove key to accurately estimating a tree's diameter. While height growth tends to slow rapidly in large trees, lateral crown expansion does not, requiring a continued investment in stem growth on the tree's part to ensure structural stability and hydraulic function (Sterck & Bongers, 2001;King & Clark, 2011;Iida et al., 2012). As a result, crown width and stem diameter tend to be strongly coupled, even in large trees (Hemery et al., 2005).

Approach 2: estimating aboveground biomass
Estimating the diameter of individual trees from remotely sensed data is an appealing prospect: not only would it provide a way to quantify biomass stocks, but would also allow other forest attributes of interest to be reconstructed with ease (e.g. stem diameter distributions). However, it also presents a challenge from the point of view of biomass estimation, as biomass allometries typically have diameter as a squared term in the equation (Zianis et al., 2005;Chave et al., 2014;Chojnacky et al., 2014), meaning that even small errors in diameter predictions can strongly influence the accuracy of biomass estimates. A better approach may therefore be to estimate a tree's aboveground biomass directly from crown architectural properties which can be measured from airborne imagery, without the need to first predict diameter. Specifically, both tree height (Hunter et al., 2013;Chave et al., 2014) and crown dimensions (Henry et al., 2010;Goodman et al., 2014;Ploton et al., 2016) are known to relate strongly to aboveground biomass, although it remains to be tested whether they can be used to accurately estimate biomass without needing to also account for stem diameter. State-of-the-art algorithms that detect and measure individual tree crowns from ALS point clouds are combined with existing field data to estimate the diameter and aboveground biomass of remotely sensed trees.
Here we compile a global data set consisting of 108753 trees for which stem diameter, height and crown diameter have all been measured, including 2395 trees which have been harvested to measure aboveground biomass. The data set is representative of the world's major tree-dominated biomes and spans a huge gradient in tree size (Fig. 2). We use these data to develop allometric equations that enable the precise and unbiased estimation of a tree's diameter and aboveground biomass based on its height and horizontal crown dimensions and use the following questions to guide our processes: (i) Can a tree's diameter be estimated accurately based on its height alone, or do we also need to account for its crown dimensions? (ii) Can a single universal equation be used to model diameter, or do different scaling relationships among forest types, biogeographic regions and tree functional types need to be accommodated for? (iii) Can a tree's aboveground Fig. 2 Overview of the allometric database. Panel (a) shows the geographic coverage of the database in relation to the world's biomes (map adapted from Olson et al., 2001). Circle size reflects the number of trees measured at each location (on a logarithmic scale). Panel (b) highlights differences in mean annual precipitation and temperature among forest types. Climate data were obtained from the WorldClim database (Hijmans et al., 2005), which consists of gridded annual mean values covering the period between 1950 and 2000 (data available from: http://www.worldclim.org/current). In (c) violin plots show the size distributionin terms of diameter and aboveground biomassof trees in the database. The number of records available for each forest type is displayed on the right. biomass be estimated directly from its height and crown diameter, thereby eliminating the need to first predict its diameter?

Allometric database
We compiled a global database of trees for which stem diameter (D, in cm), height (H, in m) and crown diameter (CD, in m) were all measured. Trees were selected for inclusion in the database based on the following criteria: (i) only trees with D ≥ 1 cm and H ≥ 1.3 m were considered; (ii) trees from managed plantations and agroforestry systems were excluded; (iii) trees known or presumed to be severely damaged were removed (e.g. broken stems or major branches; see Fig. S1); (iv) only trees whose geographic location was recorded were retained; and (v) from a taxonomic perspective trees had to, at a minimum, be identifiable as either angiosperms or gymnosperms (note that tree ferns and palms were excluded from the analysis). Our search yielded a total of 108753 trees which met the above requirements. For 2395 of these, total oven-dry aboveground biomass (AGB, in kg) was additionally measured by harvesting and weighing trees. The database spans a large range of tree sizes (D: 1.0-293.0 cm; H: 1.3-72.5 m; CD: 0.1-41.0 m; AGB: 0.1-76063.5 kg), captures a wide spectrum of tree forms and functional types (1492 tree species from 127 families) and covers the major forest types and climatic conditions found in the world's forests (see Fig. 2 for an overview of the database). A full list of data sources and associated measurement protocols is provided in Appendix S1 of Supporting Information. The database is publicly available through figshare (https://dx.doi.org/10.6084/m9.figshare.3413539.v1), with data from the Alberta Permanent Sample Plots (https:// www.agric.gov.ab.ca/app21/forestrypage) and the International Cooperative Programme on Air Pollution Effects on Forests (http://icp-forests.net/page/data-requests) archived separately and available upon request through the above links.

Forest biome classification
Scaling relationships between D, H and CD are strongly influenced by climate (Lines et al., 2012;Hulshof et al., 2015), as well as varying among species (Poorter et al., 2006) and geographic regions (Banin et al., 2012). To capture this degree of variationwhich we expect to be of key importance to accurately estimating both D and AGBeach tree in the database was assigned to one of five biome types based on its geographic location: boreal forests, temperate coniferous forests, temperate mixed forests, woodlands and savannas (which combines temperate and tropical savannas, as well as Mediterranean woodlands) or tropical and subtropical forests (biome classification follows Olson et al., 2001). In the same way, trees were also assigned to one of six biogeographic regions: Australasia, Afrotropics, Nearctic, Indo-Malaya, Neotropics or Palearctic. Transitions among forest biomes reflect strong climatic gradients (Whittaker, 1975;Stephenson, 1998;Fig. 2b), whereas biogeographic realms define regions which share a common evolutionary history (Udvardy, 1975). Olson et al.'s (2001) map of the world's terrestrial ecoregions, which defines the geographic distribution of the world's major biome and biogeographic regions, is available for download from http:// www.worldwildlife.org/publications/terrestrial-ecoregionsof-the-world.
Approach 1: estimating diameter Model development. To determine how to most accurately estimate a tree's diameter based on its crown architectural properties, we compared a set of regression models in which D was expressed as a function of either H, CD or the compound variable H 9 CD (which tests whether both height and crown size are needed to predict D). We chose to model the combined effect of H and CD using a compound variable (as opposed to including the two predictors separately in the model) to avoid issues with collinearity resulting from the nonindependence of H and CD (Dormann et al., 2013). Furthermore, preliminary analyses revealed that H 9 CD was as good (if not better) a predictor of D than a model with H and CD as separate explanatory variables (Table S2).
Typically, allometric equations are derived by fitting a linear regression directly to raw data (which in most cases have been log-transformed). Yet this approach will tend to underestimate the slope of a bivariate line when the independent variable is measured with error (also known as regression dilution bias ;Fuller, 1987;Warton et al., 2006). In the case of forest inventory data, this systematic bias is made worse by the inherently unbalanced size distribution of trees, as small stems which vastly outnumber large onescome to dominate the signal of the regression (Duncanson et al., 2015). As a solution to this problem, Duncanson et al. (2015) proposed fitting allometric models to binned data as opposed to raw values. Because this method reduces tree-level variation in allometric attributes to a mean value, it has the drawback of inevitably underestimating the true uncertainty of the model. However, a preliminary analysis of the data revealed it to be the only approach able to adequately capture underlying allometric scaling relationships (see Appendix S2 for a detailed discussion). As a compromise, we therefore chose to adopt Duncanson et al.'s (2015) binning method to estimate allometric relationships, but also develop a framework for robustly quantifying and propagating model uncertainty when working with binned data (see 'Model uncertainty and error propagation' section below).
We calculated the mean H, CD and H 9 CD for each of 50 stem diameter logarithmic bins of constant width (logarithmic binning was chosen to better capture the right-skewed distribution of D). Linear log-log models were then fit to the binned data using least-squares regression (as implemented in the R statistical software; R Core Development Team, 2013): where a and b are parameters to be estimated from the data and e is an error term [which is assumed to be normally distributed, with a mean of zero and a standard deviation r, N (0, r 2 ). Models 1-3 can be thought of as global allometric equations, as they assume that scaling relationships between D, H and CD are invariant across forest types, biogeographic regions and tree functional groups (e.g. angiosperms and gymnosperms). To determine the extent to which regional or group-specific allometries improve the accuracy of D estimates compared to those of a global model, we used mixedeffects models to develop two further equations. First, the relationship between D and the independent variable (e.g. H 9 CD) was allowed to vary among forest types nested within biogeographic regions (i.e. random intercept and slope model, where forest type and biogeographic region were treated as nested random effects). In the second model, the relationship between D and the independent variable was further allowed to vary among angiosperm and gymnosperm trees (i.e. separate a and b estimates were calculated for each functional group/forest type/biogeographic region combination). Note that to fit these models, the data binning processes was repeated and separate mean values of H, CD and H 9 CD were calculated for each combination of functional group, forest type and biogeographic region.
Generating predictions. Allometric models, such as those described above, can be used to estimate D for any tree whose H and CD are known. Using Model 3 as an example, predicted diameter values (D pred ) are obtained as follows: Assuming e is normally distributed [i.e. N (0, r 2 )], the mean of expðeÞ can be approximated by exp(r 2 /2), where r 2 is the mean square error of the regression (Baskerville, 1972). An unbiased estimate of D can therefore be calculated using the following equation: Model validation. To evaluate and compare the predictive accuracy of the different D models, we: (i) divided the database into a training set (90% of the data) and a validation set (remaining 10% of the data, used exclusively to evaluate model performance). Trees assigned to the validation data set were selected following a size-stratified random sampling approach which aimed to capture the full range of D in the database; (ii) D models were fit to the training data set using the binning approach described above; (iii) fitted equations were used to predict D for all trees in the validation data set [as outlined in Eqn (4)]; and (iv) the predictive error of each model was quantified by comparing predicted and observed D values (D pred and D obs , respectively) of trees in the validation data set (see below for a description of the model performance metrics used). Steps (i-iv) were repeated 100 times to avoid the randomization procedure in step (i) having an undue effect on the model evaluation process. For each D model we calculated two measures of average error: the root mean square error (RMSE, in cm) and the relative systematic error (or bias, in %).
Additionally, a third model performance statistic was used to compare the predictive accuracy of the D models across functional groups (angiosperms and gymnosperms), forest types and biogeographic regions. Following the approach of Chave et al. (2014), we calculated the tree-level coefficient of variation (CV) in D for trees of functional group i, growing in forest type j and in biogeographic region k as follows: where RMSE ijk is the RMSE of trees belonging to functional group i, growing in forest type j and in biogeographic region k, whereas the denominator corresponds to the mean observed D for this same group of trees. Standardizing the RMSE by the mean D is a necessary step to compare model errors across functional groups, forest types or biogeographic regions, as errors in D are strongly dependent on tree size (Colgan et al., 2013).
Model uncertainty and error propagation. As discussed previously, while data binning is well suited to estimating average allometric scaling relationships, it inevitably underestimates the true variability in these relationships among individual trees. Specifically, the data binning approach will tend to underestimate rthe residual standard deviationwhich makes quantifying and propagating uncertainty a challenge.
In a linear modelling framework r ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi P ðyiÀŷ i Þ 2 nÀ2 q , where n is the number of observations, y i is the ith observation of the response variable, andŷ i is the corresponding predicted value obtained from the model. The reason why data binning generally underestimates r is that the difference between observed and predicted values (i.e. the residuals, y i Àŷ i ) is calculated not for individual trees, but for mean values obtained by averaging across multiple trees. However, using an independent data set (the 10% of trees set aside for model validation), we can compare predicted and observed estimates of D generated for individual trees to get a much more realistic estimate of the true value of r for a given model (which we refer to as r v ): Using this simple approach, we were able to generate realistic estimates of the predictive uncertainty of models fit using the data binning method (see Fig. S3). To enable users to robustly propagate uncertainty when using the equations developed here, we report r v values for all fitted models. Furthermore, in Appendix S5 we provide R code for replicating the entire analysis.
Scaling-up from diameter to aboveground biomass.
Approach 1 aims to predict D from crown attributes, with the idea that D estimates can then be fed into existing biomass equations. To quantify the extent to which replacing field-measured D values with predicted ones influences the accuracy of AGB estimates, we used Chave et al.'s (2014) general biomass equation as a baseline. In Chave et al. (2014), AGB is expressed as the following function of D, H and wood density [q, in g cm À3 ; which we obtained from the global wood density database of Chave et al. (2009) and Zanne et al. (2009)]: AGB = 0.0673 9 (D 2 9H9q) 0.976 9 exp [0.357 2 /2]. Using this equation, we estimated AGB for trees in the database with a known biomass (i.e. trees that had been destructively harvested and weighed) using both fieldmeasured and predicted D values as inputs to the biomass model. Only trees with D ≥ 5 cm were used for this purpose (n = 1859 trees with field-measured AGB), as trees smaller that this threshold contribute negligibly to forest carbon stocks and were not used to calibrate Chave et al.'s (2014) equation. By comparing observed AGB values with those predicted using Chave et al.'s (2014) equation, we were then able to determine whether the underlying D models described previously can be used to generate accurate biomass estimates. Additionally, this also allowed us to compare the predictive accuracy of approaches 1 and 2the latter of which aims to estimate AGB directly from H and CD (see following section).

Approach 2: estimating aboveground biomass
Instead of estimating D first, a better approach to predicting the biomass of individual trees from crown architectural attributes might be to relate AGB directly to H and CD. To test this, we used data for trees with measured AGB to explore a number of alternative models relating AGB to H and/or CD. Preliminary analyses revealed the compound variable H 9 CD to be a far superior predictor of AGB than either H or CD alone. We therefore focus on the following log-log regression model of AGB: Model development and validation followed the same steps described for Approach 1. As for previous equations, the model was fit to binned mean values of H 9 CD (as opposed to raw data). To allow a comparison with Approach 1, only trees with D ≥ 5 cm were used to develop the model. We further tested whether modelling angiosperms (n = 1069) and gymnosperms (n = 790) separately would improve model accuracy, as these two functional groups differ strongly in crown architecture Hulshof et al., 2015) as well as wood density (Chave et al., 2009). Given the relatively small number of trees with measured AGB values, we did not explore the extent to which the relationship between AGB and H 9 CD varies among forest types or biogeographic regions. The predictive accuracy of Eqn (5) was compared against that of AGB models which include D as a predictor (i.e. Approach 1) on the basis of RMSE and bias.

Approach 1: estimating diameter
Of the candidate models we tested for estimating D, ones relying on H or CD alone as predictors of D proved unsuitable. Despite exhibiting relatively low RMSE (13.7 cm), a height-only model tended to systematically overestimate D (bias = 24.7%). This occurred because D-H relationships were nonlinear on a log-log scale, as H tended to asymptote in large trees. As a result, a power-law tended to overestimate D for small and medium-sized trees, while severely underestimating that of large ones (Fig. S4). Conversely, a model with only CD as a predictor of D had higher RMSE (16.6 cm), but showed lower overall systematic bias (À4.5%). However, the average bias masks a tendency of the crown diameter-only model to overestimate D for large trees, while underpredicting the size of smaller stems (Fig. S4). In contrast to the previous two models, H 9 CD proved a much better predictor of D (Fig. 3). The best-fit global D model was Equation (6) had both lower RMSE (9.7 cm) and average systematic bias (À1.2%) compared to models based on H or CD alone. Importantly, the model showed no evidence of over-or underpredicting D across a wide range of tree sizes (Fig. 3b). Using the independent validation data set, we estimated r v [i.e. the standard deviation of ln(D obs )ln(D pred )] of the model to be 0.45.
While the global D model presented in Eqn (6) was able to produce unbiased estimates of D across a wide range of species, climate zones and tree sizes (Fig. 3), scaling relationships between D and H 9 CD did vary among both forest types and functional groups (Fig. 4). Incorporating these differences in the modelling processes further improved the precision of D estimates (Fig. 5 and Table S2). In particular, accounting for the different scaling relationships of angiosperms and gymnosperms reduced the RMSE of the model to 8.1 cm, the average CV to 35.8% (from 43.3% in the global D model), and r v to 0.35 (Table S2). These gains in precision were especially evident when attempting to predict D for angiosperm trees in boreal and temperate coniferous forests, which tend to be dominated by gymnosperms (Fig. 5b). A full list of group-, forest type-and regionspecific D equations is provided in Appendix S4.
Approach 2: estimating aboveground biomass AGB was strongly related to H 9 CD, with a linear loglog relationship holding across more than six orders of magnitude variation in tree mass (Fig. 6). Scaling relationships between AGB and H 9 CD varied consistently among functional groups, with gymnosperms exhibiting higher scaling constants (a = 0.109 vs. 0.016) but smaller scaling exponents (b = 1.790 vs. 2.013) compared to angiosperm trees (Fig. 6). The best-fit AGB model which accounted for different scaling relationships among angiosperms and gymnosperms was where a G and b G are functional group-dependent parameters which represent the difference in the scaling constant a and scaling exponent b between angiosperm and gymnosperm trees. For gymnosperms, a G = 0.093 and b G = À0.223, whereas for angiosperms both parameters are set to zero. The estimated r v of the model was 0.69.
Comparing approaches 1 and 2 AGB estimates obtained using Chave et al.'s (2014) biomass equation and field-measured D values as inputs showed a close agreement with observed AGB values (RMSE = 0.86 Mg; Fig. 7a), but had a tendency to overestimate AGB (bias = 27.7%). As expected, replacing field-measured D values with ones predicted using the global D model [i.e. Eqn (6), corresponding to Approach 1] increased the RMSE of the model predictions to 1.78 Mg (Fig. 7b). However, the average systematic bias in the AGB predictions was little affected (bias = 30.1%, the overestimation arising from the use of the biomass function, not the global D model). This suggests that diameter estimates obtained using the global D model can be scaled up to biomass without introducing a systematic bias. In contrast to Approach 1, using Eqn (7) to estimate AGB directly from H 9 CD (i.e. Approach 2) resulted in substantially lower average bias in AGB estimates, regardless of tree mass (bias = À4.3%; Fig. 7c). Furthermore, Approach 2 had the advantage of reducing the RMSE of the model predictions to 1.70 Mg.

Discussion
We developed general allometric models for estimating both the stem diameter and aboveground biomass of trees based on crown architectural properties which can be remotely sensed: tree height and crown diameter. Here, we discuss how these allometric models can be used to integrate remote sensing imageryparticularly ALS datainto forest monitoring programmes, allowing carbon stocks to be mapped with accuracy across forest landscapes and shedding light on the processes which govern the structure and dynamics of forest ecosystems.

Stem diameter allometries for remote sensing imagery
We found that estimating stem diameter required accounting for both height and crown sizethe latter of which proved essential for differentiating between trees of similar height but having substantially different trunk sizes (King, 2005;King & Clark, 2011). Using a simple metric which combines these two allometric dimensions -H 9 CDwe were able to derive a global equation for estimating stem diameter which proved robust across a large range of tree sizes, forest types and tree species (Fig. 3).
Our results highlight how allocation to height growth and lateral crown expansion are strongly coordinated in trees (Sterck & Bongers, 2001;King, 2005;Iida et al., 2012) and illustrate how these developmental constraints can be exploited for the purposes of estimating stem diameter. While we did find that a single allometric function can be used to estimate diameter without introducing systematic bias, incorporating different scaling relationships among forest types, biogeographic regions and functional groups into the models helped improve the predictive accuracy of the allometric equations (Figs 4 and 5; Table S2). Particularly important in this respect was accounting for differences between angiosperms and gymnosperms (Fig. 5b). This is not surprising given the contrasting crown architecture of these two groups: gymnosperms generally exhibit strong apical dominance and invest heavily in height growth, whereas angiosperm trees have a greater ability to plastically adapt the shape and size of their crown to suit their competitive environment Hulshof et al., 2015). These differences in crown architecturecoupled with clearly distinct leaf biochemical profilesalso mean that angiosperm and gymnosperm trees can be easily distinguished using a variety of remote sensing products (e.g. aerial photographs, hyperspectral sensors and ALS; Dalponte et al., 2012). Consequently, we suggest that users select group-specific diameter equations (which we provide in Appendix S4) wherever possible, as these can be employed with little or no need for additional field data. As our ability to remotely map tree species improves (e.g. through the development of spectral libraries derived from hyperspectral sensors; Asner, 2013), it is conceivable that species-specific diameter equations could also be utilized in the future. Similarly, other aspects known to influence crown architecture (e.g. tree packing density; Jucker et al., 2015) could also be incorporated to further refine the models we develop here. Fig. 4 Relationship between stem diameter and the product of tree height and crown diameter (H 9 CD). Panel (a) shows the distributionon a logarithmic scaleof the raw data (in grey) and of the mean H 9 CD values in each diameter size class (black circles). Panel (b) illustrates fitted relationships between diameter and H 9 CD for each forest type separately, while (c) reports the slopes of these relationships (AE 95% confidence intervals) for angiosperms and gymnosperms separately.
The diameter allometries we develop here open the door to a more general and robust framework for monitoring forest carbon stocks using ALS. Currently, the standard approach for estimating carbon stocks from ALS data involves calculating summary statistics from ALS point clouds for a given pixel of land (e.g. top canopy height) and relating these to carbon estimates obtained from permanent field plots in a regression framework (Asner & Mascaro, 2014;. Despite recent attempts to generalize this 'area-based' approach (e.g. Asner & Mascaro, 2014), most models for estimating carbon stocks from ALS summary statistics are highly site-specific and can only be applied with confidence to the particular patch of forest they were calibrated for. Working at tree-level provides an intuitive solution to the issue of developing a general approach for mapping forest carbon stocks and would allow a direct comparison to field-based aboveground carbon estimates. This 'tree-centric' approach is not without its limitations, the biggest of which is the implicit assumption that individual trees can be reliably identified and measured from ALS point clouds (something which can be challenging in dense, multilayered canopies). However, recent years have seen substantial  progress in this respect, as both ALS instruments and the algorithms used to delineate trees from ALS data have improved considerably (Popescu et al., 2003;Yao et al., 2012;Duncanson et al., 2014;Paris et al., 2016;Shendryk et al., 2016). For example, Paris et al. (2016) recently developed a segmentation method which was able to correctly delineate the crowns of 97% and 77% of canopy dominant and understorey trees, respectively, as well as accurately measuring the crown dimensions of all segmented trees. Equally promising is Shendryk et al.'s (2016) algorithm which segments trees from the bottom up (mimicking the approach used to process terrestrial laser scanning data; Calders et al., 2015). As ALS technology continues to improve, 'tree-centric' carbon monitoring programmes are becoming not only feasible, but oftentimes preferable to traditional 'area-based' approaches (Duncanson et al., 2015;Dalponte & Coomes, 2016). . Panel (c) corresponds to a model in which AGB is expressed directly as a function of tree height and crown diameter (i.e. Approach 2). For panels (a-c), the dashed line corresponds to a 1 : 1 relationship, while the solid line is a regression spline fit to the data points to highlight how predictive accuracy varies with tree size. The RMSE and bias of each set of predictions is reported in the lower right-hand corner. Panel (d) shows the probability density distribution of the absolute errors (i.e. AGB pred -AGB obs ) for each AGB function.
In addition to mapping carbon stocks, characterizing the relationships between stem diameter and crown dimensions also has important implications for advancing our understanding of forest dynamics. The most obvious application of the diameter allometries developed here is for characterizing tree size distributions from airborne imagery, something which has proved challenging using traditional 'area-based' approaches (Maltamo & Gobakken, 2014). Tree size distributions are an emergent property of forest ecosystemsarising from demographic processes and competition for space among individual trees (Enquist et al., 2009;Kohyama et al., 2015) and are of key interest for understanding forest dynamics, structure and responses to disturbance (Coomes et al., 2003;Enquist et al., 2009). Intriguingly, recent work suggests that scaling relationships between diameter and crown size govern how trees utilize canopy space and compete for light, thereby having a direct influence on tree size distributions (Taubert et al., 2015;Farrior et al., 2016). ALS data, coupled with allometric equations for converting crown dimensions to diameter distributions, would allow us to empirically test this theory across large spatial scales and diverse forest types. In a similar vein, diameter allometries provide a simple solution for integrating ALS data into individual-based models of forest dynamics (e.g. Shugart et al., 2015), allowing these models to be more easily parameterized and validated.

Estimating aboveground biomass from crown dimensions
Using the subset of trees that were destructively harvested and weighed, we showed that AGB was strongly related to tree height and crown size (Fig. 6). These results give weight to recent reports which have highlighted how accounting for crown size can substantially improve AGB estimation, especially in the case of large trees where a considerable proportion of the biomass is stored in large branches (Henry et al., 2010;Goodman et al., 2014;Ploton et al., 2016). The strong link between crown dimensions and AGB has important implications for 'tree-centric' carbon mapping approaches, as it suggests that AGB can be estimated directly from remotely sensed measurements of tree height and crown width without needing to first predict diameter (Fig. 7c). This is particularly appealing as it reduces the number of steps in the AGB estimation process (each of which carries a certain degree of error) and also eliminates the need to select an equation from the literature for scaling from diameter to AGB.
Our analysis revealed clear differences in the AGB scaling relationships of angiosperms and gymnosperms (Fig. 6), presumably reflecting differences in both crown architecture and wood density among these two groups (Chave et al., 2009;Poorter et al., 2012;Hulshof et al., 2015). It may well be that AGB scaling relationships also vary systematically among forest types or biogeographic regions and that accounting for these differences could further improve the predictive accuracy of the biomass allometries presented here. Unfortunately, the relatively modest sample size of trees with measured AGB at our disposal meant we were unable to robustly test these assumptions. Despite recent efforts to compile comprehensive allometric databases (e.g. Chave et al., 2014;Falster et al., 2015), the number of trees with measured AGB remains relatively small, geographically biased and heavily skewed towards smaller stems. This is even more so when attempting to find trees that have been felled and weighed and whose crown dimensions have also been recorded. Future studies developing AGB equations should take care to also record the crown dimensions of harvested trees (e.g. Henry et al., 2010;Goodman et al., 2014;Ploton et al., 2016). In this regard, perhaps the most promising solution for bolstering existing allometric databases is terrestrial laser scanning, which captures tree architecture in exquisite detail and provides a nondestructive method for accurately estimating AGB (Calders et al., 2015). Most importantly, this would provide access to biomass data for large trees (e.g. ≥10 Mg), which tend to be disproportionately rare in allometric databasesincluding the one we have assembled here (only 2.4% of measured trees had a mass ≥10 Mg; see Fig. 2c).

Seeing the forest and the trees
Accurate assessments of forest carbon stocks are essential for initiatives to mitigate climate changesuch as the UN's programme for reducing emissions from deforestation and forest degradation (REDD+)to be implemented successfully (Agrawal et al., 2011). Yet monitoring carbon stocks across large and sometimes remote areas of forest poses a real challenge, particularly in countries where national-scale forest inventory programmes are not in place. In this context, remote sensing technologies such as ALS promise to revolutionize the way we census forests . It is our hope that the allometric equations developed here can help us move towards a more general and robust approach for monitoring forests from the air. and Maiza Nara for providing us access to data from the Sustainable Landscapes Brazil project; Kristina Anderson-Teixeira and her co-authors for archiving allometric data from the CTFS-ForestGEO forest dynamics plot at the Smithsonian Conservation Biology Institute (Virginia, USA); and KaDonna Randolph of the USDA forest service for her assistance in accessing the Forest Health Monitoring (FHM) database. We thank Bruno H erault and an anonymous reviewer for their thoughtful and constructive comments on an earlier draft of our manuscript. T.J. was funded by NERC (grant number: NE/K016377/1). This work has benefited from ANR grants to J.C. (

Statement of authorship
T.J. and D.A.C. designed the study, with the assistance of J.P.C. and J.C.; T.J. and J.P.C. compiled the database, with all authors providing data; T.J. analysed the data and wrote the first draft of the manuscript, with all authors contributing substantially to revisions.

Supporting Information
Additional Supporting Information may be found in the online version of this article: Appendix S1. Database generation Table S1. List of data sources included in the database Figure S1. Heightstem diameter ratio and crown diameterstem diameter ratio as a function of tree size for each forest type. Appendix S2. Data binning Figure S2. Relative error as a function of tree size for different modelling approaches used to relate stem diameter to the product of tree height and crown diameter Figure S3. Prediction intervals for the relationship between stem diameter and the product of tree height and crown diameter. Appendix S3. Diameter model comparison Table S2. Comparison of the predictive accuracy of allometric models. Figure S4. Relative error as a function of tree size for different diameter models. Appendix S4. Region-, forest type-and group-specific diameter equations Figure S5. Scaling relationships between height and stem diameter and crown diameter and stem diameter for angiosperm and gymnosperm trees. Figure S6. Effects of mean annual precipitation and temperature on allometric scaling relationships. Table S3. Regional stem diameter allometries Table S4. Regional stem diameter allometries for angiosperms and gymnosperms Appendix S5. R code for implementing data binning approach