Horvitz-Thompson-like estimation with distance-based detection probabilities for circular plot sampling of forests

In circular plot sampling, trees within a given distance from the sample plot location constitute a sample, which is used to infer characteristics of interest for the forest area. If the sample is collected using a technical device located at the sampling point, e.g. a terrestrial laser scanner, all trees of the sample plot cannot be observed because they hide behind each other. We propose a Horvitz-Thompson-like estimator with distance-based detection probabilities derived from stochastic geometry for estimation of population totals such as stem density and basal area in such situation. We show that our estimator is unbiased for Poisson forests and give estimates of variance and approximate confidence intervals for the estimator, unlike any previous methods. We compare the estimator to two previously published benchmark methods. The comparison is done through a simulation study where several plots are simulated either from field measured data or different marked point processes. The simulations show that the estimator produces lower or comparable error values than the other methods. In the sample plots based on the field measured data the bias is relatively small - relative mean of errors for stem density, for example, varying from 0.3 to 2.2 per cent, depending on the detection condition - and the empirical coverage probabilities of the approximate confidence intervals are either similar to the nominal levels or conservative.


Introduction
Circular plot sampling is a commonly used method in forest inventory (Gregoire and Valentine, 2007). In this form of sampling, a location is selected as a centre point of a plot and the surrounding area within a given distance is observed from that point. The objective is to gather information on forest characteristics of interest, such as stem density N (the number of tree stems per unit area) and basal area G (the sum of the the cross-sections of stems at breast height per unit area). We focus on N and G in this study. If the sampling is performed visually by a person, they can move slightly from the sampling point to observe trees that are not visible from the sampling point because of some sort of an obstacle. However, if the sampling is performed with a technical device, such as a laser scanner, the device cannot usually be moved from its original position. Hence, these sampling methods are susceptible to errors caused by incomplete detection. Lately, terrestrial laser scanning (TLS) has gained popularity as a tool for circular plot sampling and it is topical to consider the detection problems in these sampling situations.
TLS is a method of close-range remote sensing where a LiDAR scanner is operated on ground level to produce a three-dimensional point cloud of the surroundings. Typically, the scanner remains stationary on a tripod and rotates 360 degrees in the horizontal plane while the laser scans in the vertical direction in some device dependent opening angle. In forestry, the strength of TLS is that accurate measurements of the forest structure below the canopy can be made. Methods for detecting individual tree objects from the point cloud have been developed (see e.g. Strahler et al., 2008;Lovell et al., 2011;Liang et al., 2012;Raumonen et al., 2015;Ravaglia et al., 2017), meaning that tree level information for forest inventory purposes can be gathered using TLS. In this study, we focus strictly on tree stems (or more specifically, their cross-sections at a certain reference height) and ignore other parts of trees.
TLS data collection can be performed either as a single-or a multi-scan setup. The difference between these two setups is that in single-scan one forest area is scanned only once, whereas in multiscan the same area is scanned from several different locations with overlapping scanning areas to produce a more accurate point cloud, having echoes from different sides of trees and possibly from all of the trees in the area. However, scanning from several locations is more time consuming and there is the added difficulty of combining the data from different scans. From estimation point of view, several scans in nearby locations is suboptimal compared to single scans from distant locations.
We focus on the single-scan case where individual tree stems have already been detected from the point cloud, meaning that at least locations and diameters at breast height (DBH) of these detected trees are available. Naturally, the locations and DBH will have estimation errors, but we shall not consider this error source. The problem with single-scan TLS that we are focusing on is that not all trees can be detected, which can lead to underestimation of forest characteristics of interest, such as stem density or basal area. There are different possible reasons for not detecting all of the trees. For example, the tree stems closer to the scanner produce nonvisible areas behind them so that trees further away from the scanner located in the nonvisible areas can not be seen ( Figure 1). Undergrowth, low branches or other objects can block the submitted laser pulse. Small trees far away from the scanner can produce very few laser returns. We consider here only the effect of tree stems producing nonvisible areas behind them; generalization to other obstacles is trivial.
Correcting the problem of nondetection in single-scan TLS has garnered some attention. The use of gap probabilities of a Poisson forest has been proposed as a means for correction (e.g. Lovell et al., 2011). Duncanson et al. (2014) and Astrup et al. (2014) proposed models for detection probability based on traditional distance sampling. Seidel and Ammer (2014) proposed a correction factor based on the nonvisible area of the TLS plot. Olofsson and Olsson (2018) used only the area visible to the scanner as a sampling window. Kuronen et al. (2019) improved on the work of Olofsson and Olsson (2018) by modifying the visible area based on what we call a detection condition, producing a weight for every tree that depends on its DBH and the detection condition: is the tree detected only if it is fully outside of the nonvisible area, or if the centre point is visible, or if any small visible part of stem is enough for detection, or something in between, a partial visibility? The premise for the work of Kuronen et al. (2019) was that the estimator of Olofsson and Olsson (2018) produced large underand over-estimation in Poisson forests when the detection condition was either full visibility or any visibility, respectively, and deduced that this was because the area from which trees could be detected in these cases was not the same as the area visible from the scanner. However, Kuronen et al. (2019) also found that their estimator has a positive bias in the Poisson process case. Kansanen et al. (2016) proposed estimators for stem density that correct a nondetection problem in the case of individual tree detection from airborne laser scanning (ALS) data. In ALS, unlike TLS, the scanner is operated from an aircraft and the laser measurements are attained from above, mainly from the forest canopy. They assumed that a tree could be left undetected if it was covered by the larger tree crowns. The most promising of the estimators was a Horvitz-Thompson-like estimator where the detection probability for each tree was calculated based on the planar set formed as a union of the projections of the larger tree crowns on ground that was morphologically transformed based on the detection condition. This method requires the maximum radii of detected tree crowns and the locations of their centres.
Here we take a similar approach for calculating detection probabilities in the TLS case, and more generally, any circular plot sampling case with similar problems with incomplete observation. Instead of the size of the tree crowns, we assume that there is a certain hierarchy on trees being left undetected based on their distances from the scanner. The detection condition is taken into account by a tuning parameter that controls a morphological transformation of the area visible, or nonvisible, from the scanning location. The detection probabilities are used in a Horvitz-Thompson-like estimator to produce estimates of a population total of interest, such as stem density or basal area. The main difference between our estimator and the estimator of Kuronen et al. (2019) is that our detection probabilities are distance-based, whereas their construction is area-based. Hence it could be easier to combine our method with other, possibly distance sampling based, methods that take into account some other effects on tree detection from TLS data that depend on the distance from the scanner. The performance of the estimator is compared with those of Olofsson and Olsson (2018) and Kuronen et al. (2019) in circular sampling plots that are either fully simulated from point process models or based on field measurements of diameters and locations of all trees of rectangular fixed-area plots. Three detection conditions are considered: full visibility, centre point visibility, and visibility of any part of the stem.

Methodology
We model the forest as a marked point process (see e.g. Illian et al., 2008) x i are the locations of the stem centres, generated by some spatial point process, d i are the DBH of the trees, governed by some distribution, and m i is a mark of interest related to tree i. If we are interested in a stem density estimateN, then m i = 1. If we are interested in total basal areaĜ, then m i is the basal area of tree i. Other marks, such as volume or biomass, are also possible. We observe n trees from a total of N in some window of interest W (i.e., fixed-area sample plot), which, without loss of generality, is centred at the origin of the plane.
We model the tree stems as discs B(x i , d i /2) centred at x i with radius d i /2. We assume that the trees can be ordered based on their distances from the origin r i − d i /2, where r i is the distance of the stem centre from the origin. In other words, the trees are ordered based on the shortest distance to their outer bark. Henceforth we assume that the index i is ordered, i = 1 being the tree closest to the origin, i = 2 being the second closest, and so forth. We also assume that no tree disc B(x i , d i /2) covers the origin.
We assume that the ability to detect tree i is related to the trees closer to the origin than tree i. A tree forms a subset of plane A i that can not be seen from the origin as a union of the stem disc B(x i , d i /2) and the area between the two tangent lines of the disc running through the origin (see Figure 1).
We suggest the following for the detection probability p of tree i. If i = 1 the tree is detected for sure, p(r 1 ) = 1. Otherwise, where L(∂B(o, r), X) is the total length of the arcs of a circle with radius r that is inside the set X and α is a tuning parameter that controls the morphological transformations ⊖ (erosion) and ⊕ (dilation).
The reasoning behind the detection probabilities is as follows. If we assume that a point is uniformly distributed on a circle of radius r, then the probability that it is located in some certain arc of length l is l/(2πr), the proportional length of the arc. Hence, the probability for a tree to be hidden can be calculated based on the lengths of arcs generated by the invisible areas A j , j < i. The probability of being detected can then be calculated as a probability of the complement event. The parameter α controls the detection condition. If α = 0, the centre point of the tree must be visible for detection (centre point visibility detection condition, abbreviated as centre in the tables and figures). If α = −1, the tree is hidden only if its disc is fully inside the nonvisible area. In other words, any visible part produces detection (any visibility detection condition, abbreviated as any). If α = 1, the tree must be fully visible, or fully outside the union of A j , for detection (full visibility detection condition, abbreviated as full). Examples of the nondetectable areas related to these detection conditions are presented in Figure 2.
With these detection probabilities a Horvitz-Thompson-like estimator for a population total of interest τ = N i=1 m i can be formed: I i is an indicator variable that equals 1 if tree i is detected and 0 if it is not detected. Hence, in practice only the detection probabilities and marks of detected trees are needed. However, we take into account the invisible areas produced by undetected trees when we calculate the detection  probabilities. That is, although a proper detection has not occurred, for example DBH estimation is not possible, these trees increase the nonvisible area and produce a nonzero term A j to ∪A j in Equation 1. Note that this is not a concern in the case of any visibility detection condition, as the nonvisible areas created by trees that are not detected are fully contained in the nonvisible areas created by the detected trees.

Unbiasedness of the estimator for homogeneous Poisson process
As a shorthand let us write p i for the probability of detection for tree i and S i for the parts of the origin-centred circle with radius r i that belong to the set from which tree i could not be detected, generated by the trees closer to the origin. Let us write S c i for the parts of the circle that do not belong to the aforementioned set, hence, the locations from which tree i could be detected. Let us also write |S i | for the proportional length of S i , so that Here I i is an indicator function of tree i belonging to S c i . When the homogeneous Poisson process is considered in polar coordinates the angles of the points are uniformly distributed, hence the distribution of a point at a distance r i is uniform on a circle of radius r i . From this uniformity it follows that the probability of a point hitting a certain subset of the circle is proportional to the length of that subset. Hence, But now our construction for the probabilities of detection is also p i = |S c i |, and it follows that hence the estimator is unbiased.

Estimated variance of the estimator
When the assumptions related to our Horvitz-Thompson-like estimator are met -the sequential nature of detection holds, the detection condition is true, and the data is for example generated by the homogeneous Poisson process -the detection probabilities we approximate in Equation 1 are the true detection probabilities and hence our estimator coincides with the Horvitz-Thompson estimator. An unbiased estimator for the variance of the Horvitz-Thompson estimator is known (see e.g. Thompson, 2012, p. 70): where p ij is the probability to include both i and j in the sample, or in our case, to detect both trees i and j. The indexing goes over the detected trees, not the whole population. Through conditional probability we can write p ij = p j|i p i , where p j|i is the probability of detecting j when i is detected. Because of our sequential construction the probability of detecting j always takes into account the fact that i has been detected for j > i. Hence, p j|i = p j and p ij = p i p j , and furthermore The estimated variance makes it possible to produce approximate confidence intervals (Thompson, 2012, p. 70) τ ± t α 2 ,n−1 var( τ ), 6 where t α 2 ,n−1 is the critical value corresponding to the selected nominal confidence level from the t-distribution with n − 1 degrees of freedom. The t-distribution can be replaced with a standard normal distribution if the number of detected trees is large. We follow the pragmatic rule given by Thompson (2012) and use the t-distribution if the number of detected trees is less than 50. We note that a large portion of the simulated circular plots we use here for testing have less than 50 detected trees.

Comparison to the estimator of Kuronen et al.
The estimator of Kuronen et al. (2019) is of the same general form as ours (Equation 2). However, instead of weighting the detections with detection probabilities p α (r i , d i ) which we have formulated in Equation 1, the detections are weighted with weights w i that correspond to surface areas of the visible (or nonvisible) planar set that has been morphologically transformed according to the detection condition. Using the notation that we have presented the weights can be written as corresponding to the detection conditions of any, centre and full visibility. Kuronen et al. (2019) did not include the parameter α to their construction, but considered the detection conditions as totally separate cases. They did, however, consider a different kind of a weighting for the partial visibility detection condition, related to the visible proportion of the boundary of the stem disc, whereas in our construction partial visibility cases are handled by setting α to a value that differs from −1, 0 and 1 and are related to the proportion of visible stem disc radius.
Comparison of our weights in Equation 1 and the weights of Kuronen et al. (2019) in Equation 4 shows that the main difference between the two constructions is the sequential nature of our detection probabilities: in our construction only the trees that are before tree i in the ordering affect the weight of tree i, whereas in the construction of Kuronen et al. (2019) all of the trees, including tree i, have an effect. The premises of the constructions are also different. Whereas we approach the situation from a probabilistic perspective, Kuronen et al. (2019) are following the idea that a tree that is detected should belong to such a sampling window from which it can be detected, and hence, the size of the sampling window should change for every tree. Kuronen et al. (2019) was not able to show unbiasedness of their estimator -in fact, they showed that the estimator has positive bias even in the case of Poisson process data -and they did not present an estimator for the variance either.

Characterizing deviations from the Poisson assumption
As our estimator can be shown to be unbiased when the data is generated by a Poisson process, it is of interest to see how the bias of the estimator behaves when the spatial structure of the data deviates from this Poisson assumption to either clustering or regularity, and if the magnitude of this deviation has an effect on the magnitude of the bias. We use the L-function, derived from the K-function as to characterize the spatial structure of a point pattern in a plot (see e.g. Illian et al., 2008, Chapter 4.3). The K-function is related to the expected number of points of a point process that are closer than distance r to a point of the process. The L-function of the Poisson process is known: L(r) = r. For a clustered process L(r) > r, and for a regular process L(r) < r.
We estimated the L-functions for all the simulated plots and point patterns by using the R package spatstat (Baddeley et al., 2015) with isotropic edge-corrections (see e.g. Illian et al., 2008, Section 4.2.2) with r ranging from 0 to 5 metres. If the point pattern contained only one point, a case where L-function can not be estimated, we took the convention of characterizing it as coming from a Poisson process and set L(r) = r. To produce measures of deviation from the Poisson process, we used signed maximum deviation between L and the identity line. First find the distance at which maximum absolute deviation occurs, r * = arg max r∈ [0,5] |r −L(r)|.
Then, the L-based measure of deviation is given by producing positive values if the pattern shows stronger signs of regularity than clustering, and negative values if the pattern shows stronger signs of clustering than regularity. We also used this deviation measure to characterize whole data sets by first calculating the mean L-function over all n plots of the data asL and then calculating the deviation measure for the mean L-function. The mean L-function can be seen as an estimator of the expected L-function of the process that has generated the data, and as L(r) = r for Poisson process is also an expectation, this is quite natural way to characterize the deviation of a process from the Poisson process. The L-functions were estimated in R (R Core Team, 2019) with the function Lest in the R package spatstat (Baddeley et al., 2015).

Field data
We use field data measured in 2017 in Eastern Finland. It consists of 111 square 30 × 30 metre plots placed to the area of about 43,000 hectares. The plots were sampled from a systematic network using a priori information of stand maturity and dominant tree species. Tree species were determined and tree height and DBH were measured for all trees with a DBH ≥ 5 cm. Tree locations were determined using a variant of the triangulation method proposed by Korpela et al. (2007). Field measurements are documented in detail in Räty et al. (2019). The central 10×10 metre square of every plot was covered with a triangular grid where the distance between points was at least 1 metre. Every grid point was used as a centre for a circular sampling plot with radius 10 metres. If a tree stem covered the centre point, it was removed. This produced 111 × 126 = 13986 simulated plots using true field data. Statistics related to N and G in these plots are presented in

Process simulated data
Circular field plots of radius 10 metres were simulated from four different types of marked point processes: Poisson process, nonoverlapping discs process, Gibbs hard core process and Log-Gaussian Cox process (generation of point patterns from these processes is more thoroughly depicted in Section 3.2.2). Two different variants of the Gibbs hard core process and five different variants of the Log-Gaussian Cox process were simulated. In all cases, process intensities (N ) 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, and 5000 stems · ha −1 were used. These intensities were chosen to cover the variation in stem densities in natural forests and have been previously used by e.g. Olofsson and Olsson (2018). The number of plots generated was 10000 for the Poisson process, 1000 with every intensity, and 2000 per other point processes, 200 with every intensity. Special attention was given for the Poisson process because it is theoretically in line with the assumptions of our estimator and hence demonstrating the performance of the estimator in Poisson process data is of great importance. In all cases the point process was simulated in a larger simulation window to take into account possible effects at the plot boundary. The trees that did not come into contact with the plot boundary at 10 metres, and hence could not have any effect on the estimation, were removed. The number of points in the simulation 8 window was always first generated from a Poisson distribution with expected value N |W |, where |W | is the area of the simulation window. Only point patterns where the centre point of the plot was not covered by a stem disc were accepted. All simulations were done with R (R Core Team, 2019). Statistics related to the simulated N and G are presented in Table A.2 and example plots are presented in Figure A.1.

Generating tree diameters
An adjusted version of the parameter recovery method of Siipilehto and Mehtätalo (2013) was followed to produce Weibull distributions for the DBH. Mean DBH (D) and basal area (G) pairs of 6 cm and 3 m 2 · ha −1 , 12 cm and 12 m 2 · ha −1 , 15 cm and 20 m 2 · ha −1 , and 21 cm and 35 m 2 · ha −1 were taken as a starting point. Then, for shape parameter γ and scale parameter β the sum of differences between the means and quadratic means was minimized. This produced 40 different DBH distributions with expected basal areas almost exactly agreeing with the four given mean values of G and expected DBH differing from the given mean values of DBH to keep N and G compatible. In other words, for every intensity four different DBH distributions corresponding to different mean DBH and basal area were used to simulate the DBH marks for the trees. It should be noted that these intensities and distributions are parameters of the random processes and hence the simulated stem densities, mean DBH and basal areas will be different from these numbers and vary according to the nature of the process.

Generating marked point patterns
Poisson process (abbreviated as Poisson in the tables and figures) forests with complete spatial randomness of points and independent and identically distributed DBH were generated in a circular window of radius 11 metres, with the given intensities and DBH distributions. Nonoverlapping discs process (Nonoverlapping), producing point patterns where the stem discs can not overlap, was used to simulate forests with a more realistic hard core property. The first point location was generated uniformly in a circular window of radius 11 metres and mark was generated from the DBH distribution. For points 2, . . . , M a suggested point with a mark from the corresponding DBH distribution was generated uniformly in the same window. If the point was located in such a way that its stem disc did not overlap the previous discs it was accepted, otherwise it was rejected and the marked point was simulated again. At every step 2, . . . , M 10000 attempts to simulate were allowed. If the algorithm failed to simulate an acceptable point during these attempts the simulation was regarded as finished. Another type of hard core process where the points are at least 1 metre apart was simulated as a Gibbs hard core process (Gibbs 1 ). The points were first simulated in a 40 × 40 metre square window. The locations of points were simulated from a Gibbs process with known number of points (see e.g. Illian et al., 2008). Then an independent sample was taken from the DBH distribution to assign marks to the points. Similar scheme was used to simulate a Gibbs 1.5 metre hard core process (Gibbs 1.5 ).
Log-Gaussian Cox process (Cluster ) with Matern covariance function was used to produce plots with spatial clustering. The smoothness and variance parameters of the covariance function were fixed to 2 and 1, whereas the range was varied between 2, 4, 6, 8, and 10 metres, producing five different variants of a cluster process. The points were simulated in a 40 × 40 metre square window. The Gaussian field was simulated with the R package RandomFields (Schlather et al., 2019) to produce the density field over the window. The points were then generated according to that density with the spatstat function rpoint (Baddeley et al., 2015).

4 Evaluation of results
Relative root-mean-squared errors and relative means of errors whereŷ i is the estimated value, y i the true value,ȳ the mean of the true values and n the number of plots, were used to evaluate and compare the estimation errors between different estimators. The estimators that we benchmark our method against are the estimators of Olofsson and Olsson (2018) and Kuronen et al. (2019). These estimators are the same when the detection condition is centre point visibility. We also report the errors of estimation based on the detected trees with no corrections. It should be noted that both Olofsson and Olsson (2018) and Kuronen et al. (2019) assumed that the stem discs belong to the visible area, whereas we assume that they belong to the nonvisible area. To compare only the differences of distance-based and area-based construction on the estimation we wanted to keep all other effects the same and decided to modify these estimators to include the stem discs to the nonvisible area. This should not affect the results greatly; the important part is to have the detection conditions matching the nonvisible areas, in other words that the simulated trees have been correctly classified as either detected or not detected. In the result tables we refer to our estimator as "HT", the estimator of Olofsson and Olsson (2018)

Results
In the plots based on the field measured data, the HT-like estimator produces better or very similar error values in most cases (Table 1). The exception is the estimation of G when the detection condition is full visibility, where the estimator of Olofsson and Olsson (2018) produces lowest error values.  The results in the plots simulated from marked point processes are presented in Table 2 for the Poisson process data and Tables A.3, A.4, and A.5, for the others. When comparing the HT-like estimator and the estimator of Kuronen et al. (2019), the HT-like estimator produces better or very similar error values in data simulated from the Poisson process, Gibbs processes and the nonoverlapping discs model. Especially in the Poisson process case the ME% values of stem density estimation are very close to zero, which is consistent with the unbiasedness of the estimator for homogeneous Poisson process. In the clustered data, the estimator of Kuronen et al. (2019) produces lower error values than the HT-like estimator. Both of the estimators produce larger estimation errors in the clustered data than in the other types of data. Both of the estimators show overestimation in the regular data and underestimation in the clustered data. The estimator of Olofsson and Olsson (2018) does not show this behaviour, but shows overestimation with full visibility detection condition and underestimation with any visibility detection condition in all of the simulated data sets. For the Gibbs hard core data and full visibility detection condition it produces lower error values than the other two estimators, except for ME% of G. Figure 3 represents the relationship between bias and deviation of a data set from the Poisson process assumption. The ME% of the different data sets are plotted against the L-function based deviation measures calculated from the mean L-functions of the data sets. The Poisson process data, having deviation measure closest to zero, has ME% closest to zero, indicating unbiasedness. When the deviation measure grows larger, the errors start to also grow, showing that larger deviations to regularity produce larger overestimations. On the other hand, deviations to clustering produce underestimation, and the magnitude of deviation affects the magnitude of underestimation. The deviation and detection condition also have a connection, the any visibility detection condition results being least affected by the magnitude of clustering or regularity, and full visibility detection condition being affected the most. The field data based data set is somewhat regular, having L-function based deviation of roughly 0.5. It should be noted that although the data sets have "on average" a certain type of behaviour, e.g. heavily clustered or slightly regular, the plots in these data sets can have widely varying deviation measures (see Tables A.1 and A.2). However, the mean behaviour seems to have the largest effect on the bias of the estimator.
To examine the estimation of estimator variance and the goodness of the approximate confidence intervals, we calculated the 90, 95 and 99 per cent intervals for stem density and basal area in the simulated data sets with different visibility conditions and then calculated how many of the simulated stem densities and basal areas belonged to their corresponding confidence intervals. The results about stem density are presented in Table 3 and Table A.6. In the Poisson and nonoverlapping discs data sets the observed coverage probabilities of the confidence intervals are very close to the nominal confidence levels with all of the detection conditions tested. In the slightly regular field data set the percentages are mostly in line with the 99 per cent confidence level, but the 90 and 95 per cent intervals are slightly conservative. In the Gibbs hard core process data sets with strong regularity the visibility condition has a larger effect on the goodness of the interval estimates. This can be especially seen in the 90 per cent interval case, where under the full visibility condition only 87.1 per cent of the simulated stem densities belong to their corresponding intervals, whereas under the any visibility condition the value is 93.2 per cent, going from anti-conservative to conservative intervals. However, the 99 per cent intervals for Gibbs 1 metre hard core process have very similar nominal and observed coverage probabilities. For Gibbs 1.5 metre hard core process, most of the intervals are anti-conservative. In the clustered data sets, all of the intervals are anti-conservative, indicating that the variances have been underestimated in addition to the underestimation present in the estimated stem densities. The results relating to basal area estimation, presented in Table A.7, show similar effects than the stem density estimation results. The relationships of the estimated stem densities and their estimated standard errors to the simulated stem densities in the Poisson process data are presented in Figure 4. The estimated stand densities are well centred around the identity line, with more variance in the estimates as the simulated stem density increases. The estimation results get better when going from full visibility to centre point visibility, and further to any visibility detection condition, represented by the more concentrated point cloud. This effect is also shown by the incrementally smaller RMSE% values ( Table 2). The estimated standard errors increase as the simulated stand density increases, as does the variation in the estimated standard errors. The errors decrease when moving from the full visibility to any visibility detection condition.

Discussion
Our estimator has a clear connection to distance sampling (see e.g. Buckland et al., 2004), which is commonly used in estimating for example sizes of wild animal populations from a sample of observations at different distances from the observer. Most commonly a parametric model describing the probability to detect an animal at a certain distance is fitted to the sample data, and then a Horvitz-Thompson-like estimator is used, either with the fitted probabilities for each observation depending on its distance, or the mean probability for every observation. Fitting of a parametric detection probability model is also the approach of Duncanson et al. (2014) and Astrup et al. (2014) to estimating population totals from TLS data in forestry. Our construction could be seen as a nonparametric distance sampling estimator -provided that the value of the tuning parameter is known -where the detection probabilities are calculated directly from the data for each observation individually. It could also be seen as distance sampling with empirical detection probabilities under an assumed Poisson model. The strength of our construction, when compared to Kuronen et al. (2019) and other previously published methods, is the ability to prove the unbiasedness of the estimator when the data is generated by a Poisson process - Kuronen et al. (2019) showed that there is a positive bias in their estimator even in this case -and a way to calculate approximate confidence intervals. In both cases the sequential nature of the detection probabilities, i.e. only the trees closer to the plot centre affecting the detection of trees further away, is of great use. In addition to the unbiasedness in Poisson process data, Figure 3 shows that there is a natural relation between deviations from Poisson process and the bias of the estimator: regularity leads to overestimation, clustering to underestimation.
The HT-like estimator with distance-based detection probabilities produces in most simulation cases estimation errors that are lower than or comparable to the errors produced by the estimators of Olofsson and Olsson (2018) and Kuronen et al. (2019). The estimator works very well in the Poisson process data, as is to be expected. The estimator is slightly biased in the slightly regular field data. However, the bias is to a "safe" direction because the confidence intervals are conservative in this case. The results indicate that the estimator should not be used "as is" for strongly clustered or regular data. In these cases, it might be possible to tune the estimator by adjusting the parameter α. An example of unintentional tuning is for example the estimator of Olofsson and Olsson (2018) producing lower ME% of G in the field data with full visibility detection condition. This is because the correction is "wrong": when compared to the area from which trees can be detected based on the detection condition a larger area is used for weighting, pushing the estimates down. Intentional tuning in our case would entail calculating estimates with several values of parameter α in a training data set and choosing a value that minimizes some error value as the "right" detection condition for that type of data. Figure 4 further demonstrates the properties of our estimator when the data is Poisson distributed. The unbiasedness is shown by the concentration of the stem density estimates along the identity lines. The increasing simulated stem density produces increasing estimates of standard error. As the number of stems increases, so does the number of detected stems, which introduces more positive terms to the formula of the variance estimator (Equation 3), which produces a larger variance. The increase caused by a single tree is connected to its detection probability; small probability leads to a large increase.
The construction could be generalized in several ways. For example, other objects than tree stems producing nonvisible areas behind them could be added in when calculating the detection probabilities, as long as the object is detected from the data in such a way that the geometry of the nonvisible area it produces is known. The visibility condition of a tree could depend on its size and distance from the scanner. Other effects connecting distance and detection probability, such as the number of laser returns per unit area which diminishes as a function of distance, could be added to the calculation of detection probabilities, either through the parameter α or as additional weights.

Conclusions
We have presented a Horvitz-Thompson-like estimator with distance-based detection probabilities for single scan terrestrial laser scanning and evaluated its performance in a simulation study consisting of plots based on field measured data and simulated from several different point processes. The estimator produces error values comparable to or lower than the two benchmark methods in most cases. A notable exception is the clustered data. The estimator is unbiased when the data comes from a Poisson process. The variance of the estimator can be estimated and approximate confidence intervals can be formed. The detection probability construction is easily generalizable to allow objects other than tree stems to affect detection and addition of different distance-based effects on detection. The estimator has a tuning parameter which could be used to find an appropriate detection condition for a scanner in training data.
Appendix: Additional tables and figure Figure A.1: Three simulated plots from the Poisson process, Non-overlapping discs process, Gibbs hard core process, and Log-Gaussian Cox process that produces clustered patterns.