Vegetation structure from LiDAR explains the local richness of birds across Denmark

1. Classic ecological research into the determinants of biodiversity patterns emphasised the important role of three- dimensional (3D) vegetation heterogeneity. Yet, measuring vegetation structure across large areas has historically been difficult. A growing focus on large- scale research questions has caused local vegetation heterogeneity to be overlooked compared with more readily accessible habitat metrics from, for example, land cover maps. 2. Using newly available 3D vegetation data, we investigated the relative importance of habitat and vegetation heterogeneity for explaining patterns of bird species richness and composition across Denmark (42,394 km 2 ). 3. We used standardised, repeated point counts of birds conducted by volunteers across Denmark alongside metrics of habitat availability from land- cover maps and vegetation structure from rasterised LiDAR data (10 m resolution). We used random forest models to relate species richness to environmental features and considered trait- specific responses by grouping species by nesting behaviour, habitat preference and primary lifestyle. Finally, we evaluated the role of habitat and vegetation heterogeneity metrics in explaining local bird assemblage composition. 4


| INTRODUC TI ON
Spatial environmental heterogeneity (sensu Stein & Kreft, 2015) is considered a primary driver of local species richness patterns as more species can coexist in heterogeneous areas, which feature greater niche space and ecological opportunities (Currie, 1991;Tews et al., 2004). Land cover heterogeneity, that is, the composition and configuration of habitat types, has been a major focus of research into this relationship (Stein et al., 2014). Yet, vegetation structure heterogeneity, which was identified early on as a strong predictor of local diversity patterns (August, 1983;MacArthur, 1964), has been comparatively overlooked-likely due to the difficulty in obtaining data at large scales (Simonson et al., 2014). A key knowledge gap, therefore, concerns the role of three-dimensional (3D) vegetation heterogeneity in driving local diversity patterns for large geographic areas and across multiple habitat types. Active remote sensing is a valuable and growing tool for studying these types of species-habitat relationships; rasterised LiDAR datasets are now available that measure fine-scale (10 m resolution) vertical structure across the extent of entire countries (Assmann et al., 2022). With accelerating biodiversity loss and changes in species communities (IPBES, 2019), understanding the relationship between species and their environment is more important than ever.
To describe species' habitat preferences, we often split landscapes into distinct habitat classes using vegetation and other environmental characteristics. Land cover maps that describe the distribution of habitat types have been used to study species' habitat preferences and the drivers of biodiversity patterns from regional to global scales (Stein et al., 2014). However, human-defined habitat classes are not always relevant to the organisms being studied (Davison et al., 2021;Fahrig et al., 2011). The sharp boundaries between neighbouring classes are also a poor representation of habitat transition zones (McGarigal & McComb, 1995). Habitat classifications may therefore be overlooking important environmental variation within classes that could help us describe species' physical niches (Rotenberry, 1981).
Looking beyond habitat classes and directly measuring the heterogeneity of vegetation structure can produce continuous environmental metrics that capture variability within habitat types. Classic ecology research emphasised the important role of vegetation structure in local patterns of biodiversity (MacArthur, 1964;McCoy & Bell, 1991;Pianka, 1966). In their pioneering study, MacArthur and MacArthur (1961) showed that heterogeneity in the vertical profile of vegetation positively correlated with bird species diversity. Since then, many studies have found positive relationships between vegetation heterogeneity and bird species diversity (e.g. Carrasco et al., 2019;Clawges et al., 2008;Flaspohler et al., 2010).
The positive effect of vegetation heterogeneity on faunal diversity is consistent with niche theory-as habitats become more structurally complex there are more microhabitats, food resources and sites for shelter and breeding (Currie, 1991;Lawton, 1983). Nevertheless, our understanding of how vegetation structure affects species diversity has been limited by the difficulty of measuring fine-grained structure over large extents and by not considering the responses of different species groups, especially non-forest species.
The response of species to the type, heterogeneity and structure of habitats can vary greatly depending on their individual traits (Goetz et al., 2007;Weisberg et al., 2014). Considering only total species richness can overlook species-specific responses to habitat features and miss important changes in assemblage composition.
Tall canopies support more forest species, for example, yet overall species richness may decline with increasing canopy height if most species in the regional pool have different habitat requirementsfor example, they are mostly non-forest species. A solution to make useful generalisations about a taxonomic assemblage, while minimising the noise of individual species responses, is to group species into functional groups (Weisberg et al., 2014), for example, based on their habitat use, nesting behaviour or primary lifestyle. These divisions help us identify the environmental features that are involved in habitat selection and therefore better understand this process.
Remote sensing, such as LiDAR, presents an opportunity for investigating the relative roles of habitat and vegetation heterogeneity in driving diversity patterns (Simonson et al., 2014). Several studies have demonstrated the potential of LiDAR for improving predictions of species diversity and habitat use; however, they have generally been limited to forest habitats or relatively small areas (Hill et al., 2004;Lesak et al., 2011). In their review of LiDAR use in avian research, Bakx et al. (2019) found that the median spatial extent 5. Our results show how LiDAR and land cover data complement one another to provide insights into different facets of biodiversity patterns and demonstrate the potential of combining remote sensing and structured citizen science programmes for biodiversity research. With the growing coverage of LiDAR surveys, we are witnessing a revolution of highly detailed 3D data that will allow us to integrate vegetation heterogeneity into studies at large spatial extents and advance our understanding of species' physical niches.

K E Y W O R D S
bird diversity, citizen science, habitat availability, heterogeneity, land cover, LiDAR, remote sensing, vegetation structure of studies was only 53 km 2 and that 82% included only one habitat type-generally forests. Broad-extent LiDAR data is now becoming more available, permitting investigations into the relationship between measures of 3D vegetation structure and biodiversity across large areas and varied habitat types (e.g. Moeslund et al., 2019).
In this paper, we combine remote sensing data and citizen science observations to test the relationship between bird species and environmental heterogeneity at the extent of a whole country. Our primary aim was to evaluate the roles of heterogeneity in habitats and vegetation structure for determining local bird species diversity in Denmark. We were specifically interested in answering the following research questions: (1) What is the relative importance of habitat availability and vegetation structure in explaining local species richness patterns? (2) how do different functional groups respond to these features of environmental heterogeneity? and (3) to what degree does fine-grained data on vegetation structure provide additional insights compared to traditional habitat classifications?
Overall, we expected a positive relationship between total richness and environmental heterogeneity, both in terms of habitat and vegetation structure heterogeneity, due to the predicted increase in niche availability (MacArthur & MacArthur, 1961;Tews et al., 2004). We expected functional groups to respond differently to features of the environment depending on their characteristics (Goetz et al., 2007;Weisberg et al., 2014)-for example, arboreal nesting species would benefit from a well-developed canopy layer.
Lastly, we expected vegetation structure variables to capture variation within habitat classes that would help explain richness patterns (Culbert et al., 2013). Together, our research questions will address the importance of considering 3D vegetation structure in analyses of diversity patterns and demonstrate the potential of LiDAR and citizen science for research across large extents.

| Study area
Denmark (42,394 km 2 ) has a temperate climate that is milder in the west because of its proximity to the Atlantic Ocean and is slightly drier and more continental in the east. Agricultural land makes up more than 60% of the Danish landscape (Levin, 2019), while remnant natural and semi-natural habitats persist in small patches. The present-day breeding avifauna is largely migratory with few sedentary residents (<5%) and is therefore not considered dispersal limited (Gotelli et al., 2010).

| Bird data
We used point count data from the Common Bird Monitoring programme organised by Birdlife Denmark (1975-present, for more information see Eskildsen et al., 2021). This structured citizen science database provides standardised, geolocated, annual observations at a national scale. We used volunteer observations from the summer breeding season (1 May to 15 June) in 2014, 2015 and 2016. Surveys were conducted along routes consisting of at least 10, but mostly 20 points-at which all birds seen or heard at any distance within 5 min were recorded. To avoid double counting, points are placed 300-1000 m apart (median nearest-neighbour distance = 362 m; IQR = 259-547 m). Repeat observations of routes were made at the same time of year (±7 days) and time of day (±30 min), and under good weather conditions. We retained points that were repeated in all 3 years and were from route surveys that began before midday; a total of 10,704 five-minute point counts from 3568 individual survey points (total observation time ~892 h; Figure S1). At each survey point, we summed observations across all survey seasons (2014)(2015)(2016). We only included species recorded as breeding in any of the three Danish Breeding Bird Atlas projects (covering 1971Vikstrøm & Moshøj, 2020). We retained 176 breeding species, while 16 non-breeding species were removed (Table S1).
We separated bird species into functional groups based on shared characteristics, focussing on habitat use, nesting behaviour and primary lifestyle (Table 1). Habitat use and nesting behaviour were from a database version of 'The Birds of the Western Palearctic' (Cramp, 2006;Storchová & Hořák, 2018). Habitat use was considered as the environments a species occupies during the breeding season-although we excluded habitat types absent in Denmark.
We aggregated some similar habitat groups because our analyses indicated very similar responses (deciduous and coniferous forest species; shrub and woodland; freshwater and marine). Species can belong to more than one habitat use or nesting group. Primary lifestyle describes the predominant locomotory niche of a species and these groups were mutually exclusive. Primary lifestyle was taken from the AVONET trait database (Tobias et al., 2022). In total, we calculated the species richness of 19 functional groups (Table 1).
Furthermore, we calculated four global-level metrics including overall species richness and the richness of habitat, nesting and primary lifestyle functional groups (i.e. the number of unique groups at a TA B L E 1 Species functional groups. The total number of species for each group is in parentheses. Habitat and nesting groups are not mutually exclusive but primary lifestyle groups are. Insessorial = perching birds.

Habitat use
Nesting behaviour

Primary lifestyle
Forest ( site). We also summed species abundances across years to analyse bird assemblage composition.

| Environmental data
We considered two groups of environmental variables: habitat availability and vegetation structure (Table 2)-and included degrees latitude and longitude to account for spatial gradients of unmeasured variables. All environmental variables (n = 18) were sampled from a circular plot with a radius of 150 m at each point count location (area ≈ 7 ha). This radius size was chosen to capture local effects of land cover and vegetation structure, although we a sensitivity analysis of radius choice. All data cleaning and analyses were conducted in R version 4.2.0 (R Core Team, 2022).
Land cover data was extracted from a 2015 pan-European land cover map (Pflugmacher et al., 2019). For our analysis, we merged some rarer land cover classes with closely related common classes: perennial and seasonal cropland, water and wetlands and the three types of forest (coniferous, broadleaved, mixed)-resulting in six classes. The habitat availability of each point count was considered as the percent cover of each class. We calculated habitat diversity, using the reciprocal Simpson index of land cover classes (Simpson, 1949). This metric represents compositional heterogeneity by calculating the probability that two randomly selected points are in different classes. The metric increases with the number and/or evenness of habitat types. We excluded a metric of configurational heterogeneity, landscape division (Jaeger, 2000), due to strong collinearity with habitat diversity (Pearson's r = 0.98).
We acquired all measures of vertical structure from the EcoDes-DK15 dataset, which describes a broad range of ecological descriptors across all of Denmark (Assmann et al., 2022 (Assmann et al., 2022). However, we did not expect a strong influence on our results as only 6.5% of the total area across our sites was coniferous forest. For our analysis, we focussed on descriptors of vegetation structure and amount that were likely to be important for birds: canopy height, canopy openness and the total number of LiDAR returns from vegetation.
Using the LiDAR return counts (in various height bins), we also recorded the amount of vegetation that was close to the ground (below 1.5 m) as this is an important layer for birds and could help identify grassland or cropland areas. Furthermore, we calculated an index of foliage height diversity (MacArthur & MacArthur, 1961).
This is equivalent to the Shannon diversity index and is calculated as − ∑ i p i log e p i where p i is the proportion of the total foliage, or LiDAR returns, which lies in the vertical layer i. We stratified the vertical vegetation into five layers (0-2, 2-5, 5-10, 10-15 and >15 m). canopy height, canopy openness and foliage height diversity, we also calculated standard deviation to describe the spatial variability of these features.

| Analysis
For our primary analyses, we used random forest models (Breiman, 2001) with species richness or functional group richness as response variables. Random forests are a nonparametric machine learning method that exploits structure in high-dimensional data by combining the predictions of numerous decision trees (Cutler et al., 2007). Random forests do not require a priori specification of a model linking predictors and responses but use an algorithmic approach to learn the form of key relationships (Oppel et al., 2009).
This method has a low cost to including many predictors, can handle nonlinear effects and is relatively insensitive to collinearity between predictors (Dormann et al., 2013). We appraised model fit using the percent of variation explained (pseudo-R 2 ; 1 − residual sum of squares/total sum of squares) using external cross-validation (Oppel et al., 2009). We used 70% of the data to build the models and the remaining 30% (spatially explicit; Figure S2) to assess their performance and determine variable importance.
We ran random forest models for all species groups that had a minimum richness of 10 species co-occurring at least once (summed across years) and for the four global level metrics (17 models total). All random forest modelling was conducted with the randomForestSRC package in R (Ishwaran et al., 2008;Ishwaran & Kogalur, 2021 (Nicodemus et al., 2010). To avoid this issue, we identified groups of co-dependent variables (including nonlinear dependencies) using the Chi-squared (χ 2 ) statistic and permuted these groups simultaneously ( Figure S3). We standardised variable importance values relative to the total importance of each model (0-1).
We conducted a sensitivity analysis to assess the role of scale on model fit and the importance of habitat and vegetation structure.
We reran the random forest algorithm at two additional scales: a 50-m radius buffer and a 450-m radius buffer, approximating a 10fold decrease and increase of our focal scale. We compared the mean model fit (Pseudo-R 2 ) between scales for all models using one-way

| RE SULTS
A total of 178 bird species were recorded in the study period. The three species with the highest occupancy were Blackbird (Turdus   Among the global metrics, the richness of habitat functional groups performed best and achieved optimal fit using only habitat availability variables-as with most models in this group (Figure 1a).
For grassland species richness, the best-performing model, grassland cover appeared to be less influential than features of vegetation structure (Figures 1 and 2a). Grassland species favoured areas where vegetation was open and not vertically complex (Figure 3a).
Vegetation structure variables performed better than habitat availability for most habitat association models. The exceptions were freshwater and marine species and human settlement species, for which the presence of water/wetland and artificial land were key features respectively ( Figure S4). Forest species richness was largely explained by vegetation structure; foliage height diversity had a strong positive effect; and there was a unimodal response to the variability of canopy height (Figures 2b and 3b).
Among nesting behaviours, ground nester richness was the best-  (Figures 2c and 3c). The richness of species forming open arboreal nests was explained much better by vegetation structure than habitat availability. For primary lifestyles, aquatic species were best predicted by the presence of water; it was twice as important as the next group of four variables describing vegetation (Figure 2d).
Insessorial species favoured the presence of forest and vertical vegetation structure, while for terrestrial species, a lack of tall complex vegetation was key-especially as measured by canopy height ( Figure S5). No functional groups appeared to be strongly influenced by habitat diversity (Figures S4 and S5).
Our sensitivity analysis showed no differences between mean  (Figures S6 and S7). At the focal 150 m scale, the mean fit of vegetation structure models was 0.4% higher than habitat availability models (50 m: 1.3% higher; 450 m: 1.6% lower). When considering the average performance across models the full models performed best at all scales ( Figure S7).
Bird assemblage composition (as summed relative abundances) correlated better with the difference between sites calculated using habitat availability metrics than with vegetation structure metrics (Table 3). The highest correlation of all subsets was for habitat and vegetation metrics combined (r M = 0.321); meanwhile, the lowest correlation was with geographic coordinates (r M = 0.063). A principal coordinates analysis explained 19% of the variation in assemblage composition in the first two axes. The first and most important axis correlated with a gradient of open to closed habitats, while the second axis correlated with artificial habitat area ( Figure S8).

F I G U R E 3
Independent effect of key variables in selected bird richness models (a-d). Effects were estimated using accumulated local effects plots. Vegetation structure metrics are green, habitat availability metrics are blue. Grey lines are 50 bootstrap reiterations of the data and rugs denote five percentiles of the variable's distribution (interpret cautiously areas with low coverage). The y-axis is the change in predicted species richness relative to the average prediction and varies between models (rows). Additional figures in Supporting Information. Vegetation structure influences bird habitat selection in both open and closed environments through its effect on movement (Brokaw & Lent, 1999), foraging success (Whittingham & Markland, 2002) and predator avoidance (Gotmark et al., 1995).
We found high importance of vegetation structure relative to habitat availability for many models of local bird richness. We expected niche addition to lead to a positive linear relationship between richness and foliage height diversity, especially for forest species (Clawges et al., 2008;MacArthur & MacArthur, 1961). Indeed, we found positive effects of foliage height diversity and canopy height that appeared to exceed the influence of forest cover on forest bird richness. Other functional groups also responded distinctly to the presence or absence of tall, complex vegetation and canopy height appears to be a key variable in this group ( Figure S5).
We found that the positive relationship between canopy height and forest species richness plateaued after approximately 60% of its distribution, which may indicate that other factors such as structural complexity, tree species composition or the size of fragments constrain richness in forests. Interestingly, the relationship between foliage height diversity and forest bird richness did not plateau early-suggesting an important role of structural diversity in forests. Danish forests are highly fragmented, and we observed the plateauing effect of canopy height occurred at a higher value when using a 50-m radius and was lower with a 450-m radius (not shown).
As the focal area grows, it tends to incorporate more open habitats (the dominant type) and the mean canopy height decreases. These scaling effects, along with potentially complex interactions between foliage height diversity, canopy height and canopy openness, merit further investigation when applying these metrics across broad extents and mixed habitats.
Large-grain studies have often shown positive heterogeneitydiversity relationships based on habitat availability (Atauri & De Lucio, 2001;Redlich et al., 2018). Contrary to expectations, however, our metric of habitat diversity did not play an important role in explaining richness patterns. Instead, vegetation structure and the presence of specific habitat types identified suitable areas for different species. Habitat diversity may have been a weak explanatory variable if the scale of our main analysis (7 ha) was smaller than the taxon-specific scale of this effect (Mayor et al., 2009;Tews et al., 2004). Indeed, our sensitivity analysis showed that habitat availability became more important with increasing spatial grain. Due to a trade-off between habitat heterogeneity and the area of suitable habitat per species (Allouche et al., 2012), the heterogeneity-diversity relationship often depends on spatial scale (Tamme et al., 2010) and on species' unique habitat requirements (Atauri & De Lucio, 2001). Nonetheless, we did not find strong negative effects of habitat heterogeneity that would have been predicted by microfragmentation of habitat patches in small areas (Tamme et al., 2010).
One aspect of habitat availability had a particularly strong effect on bird richness patterns: the presence of water and wetlands.
Despite its rarity, this habitat class was crucial for some functional groups, highlighting the importance of certain habitats for specialist species (Pickett & Siriwardena, 2011). For example, the richness of ground-nesting species was strongly correlated to water and wetland availability, and this group is suffering the greatest long-term declines of any in Denmark (Heldbjerg et al., 2018). Widespread subsurface draining was initiated in the 19th century in Denmark to increase the amount of land available for agriculture (Mortensen, 1987) and to convert forests to plantation forestry. Historical maps from the turn of the 19th century show that wetlands made up 20%-30% of the Danish landscape (Korsgaard, 2004). While ecosystem restoration projects are underway that aim to restore natural wetland and forest hydrology, our findings highlight the importance of their continuation and development.
Species richness was best explained when species were split into functional groups, echoing analyses restricted to forest habitats (Goetz et al., 2007). We could not predict total species richness and instead found that functional groups showed diverse responses to the environment, highlighting the need to consider species identity when studying richness patterns (Stirnemann et al., 2015;Weisberg et al., 2014). Most results followed our ecological expectations. For example, the richness of water-associated functional groups depended strongly on the availability of water and wetlands. We also identified important negative correlations, such as the avoidance of We found that greater differences in habitat availability between sites across Denmark lead to greater differences in bird assemblages. While the same pattern was found for vegetation metrics, the correlation to bird composition was not as strong. This finding may indicate that abundance is influenced more by habitat type than by its structure or complexity. The amount of artificial land, for example, likely affects bird composition but was not well captured by our LiDAR metrics ( Figure S8). Nonetheless, while we focussed on  (Gotelli et al., 2010), climatic gradients or temporal heterogeneity in resources and predation pressure (Mayor et al., 2009).
Our nationwide assessment opens promising avenues for future research and management applications. As the coverage of highresolution LiDAR expands, it will be possible to investigate differences in the relationship between species diversity and vegetation structure at a local scale across biomes and between regions with different management histories. As demonstrated in this study, citizen science monitoring programmes can be an ideal counterpart to large-scale remote-sensing efforts. Having standardised, geolocated observations is particularly helpful, even if limited to more common species. In the future, leveraging the full temporal dimension of these data types will permit research into dynamic changes over time. Furthermore, the development of standardised structural metrics that also apply in open areas may offer insights into the role of structural features in habitats with different vertical scales (e.g. in wetlands; Koma et al., 2021). Establishing the links between vegetation structure and management-either for production or conservation-will help guide actions to promote the features relevant to species' requirements. For Denmark, our results suggest that promoting water availability and vegetation complexity in forests, which are often intensively managed, could benefit bird diversity.
In conclusion, our study shows how national-scale LiDAR data allows us to investigate the response of bird functional groups to 3D features of the environment across large extents. We found that vegetation structure helped explain richness patterns, but that habitat availability best described bird assemblage composition. There Rahbek was involved in conceptualisation, funding acquisition, supervision, writing-review & editing. Naia Morueta-Holme was involved in conceptualisation, funding acquisition, methodology, supervision (lead), writing-review & editing (lead).

ACK N O WLE D G E M ENTS
Firstly, thanks to Birdlife Denmark and all the volunteer bird watchers that contributed to the Common Bird Monitoring programme.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflicts of interest.