Application of remote sensing technology to estimate productivity and assess phylogenetic heritability

PREMISE Measuring plant productivity is critical to understanding complex community interactions. Many traditional methods for estimating productivity, such as direct measurements of biomass and cover, are resource intensive, and remote sensing techniques are emerging as viable alternatives. METHODS We explore drone‐based remote sensing tools to estimate productivity in a tallgrass prairie restoration experiment and evaluate their ability to predict direct measures of productivity. We apply these various productivity measures to trace the evolution of plant productivity and the traits underlying it. RESULTS The correlation between remote sensing data and direct measurements of productivity varies depending on vegetation diversity, but the volume of vegetation estimated from drone‐based photogrammetry is among the best predictors of biomass and cover regardless of community composition. The commonly used normalized difference vegetation index (NDVI) is a less accurate predictor of biomass and cover than other equally accessible vegetation indices. We found that the traits most strongly correlated with productivity have lower phylogenetic signal, reflecting the fact that high productivity is convergent across the phylogeny of prairie species. This history of trait convergence connects phylogenetic diversity to plant community assembly and succession. DISCUSSION Our study demonstrates (1) the importance of considering phylogenetic diversity when setting management goals in a threatened North American grassland ecosystem and (2) the utility of remote sensing as a complement to ground measurements of grassland productivity for both applied and fundamental questions.

been shown to be correlated with productivity in trees (Wang et al., 2004) and biomass in prairie (Wang et al., 2016) and tundra habitats (Goswami et al., 2015). Furthermore, variation in reflectance across spectra may be phylogenetically conserved , suggesting that multispectral analyses can potentially be used to remotely estimate the community phylogenetic composition and diversity.
Remote sensing techniques can also estimate productivity by providing data to model the physical structure of vegetation. Photogrammetry uses overlapping photographs to reconstruct 3D structures, producing models of the outermost layer of vegetation visible in the photographs (Baltsavias, 1999). It can accurately capture the structure of large and complex vegetation forms, both as individuals (Scher et al., 2019) and as blocks of vegetation. Light detection and ranging (LiDAR) yields a more nuanced view of the layers of vegetation that comprise a community or individual plant (Baltsavias, 1999;Dubayah and Drake, 2000), enabling the estimation of both vegetation volume and density. While LiDAR provides more detail than photogrammetry, both techniques can accurately estimate biomass from volume over small areas (Wallace et al., 2017). Furthermore, photogrammetry requires only red-greenblue (RGB) photographs, whereas LiDAR requires more expensive equipment (Solazzo et al., 2018).
High-resolution satellite imagery remains prohibitively expensive for many ecology applications (Wang et al., 2010). By contrast, the cost of quality drone imagery is decreasing rapidly, making it an exciting tool for the investigation of individual plants and local landscapes (Cannon et al., 2018). Precision agriculture regularly uses high-resolution drone-based imagery (Zhang and Kovacs, 2012), and the decreasing costs of drone imagery are making this technology equally accessible to ecologists.
In this study, we first investigate how high-resolution (<1.5 cm) drone-based multispectral imagery and volume estimates correlate with field-based estimates of grassland productivity. We compare four productivity proxies: two measured from field observations (aboveground biomass and percent vegetation cover) and two measured via drone-based photography (individual light spectra and VIs [ Table 1] and photogrammetry-derived vegetation volume). We then use the productivity proxies to illuminate how the evolution of functional traits that influence plant productivity and competitive interactions shape the phylogenetic distribution of grassland productivity. Our study demonstrates the utility of remote sensing to address finescale ecological questions efficiently and economically. Furthermore, it advances our understanding of how the evolution of plant traits and lineage diversity shape grassland ecosystem processes.

Site
Our study system is a tallgrass prairie restoration experiment located at The Morton Arboretum in northeastern Illinois, USA. The soils (Markham series, fine, illitic, mesic, mollic oxyaquic Hapludalfs) on the site are deep and moderately well drained, forming in a thin layer of silt loam loess overlying silty clay loam till (USDA-NRCS, 2019). Our experiment manipulates phylogenetic and trait diversity in mixed species plots drawn from 127 tallgrass prairie plants, all of which essentially reached maximum productivity at the time the biomass and remote sensing data were collected (Hipp et al., 2018). The site is divided into 437 4-m 2 plots, which were partitioned based on measurements of their soil traits into two superblocks (west and east), each containing three blocks (A-C in the west superblock, D-F in the east superblock; Fig. 1). Of the 437 plots, our analysis used 326 plots of two types, monoculture and multispecies, all planted from plugs in August-September 2016, with follow-up planting of failed or initially unavailable plants in summer 2017. For each of the 127 species used, we planted one monoculture plot in each superblock, for a total of 254 monoculture plots. Each multispecies plot was planted with one of 36 combinations of 15-species mixes, and the 15-species planting order was randomized within each quarter plot. Multispecies plots were also duplicated in both superblocks for a total of 72 multispecies plots. Complete details of the experimental design are described by Hipp et al. (2018).

Soil analysis
We measured the A horizon in each plot and interpolated the gravimetric soil moisture, loss on ignition soil organic matter, pH, electrical conductivity, wet aggregate stability, and phosphorus from measurements of a subset of the plots. We used a principal component analysis to identify groups of plots that differed in these soil characteristics and thus partitioned the site into blocks. See Appendix S1 for details.

Photo capture and orthomosaic construction
A DJI Phantom 4 drone (Shenzhen DJI Sciences and Technologies Ltd., Shenzhen, China) carrying a Parrot Sequoia (Parrot Drones SAS, Paris, France) was flown over the experimental prairie on 1 August 2017, during the peak of its first full growing season following establishment. The drone captured 598 RGB photos and single-band images of the red, green, near-infrared, and red edge bands. For details, see Appendix S1.
Seven orthomosaics were produced by merging 598 photos of the site into a single image; each orthomosaic shows one of the three VIs measured (Table 1) or one of the four wavelength bands (Table 1, Appendix S2). Values of each VI range from −1 to 1, with high values representing dense vegetation and values close to and below 0 representing non-vegetative surfaces. Each VI, however, captures differences in vegetation density at varying levels; for example, NDVI saturates at high levels of chlorophyll concentration, and therefore does not capture variation in high vegetation densities. The green normalized difference vegetation index (GNDVI) uses green reflectance instead of red and has a larger dynamic range at high vegetation densities (Gitelson et al., 1996); GDVI 2 , a form of the generalized difference vegetation index (GDVI), is derived from NDVI and increases the dynamic range at low vegetation densities (Wu, 2014). The orthomosaics were exported from Agisoft Photoscan (Agisoft, St. Petersburg, Russia) in the tagged image file format (TIFF).

Biomass
Vegetative material was collected between 9 October and 20 November 2017, before the first major frost, to avoid leaf disintegration. In the multispecies plots, all vegetative material in the southwest quadrant was collected. In the monoculture plots, material was collected only from the central 0.25 m 2 of the southwest quadrant. All individuals of the planted species rooted inside the given area were clipped at ground level. The vegetation was dried in paper bags at 50°C for at least 48 h and weighed immediately after removal from the oven. We calculated the biomass of each species and the total biomass for the multispecies plots, which was converted to a dry weight of grams per square meter. Vegetation collected early in the season from some multispecies plots in blocks A, B, C, and F was not dried sufficiently before its biomass was measured. Biomass measurements in these plots were biased upward in the preliminary analyses, so multispecies plots that were affected are excluded from all biomass analyses.

Spectral imagery
All spectral band and VI orthomosaics (Table 1) were processed as rasters using the raster package (Hijmans and van Etten, 2012) in R (R Core Team, 2019). The outer 0.25 m of each plot was removed to avoid counting turf species encroaching from the paths, which were also excluded from the manual field-based productivity estimates. For each of the seven rasters, all pixels within the resulting 1.5 × 1.5-m squares were averaged. These values represent all vegetation in the plot at the time when the aerial images were taken, but because only the biomass of the planted species was measured after being collected later in the growing season, we used a correction factor to adjust the orthomosaics for a more consistent comparison. The adjustment was used to correct for weeds that were present in the drone images but whose biomass was not measured, as well as growth that occurred between the photo capture and biomass measurement. For details on the correction, see Appendix S1. This procedure produced 14 spectral values for each plot, representing the raw and corrected values from each of the seven rasters.

Estimation of vegetation volume
The vegetation volume of each plot was estimated using a vegetation height model (VHM) produced by subtracting a digital terrain model (DTM) from a digital surface model (DSM). The DSM and DTM were built in Agisoft Photoscan using all points in the dense cloud and only ground-level points, respectively. The DSM does not distinguish vegetation height from changes in ground level, so plants on higher ground appear taller, whereas the DTM interpolates the ground level below the vegetation from ground-level points. Subtracting the DTM from the DSM produced a VHM containing the height of vegetation after correcting for ground level. We then calculated the vegetation volume as the average height of the plot multiplied by the length and width of the plot. To minimize the effects of very slight variations in plot size, the resulting volume was then standardized to a 1.5 × 1.5-m plot. Standardizing in this way has the effect of transforming volume estimates into average vegetation heights, as estimated in the VHM. In either case, the estimate is fundamentally different from cover estimates (which estimate the area covered by a shadow cast by the vegetation) or the VIs (based on reflectance of discrete spectra).

Predicting productivity using models
We used a linear regression to estimate how well the remote sensing metrics predict biomass and cover using base R (R Core Team, 2019) and the lme4 package (Bates et al., 2015). We also use the MuMIn (Barton, 2009), lmer4Test (Kuznetsova et al., 2017), and qpcR (Spiess, 2018) packages to evaluate the models. Each model predicting biomass uses covariates from three groups: (1) a VI or single band, (2) total or planted cover, and (3) photogrammetry-modeled volume (hereafter, "volume"). Because the remote sensing metrics were calculated based on all the vegetation in each plot, we modeled total cover, not planted cover, using (1) a VI or single band and (2) volume. Covariates and responses were scaled to unit variance and a mean of zero prior to model fitting. Each combination of variables was run as a multiple linear regression and a mixed-effects model using block as a random effect. All regressions were fitted using all plots, only monoculture plots, and only multispecies plots. For the multispecies plots, we modeled total biomass and cover, not partitioned by species. While there is an expected collinearity between VIs, cover, and volume, all are maintained as covariates in the models to estimate partial correlation coefficients. Although a case can be made for eliminating predictors that covary to a great extent, the retention of the relatively small number of covariates in this study is key to assessing whether there is variance in the biomass left to explain after the other potential predictors have been factored out (Legendre and Legendre, 1998).
To test whether the presence of non-photosynthetic flowers that could alter VI values (Shen et al., 2010) affects the model fit, linear regressions were also conducted using only plots that did not have open flowers at the time of photo capture. We ran models on all nonflowering plots and on only nonflowering monoculture plots; there were only three nonflowering multispecies plots, so we did not run separate models on them.
We used R 2 to evaluate the model fit and Akaike's information criterion (AIC) to estimate the predictive value of the models (Sober, 2002). The AIC weight (AICw) of each model was calculated as its relative likelihood as a proportion of the summed relative likelihoods of all models evaluated (following Anderson, 2002, 2004). We then quantified the importance of each covariate by summing the AICw of all models in which the covariate was used (Burnham and Anderson, 2002) to obtain a cumulative AICw for each covariate across all models. A high cumulative AICw for a covariate indicates that the covariate is present in models that carry higher predictive weight. Unlike the AIC, the absolute value of the cumulative AICw is meaningful: a cumulative AICw near 1.0 implies that the covariate has high predictive weight relative to the population of models evaluated. In evaluating the support for individual models, we also report ΔAIC, the difference between the AIC of the best model evaluated for that test and the model being reported. While there are no hard rules about ΔAIC values, models with ΔAIC < 2 are generally considered to be practically indistinguishable in evidentiary support from the best model considered and thus to contribute significantly to our understanding of variance in the system we are studying, while those with ΔAIC > 10 have "essentially no support" (Burnham and Anderson, 2004).

Trait data
Quantitative and qualitative trait data for each species were gathered from published sources (Iverson et al., 1999;Amatangelo et al., 2014;Sonnier et al., 2014;Ash et al., 2015;Hipp et al., 2018;Clark et al., 2019) or collected as described in Appendix S1, using the species classification described by Bai et al. (2012). Nitrogen fixation and root types (adventitious, primary, bulb, corm, fibrous, rhizome, stolon, and tuber) were coded as binary traits. Quantitative traits included the genome size, petiole length, leaf length, leaf width, leaf thickness, vegetative height, seed mass, leaf dry-matter content, specific leaf area, leaf nitrogen content, leaf carbon content, leaf phosphorus content, stem dry-matter content, and circularity.

Phylogenetic and trait analyses
To assess how trait evolution shapes the phylogenetic structure of species-level productivity estimates in the monoculture plots, a phylogeny synthesized from published phylogenies and taxonomic knowledge of unsampled species (a "synthesis phylogeny" sensu Li et al., 2019;Fig. 2) was developed using Zanne et al. (2014) as a framework, followed by the pruning and splicing of taxa of interest in this study, as detailed by Hipp et al. (2018). We estimated the phylogenetic signal in species' traits, biomass, and corrected VI measures from monoculture plots (see methods above) using Blomberg's K (Blomberg et al., 2003) and Pagel's λ (Pagel, 1999) in geiger version 2.0.6.2 (Harmon et al., 2008). The correlation (Pearson's r) of plant traits with NDVI was plotted against their correlation with biomass, and the point size was scaled by the phylogenetic heritability to explore how the phylogenetic heritability of traits correlates with their influence on productivity. We then used Mantel tests of pairwise species comparisons of NDVI and phylogenetic distance, closely mirroring analyses conducted by Schweiger et al. (2018) that assessed the plant-level correlation between the species' mean spectral profiles and phylogenetic distance, to evaluate whether VIs at the plot level (from our study) show comparable phylogenetic heritability to the hyperspectral data at the plant level (from Schweiger et al., 2018).
We used non-metric multidimensional scaling (Kruskal, 1964) to visualize individual species positions in the trait and productivity space, using monoMDS in the R package vegan (Oksanen et al., 2019) on a Gower's distance matrix estimated from seven functional traits (stem dry-matter content, leaf dry-matter content, leaf circularity, vegetative height, leaf carbon content, leaf length, leaf thickness) along with biomass, NDVI, and GNDVI. The productivity http://www.wileyonlinelibrary.com/journal/AppsPlantSci © 2020 Scher et al.

FIGURE 2.
A 127-species phylogeny with corresponding biomass measurements and spectra mapped. Trait and productivity metrics are all rescaled from 0 (white) to 1 (black); raw measurements can be found in the data sets provided in GitHub (https://github.com/lanescher/prairie-remote-sensing-2020/ tree/master/DATA response and trait variables were fit to the ordination in vegan using the envfit function. The reader may note that we did not estimate partitioned biodiversity effects (sensu Loreau and Hector, 2001). The biomass sampling size in the monocultures (0.25 m 2 ) in our study did not equal the sampling size in the multispecies plots (1 m 2 ), and multispecies plot samples consequently included a greater proportion of plot edges than the monocultures. As the null expectation that diversity effects will equal zero is not met in the current sample, we leave these analyses to future studies.

Soil types
The first two axes of the soil principal component analysis explained 49.2% and 19.6% of the variance, respectively (available as Supplement 3 on GitHub and Zenodo [Scher et al., 2020], see Data Availability statement). The depth of the A horizon was responsible for 48.06% of the variance observed on the first axis and 23.88% of the second axis. Soils differ significantly among blocks, primarily distinguishing block A (μ = 38.5 ± 0.6 cm [SEM]) from blocks B-F (μ = 26.2 ± 0.2 cm; Kruskal-Wallis test, k = 172.44, P = 2.19 × 10 −35 ; Fig. 1). Block type was consequently retained as a random effect in biomass and cover regressions to account for differences in soil.

Biomass
The biomass of the monoculture plots ranged from 0 to 3977 g per 4-m 2 plot, with a mean of 399.0 g and a standard error of 29.1 g ( Table 2, Fig. 3). There was no effect of soil blocking on the biomass (Kruskal-Wallis test, k = 6.15, P = 0.292). We did not analyze these patterns in the multispecies plots, because the multispecies plots that were measured incorrectly were concentrated in a few blocks, the removal of which might introduce bias from block conditions. We do, however, show summaries of correctly measured multispecies plots in Fig. 3. All biomass measurements for the monoculture and multispecies plots are listed in Supplements 4 and 5 (available on GitHub and Zenodo, see Data Availability statement), respectively.

Volume
The mean volume across all plots was 0.96 m 3 and ranged from -0.31 m 3 to 3.55 m 3 . The 17 negative volume estimates suggest there was little vegetation within those plots and that the ground may slope. The volume was higher in multispecies plots (1.29 ± 0.08 m 3 ) than in monoculture plots (0.87 ± 0.05 m 3 ; Table 2; Kruskal-Wallis test: w = 12,705, P < 0.0001) and varied among blocks (two-sided Mann-Whitney test, k = 15.74, P = 0.008).

Evaluation of biomass models
We evaluated models predicting biomass with combinations of covariates from three groups (a VI or single band, total or planted cover, volume), for a total of 89 models with between one and three covariates each. Including block as a random effect resulted in negligible improvement in model likelihood and consequently the AIC increased (poorer model fit) for all models (available as Supplements 6-8 on GitHub and Zenodo, see Data Availability statement). We therefore only discuss models without block as a random effect. Moreover, the rank order of models based on AIC using plots with and without flowers was similar to that of models using only plots without flowers (available as Supplement 9 on GitHub and Zenodo, see Data Availability statement), so we do not discuss the results of the analyses conducted solely on plots without flowers. Models predicting the biomass of monoculture (Table 3 and Supplement 10 [available on GitHub and Zenodo, see Data Availability statement]) and multispecies plots (Table 4 and Supplement 11 [available on GitHub and Zenodo, see Data Availability statement]) separately produced very different results. Because of the difference in the number of monoculture plots (n = 254) and multispecies plots (n = 72) used here, the models evaluating all plot types together (see Supplement 12 on GitHub and Zenodo, see Data Availability statement) were dominated by the monoculture plots, and were almost identical to the models of monoculture plots alone. We therefore report the results for monoculture and multispecies plots separately. Planted cover (30/89 models; cumulative AICw = 0.999) and volume (45/89 models; cumulative AICw = 0.999) held by far the most predictive power, followed distantly by GDVI 2 (6/89 models; cumulative AICw = 0.252). The cumulative AICw of all other predictors was less than 0.104. The best model for predicting biomass in monoculture plots used GDVI 2 , volume, and planted cover as predictors, and accounts for 46% of the variation in biomass. All models incorporating planted cover and volume account for approximately 45% of the variation in biomass and have ΔAIC values <4.6. All models without both planted cover and volume have ΔAIC > 26.57 and R 2 < 0.42. The best model for predicting monoculture plots using only metrics from remote sensing imagery is similar to the best overall, but uses total cover, estimated by a visual inspection of aerial imagery, instead of planted cover, estimated by a visual inspection at ground level (R 2 = 0.38, P < 2.2 × 10 −16 ). The most important predictors of multispecies plots were volume (45/89 models, AICw = 0.999), total cover (30/89 models, AICw = 0.217), and planted cover (30/89 models, AICw = 0.215). Models incorporating volume have ΔAIC < 4.62 and account for at least 47% of the variance in biomass, whereas models that do not include volume have ΔAIC > 16.9 and account for less than 23% of the variance in biomass. The best model for predicting multispecies plots using only remotely sensed variables uses the red edge band, total cover, and volume (R 2 = 0.50).
Across all models, the residuals were nonlinear and heteroskedastic, suggesting the covariates we used did not account for all the variation in the response.

Cover models
We evaluated models predicting cover with combinations of covariates produced by selecting one variable from each of two groups (VI or single band and volume), for a total of 15 models with one or two covariates each. Models including all plots were dominated by monocultures (see Supplement 13 on GitHub and Zenodo, see Data Availability statement), and models predicting including and excluding flowers were very similar (see Supplement 14 on GitHub and Zenodo, see Data Availability statement). Similar to the models predicting biomass, including block as a random effect increased the AIC in all models (see Supplements 15-17 on GitHub and Zenodo, see Data Availability statement).
The best model for predicting cover in monoculture plots uses GDVI 2 and volume and accounts for 67% of the variance in biomass. The AIC and R 2 scores are similar for models using only a VI and models using the same VI with volume (ΔAIC < 2), except in the case of GDVI 2 , for which the model using volume is better (ΔAIC = 10.3). The ΔAIC is >105.99 for the models using single bands instead of VIs (Table 5). Volume (8/15 models) and GDVI 2 (2/15 models) are the most important covariates for predicting cover in monoculture plots (cumulative AICw = 0.999 and 0.994, respectively), with the AICw of all other covariates less than 1.5 × 10 −6 .
All multispecies-plot models (Table 6) have lower R 2 values than models predicting monoculture cover. The best model for multispecies plots uses only GNDVI and accounts for 29% of the variation in cover. Volume (8/15 models), GNDVI (2/15 models), and red edge (2/15 models) are the most important predictors of cover in multispecies plots (AICw = 0.384, 0.358, and 0.232, respectively).

Phylogenetic signal of functional traits correlated with biomass and NDVI
Traits exhibit high variation in phylogenetic heritability (see Supplement 18 on GitHub and Zenodo, see Data Availability statement), but overall the multivariate trait space is moderately predicted by phylogeny, with plants largely intermixed in trait space by family, although the Fabaceae and Poaceae stand out as distinct (Fig.  4). Stem dry-matter content and vegetative height, two of the traits most strongly correlated with productivity, show low phylogenetic heritability (stem dry-matter content: λ < 0.00001, K = 0.00263; vegetative height: λ < 0.00001, K = 0.00651) (Fig. 5, Supplement 18 [available on GitHub and Zenodo, see Data Availability statement]). Leaf thickness and circularity are also strongly correlated with biomass and have moderate estimates of phylogenetic heritability (λ = 0.43687 and 0.33924, respectively). Seed mass is weakly positively correlated with productivity and has moderate phylogenetic heritability (λ = 0.0771). The three traits with particularly strong phylogenetic heritability (genome size, λ = 0.88188; leaf nitrogen content, λ = 0.80715; leaf length, λ = 0.5112; and leaf width, λ = 0.40038) were weakly correlated with biomass and NDVI (Fig.  5). All measures of Blomberg's K were significantly less than 1, suggesting that the variance for each trait measured is relatively high within the sampled clades vs. among clades. Genome size, leaf nitrogen content, and specific leaf area had the highest measures (K = 0.02821, 0.01307, and 0.01013, respectively). The correlation between the pairwise VI dissimilarity and phylogenetic distance was not statistically significant (see Supplement 19 on GitHub and Zenodo, see Data Availability statement), but biomass was significantly correlated with phylogeny using the same Mantel tests (r = 0.4476, P < 0.001). Our pairwise species-level VIs were also not significantly correlated with the pairwise VIs described by Schweiger et al. (2018) (r = 0.0089, P = 0.468), likely because our VIs were collected at the plot level instead of the leaf level and thus less closely tied to individual species.

Predictive ability depends on vegetation diversity
Our study demonstrates that the predictive ability of all remote sensing metrics varies depending on diversity, but indicates that the  drone-based quantification of vegetation volume is a useful estimator in both monoculture and multispecies plots, explaining 34% and 47% of the variation in biomass, respectively. Of the predictors we tested for biomass, we found that volume is the most important for multispecies plots (cumulative AICw = 0.999) and second-most important (after planted cover) for monoculture plots (cumulative AICw = 0.999). Only one combination of covariates (volume and red edge) estimates biomass in multispecies plots better than volume alone (R 2 = 0.50, ΔAIC = 0.62). Conversely, volume is less effective at predicting biomass in monoculture plots than 58/89 combinations of covariates based on ΔAIC (volume and total or planted cover with or without any VI or single band; planted cover with or without any VI or single band; planted cover alone; volume and corrected or uncorrected NDVI, GNDVI, red edge, or near-infrared, or corrected green or red), which explain up to 46% of the variation in biomass (see Supplement 10 on GitHub and Zenodo, see Data Availability statement). This finding is particularly interesting because the variance in biomass among monoculture plots is higher than the variance among multispecies plots (Fig. 3). This is what we would expect, as the monoculture plots contain the extremes of growth form whereas multispecies plots are weighted averages of these extremes; consequently, it is hard to see how the variance of mixture plot variance could not be less than the variance of the monoculture plot variance. Volume may explain more of the variance in multispecies biomass than monoculture biomass if high-diversity plots have greater vertical niche partitioning and thus show a tighter correlation between volume and biomass due to a more complete filling of the volume estimated by photogrammetry. Vertical niche partitioning may result in more efficient space filling in the multispecies plots: species with different growth forms may exploit vertical layers more efficiently, mirroring canopy packing in diverse forests (Jucker et al., 2015). In monocultures, individuals of the same species are more likely to occupy the same level, leading to empty space under the highest vegetation layer. Because volume is estimated from the outermost layer of vegetation only, empty space will result in lower measured biomass but will have no effect on estimated volume. Therefore, monocultures will likely have a looser relationship between biomass and volume. Because of the limitations of our sampling that make it impossible to tease apart complementarity from selection effects in the multispecies plots (see discussion above), we cannot directly test the hypothesis here. To further investigate this relationship, LiDAR metrics that identify the density of vegetation at different heights could be useful, as would future analyses in which monoculture biomass and multispecies plot biomass were measured in the same way.
Surprisingly, although NDVI is one of the most commonly used VIs (Pettorelli et al., 2005), it was not the best VI predictor of biomass or vegetation cover in either plot type in our experimental prairie. Based on AICw, GDVI 2 was more important in predicting both biomass and cover (cumulative AICw = 0.252 and 0.999, respectively) than NDVI (cumulative AICw = 0.104 and 1.43 × 10 −6 , respectively) in monoculture plots. In multispecies plots, red edge was the most important spectral predictor of biomass (cumulative AICw = 0.160) and GNDVI was the most important spectral predictor of cover (cumulative AICw = 0.385); by contrast, NDVI was the least important spectral predictor of biomass (cumulative AICw = 0.042) and the third-most important spectral predictor of cover (cumulative AICw = 0.152). The difference in the best spectral predictors of biomass is likely related to the difference in vegetation density between monoculture and multispecies plots: GDVI 2 is more dynamic for less-dense vegetation (Wu, 2014), whereas red  edge is a good indicator of biomass at higher vegetation densities (Cao et al., 2016). Similarly, GDVI 2 and GNDVI likely predict cover in monocultures and multispecies plots best because they are more sensitive to variation at low and high vegetation densities, respectively (Gitelson et al., 1996;Wu, 2014). We expect that the patterns we describe here will be generalizable to other grassland community types that are similar in diversity, density, and stratification.

Prairie productivity is driven by phylogenetically labile traits
The traits most strongly correlated with productivity in our study are also traits with low phylogenetic heritability (Fig. 5).
Although traits evolving without constraint or selection are expected to track phylogeny (Felsenstein, 1985), natural selection dampens the effect of phylogenetic heritage on trait variation if it is uniform across taxa (e.g., in stabilizing selection) (Hansen, 1997), and even more dramatically when there is convergence between clades in ecologically significant traits (e.g., Cavender-Bares et al., 2004). Our results suggest that traits responsible for productivity-estimated in our study as biomass and VIs-are more variable within clades than between clades, either due to convergence, stabilizing selection, or pleiotropy on other selected traits. This is not true of all traits; in multivariate space, some plant families segregated based on functional traits (Fig. 4). This is expected if traits in general tend to track phylogeny; while individual traits diverge even under neutral expectations, even a wide range of traits that all fail to track phylogeny closely may track it in aggregate (Givnish and Sytsma, 1997;Cadotte et al., 2017). The mint family (Lamiaceae) and sunflower family (Asteraceae) are particular exceptions in our study as they cover a relatively broad range of trait space, suggesting that trait variance within these lineages may be driving the observed patterns. These results suggest that additional unmeasured traits associated with phylogeny may drive aboveground productivity. Alternatively, we may be observing a biodiversity effect (Duffy et al., 2017) that correlates with phylogeny, even if individual traits do not. Trait correlations with both productivity and phylogeny may be more complex than our metrics suggest; leaf dry matter content, for example, has been shown to be correlated with productivity (Smart et al., 2017), but is inconsistently correlated with productivity metrics in the current study and exhibits moderate phylogenetic conservatism (Fig. 5). Furthermore, we also assessed whether measures of productivity were correlated with traits that influence productivity, such as leaf carbon content. Although NDVI has been used as a predictive tool for measures of biomass and leaf nitrogen content (Cabrera-Bosquet et al., 2011), we did not find such a correlation with leaf carbon content, likely because the spectra involved in distinguishing the levels of carbon are different from those involved in detecting nitrogen levels.

Application of high-resolution imagery in ecological studies
Remote sensing metrics are often used to detect productivity (Wang et al., 2004), as well as species richness and diversity (Gould, 2000); however, few studies examine these three features together (Wang et al., 2016). Here, we find that these three features are interrelated: vegetation diversity affects the relationship between remote sensing metrics and productivity. Specifically, these results suggest that when using metrics derived from imagery to estimate productivity, diversity needs to be considered. We focus on only two levels of diversity (monoculture and multispecies), but future studies using more levels could examine this relationship in more detail. Our finding that volume is more useful than NDVI in estimating biomass has two important implications: (1) a traditional camera can capture adequate RGB images for volume reconstruction, reducing costs compared to techniques requiring a multispectral sensor; and (2) LiDAR approaches may prove even more accurate, given the informative vegetation structure data. The VIs, particularly NDVI, have been the focus of most previous biomass estimations (Das and Singh, 2012;Goswami et al., 2015), but our results support recent research suggesting that high-resolution volume data are also a good estimate of biomass (Wallace et al., 2017).
We demonstrate the use of high-resolution remote sensing imagery to address an ecological question. We find that volume is a good predictor of biomass regardless of diversity. Because structural metrics such as volume are the least expensive remote sensing proxy to obtain, this technique could be used much more widely than those requiring a multispectral sensor. Finally, we find that the relationship between remote sensing proxies of productivity depends on the diversity of the vegetation. The variation in this relationship is likely due to differences in vegetation structure and warrants further investigation.

ACKNOWLEDGMENTS
The authors thank the dedicated Morton Arboretum weekly prairie volunteers (K. Thomas, R. Czuprynski, P. Tota, J. Knight, and J. FIGURE 5. Correlation of biomass and NDVI with individual traits. Size of dots represent phylogenetic heritability, as estimated using Pagel's λ. The traits with greatest heritability (lower left; i.e., leaf nitrogen [N] content and genome size) are weakly correlated with biomass and NDVI, whereas traits with a high correlation have low heritability (upper right; i.e., stem dry-matter content, vegetative height).