Where Are Global Vegetation Greening and Browning Trends Significant?

Global greening trends have been widely reported based on long‐term remote sensing data of terrestrial ecosystems. Typically, a hypothesis test is performed for each grid cell; this leads to multiple hypothesis testing and false positive trend detection. We reanalyze global greening and account for this issue with a novel statistical method that allows robust inference on greening regions. Based on leaf area index (LAI) data, our methods reduce the detected greening from 35.2% to 15.3% of the terrestrial land surface; this reduction is most notable in nonwoody vegetation. Our results confirm several greening regions (China, India, Europe, Sahel, North America, Brazil, and Siberia), that are also supported by independent data products. We also report evidence for an increasing seasonal amplitude in LAI north of 35°N. Considering the widespread use of spatially replicated trend tests in global change research, we recommend adopting the proposed multiple testing procedure to control false positive outcomes.

potentially leading to inferred trends that are false positives: when performing hundreds of thousands of tests, which is the case of gridded datasets derived from satellite products -we expect % of all grid cells to be significant just by chance (false positives). Due to spatial autocorrelation, these false positives may lead to the interpretation of spurious spatial patterns. In the case of global greening and browning, previous studies have reported (at   0.10 ) LAI trends in ∼40% of the terrestrial land surface (∼35% greening and ∼5% browning). Therefore, we expect a fourth (0.10/0.40) of these trends to be false positives just due to the statistical testing mechanism.
The issue of multiple hypothesis testing has been raised in the environmental sciences, along with some common corrections for multiple hypothesis testing (Livezey & Chen, 1983;Ventura et al., 2004;Wilks, 2006Wilks, , 2016, but these methods have not been widely adopted. We focus on a permutation method that allows us to make inference on regions of significant greening or browning, as shown recently by Cortés et al. (2020). This novel statistical method shall allow us to define clear focal areas of greening that can then be used in subsequent analysis.
In this work, we reanalyze global greening trends based on the Boston University Advanced Very-High-Resolution Radiometer (BU AVHRR) LAI data set and validate the results with four other data products: NOAA CDR, LTDR, BU moderate resolution imaging spectroradiometer (MODIS C6), and SPOT. We perform a Mann-Kendall (MK) trend test, which is a common significance test for the slope estimated by the Theil-Sen method. The latter is a nonparametric estimator of a linear trend, which tells us how the average LAI changes with time, but is robust against individual outliers. We correct all analyses for the multiplicity of hypothesis tests using a permutation method based on clustering. However, we also expect that the quantiles of LAI may behave differently than the mean over time. For example, an increase in the seasonal amplitude of LAI would make the 0.1 and 0.9 quantiles behave differently than the mean. To test this hypothesis, we also aggregate our yearly data into the 0.1, 0.25, 0.5, 0.75, and 0.9 quantiles, and perform trend tests on these quantiles.
The study is structured as follows. The five data products used in this study are further explained in Section 2. In Section 3, we detail our methodology. Our results and discussion are presented in Section 4, and we present our conclusions in Section 5.

Data
We analyze five data products that all aim at improving the estimates of Leaf Area Index (LAI) around the globe. These datasets are derived from different satellite products and use state of the art methodologies to ensure the quality and consistency of their estimates. Our results are based on the BU AVHRR LAI data, and validated with LAI data obtained from the National Oceanic and Atmospheric Administration's Climate Data Record (NOAA CDR), the Land long Term Data Record (LTDR), BU MODIS C6, and SPOT/PROBA-V. Further details can be found in Table 1

Methods
Before applying the trend test, we aggregate the data into yearly values using several descriptive statistics: the mean, the 0.1, 0.25, 0.5, 0.75, and 0.9 quantiles, and the interdecile range (defined as the difference between the 0.90 and 0.10 quantiles). The ability to detect trends is not affected by this averaging, and the trend estimates are comparable to methods which use the unaveraged time series (Forkel et al., 2013). We use all available LAI values in the given year, as this is better suited for a global study and avoids issues such as defining a growing season, dealing with multiple growing seasons, and dealing with growing seasons that span two calendar years (C. Chen, Park, et al., 2019). We test for a trend with the nonparametric MK trend test (Kendall, 1975;Mann, 1945). The MK trend test is commonly used as a test of significance for the Theil-Sen estimator, which is a nonparametric method to estimate a monotonic trend from a time series.
Temporal autocorrelation is accounted for with an AR(1)correction. Not accounting for temporal autocorrelation can lead to increased false positive rates (von Storch, 1999). We follow the procedure outlined in von Storch (1999): for each grid cell, we calculate the temporal autocorrelation at lag-1, MK trend test is then performed on this new time series. This procedure is applied to all of the data in this manuscript.
To correct for multiple hypothesis testing, we apply a permutation method based on clustering. The permutation test establishes a threshold for overall significance based on the number of contiguous (first-order queen neighbors) significant grid cells. The motivation behind this test is that the clusters formed by false positives are smaller than those formed where a true increase or decrease in vegetation exists. We perform 3,000 permutations to determine the threshold for significance. At each permutation, all grid cells are permuted jointly, that is, using the same set of permuted time indices. This preserves the spatial correlation of the data. The MK trend test is performed on this permuted data, and the largest cluster of significant grid cells is recorded. The     1 th quantile of the 3,000 recorded largest clusters is the threshold for overall significance. In our original data, for each significant region we count the number of grid cells, and if it exceeds the threshold established by the permutation method, we declare this region significant. This method has been proven to control the probability of having false positives in the results, and has higher statistical power than comparable methods, such as Bonferroni and related methods (Cortés et al., 2020).
For a consistent comparison of greening trends in different LAI products, we also analyze the time period from 2000 to 2018 for all data products.

Results and Discussion
We find consistent greening trends of the yearly LAI mean in the whole world: Asia (35% of overall global greening), Europe (32.5%), North America (13.3%), Africa (12.6%), South America (6%), and Australia (0.5%). These trends are detected mostly in the northern hemisphere ( Figure 1). These greening patterns are consistent with those detected in other recent studies such as C. Chen, Park, et al. (2019); we detect the same greening patterns and, additionally, several more. Our results validate the main greening hotspots, but also suggest that other regions which have been discussed in other studies are not large enough to be statistically significant when analyzing the whole terrestrial land surface -for example, our study does not detect the greening and browning of Alaska (Berner et al., 2020;Forkel et al., 2013;Verbyla, 2008), the greening western Sahel zone (Dardel et al., 2014), and most of the greening of Australia (Ukkola et al., 2016). Our method assumes that false positive regions are smaller than true signal regions. It is reasonable to expect regions of greening to be much larger, due to the large scale global greening that has been reported. This is the case in our study, as evidenced by the detected greening across all continents. Nevertheless, there are still regions, as discussed previously, that are too small to be detected by our method when analyzing the whole world. Therefore, analyzing a smaller study region may reveal previously undetected trends.
We observe pockets of nonsignificant grid cells among the significant clusters. Because there is no wall-towall ground truth validation, noise in the trend can cause some small pockets. Larger pockets can be attrib-uted to different biome types, caused by (sharp) elevation change in mountainous regions and different land use (e.g., urban vs. cropland).
We look only at monotonic trends for the whole period. Thus it is possible that there are strong piecewise trends (e.g., first greening, then browning) which lead to a nontrend overall. This has been observed by de Jong et al. (2012), who reported that 15% of the terrestrial land surface exhibits both greening and browning.
The statistical testing of such trend changes, while taking into account the multiple hypothesis testing, will be an interesting follow-up study.
Not adjusting for multiple hypothesis testing can overestimate the global greening ( Figure 2 (26.7% reduced to 5.2%) we observe a great reduction in detected greening. We categorize areas as uncertain greening where there are grid cells whose trend signal is not strong enough to be detected by our method, but strong enough to be detected when performing no correction.
The discrepancy in the amount of overestimation by continent can be better understood by looking at the climate zones and land use type. This reveals that the uncertain greening occurs twice as much in nonwoody compared to woody vegetation (Figure 2c), while there is very little uncertain greening in crop land. This is in line with previous studies that have observed prominent greening in cropland (de Jong et al., 2011;Piao et al., 2020). Piao et al. (2020) also observe strong greening in afforested regions and biomes with low human footprint, which coincides with the bulk of the confirmed greening that we detect in temperate and continental climate zones. The overestimation is particularly large in nonwoody vegetation within tropical and dry climate zones (Figure 2c). In tropical regions, large uncertainty is caused due to the saturation of LAI in dense vegetation, in addition to cloud and aerosol contaminants, while uncertainties in semi-arid climate and sparse vegetation can be due to the sensitivity of the vegetation indices to change in soil background (Piao et al., 2020).
Using the mean or the median to aggregate yearly values can impact the results. Both mean and median detect similar greening regions, but browning is only detected when using the median (Figure 3). While the mean detects greening in ∼16.6 million km 2 (273,247 grid cells) of the land surface and no browning, the median detects greening in ∼15.8 million km 2 (258,385 grid cells) and browning in 1.1 million km 2 (26,241 grid cells) of the land surface. Two browning patterns arise when using the yearly median: one in Siberia and one in Canada. Previous studies have indicated a reversal or stalled greening in high latitudes (Gonsamo et al., 2019), and it is argued that increased plant growth in spring and earlier start of the growing season leads to decreased summer growth and decreased peak season maximum plant growth. When correcting for multiple testing, the summarizing statistic should not be overlooked when analyzing global greening and browning trends. Previous studies commonly use the average, so a reanalysis using the median can potentially reveal new browning patterns and reduce greening regions. the mean and median, the lower quantiles (0.10, 0.25) reveal browning trends in specific regional clusters around the globe (Figure 3). When we look at the distribution of Z-scores obtained from the LAI trends throughout the quantiles, there is a shift from most grid cells indicating browning at the 0.10 quantile to most grid cells indicating greening at the 0.90 quantile ( Figure S1). The bulk of the distribution shifts from browning to greening, which hints toward an increase in the seasonal amplitude around the globe. We further investigate this by analyzing the interdecile range, which shows the regions where we detect several regions of significant increase in the seasonal amplitude of LAI ( Figure S1). An increase in seasonal amplitude has been previously observed and could be due to trends in deforestation, fertilization and climate change effects, and/or changes in management (Kraemer et al., 2020).
The growing season in the higher latitudes has been expanding (Wolfe et al., 2005;H. Zeng et al., 2011;Ziska et al., 2011). This can cause the browning trends at the 0.10, 0.25, and 0.50 quantiles in the northern latitudes, since more low LAI values are detected each year. Previously these values were missing (marked as NA in the datasets), and they did not contribute toward the yearly aggregate. We would expect the trends CORTÉS ET AL.
10.1029/2020GL091496 6 of 9 at the 0.10 quantile to be the most affected, yet most of these are validated by being detected at the 0.25 quantile. Unlike the median, the mean does not detect browning when analyzing yearly trends. This can be explained by the increase in peak summer vegetation which, as opposed to the longer growing season, is driving the greening trends in Eurasian boreal forests (Gao et al., 2020). The mean is more affected than the median by these summer peaks and therefore does not detect browning.
A large increase in the seasonal amplitude of CO 2 (at least 50%) has also been observed at all latitudes north of 45°N, and to a lesser extent (∼26%) between 35°N and 45°N (Graven et al., 2013). It is argued that this increase can be caused by the ecosystems growing and shrinking more than before. These ecological changes are corroborated by the increasing and decreasing vegetation trends found previously in the northern hemisphere (Myneni et al., 1997), which we confirm with our rigorous significance testing. Furthermore, the regions where we detect an increase in the seasonal amplitude of LAI are also concentrated north of 35°N.
When considering a comparable time span (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018), there is no longer such consistency ( Figure S2). Few regions are detected by more than one data product, e.g., greening in southeast China is only detected by BU AVHRR LAI and BU MODIS C6, and greening in northern Canada is only detected by BU MODIS C6 and SPOT/PROBA V. In contrast to all other datasets, the LTDR detects browning in the Amazon forest, central Africa, Madagascar, and India. These results can be caused by several effects. For example, when shortening the time span, there may no longer be enough evidence to detect many greening trends. To explore this hypothesis, we test all possible 19 year periods in the BU AVHRR LAI data, that is, we analyzed the trends from 1981 to 2000, 1982 to 2001, …, 2000 to2018. Only for 4 (out of 20) of these time periods we found few additional greening trends in the Sahel region ( Figure S3). We find this still inconclusive -it is possible that the trends are missing due to a combination of (a) not enough data points and (b) a change in trend within the time series. Related to (b), abrupt and gradual changes in trend have been observed in large parts of the world by de Jone et al. (2012). A change in trend can also explain the browning found in the LTDR (e.g., greening from 1981 to 2000 can mask the browning from 2000 to 2018). The ability to detect a yearly trend is affected by this shortening.
All datasets that go back to 1981, BU AVHRR LAI, NOAA CDR, and LTDR, agree on the greening regions found in Europe, China, India, North America, Brazil, the Sahel, and Siberia. Thus we have more confidence in these observed greening patterns. Investigating these regions further with trend-attribution methods is an opportunity for further research.
In this work, we have introduced a new step, correcting for multiple hypothesis testing, when analyzing global LAI trends. However, the principles are very generic and relevant to most of the spatiotemporal dynamics encoded today in the growing global multivariate Earth system data cubes . For all these data streams, the presented methodology offers a quantitative way to automatically detect regions of statistically significant trends in either direction while controlling the probability of detecting false positives. By controlling the probability of detecting false positives, we make the detected trends more robust.

Conclusions
Many studies confirm that the Earth is greening but can differ in their explanation. We argue that a key step for analyzing the greening trends is missing -in our study we have introduced and applied a novel method that is often overlooked in a statistical analysis: a multiple hypothesis testing correction. In this work we apply a permutation-based correction to LAI trends and analyze yearly trends of the mean and the 0.10, 0.25, 0.50, 0.75, and 0.90 quantiles.
We detect greening in 15% of the terrestrial land surface, as opposed to 35% when applying no multiple testing correction. Still, we detect several hotspots that now stand out more prominently and consistently across the five data products that were analyzed. This overestimation happens twice as much in nonwoody vegetation than woody vegetation, and particularly in tropical and dry climate zones. Greening detected in crop land is the most reliable. Browning is only detected when aggregating the yearly data using the median instead of the mean. Additionally, we observe an increase in the seasonal amplitude of LAI around the globe.