Associations between socio‐environmental factors and landscape‐scale biodiversity recovery in naturally regenerating tropical and subtropical forests

Natural regeneration is key for large‐scale forest restoration, yet it may lead to different biodiversity outcomes depending on socio‐environmental context. We combined the results of a global meta‐analysis to quantify how biodiversity recovery in naturally regenerating forests deviates from biodiversity values in reference old‐growth forests, with structural equation modeling, to identify direct and indirect associations between socioeconomic, biophysical and ecological factors and deviation in biodiversity recovery at a landscape scale. Low deviation within a landscape means higher chances of multiple sites in naturally regenerating forests successfully recovering biodiversity compared to reference forests. Deviation in biodiversity recovery was directly negatively associated with the percentage of cropland, forest cover, and positively associated with the percentage of urban areas in the surrounding landscape. These three factors mediated the indirect associations with rural population size, recent gross deforestation, time since natural regeneration started, mean annual temperature, mean annual water deficit, road density, land opportunity cost, percentage cover of strictly protected forest areas, and human population variation in the surrounding landscape. We suggest that natural forest restoration should be prioritized in landscapes with both low socioeconomic pressures on land use conversion to pasturelands and urban areas, and high percentage of forest cover.


INTRODUCTION
Tropical and subtropical forests are threatened globally by land use change, with over 100 Mha of forest lost between 1980 and 2012 (Seymour & Harris, 2019). Both the science and practice of forest restoration have been boosted by current international agreements and national regulations as an attempt to reverse these losses and promote biodiversity conservation, climate change mitigation, and enhancement of multiple ecosystem services essential for human well-being (Brancalion et al., 2019;Strassburg et al., 2019). For example, the Bonn Challenge aims to restore 350 Mha of degraded and deforested lands by 2030 (Bonn Challenge, 2018). Achieving such an ambitious restoration goal during the UN Decade on Ecosystem Restoration (2021-2030) will be very expensive if cost-effective restoration approaches, such as natural forest regeneration, are not fully considered and implemented (Crouzeilles et al., 2017). The process of natural forest regeneration may not always reach similar biodiversity outcomes (e.g., species richness, abundance, diversity and assemblage similarity) compared to reference old-growth forests in the same land-scape (e.g., Crouzeilles et al., 2017). Biodiversity recovery can be slow or incomplete due to associations among biophysical, ecological, cultural, socioeconomic and political factors across space and time (hereafter collectively termed socio-environmental factors; Arroyo-Rodríguez et al., 2017;Chazdon & Guariguata, 2016).
Biophysical and ecological factors associated with biodiversity outcomes in naturally regenerating forests have been studied in different regions and at different spatial scales (Crouzeilles et al., 2017;Rozendaal et al., 2019). Yet the potential direct and indirect associations of socioeconomic factors and their interactions with biophysical and ecological factors in determining the trajectory of natural forest regeneration at the landscape scale remains a critical knowledge gap in the context of restoration policy and practice (Chazdon et al., 2020).
To address this knowledge gap, we asked: How are socio-environmental factors associated with deviation in biodiversity recovery in naturally regenerating forests at the landscape scale across the tropics and subtropics? To address this question, we first conducted a global meta-analysis to quantify how biodiversity recovery in naturally regenerating forests deviates from biodiversity values in reference old-growth forests within a landscape. Second, we used structural equation modeling to identify the direct and indirect associations between socio-environmental factors and deviation in biodiversity recovery. High deviation-low chances of multiple sites in naturally regenerating forests successfully recovering biodiversity compared to reference forests-is the result of the occurrence of both early and late successional and non-native species, local extinction of late successional species, and seed dispersal limitation (e.g., Crouzeilles & Curran, 2016;Crouzeilles et al., 2019). We hypothesized that socioeconomic and political factors that increase pressures on land use and create biophysical and ecological conditions with variable localized effects on forest regrowth would lead to high deviation in biodiversity recovery in naturally regenerating forest landscapes.

Data compilation
We used the forest restoration database constructed by Crouzeilles, Ferreira, and Curran (2016a), which includes data on biodiversity and vegetation structure for both reference and restored (naturally regenerating and actively restored forests) or degraded (e.g., agriculture and pasturelands) ecosystems within the same study landscape. The database served to compare reference old-growth forests and naturally regenerating tropical and subtropical forests according to different ecological metrics (species abundance, richness, diversity and/or similarity) and different taxa (plants, mammals, birds, herpetofauna, and/or invertebrates). In our analyses, we included only ecological metrics of species abundance, richness, and/or diversity since assemblage similarity can takes longer to recover and our study landscapes are generally young (Crouzeilles et al., 2016b;Rozendaal et al., 2019). Each primary study in the database contained information on the geographic coordinates of the area under investigation (hereafter study landscape) and old-growth forest used as a benchmark reference in the same landscape. The median distance between sampling sites within our study landscapes was 5 km, so our landscapes (the unit of analysis) were ∼8,000 ha in size (Crouzeilles & Curran, 2016). In total, we considered 34 study landscapes and 360 comparisons between naturally regenerating forests (between 2 and > 100 years old) and reference forests for different taxonomic groups, from 1984 to 2006 (Appendix).
The potential factors associated with deviation in biodiversity recovery (see next sub-section) were grouped into six categories: (i) climate (mean precipitation in the dri-est quarter, mean annual temperature, and mean water deficit); (ii) economic, infrastructure and political (percentage of commodity-driven deforestation; i.e., agricultural commodities production was expanded at the expense of forest cover), percentage of forestry (i.e., commercial tree plantations), sum of land opportunity cost, percentage of strictly protected forest areas, and mean road density; (iii) land use cover and change (percentage of: cropland (which includes shifting agriculture fields, annual and perennial crops and agroforestry systems), forest cover at any successional stage, recent gross deforestation, net primary productivity, pasture, shifting agriculture, and urban area); (iv) Societal issues (fire frequency, human development index, human population variation, rural population density, and rural poverty); (v) soil and topography (elevation, slope, soil bulk density, soil cation exchange capacity, soil pH, and soil sand content), (vi) time (mean time since natural regeneration started). Categories (i), (iii--percentage of forest cover), (v), and (vi) were classified as biophysical and ecological factors, while the other categories were categorized as socioeconomic factors. Detailed information is available in the Supporting Information, Table 1.
We extracted socio-environmental factors from available geospatial databases (except for time since natural regeneration started, which was extracted from Crouzeilles et al., 2016a). We estimated values of all geospatial factors based on the geographic coordinates reported for each study landscape, at six different spatial scales that represent different surrounding landscape sizes (buffer sizes of 5, 10, 25, 50, 75, and 100 km radius; Supporting Information). We conducted all spatial analyses in Google Earth Engine (Gorelick et al., 2017).

Meta-analysis
We conducted a global meta-analysis to quantify mean and deviation in biodiversity recovery in naturally regenerating forests at a landscape scale. To do so, we calculated a standardized mean effect size (response ratio-RR; Hedges, Gurevitch, & Curtis, 1999) for each quantified comparison of biodiversity between naturally regenerating forests and reference old-growth forests within a study landscape (Crouzeilles et al., 2016b(Crouzeilles et al., , 2017. Many RRs can be measured within a single study landscape due to studies accessing different taxonomic groups through different ecological metrics under different periods. Thus, mean biodiversity recovery at a landscape scale can vary from negative to positive, but values around zero are the desirable benchmark as they indicate that naturally regenerating forests are similar to the reference forests in terms of the levels of biodiversity supported (Crouzeilles et al., 2016b(Crouzeilles et al., , 2017. Deviation in biodiversity recovery at a landscape scale was calculated as the deviation of RR values in relation to a RR value equal to zero, which means that levels of biodiversity are similar between naturally regenerating forests and reference old-growth forests within each study landscape (i.e., "restoration success" as defined by Crouzeilles et al., 2019). Thus, deviation in biodiversity recovery in naturally regenerating forests at the landscape scale can be calculated as: where i and j are values of RR and N is the number of RR values, of sites within the same landscape. Thus, deviation in biodiversity recovery varies from zero (low variation) to positive infinite (high variation). Low deviation means that most RRs within a study landscape correspond to a high chance of successfully recovering biodiversity to the level of reference forests at the landscape scale.
Although individual stands of naturally regenerated forests can be similar to reference old-growth forests in terms of the levels of biodiversity supported (RR value equal to zero), a high level of deviation in biodiversity recovery within the same landscape may exist (Crouzeilles & Curran, 2016;Crouzeilles et al., 2019). For example, a study landscape with a large range of positive and negative RR values, but with same mean values, leads to a higher deviation in biodiversity recovery at a landscape scale. Therefore, we argue that deviation in biodiversity recovery is more meaningful information for assessing landscapescale biodiversity recovery in our analysis.
Deviation in biodiversity recovery at a landscape scale could also arise from the use of different ecological metrics and various taxonomic groups included in our analyses. We therefore removed RR outliers to avoid extreme and inconsistent data. We also tested whether the RRs were influenced by taxonomic group or a given ecological metric using ANOVA. RR values were not affected by taxonomic group (p = 0.48) or ecological metric (p = 0.61), and we therefore did not include these as factors in further analyses. Finally, deviation in biodiversity recovery and all potential socio-environmental factors were standardized using a z-transformation (mean of 0 and standard deviation of 1) to allow for unbiased comparisons of effect sizes across variables (Legendre & Legendre, 1998).

Scale of effect
We used an AIC-based model selection procedure to identify the surrounding landscape size that best explained deviation in biodiversity recovery in naturally regenerating forests at a landscape scale Jackson & Fahrig, 2012, 2015. We then completed all subsequent analyses using each potential factor at its strongest scale of effect (Supporting Information, Table 1).

Piecewise structural equation modeling
We conceived and then tested a new conceptual model of direct and indirect associations of socio-environmental factors and deviation in biodiversity recovery in naturally regenerating forests. A given potential explanatory factor was hypothesized to be associated with deviation in biodiversity recovery (i) directly-when there are no mediating factors, or (ii) indirectly-when its association is mediated by another factor. Our causal hypotheses were based on previous qualitative and quantitative analyses of the factors affecting the occurrence and persistence of naturally regenerating forests (e.g., Crk, Uriarte, Corsi, & Flynn, 2009;Jakovac, Dutrieux, Siti, Peña-Claros, & Bongers, 2017; Levy-Tacher, Ramírez-Marcial, Navarrete-Gutiérrez, & Rodríguez-Sánchez, 2019; Reid, Fagan, & Zahawi, 2018), which we also expected to be associated with deviation in biodiversity recovery (Supporting Information, Table 2).
We conducted two stepwise generalized linear multiple regressions to identify both the strongest model(s) directly and indirectly associated with deviation in biodiversity recovery. Prior to each of these two analyses, we removed potential explanatory factors that could overfit such a model by eliminating, one-by-one, the potential explanatory factors with the highest variation inflation factor (VIF) and of least ecological importance until the model contained only potential factors with VIF <5.
In the first analysis, we hypothesized and tested the direct associations of potential explanatory factors with deviation in biodiversity recovery (dependent variable). In the second analysis, we tested the associations of potential explanatory factors with significant direct factors (dependent variables) that may mediate indirect associations of non-significant direct factors. We did not consider a model selection approach because: (i) the number of alternative models would be much larger than our sample size, and (ii) indirect associations can be detected only using structural equation models.
We then used piecewise structural equation modeling (SEM; Shipley, 2000) to develop a global model by listing all strongest model(s) with factors directly and indirectly associated with deviation in biodiversity recovery. We tested the significance of the global model using Fisher's C statistic (Shipley, 2013). This highlights the fit of the global model, and was calculated from the p-values of all tests of F I G U R E 1 Location of the 32 study landscapes used in our analysis. Colors differentiate the standardized values of deviation in biodiversity recovery within naturally regenerating tropical and subtropical forests direct separation as follows: where p i are p-values of all independence tests. That is, the approach tested whether the arrows that do not exist in the "global model" (alternative paths) are significant, and K is the total number of independence tests. We used the coefficients of each path to estimate total direct and indirect effects of socio-environmental factors on deviation in biodiversity recovery (Grace, 2006). To estimate the total indirect effects, we summed all indirect pathway coefficients associated with direct pathway coefficients. To estimate the total global effect, we multiplied the direct pathway coefficients (direct effects) by the indirect pathway coefficient (indirect effects), and summed-the total direct and indirect effects (Grace, 2006).

RESULTS
Our analysis included information from 32 study landscapes, 350 RRs (Figure 1), and naturally regenerating forests with average of 53 years old (from 2 to > 100 years old). The biodiversity recovery data included comparisons for birds (24%), invertebrates (29%), plants (24.5%), mammals (10.5%), and herpetofauna (12%). Data on species richness (46%) and abundance (36%) were more frequent than data for species diversity (18%). Deviation in biodiversity recovery (where high deviation means lower chances of multiple RRs corresponding to successful biodiversity recovery compared to reference forests) was significantly and directly associated with three factors. It was negatively associated with both the percentage of cropland (estimate = −0.51, SD = 0.21, strongest landscape size = 5 km) and the percentage of forest cover (estimate = −0.48, SD = 0.20, landscape size = 25 km). It also was positively associated with the percentage of urban area (estimate = 0.46, SD = 0.16, landscape size = 50 km) (R 2 = 0.45, F = 3.18, p = 0.012). Our study landscapes have, on average, 58% of forest cover (varied between 15 and 95%), 19% of cropland (0 to 75%), and < 1% of urban areas (0 to < 1%). We then fitted these three direct factors as dependent variables that potentially mediated indirect associations of non-significant direct factors.
The results of the SEM indicated that the global model (i.e., including both direct and indirect paths) was a good representation of the socio-environmental factors associated with deviation in biodiversity recovery, both directly and indirectly (Fisher's C = 49,21, p = 0.958, Figure 2; Supporting Information, Table 3). Deviation in biodiversity recovery was indirectly associated in three ways ( Figure 2, Supporting Information, Table 3). First, percentage of cropland was positively associated with rural population density (estimate = 0.51, SD = 0.14, strongest landscape size = 50 km radius), percentage of recent gross deforestation (estimate = 0.47, SD = 0.14, landscape size = 100 km radius), and human population variation (estimate = 0.32, SD = 0.14, landscape size = 25 km radius) in the surrounding landscape. Second, percentage of forest cover was positively associated with time since natural regeneration started (estimate = 0.58, SD = 0.19, landscape size = NA) and mean annual temperature (estimate = 0.54, SD = 0.19, landscape size = 5 km radius), and negatively associated with mean annual water deficit (estimate = −0.66, SD = 0,14, landscape size = 5 km radius), road density (estimate = −0.44, SD = 0.16, landscape size = 50 km radius), land opportunity cost (estimate = −0.51, SD = 0.16, landscape size = 100 km radius), and percentage cover of strictly protected forest areas F I G U R E 2 Direct and indirect associations between socio-environmental factors and deviation in biodiversityrecovery. Colored panels indicate different pathways (blue-percentage of forest cover (Forest); purple-percentage of urban areas (Urban); pink-percentage of cropland (Cropland), with solid (direct paths) and dotted lines (indirect paths). Blue lines represent positive associations and red lines, negative associations. Def = gross deforestation rate, Rpop = rural population density, Pop = human population variation, Oppt = land opportunity cost, Road = road density, Time = time since natural regeneration started, WD = water deficit, Temp = mean annual temperature, SPA = strictly protected forest areas. Estimates and standard deviation (in parentheses) are given for each path. R 2 values showed in dependent variables symbols represent the fraction of deviation in biodiversity recovery explained by four multiple generalized linear regressions included in the Structural Equation Model (estimate = −0.51, SD = 0.18, landscape size = 100 km radius) in the surrounding landscape. Third, percentage of urban area was positively associated with human population variation (estimate = 0.24, SD = 0.10, landscape size = 25 km radius) and road density (estimate = 0.80, SD = 0.10, landscape size = 50 km radius) in the surrounding landscape.
The total magnitude of direct and indirect effects in the global model was estimated to be −1.39 and 1.07 (mean = −0.12 and 0.05, SD = 0.36 and 0.48), respectively (Supporting Information, Table 4). The total magnitude of effects in the global model (sum of all direct and indirect effects) was estimated to be -0.32 (mean = −0.04, SD = 0.43; Supporting Information, Table 4). Our global model was shaped by four generalized linear regressions that explained biodiversity recovery directly (R 2 = 0.67) and indirectly, via percentage of forest cover (R 2 = 0.67), percentage of cropland (R 2 = 0.66), and percentage of urban area (R 2 = 0.72).

DISCUSSION
Our findings supported the development of a new conceptual model in which deviation in biodiversity recovery in naturally regenerating forests at the landscape scale was directly associated with surrounding land use and forest cover, which also mediated several indirect socio-environmental associations (Figure 2). Landscapes with higher percentages of forest cover and cropland showed the highest levels of recovery of biodiversity in relation to reference forests. It is important to emphasize that cropland areas are not significantly correlated with commodity-driven deforestation area, which is a permanent agriculture class (R 2 = 0.32, p = 0.08; landscape size = 5 km radius). This means that croplands in our study landscapes are mostly represented mostly by non-mechanized farmlands, i.e., they are considered to be a relatively more "environmentally-friendly" matrix, which includes shifting cultivation, perennial crops and agroforestry systems. In contrast, landscapes with higher percentages of urban areas and pasturelands showed lower levels of biodiversity recovery. Land use and forest cover at a landscape scale were primarily shaped by economic opportunities, social needs and ecological/biophysical conditions. We suggest that natural forest restoration should be prioritized in landscapes with both low socioeconomic pressures on land use conversion to pasturelands and urban areas, and high percentage of forest cover. We report the limitations of this study in the Supporting Information.

Economic opportunities and societal needs
Deviation in biodiversity recovery was higher in landscapes with higher: (i) deforestation rates, (ii) rural and/or urban population densities, (iii) road densities, and (iv) land opportunity costs. These landscapes are characterized by substantial socioeconomic pressures to convert tropical secondary forests and old-growth forests to agriculture or grazing lands (Curtis, Slay, Harris, Tyukavina, & Hansen, 2018). As a consequence, such landscapes are usually characterized by a history of transformation, degradation and fragmentation. They are also characterized by long distances to seed sources from neighboring vegetation and reduced recovery capacity of natural regeneration (Arroyo-Rodríguez et al., 2017).
Conversely, deviation in biodiversity recovery was lower when the percentage of cropland was higher than the percentage of pastureland and urban areas in the surrounding landscape. In our study landscapes, croplands are more of an "environmentally friendly" matrix, which facilitates species movements, dispersal, recolonization and supplementation at a landscape scale (Chazdon et al., 2009;Phalan, Onial, Balmford, & Green, 2011). The opposite holds for urban areas, which can create an impermeable matrix for native forest species and have a strong negative effect on biodiversity (Yu, Xun, Shi, Shao, & Liu, 2012). Moreover, our study landscapes are surrounded, within a 25 and 5 km radius, by large amounts of forest cover (mean = 58.1 ± 18.5 and 61.0 ± 24.7, respectively). The implication is that when the amount of forest cover in a landscape is high (which is not necessarily correlated with percentage of strictly protected forest areas; Supporting Information), the complementary land use in such landscapes are usually "environmentally friendly" croplands rather than pasturelands and urban areas. Our study did not address biodiversity recovery in landscapes with very low levels of forest cover (in our 32 study landscapes, 0 has < 10% and up to 5 have between 10% and 30% of forest cover for buffer of 5 or 25 km of radius), where natu-ral regeneration is less likely to occur .

Favorable ecological and biophysical conditions
Deviation in biodiversity recovery was lower in landscapes with high annual temperatures (e.g., Crouzeilles et al., 2017), reduced soil water deficit (Rozendaal et al., 2019), and long periods of time elapsed since natural regeneration started. Higher temperatures and water availability have been associated with faster and more diverse patterns of recovery (Rozendaal et al., 2019). Time since natural regeneration started and climatic conditions shape forest cover in a landscape, which acts as seed sources for recolonization of native plant species and provides habitat for animals (Chazdon, 2003;Helmer, Brandeis, Lugo, & Kennaway, 2008), providing higher chances of biodiversity recovery in multiple sites within a landscape. Therefore, surrounding landscapes with favorable ecological and biophysical conditions are crucial for increasing the chances of biodiversity recovery.

CONCLUSIONS
Our analysis revealed that a low deviation in biodiversity recovery (which means a higher chance of naturally regenerating forests to successfully recover biodiversity compared to reference forests) occurs in landscapes with a higher percentage of cropland and forest cover, and in landscapes with a lower percentage of urban areas. These direct factors also mediate indirect associations. In summary, economic opportunities, social needs and ecological and biophysical conditions are primarily shaping forest cover at different successional stages and land use decisions at a landscape scale, which is directly associated with deviation in biodiversity recovery. Therefore, public policies and conservation strategies should foster a land sharing approach in landscapes with low socioeconomic pressure for forest conversion and with favorable ecological and biophysical conditions for increasing biodiversity recovery during natural forest regeneration.