Environmental heterogeneity dynamics drive plant diversity on oceanic islands

The General Dynamic Model (GDM) links island biogeographical processes to island geological history. A key premise of the GDM implies that environmental factors shaping the ecology and evolution of biota on oceanic islands follow a hump‐shaped trend over the island's life span and drive dynamics in carrying capacity, species diversity and endemism. An important component of the GDM is environmental heterogeneity (EH), but its effects on insular diversity remain poorly understood. Here, we first quantified EH, tested whether EH follows the expected hump‐shaped trend along island ontogeny and evaluated how EH relates to plant diversity.

Oceanic islands are highly dynamic systems with heterogeneous environments, unique biota and outstanding levels of endemism (Kier et al., 2009;Weigelt, Jetz, & Kreft, 2013). Oceanic islands are typically characterized by a rapid volcanic growth and a relatively short life span, which can range from days (Sabrina, Azores in 1811) to tens of millions of years (Fuerteventura 20 Ma) (Fernández-Palacios, Otto, Thebaud, & Price, 2014). Some oceanic islands reach remarkable elevations, such as the Mauna Kea (4,207 m a.s.l.) on the Island of Hawai'i or Mount Teide (3,718 m a.s.l.) on Tenerife, and these impressive elevational gradients cause marked differences in temperature, orographic precipitation regimes and rain shadow effects over short geographical distances.
Over geological time scales, island surfaces change from high and smooth volcanos, through highly rugged terrain when erosion shapes mountain ridges, to flat island remnants (Paulay, 1994;Price & Clague, 2002). Such changes are caused by the interplay between volcanic activity, erosion, landslides and subsidence (Badgley et al., 2017;Carracedo et al., 2011). Certain oceanic islands have a complex geological history with repeated episodes of volcanism and mega-landslides (Gillespie & Roderick, 2014;Neall & Trewick, 2008). Hence, oceanic islands may exhibit high EH in terms of topography, climate (e.g. orographic precipitation regimes) and soil conditions (Seijmonsbergen, Guldenaar, & Rijsdijk, 2018;Whittaker et al., 2007). However, little is known about how ecologically relevant components of EH change over the life span of oceanic islands and how these changes affect the biogeographical processes generating and maintaining insular diversity.
Investigating how the dynamic nature of EH through the ontogeny of islands (i.e. island development through its geological life span from island emergence, island building, advanced island age to island submergence) affects colonization, speciation and extinction rates and emergent patterns of insular species diversity is at the forefront of modern island biogeography Whittaker, Fernández-Palacios, Matthews, Borregaard, & Triantis, 2017). The General Dynamic Model (GDM) links the geological dynamics of oceanic islands to biogeographical rates and diversity patterns (Whittaker, Triantis, & Ladle, 2008). One of the three premises of the GDM states that island elevational range, topographic complexity (both belonging to the topographic component of EH) and island area change in a predictable manner over time and peak at intermediate island age, causing a hump-shaped pattern in island carrying capacity and species richness (Lim & Marshall, 2017;Valente, Etienne, & Phillimore, 2014;Whittaker et al., 2008). The peaks in carrying capacity and species richness are assumed to occur between the time when an island first reaches its maximum elevation with steep climatic gradients, and the time when it reaches maximum topographic complexity after having experienced erosion, that is, a rugged and dissected landscape with a large number of different habitats. The hump-shaped relationship between species diversity, island area and age has been summarized in the GDM and was originally mathematically expressed as: Biodiversity ~ ln(Area) + Time + Time 2 , called the 'ATT 2 model' (Whittaker et al., 2008).
Despite these clear theoretical underpinnings, the complex geological histories and often idiosyncratic trajectories of individual oceanic islands (Ali, 2017) may limit the applicability of the GDM Keppel et al., 2016). Most empirical tests of the GDM have focused on the relationship of island area and age with species richness Lenzner, Weigelt, Kreft, Beierkuhnlein, & Steinbauer, 2017;Steinbauer, Dolos, Field, Reineking, & Beierkuhnlein, 2013). The few studies that included EH found species diversity best explained when both area and EH were included in the models (Keppel et al., 2016). However, the validity of the assumption of a hump-shaped relationship between EH and island age -to the best of our knowledge -has never been tested before.
Also, the effect of EH on insular plant diversity (i.e. inclusion of EH in the ATT 2 model) and species diversity remains poorly understood  Triantis, Mylonas, Lika, Natural, & Museum, 2003). The apparent research gap of EH research on islands is partly due to EH being a multifaceted concept and difficult to quantify in an ecological meaningful way, that is, to capture all attributes of heterogeneity in an environment that may drive species diversity.
The quantification of EH is complicated by at least two main challenges. First, EH comprises a potentially large number of different components (e.g. related to precipitation, topography and soil types). Second, different quantification methods may capture different aspects of EH . In addition, the spatial scale at which EH is calculated may affect the results of EH quantification (Jackson & Fahrig, 2015) and interactions with island area can potentially modify the effect of EH on diversity (Allouche, Kalyuzhny, Moreno-Rueda, Pizarro, & Kadmon, 2012;Hortal et al., 2013). For instance, the area-heterogeneity trade-off hypothesis predicts species diversity to decrease at high levels of EH, because the effective area of individual habitats is reduced (Allouche et al., 2012). While theoretical arguments opposing the area-heterogeneity trade-off hypothesis have been raised (Hortal et al., 2013), it remains largely unknown if there is an interactive (positive or negative) effect of island area and EH on diversity on islands (but see Hortal et al., 2009) and how this relates to the GDM. Therefore, we set three aims for the present study. First, we aim to calculate and compare various alternative, ecologically meaningful EH metrics across a large number of oceanic islands worldwide by focusing on two main abiotic EH components, namely climatic and topographic heterogeneity. Second, we test the GDM premise that EH exhibits a hump-shaped relationship with island age. Finally, we evaluate the effect of EH on plant diversity of oceanic islands and examine the EH effect on predictor variables of the GDM (island area and time) and the potential interaction between EH and island area on diversity, by including different variants of EH in the ATT 2 model.

| MATERIAL S AND ME THODS
We studied spatial environmental heterogeneity (EH), its relationship with island age and its effect on species diversity of vascular plants for 135 oceanic islands of volcanic origin (Figure 1), which belong to 41 archipelagos worldwide ( Figure S1.1). We restricted the analysis to islands >2 km 2 , as EH could not be calculated in a meaningful way for smaller islands, given the spatial grain of the climatic variables (1 km). We worked with two aspects of plant diversity, (a) species richness of native species and (b) single-island endemics.
The latter reflects evolutionary processes on islands, such as in situ speciation (Weigelt, Steinbauer, Cabral, & Kreft, 2016). We obtained the information about plant diversity, as well as island characteristics (island age and area), from the Global Inventory of Floras and Traits (GIFT). The GIFT database provides information on distributions and floristic status (native, endemic, alien) of plant species based on a wide range of regional floristic databases, floras and checklists (Weigelt, König, & Kreft, 2020). The full list of original literature resources used to obtain species diversity information is available in Appendix 1.

| Environmental heterogeneity components
To assess the climatic component of EH, we used mean annual precipitation (PREC; mm/year) and mean annual temperature (TEMP; °C). Both variables influence the water and energy available to plants and are strong determinants of plant diversity (Kreft & Jetz, 2007). The information of PREC and TEMP were derived from the Climatologies at High resolution for the Earth's Land Surface Areas (CHELSA) dataset at a spatial grain of c. 1 km (Karger et al., 2017). To evaluate the topographic component of EH, we used F I G U R E 1 Workflow of quantifying environmental heterogeneity (EH) for 135 oceanic islands worldwide, displayed as black dots in the world map. A total of 20 EH metrics were calculated for each island to capture four EH components: elevation (ELEV; m a.s.l.), mean annual precipitation (PREC; mm/year) and mean annual temperature (TEMP; °C), and heat load index (HLI). Two types of metrics were investigated: (i) whole-island metrics that summarized EH per island by its range (rg) or standard deviation (sd) of the four EH components and (ii) moving-window metrics that first calculated statistics for a focal cell within the specified window size (9 km 2 ), resulting in heterogeneity rasters for each EH component, and then summarized EH for each island using the mean value of the heterogeneity rasters, that is, mean of dissection (dis), standard deviation (msd) and roughness (rou). Tenerife island (2034 km 2 ) is shown as an example displaying the different EH components, heterogeneity rasters and the 9 km 2 window size [Colour figure can be viewed at wileyonlinelibrary.com]  (Jarvis, Reuter, Nelson, & Guevara, 2008). From the ELEV data, we derived the heat load index (HLI), using the formula: HLI = −1.467 + 1.582 * cos(L) * cos(S) − 1.5 * cos(Fα) * sin(S) * sin(L) − 0.262 * sin(L) * sin(S) + 0.607 * sin(Fα) * sin(S),

Moving-window metrics
where L = latitude, S = slope and Fα = folded aspect (see equation 1 in McCune, 2007;McCune & Keon, 2002). HLI assesses topographyassociated thermal gradients by including the slope aspect. Slope aspect strongly influences local temperature, that is, equatorial facing slopes are warmer than polar facing slopes. Thus, slope aspect influences the thermal condition in a habitat (McCune, 2007). HLI has already been related to distribution patterns of plant diversity (He et al., 2017).

| Quantification of environmental heterogeneity across oceanic islands
To quantify the two main EH components, climatic and topographic heterogeneity, we calculated two types of metrics ( Figure 1). First, we used whole-island metrics describing the range and the spatial variability of the EH components over the whole island. These metrics can describe island carrying capacity of an entire oceanic island . To this end, we summarized EH per island by the range (rg) and the standard deviation (sd) of the environmental variables ELEV, HLI, PREC and TEMP. Second, we used moving-window metrics to calculate the local environmental turnover of either climate or topography within a defined area (Amatulli et al., 2018). This group of EH metrics can provide information about climatic and topographic complexity, and potential topography-associated dispersal barriers within islands.
We applied a moving-window approach (Hagen-Zanker, 2016) that calculates statistics for a focal cell within a specified window size (here 9 km 2 comprised 3 × 3 and 12 × 12 pixels for CHELSA and SRTM data, respectively). Within the 9 km 2 window, we calculated three statistics (1) dissection = (z − z min )/(z max − z min ), where z max = maximum, z min = minimum and z = focal cell value within the window, (2) standard deviation and (3) roughness (i.e. the largest inter-cell difference of a focal cell and its surrounding cells) (Amatulli et al., 2018;Riley, DeGloria, & Elliot, 1999). This produced three new raster layers for each EH component (hereafter 'heterogeneity rasters') with identical spatial extent and grain as the input and each new cell describing the heterogeneity within the window. We then summarized EH for each island by calculating the mean value of the heterogeneity rasters and termed them 'dis', 'msd' and 'rou', respectively ( Figure 1). We named the EH metrics by referring first to the environmental variable abbreviation in uppercase, followed by the calculation metric abbreviation in lower case, for example, ELEVrou for the mean roughness in elevation per island and PRECrg for the range in precipitation per island.
Window size and spatial grain can influence EH quantification.
We therefore tested three alternative window sizes (3, 25 and 49 km 2 ) on three different spatial grains (250 m, 500 m and 1 km, the last two grains were aggregated from initial ELEV at 250 m) using the ELEV data only ( Figure S2.6). Following the same procedure as described above for moving-window metrics, we obtained EH values per island and compared values across islands using correlation analysis. We calculated the EH metrics using R version 3.5.2 (R Development Core Team, 2018) using the extract function from the package raster (Hijmans et al., 2018). For computing the HLI and the heterogeneity rasters dissection, roughness and standard deviation, we used the Spatial Analyst extension and the Geomorphometry & Gradient Metrics toolbox (Evans, Oakleaf, & Cushman, 2014) in ESRI ArcGIS version 10.4.

| Statistical analysis
We used Pearson's correlation coefficients to relate EH metrics to each other, and to assess similarities among EH components and the two types of metrics. To test the GDM premise of a hump-shaped trend in EH over island age (see relationships between island age and individual EH metrics in Figure S3.7), we replaced 'Biodiversity' with a respective EH metric as a response variable in a modified GDM formula that uses a log-transformation of Time (Steinbauer et al., 2013): where 'ln' is the natural logarithm (hereafter 'EH ~ ATT 2 ' model).
We fitted the EH ~ ATT 2 formula using linear mixed-effect models (LMM) that account for the variation across archipelagos as random intercept because EH and species diversity of individual islands depend on archipelago characteristics (Borregaard, Matthews, Whittaker, & Field, 2016;Bunnefeld & Phillimore, 2012). All EH metrics were scaled to zero mean and unit variance to facilitate comparisons among different EH measures. We then produced model predictions to assess the trend of EH with island age, by keeping island area and archipelago constant (median island area and one selected archipelago, Hawaii). We verified if the log-transformation of island age produced statistically more robust models by fitting the EH ~ ATT 2 formula without log-transforming island age and using Akaike's information criterion (AIC), and how island area influenced the EH metrics, as area may interfere with the identification of EH per se (Stein et al., 2014), by plotting coefficient estimates for the EH ~ ATT 2 models.
We evaluated the effect of EH on the diversity of vascular plants, by including each EH metric separately as a predictor variable in the modified GDM formula (Steinbauer et al., 2013): where we replaced Biodiversity by (a) number of native species and (b) number of single-island endemic species of vascular plants.
We fitted the EHATT 2 models and for comparison also the ATT 2 , which did not include EH, using generalized linear mixed-effect models (GLMM, with Poisson distribution error). To identify the differential effect of the investigated predictor variables, that is, Area, Time, and each EH metric (we assessed the effect of log-transforming the metrics, see  (Nakagawa & Schielzeth, 2013) for EH ~ ATT 2 , as well as for EHATT 2 and ATT 2 models (Tables S3.1 and S3.3). Second, using model diagnostics (QQ plot and residual versus predicted values), we determined whether there were significant degrees of overdispersion for GLMM (i.e. EHATT 2 and ATT 2 ). Overdispersion is common in models for count data and can be caused by aggregation among observations , that is, islands. It may cause Type I errors (false positives), as standard errors are underestimated. To fix this, we used the observation-level random effect approach (i.e. the identity of islands was set as random intercept), which gives more accurate estimates of standard errors (Harrison, 2014(Harrison, , 2015. R 2 values and overdispersion tests were computed in MuMIn (Bartoń, 2018) and DHARMa (Hartig, 2019), respectively. All statistical analyses were done using R 3.5.2 (R Development Core Team, 2018). The LMM and GLMM were fitted using the package lme4, and model coefficient estimates plots were produced using the package dotwhisker (Solt & Hu, 2015).

| Assessment of environmental heterogeneity metrics
Our comparison of different EH metrics revealed strong similarities among topographic and climatic heterogeneity, namely between ELEV and TEMP heterogeneity (Pearson's correlation coefficients between 0.5-1 and average 0.72, see Figure S2.  Figure S2.6), whereas the larger window 49 km 2 showed a slight difference in EH values ( Figure S2.6). In addition, we found that the heterogeneity rasters based on 9 km 2 window size clearly identified landscape features, such as ravines and mountainous areas, while the 49 km 2 window generally led to more diffuse patterns (see example in Figure S2.4).

| Trends of environmental heterogeneity over island age
We found hump-shaped relationships between EH and island age for 16 out of the 20 EH metrics (Figure 2). Those 16 EH metrics showed a similar pattern over time, that is, EH rapidly increased and the log-transformation of island age in the EH ~ ATT 2 formula was always more strongly supported than models with untransformed data (Table S3.1).

| Environmental heterogeneity as a predictor of plant diversity in the ATT 2 model
The majority of EH metrics had a positive effect on insular plant diversity (Figure 3), and they had an even stronger effect on singleisland endemic species (compare x-axis in Figure 3). For the number of native species, PREC and TEMP heterogeneity had the strongest effect (Figure 3a), particularly precipitation and temperature range (i.e. whole-island metrics PRECrg and TEMPrg). It was followed by the positive effect of climatic complexity in terms of precipitation (i.e. moving-window metrics PRECmsd and PRECrou) (Figure 3a).
The number of single-island endemics species was most strongly affected by TEMP and ELEV heterogeneity (Figure 3b), specifically the range in temperature and elevation (TEMPrg and ELEVrg). Climatic (in terms of temperature) and topographic complexity (i.e. movingwindow metrics TEMProu, ELEVmsd, HLImsd) also had a positive effect on single-island endemic species but PREC heterogeneity did not affect single-island endemics (Figure 3b). Moving-window metrics that used dissection neither affected native nor single-island endemic species (Figure 3).
In all EHATT 2 and ATT 2 models, island area had the strongest effect (Figure 3), but its effect particularly decreased when whole-island metrics that measured ranges (TEMPrg, ELEVrg and PRECrg) were included in the models (see models coefficients and error bars of the ATT 2 model highlighted in black in Figure 3). Island age had a weak effect in all EHATT 2 and ATT 2 models (see Time and Time 2 coefficients estimates and error bars in Figure 3). However, the effect of both terms for age changed after including whole-island metrics measuring ranges (again TEMPrg, ELEVrg, PRECrg), that is, the effect of the linear term (Time) increased and the quadratic term (Time 2 ) decreased. The decrease in the effect of the quadratic term caused an asymptotic relationship of species richness and endemism with time ( Figure S5.10). Furthermore, models without the quadratic term of age (i.e. EHAT models) had lower AIC values than the EHATT 2 models (Table S5.4) and therefore a stronger support. Also, the majority of the EHATT 2 models received stronger statistical support than the ATT 2 model (Table S3.3), namely 15 out of 20 EHATT 2 models for predicting number of native species and 11 out of 20 EHATT 2 models for predicting number of single-island endemics (Table S3.3).
Finally, there was only limited support for an interaction between island area and EH ( Figure S5.11). For models predicting the number of native species, the positive interactions for one whole-island metric (PRECrg) and two moving-window metrics (PRECmsd and PRECrou) with island area (Figure S5.11a) received statistical support. For models predicting the number single-island endemic F I G U R E 3 Effects of environmental heterogeneity (EH) components and metrics, island area and age on species richness and endemism of vascular plants, in the framework of the general dynamic model (GDM). Coefficient estimates (dots) and 95% confidence intervals (bars) for (a) number of native species and (b) number of single-island endemic species, from models fitted with EHATT 2 and ATT 2 to compare the effect of including EH. The coloured dots and bars correspond to a particular model depending on the EH metric included. Coefficients and error bars of the ATT 2 models are highlighted in black. Vertical dashed lines mark zero effects and covariates are not considered significant if the error bar intersects with the zero-line. Coefficient estimates were automatically scaled for direct comparison by two times their standard deviation [Colour figure can be viewed at wileyonlinelibrary.com] PREC heterogeneity TEMP heterogeneity ELEV heterogeneity

HLI heterogeneity
Effect on single-island endemics

| D ISCUSS I ON
Our study aimed to identify ecologically meaningful measures of environmental heterogeneity (EH) on oceanic islands by assessing climatic and topographic components of heterogeneity related to precipitation (PREC), temperature (TEMP), elevation (ELEV) and heat load (HLI), and to evaluate the performance of whole-island as well as of moving-window metrics (Figure 1). We then tested a key premise of the general dynamic model of island biogeography (GDM; Whittaker et al., 2008) and showed that EH indeed follows the expected hump-shaped relationship with island age (Figure 2).
We found strong evidence for an important role of EH as a driver of plant species richness and endemism of oceanic islands (Table S3.3).
Metrics reflecting climatic heterogeneity (i.e. PREC and TEMP heterogeneity) are particularly relevant for species richness and temperature and topographic complexity (i.e. ELEV and TEMP and HLI heterogeneity) are particularly relevant for endemic species ( Figure 3). Together, our results contribute to a better understanding of the role of EH for insular diversity patterns.

| Capturing the heterogeneity of insular environments
Our assessment of different alternative ways to quantify EH of oceanic islands revealed that certain environmental components strongly co-vary, as seen in the strong positive correlation of elevation and temperature-related heterogeneity metrics. This is due to the strong dependency of temperature gradients on the topography (Dobrowski, Abatzoglou, Greenberg, & Schladow, 2009). In contrast, precipitation heterogeneity was less strongly associated with metrics of other EH components ( Figure S2. in temperature, as well as dramatic precipitation gradients created by the high elevation of the island and trade winds (Giambelluca et al., 2013). Likewise, the Island of Hawai'i is characterized by a comparatively low climatic and topographic complexity because the surface of this young island is relatively smooth compared to older, more eroded islands. These results indicate that whole-island metrics successfully describe total energy (TEMPrg, HLIrg), water supply (PRECrg) and available space for species (ELEVrg), all major elements of island carrying capacity (Hui, 2006). Moving-window metrics, in contrast, are more suitable for describing the climatic and topographic complexity of islands (Cramer & Verboom, 2017), when using roughness or standard deviation because they capture local changes in temperature and precipitation regimes and terrain complexity, for example, ridges and valleys (Bonetti, Hooshyar, Camporeale, & Porporato, 2020), found in landscapes such as the Anaga mountains on Tenerife (Figure 1), Moka in Mauritius and Koke'e in Kauai (see island maps in Figure S2.4).
Moving-window metrics of dissection computed here as the mean value of the heterogeneity rasters of dissection per island did not always reliably inform about how dissected an island landscape is. Our results showed that (mean) dissection values varied independently of island age ( Figure S3.

7). For instance, Christmas
Island (20 Ma) and Genovesa Island (0.3 Ma) had the highest (mean) dissection values for elevation among all islands studied. The first is an old, highly eroded island with steep escarpments around its boundaries, the second is a young volcano. Both island landscapes are mostly smooth with little topographic complexity (i.e. no valleys and ridges), but rather have a continuously descending landform (e.g. cone-shape). In both cases, the dissection formula led to high values for landscape incisions (i.e. descending areas within the 9 km 2 window size) or its analogue for climate (see Tenerife heterogeneity rasters in Figure 1). Thus, islands with a relatively flat or cone-shaped landscapes can have high average dissection values despite having limited EH.
The spatial grains used here (250 m for ELEV and HLI and 1 km for TEMP and PREC) produced comparable estimates of EH. Our test using different window sizes confirmed that within-island EH, caused by island ravines, ridges and valleys, is captured well at an intermediate scale (i.e. 9 km 2 ) ( Figure S2.4). At a larger scale (i.e. 49 km 2 ), such landscape features were averaged out and disappeared, potentially leading to an underestimation of EH (see 9 km 2 versus 49 km 2 window sizes in Figure S2.6). This is relevant for plant diversity because at intermediate scales geology and soil conditions, in addition to topography and climate, create a matrix of habitats that can host distinct plant communities (Crawley, 2001;Miguet, Jackson, Jackson, Martin, & Fahrig, 2016).

| The trajectory of environmental heterogeneity over oceanic island ontogeny
With few exceptions, the EH metrics investigated here showed a hump-shaped relationship with island age (Figure 2), with peaks early in the island ontogeny (higher model support with log-transformed island age, see Table S3.1). This result lends strong support to one of the fundamental premises of the GDM (Whittaker et al., 2008), that is, island elevational range and topographic complexity have a humpshaped pattern over time. In contrast to the graphical model representation in Whittaker et al. (2008), we found that EH peaks before the 'middle-age' of an island because most of the volume and elevation of volcanic hotspot islands usually forms within the first million years (Troll & Carracedo, 2016). Shortly after island emergence, the onset of erosion, occurrence of mega-landslides and collapses of calderas contribute to a complex island topography (Carracedo, 1994).
With the peak in elevation, maximum climatic ranges are also reached and within-island climatic complexity increases. Two young islands from our analysis exemplify this: La Palma (2,380 m a. s. l. and ca.
1.7 Ma) and Tahiti (1,670 m a. s. l. and ca. 1.3 Ma) have large climatic and elevational gradients and a complex climate and topography (e.g. both received high values for moving-window metrics of roughness), as they had already undergone geomorphological processes that considerably modified their surface configuration (Ferrier, Huppert, & Perron, 2013).
The slow decline of EH over time ( Figure 2) indicates that it may take several million years for an island to be eroded. For instance, the island of Lanzarote is 16 Ma old, its highest point is 650 m a.s.l., and it still holds considerable EH. The main factors responsible for the decline of island's EH are long-term erosion through rainfall (Ferrier et al., 2011), subsidence in certain archipelagos, for example, Galapagos Islands (Ali & Aitchison, 2014) and coastal erosion (Ramalho et al., 2013). Also, the trade-off between losing elevational range (Price & Clague, 2002) and increasing topographic complexity slows the decline of island EH. The slow decline in EH has relevant biogeographical implications. Extinction rates should rise only slowly while speciation rates might be maintained at higher levels, which would lead to a slower decline in species richness than previously thought (Whittaker et al., 2008). This has been shown for birds, insects and plants in the Hawaiian islands, where evolutionary decline is much slower compared to evolutionary radiation at the beginning of island building (Lim & Marshall, 2017).
The few exceptional cases where EH metrics (i.e. dissection metrics in Figure 2) did not show a humped-shaped trend over time, were caused by the unclear relationship between the metrics and island age (i.e. both young and old islands had intermediate to high dissection values, see Figure S3.7). Lastly, we note that although island EH is generally related to island area and certain EH metrics are more affected by area per se ( Figure S3.8), the EH trends we find here are not due to the variation of island area, as area entered the analysis as a covariate.

| Environmental heterogeneity as a determinant of insular species richness and endemism
Insular vascular plant diversity was strongly affected by EH, and most notably, native and endemic species richness were differentially affected by EH. Specifically, we found that climatic heterogeneity (PREC and TEMP heterogeneity) was the most important predictor of native species richness (Figure 3a), whereas temperature and elevational heterogeneity were most important for the number of single-island endemic species (Figure 3b). Climatic heterogeneity influences species richness on oceanic islands by increasing the number of climatic niches, where plant colonizers can establish and a large number of species can coexist (Stein et al., 2014) and persist if climatic fluctuations occur (Keppel et al., 2018). Elevational heterogeneity, on the other hand, is known for its key role in promoting species diversification (Rahbek et al., 2019). Steep elevation gradients, which directly relate to changes in temperature, create selection pressures that can lead to new species adaptations (Badgley et al., 2017). Furthermore, a complex topography implies geographical barriers that isolate species populations (Irl et al., 2015), disrupt their gene flow and eventually lead to within-island diversification (Kisel & Barraclough, 2010

| Effect of including environmental heterogeneity in the ATT 2 model
Including EH in the ATT 2 model improved the statistical power and also modified the effect of island area and age on species richness and endemism of vascular plants. The decreased effect of island area after including EH, particularly with whole-island metrics (i.e. PRECrg, TEMPrg and ELEVsd), provides evidence that both, island area per se and EH, need to be considered in models of species richness and endemism (Triantis, Guilhaumon, & Whittaker, 2012).
Including EH had opposite effects on the two terms for island age.
It increased the effect of the linear term (Time in Figure 3) and decreased the effect of the quadratic term (Time 2 in Figure 3), leading to an overall asymptotically increasing relationship between species richness and endemism with time ( Figure S5.10) and not to the hump-shaped relationship predicted by the GDM (Whittaker et al., 2008). This change highlights three key phenomena occurring during island ontogeny. First, more and more species colonize and eventually diversify with time (Heaney, 2000). Second, colonization and speciation slow down when many species are already present . Third, and importantly, the negative effect of time on island carrying capacity, and hence island diversity, can be captured directly by the effects of decreasing island area and EH.
The lack of significant interactions for most models suggests that overall island area and EH affect species richness and endemism largely additively. However, the positive and significant interactions ( Figure S5.11) we found between island area and EH (only for climatic heterogeneity) indicate that the effect of EH depends on island size and lend limited support to the area-heterogeneity trade-off hypothesis (Allouche et al., 2012). Our finding shows that on large islands climatic heterogeneity has a positive and strong effect on species diversity, while on small islands climatic heterogeneity has a weaker to even negative effect on species diversity. Therefore, small islands even if exhibiting high levels of climatic heterogeneity, will not have large numbers of species, particularly not endemic ones (compare x-axis Figure S5.11) because the effective area per habitat required for species to persist or even speciate is limited.
There are several limitations of our study. First, using a spacefor-time substitution for understanding biodiversity patterns that change over time only allows for limited inference (Pickett, 1989).
Second, the challenges imposed by the complexity and often idiosyncratic development of volcanic island ontogenies hamper the search for generality (Ali, 2017;Borregaard et al., 2017). Finally, there are further potential EH components crucial for plant diversity, such as heterogeneity in soil conditions (Crews et al., 1995)

DATA AVA I L A B I L I T Y S TAT E M E N T
Data analysed and produced in this study, that is, number of native and single-island endemic species, island age, area, archipelago information and EH metrics, as well as environmental components (elevation, precipitation, temperature and heat load index), heterogeneity rasters and the code for analyses in R can be download from Dryad

A PPE N D I X 1
The following literature correspond to the regional floras, checklists and