Fine‐scale substrate heterogeneity in green roof plant communities: The constraint of size

Abstract Heterogeneity–diversity relationship (HDR) is commonly shown to be positive in accordance with classic niche processes. However, recent soil‐based studies have often found neutral and even negative HDRs. Some of the suggested reasons for this discrepancy include the lack of resemblance between manipulated substrate and natural settings, the treated areas not being large enough to contain species' root span, and finally limited‐sized plots may not sustain focal species’ populations over time. Vegetated green roofs are a growing phenomenon in many cities that could be an ideal testing ground for this problem. Recent studies have focused on the ability of these roofs to sustain stable and diverse plant communities and substrate heterogeneity that would increase niches on the roof has been proposed as a method to attain this goal. We constructed an experimental design using green roof experimental modules (4 m2) where we manipulated mineral and organic substrate component heterogeneity in different subplots (0.25 m2) within the experimental module while maintaining the total sum of mineral and organic components. A local annual plant community was seeded in the modules and monitored over three growing seasons. We found that plant diversity and biomass were not affected by experimentally created substrate heterogeneity. In addition, we found that different treatments, as well as specific subplot substrates, had an effect on plant community assemblages during the first year but not during the second and third years. Substrate heterogeneity levels were mostly unchanged over time. The inability to retain plant community composition over the years despite the maintenance of substrate differences supports the hypothesis that maintenance of diversity is constrained at these spatial scales by unfavorable dispersal and increased stochastic events as opposed to predictions of classic niche processes.

time. Vegetated green roofs are a growing phenomenon in many cities that could be an ideal testing ground for this problem. Recent studies have focused on the ability of these roofs to sustain stable and diverse plant communities and substrate heterogeneity that would increase niches on the roof has been proposed as a method to attain this goal. We constructed an experimental design using green roof experimental modules (4 m 2 ) where we manipulated mineral and organic substrate component heterogeneity in different subplots (0.25 m 2 ) within the experimental module while maintaining the total sum of mineral and organic components. A local annual plant community was seeded in the modules and monitored over three growing seasons.
We found that plant diversity and biomass were not affected by experimentally created substrate heterogeneity. In addition, we found that different treatments, as well as specific subplot substrates, had an effect on plant community assemblages during the first year but not during the second and third years. Substrate heterogeneity levels were mostly unchanged over time. The inability to retain plant community composition over the years despite the maintenance of substrate differences supports the hypothesis that maintenance of diversity is constrained at these spatial scales by unfavorable dispersal and increased stochastic events as opposed to predictions of classic niche processes.

K E Y W O R D S
neutral theory, niche theory, plant community assemblage, plant-soil interactions

| INTRODUC TI ON
One of the longest standing challenges in the field of ecology is explaining the mechanisms that sustain species richness over time and space. Spatial heterogeneity of resources and environmental conditions was suggested to increase niches which would, in turn, support the maintenance of a variety of species (Chesson, 2000; that the maintenance of a diverse plant community is a direct result of fine-scale heterogeneity where different plant species are supported by different patches (Whittaker, 1965).
Accumulating evidence for contradicting hypotheses resulted in the publication of the neutral theory (Hubbell, 2001) that successfully predicted observed patterns while completely ignoring resource heterogeneity. Although these contradicting theories were generally reconciled into a "niche-neutral continuum" (Leibold & McPeek, 2006;Matthews & Whittaker, 2014), the underlying insight was that the seemingly obvious heterogeneity-diversity relationship (i.e., HDR) was no longer indisputable.
This shaking of the niche theory may have given rise to the emergence of several studies that have challenged the generality of positive HDR especially in soil heterogeneity and even suggested negative HDRs (Gazol et al., 2013;Lundholm, 2009;Tamme, Hiiesalu, Laanisto, Szava-Kovats, & Pärtel, 2010). Experimental studies that put this theory to test only rarely found a positive HDR for soil heterogeneity (Williams & Houseman, 2014). A large-scale metaanalysis was performed (Stein, Gerstner, & Kreft, 2014) and showed a significantly positive HDR effect across taxa, biomes, and spatial scales which could have potentially refuted the negative HDR studies. However, the meta-analysis only included large-scale (>10 km 2 ) observational studies while the contradictory results were attained in experimentally manipulated fine-scale studies.
Some have tried linking this discrepancy to the effect of patch size. A meta-analysis performed on soil manipulations studies (Tamme et al., 2010) claimed that experimental studies' negative HDR was limited by fine-scale patch size where fine-scaled heterogeneity supported lower diversity. This is also supported by the strong positive effect of patch size found in the meta-analysis performed on observational soil studies (Tamme et al., 2010) and, in general, HDR studies (Stein et al., 2014). Since experimental studies are inherently limited in their dimensions, it can be suggested that the manipulated patch size is innately limited by experimental dimensions due to physical restrictions which may explain the scarcity of positive HDR effects.
The attempt to reconcile negative HDRs in experimentally manipulated studies with the general positive perceived trend received three different potential hypotheses that were suggested or tested: lack of realism is embedded in the method of man-made heterogeneity, patch size effect on individuals, and patch size effect on populations.
Hypothesis 1: Realism in the method of creation of heterogeneity.
It has been claimed that a lack of realism is inherent in most methods of heterogeneity manipulation, especially with nutrient manipulations (Williams & Houseman, 2014). The manipulated substrates may not mimic natural soil, and nutrients that are artificially added may disturb plant-soil microbe interactions or, in certain cases where highly mobile forms of nitrogen are used, give preference to nitrophilic species that are able to capitalize on the resources more easily, which masks the heterogeneity effect.
Hypothesis 2: Patch size has effect on individuals.
Treated patch size within experimental modules has been targeted for some time as a potential challenge in studies of this kind; treated areas that are smaller than the root span of certain species are functionally invisible to those species (Hutchings, John, & Wijesinghe, 2003).
However, when all species have similar root spans that are larger than treated patches, the heterogeneity effect is predicted to be neutral.
When some species' root spans are smaller and some are larger than patch size, species with larger root spans have a foraging advantage over species with smaller root spans and increase their fitness which could potentially reduce diversity (Rajaniemi, 2011;Tamme, Gazol, Price, Hiiesalu, & Pärtel, 2016).
Hypothesis 3: Patch size has effect on populations.
Theoretical models designed to improve our understanding of community dynamics within heterogeneous surroundings found support for the negative HDR (Kadmon & Allouche, 2007;Palmer, 1992;Smith & Lundholm, 2012). This is explained by the increased stochasticity caused by habitat heterogeneity which affects plant populations.
Reducing the absolute patch area results in smaller populations in each of the patches which in turn increase the chances of stochastic events occurring within them. An important role was also assigned to dispersal mechanisms-smaller patches would increase the percentage of propagules dispersed from the patches into unsuitable habitats due to the fact that patch perimeter would be closer to the plant and would also reduce the incoming propagules from the regional species pool (Kadmon & Allouche, 2007). At reduced patch sizes, increased heterogeneity has a better chance of causing a negative HDR.
In this experiment, we wish to put two of these hypotheses (2 and 3) to test. The construction of large experimental modules (=units) with large enough subplots (=patches) to sustain distinct plant populations and communities and manipulating mineral substrate components alongside observation and sampling over several years will allow us to examine the first and third hypotheses more closely. HDR as well as comparing community compositions between treated modules and subplots within modules could potentially shed light on the processes taking place. While substrate heterogeneity was predicted to increase plant diversity, we did not expect that it would increase plant biomass.
The increasingly common green roof studies may serve as an ideal testing ground for questions of this type. Green roofs are a widespread urban phenomenon where a vegetative layer is placed on roofs. The majority of green roofs are lightweight and often planted with a small array of plant species that entail minimal maintenance (Oberndorfer et al., 2007). While green roofs were originally designed to mitigate stormwater runoff and enhance buildings' thermal insulation, their potential ecological benefits such as increasing biodiversity have been receiving more focus in past years (Blaustein, Kadas, & Gurevitch, 2016;Lundholm & Peck, 2008;Sutton & Lambrinos, 2015). The steady increase in urbanization, alongside the popularity of green roofs, suggests a potential key role of green roofs at increasing urban biodiversity if designed correctly (Blaustein et al., 2016). Green roof studies can provide ideal testing grounds for general ecological theory (Vasl & Heim, 2016) being man-made, and they offer a high level of experimental control. The results of these studies would not only improve theoretical insights but give verified practical tools for green roof designers to implement in their green roof planning and enhance green roof biodiversity. Since green roofs are carefully designed and generally costly, simple manipulations that would stabilize and enhance a diverse plant community-for example, substrate heterogeneity could prove a highly beneficial and a costeffective method to increase diversity on green roofs.
Green roof studies have previously targeted the enhancement of species diversity via heterogeneity. Previous studies have manipulated different substrate features (Lundholm, 2009) as well as the mixing of annuals with perennials (Vasl, Shalom, Kadas, & Blaustein, 2017), creating heterogeneous surface features such as logs and pebbles (Walker & Lundholm, 2017) and substrate depth (Heim & Lundholm, 2014).
We established green roof modules and manipulated heterogeneity of a set amount of different substrate components with relatively large subplot size. We predicted that the different substrate niches would support different plant communities which would lead to higher levels of total plant diversity in the more heterogeneous modules. In an attempt to avoid effects caused by specific kinds of heterogeneity (partially mentioned in hypothesis 1), we tested both the commonly manipulated organic components as well as nonorganic components that are commonly used in the green roof industry that have very different features (e.g., weight and water content).
We emphasize that the treatment performed in this study was only the level and type of inner distribution of the total substrate components while total substrate components were kept similar.
The goal of this experiment was not to discern the effect that each of the specific treated substrate compositions has on the plant community but instead to isolate the role of substrate heterogeneity on plant diversity. However, following the results of plant communities in control and treated plots, we did analyze plant species distribution within the plot to better our understanding of the processes that took place throughout the experiment.

| Experimental design
The experiment included 24 experimental modules that were placed on three school roofs (eight modules per roof) in the city of Haifa, Israel, and monitored for three consecutive growing seasons. Haifa has a typical dry Mediterranean climate with short rainy winters and long, hot, and dry summers. Precipitation events mainly take place between late October and early April. The three schools were "Dinur," "Ben-Gurion," and "Matos" (Table 1) was placed on the inner side of the drainage unit to filter runoff water and prevent clogging of drainage. Modules were placed on a synthetic foam sheet (GalFoam -GA400, "Palziv") to insulate the modules from the roofs and to protect the modules and the roofs from physical damage.
Substrate for all modules was composed of 10% peat, 10% compost, 10% tuff (local volcanic ash-0-8 mm), and 70% processed perlite (imported amorphous volcanic glass-0.6 mm, produced by "Agrical").  Note: Temperature and precipitation were collected for the second and third years. mineral and organic heterogeneity (i.e., "M+O-HET")-both mineral and organic components were heterogeneous in their dispersion. In order to retain the tuff:perlite and low:high organic matter ratios, the total sum of tuff in this treatment was slightly higher (96.19 L per module) and perlite was slightly lower (479.81 L per module) than other treatments (Table 2). All treatment compositions were achieved by mixing the individual components for a constant period of time in a clean portable electric cement mixer.
All modules were subdivided, and four subplots (each subplot: 500 × 500 mm) with plastic frames were positioned in module corners, 250 mm from the module border (Table 2) Seeds of 19 species of local annuals from different families including grasses and nitrogen fixers were collected throughout 2013, and seeds of Agrostemma githago (a locally protected species) were purchased from a local wild flower nursery ("Seeds from Zion") ( Table 3). Each of the modules was seeded with a total of 4,000 seeds-200 seeds from each of the 20 species. Seeds were mixed in a bucket with 1 L of sand and evenly distributed over the entire experimental module.
Modules were then covered with a 20 mm layer of medium-sized (6-20 mm) gravel to avoid wind erosion of perlite-based substrates and seed scattering before the first rains of the first season.

| Point-intercept measures
In the beginning of February of 2014, a nondestructive biomass measure was performed once a month throughout the growing seasons using the point-intercept method (Jonasson, 1988). One hundred metal skewers (diameter of 2.5 mm) were uniformly placed (83.3 mm apart) in each of the modules. Number and identity of green plant organs that intercepted with the skewer were documented. The sum of the yearly touches was used as a biomass proxy, and the identity was used to estimate species distributions within the module.
While different growth forms have been shown to have different biomass:intercept ratios, use of this method for repeated monitoring within given experimental units containing several growth forms has been shown to be effective (Bråthen & Hagberg, 2004).

| Individual count
At the end of the growing period of each of the species, all dead plants were counted. These data were used to calculate total module yearly Shannon-Wiener diversity index (H′).

| Subplot level analysis
Point-intercept data were tracked on the subplot level so that the total sum of intercepts counted in the treated subplots (the two diagonal paired subplots-total of 18 skewers) as well as the respective "control" subplots that contained similar substrate to that in the matrix could be attained for each of the modules.
A sum of yearly species identity for each of the potential treated and "control" subplots was calculated. The seven different subplots Note: All modules contained a total of 720 L of substrate but components were dispersed differently within the different treatments. All treated subplots (one treatment for M-HET and O-HET and two for M+O-HET) were separated into two 0.5 × 0.5 m subplots that were placed 250 mm from the module edges.

| Substrate change monitoring
Core samples (50 ml Measurements were taken on either side (distance of 250 mm) of the initial subplot border with the module matrix. We used only the January measurements that represent the peak of the rainy season. Note: Annuals are from 10 different families including grasses (Poaceae) and nitrogen fixing legumes (Papilionaceae). All seeds were collected from wild populations except for the locally protected Agrostemma githago whose seeds were purchased.

| Statistical analysis
the 3 years of the experiment. Parametric assumptions including homogeneity of variance (Levene's test) and normal distribution (Shapiro-Wilk test) of residuals were tested .
Data were transformed (specific transformations are reported at each relevant test) when parametric assumptions were not met.
Greenhouse-Geisser corrections for degrees of freedom were used when sphericity assumptions were not met.
Community dissimilarity between modules and between subplots was calculated using Bray-Curtis differences. The data were visualized in nonmetric dimensional scaling plots (NMDS), using the meta-DATA function in the vegan package of R (Oksanen et al., 2013).

A nonparametric multivariate analysis of variance PERMANOVA on
Bray-Curtis dissimilarities with 999 permutations was performed on whole module species abundance data and subplot community point-intercept data for each of the years using "adonis" function of "vegan" package in R, with block, treatment, and their interactions as predictors. Since PERMANOVA tests do not have post hoc procedures, when treatment was statistically significant, we performed pairwise t tests on each of the combinations to establish which were different. Critical p-values were corrected following the "Benjamini-Hochberg" false discovery correction (Benjamini & Hochberg, 1995).

| Total module results
Biomass proxy (point intercept) did not change with substrate heterogeneity, but increased from the first to the second year and decreased in the third year ( Figure 1a) (repeated measures one-way ANOVA, p < .001; Table 4).
Shannon-Wiener diversity index (H′) did not change with substrate heterogeneity either, but decreased over the 3 years of the experiment (Figure 1b) (repeated measures one-way ANOVA (x 2transformed), p < .001; Table 4).
Plant community similarities displayed in nonmetric dimensional scaling (NMDS) in Figure 2 depict the small effect of substrate heterogeneity as opposed to the change and divergence depicted over time as well as the strong effect of school identity.
Bray-Curtis distances of whole module communities for each of the years showed a significant treatment effect only on the first year (PERMANOVA, p = .01; Table 5) while school block effects were significant on years 1 and 3 (p < .001 and p < .01, respectively; Table 5).
Pairwise comparisons performed on the first-year results found that only treatments M-HET and O-HET had a significant treatment effect between them (Pseudo-F(1) = 3.03, p = .004). year (PERMANOVA, p < .001; Table 6). School (=block) effects were significant on all 3 years (p = .02, p = .02, and p < .001 respectively; Table 6). Pairwise comparisons performed on the first year's results (Table 7)

| Substrate differences over time
Differences between core sample weights for each of the treatments (treated subplot as well as the substrate near it) (Figure 5a) found a treatment effect (repeated measures one-way ANOVA, p < .001; A treatment effect (p < .001; Table 8) was found for differences in January moisture measurements (repeated measures one-way ANOVA arcsin-square root-transformed) (Figure 5b). Post hoc tests showed that the moisture differences for the two tuff subplots were significantly drier while other subplots were not.
Differences between percent organic matter (arcsin-square roottransformed) of treatments HOM and O-HET showed that there was a statistically significant treatment effect (Repeated measures one-way ANOVA, p = .001) while differences were larger in O-HET subplots and that year and year*treatment interaction were not significant (Table 8, Figure 6).

| D ISCUSS I ON
In our experiment, we did not find a positive effect of substrate heterogeneity on plant diversity throughout the 3 years of the experiment. The inner module documentation of plant specimen locations suggested that plants were locally affected by substrate treatments but only on the first year. Finally, substrate yearly changes were documented and suggest that differences between substrate treatments were maintained over the years.
As portrayed above, while positive HDR is a generally accepted phenomenon with strong theoretical backing, soil HDR (especially in studies comparing similar sized units) is often not positive (Stein & Kreft, 2015).
Since experimental soil studies are typically limited in size, the size of the modules used in the experiments was targeted as the potential cause to this discrepancy (Walker & Lundholm, 2017). The even smaller patches (i.e., subplot) within the experimental modules may not be large enough to sustain individuals of a different species.
Presumably, if only soil experimental studies were larger in size, the patches within the experimental modules could sustain individuals from different species and the studies would show a positive HDR in accordance with general HDR findings.
A meta-analysis performed entirely on soil manipulation studies (Tamme et al., 2010)   F I G U R E 5 Differences in substrate core sample weight and January substrate moisture between treated subplots and their surroundings over the 3 years of the experiment. Differences in weight (a) and January moisture (b) between treated subplots and their surrounding substrate. Post hoc tests showed that for weight and January moisture measures, M-HET (tuff) and M+O-HET (tuff) differences were significantly different from the other subplot differences Our study had large enough subplots to sustain several individuals of a certain species, involved both mineral and organic substrate manipulations and lasted more than one growing season. However, we did not find a positive HDR or a positive effect on biomass in experimental modules. Interestingly, another large-scaled ground-level 15-year experimental soil study (Baer, Blair, & Collins, 2015) that was published after Tamme et al. (2010) did not find a positive HDR either.
We believe that the theoretical considerations regarding the effect of subplot size on local populations' persistence within the subplots (Kadmon & Allouche, 2007) support a good understanding of this system. Our findings showed that community composition in treated modules of O-HET and M-HET as well as the community compositions inside the treated subplots in these treatments did differ during the first year. Substrate mineral and organic differences were maintained throughout the duration of the experiment while module and subplot communities no longer responded to these differences after the first year. These findings allow us to point toward a potential effect on the community level that was not previously explicitly examined in experimental studies.
In response to the three potential hypotheses presented in the introduction: 1. Realism in the method of creation of heterogeneity On green roofs, as opposed to ground-level experiments, substrates that are used are intrinsically human-made; therefore, these manipulations are representative of the dynamics that are commonly predicted on green roofs. In addition, the organic manipulations that were especially targeted as nonrealistic showed similar behavior to the mineral treatment.

Subplot size effect on individuals
The initial first-year response of community composition and subplot biomass response over all 3 years imply that individuals in subplots were affected by the subplots' unique substrate compositions and that subplot size was sufficient for the maintenance of individuals from different species.

Subplot size effect on populations
Finally, the change in response to the community composition over time may imply that population and community dynamics might be playing a role at structuring the communities in these experimental modules. Subplot communities in the second and third seasons may have been altered by a "mass effect" (Shmida & Ellner, 1984) of propagules from its surroundings while losing many of the propagules of their locally "adapted" community to the unfavorable surroundings.
The lack of significance between treatment M+O-HET matrix and subplots may result from the reduction in the area surrounding the subplots. The excessive fragmentation into many units may have prevented the establishment of any community that would be subplot-specific (Kadmon & Allouche, 2007). This may also imply that positive HDR is limited by abiotic heterogeneity.
Our findings also suggest a relatively strong effect of block location within the city on plant community development. This could result from different microclimates in the different parts of the city (mainly wind exposure and specific rain events at the end of the season that could affect plant development) as well as the specific site characteristics (height of roof and distance to potential pollinating insect communities) as was displayed in previous studies (Braaker, Ghazoul, Obrist, & Moretti, 2014). These differences in plant communities in identically designed modules should be considered in the future design of diverse green roofs within the city. Our study suggests that the specific location of the roof could eventually support a different plant community structure and so increase the total green roof beta diversity.
Finally, in order to properly examine the viability of distinct plant populations within the different niches, we suggest the construction of experimental designs that would manipulate both module and subplot sizes. This experimental design would include subplots that could potentially support a distinct population over time. This would allow the point to be addressed appropriately. Alternatively, seed dispersal between subplots and the matrix could be directly manipulated to test for the same effect. These studies would preferably also include heterogeneity in non-nutrient substrate components which is very rarely manipulated.

ACK N OWLED G M ENTS
This study was kindly supported by the Kadas family and Haifa University.
F I G U R E 6 Differences in substrate core sample percent organic matter between homogeneous and low organic subplots and their surroundings over the 3 years of the experiment. The difference in percent organic matter between homogeneous (HOM) and low organic (O-HET) subplots and their surroundings. Core samples were taken at the end of each growing season. Dotted line marks the difference in the initial levels of the low organic substrate and its surroundings