Phylogenetic diversity and community assembly in a naturally fragmented system

Abstract We sought to assess effects of fragmentation and quantify the contribution of ecological processes to community assembly by measuring species richness, phylogenetic, and phenotypic diversity of species found in local and regional plant communities. Specifically, our fragmented system is Craters of the Moon National Monument and Preserve, Idaho, USA. CRMO is characterized by vegetated islands, kipukas, that are isolated in a matrix of lava. We used floristic surveys of vascular plants in 19 kipukas to create a local species list to compare traditional dispersion metrics, mean pairwise distance, and mean nearest taxon distance (MPD and MNTD), to a regional species list with phenotypic and phylogenetic data. We combined phylogenetic and functional trait data in a novel machine‐learning model selection approach, Community Assembly Model Inference (CAMI), to infer probability associated with different models of community assembly given the data. Finally, we used linear regression to explore whether the geography of kipukas explained estimated support for community assembly models. Using traditional metrics of MPD and MNTD neutral processes received the most support when comparing kipuka species to regional species. Individually no kipukas showed significant support for overdispersion. Rather, five kipukas showed significant support for phylogenetic clustering using MPD and two kipukas using MNTD. Using CAMI, we inferred neutral and filtering models structured the kipuka plant community for our trait of interest. Finally, we found as species richness in kipukas increases, model support for competition decreases and lower elevation kipukas show more support for habitat filtering models. While traditional phylogenetic community approaches suggest neutral assembly dynamics, recently developed approaches utilizing machine learning and model choice revealed joint influences of assembly processes to form the kipuka plant communities. Understanding ecological processes at play in naturally fragmented systems will aid in guiding our understanding of how fragmentation impacts future changes in landscapes.


| INTRODUC TI ON
With the continued anthropogenic alteration of natural landscapes, there is a persistent and pressing need to investigate the consequences of habitat fragmentation and how these consequences affect biodiversity in ecological communities. Specifically, there is a need to understand the effects of fragmentation on phylogenetic and functional trait diversity (Debinski & Holt, 2000;Ewers & Didham, 2006) as they have the power to elucidate past ecological processes that have impacted the community (Cavender-Bares et al., 2009). Understanding the processes involved in community formation can provide insight into what ecological pressures are influencing community assembly and ultimately the biodiversity we observe (Faith, 1992). By studying recently formed, naturally fragmented landscapes, we can explore the ecological processes that are involved in the early construction of species assemblages, the coexistence of species, and importantly the maintenance of diversity. Thus, if we understand the natural ecological processes at play in response to fragmented landscapes, we can use this information to guide our understanding of how future ecosystems may respond to fragmentation, either natural or human-caused. Additionally, we can explore the impact of fragmentation on phylogenetic and phenotypic diversity.
Previous work has characterized species richness and phylogenetic diversity in fragmented systems, and sometimes both components are explored (Helm et al., 2006;Santos et al., 2010). In these and other studies, however, the fragmentation process is often implemented experimentally or due to human impacts on a system (Arroyo-Rodríguez et al., 2012;Laurance et al., 2010).
Furthermore, functional trait diversity of fragmented systems is rarely explored alongside phylogenetic information (but see Ribeiro et al., 2017), even though the traits important for existing in a community and local environment can be very telling of the processes that led to the assembly of the current community (de Bello et al., 2009;Kraft et al., 2007;McGill et al., 2006;Weiher & Keddy, 1999). Research has thus far focused on frequency of traits, for example, relative abundance of reproductive strategy and how overall functional diversity is reduced with fragmentation (Girão et al., 2007), rather than the impact of the functional trait variation present. Exploring the effect(s) of fragmentation on phylogenetic and functional trait diversity in a naturally fragmented system will help establish what ecological pressures fragmentation evokes, for example possible increased competition, and how biodiversity is impacted by fragmentation.
The phylogenetic diversity of a community captures information about the amount of evolutionary history shared among the species within a community, which is oftentimes used as a proxy for functional trait differences among species within that particular community (Webb et al., 2002). Phylogenies overall are assumed to reflect morphological, ecological, genetic, and physiological differences that have accumulated between lineages (Gerhold et al., 2015). Phylogenies are thus useful in understanding processes that have influenced, and may continue to influence, multiple aspects of diversity within a community (Brooks & McLennan, 1991;Owen et al., 2019;Tucker et al., 2017;Webb, 2000). For example, community phylogenetic approaches have been used to understand ecological processes important for the assembly of alpine plant communities (Marx et al., 2017), as plants in alpine environments are exposed to harsh conditions requiring a suite of functional traits that may be best represented using a phylogeny.
In the field of community ecology, dispersion metrics calculated from phylogenetic distances between species are often used to infer local ecological processes that have contributed to community structure (Kembel et al., 2010;Kraft et al., 2007;Webb et al., 2002Webb et al., , 2008. The nonneutral processes inferred are generally habitat filtering (Bazzaz, 1991), which is inferred when species within a community are phylogenetically closely related, and competitive exclusion (MacArthur & Levins, 1967), which is inferred from a community of species encompassing high phylogenetic variation (Webb et al., 2008). The justification for these inferences relies on the assumption that most functional traits, especially those important in surviving habitat conditions or local competition for resources, are conserved so that closely related species tend to share similar functional traits. Thus, if many species require similar functional traits to survive in an environment, we then expect these species to be more closely related to one another than by chance, that is, would observe low phylogenetic dispersion. Likewise, if species are competing for a similar niche space, species with traits that are dissimilar are those that exist in the community because they have not outcompeted one another, resulting in species that are not as closely related, and subsequently large phylogenetic dispersion is observed.
In addition to phylogenetic information, morphological, physiological, behavioral, or ecological traits can also be incorporated directly to understand community assembly processes (Cornwell et al., 2006;Kraft et al., 2007Kraft et al., , 2015. Traits are often assumed to correlate with phylogenetic information, but this is not always the case (Mazel et al., 2018) and thus sometimes using the traits themselves, rather than the phylogeny as a proxy, can provide a more accurate depiction of community assembly processes (de Bello et al., 2009;Kraft et al., 2007). Specifically, functional trait diversity can, perhaps more directly, provide information about how competition between members in a community might promote or hinder their coexistence (MacArthur & Levins, 1967;McGill et al., 2006;Weiher & Keddy, 1999). Thus, incorporating both phylogenetic and functional trait diversity within a single community can help infer the processes that have led to the assembly of that community, and ultimately what contributes to the maintenance or loss of biodiversity (Cadotte et al., 2013;Webb, 2000).
Utilizing both traits and phylogenies presents challenges, as incorporating both traditional metrics in community ecology is not straightforward. Additionally, the use of phylogenetic dispersion metrics to infer processes of community assembly has presented its own concerns. One of which is the assumption that functional traits important for assembly are conserved across, or correlated with, the phylogeny as this does not always hold (Cavender-Bares et al., 2009;Mayfield & Levine, 2010) However, limiting analyses to functional traits does not necessarily solve the problem because then the phylogenetic information is not incorporated, meaning information inherent in evolutionary relationships is not accounted for. Additionally, a conclusion based on hypothesis testing and interpretation of processes when a significance threshold is passed is arguably problematic for biological inferences when we know the inference itself exists on a continuum, rather than on a binary threshold (i.e., yes/no). Therefore, we also use an alternative approach, Community Assembly Model Inference (CAMI; Ruffley et al., 2019).
This approach attempts to address the aforementioned problems by inferring a model of community assembly using both phylogenetic information and information on a single continuous trait. The advantages of this approach include the avoidance of assumptions as to how traits evolved along a phylogeny and the uncertainty in the community assembly inferences to be quantified, avoiding a significance threshold for inference. In utilizing this new approach, along with traditional dispersion metric approaches, we seek to learn more about the ecological processes at play in naturally fragmented systems by incorporating phylogenetic information and functional traits together.
We ultimately combine the phylogenetic and functional trait data for use in CAMI, a novel machine-learning model selection approach (Ruffley et al., 2019), to infer the probability associated with different models of community assembly given the data.
With CAMI, we also go one step further than testing for nonneutrality by quantifying the strength of proposed nonneutral models associated with inferred processes of community assembly. Finally, with the probabilities associated with the predicted models and their relationship to island meta-data, such as area and proximity to the outer edge of lava flow, we are able to further quantify the effect of fragmentation on assembly processes.
With this information we can ask whether these methods, hypothesis testing with dispersion metrics, and CAMI, are corroborative of each other and whether simultaneously considering phylogenetic and trait information changes the inferences made by dispersion metrics that consider the two methods alone. This work investigates phylogenetic and functional diversity within a naturally fragmented system and ultimately, we assess the effects of fragmentation on kipuka plant communities at Craters of the Moon National Monument and Preserve by (1) measuring species richness, phylogenetic, and phenotypic diversity of species found in the kipuka community and those found in the greater shrub-steppe region, and (2) quantifying the contribution of different ecological processes to the assembly of communities in the fragmented landscape with both phylogenetic and ecological information. Given the harsh landscape and the isolation of the kipukas, we predict that the assembly of the plant communities in kipukas will be shaped by nonneutral processes, predominantly by environmental conditions and less so by competitive interactions due to the combination of climatic extremes in the availability of water, temperature variation, and high wind experienced in the region.

| Study system
Craters of the Moon National Monument and Preserve (CRMO) located in south central Idaho, USA is a naturally fragmented system ideal to explore these questions because the lava-flow islands of vegetation within the preserve have been formed relatively recently, within the last 15,000 years. The islands are young in age, and there are many of them, thus offering many replicates to detect the impacts of natural fragmentation. Additionally, within CRMO, the plants that exist in the lava-flow islands experience harsh environmental conditions that have further shaped the assembly of species within the communities. Between 15,000 years ago (kya) and as recently as 2 kya, the eruptive periods at CRMO have resulted in 60 overlapping flows that encompass nearly 1900 km 2 (Kuntz et al., 1982;National Park Service, 2011). After each eruption, islands of vegetation surrounded by lava flows were formed. These vegetation-filled lava-flow islands are known as kipukas, a Hawaiian term used for an area of older land that is completely surrounded by an area of younger lava flows (Vandergast & Gillespie, 2004). There are over 500 kipukas at CRMO creating a vegetated archipelago of islands within an "ocean" of basaltic lava.
The size of the kipukas ranges from substantially less than one km 2 up to a privately owned kipuka that is over 341 km 2 (National Park Service, 2018). The plant communities at CRMO differ depending on successional stage and location, for example whether on lava flows, in cinder areas, or within kipukas.
Plant communities in kipukas are dominated by shrubs like sagebrush (Artemisia tridentata) and perennial bunchgrasses such as Idaho Fescue (Festuca idahoensis) (Link et al., 2006). Shrubs and perennial bunchgrasses dominate the shrub steppe ecoregion, which covers about 6,450,000 km 2 of western North America (Daubenmire, 1970;Link et al., 2006;Rickard & Vaughan, 1988). Typical of the semi-arid shrub steppe ecosystem, the dry climate of CRMO is characterized by a combination of high temperature, low precipitation, and strong winds. Air temperatures approach 30°C in summer months and the surface of the lava can reach 77°C, whereas in the winter the air temperature can get as low as −17°C (NPS Contributors, 1991; Western Regional Climate Center, n.d.). Average annual precipitation ranges throughout the monument from southern portions to northern portions accumulating 38-51 cm, respectively, and most of the precipitation comes in the form of snow (NPS Contributors, 1991). Strong daily afternoon winds are between 24 and 48 km/h (National Park Service, 2016). Individually, these harsh conditions, and the combination of them, limit the possible plant diversity that could persist at CRMO to those species that can deal with these physiological stresses.
For this study, we used data from floristic surveys of vascular plants in 19 kipukas at CRMO. We used the collections, along with a Flora of the shrub-steppe ecoregion, to describe the phylogenetic diversity in and around the naturally fragmented landscape. We used an existing phylogeny of Spermatophyta (Smith & Brown, 2018) to construct a community-wide phylogeny of the species in the kipukas and around the region. With these regional and kipuka phylogenies, we used traditional dispersion metrics and hypothesis testing (Webb et al., 2002(Webb et al., , 2008 to infer processes of community assembly in the kipukas. As we are interested in the effects of fragmentation on plant communities, we focused on plants collected within kipukas and not on the lava fields. Plant traits are important for resource acquisition, seed dispersal, reproductive systems, and might be specific adaptations to low water availability. Adaptations include for example, modifications to increase photosynthesis efficiency (e.g., relative abundance of CAM, C3, and C4 species) (Cavagnaro, 1988), a reduction in size of stomata (Sundberg, 1985), and an overall decrease in height to minimize conduit diameter for water transport as a wider diameter makes the species more vulnerable to conduction-blocking embolisms from drought or cold (Olson et al., 2018). We chose the functional trait of maximum vegetative height to generate phenotypic dispersion metrics as height is a proxy for resource allocation and competitive ability in plants (Cornwell et al., 2014;Weiher & Keddy, 1999;Westoby, 1998).
Additionally, it is consistently noted in species descriptions and as such, the amount of missing data would be minimal (Cornwell et al., 2014).

| Sampling
We obtained a permit for specimen collection from the National Park Service. Floristic surveys were conducted in 27 kipukas at CRMO during May-July 2016 and May 2017 ( Figure 1). Kipukas were accessed by foot and surveys targeted smaller kipukas that were generally less than 0.02 km 2 in size where we were confident that the habitat could be thoroughly inventoried by two people in the field by searching within the lava boundary of the kipuka.
For each species encountered in a given kipuka, we collected two or three representatives in florescence. Collected plants were pressed, brought back to the University of Idaho for identification, and are stored in the Stillinger Herbarium and publicly available online (www.pnwhe rbaria.org). The surveys resulted in a total of 66 species, which we use here as the kipuka community species list, and thus used in the kipuka community phylogeny. Nineteen of the 27 kipukas contained nine or more species and were used for subsequent analysis and categorized as northern, central, or southern kipukas (as indicated in Figure 1). We chose that cutoff to keep as many kipukas as possible in our dataset to maximize statistical power while balancing the fact that communities with less F I G U R E 1 Map of Craters of the Moon National Monument and Preserve, Idaho, USA. Colored outline of map inlays corresponds to organizational scheme of northern, central, and southern regions (yellow, blue, and gray, respectively). The 19 locations of kipukas with vascular plants surveyed are referenced with a letter. Photo at top right is of kipuka "A" than 10 species tend to have error rates in model identification of over 30% (see Ruffley et al., 2019). Identifying and using a comprehensive regional pool is important as this determines the species located within the region that could disperse into the communities of interest. Plant species located up to 17 km away have been demonstrated to have a role in the colonization process after the large-scale destruction of an ecosystem has occurred (Kirmer et al., 2008). For this study, a regional species pool was compiled by using the kipuka community list and adding the 621 other species listed on existing checklists for vascular plants at CRMO (Popovich, 2006) and the shrub-steppe ecoregion (Link et al., 2006), resulting in a regional pool, and the regional community phylogeny, consisting of 687 species (Appendix S1). Thus, the kipuka phylogeny is a subset of the regional phylogeny.

| Community phylogenetics
We constructed two community phylogenies: one from the species list stemming from all of the kipukas sampled, and one for the regional species pool. This was accomplished by using the drop.tip and keep.tip functions in the R package "ape," "phytools," and also the grepl function (Paradis et al., 2004;Revell, 2012). The complete regional species pool included all vascular plant species documented within CRMO and the shrub-steppe ecoregion, as these species are potentially able to colonize the kipukas and thereby play an important role in the colonization process of the kipuka community (Kirmer et al., 2008). We chose to prune from an existing seed plant megaphylogeny (Smith & Brown, 2018) to create a single kipuka phylogeny, as opposed to creating individual community phylogenies for F I G U R E 2 Local community phylogeny of species found in the kipukas sampled at Craters of the Moon National Monument and Preserve, Idaho, USA. Colors shading taxon names correspond to Family listed at right each kipuka, as the approach we chose has been shown to result in a more consistent estimate of evolutionary relationships and distances between taxa (Erickson et al., 2014). We constructed the regional phylogeny in a similar way, by dropping species not included in the regional checklists off of an the seed plant megaphylogeny (Smith & Brown, 2018). This subsampling of the megaphylogeny has the advantage of having no impact on the branch lengths already estimated and recent studies suggest these are reliable trees for community phylogenetic inference (Li et al., 2019). The megaphylogeny we used, which consists of 79,881 vascular plant species with molecular data available from GenBank, is the largest dated phylogeny currently available for seed plants and has broad taxon sampling (Jantzen et al., 2019).
If a species was present in the community but absent in the megaphylogeny, a "replacement" species that is a close relative in the same genus with a similar ecological distribution present in the megaphylogeny was retained in the phylogeny (Qian & Jin, 2016). We acknowledge that using replacement species could impact our calculation for community dispersion, though this is unlikely to be significant as a majority of the species relationships are rather distant (Jantzen et al., 2019). Species present in the kipuka and regional communities but for which the genus was not represented in the megaphylogeny and/or no suitable replacement was available (e.g., only one species was present in the megaphylogeny and there were multiple species in the regional species list) were not included in the community phylogenies. The resulting two community phylogenetic trees, after dropping species not present in the checklists and adding replacements, contained 65 and 641 species for the kipukas and regional pool of CRMO, respectively (Figures 2 and 3).
F I G U R E 3 Regional community phylogeny of species found in the shrub-steppe ecosystem. The bars surrounding the phylogeny loosely indicate Family grouping

| Functional trait
Maximum vegetative height data for all species in the kipuka and regional communities were gathered using a combination of herbarium records, species descriptions, and Floras (Hitchcock & Cronquist, 2018). Maximum vegetative height values were logtransformed because the data were strongly right skewed. Though it made the data more normal, log transformation was performed primarily for ease of biological interpretation of maximum vegetative height. Notably, a very small number of tree species in the kipukas have very large maximum height values compared to the rest of the species in the kipukas thereby inflating the impact the maximum vegetative height of these species has on the analyses of ecological process. Transforming the data allows us to consider the differences in height at a small scale as equally important as the large differences in height presented by the species of trees within the kipukas.

| Community dispersion metrics
We measured the amount of phylogenetic dispersion among species in the kipuka community and tested for significance of the difference between the observed patterns and neutrality by calculating the standardized effect sizes (SES) of two different dispersion metrics (Webb, 2000;Webb et al., 2002) using the R package "picante" (Kembel et al., 2010). First, we calculated mean pairwise distance (MPD) between all species in the kipuka community phylogeny. We also calculated the mean nearest taxon distance (MNTD) as the mean distance separating each species in a community from its closest relative, this metric captures how clumped the species in the community are on the phylogenetic tree and the prevalence of short-branched clusters of species separated by longer branches. We then compared the observed values to the null expectations of these metrics that were produced by generating 1000 replicate metrics. Each of these replicates was made from shuffling the species present in the regional community randomly, resampling the same number of species, and then recalculating the metrics.
If the observed values for MPD or MNTD are significantly under-dispersed or clustered, the test statistic fell in the lower 2.5% of the values obtained in the null distribution (p-value < .025).
A community assembly process of habitat filtering is inferred in this case because the species in the local community are more closely related than is expected by chance (Gotelli & Colwell, 2001;Kembel et al., 2010;Webb, 2000;Webb et al., 2002Webb et al., , 2008. Alternatively, if the observed metrics are significantly overdispersed, meaning the test statistic fell within the upper 97.5% of the null distribution (p-value > .0975). In this case, a community assembly process of competitive exclusion is inferred because the species in the local community are more distantly related than you would expect by chance. As these tests are done separately, if neither metric fell in either tail of the null distribution, a neutral process of community assembly was inferred. Though if one metric, either MPD or MNTD was found to be significant and the other not significant, we still considered the significant result.
Mean pairwise distance and MNTD can be calculated using phylogenetic branch lengths, the number of nodal distances, or phenotypic/functional trait differences (Gotelli & Colwell, 2001;Kembel et al., 2010;Webb, 2000;Webb et al., 2002Webb et al., , 2008. Thus, we measured the phenotypic dispersion the same way we calculated the phylogenetic dispersion metrics. We calculated each metric, MPD and MNTD, then performed 1000 random shuffles of the regional and local communities to get the null distribution and to see if the observed metrics fell within either tail. We first ran a comparison between the kipuka and regional communities and then also looked at the kipukas separately by further pruning the phylogeny to represent only species present in a given kipuka.
We then repeated this process for each of the remaining kipukas individually.

| CAMI
To integrate phenotypic and phylogenetic data while inferring community assembly processes, we used a novel simulation software and inference procedure for community assembly models implemented in the R package "CAMI" (Ruffley et al., 2019). This approach works by first simulating many datasets of phylogenetic and phenotypic data under various community assembly processes such as habitat filtering, competitive exclusion, and neutrality. We then use a set of sum- To establish what model under which to simulate data, we first determined the model of trait evolution that best fits the regional phylogeny and regional trait information prior to simulation of phylogenetic and phenotypic data in CAMI. We fit the empirical data to two models of trait evolution, Brownian Motion (BM; Felsenstein, 1985) and Orstein Uhlenbeck (OU; Butler & King, 2004;Hansen, 1997) using the fitcontinuous() function in the R package 'Geiger' . BM models mimic the process of evolutionary drift over macroevolutionary time, t, with a single parameter, σ 2 , that controls the rate of phenotypic change through time such that the expected distribution of trait values should be normal with the variance σ 2 t. OU does the same, only it includes a selective regime in which traits are "pulled" toward a phenotypic optimum at a rate of α. Using AIC, the best fitting model was found to be OU, which meant both parameters σ 2 and α needed to be estimated. To fit an OU model, we maximized the likelihood of the parameters of the OU model given the kipuka data. However, OU model parameters are notoriously hard to estimate as σ 2 and α are confounded and data can be fit using various combinations of these parameters where the likelihood always gets better with an increasing α and smaller σ 2 , though increasing α values become more and more unrealistic the larger they get . Therefore, we fit several OU models to the empirical data, varying the bounds of α from 0.01 to 1, to determine at what values of σ 2 and α the likelihood stopped getting dramatically better. This was at an estimated σ 2 of 0.92 with a corresponding estimate of α at 0.2; we used these estimates to simulate the trait data in CAMI (Appendix S2).
We simulated 10,000 community assembly datasets for each assembly model, for competitive exclusion, habitat filtering, and neutral, all under an OU model of trait evolution with the above estimated parameters. The other parameters such as the strength of filtering/competition t and the phylogenetic parameters, the speciation rate λ and the extinction rate μ, were drawn from their default uniform prior distributions as implemented in CAMI. The resulting simulated data, along with the empirical data, were summarized into 30 different summary statistics (Appendix S3) to be used for model selection in RF and parameter estimation in ABC. For community assembly model selection, we constructed a classification forest consisting of 1000 decision trees using the 30,000 simulated datasets and the 30 summary statistics. RF works by using many decision trees to partition out the variation in the summary statistics and uses these differences to distinguish between the three community assembly models. As the decision trees are being constructed, they are also simultaneously being validated by a portion of the data that is withheld from the construction. This enables the calculation of the out-of-bag (OOB) error rate, or the proportion of misclassified simulations. This OOB error rate details how accurate the classifier is overall and also for each model, as some models are easier to distinguish than others. The resulting classification forest was then used to determine which model of community assembly structured the kipuka plant communities at CRMO. Here, we inferred the probability of each community assembly model for each of the 19 kipukas surveyed.
We performed parameter estimation using ABC following Ruffley et al. (2019). For ABC, we scaled the summary statistics by their standard deviation and then used the top 10 informative summary statistics from the RF classifier to estimate the posterior probability of t, the strength of habitat filtering (Appendix S4). We only considered the simulations under the community assembly model that best fit the data given the RF model selection, (i.e., the habitat filtering simulations). From those, we accepted 100 simulations from the posterior distribution for the parameter t. We used these estimates to generate 95% high density confidence intervals (Kruschke, 2011).

| Factors influencing community assembly
To understand whether the model probabilities were explained by the fragmented nature of the kipukas, we constructed linear regression models using the lm() function in R 3.6.1 and tested whether any significant relationship existed. Specifically, we tested for whether any of the following independent variables; species richness, area of kipuka, distance to the edge of lava flow (isolation), and kipuka elevation, explained the variation in support for community assembly models associated with the 19 kipukas in our study (dependent variable). One may expect the combination of isolation and area, or isolation and elevation to better capture "fragmentation" than just one of the variables alone. Thus, we also tested whether the interaction between any of these variables resulted in a significant relationship with model support. This analysis aimed to understand whether these metrics of fragmentation explained variation in the ecological processes inferred from the phylogenetic and functional trait data.

| Kipuka community diversity and biogeographical attributes
The 66 plant species collected in the 19 kipukas sampled at CRMO represent 24 families and 51 genera. Species richness ranged from nine to 20 species per kipuka. The phylogenies created using an existing seed plant megaphylogeny consisted of 65 and 641 species in the kipuka and in the regional community phylogenies, respectively.
Mean maximum vegetative height was 126 cm for the regional community and 77 cm for the kipuka community (Table 1). There was no missing data for the height data for species used in the analysis.
The mean area of kipukas sampled was 13,670 m 2 , mean kipuka isolation, that is the distance from the edge of a kipuka to the outer lava flow, was 348.5 m, and mean kipuka elevation was 1574 m.

| Community dispersion metrics
The observed values of MPD and MNTD for the kipuka community as a whole (all 66 species observed in kipukas) suggest that neutral processes are dominant as neither dispersion metric was significantly under-or over-dispersed (Appendix S5). Although not significant, TA B L E 1 Summary of vegetative height data for the regional community (n = 641 species) and the kipuka community (n = 65 species) and the biogeographical factors of kipukas that were included in the analyses for neutral processes (406 out of 1000). The SES are calculated by standardizing the raw phenotypic and phylogenetic dispersion metrics relative to the total variation observed. The empirically calculated SES is then considered the test statistic when compared to a null distribution of SES and the p-value is where that test statistic falls within the null distribution. In Figure 4, the p-values reported in each cell are as follows, for example, kipuka A received a SES rank of 10 out of 1000 randomizations for phylogenetic data using the MPD metric and has a p-value of .01 listed and thus significant support for clustering. In sum, neither process of habitat filtering nor competitive exclusion were inferred with these traditional phylogenetic dispersion metrics.
When considered individually across kipukas, none of the 19 kipukas showed significant support for over-dispersion with either the MPD or MNTD metric using phylogenetic data ( Figure 4).
Using the MPD metric based on phylogenetic data, five kipukas showed significant support for phylogenetic clustering and with the MNTD metric based on phylogenetic data, two kipukas showed significant support for phylogenetic clustering. Two kipukas (G and P) showed significant support for clustering with each metric, MPD, and MNTD. Thus, we found more individual kipukas at CRMO to be phylogenetically clustered than over-dispersed.
Among the remaining 14 kipukas that did not significantly support either clustering or over-dispersion using the MPD metric and phylogenetic data, eleven trended toward phylogenetic clustering (p = .5-.25), and only one (kipuka S) trended toward over-dispersion (p = .75-.95). Using the MNTD metric and phylogenetic data, nine kipukas trended toward clustering. None of the kipukas had significant support for over-dispersion based on the phylogenetic MNTD metric.
In regard to phenotypic dispersion based on maximum vegetative height and the MPD metric, no kipuka showed significant support for either clustering or over-dispersion using MPD. Six kipukas had ranks above 50 but less than 250, indicating a trend toward phenotypic clustering. Only two kipukas tended toward over-dispersion indicating possible competition (kipukas I and G had ranks above 750 but below 950). Nine kipukas trended toward clustering.

| Selection of community assembly model
In general, most kipukas had very similar summary statistics, many with an expected amount of deviation given the varying species' pools across kipukas (Appendix S6 and available at https://github. com/ruf fl eymr/Peter son_Data/blob/maste r/KipSu mmar y Stat s. csv). Notably, the variance of vegetative height among kipuka species was almost always, except in four kipukas, smaller than that of the regional species pool trait variance. This is somewhat indicative F I G U R E 4 Heatmap of p-values for the 19 kipukas sampled at Craters of the Moon National Monument and Preserve, Idaho, USA for each phylogenetic and phenotypic diversity metric. The header of each column is the test that the p-value in the cells refers to (mean pairwise distance, MPD and mean nearest taxon distance, MNTD). Colored squares at the left of the heatmap denote the kipuka letter and region (northern, central, and southern) as indicated in Figure 1. Darker gray colors represent lower p-values and lighter gray colors represent higher p-values. The standardized p-value is noted in each cell. Additionally, a black circle within an individual cell represents a p-value of less than .025 indicating significant support for phylogenetic or phenotypic clustering. A p-value of more than .975 would indicate significant support for over-dispersion

| Factors influencing community assembly
Of all the linear regression models tested to evaluate the effect of kipuka properties on the support for community assembly models, few resulted in significant relationships (α = 0.05). The only models with significant prediction ability were species richness predicting model support for competition, as well as species richness predicting model support for the neutral model ( Figure 6, Appendix S8).
Specifically, as species richness for a kipuka increased, the model support for competition decreases (p-value .018) and the model support for the neutral model of assembly increased (p-value .019).
Likewise, elevation was nearly a significant predictor of the model support for habitat filtering (p-value .052), where low elevation kipukas showed higher support for the habitat filtering model ( Figure 6).
All other models, including those with interaction terms and multiple predictors did not increase the predictability of any of the response variables.

| DISCUSS ION
While traditional phylogenetic community approaches based on trait and phylogenetic dispersion suggest neutral assembly dynamics, overall, we do find some support for phylogenetic clustering, and ultimately habitat filtering. Importantly, we find that recently devel-

| Phylogenetic and functional trait diversity
Using traditional dispersion metrics alone, such as MPD and MNTD, and hypothesis testing, our analyses mainly support the role of neutral processes forming the community as very few kipukas resulted in significantly over or under-dispersed phylogenetic or functional trait metrics. Under a neutral model of assembly, all species present in a regional community pool have an equal probability of colonizing and persisting in that local community (Hubbell, 2001;Rosindell et al., 2012). This neutrality implies that species differences (e.g.,  (Woodcock et al., 2007), farmland birds that exist in a fragmented agricultural landscape (Henckel et al., 2019), and cichlids in Lake Tanganyika (Janzen et al., 2017).
Given that no phylogenetic signal, or Blomberg's K in this case, for our trait of interest was estimated to be of 0.27 across all kipukas, the approaches above were not completely reliable. This is because the use of phylogenetic and functional trait dispersion metrics for community assembly relies on high phylogenetic signal in the trait(s) of interest. Rather an approach that does not assume phylogenetic signal in traits, such as CAMI, is justifiable to use (Cavender-Bares et al., 2009;Kraft et al., 2007). In CAMI, in all models of community assembly the species in the regional pool have an equal probability of colonizing a community thus, support for neutral and filtering suggests that the trait of maximum vegetative height reflects a barrier for some species inhabiting the kipukas. Perhaps the true functional trait barrier is the height of the plants, or perhaps it is related to the shared resource allocation that the plant trait height is a proxy for.
Either way, there is evidence that there is an environmental limitation or barrier to some species existing in the kipukas.
Support for multiple process of community assembly could mean processes of community assembly are operating at different scales. For example, previous work has found multiple mechanisms of community assembly operating in early plant communities (Marteinsdóttir et al., 2018). Assembly from the regional pool to local communities was mostly neutral, and within communities, nonrandom assembly occurred related to various traits important in a plant species ability to disperse, establish, and persist in a local community. Additionally, others have found that different community assembly processes operate at different life stages of plants (Hu et al., 2016). It is important to note that all environments, or each individual kipuka in this case, may not select for the same variant in traits (Lowe & McPeek, 2014). The kipuka community as a whole is then comprised of a set of species that are expressing different traits based on selective pressures at different scales (e.g., spatial, temporal, and phenological) (Hu et al., 2016;Lowe & McPeek, 2014;Marteinsdóttir et al., 2018). Support for both neutral and filtering processes operating in the assembly of the kipuka communities at CRMO may highlight processes impacting at different scales, different life stages, and the differences in selective pressures between kipukas. We may be observing and measuring the initial impacts of fragmentation on the kipuka communities and the long-term effects of these processes over a macroevolutionary timescale might not yet be realized.
Various traits in plants are important for resource acquisition, seed dispersal, and specific adaptations to the stress of low water availability exist. One of these, a reduction in overall plant height to minimize the diameter of vascular tissue in order to decrease occurrence of embolisms (Olson et al., 2018) would be particularly beneficial in habitats that experience temperature and precipitation extremes, such as at CRMO. We chose the single trait of plant height because of its impact on overall water movement in a plant, as susceptibility to stress due to low water availability and cold would impact a plants ability to persist at CRMO. Water stress in plants has been shown to be an important primary filter in restricting which species present in a regional pool were available to establish via community assembly (Luzuriaga et al., 2012). Future studies including several ecologically relevant traits could reveal a more complete picture of the role of phenotypic variation across species in F I G U R E 6 Significant linear regression model results (α = 0.05). Top panel (a) includes significant (**) results for the model support (dependent variable) and factor of kipuka (independent variable). Split panels demonstrate (b) nearly significant (*) relationship negative relationship between elevation and model support for filtering and (c) significant (**) positive relationship between species richness and model support for neutrality. The rest of the linear regression model results can be found in Appendix S8 constraining or promoting the assembly of fragmented communities.
Although one quantitative trait can be used at a time in CAMI, multiple analyses could be done to compare across traits.
Qualitative traits, for example seed dispersal mechanisms may vary between plants found within the local kipuka community and those in the regional community (Lowe & McPeek, 2014). Perhaps gravity seed dispersal is more prevalent for the kipuka species than for the regional species, however this was outside of the scope of this study.
In our efforts to measure the strength of filtering through the t parameter, we find that we do not have much confidence to estimate this parameter with our current techniques and data. The data are limited by small communities, and we know small communities lead to a lack of power in estimating this parameter (Ruffley et al., 2019).
However, we also know that these data support both filtering and neutral models of assembly, which could also be why estimating a parameter only from the filtering model is unsuccessful. at multiple scales from cellular to the entire organism, but root or stem failure is still possible (Gardiner et al., 2016). Increased susceptibility and negative impacts of wind damage has been found to be exacerbated when surrounding areas lack vegetation (e.g., denuded) such as those of the lava matrix at CRMO (Laurance & Curran, 2008).
Thus, plants with a maximum vegetative height shorter than the lava boundary would be able to withstand the strong winds experienced at CRMO better as they are partially protected within the "bowl" shape.

| A fragmented landscape
Within the fragmented landscape of kipukas at CRMO, the trait of maximum vegetative height may be particularly influential in the ability of a species to establish and thrive in the kipukas as height may be especially costly in this environment due to environmental stressors caused by fluctuations of temperature and precipitation that occur. How wind acts as a selective force for plants is of interest in other fragmented landscapes as well, as abiotic factors greatly influence the successful establishment and persistence of a species within a community. The fact that lower elevation kipukas show more support for habitat filtering models compared to the kipuka community as a whole is interesting and could be due to a finer scale filtering pressure along an elevational gradient, in addition to the already mentioned environmental stresses operating on the community as whole.
The impacts of fragmentation can be hard to measure at the phylogenetic scale which broadly characterizes diversity at a macroevolutionary scale. One way to obtain a finer perspective of local diversity within and between kipukas at CRMO for future work could be to incorporate genetic sequencing of individuals from each species collected. Producing species-specific population genetic data would then allow for quantification of diversity within species and comparisons among species. This proposed population genetic approach would allow us to quantify contemporary migration (i.e., dispersal) occurring within the local community between kipukas.
Although outside of the scope of this study, leaf tissue samples were obtained (and stored in silica) from each individual species collected and these could be used in the future in such a proposed population genetic study.
The fragmented landscape due to the lava matrix in which the archipelago of kipukas are situated makes CRMO a particularly useful system in which to ask questions related to functional trait diversity and phylogenetic diversity. Although this system is naturally fragmented, the intervening matrix in many ways is similar to anthropogenic alterations of landscape occurring elsewhere (e.g., asphalt, concrete). By understanding the ecological processes at play in natural fragmented systems and traits that may impact community assembly we can then use this information to guide our conservation and restoration efforts in future fragmented ecosystems.

| CON CLUS ION
With the continued alteration to natural landscapes, there is a persistent and pressing need to investigate the consequences of habitat fragmentation and how these consequences may impact phylogenetic and functional trait diversity. The incorporation of both phylogenetic and functional trait diversity within a single community can help infer the processes that have led to the assembly and formation of that community, and ultimately what contributes to the maintenance or loss of biodiversity. Using a new approach that infers a model of community assembly using both phylogenetic and trait information, along with measuring the strength of the inferred ecological process, we find that for the kipuka plant community at CRMO dual processes of neutrality and filtering based on maximum vegetative height have contributed to community formation. Additionally, we find there is evidence that environmental pressures are indeed prohibiting some species from inhabiting some or all of the kipukas, and these pressures may be more severe at lower elevations. When data for more than one trait are available, multiple CAMI analyses could be performed to compare the role of different traits and their impact on community formation. This type of comparative trait-based analysis could help to predict how community assembly might respond to changes such as fragmentation.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
All input data and scripts for each analysis, along with the output data, can be found in https://github.com/ruffl eymr/Peter son_Data and data is located in a permanent Dryad repository https://doi.