A unified model of species abundance, genetic diversity, and functional diversity reveals the mechanisms structuring ecological communities

Abstract Biodiversity accumulates hierarchically by means of ecological and evolutionary processes and feedbacks. Within ecological communities drift, dispersal, speciation, and selection operate simultaneously to shape patterns of biodiversity. Reconciling the relative importance of these is hindered by current models and inference methods, which tend to focus on a subset of processes and their resulting predictions. Here we introduce massive ecoevolutionary synthesis simulations (MESS), a unified mechanistic model of community assembly, rooted in classic island biogeography theory, which makes temporally explicit joint predictions across three biodiversity data axes: (i) species richness and abundances, (ii) population genetic diversities, and (iii) trait variation in a phylogenetic context. Using simulations we demonstrate that each data axis captures information at different timescales, and that integrating these axes enables discriminating among previously unidentifiable community assembly models. MESS is unique in generating predictions of community‐scale genetic diversity, and in characterizing joint patterns of genetic diversity, abundance, and trait values. MESS unlocks the full potential for investigation of biodiversity processes using multidimensional community data including a genetic component, such as might be produced by contemporary eDNA or metabarcoding studies. We combine MESS with supervised machine learning to fit the parameters of the model to real data and infer processes underlying how biodiversity accumulates, using communities of tropical trees, arthropods, and gastropods as case studies that span a range of data availability scenarios, and spatial and taxonomic scales.


| INTRODUC TI ON
Biodiversity is structured hierarchically across spatial, temporal, and taxonomic scales (Leibold & Chase, 2017). Fluctuations of species abundances within communities operate on ecological timescales, on the scale of handfuls or tens of generations. Population genetic variation, by contrast, accumulates and degrades over timescales of tens to tens of thousands of generations (Leffler et al., 2012), while phylogenetic and functional diversity accumulate even more slowly, on the order of thousands to millions of generations (Uyeda et al., 2011). Over time, various fields have emerged to investigate processes within individual levels of organization (macroecology, comparative population genetics, macroevolution), but only recently have inroads been made to combine theory across multiple scales of space and time into a general unified model (Thompson et al., 2020;Vellend, 2010Vellend, , 2016. Complicating matters, there is little consensus over whether, and to what degree, ecological interactions contribute to the structuring of ecological communities (Harmon & Harrison, 2015;Rabosky & Hurlbert, 2015). Likewise, the relative contributions of colonization and in situ speciation to the composition of community structure remains an open question (Patino et al., 2017).
Discovering universal rules that structure ecological communities is a challenging task given the difficulty of disentangling the relative influence of faster ecological mechanisms from slower evolutionary processes (Ricklefs, 2004), yet a unification of theory across multiple scales will provide significant insight into the formation of biodiversity (McGill et al., 2019). Ecological models of community biodiversity inspired by the Equilibrium Theory of Island Biogeography (MacArthur & Wilson, 1967) and the Neutral Theory of Biodiversity and Biogeography (Hubbell, 2001) have focused on predicting only a single biodiversity metric, the shape of the local species abundance distribution (SAD). As central as the SAD is to macroecology and community ecology, it is often insufficient to distinguish among different models of community assembly, particularly at equilibrium (Chave et al., 2002;Haegeman & Etienne, 2011;McGill et al., 2007). Recently, DNA sequence data sampled at the community-scale has offered a powerful new approach for studying community dynamics at the genetic level (Baselga et al., 2013(Baselga et al., , 2015Dapporto et al., 2019;Múrria et al., 2017;Papadopoulou et al., 2009). While empirical investigation of community intraspecific genetic diversity has flourished, modelling efforts have remained constrained, with current theory either lacking an explicit population genetic foundation (Vellend, 2005), considering genetic variation only of a focal species (Laroche et al., 2015), or modelling but not fully exploring genetic variation at the community scale (Aguilée et al., 2018;Manceau et al., 2015). Demonstrating the power of unified modeling, a great deal of work has been done to incorporate phylogenetic information with abundance data to make inferences about community assembly processes (Jabot & Chave, 2009;Webb et al., 2002). While such approaches make useful predictions, they are predicated on assumptions of equilibrium within the local community and also assume that the phylogeny is a reliable proxy for functional trait diversity, an assumption violated by traits that are not phylogenetically conserved (Cavender-Bares et al., 2009).
Likewise, there have been other successful efforts to unify theory across timescales with mechanistic ecoevolutionary models of assembly. Cabral et al. (2019) unify population-level and evolutionary timescales to investigate the dynamic relationship between community age, competition, and local richness.  developed a trait-based, spatially explicit ecoevolutionary model to make inferences about prey and predator niche widths with potentially diverse data types. Incorporating temporal dynamics can help to distinguish among ecological processes (Chisholm & O'Dwyer, 2014;Jabot et al., 2018;Ricklefs, 2006), yet current theory fails to generalize across levels of biological organization. Adding more axes of data to process-based models without increasing model complexity at the same rate is therefore a necessary advance to break this many-to-one mapping of hypotheses to observation (Leibold & Chase, 2017;McGill et al., 2007).
The massive multidimensional data sets that continue to emerge from high-throughput biodiversity investigations applying community-wide surveying techniques such as eDNA (Deiner et al., 2017), metabarcoding (Andújar et al.,2018;Dopheide et al.,2019), and remote-sensing technologies that can directly infer trait data (Cavender-Bares et al., 2017), are therefore timely. However, the challenges associated with moving beyond descriptive approaches of interpretation and inference have limited broader understanding of processes generating biodiversity patterns (but see Bohan et al., 2017;Derocles et al., 2018). Historically there have been two general approaches to investigate the evolutionary and assembly processes underlying the patterns we observe in nature: (1) "process-first" approaches that use first principles to derive generative mechanisms to make predictions of a single data type under the assumptions of an idealized community (Gavrilets & Vose, 2005;Marquet et al., 2014;Rosindell et al., 2012); and (2) "pattern-first" approaches that reveal aggregate differences in macroecological patterns from real world systems across a range of spatial and temporal scales (Craven et al., 2019;Keil & Chase, 2019;Ricklefs & Bermingham, 2001;Rominger et al., 2016;Wagner et al., 2014). Recent advances in simulationbased inference under increasingly complex models provide a third option of unifying multiple processes and multiple data types across different scales Pontarp, Bunnefeld, et al., 2019). A unified model of community assembly, which accounts for the fundamental processes underlying biodiversity across spatial and temporal scales (Vellend, 2010), could be used to make predictions about multiple axes of biodiversity data that include species richness and abundances, distributions of species genetic diversities, and trait variation. Several studies have recently shown that such complex biological models and resultant high-dimensional data can be tractable within a machine learning framework (Schrider & Kern, 2018), providing a robust inference procedure for simulation-based interrogation of empirical data.
Here we introduce the massive ecoevolutionary synthesis simulations (MESS) model, building upon classic community ecology theory (Hubbell, 2001;Leibold & Chase, 2017;MacArthur & Wilson, 1967;Vellend, 2016) to produce a mechanistic model of local community assembly for making joint predictions of observed multidimensional biodiversity data such as that currently emerging from high-throughput metabarcoding studies (Taberlet et al., 2012).
MESS integrates ecological models of community biodiversity, comparative population genetics, and trait evolution, with an explicit focus on incorporating microevolution and ecological interaction processes, which are often underrepresented in mechanistic models (Leidinger & Cabral, 2017). MESS can simulate community assembly across a continuum of scenarios from evolved to assembled, and from purely neutral to niche-structured by either competition or environmental filtering. These simulations generate predictions for locally sampled distributions of abundance, genetic variation, and trait values which are summarized using a novel combination of statistics that capture the variation within and among these biodiversity data axes. We combine summary statistics from numerous simulations with supervised machine learning methods to test an array of competing models and to estimate model parameters relevant to understand complex histories of community assembly and evolution. We perform extensive simulation-based cross-validation analyses to explore precision and accuracy of model inference. Finally, we apply the model to four high-throughput biodiversity data sets representing different taxonomic and spatial scales: two arthropod communities with varying dispersal capacity from Mascarene islands of different ages (Emerson et al.,2017;Kitson et al., 2018); plot level sampling of Australian tropical forest trees (Rossetto et al., 2015); and archipelago-scale sampling of Galapagos Islands gastropods (Kraemer et al., 2019;Triantis et al., 2016).

| Metacommunity composition
The MESS model comprises three components summarised in  Table 1 for model parameter details). The metacommunity is modelled as a regional pool which is very large and fixed with respect to the timescale of the assembly process in the local community. It consists of a global phylogeny relating all species, along with species abundances, and trait values evolved along the phylogeny. The global phylogeny is produced by simulating a homogeneous, time-constant diversification process, in which lineages give rise to new lineages or die with fixed speciation ( ) and extinction ( ⋅ ) rates, until the desired number of species (S M ) is reached (treesim v2.4;Stadler, 2019). Next, we simulate a Brownian motion model of trait evolution on the phylogeny with a root value of 0 and a rate of σ 2 M (ape v5.3; Paradis et al., 2004). Continuous traits evolve following a Brownian process of random drift in the metacommunity, rather than an Ornstein-Uhlenbeck process, which is stochastic with a central tendency (Butler & King, 2004), because we assume species in the metacommunity are not exposed to constraints imposed by the local environmental conditions. Likewise with this model, we make no assumption about the degree of phylogenetic conservatism for each trait simulated. While multiple traits evolving under varying degrees of phylogenetic conservatism may provide more nuanced biological insight, for reasons of computational tractability we consider individual trait evolution as a reasonable first approximation. Additionally, we do not model intraspecific trait variation, on the assumption that trait values represent the mean phenotype of each species. Finally, the species abundances are sampled from a log-series distribution parameterized by the total number of species (S M ) and the total metacommunity size (J M ).

| Local community dynamics
The foundations of the community dynamics underlying MESS are based on the joint neutral model of abundance and genetic diversity described in Overcast, et al. (2019). The individual based community assembly model broadly follows that used in Rosindell and Harmon (2013), which is inspired by the ecological neutral theory of Hubbell respect to the timescale of assembly in the local community. As with the typical spatially implicit neutral model (Hubbell, 2001), the local community diversity approaches a dynamic equilibrium state from its initial conditions such that ultimately local extinction due to ecological drift is counterbalanced by new species arriving through colonisation.
Departing from the previous model, MESS allows relaxation of the assumption of ecological neutrality, generating individual fitness differences which account for biotic and/or abiotic interactions. MESS local community dynamics can range from fully neutral (species traits have no effect), to various degrees of non-neutrality determined by the magnitude that species traits influence individual death probability ( ) through competition or environmental filtering. Following Ruffley et al. (2019), we based our environmental filtering and competition models on a functional relationship common in coevolutionary models which relates trait-based interactions with the probability of persistence in a community, scaled by the ecological strength of the interaction (s E ; Lande, 1976;Nuismer & Harmon, 2015). The s E parameter determines either the strength of species-species competitive interactions or species-environment filtering interactions depending on whether a competition or filtering model is specified. MESS does not simultaneously model competition and filtering, though this will be a potential future development. Calculated death rates per species are normalised to provide a vector of death probabilities that weight the random sampling of which individual will die in each time step according to a multinomial distribution (see Appendix S1:Methods).
As a first approximation, within the local community we implement a point mutation speciation process (Hubbell, 2001), although other modes could be incorporated in future versions of the model (Haegeman & Etienne, 2017;Rosindell et al., 2010). Speciation is implemented phenomenologically and takes place with probability upon each birth event. Upon each speciation event, the new individual is assigned a unique species identity, and its prior species identity is recorded as the parental species for purposes of building the local phylogeny. The descendant species receives a trait value sampled from a normal distribution centered on the parent species' trait value and with variance equal to σ 2 M /( + ⋅ ), which is the expected variance of trait differences between parent and offspring species in the metacommunity. As each simulation proceeds, trait values continue to evolve in a punctuated fashion at each speciation event, and branch lengths within local radiating lineages are updated to reflect the accumulated time since speciation. F I G U R E 1 Conceptual diagram illustrating the machine learning inference procedure and the three primary components of MESS simulations. (a) The MESS machine learning inference procedure proceeds broadly in three steps. First, community-scale data is obtained for one or more axes of biodiversity data including abundances, trait values, and genetic sequence data, and community summary statistics are calculated. Next, prior ranges on model parameters are selected (depicted are migration rate (m), speciation rate (ν), and equilibrium (Λ)), numerous simulations are performed to match the sampling of the observed data using parameters sampled from these prior ranges (dashed box; see exploded view of simulations in (b), and the identical suite of summary statistics are calculated. Finally, a machine learning framework is trained using the simulated data, learning the mapping between summary statistics and simulation parameters. The trained machine learning framework is then used to estimate model parameters using the observed community summary statistics. (b) MESS simulations are composed of three hierarchically linked components. The metacommunity component (red) encompasses a global phylogenetic history of all species, along with species abundances and trait values evolved along the phylogeny. The local community component (black) involves a forward-time process during which a local community assembles by individual birth/death, immigration (dispersal from the metacommunity), and local speciation or extinction. The population genetic component (blue) approximates per species genetic polymorphism from coalescent simulations that are parameterized from the abundance histories and colonization times generated by the forward-time local community component. Processes which operate within and between each hierarchical level are indicated within each subpanel (see Figure 2 for further details on model parameters) Following Overcast, et al. (2019), the forward-time histories of colonization and abundance changes through time per species are rescaled to parameterize divergence time and effective population size in backward-time coalescent models with immigration for each species (Kelleher et al., 2016) to generate sampled local nucleotide diversities (π; Nei & Li, 1979). For reasons of computational efficiency, and to achieve a realistic scale in terms of numbers of individual organisms, we use a scaling parameter ( ) to specify the number of individuals per deme, thus the total number of organisms in the local community is given by J ⋅ . This notion of demes, or 'cohorts', groups of individuals that perform the same actions at the same time, is conceptually similar to that of Harfoot et al. (2014). We use the forward-time frequency of colonization events (scaled to number of colonizations per generation) for each species to parameterize the migration probability in the coalescent of colonization/divergence with ongoing immigration. The per site per generation mutation rate is and we use the harmonic mean of the forward-time population size history of each species to approximate each corresponding effective population size (Karlin, 1968;Pollak, 1983). The time of initial colonization of each species is the divergence time from the source population in the metacommunity within which the final coalescent events take place (going back in time). We scale forward-time Moran time steps by a factor of 2∕J to convert to backward-time Wright-Fisher units of nonoverlapping generations.

| Population genetics component
Finally, given an observed data set, coalescent simulations match the observed sample sizes of each species for which DNA sequence data F I G U R E 2 Flow diagram illustrating MESS model processes and parameters. A flow diagram illustrating all MESS model processes and the parameters that govern their behaviour. Each box illustrates a subcomponent of the model (colored to correspond with subcomponents illustrated in Figure 1b), and indicates the parameter(s) which determine the behaviour of each subcomponent. Diversification and trait evolution processes in the metacommunity (red) are determined by speciation (λ) and extinction (ε) rates, the total number of species in the metacommunity (S M ), and the rate of trait evolution (σ 2 M ), which follows a Brownian process. Abundances in the metacommunity are sampled from a log series distribution such that the total number of individuals is equal to J M . The local community (black) is initialized with a fixed number of individuals (J) and proceeds by a stepwise birth/death/immigration/speciation process, which (in the neutral case) is governed by the immigration rate (m) and the local speciation probability (ν), and which proceeds for a fixed amount of time per simulation determined by the Λ parameter. For non-neutral local community dynamics, unequal death probabilities (i.e. fitness differences) are determined by species trait values, the strength of ecological interactions (s E ) and the local trait optimum (z E ; in the case of environmental filtering). Finally, the population genetics component (blue) generates predictions of genetic variation per species based on standard population genetic parameters which are either fixed for all species per simulation (sequence length (L), mutation rate (μ), and number of individuals per deme (α)) or which are dynamically recorded per species per simulation (divergence time (τ) and effective population size [N e ]). Arrows between subcomponents indicate information flow through the simulations

| Summary statistics
We specify a hierarchical structure of summary statistics for each target data axis: species abundances, population genetic variation, and trait values. First, several relevant summary statistics are calculated per species, for each of the data axes. Next, each species-level statistic is aggregated and community-scale summary statistics are calculated per axis of data, capturing information about the distribution of the statistic across the community. We include as summaries the first four moments of each community-wide distribution, as well as pairwise Spearman rank correlations among all data axes.
For correlations involving the trait axis, we consider the absolute value of the difference between the species trait and the local trait mean as the trait variable. We also calculate the differences between regional and local values of trait mean and standard deviation (Δ trait and Δ trait respectively). Additionally, we utilize a framework of generalized Hill numbers as community-scale summary statistics, to quantify the shape of each distribution (Chao et al., 2014). In order to distinguish between these diversity metrics when calculated on distributions of different data axes we will refer to the Hill number of order q for abundance data as q D, for genetic data as q GD, and for trait (functional) data as q FD (see Appendix S1:Methods for further details). For simplicity, throughout the manuscript we will refer to Hill numbers calculated on distributions of each data axis as abundance, π, and trait Hill numbers.
As an example of the hierarchical nature of our summary statistics, consider genetic variation per species within a local community.
The average number of pairwise differences among sampled gene copies (π; Nei & Li, 1979) is calculated to summarize the genetic diversity of each species. As a per species metric π is well suited for characterizing genetic diversity of molecular data as it is able to capture most of the true population genetic diversity with only 5-10 individuals (Tajima, 1983). The per species π values are accumulated to compose the community genetic diversity distribution, and the first four orders of q GD of this distribution are calculated, summarizing the partitioning of genetic variation at the community scale. A similar hierarchical decomposition of abundance and trait diversity can be obtained. Importantly, with respect to the question of bias induced in summary statistics by unsampled taxa, within the local community it is reasonable to assume that unsampled taxa will be at very low abundance (Preston, 1948). In this case the failure to sample them will have essentially no impact on the abundance, π, and trait Hill numbers, and will induce relatively minor bias in the first four moments, though investigating the nature of this bias is beyond the scope of this manuscript. The complete MESS model predictions are compared with empirical data via summary statistics and machine learning inference methods enabling selection between local community models as well as estimation of parameters relevant to the community assembly process.

| Model behaviour
We simulated communities under a range of parameter values to understand how different model processes affect the distributions of community-scale data, and whether the summary statistics capture information to discriminate among various alternative models.
Given that the MESS model is dynamic in time, we controlled for this by running each simulation to the same fixed point in the assembly process. We quantified this point as the proportional approach to equilibrium (Λ) and fixed this parameter at 0.75. This value is measured as the fraction of information about the initial state of the local community which is no longer present in the current state (see Overcast, et al., 2019 for a full treatment of this parameter).
We allowed to take one of three values corresponding to no-, low-and high-speciation (0, 5 × 10 −4 , and 5 × 10 −3 respectively) and generated 10,000 simulations for each assembly model (see Table   S1 for simulation parameters). We also investigated how summary statistics of different assembly model types vary through time. To this end, we generated 10,000 simulations for each assembly model while allowing to vary as above, sampling communities at different stages of the assembly process (Λ ~ U[0,1]; see Table S2 for simulation parameters).

| Machine learning inference and crossvalidation
The MESS package includes an automated multistage machine learning (ML) inference procedure (Figure 1a). First, MESS model param- Next, the best community assembly model class is selected as that with the highest predicted probability, and a parameter estimation step is performed. Simulations are filtered to retain only those which belong to the best model, and an ML regressor is trained on this subset of simulations. A second round of feature selection and ML model hyper-parameter tuning is performed prior to ML regressor model training. Following this, summary statistics from empirical data are used to estimate MESS parameters of interest. We quantify uncertainty on parameter estimates as prediction intervals (PIs) using a quantile regression approach (Meinshausen, 2006). At this stage we are careful to evaluate parameter estimate uncertainty in light of the fact that uncertainty on model selection has not been propagated forward, which is an avenue for further development.
Finally, to evaluate model adequacy we implement posterior predictive simulations (PPS) to assess goodness of fit of the model to the observed data (Gelman, 2003). Additionally, after both classification and regression training steps, feature importances can be extracted to evaluate the proportion of information with respect to a given target variable that is contained within each retained summary statistic. The MESS ML classification and regression procedures can be performed with a number of ensemble learning strategies including random forest (Breiman, 2001), gradient boosting (Friedman, 2001),  Table S3 for simulation parameters). To quantify the accuracy and bias of MESS parameter estimation utilizing an ML ensemble method regression framework, we generated 10,000 community simulations per assembly model class while varying several parameters of interest ( , J, s E , m, v, and Λ) using log-uniform or uniform prior distributions (see Table S4 for parameters). ML estimator performance was then investigated using a K-fold CV procedure whereby simulations were split into training and testing sets, with the model being iteratively trained on each K-fold and performance being evaluated as minimized CV prediction error on the held out training set. Classifier model adequacy was quantified by the percent error rate of misclassification, and regression model accuracy was quantified by the explained variance and R 2 (coefficient of determination) regression scores.

| Empirical examples
As case studies, we selected four systems that occupy different spatial scales and probably occupy different locations on the continua of dispersal, speciation, ecological drift and non-neutrality. Each system has some combination of community-scale data available for two of the three axes which can be considered by the model. In this way we hope to demonstrate the power of MESS across taxonomic and spatial scales, using data availability scenarios that might be encountered by empirical biologists in the present or very near future. Next, we investigated the temporal dynamics of MESS community histories (Figure 4). Again, species richness in neutral models tended to exceed that of the non-neutral models throughout the entire community assembly process. In general, a low rate of local speciation produced a slight increase in richness and Hill numbers for neutral simulations, whereas a high rate produced dramatic increases in these metrics for all simulation scenarios. Between non-neutral models, richness and Hill numbers for competition were, on average, always greater than those of filtering models across all time points, with differences increasing with increasing speciation rate (v). For neutral models, q D tended to slowly increase monotonically through time, whereas q GD initially increased quickly with community-scale genetic diversity accumulating more slowly in later stages of assembly. Increasing v increased the average maximum q GD for non-neutral models, but in these simulations this maximum value tended to saturate very early, with little

| Parameter estimation ML cross-validation
Cross-validation explained variance and R 2 regression scores for model parameter ( , J, s E , m, v, and Λ) estimation were broadly congruent and positive in almost all cases, indicating that the simulated and estimated parameter values were correlated (in some cases highly so). For neutral simulations Λ had the highest R 2 (0.963) and ecological strength (s E ) the lowest (-0.037), with most parameters having moderate R 2 values (e.g. α = 0.567; m = 0.685; Figure 6).
The small R 2 for s E is expected given that neutral simulations should have no information about strength of environmental interactions.
Estimates of small to moderate values of m and v were accurate, but larger values tended to be underestimated. ML parameter estimation for simulations of filtering and competition models obtained improved accuracy to estimate s E (R 2 = 0.146 and R 2 = 0.287, respectively); however, R 2 values for other parameters were somewhat reduced with respect to the neutral simulations ( Figures S1 & S2). Both

F I G U R E 4
Community summary statistics through time for neutral and non-neutral models. This plot depicts the temporal change in select summary statistics for the three focal community assembly models at three different speciation rates: No, Low, and High corresponding to = 0, 0.0005, 0.005, respectively. The x-axis indicates community age measured as progress of the community toward equilibrium (Λ). Community assembly models depicted are neutral (orange), filtering (aqua), and competition (dark blue). Each subpanel shows the resultant summary statistic for 1000 simulations equally spaced through time for each model class. Simulated values are depicted as points, and a least squares polynomial is fit to better illustrate the trajectory. The far left column of panels illustrate species richness on the y-axes (S). The y-axes of the remaining columns illustrate the Hill number of order 1 (effective number of species) for abundance, genetic diversity, and trait values, respectively non-neutral models produced diffuse estimates of (R 2 = 0.205 and R 2 = 0.258) and J (R 2 = 0.398 and R 2 = 0.448). The most significant difference between the non-neutral models concerned estimates of Λ. Under competition scenarios, Λ estimates were precise but upwardly biased between Λ = 0 and 0.5, with increasing variance between Λ = 0.75 and 1. Under filtering scenarios, Λ estimates were only accurate for values close to Λ = 0.5, with decreasing accuracy as Λ moved away from this value in either direction.

| Empirical examples
The ML classification procedure identified the neutral model as the most probable for all three Mascarene arthropod communities  Figure S3).

| DISCUSS ION
Community ecology has been perceived as a "mess" (Lawton, 1999) because of the endless proliferation of processes proposed for shaping biodiversity. To remedy this, Vellend (2010) proposed a conceptual framework to unify the study of community assembly dynamics composed of four fundamental processes: dispersal, stochastic drift, selection (e.g., deterministic competition/filtering), and speciation.
These different processes operate on different timescales and contribute information to different biodiversity data axes (i.e., species richness, abundances, trait distributions). To more fully characterize the interaction among these processes it is therefore necessary to develop process based joint models of these data axes. Overall, we find that any two of the three data axes are sufficient to accurately identify the relative strength of neutral versus non-neutral processes in local community assembly, and that including trait information allows discrimination between which of the non-neutral processes are more important in driving the local patterns of biodiversity ( Figure 5). This latter finding suggests that niche-structured abundances and genetic diversity distributions are broadly similar between environmental filtering and competition models, and that the variance in local traits is necessary to distinguish between them. These results should be robust to values of s E that generate moderate to strong non-neutrality (i.e. s E ≥ 1), with a corresponding increase in misclassification rate as s E approaches 0.
More generally, using any two data axes always resulted in improved classification accuracy when compared to using a single axis alone.
Furthermore, our results highlight the flexibility of MESS to mask F I G U R E 7 MESS empirical analysis. Empirical classification and parameter estimation of five local communities including snails, tropical trees, and island arthropods. (a) depicts machine learning classification probabilities for each empirical community for three focal community assembly models. The proportion of colour within each bar represents the proportional predicted model class for neutrality (orange), environmental filtering (aqua), and competition (dark blue). (b) depicts pairwise estimates of five different model parameters under the best classified model for each local community data set. The value along each parameter axis is indicated by the position of the representative icon. Parameters depicted include number of individuals per deme (α), ecological strength (s E ), migration rate (m), local speciation probability (ν), and fraction of equilibrium (Λ) unobserved summary statistics such that inference can be made from a wide variety of high-throughput biodiversity surveys across different spatial scales and data availabilities. This will enable practicing community ecologists to perform inference with whatever biodiversity data is in hand.  (Parent & Crespi, 2006). Finally, because the Australian tree communities are plot-level samples from smaller scales representing semi-isolated habitat patches and not true insular systems we expect their parameter estimates to deviate from those of true island assemblages. This is in agreement with the finding that these tree communities are all far from equilibrium (Rossetto et al., 2015). Specifically, our approach estimates that the system is characterized by moderate m, and high v and s E estimates which indicate that local turnover, in the context of a selective environment, is important and ongoing.

| Future perspectives
As a first approximation of the feedbacks between processes operating at different timescales MESS makes several simplifying assumptions which can be treated as targets for future model improvement.
Non-neutral dynamics could constrain trait evolution as a function of resource availability or density-dependence (Múrria et al., 2018), allow for filtering and competition processes within the same model, and/or allow for mutualistic rather than simply competitive interactions. Additionally, directly modelling multivariate trait evolution may increase statistical power of inference (Zheng et al., 2009) while bypassing the biases associated with reducing the dimensionality of multivariate data into one trait dimension (e.g., with PCA; Uyeda et al.,2015). Modelling more realistic metacommunity processes and patterns, and including more sophisticated measures of diversity such as temporal correlations and environmental matching would allow for expanding beyond the simple local/regional dichotomy.
One caveat is that MESS assumes all species (or operational taxonomic units) have been well identified and do not deviate from panmictic population structure, as this will distort model selection and parameter estimation during inference. For example, cryptic population structure will reduce S and inflate metrics of genetic diversity, which could bias MESS inference to prefer non-neutral models, within which these features are common hallmarks. Another special consideration is the variance in the rate at which Λ changes with respect to time as measured in generations. Specifically, the neutral approach to equilibrium is much slower (with respect to numbers of generations) than either of the non-neutral models, potentially confounding comparisons between models at fixed values of Λ. This also highlights the need for a more robust measure of equilibrium, which can account for processes across timescales. From a practical perspective, the limitations of current MESS ML inference (i.e., point estimates of model parameters and uncertainty estimated using quantile regression) may be overcome by implementing a machine learning procedure which would allow for full posterior inference (e.g., Bayesian additive regression trees; Chipman et al., 2010).
Another approximation is the use of the rescaled Wright-Fisher coalescent process to generate the community-wide population genetic predictions of the forward-time Moran birth/death process.
Yet future advances could make use of the powerful new treesequence recording (Haller & Messer, 2019;Kelleher et al., 2018) to more accurately and flexibly match the full demographic and abundance history of each species with its respective underlying population genetic history. Although here we modelled a single locus per species to match the barcode and metabarcode data that are emerging from high-throughput ecological sampling efforts, implementing tree-sequence recording methods could also allow for flexible downstream options to incorporate spatial information associated with genetic georeference databases (Lawrence et al., 2019).

| Conclusions
With our approach we were able to identify whether real communities were near equilibrium or not, and the ecoevolutionary processes underlying those dynamics. For example, despite the near-equilibrium state of both spider and beetle communities on islands, we discovered that the approach to these equilibria were different, with spider communities assembling largely by immigration, compared to the more prominent role of speciation in weevil communities. This confirms suspected, but as of yet untested, hypotheses from other island arthropod systems (Rominger et al., 2016) that can only now be evaluated. We were also able to pinpoint the mechanistic causes (turnover and environmental filtering) of nonequilibrium in the tree communities. Finally, our analysis of Galapagos snails highlight areas for future improvement in modeling more fine scale environmental heterogeneity and its impact on filtering and speciation.
The MESS model unifies the study of biodiversity by linking ecological and evolutionary theory across three disparate timescales within an individual-based, mechanistic framework. The model generates explicit temporal predictions of community-scale data across these three diversity axes (species richness and abundance, population genetic diversity, and trait variation), spanning equilibrium and nonequilibrium conditions, and allowing for stochasticity along a continuum of scenarios ranging from pure ecological neutrality, to strong ecological interactions and/or environmental filtering.
To complement the MESS model simulations, our implementation includes an extensive suite of ML tools for performing model selection and parameter estimation from observed data, and plotting routines for visualizing and evaluating results. This unified mechanistic model provides a general framework for hypothesis testing and biodiversity data synthesis, enabling the generation of multidimensional forecasts and test parameterized hypotheses about the historical and future processes driving biodiversity patterns from small-scale intensively sampled plots, to islands sensu lato, to regional and subcontinental scales.

ACK N OWLED G EM ENTS
This manuscript is a product of the working group sEcoEvo -