Disentangling genetic structure for genetic monitoring of complex populations

Abstract Genetic monitoring estimates temporal changes in population parameters from molecular marker information. Most populations are complex in structure and change through time by expanding or contracting their geographic range, becoming fragmented or coalescing, or increasing or decreasing density. Traditional approaches to genetic monitoring rely on quantifying temporal shifts of specific population metrics—heterozygosity, numbers of alleles, effective population size—or measures of geographic differentiation such as FST. However, the accuracy and precision of the results can be heavily influenced by the type of genetic marker used and how closely they adhere to analytical assumptions. Care must be taken to ensure that inferences reflect actual population processes rather than changing molecular techniques or incorrect assumptions of an underlying model of population structure. In many species of conservation concern, true population structure is unknown, or structure might shift over time. In these cases, metrics based on inappropriate assumptions of population structure may not provide quality information regarding the monitored population. Thus, we need an inference model that decouples the complex elements that define population structure from estimation of population parameters of interest and reveals, rather than assumes, fine details of population structure. Encompassing a broad range of possible population structures would enable comparable inferences across biological systems, even in the face of range expansion or contraction, fragmentation, or changes in density. Currently, the best candidate is the spatial Λ‐Fleming‐Viot (SLFV) model, a spatially explicit individually based coalescent model that allows independent inference of two of the most important elements of population structure: local population density and local dispersal. We support increased use of the SLFV model for genetic monitoring by highlighting its benefits over traditional approaches. We also discuss necessary future directions for model development to support large genomic datasets informing real‐world management and conservation issues.

genetic monitoring rely on quantifying temporal shifts of specific population metricsheterozygosity, numbers of alleles, effective population size-or measures of geographic differentiation such as F ST . However, the accuracy and precision of the results can be heavily influenced by the type of genetic marker used and how closely they adhere to analytical assumptions. Care must be taken to ensure that inferences reflect actual population processes rather than changing molecular techniques or incorrect assumptions of an underlying model of population structure. In many species of conservation concern, true population structure is unknown, or structure might shift over time. In these cases, metrics based on inappropriate assumptions of population structure may not provide quality information regarding the monitored population. Thus, we need an inference model that decouples the complex elements that define population structure from estimation of population parameters of interest and reveals, rather than assumes, fine details of population structure. Encompassing a broad range of possible population structures would enable comparable inferences across biological systems, even in the face of range expansion or contraction, fragmentation, or changes in density. Currently, the best candidate is the spatial Λ-Fleming-Viot (SLFV) model, a spatially explicit individually based coalescent model that allows independent inference of two of the most important elements of population structure: local population density and local dispersal. We support increased use of the SLFV model for genetic monitoring by highlighting its benefits over traditional approaches. We also discuss necessary future directions for model development to support large genomic datasets informing real-world management and conservation issues.

K E Y W O R D S
Λ-coalescent, density, dispersal, genetic monitoring, isolation by distance, multiple merger coalescent, population structure, spatial Λ-Fleming-Viot model … the development of statistical procedures to uncover the demographic or selection history of a set of populations that best explains the observed genetic structure is certainly one of the most interesting challenges of population genetics.

| TR ADITIONAL G ENE TIC MONITORING
Genetic monitoring is concerned with estimating temporal changes in population demographic processes such as abundance, vital rates, and rates of exchange using information obtained from molecular markers (Schwartz, Luikart, & Waples, 2007).
However, because studies can span long time frames and also incorporate results of other studies, care must be taken to ensure that inferences reflect actual population processes rather than changing molecular techniques (Allendorf, 2017;Charlesworth & Charlesworth, 2017) or incorrect model assumptions (Morin et al., 2010;Peery et al., 2012;Samarasin, Shuter, Wright, & Rodd, 2017).
Moreover, populations tend to be complex in structure and change through time by expanding or contracting their geographic range, becoming fragmented or coalescing, or increasing or decreasing density (Hey & Machado, 2003). Indeed, all of these can be occurring simultaneously in different parts of a single species' geographic range, and are more likely occurring in species of conservation concern (Whitlock & McCauley, 1999). While these changes are often in and of themselves important to conservation and basic population genetics, they can also cause challenges in the interpretation of analyses that are often overlooked.
In traditional approaches to genetic monitoring, the predominant approach quantifies patterns of variation or differentiation using measures such as heterozygosity, nucleotide diversity, numbers of alleles and percentage of polymorphic loci, and estimates of effective population size, N e (Aravanopoulos, 2011;Excoffier, 2007;Schwartz et al., 2007;Tallmon et al., 2010). The underlying assumption is that temporal changes in these quantities are related to demographic parameters of conservation concern (Hoffmann & Willi, 2008;Pertoldi, Bijlsma, & Loeschcke, 2007;Schwartz et al., 2007). However, these relationships can be affected by changes in population processes (Schwartz et al., 2007) and by the number and type of genetic markers used and how closely they adhere to the analytical assumptions (Narum et al., 2008;Smith & Seeb, 2008;Smith et al., 2007). Consequently, metric-based approaches to genetic monitoring or to quantifying population structure can be misleading when the necessary a priori assumptions are incorrect.
As an example, one of the most commonly used measures of differentiation is F ST , which was originally defined by Wright (1965) as the correlation of two alleles randomly sampled from a single subpopulation relative to the correlation of two alleles randomly sampled from the population as a whole. Under some conditions, F ST is also related to the inverse of the migration rate: where N e m is the effective number of reproducing migrants per generation (Wright, 1931). This relationship has led to widespread use of F ST as an indirect measure of gene flow (Slatkin, 1985). with respect to migration and genetic drift (Wright, 1931). While this model has proven to be a useful simplification, it is widely recognized that in most empirical populations these assumptions are practically never satisfied (Waples, 1998;Whitlock & McCauley, 1999). In fact, populations of conservation concern are very likely to demonstrate deviations from ideal conditions. These populations often change in size rapidly and are not in equilibrium Whitlock & McCauley, 1999 (Neigel, 2002). Strand, Milligan, and Pruitt (1996) (Charpentier et al., 2012;Musiani et al., 2007), it can be difficult to assess how well putative stratifications reflect real populations. However, even when such datasets exist, population stratification defined by genetic data often differs from stratification defined by, for example, morphology or behavior, because they are influenced differently by demography and selection (Ortego, Garca-Navas, Noguerales, & Cordero, 2015;Serrouya et al., 2012). In the absence of independent sources of data, populations are usually defined either based on how samples have been collected or as perceived centers of density within the species' distribution, both of which can be biased by collection methods and might not reflect actual distribution or mating patterns.
Thus, most uses and interpretations of gene flow from estimates of F ST are accompanied by implicit acceptance of a particular model of population structure, and their relevance depends crucially on the appropriateness of the model used to relate the pattern-based quantities to underlying biological processes of interest. Further, models of population structure and models of population size change can make identical predictions for observable genetic quantities, and therefore, these processes cannot be distinguished without considering the full distribution of genetic variation (Mazet, Rodrguez, & Chikhi, 2015;Mazet, Rodríguez, Grusea, Boitard, & Chikhi, 2016). In the context of genetic monitoring, differentiating these is of crucial importance, so confounding them as a consequence of a priori assumptions is a serious issue. The inherent complexity of populations therefore poses a nontrivial problem for the prospect of discovering population structure, and presents significant challenges to the development of a coherent means of monitoring populations using genetic information gathered over any reasonably large spatiotemporal extent (Crandall, Bininda-Emonds, Mace, & Wayne, 2000;Excoffier, 2007;Segelbacher et al., 2010). Nevertheless, this is a problem that must be addressed. What follows is our view of the path forward.

| THEORY AND RE ALIT Y IN P OPUL ATI ON G ENE TI C S
The rich theoretical foundation of population genetics has inspired numerous models to describe how genetic characteristics vary over space and time. This creates a challenge for discovering population structure or guiding genetic monitoring, because choices among models must be made a priori and available models might not correspond to biological reality. The range of patterns of structure in natural populations can be viewed as a triangular space described by patchiness and individual dispersal distance ( Figure 1). If both patchiness and dispersal are low, individuals are relatively uniformly distributed. As patchiness increases, individuals become more clumped into discrete populations. As dispersal increases, all cases converge to a single panmictic population. In reality, groups of individuals within a metapopulation can exist at multiple locations in this space.
Certainly for the discovery of population structure and often for the purposes of genetic monitoring, we are interested in where in this space a set of individuals lies, whether the location is shifting over time, and if so, the rate of change. To maximize analytical tractability, however, traditional population genetics models typically make simplifying assumptions about life histories and demographic and evolutionary processes. This limits their applicability by interpreting the study system with respect to a small subset of the parameter space.
In the most widely adopted paradigm, individuals are assumed to assort themselves into semi-discrete subpopulations, within which matings occur at random. The two most commonly used models of this class are Wright's island model, introduced in Wright (1931) but not named until Wright (1943), and the stepping-stone model (Kimura & Weiss, 1964;Weiss & Kimura, 1965). These models limit themselves to the right border of the spatial structure triangle ( Figure 1).
Here, subpopulations are convenient, and often necessary, units for subsequent analyses of genetic diversity within (heterozygosity, allelic and nucleotide diversity) and among (F ST and related measures) groups of individuals. The primary parameters governing these models are the effective size of each subpopulation (N e ) and the rate of migration among subpopulations (in the island model, m is the single migration rate among all subpopulations; in the stepping-stone model, m j is the migration rate among subpopulations separated by j steps and m ∞ is the rate of long-range migration, equivalent to m in the island model). Spatial heterogeneity is captured mainly through analysis of pairwise combinations of connected, discrete populations (Rousset, 1997;Slatkin, 1993), or by the estimation of migration matrices (Beerli & Felsenstein, 2001).
In contrast, the most widely adopted alternative paradigm is Wright's IBD model (Wright, 1943(Wright, , 1946, which focuses on individuals assumed to be distributed continuously and uniformly across space. These models limit themselves to the left border of the spatial structure triangle ( Figure 1). Here the primary parameters governing the models are local density (d) and the variance of parent-offspring dispersal distance (σ 2 ). Together these define the concept of neighborhood size as the geographic area within which most matings take place. Spatial heterogeneity is generally not considered in these models.
Some attempts to bridge these two paradigms have been made, but they are limited to identifying special cases that can transform one into the other. Stepping-stone models, for example, converge The parameter space for complex populations.
Populations with complex spatial structure are located within a parameter space defined by dimensions corresponding to the degrees of patchiness and connectivity. For simplicity, an additional dimension corresponding to the local population density is not shown. Increasing connectivity for any population structure converges to the same outcome, that is, panmixia, so the feasible parameter space is shown as triangular  (Kimura & Weiss, 1964;Weiss & Kimura, 1965). Conversely, as the number of subpopulations increases and effective size of each becomes arbitrarily small, the stepping-stone model approaches the IBD model. Kimura and Weiss (1964) suggested that their steppingstone model could be analyzed in terms of IBD by replacing m 1 with σ 2 and by substituting the effective density d(N e /N) for N e .
Importantly, neither dominant paradigm penetrates the interior of the spatial structure parameter space (Figure 1), which creates problems when models based on those paradigms are used to discover population structure or are applied to genetic monitoring.
Although some real-world species fall neatly into one or the other of these paradigms, many others exist somewhere in the interior space of the triangle. In some species, individuals are neither randomly distributed across the landscape nor neatly clumped into semi-discrete subpopulations, while for others individuals are arrayed in different spatial patterns in different areas and/or at different times. And for many other species, connectivity depends strongly on features of the habitat (which might change at different spatiotemporal scales) rather than being a simple function of distance as implied by the IBD model.

| IND IVIDUALLY BA S ED L ANDSC APE G ENE TIC S MODEL S
In general, the area within the spatial structure triangle (Figure 1) can be considered the domain of landscape genetics, which integrates population genetics, landscape ecology, and spatial statistics to identify landscape and environmental factors that affect genetic and genomic variation (Milligan, 2017;Segelbacher et al., 2010). Landscape genetics, a term coined in 2003 (Manel, Schwartz, Luikart, & Taberlet, 2003) to describe increasingly spatially explicit advances in population genetics (Dyer, 2015a), has had a strong focus on the flow of genetic information across the landscape and hence population structure. Further, it is well recognized that model output and inference in landscape genetics is heavily influenced by and dependent on the scale and resolution (i.e., how finely resolved are measures of ecological differences) of ecological processes (e.g., dispersal and demography) that influence gene flow and population structure (Cushman & Landguth, 2010;Galpern & Manseau, 2013;Hand, Cushman, Landguth, & Lucotch, 2014;Wasserman, Cushman, Schwartz, & Wallin, 2010).
Most landscape genetic studies rely strongly on the dichotomy of individual versus population-based models for inference (Dyer, 2015a;Storfer, Murphy, Spear, Holderegger, & Waits, 2010). The approach of using pattern-based measures such as F ST and correlating them with spatial and/or environmental factors, has long dominated landscape genetics . These approaches require a priori stratification of samples into putative populations.
Newer approaches like population graph approaches (Dyer, 2007(Dyer, , 2015bDyer & Nason, 2004;Murphy, Dyer, & Cushman, 2016) have been largely applied in population-based frameworks, often where sampling locations, not genetically discrete populations, define the vertices of the graph. Individual-based analyses in landscape genetics can help overcome problems with predefining populations, and many landscape genetic statistics can be adapted to individualbased measures of genetic differentiation. However, individualbased studies often yield thousands of pairwise values, making it difficult to make biologically relevant inferences of genetic structure (Kierepka & Latch, 2015). Furthermore, popular tests of association between matrices of pairwise distances, for example, Mantel tests, suffer from statistical errors (Graves, Beier, & Royle, 2012;Kierepka & Latch, 2015) and are easily susceptible to sampling biases (Kierepka & Latch, 2015;Oyler-McCance, Fedy, & Landguth, 2013;Schwartz & McKelvey, 2009). Thus, despite its promise, much of the core of landscape genetics must be improved before it is ready to tackle the challenges of long-term genetic monitoring and discovery of population structure. Improvement of landscape genetics models for genetic monitoring might start from either of two points. The first is the family of spatially explicit, individually based ancestry clustering models, which includes geneland (Guillot, Estoup, Mortier, & Cosson, 2005), TESS (Chen, Durand, Forbes, & François, 2007), BAPS (Corander & Marttinen, 2006), and POPS (Jay, Durand, François, & Blum, 2015), many of which are derived from the nonspatial structure model (Falush, Stephens, & Pritchard, 2003;Pritchard, Stephens, & Donnelly, 2000). All of these models interpret the observed multilocus genotypes as samples from putative populations, which are inferred during the modeling process. As a consequence, they are limited to the right border of the spatial parameter space (Figure 1). In addition, a range of covariates are often included. For example, structure (Pritchard et al., 2000) allows prior distributions to be influenced by the sampled spatial location of each individual, while geneland (Guillot et al., 2005), TESS (Chen et al., 2007), spatial BAPS (Corander, Sirén, & Arjas, 2008), and POPS (Jay et al., 2015) explicitly include the sampled spatial location of each individual in the model. In addition, POPS (Jay et al., 2015) explicitly includes environmental as well as spatial information. However, none of these models explicitly includes gene flow, despite it being one of the most important genetic mechanisms influencing variability and local adaptation (Holderegger & Wagner, 2008). Thus, despite their promise, these models also need improvement if they are to be used to handle the complexities of long-term genetic monitoring. Specific areas of improvement include the addition of more biologically relevant mechanisms such as gene flow in ways that acknowledge the spatial heterogeneity required for genetic monitoring and discovery of population structure (Milligan, 2017).
The second family contains the individually based explicitly genealogical models of ancestry, which are based upon the coalescent (Kingman, 1982). This includes a large set of models that infer, generally from DNA sequence data, such quantities as effective population size and growth rate, gene flow, and population divergence (Kuhner, 2008). Unlike most of the models in the first category, these are not truly spatially explicit; at best individuals are gathered into predefined populations for analysis using a structured coalescent (Hudson, 1990;Notohara, 1990). Furthermore, many of the parameters inferred in these models are averages across the entire sample. Thus, for example, spatially dependent density or gene flow cannot be ascertained, both of which are important for long-term genetic monitoring or for discovery of population structure. As a result, while offering much promise, this set is likewise not immediately suitable.
The main approaches to population and landscape genetics provide strong foundations for genetic monitoring. However, they generally require making a priori assumptions about quantities that are the subject of inference and the models exhibit many problems when applied to the challenge of genetic monitoring (Table 1).
Consequently, a new look at genetic monitoring and discovery of population structure is required.

| MODEL S FOR G ENE TIC MONITORING AND DISCOVERY OF P OPUL ATION S TRUC TURE
A more general approach to population genetic analysis must place the focal system within the spatial structure triangle (Figure 1) as a natural outcome of the analysis, not start with a priori assumptions about its location within the parameter space. Additionally, the model would directly quantify the full distribution of actual population or evolutionary processes of interest as best as possible, decoupling these parameters from the elements that define population structure (Excoffier, 2007). In particular, this model would: • Encompass a broad range of possible population structures, so that inferences made would be comparable across different geographic scales and types of biological systems, • Utilize spatial information, • Simultaneously quantify processes influencing population structure and connectivity, and assess changes in both over time, • Allow for spatial heterogeneity in model parameters, • Directly estimate parameters of interest and their uncertainty, while not being confounded by range expansion or contraction, fragmentation, or changes in density, and • Be compatible with multiple types of genetic data, allowing it to be informed by legacy microsatellite or potentially allozyme data sets, next-generation sequencing data, or data generated by future technologies.
The basic observations for a general analysis with this hypothetical model would be multilocus genotypes, multilocus sequences, or full genome sequences of individuals, their geographic locations, and information on covariates that might influence local density, movement, and selection. The model should serve as a bridge between the two main paradigms of individual neighborhood and island/stepping-stone models (i.e., the left and right borders of the spatial structure triangle (Figure 1)), and encompass these models as boundary conditions.
Preliminary analyses using the model might indicate that a given system fits comfortably onto either border, justifying the use of one or the other set of standard analytical regimes. However, most empirical cases are more likely to lie in the interior, so the model could also give an indication of the appropriateness of measures deriving from one or the other of the main paradigms.

| S PATIAL Λ-FLEMING -VIOT MODEL
Currently, the only model with immediate potential to address most of the requirements for long-term genetic monitoring is the spatial Λ-Fleming-Viot (SLFV) model (Barton, Etheridge, & Véber, 2013; TA B L E 1 Current problems in the implementation of genetic monitoring models and important qualities of a genetic monitoring model

Primary problem
Examples of potential consequences

Current metrics heavily influenced by scale and vary greatly depending on the scale used
Multi-scale studies show that landscape effects are evident at one scale and absent at another (Balkenhol et al., 2014;Millete & Keyghobadi, 2015) Scale-independent quantification of local population structure and connectivity Spatial heterogeneity in model parameters Many genetic metric models require assignment of individuals to predetermined groups Potential for erroneous groups from clustering algorithms (Frantz, Cellina, Krier, Schley, & Burke, 2009;Latch, Dharmarajan, Glaubitz, & Rhodes, 2006;Schwartz & McKelvey, 2009) No a priori grouping Genetic metrics are often divorced from the underlying genetic process, leading to poor estimation of the process itself Inaccurate estimates of migration rates, especially at low values of F ST (Allendorf, Luikart, & Aitken, 2013) Directly incorporate known population genetics mechanisms Violation of assumptions can greatly impact estimates of effective population size  Genetic metrics can be sensitive to the marker type used and could therefore change temporally based solely on the methodology Different spatial genetic structures between marker types (Bradbury et al., 2015) Technology independent Limited applicability across studies for wide-ranging species (de Groot et al., 2016) Guindon, Guo, & Welch, 2016;Joseph, Hickerson, & Alvarado-Serrano, 2016;Kelleher, Barton, & Etheridge, 2013). The SLFV is a spatially explicit extension of the Λ-Fleming-Viot model which is itself an extension of the Fleming-Viot model (Fleming & Viot, 1979).
Equivalently, it is a spatially explicit version of the Λ-coalescent which is an extension of Kingman's coalescent (Kingman, 1982;Tellier & Lemaire, 2014). Specifically, coalescence in the SLFV model is not limited to two lineages, and individuals can be distributed arbitrarily across space, avoiding the restriction in classical island and steppingstone models of discrete population boundaries. As a result, the SLFV model permits the simultaneous, yet independent, estimation of local population density and local dispersal rates, two key parameters of population processes integral to genetic monitoring studies.
The mathematical background for the SLFV model was introduced in Etheridge (2008)  Etheridge, and Barton (2014) and Kelleher, Etheridge, and McVean (2016). In what follows, we introduce informally this simple model, then present the steps involved in a more mathematically rigorous form to illustrate explicitly how the restrictive assumptions can be relaxed to obtain a model with the desired characteristics outlined in the previous section.
In its simplest form, the SLFV model constructs coalescent genealogies of subgroups of haploid individuals through iterations of reproduction and movement events backwards in time (Figure 2).
The sequence begins with a set of individuals, arbitrarily distributed across a continuous landscape (Figure 2a), each carrying their empirical genotypic data (although they can also optionally be associated with other data such as sex, demographic or reproductive state). In the first step, a neighborhood center (x) and radius (r) are randomly selected ( Figure 2b). All coalescent events will be limited to individuals within this neighborhood. A new location within the neighborhood is randomly selected for the ancestor (a) and its genotype is selected from the distribution in the neighborhood associated with that location (Figure 2c). Existing individuals within the neighborhood are then randomly selected to be descendants of the new ancestor. Finally, as for the Moran (1958) model, the descendants are removed, having been replaced by the ancestor (Figure 2d), and a new iteration begins, with iterations continuing until only a single ancestor remains.
As outlined below, the individuals need not be haploid. Sexual reproduction can be accommodated by selecting more than a single ancestor. Note that small-scale, for example, single generation, F I G U R E 2 Illustration of one iteration of the SLFV model. (a) Initial condition involving individuals at their empirical sampling locations with two haplotypes (white and gray), (b) placement of a random neighborhood (circle) defined by its center (x) and radius (r), (c) random placement of a putative ancestor (square) and coalescence of ancestry of randomly selected descendants, and (d) distribution of remaining individuals after removal of the descendants a reproduction events will necessarily involve two ancestors, but large-scale events, that is, those with long intervals or covering large areas, can involve more than two because multiple generations might have intervened (Kelleher et al., 2013).
The steps in this process can be formalized to illustrate the generalizations that are possible. For clarity of exposition we will consider the single locus model, because it captures the spatially explicit nature that is crucial for genetic monitoring; multilocus extensions are straightforward (Kelleher et al., 2013(Kelleher et al., , 2014(Kelleher et al., , 2016. This will yield a set C ′ containing zero or more individuals, randomly selected according to the spatial distribution associated with the event and their state. In the case of no mutation, all individuals in C ′ will have the same state, but this restriction is not necessary. Depending on the number of individuals in C ′ , this event either has no effect or involves a mixture of reproduction and movement. (a) If C ′ is empty, no individuals are affected by the event and C is unchanged. Construct a new event.
(b) If C ′ contains at least one individual, the event is potentially a mixture of reproduction and movement (and possibly mutation). Sample a set of individuals, which will replace those in C ′ , from the distribution R(x|C ′ ). Some or all of these individuals may be ancestors of (some of) those in C ′ ; the remainder are individuals in C ′ that have simply moved. Thus, the distribution R(x|C ′ ) determines the mixture of reproduction and movement that occurs in the event. For sexual reproduction, R(x|C ′ ) can generate locations for more than one ancestor, and even for more than two in the case of large-scale events. In this case, ancestry must be distributed across the selected individuals; Kelleher et al. (2016) compares the efficiency of alternative algorithms for accomplishing this. In the simplest cases, R(x|C ′ ) is uniform across the d-sphere defined by E(x) (Kelleher et al., 2013) or may only depend on the distance between individuals (Guindon et al., 2016). However, more complex distri- Bayesian computation (ABC) pipeline based upon the selectively neutral, spatially homogeneous SLFV model (Kelleher et al., 2013(Kelleher et al., , 2014. The pipeline was used to validate the estimation of neighborhood size from simulated data and subsequently to estimate both neighborhood size and dispersal radius from empirical data on Berkheya cuneata (Asteraceae) from South Africa. In their model, dispersal radius R was the maximum distance individuals could disperse, and neighborhood size was the number of individuals within the area of an event of radius R. For validation, 100,000 datasets were generated for eight individuals sampled at 10 unlinked loci. Each dataset was composed of the genealogy generated by the SLFV model and 1 kb sequences simulated along each genealogy. Data generation took 2 days on a 12-core computer. Subsequently, the posterior distribution of neighborhood size was calculated using ABC based upon 100 replicate leave-one-out cross-validations; regression of the estimated neighborhood size on the actual neighborhood size had R 2 = 0.87.
The empirical analysis of Berkheya cuneata used a total of 33 individuals with known locations and sequence data at one nuclear and two plastid loci (Joseph et al., 2016 This study illustrates several important points regarding practical use of the SLFV model. First, the two most biologically important parameters, neighborhood size and dispersal distance, are identifiable; that is, they can be estimated separately using the SLFV model. Second, it is possible to obtain useful estimates even from relatively small datasets composed of no more than dozens of individuals or handfuls of loci. Third, there is room for improved computational efficiency to accommodate larger datasets. Finally, adding spatial heterogeneity in the form of known resistance surfaces or the like, as is often done in landscape genetics (McRae, 2006;Spear, Cushman, & McRae, 2016), will increase realism without adding parameters; inferring properties of resistance surfaces adds no more parameters than the equivalent multivariate regression or similar landscape genetic analysis would. Thus, while the existing pipeline (Kelleher et al., 2013(Kelleher et al., , 2014 does not accommodate that flexibility, a spatially heterogeneous SLFV model is both feasible and likely to be computationally tractable. A second example using the selectively neutral, spatially homogeneous SLFV model reinforces these points and illustrates ad-  (Kimura, 1980) to generate nucleotide sequences given the genealogies. Effective population density (d) and dispersal intensity (σ 2 ) (Wright, 1946) were estimated using the SLFV model based upon a sample of 50 individuals sampled at either two or ten different sites. Additionally, parameter estimates were obtained using the structured coalescent (Hudson, 1990;Notohara, 1990) under the assumption of either two or ten discrete populations. Estimates from the structured coalescent were upwardly biased to a large degree, though much less so for ten than for two populations. Estimates from the SLFV model were much better, although the precision declined with larger values of dispersal intensity. These computations took 100 hr to complete on a computer with 2.7-2.8 GHz CPUs.
The empirical analysis of influenza (Guindon et al., 2016)  This study reinforces the point that neighborhood size and dispersal rates can be estimated separately using the SLFV model.
Distinguishing between them is important, especially in the case of genetic monitoring where either or both might shift (as they did with influenza) through time. Detecting those shifts may in fact be a major reason for undertaking a monitoring program. It also reinforces the point that useful estimates can be obtained for typical samples using a reasonable amount of computation. Thus, the SLFV model can be developed into a practical approach to genetic monitoring. It may also serve the task much better than other methods, such as those based upon F ST or the structured coalescent, that impose a priori assumptions upon the spatial structure of the populations under study.
Although analyses using the SLFV model to date (Guindon et al., 2016;Joseph et al., 2016) have assumed spatial homogeneity in both neighborhood size and dispersal, there is no inherent reason not to allow spatial heterogeneity, just as it is routinely included in landscape genetics analysis (Balkenhol et al., 2016). For example, given information on the spatial layout of distinct habitat types, one could estimate different densities or dispersal rates for each habitat. In turn, those parameters could be the focus of genetic monitoring to detect changes in habitat-specific density or dispersal, information that would be of great value to a monitoring program. It would also reveal valuable information on the basic biology of the species under study. Importantly, differences among habitats (or other spatially defined factors) would emerge naturally from the analysis if they exist rather than be imposed at the outset by selection of the analysis framework. Of course, as with landscape genetics models, SLFV models with too many parameters will be impossible to estimate.
How many and which parameters can be estimated remains an open question, and software implementations of more complex, and possibly biologically realistic, models are required to investigate this.

| P OTENTIAL S HORTCOMING S OF CURRENT IMPLEMENTATI ON S OF THE S LF V MODEL
Current implementations of the SLFV model (Guindon et al., 2016;Kelleher et al., 2013Kelleher et al., , 2016  it is likely that the same will be true for the SLFV model. A feature of the SLFV model as currently implemented is that no distinction, other than location, is made among individuals with respect to their likelihood of birth; in the backward in time version of the model described above, the probability distribution E(x) that selects individuals influenced by an event depends only on location.
Greater biological realism could be incorporated into the model by allowing E(x) to depend on, for example, the demographic state of individuals or their genotype. These states need not even be static; they could be projected through time from one event to the next much as phylogenetic analysis projects state change along lineages.
Further, these projections could incorporate structured population models (Caswell, 2000) in a natural way.

| A LONG -TERM G ENE TI C MONITORING S TR ATEGY
What would a long-term genetic monitoring strategy based upon spatially explicit coalescent models, such as the spatial Λ-Fleming-Viot model, look like? From the data acquisition viewpoint, such a monitoring strategy would largely resemble any other. Geo-referenced samples of individuals would be distributed across the species range, and sampling would be repeated to create a time series. Environmental and landscape data would be obtained as well to provide information on potential covariates. As with all similar studies, the goal of sampling is to ensure that each individual is equally likely to be sampled, that individuals are sampled independently, and that the environmental and landscape covariates are spatially representative.
From the data analysis viewpoint, however, such a monitoring strategy would look quite different from common practice. First, different types of genetic data, for example, DNA sequences and multilocus genotypes would be analyzed simultaneously in the same model. In principle, this has long been possible for coalescentbased methods (Beerli & Palczewski, 2010;Bouckaert et al., 2014;Drummond & Rambaut, 2007); however, in practice different types of data, for example, single nucleotide polymorphisms (SNPs) and microsatellites, are analyzed separately. For genetic monitoring, the focus is on basic properties of the populations, for example, spatially dependent density and dispersal, not on data type-specific estimates (Milligan, Leebens-Mack, & Strand, 1994). Joint analysis of the data is likely to be better than independent analyses of partitions, in much the same way that joint analysis of gene trees leads to better inference of species trees in phylogenetics (Liu, Xi, Wu, Davis, & Edwards, 2015).
Second, increasing emphasis would be placed on the posterior distributions of parameters, as opposed to their point estimates.
Much as Guindon et al. (2016) were able to recognize similarities and differences among distributions inferred for a sequence of influenza outbreaks, genetic monitoring must recognize similarities and differences in parameters across spatial and temporal dimensions. This can only be done accurately if information on the full distributions is available.
Third, the same model would be used for temporal comparisons to identify biological, not methodological, shifts. Not only would this make comparisons more meaningful, it would also enable direct and quantitative analysis of changes. The current practice of using different data and models over time, coupled with ad hoc interpretations of the differences, does not lend itself to reliable monitoring protocols.
Finally, the nature of the models used must of course be improved so that they will handle these demands. They must cover a full range of data types and include a full range of biological mechanisms to achieve this. Consequently, advances in genetic monitoring depend crucially on advances in the models and analyses that are possible. The rapid technological advances in data acquisition, for example, the increasing accessibility of genome-scale data, make it easy to forget that the data are meaningless without suitable analyses. For long-term genetic monitoring, those analyses must yield comparable information, and they must do so in the face of both dynamically changing populations and changing types of data.

| CON CLUS IONS
In conservation biology, there has been a movement toward better utilizing genomic data and information about adaptive genetic markers to improve our understanding of evolutionary processes, rates of dispersal, local adaptation, genotype-by-environment interactions, and other important factors influencing population structure at multiple scales (Allendorf, Hohenlohe, & Luikart, 2010;Garner et al., 2016). By enabling process-based, rather than pattern-based, approaches, models such as the spatial Λ-Fleming-Viot model will allow the quantitative, spatiotemporal comparisons required for rigorous and informative genetic monitoring and for discovering the structure of natural populations. They will also allow adaptive incorporation of additional monitoring effort to efficiently reduce uncertainties and iteratively improve inferences about temporal changes in monitored systems. Finally, they will allow integration of new samples, including historical ones from archival collections, into a monitoring effort, thereby greatly expanding the time scale over which monitoring can meaningfully occur. As a consequence of the parallel development of these models and genetics technology, genetic monitoring stands poised to provide a rich source of information for more effectively guiding real-time management decisions, monitoring the impact of human activities including changes in policy, and informing us about fundamental biological processes such as responses to global climate change.

ACK N OWLED G EM ENTS
This work was assisted through participation in the Next Generation Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. BKH was partially supported by funds from NSF (award #DOB-1639014) and NASA (award #NNX14AB84G). We thank two anonymous reviewers for comments that greatly improved our writing.

DATA A R C H I V I N G S TAT E M E N T
There are no data associated with this article.

CO N FLI C T O F I NTE R E S T
None declared.