Spatially explicit ecological modeling improves empirical characterization of plant pathogen dispersal

Abstract Dispersal is a key ecological process, but it remains difficult to measure. By recording numbers of dispersed individuals at different distances from the source, one acquires a dispersal gradient. Dispersal gradients contain information on dispersal, but they are influenced by the spatial extent of the source. How can we separate the two contributions to extract knowledge about dispersal? One could use a small, point‐like source for which a dispersal gradient represents a dispersal kernel, which quantifies the probability of an individual dispersal event from a source to a destination. However, the validity of this approximation cannot be established before conducting measurements. This represents a key challenge hindering progress in characterization of dispersal. To overcome it, we formulated a theory that incorporates the spatial extent of sources to estimate dispersal kernels from dispersal gradients. Using this theory, we re‐analyzed published dispersal gradients for three major plant pathogens. We demonstrated that the three pathogens disperse over substantially shorter distances compared to conventional estimates. This method will allow the researchers to re‐analyze a vast number of existing dispersal gradients to improve our knowledge about dispersal. The improved knowledge has potential to advance our understanding of species' range expansions and shifts, and inform management of weeds and diseases in crops.

. However, there is still far fewer datasets describing plant dispersal than plant demography because dispersal remains difficult to measure (Bullock et al., 2017). Here, we identified and resolved one of the key challenges that hinders progress in empirical characterization of dispersal: We incorporated the spatial extent of dispersal sources in the analysis of dispersal measurements.
One approach to measure dispersal is to use spatially localized sources of dispersing individuals and record dispersal gradients produced from them. These gradients do contain relevant information about dispersal, but they are also influenced by the spatial extent of the source (Cousens & Rawlinson, 2001;Ferrandino, 1996;Zadoks & Schein, 1979).
How can we evaluate this influence to extract more general knowledge about dispersal from specific dispersal gradients? This can be achieved using a mathematical description of dispersal with the help of dispersal kernels. A dispersal kernel quantifies the probability of an individual dispersal event from a source point to a destination point. Technically, a dispersal kernel is a probability density function that depends on the location of the destination point ("dispersal location kernel", Nathan et al., 2012).
To characterize dispersal, we need to estimate dispersal kernels based on observed dispersal gradients. An observed dispersal gradient from a point source would correspond to the dispersal kernel.
However, sources usually need to have a certain area to yield sufficient number of dispersing propagules to be observed. How can we achieve a sufficiently small source to be considered a point? In an influential book on plant disease epidemics, Zadoks and Schein (1979) formulated a rule of thumb stating that a point source should have "a diameter smaller than 1% of the gradient length". However, this rule of thumb is misleading. The size of the source should be compared with the characteristic distance of dispersal rather than the gradient length. However, we do not know the characteristic dispersal distance in advance of conducting measurements. Therefore, we cannot establish sound criteria for the validity of the point source approximation in advance of conducting measurements.
Due to the lack of clear criteria, "point" sources of various sizes appear in the literature: an adult tree (lichen Lobaria pulmonaria, Werth et al., 2006), circles of 1.6 m diameter (thistles Carduus nutans, Carduus acanthoides, Skarpaas & Shea, 2007), circles of 0.5 m diameter (garlic mustard Alliaria petiolata, Loebach & Anderson, 2018), 4 m 2 squares (wine raspberry Rubus phoenicolasius, Japanese barberry Berberis thunbergii, multiflora rose Rosa multiflora, and Japanese stiltgrass Microstegium vimineum, Emsweller et al., 2018), and even entire agricultural fields (oilseed rape Brassica napus, Devaux et al., 2007). These studies reported valuable dispersal gradients, but using these dispersal gradients as proxies for dispersal kernels may be unjustified. Spatially explicit modeling has been suggested (Greene & Calogeropoulos, 2002) to address this problem and was used in some modeling studies (Clark et al., 1999;Shaw et al., 2006), but it was not widely adopted in the literature on experimental dispersal measurements.
In this study, we devised a systematic approach to estimate dispersal kernels from dispersal gradients without using the point source approximation. For this purpose, we combined theory, analysis of empirical data and numerical simulations. We first formulated a theory that incorporates dispersal from a spatially extended source considering each point within the source area as an independent point source (the spatially explicit approach). We highlighted how mathematical properties of widely used kernel functions (exponential, Gaussian and power-law, Nathan et al., 2012) can inform experimental design. Then, we re-analyzed published empirical datasets on three major plant pathogens with contrasting spatial scales of dispersal and conducted comprehensive numerical simulations. In this way, we demonstrated how this approach allows the researchers to estimate dispersal kernels more accurately than using the point source approximation.

| Theory
The probability of dispersal from a source point p s = x s , y s to a destination point p d = x d , y d is given by the dispersal location kernel (hereafter "dispersal kernel"). It is typically a monotonically decreasing function of the distance between the points.
To estimate a dispersal kernel using a dispersal gradient produced by an area source, we consider the cumulative effect of all point-to-point dispersal events from the source to the destination. This is achieved by taking an integral over the individual points comprising the source to calculate their combined contribution to the dispersed population at a certain destination point (as in Shaw et al., 2006, Equation (4.6)). Similarly, the integral over all points of the destination area gives the total number of individuals that moved there from the source (as in Rimbaud et al., 2018, Equation (16)): where S = p s is the source area, D = p d is the destination area, n 0 p s is the density of individuals within S before dispersal, and p s , p d is the dispersal kernel (key variables and parameters are listed in Table 1). Equation (1) provides a valid description of the dispersal process when the overall population size is sufficiently large so that stochastic fluctuations in the numbers of dispersed individuals can be neglected. When the populations before dispersal (n 0 p s ) and after dispersal (N 1 ) are measured, the only unknown in Equation (1) remains the dispersal kernel. Equation (1) offers a way to estimate dispersal kernel parameters that takes into account the spatial extent of both the source and the destination.
A simpler, more common but often inaccurate approach is to fit a function of one spatial coordinate x to dispersal gradient data. For example, the function can be fitted to dispersal gradient data to estimate the scale parameter of the exponential kernel (for example, Saint-Jean et al., 2004). This (2) N 1 = Ce −x∕ approach works for any kernel function if both the source and the destination can be considered as points.
However, when the source or the destination is extended in space, the above approach may lead to inaccurate estimates of kernel parameters. In particular, extended sources modify dispersal gradients compared to point sources. Figure 1 illustrates such modifications for exponential, Gaussian and power-law kernels (defined in Box 1). Compare, for example, the gradients produced by the point source (source 1 in Figure 1a) and the area source (source 4 in Figure 1a). Extension of the source leads to a "flattening" of the gradient for the exponential and the power-law kernels, but for the Gaussian kernel, it leads to a "steepening" of the gradient (cf. the dashed green curve with the dashed blue curve in Figure 1b,d). Some studies postulated that gradients produced by spatially extended sources are "flatter" than gradients resulting from more localized sources (Cousens & Rawlinson, 2001;Ferrandino, 1996;Greene & Calogeropoulos, 2002;Zadoks & Schein, 1979). Here, we demonstrated that whether the extension of the source leads to a "flattening" or to a "steepening" of the gradient depends on the underlying kernel function. Thus, using a dispersal gradient from an extended source as a proxy for a dispersal kernel can lead to either an overestimation or an underestimation of the associated kernel parameters.
Only in special cases, does the shape of the dispersal gradient match to the shape of the dispersal kernel even when the source is extended, whereby the analysis can be simplified. (i) If the source is extended only in the direction of the measured gradient (along the xaxis) and dispersal is governed by the exponential kernel, using Equation (2) will give a correct estimate of , because exponential kernels are "memoryless" (Box 1). This is visible in Figure 1b where source 1 and source 2 produce the same gradients. However, this does not work for Gaussian and power-law kernels (Figure 1c,d).
(ii) If the source is extended along the yaxis, perpendicular to the direction of measured gradient, a similar simplification is possible for the Gaussian kernel ( Figure 1c, source 1 and 3). This is due to separability (Box 1) of the kernel, whereby each point source within the line source 3 in Figure 1a produces the same gradient along the xaxis. This holds for any separable kernel, but does not hold for non-separable kernels such as exponential or power-law kernels (Figure 1b,d). Analogous simplifications can be made when considering spatially extended destinations.
Insights presented above inform design and analysis of dispersal experiments. Gaussian and exponential kernels have been used in a number of studies to describe dispersal across a range of taxonomic groups (Table 15.1 in Nathan et al., 2012). When dispersal is governed by a memoryless (exponential) or a separable (e.g., Gaussian) kernel, appropriate line sources should be used to boost the power of the source, while maintaining the validity of the point source approximation to simplify the analysis. However, in most cases dispersal is better described by kernels that are neither memoryless nor separable (Nathan et al., 2012), such as the power-law kernel in Equation (5). In these cases, or when the kernel function is not known before conducting measurements, dispersal gradients should be analyzed using a spatially explicit approach based on Equation (1), as we demonstrate next.

| Experimental design and data analysis
We re-analyzed published empirical data on dispersal gradients using the spatially explicit method that incorporates the spatial extent of the source and compared the outcomes with those based on the conventional point source approximation. We considered three datasets collected in field experiments investigating dispersal of major pathogens of crop plants: (i) the fungus Zymoseptoria tritici that causes septoria tritici blotch in wheat (Karisto et al., 2022); (ii) the fungus Puccinia striiformis that causes stripe (yellow) rust in wheat (Cowger et al., 2005;Sackett & Mundt, 2005); and (iii) the oomycete Phytophthora infestans that causes late blight in potatoes (Gregory, 1968 asexual spores of P. striiformis (urediniospores) and propagules of P. infestans (sporangia) are mainly wind-dispersed. Spatial scales of the experiments varied from 100 cm to 100 m. Design of experimental plots and measurements is shown in Figure 2.
In each experiment, pathogen spores were inoculated across inoculation areas within experimental plots to create area sources of dispersing populations (orange areas in Figure 2). Then, disease gradients (disease intensity versus distance from the source) were recorded outside the inoculation areas across rectangular areas situated at increasing distances from the source (we call these areas "measurement lines"; light brown rectangles in Figure 2). These disease gradients are called primary gradients, because they resulted from a single cycle of pathogen reproduction (based on latent periods and timing of infections). The cycle includes both spore dispersal and infection success, hence, the measured gradients reflected effective dispersal gradients of the pathogen population (analogous to the combination of seed dispersal and establishment, Klein et al., 2013).
In the analysis, we incorporated the spatial extent of the source areas in two dimensions, but considered the measurement lines as thin lines perpendicular to the dispersal direction, since their length along the dispersal direction was short (dark brown lines in Figure 2).
For each dataset, we chose an appropriate dispersal kernel function based on the original study, to allow for comparison with the results of the original analysis. Then we derived specific expressions for dispersal gradients, firstly, using the point source approximation (i.e., assuming a point-like source and destinations in the middle of the inoculation area or measurement lines; "1D" in Figure 2) and, secondly, based on the spatially explicit Equation (1). Based on these expressions, we estimated dispersal kernel parameters.

| Septoria tritici blotch
We analyzed a subset of data collected in a larger experiment (Karisto et al., 2022) that characterized dispersal of a specific Using the point source approximation, we computed the disease intensity after dispersal at a distance r = x from the source with exponential kernel (Equation (3), k = 1) as where I 0 is the disease intensity at the source before dispersal and is the transmission parameter comprising the probability of dispersal and the infection efficiency of fungal spores.
Next, we relaxed the point source approximation and used the spatially explicit approach. We computed the expected disease intensity after dispersal at a destination point by substituting the exponential kernel with k = 2 into Equation (1). We specified the integrals in Equation (1) BOX 1 Dispersal kernels and their special properties.

Exponential kernel is defined as
where is the scale parameter, k ∈ {1, 2} is the number of dimensions, r = r p s , p d is the Euclidean distance from the source point , and C k,e is the normalization factor: C 1,e = 1 ∕ (2 ), C 2,e = 1 ∕ 2 2 . The mean dispersal distance for k = 2 is r e = 2 .

Power-law kernel is defined here as
where is the shape parameter, is the scale parameter, Memorylessness. Exponential kernels are memoryless: when we set any point in the distribution as a starting point, the tail of the distribution will have the same shape as the entire distribution, i.e., the "past" does not affect the "future" probabilities: for any starting points x 1 , x 2 (see also Ahmad & Alwasel, 1999). Thanks to this property, exponential kernels are unambiguously characterized by the half-distance ln(2). For any rvalue in Equation (3), moving ln(2) units further to r + ln(2) will decrease the density by half.

Separability.
A function is called separable when it can be expressed as a product of other functions that depend on only one independent variable each: the variables can be separated from each other, e.g., f(x, y) = f x (x)f y (y). Separable functions are often considered in connection with separable differential equations (Ahmad & Ambrosetti, 2015). When a dispersal kernel is separable, the shape of the kernel along the xaxis does not depend on the ycoordinate, i.e., dispersal probabilities in xand ydimensions are independent random variables.
(3) e (r) = C k,e e −r∕ , (4) g (r) = C k,g e −r 2 ∕2 2 , gives the average intensity across the measurement line at a distance x d as The integrations above incorporate the contribution of each source point x s , y s to disease intensity at the destination point (x d , y d ). In the integrals in Equation (7), we set x s = 0 at the center of the inoculation area and y s = 0, y d = 0 at the edge of the plot.
We fitted the one-dimensional model Equation (6) and the twodimensional model Equation (7) to observed dispersal gradients to estimate the scale parameter .
Asexual spores of P. striiformis (urediniospores) were inoculated onto 1.53 m × 1.53 m squares within 6.1 m-wide plots that were at least 100 m long in the downwind direction (Cowger et al., 2005). Disease severity in the measurement lines was measured visually as the percentage of leaf area covered by lesions ("disease severity" is a specific form of the more general term "disease intensity", Madden et al., 2007).
We used the modified power-law kernel as defined by Equation (10) of Mikaberidze et al. (2016), because it describes disease gradients of stripe rust better than exponential or Gaussian kernels. Here, r is the distance between the source point and the destination point; C k,p1 is the normalization factor, k = 1, 2 is the number of dimensions. At k = 1, is the gammafunction; and at k = 2, C 2,p1 = ( − 2) ∕ 2 2− .
The kernel Equation (8) has the same basic properties as the modified power-law kernel in Equation (5) (fat-tailed, power-law).
This form was used by Mikaberidze et al. (2016) to analyze the same data and we decided to use the same form to enable an easier comparison. Similarly to Equation (5), is the shape parameter, is the scale parameter (set to λ = 0.762 m as in Mikaberidze et al., 2016).
Using Equation (8)  we performed a natural logarithmic transformation of observed disease gradients to avoid a disproportionate emphasis on the few large values at the beginning of the gradient, and excluded zeros from the log-transformed data. Accordingly, we log-transformed both the one-dimensional model Equation (9) and the two-dimensional model Equation (10). We then fitted both functions to log-transformed disease gradients to estimate the shape parameter .

| Potato late blight
We analyzed a subset of data on dispersal of P. infestans (Gregory, 1968, Table III, unsprayed experiment). P. infestans The power-law function used to describe disease gradients by Gregory (1968) is not a kernel. Nevertheless, we used it in the analysis as if it were a kernel so that the results are comparable with the estimates obtained by Gregory (1968).
Under point source approximation, we used Equation (11)  in Equation (12) and Equation (13). For this reason, the prefactor I 0 in Equation (12) and Equation (13) has no biological relevance.
We performed the natural logarithmic transformation of observed disease gradients and excluded zeros from transformed data. Accordingly, we log-transformed both the one-dimensional Equation (12) and two-dimensional Equation (13). Then, we fitted the two functions to observed disease gradients to estimate the shape parameter .

| Data analysis
The fitting was implemented in Python 3.7 using packages numpy (v. We used the estimates of the kernel parameters for septoria tritici blotch and stripe rust to quantify the characteristic scales of dispersal by computing medians (r 50 ) and 90th percentiles (r 90 ) of dispersal distance kernels (Nathan et al., 2012). We computed the two percentiles numerically by solving the equation 2 ∫ r L 0 r i (r)dr = 0.01L with respect to r L at L = 50, 90. Here, i (r) is the dispersal kernel function, where i = e, p1; e stands for the exponential kernel (Equation (3) at k = 2) and p1 stands for the modified power-law kernel (Equation (8) at k = 2).

| Results of data analysis
In all three cases, the spatially explicit estimation (2D-estimation) resulted in steeper dispersal kernels and shorter dispersal distances compared to the point source approximation (1D-estimation), because the estimated kernel parameters differed between 2D-and 1D-estimation (Table 2, Appendix S1: Figure S1.1). -estimate for septoria tritici blotch was lower by about 12% in 2D-estimation compared to 1D-estimation; -estimate for stripe rust was higher by about 10%; -estimate for potato late blight was higher by more than 30%.
For septoria tritici blotch and stripe rust, we further investigated how the differences in kernel parameter estimates affect the characteristic spatial scales of dispersal, quantified by the 50th and 90th percentiles of dispersal distance kernels, r 50 and r 90 . For septoria tritici blotch, a moderate reduction in the -estimate in 2D-estimation compared to 1D-estimation translated into a similarly moderate decrease in r 50 and r 90 (Table 2). In contrast, for stripe rust, a modest increase in the -estimate in 2D-versus 1D-estimation translated into a dramatic decrease in the spatial scales of dispersal (Table 2).
In particular, for Madras dataset, 2D-estimation resulted in a nearly two-fold reduction of r 50 and a massive, almost seven-fold reduction of r 90 . Although we could not conduct this analysis with the powerlaw function defined by Equation (11) since it cannot be normalized, based on the substantial difference in the -estimates between 2D-and 1D-estimation, we expect a comparably strong reduction in estimated characteristic spatial scales of dispersal for potato late blight too. Thus, the three plant pathogens disperse over substantially shorter distances according to more realistic 2D-estimation compared to conventional 1D-estimates.

| Results of numerical simulations
Are the 2D-estimates acquired above more accurate (i.e., closer to the true values) than 1D-estimates? This is plausible, because the 2D-estimation describes dispersal from spatially extended sources more realistically. However, we cannot answer this question definitively based on the analysis of experimental data alone, because we do not know the true values of dispersal kernel parameters. Here, we addressed this question via numerical simulations. We first simulated the dispersal process according to exponential, Gaussian and power-law kernels with pre-defined parameters. Then, we used both methods to estimate the kernel parameters and compared the two methods in terms of their estimation accuracy across a range of biologically plausible scenarios. Here, we summarize the key outcomes of these simulations, but describe them in more detail in Appendices A, B and C.
We started by conducting idealized simulations (Appendix S2): we assumed that sampling locations were points without spatial extent and that measured values accurately reflected true values.
Here, the 2D-estimation provided perfectly accurate estimates, while 1D-estimation exhibited substantial errors. We analyzed how the errors in 1D-estimates depend on the parameters of kernel functions and source sizes. We found that the errors become smaller for organisms with longer mean dispersal distances and when using smaller source sizes.
Next, we wanted to understand the origin of errors in 1D-

| DISCUSS ION
We devised a theoretical framework to estimate dispersal kernels from empirical dispersal gradients by incorporating the spatial extent of dispersal sources. We re-analyzed existing dispersal gradi- The parameter appears in the exponent of power-law kernels, but the parameter enters the denominator of the exponent in the exponential kernel. Hence, the parameter difference has the opposite effect on characteristic dispersal distances in septoria tritici blotch compared to the two other systems.
TA B L E 2 Comparison of kernel parameter estimates and associated percentiles of dispersal distance kernels between one-and two-dimensional models (1D and 2D, respectively). The 1D-estimates here correspond to the estimates presented in the earlier publications.
these two lines of evidence, we conclude that the three organisms disperse on average over substantially shorter distances compared to estimates from conventional modeling.

Similar spatially explicit approaches have been used in model-
ing studies to investigate dispersal in plants (Clark et al., 1999;Shaw et al., 2006) and plant pathogens (Rimbaud et al., 2018). However, such approaches are not adopted in the literature on empirical characterization of dispersal (e.g., not used in Werth et al., 2006;Skarpaas & Shea, 2007;Loebach & Anderson, 2018;Emsweller et al., 2018;Devaux et al., 2007). Also, Bullock et al. (2017) excluded dispersal gradients produced by line and area sources from their analysis, because these gradients could not be compared to dispersal gradients from point sources. However, dispersal kernels estimated using the spatially explicit approach presented here enable such comparisons, because the estimates are independent of specific experimental design. Hence, adoption of this methodology would provide a unifying framework to extract biological knowledge from experimental data on dispersal.
We demonstrated how to use this theory to extract more knowledge from existing dispersal datasets. The improved estimates can potentially enhance our understanding of ecological dynamics. For example, yellow rust pathogen P. striiformis has recently expanded its geographic range by adapting to higher temperatures (Milus et al., 2009). Here, we acquired more accurate estimates of P. striiformis dispersal kernels using the spatially explicit approach, whereby the characteristic dispersal distance r 90 is about seven times shorter compared to conventional estimates (Table 1). Thus, our results (combined with knowledge about other relevant biophysical processes) could enable a more accurate prediction of further range expansion of P. striiformis populations, which is likely to be slower than expected based on conventional estimates.
Similarly, using this method, a large proportion of other published dispersal gradients (e.g., Devaux et al., 2007;Emsweller et al., 2018;Loebach & Anderson, 2018;Skarpaas & Shea, 2007;Werth et al., 2006) can be re-analyzed to improve our knowledge about spatial scales of dispersal. This could improve our capacity to predict shifts and expansions of species' geographic ranges, and sizes and compositions of plant communities. We demonstrated that depending on several factors (e.g., the functional form of the kernel and the spatial configuration of source/measurement locations), the improved estimates based on spatially explicit approach can result either in shorter or longer dispersal distances compared to conventional estimates. Accordingly, an improved prediction of rate of range expansions or shifts and sizes or compositions of ecological communities can go in either direction, which highlights the importance of acquiring more accurate estimates of dispersal.
We assumed isotropic dispersal in data analysis and simulations. However, anisotropic dispersal is common in nature (Soubeyrand et al., 2007) and the model can be extended to incorporate it. In this extended model, the probability of dispersal from a source point to a destination point will depend not only on the distance between the points, as in our case, but also on the direction from the source to the destination. Parameters of anisotropic dispersal kernels can then be estimated from measurements of dispersal gradients with the spatially explicit consideration of the source. Empirical data we analyzed characterized populations of passively dispersing plant pathogens. The methodology is applicable to plant and plant pathogen systems that have passive dispersal, but may not be applicable to characterize active dispersal, e.g., of vector-borne plant viruses.
Based on the outcomes of our idealized simulations, it is tempting to propose simple rules of thumb about when the point source approximation provides reasonably accurate estimates of dispersal kernel parameters. This appears to be the case, for example, when the sources are sufficiently small and the spatial scale of dispersal is sufficiently large ( Figure S2.2). However, in these idealized simulations, we neglected the spatial extent of measurement areas and limitations in the amount of sampling within these areas. When we considered these features in more realistic simulations, the outcomes revealed a non-trivial influence of several factors (such as the functional form of the kernel, the spatial configuration of the source and the measurement locations as well as sample sizes) on the estimation accuracy. As a result, we are not able to provide simple rules of thumb regarding the validity of the point source approximation.
Instead, based on our results we suggest the following best practices to design future dispersal experiments. First, a proposed experiment should be simulated numerically over a range of plausible parameter values to decide whether the point source approximation is valid or the spatially explicit modeling should be used in the analysis. Second, aspects of experimental design can be optimized by doing further simulations in order to minimize costs while maximizing the estimation accuracy. These aspects include the size of the source, the spatial configuration of measurement areas (such as their sizes, shapes, and measurement distances), and sample sizes.
In conclusion, we demonstrated how spatially explicit modeling can improve the analysis of existing dispersal data and optimize design of future dispersal experiments.

ACK N OWLED G M ENTS
PK and AM gratefully acknowledge financial support from the Swiss National Science Foundation through the Ambizione grant PZ00P3_161453.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare that they have no competing interests.