dynamicSDM: An R package for species geographical distribution and abundance modelling at high spatiotemporal resolution

Species distribution models (SDM) are widely applied to understand changing species geographical distribution and abundance patterns. However, existing SDM tools are inherently static and inadequate for modelling species distributions that are driven by dynamic environmental conditions. dynamicSDM provides novel tools that explicitly consider the temporal dimension at key SDM stages, including functions for: (a) Cleaning and filtering species occurrence records by spatial and temporal qualities; (b) Generating pseudo‐absence records through space and time; (c) Extracting spatiotemporally buffered explanatory variables; (d) Fitting SDMs whilst accounting for temporal biases and autocorrelation and (e) Projecting intra‐ and inter‐ annual geographical distributions and abundances at high spatiotemporal resolution. Package functions have been designed to be: flexible for targeting specific study species; compatible with other SDM tools; and, by utilising Google Earth Engine and Google Drive, to have low computing power and storage needs. We illustrate dynamicSDM functions with an example of a nomadic bird in southern Africa, the red‐billed quelea Quelea quelea. As dynamicSDM functions are flexible and easily applied, we suggest that these tools could be readily applied to other taxa and systems globally.

extrapolated field data (Baker et al., 2017), which usually represent annual or longer-term averages, and sometimes are poorly temporally matched with species occurrence records. From static SDMs, the drivers of species occurrence can be inferred, and it is assumed that by applying these relationships over space and time, patterns of distribution suitability can be projected, and some have even used this information to project distribution changes (Elith et al., 2010). Therefore, SDMs have become fundamental tools for, amongst other topics, biogeographic, evolutionary and applied-ecological research to inform biodiversity and species management (Zimmermann et al., 2010).
However, these data have been shown to be inadequate for simulating rapidly changing species distributions that are driven by short-term (e.g. annual and even sub-annual) ecoclimatic conditions (Bateman et al., 2012;Fernandez et al., 2017). Moreover, recent studies have demonstrated an improvement in SDM accuracy and precision by incorporating temporally dynamic explanatory variables when modelling distributions of mobile species (Abrahms et al., 2019;Reside et al., 2010) and in landscapes with high inter-annual environmental variability (Andrew & Fox, 2020;Fernandez et al., 2017).
Yet, to date, examples of dynamic SDMs remain scarce. Static or long-term average explanatory variables continue to be employed, even when modelling distributions of species that are highly mobile and responsive to short-term conditions, including birds (Williams et al., 2017), mammals (Wieringa et al., 2021) and insects (Kimathi et al., 2020). This is despite the availability of high spatiotemporal resolution, remote-sensed datasets for numerous environmental variables, from which dynamic explanatory variables can be derived. Utilisation of such datasets is rare probably due to a combination of the more ready-availability of static datasets, the associated analytical packages available, and the perceived added computational overheads of dynamic modelling with high resolution datasets.
Moreover, these barriers are likely to be exacerbated by existing SDM tools that are not optimised for incorporating temporally dynamic explanatory variables. Many SDM functions in R packages lack functionality for explicit consideration of the temporal dimension at key modelling stages. For instance, when modelling with temporally dynamic explanatory variables, temporal biases in occurrence datasets could over-or under-represent conditions at a given time and impact SDM performance. However, existing R package functions typically only account for spatial biases, offering tools to spatially thin records (Aiello-Lammens et al., 2015) or generate spatial buffered background points (Thuiller et al., 2016). Without appropriate tools, our ability to generate spatiotemporally dynamic SDMs is limited. The Spatiotemporal Observation Annotation Tool's rstoat package (https://mol.org) can extract spatiotemporally buffered data for species occurrence records, but lacks the functions for developing and projecting dynamic SDMs with these data. Here, we present dynamicsdm, an R package that includes user-friendly and flexible functions for extracting and incorporating dynamic explanatory variables into species distribution models and projecting distribution patterns at high spatiotemporal resolution.

| PACK AG E OVERVIE W
The main features of dynamicSdm functions are: • Dynamism: Fill gaps in existing static SDM package tools to account for both spatial and temporal dimensions at key modelling stages.
• Flexibility: Function arguments are highly flexible to target methods to the species and environment of interest.
• Computer-friendly: Explanatory variable extraction functions utilise Google Earth Engine and Google Drive to minimise computing power and storage demands.
• Compatibility: Function inputs and outputs can be used interchangeably with other SDM packages.
The dynamicsdm package workflow is presented in Figure 1 and the included functions are detailed in the following section. Table 1 outlines the novel functionality of dynamicsdm functions for generating high spatiotemporal resolution SDMs compared to existing SDM R packages.

| Clean and filter species occurrence data
We provide three functions that encompass the spatial and temporal dimension when checking that species occurrence records match the study's scope and quality requirements. First, spatiotemp_check checks the formatting, completeness and validity of record coordinates and dates, with optional use of package coordinatecleaner for additional spatial checks (Zizka et al., 2019). Second, spatiotemp_ extent excludes records outside a given spatial and temporal extent, typically dictated by the study's scope or the coverage of environmental datasets. Third, spatiotemp_resolution filters records by a specified spatial and temporal resolution (e.g. dates must be given to daily resolution). As dynamicsdm functions require occurrence data in a standardised format, convert_gbif transforms records from the Global Biodiversity Information Facility (GBIF, https://www.gbif.org/) into this format.

| Assess and account for spatial and temporal biases
spatiotemp_bias assesses spatial and temporal biases in occurrence records, which are prevalent due to various factors including temporal variation in sampling effort or species detectability. For temporal biases, spatiotemp_bias returns a histogram plot of record frequency over time, and performs a chi-squared test for significant difference between the observed temporal distribution of records compared to random. For spatial biases, spatiotemp_bias returns a scatter plot of record co-ordinates for visual assessment of clustering and performs a t-test for significant difference between observed nearest neighbour distance and that of random points simulated at the same density. Users can limit spatiotemp_bias to a specific area. This may improve the reliability of bias assessments for range-shifting species where uneven record distribution at range peripheries may be underpinned by ecological process and not sampling bias.
Multiple correction methods exist for spatial bias, including spatial thinning of records (Aiello-Lammens et al., 2015) or weighting by sampling effort (Stolar & Nielsen, 2015). We adapted these methods for temporal biases. spatiotemp_thin temporally thins occurrence records by removing records within a temporal distance of each other. Temporal distance between records can be measured by two methods: absolute number of days, or days apart within the annual cycle. For instance, temporally thinning two records with dates "2010-01-01" and "2015-01-04" by 10 days through the "absolute" method would retain both records, but the "annual cycle" method would remove one. To prevent spatially distant but temporally close records from being excluded, only records within a set grid cell size are temporally thinned. Then, spatiotemp_thin thins remaining records by a spatial distance using spthin package functions (Aiello-Lammens et al., 2015). Alternatively, we include the spatiotemp_weights function that calculates total sampling events (input data) across a spatial and temporal buffer from each occurrence record.

| Generate pseudo-absences through space and time
Paucity of species absence records may necessitate the generation of pseudo-absence records or "background" points, which are inferred absences based upon known presence sites. We provide spatiotemp_pseudoabs that generates pseudo-absence co-ordinates and dates, either randomly within a spatiotemporal extent or buffer from occurrence records. Following pseudo-absence spatial buffering theory (VanDerWal et al., 2009), temporal buffering could reveal more fine-scale temporal drivers because presence-absence comparisons are at short time scales. Moreover, buffered pseudoabsences may be more suitable because randomly generated pseudo-absences have been shown to inflate SDM performance metrics (Acevedo et al., 2012).

| Extract spatiotemporally dynamic explanatory variables
Within dynamicSdm there are two functions for extracting dynamic explanatory variables using Google Earth Engine (GEE). GEE is a cloud platform developed for remotely processing remote sensing datasets ( Figure 2). These functions require the installation of the GEE API R package rgee (Aybar et al., 2020) and an associated Google Drive account. By utilising GEE and Google Drive, we minimise computing power and storage demands.
extract_dynamic_coords extracts temporally dynamic explanatory variables. Users must input species occurrence records and F I G U R E 1 Overview of dynamicsdm package functions across species distribution modelling (SDM) stages. (1) Response data; functions for filtering species occurrence records by spatiotemporal quality, extent and resolution, generating pseudo-absence dates and co-ordinates, and exploring spatiotemporal biases in response data.
(2) Explanatory data; functions for extracting spatiotemporally buffered explanatory variables using Google Earth Engine. (3) Model relationships; functions for fitting SDMs whilst accounting for spatial and temporal autocorrelation. (4) Dynamic projections; functions for generating high resolution projection covariates and projecting dynamic species distribution and abundance patterns.
TA B L E 1 Limitations of existing species distribution modelling R package functions for generating spatiotemporally dynamic SDMs and dynamicSdm functions to overcome these.  (Hijmans et al., 2015) to standardise the calculation and improve projection raster generation time. If the traditional approach of a circular buffer from record co-ordinates is used, then generating high resolution projection rasters can be computationally heavy. The "moving window" matrix size will vary with radial distance of interest and the resolution of environmental data, so we include get_moving_ window that calculates an optimal matrix by balancing these factors.

| Model fitting
To model the relationships between species occurrence and dynamic conditions, we include brt_fit for fitting Boosted Regression Tree models using the gbm.fit algorithm from gbm R package (Greenwell et al., 2019). brt_fit takes optional arguments, including for fitting jack-knife models to spatiotemporal blocks and weighting records by spatiotemporal sampling effort. We include brt_fit only for completeness as an SDM package. We suggest dynamicSdm could be easily integrated with alternative modelling approaches because pre-modelling functions generate a simple response and explanatory dataset, and post-modelling functions accept various model types.

| Extract dynamic projection covariates
The dynamic_proj_dates function generates dates at given intervals within an extent. Then extract_dynamic_raster and extract_buffered_ raster functions iterate through each date and extract variable rasters at a given extent and resolution. To minimise storage demands, these rasters are stored on Google Drive. dynamic_proj_covariates stacks rasters for each date and exports either data frames or raster stacks.

| Project and visualise spatiotemporal patterns
The dynamic_proj function projects species distribution and abundance models onto each covariate data frame or stack and exports projection rasters. To visualise spatiotemporal patterns, we include dynamic_proj_GIF that combines projections into an animated GIF.

| E X AMPLE-THE RED -B ILLED Q UELE A QUELE A QUELE A
We illustrate the application of dynamicSdm to model the highly variable intra-and inter-annual distribution and abundance of the granivorous and nomadic weaver bird, the red-billed quelea Quelea quelea; a major pest of small grain cereals in sub-Saharan Africa.

| Response data
Species occurrence and abundance data were collated from GBIF (Gbif Occurrence Download, 2021) and pest control organisations (Table S1). Occurrence data were filtered to exclude records containing anomalous or missing values using spatiotemp_check. Using spatiotemp_resolution, records not given to a spatial resolution of four decimal places or a temporal resolution that was >1-day were excluded due to being of an inadequate resolution to study intraannual, local-scale distribution patterns. Then spatiotemp_extent filtered occurrence data to 2001-2017, the temporal extent of explanatory datasets, and the spatial extent of southern Africa to focus solely on the subspecies Q. quelea lathamii (Figure 3). Paucity of absence records necessitated pseudo-absence generation using spatiotemp_pseudoabs, which randomly generated pseudo-absence co-ordinates within a 250-500 km spatial buffer and dates within a 6-12 week buffer. Buffer sizes were informed by quelea movement capabilities (Elliott, 1990) and typical rates of change in their habitat (Cheke et al., 2007). Spatial and temporal biases in occurrence records were detected by spatiotemp_bias. Therefore, spatiotemp_weights

| Explanatory data
For each record, explanatory variables were extracted using ex-tract_dynamic_coord and extract_buffered_coord, which represented short-term (8-week) and longer-term (52-week) weather conditions, F I G U R E 3 Spatiotemporal distribution of red-billed quelea Quelea quelea occurrence records a) before filtering (N = 342,434) and b) after filtering (N = 66,740) using dynamicSdm response data functions. Purple points represent spatiotemporally buffered pseudo-absence records (N = 66,740) generated with function spatiotemp_pseudoabs. and the availability of resources governing quelea distribution and abundance (Ward, 1971; Table 3). Spatial buffer size was generated by get_moving_window using quelea's 100 km dispersal radius to access resources (Elliott, 1990

| Dynamic projections
dynamic_proj_dates generated dates at monthly intervals for a 5 year period (2013)(2014)(2015)(2016)(2017). Rasters for each variable were extracted to Google Drive using extract_dynamic_raster and extract_buffered_raster TA B L E 3 Dynamic explanatory variables in red-billed quelea Quelea quelea species distribution and abundance models. Accessible area is a "moving window" matrix matching quelea's 100 km dispersal radius generated by dynamicSdm function get_moving_window. and then combined into data frames by dynamic_proj_covariates.
Using dynamic_proj a binary distribution was projected onto covariates and then projected abundance was stacked onto positive occurrence cells (Figure 4, Supplementary Materials 4). The maximum projected abundance across life-cycle stages was taken to produce a single projection for each month. dynamic_proj_GIF generated a GIF to visualise the intra-and inter-annual distribution suitability and abundance patterns of quelea ( Supplementary Figures S1 and S2).
For comparison, we also fitted and projected models using long-term average ecoclimatic variables (Supplementary Materials 5).

| CON CLUS ION
Overall, dynamicSdm provides users with flexible and easily applied tools to model the dynamic distributions and abundances of species worldwide and advance SDM-based research.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflict of interests.

PE E R R E V I E W
The peer review history for this article is available at https:// w w w.w e b o f s c i e n c e . c o m /a p i /g a t e w a y/ w o s /p e e r-r e v i e w/10.1111/2041-210X.14101.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data and code from this manuscript's case studies are archived through Zenodo at https://doi.org/10.5281/zenodo.7670393 (Dobson et al., 2023). the colour gradient represents projected quelea abundance (number of individuals, log 10 scale) in these cells.