Target‐group backgrounds prove effective at correcting sampling bias in Maxent models

Accounting for sampling bias is the greatest challenge facing presence‐only and presence‐background species distribution models; no matter what type of model is chosen, using biased data will mask the true relationship between occurrences and environmental predictors. To address this issue, we review four established bias correction techniques, using empirical occurrences with known sampling effort, and virtual species with known distributions.


| INTRODUC TI ON
Understanding macro-geographic processes is becoming ever more important in the challenges facing biodiversity conservation. The threats of climate change, species invasions and continued habitat degradation mean we need accurate methods for understanding distributions and their responses to changing scenarios (Jeschke & Strayer, 2008;Pimentel et al., 2001). Species distribution modelling (SDM) aims to relate the occurrence records of a species to abiotic variables such as climate and land use Phillips, 2004). By effectively describing the environmental niche that a species occupies, models can be used to predict geographic distributions (Soberón & Peterson, 2005).
Many SDMs used today utilize presence-only data, most notably using Maximum Entropy statistical models (Phillips, 2004;Phillips et al., 2006). Because it is simple to use and performs well with small sample sizes, Maxent has driven the use of SDMs in conser- vation. An important assumption of modelling with presence-only data is that the modelled region has been sampled randomly, with environmental conditions represented in terms of their availability (Phillips et al., 2009); due to the haphazard sampling of museum records and citizen science databases, this is rarely the case (Newbold, 2010;Ponder et al., 2001). Often the locations that are sampled the most are in the most accessible areas, in protected areas, or where the greatest number of species can be observed (Kadmon et al., 2004;Reddy & Dávalos, 2003). This in turn can cause distribution models to resemble sampling effort more closely than to the true distribution, complicating interpretation (Elith et al., 2011;Soberon & Nakamura, 2009). This phenomenon is typically referred to as sampling bias.
While sampling bias may be a result of geographic features, models are typically fitted in environmental space, and so sampling bias becomes a problem if the highly sampled regions correlate with specific climatic conditions (Phillips et al., 2009). This may be true in many instances, such as the proximity of road networks to ridge lines, or the association between cities and rivers. In these cases, the realized niche of a species is often disguised by the climatic variables of over-represented areas. Despite this, sampling bias is still often unaccounted for in distribution modelling (Yackulic et al., 2013).
There are two main methods commonly used to correct for sampling bias. The first is to manipulate the selection of background points used to determine environmental variation so that these share the same bias; this in effect increases the contribution of environmental variation from highly sampled areas (Ponder et al., 2001).
The second option is to remove presences from highly clustered areas, known as spatial filtering (Boria et al., 2014;Veloz, 2009).
Both methods have been shown to reduce the effect of sampling bias. We chose to focus on background manipulation rather than spatial filtering for two reasons. First, occurrences may be clustered for ecological reasons such as population structure (Dormann et al., 2007), and so removing records from highly clustered areas may disguise the true patterns; second, presence records may often be sparse, and thus removing records can reduce the sample size unacceptably.
Previous evaluations of methods to account for sampling bias either have used simulated datasets with virtual species (e.g. Barbet-Massin et al., 2012;Moua et al., 2020;Ranc et al., 2017;Stolar & Nielson, 2015), estimations or simulations of sampling bias (e.g. Fourcade et al., 2014;Kramer-Schadt et al., 2013), or presence-only methods using presence-absence data (Fithian et al., 2015;Syfert et al., 2013). By having complete knowledge of the underlying population distribution, evaluating correction methods can be done confidently. However, artificially generated sampling bias can be simplistic and is often represented as gradients of sampling intensity across modelled regions. This does not reflect the complex patterns of bias observed in real life.
In contrast to the above studies, we have a dataset of fieldcollected data where the sampling bias is known. We compare bias correction methods here using the presence-only data of the Hoverfly Recording Scheme (Ball & Morris, 2012), which has a complete record of sampling effort for each 1-km square of the UK. The data come from a long-standing National Recording Scheme, with records contributed by skilled professionals, amateurs and increasingly by citizen scientists; the data therefore capture the true complexity of bias in sampling, affected by all manner of factors, such as accessibility and the site preferences of surveyors.
We used both empirical and virtual species to test three methods of bias correction via background manipulation: the use of records of similar taxa (the 'target-group' method: Phillips et al., 2009;Ponder et al., 2001), restricting the available background to a specified distance from records (Anderson & Raza, 2010;Phillips, 2008) and the use of potential covariates of sampling effort, such as human population density or the positions of roads (Dubos et al., 2021;El-Gabbas & Dormann, 2018;Monsarrat et al., 2019). We used the explicit knowledge of sampling effort to factor out bias in real distributions as a standard for comparison, and to create biased and unbiased occurrences for virtual species. To the best of our knowledge, this is the first attempt to evaluate bias correction techniques using real data with explicit knowledge of sampling effort and therefore presents a unique opportunity to test them against the complexity of real-life sampling bias.

| Empirical data
Our data came from the Hoverfly Recording Scheme (HRS) (Ball & Morris, 2012), which now contains more than 1 million records. The scheme aims to collate information on the distribution and ecology of hoverflies (Diptera, Syrphidae) throughout the UK. We chose to model 58 individual species with over 1000 records, spanning 1983-2002. The use of higher numbers of records leads to more accurate models (Kadmon et al., 2003;Wisz et al., 2008), and by using multiple species, we aimed to reduce the effect of species-specific relationships with sampling bias (Fourcade et al., 2014;Stolar & Nielsen, 2015;Warton et al., 2013). The exact number of occurrences for each species is presented in Table 1.
Climatic variables for modelling were sourced from the WorldClim database (Version 2.1) (Flick & Hijmans, 2017); all 19 bioclim variables were initially considered, alongside elevation. Land use data were also incorporated, using the Land Cover Map 2000 (1-km dominant target class), as produced by the Centre for Ecology and Hydrology (Fuller et al., 2002). Target classes for each 1-km raster cell show 25 land use types representing different habitats throughout the UK. We assessed predictor variables for collinearity using the r package virtualspecies (Leroy et al., 2015), selecting nine independent variables with a Spearman's rank correlation less than 0.7 (Appendix S1: Figure S1.1) (Braunisch et al., 2013). Information on the environmental predictors is available in Appendix S1: Table   S1.1. All processing of environmental variables and occurrences were coded in r version 4.0.5 (R Core Team, 2021) and projected in the OSGB 1936 coordinate system.

| Maxent settings
All species distribution modelling was performed using Maxent (version 3.4.1) in the r package dismo . Because we modelled multiple virtual and empirical species across different methods, we chose to use Maxent's default regularization and feature settings to maintain consistency. Tuning individual models can improve their predictive ability (Merow et al., 2013;Radosavljevic & Anderson, 2014), including their response to sampling bias (Anderson & Gonzalez, 2011), which may in our case mask the effect of each bias correction method. However, for practical application we recommend that Maxent users should always carefully consider model settings and visually assess models for biological realism (Fourcade et al., 2018).

| Maxent background
Maxent uniformly samples 10,000 background points to estimate environmental variation across the area of interest, which is used to determine the habitat preference of species. When sampling is concentrated in small areas, most of the environmental variation will come from unsurveyed regions, contributing to the effect of sampling bias (Barve et al., 2011). Because many inaccessible regions have extreme environmental values, the consequence is to skew environmental gradients, leading to overestimation where presences occur (Lobo et al., 2010). To prevent this, background point selection can be influenced using a probability surface that mimics sampling bias, so that highly sampled areas receive a greater number of background points.
For all correction techniques, we sampled 10,000 background points without replacement, using the function randomPoints from the r package dismo. For models with no bias correction, we used a uniform background (Figure 1b), which is the default setting for maxent. To effectively counteract bias, we created a background based on sampling effort of the hoverfly recording scheme. The sampling effort of the HRS is the number of visits to each 1 km².
To create a background, these data were first converted into a continuous surface (Elith et al., 2010). This was achieved using 2D kernel density estimation in the r package ks (Duong, 2020), with cells weighted by the number of visits over the 20-year period. Resulting rasters were projected to the same resolution, area and coordinate system as environmental variables ( Figure 1a).
Background points for each species were then sampled using the probability surface and were supplied to Maxent for use when modelling.

| Target-group background
The target-group approach to sampling bias is the most prevalent in the literature and relies on using the collective records of similar taxa to estimate sampling effort for the focal species (Phillips et al., 2009;Ponder et al., 2001). The reasoning is that surveyors will display the same bias when sampling similar species, and so a larger dataset will contain information on general sampling effort for that taxon. It is important to carefully decide the scope of the target-group, because of changes in locations and seasons, the varying detectability among species, and because observers need to use the same methodology (Phillips et al., 2009;Ponder et al., 2001;Yackulic et al., 2013).
Because of this, our aim was to test if target-group methods accurately reflect the distribution of sampling effort, when considering varying detectability and underlying patterns of species richness.
To generate a target-group background, we used the collective records of all the species of hoverfly within the HRS, following established practice (see Elith et al., 2010;Phillips et al., 2009). First, we removed duplicate records for each species, so that each cell value represented the total number of species sampled. We then followed the same processing steps as when generating the sample effort background, using a 2D kernel density estimation to convert single points into a continuous probability surface that could be used to weight background point selection ( Figure 1c).

| Radius-restricted background
An alternative method of bias correction is to restrict the area that Maxent uses to determine environmental variation, normally as a pre-specified radius from each record (Anderson & Raza, 2010;Phillips, 2008). This means that the model excludes obtaining background points from further away, focussing on available habitat close to sampled locations. This has been shown to improve model predictions (Acevedo et al., 2012;Anderson & Raza, 2010;Barve et al., 2011), but care must be taken to avoid a resulting lack of model accuracy when too few background points are used (Thuiller et al., 2004;VanDerWal et al., 2009).
Similar to choosing minimum distances for spatial filtering, the choice of radius size can be subjective (Aiello-Lammens et al., 2015), which can pose a significant dilemma to conservation practitioners.
We selected 10 km as a potential to correct for sampling bias without excessively reducing the available background, but clarify our choice was also subjective. It may have been possible to assess multiple distances, and select the best for each species, but this information is typically unavailable to practitioners, and we wanted our assessment to accurately reflect real-world scenarios. Within the buffer region, each grid cell was equally likely to be selected as a background point. The background was created using the package raster (Hijmans, 2020) ( Figure 1d).

| Covariates of sampling effort
In the case when a focal species lacks appropriate target-group data or has too few presence points for background restriction to be viable, another option is the use of potential covariates of sample effort. Maps of human population density and road networks can provide an alternative means of estimating sampling effort, and can be incorporated into SDMs (Guerra et al., 2013;Kadmon et al., 2004;Monsarrat et al., 2019). While accessibility maps may work well in some scenarios (Dubos et al., 2021), there is currently no review of their effectiveness against other bias correction methods.
Human population density was sourced from the Socioeconomic Travel time from the major cities was used as an alternative proxy for sampling effort. This is based on similar assumptions of sampling bias being shaped by population density, but also including accessibility via road and river networks. A gridded surface of accessibility to cities with more than 50,000 people in the year 2000 was created by the European Commission (Nelson, 2008), defining accessibility as travel time to a location using land or water. As with population density, using travel time in its raw format led to similar overcompensation, and hence, values were transformed by logarithm, multiplied by −1 (to make remote locations less likely to be sampled) and shifted to positive values. The resulting spatial pattern of accessibility is shown in Figure 1f.

| Virtual species
To validate the results of bias correction on empirical species, we also modelled virtual species where the true distribution is known.
We first created 50 random species using the generateRandomSp function from the virtualspecies r package (Leroy et al., 2015), which used the same predictor variables to create unique habitat suitability maps (Appendix S1: Table S1.1). The habitat suitability of each random species was converted to a probability of occurrence and binary presence-absence raster using the logistic conversion method of virtualspecies. We generated threshold values by allowing beta to vary randomly between 0.3 and 0.7, while keeping alpha constant at −0.1 and setting niche breath to 'wide'. This generated a random species prevalence for each virtual species, so bias corrections were tested over a wide range of possible scenarios. The exact beta and prevalence values for each virtual species are available with generated distributions in Appendix S3.
To create a set of biased and unbiased occurrences, we sampled each virtual species using two sampling regimes. For our unbiased dataset, we simply sampled each presence-absence raster randomly, taking 1000 presence points. To create a biased dataset, we used the sample effort from the hoverfly recording scheme (Figure 1a) to weight our sampling regime. This allowed us to test models with a realistic sampling bias pattern. To test bias correction, we used the same backgrounds generated with empirical species for targetgroup, population density and travel time (Figure 1). For distancerestricted backgrounds, we created a unique background for each virtual species, using the same 10-km radius.

| Statistical analysis
As with colour patterns, comparing maps is a difficult challenge (Fourcade et al., 2014;Warren et al., 2008); to test for bias removal, most analyses compare the ability to predict occurrences, a method that is counter-intuitive when the occurrences contain the sampling bias that we hope to remove. When using discrimination metrics such as area under the receiver-operating curve (AUC), the Kappa statistic and the true skill statistic, biased models can even perform better than their bias-corrected counterparts (Fourcade et al., 2014;Leroy et al., 2018;Veloz, 2009).
For this reason, we chose to rely on similarity metrics between reference distributions and different bias corrections for both empirical and virtual species. For empirical data, these were speciesspecific distributions created using known sampling effort to factor out sampling bias (Figure 2a). Because each sample effort corrected distribution is only a best estimate at correcting bias, we repeated the same comparisons for virtual data, by testing the similarity between each correction and a model built using unbiased occurrences ( Figure 3a). This meant we could confidently assess over-and underprediction rates. Before assessing corrections, we validated the effectiveness of background manipulation to remove bias, using virtual data. We found that sample effort corrected models performed as well as unbiased models (Appendix S2), and therefore, we were confident that our sample effort distributions were an acceptable standard for comparison.
Distributions were compared by looking at three indices (niche overlap, centroid shift, and range size changes). Schoener's D (Distance) (Schoener, 1968) was chosen as a measure of niche overlap because it has been suggested as the best measure of similarity between niches in reviews (Rödder & Engler, 2011;Warren et al., 2008) and has been an effective metric in studies of sampling bias (Fourcade et al., 2014;Ranc et al., 2017;Stolar & Nielsen, 2015).
Centroid shift was computed as the Euclidean distance from the suitability-weighted centroid of the sample effort corrected distributions. Centroids were calculated using the r package SpatialEco (Evans, 2020), and Schoener's Distance using the r package ENMtools (Warren et al., 2019).
Range size changes were calculated using the r package Biomod2 severely under-predicted suitability compared to the reference, thresholds would be typically around 0.8-0.9 to minimize losses, even if the two maps were vastly difference in reality.
After generating comparison metrics, results were grouped by the method of bias correction, and by species, to account for speciesspecific effects. A Friedman test was performed to test for significant differences among correction methods, and post-hoc Wilcoxon signed-ranks tests were performed on pairwise comparisons applying a Bonferroni adjustment to p-values to account for the multiple tests. We understand that using traditional statistical tests can be inappropriate for virtual species (Meynard et al., 2019); however, we decided to report test statistics alongside empirical species and focus on clear differences in results rather than specific p-values.
In addition, we visually inspected the predicted distributions of each species, using expert knowledge to critically assess hoverfly distributions.

| Result s
While we used all species for our data analysis, for demonstrative purposes we have shown the first species Baccha elongata, and the first random species, which should provide a typical representation of how sampling bias changed predicted distributions. The distributions of each empirical and virtual species is available in Appendices S4-S5.

| The effect of sampling bias
Without applying any bias correction, the cells with high habitat suitability are densely clustered in central and southern England for both empirical and virtual species (Figure 2b; Figure 3b). This is highest in and around London, but other cities in central England are also hotspots, closely reflecting the distribution of sampling effort ( Figure 1b). In contrast, Scotland has very low levels of predicted habitat suitability across nearly the whole region. This is particularly apparent for virtual species, as unbiased models predict high suitability in northern regions (Figure 3a), but predictions without correction are only concentrated in the south (Figure 3b).
When true sampling effort was used to sample background points, there was a clear and strong effect on the predicted habitat suitability ( Figure 2a); distributions no longer had a strong bias for population centres and were instead were distributed more evenly. The greatest change was around London and regions directly south, which were predicted to have more isolated areas of highly suitable habitat, rather than one continuous block. Habitat suitability showed moderate increases in suitability in Scotland, but these were generally quite small, with one exception of the western coastline, which showed a much higher predicted suitability in line with previous knowledge of hoverfly habitat preference.

| Similarity metrics
As calculated using Schoener's D, there was a significant difference between distributions created using each bias correction technique when compared against the reference distribution for both empirical and virtual species (Table 2) There was a significant difference in the magnitude of the centroid shift of each bias correction technique, for both empirical and virtual species (Table 2). Pairwise tests showed there were significant differences between no correction and other all bias correction methods (Figure 4b; Figure 5b). Target-group backgrounds generated the smallest centroid shifts, and no correction generated the largest shifts, with the same pattern for both empirical and virtual species.
There was a significant difference between range gains and losses for both empirical and virtual species (Table 2)

| The effect of sampling bias
There was a clear and substantial effect of sampling bias in generating models (Figure 2; Figure 3). Without correcting for sampling bias, highly suitable areas were concentrated in central England, where the most sampling took place, while much of the rest of Great Britain was considered poor habitat. The same pattern of high suitability was prevalent in virtual species, despite the fact that virtual species had randomly generated habitat suitability that should differ from empirical data, demonstrating that the same sampling bias is present in both datasets.
When true sampling effort was used as part of the modelling process, distributions were typically more evenly spread, with higher habitat suitability in the south and coastal regions. This reflects a realistic distribution of hoverflies in the UK, which are more abundant at lower latitudes and altitudes. Strong biases towards central England were down weighted, leading to a smaller potential area of occupation in this region. Some areas of Scotland showed high habitat suitability, particularly the western coastline, which matches previous knowledge that hoverflies favour habitats with higher moisture.
Interestingly, for virtual species, the true probability of occurrence is still visible in the biased predictions (Figure 3b; Appendix S5), denoted by slightly lighter areas in the northernmost regions where there should be high predicted occurrence. However, the high suitability in the south caused by sampling bias completely overshadows this pattern, creating distributions that appear vastly different. In general, we were surprised by the stark difference between biased and unbiased models. It may be that the hoverfly recording scheme is particularly biased in sampling; however, conservation practitioners need to be careful that their distributions do not typically map sampling effort as ours did.
When assessing maps using similarity metrics, no-correction maps consistently performed poorly for both empirical and virtual species. Moreover, all metrics displayed a very large range of values,  bias. This suggests that in most scenarios any attempt at bias correction will bring the result closer to the true estimate of sampling effort, and hence a more accurate representation of the true distribution (Phillips et al., 2009;Ranc et al., 2017;Stolar & Nielson, 2015).

| Target-group correction
Target-group backgrounds aim to use the occurrences of similar taxa to estimate sampling effort and were a highly successful method of counteracting sampling bias. They consistently performed well in terms of all similarity metrics, whether using empirical or virtual data. Furthermore, when visually inspecting maps, target-group corrections were the only method that produced distributions that appeared similar to sample effort corrections in empirical data, and unbiased models for virtual species. Given that visual inspection is often the primary assessment by conservation practitioners, this was perhaps the most important result.
Concerns that differences in detectability and species richness may lead to a dissociation between sampling effort and target-group occurrence records were unfounded. This demonstrates that when sampling effort is unknown, using presence records of similar taxa can produce results almost as good as the use of true sampling effort, and this supports previous reviews of bias correction, which favour target group as an effective method of bias correction (Phillips et al., 2009;Ranc et al., 2017;Syfert et al., 2013).
The only negative of using target-group as a bias correction is that in many species, there were signs of overcompensation in the most under-sampled areas of the distribution, echoing past studies that observe increases in false positives (Syfert el al., 2013). This

| Background restriction
Restricting background points to within a 10-km radius of occurrences produced distributions that were moderately successful at counteracting sampling bias (Figure 2d; Figure 3d; Figure 4). When considering niche overlap, distributions performed significantly better than unbiased models; however, maps often appeared similar to no-correction models during visual inspection. This is supported when looking at range loss metrics, which were similar to no-correction models, especially for empirical species.
While there have been many reviews on the effectiveness of restricting background areas in terms of increasing model accuracy (Acevedo et al., 2012;Anderson & Raza, 2010;Barve et al., 2011), there has been limited focus when correcting sampling bias is the main objective. To the best of our knowledge, there has only been one review by Fourcade et al. (2014). While our models were somewhat successful at correcting bias, they found that restricting the background area in geographical space led to some of the worst performing models. This disparity is also apparent in our results, with 10-km buffers generating the greatest variation in bias correction performance. The main reason for this is that restricting background points seemed to perform gradually worse with denser occurrences. This is because limiting the background more severely runs the risk of reducing model accuracy as species appear to utilize all available environment (Thuiller et al., 2004;VanDerWal et al., 2009) and so should only be done when the occurrence records are extensive enough.
It is also worth noting that the variation in performance between species was partially a result of the subjective choice of radius size. This may be the greatest barrier to using distancerestricted backgrounds, as without explicit knowledge of sampling effort it is not possible to easily assess different radius sizes. A similar method for restricting the background was recently proposed by Vollering et al. (2019), using the spatial autocorrelation of environmental variables to determine radius size. A key difference was that they account for denser occurrences by summing overlapping radii, which they refer to as 'background thickening'.
When assessing the method, they found thickening performed better than target-group methods, but importantly they used individual target-group occurrences rather than a continuous surface of species richness, so a direct comparison of the two methods is still needed to draw firm conclusions.

| Population density and travel time
Using a potential covariate of sampling effort is an appealing concept to simplify distribution modelling, especially when records of similar taxa are lacking, or there are too few points for restricting background points. A recent paper by Monsarrat et al. (2019) showed that accessibility maps were able to predict sampling-effort biases in historical data, and as our results showed an improvement over no correction, this also suggests that accessibility is a factor in sampling bias.
However, while human population density and travel time produced models that were significantly better than no correction, the visual differences were often barely noticeable (Figures 2e-f, 3e-f).
The inclusion of access in the form of travel time slightly improved models, but not to significant degree, suggesting that a large proportion of accessibility is already accounted for in the distribution of samplers. There was also a large degree of variation between species, similar to previous reviews of accessibility (Dubos et al., 2021). Therefore, we are hesitant to recommend using covariates of sampling effort as a bias correction method when alternatives are available.

| CON CLUS ION
Choosing a bias correction method is a difficult choice, but one fundamental to the accuracy of species distribution modelling (Merow et al., 2013;Newbold, 2010). While there are many options available for modelling species, if the data have an inherent bias any distributions generated will always be an inaccurate representation. This becomes an issue of great importance when species distribution modelling is frequently used to plan and manage conservation projects around the world (Loiselle et al., 2003).
We found that using target-group sampling to generate backgrounds was the most effective method of estimating bias when the sampling effort is unknown. In this scenario, every care should be taken to make sure that the target-group taxa demonstrate the same sampling bias, and that sampling methods and detectability are consistent among species. This assumption was easily met when using target-group data from the same recording scheme and so may be particularly useful to other large citizen science projects.
Furthermore, as target-group data are also determined by species richness, it is possible that sampling bias is being counteracted by a bias for species richness, which may not be appropriate for all taxa or situations (El-Gabbas & Dormann, 2018;Warton et al., 2013). Users should therefore always provide strong evidence of any assumptions made when justifying which taxa to include for background samples (Merow et al., 2013).
Throughout this study, the issue of overcompensation and subjectivity became increasingly apparent. This was most obvious when deciding radius size for distance-restricted backgrounds, but also became apparent when creating backgrounds using travel time; initial backgrounds caused such overcompensation that only the most inaccessible areas were considered suitable habitats. This effect was also apparent to a lesser extent when looking at distributions generated by the target-group approach. It is almost certain that some areas received inaccurately inflated habitat suitability scores as a result of reducing biased regions. Modifying population density and travel time backgrounds to reduce their overcompensation produced more realistic maps but introduced a subjective process that may lead to users to create distributions that they feel are correct, rather than accurately predicting the underlying pattern of suitability. Recent work has shown that overprediction can be addressed by incorporating dispersal constraints into models (Mendes et al., 2020) and should be used in concert with bias corrections to provide the most accurate distributions.
Advances in technology such as remote sensing and machine learning mean we are supplied with increasing amounts of high-quality data, ideal for modelling species distributions. At the same time, climate change and other human stressors are dramatically increasing our need for sophisticated methods of conservation. As a result, it is becoming more and more important to base our decisions on accurate estimations of species distributions, and paramount that we address the issue of sampling bias. Our results show that sampling bias can have a serious effect on predicted distributions. As we, and others, have demonstrated, sampling bias correction can greatly increase the accuracy of species distribution models, and should be used wherever possible to generate effective tools to aid in conservation.

ACK N OWLED G EM ENTS
We would like to thank the volunteers of the Hoverfly Recording Scheme for supplying us with such useful data for modelling.
We would also like to thank Alicja Witwicka for her very helpful comments.

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interest.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13442.