Population variability in species can be deduced from opportunistic citizen science records: a case study using British butterflies

Abundance data are the foundation for many ecological and conservation projects, but are only available for a few taxonomic groups. In contrast, distribution records (georeferenced presence records) are more widely available. Here we examine whether year‐to‐year changes in numbers of distribution records, collated over a large spatial scale, can provide a measure of species' population variability, and hence act as a metric of abundance changes. We used 33 British butterfly species to test this possibility, using distribution and abundance data (transect counts) from 1976 to 2012. Comparing across species, we found a strong correlation between mean year‐to‐year changes in total number of distribution records and mean year‐to‐year changes in abundance (N = 33 species; r2 = 0.66). This suggests that annual distribution data can be used to identify species with low versus high population variability. For individual species, there was considerable variation in the strength of relationships between year‐to‐year changes in total number of distribution records and abundance. Between‐year changes in abundance can be identified from distribution records most accurately for species whose populations are most variable (i.e. have high annual variation in numbers of records). We conclude that year‐to‐year changes in distribution records can indicate overall population variability within a taxon, and are a reasonable proxy for year‐to‐year changes in abundance for some types of species. This finding opens up more opportunities to inform ecological and conservation studies about population variability, based on the wealth of citizen science distribution records that are available for other taxa.


Introduction
The long-term monitoring of population dynamics is an important aspect of ecology, and allows examination of factors driving species' abundance trends, such as the effects of weather , habitat (Lemoine et al., 2007), disease (Daszak et al., 2003) and human impacts (Lewis & Vandewoude, 2015). Monitoring abundance trends of species thus helps to identify species at risk, develop conservation strategies to halt population declines (Brown et al., 1995), and identify increasing populations of pests to implement control strategies (Petrovskii et al., 2014). Measuring population variability is essential to explore the influence of environmental factors, such as climatic cycles or food availability, on population dynamics (van Schaik & van Noordwijk, 1985;Lynam et al., 2004). In addition, population variability may be an important determinant of the likelihood that populations will survive in habitat fragments, and variability may indicate the sensitivity of populations to climatic fluctuations (Pimm et al., 1998;Vucetich et al., 2000;Oliver et al., 2012). Nevertheless, collecting abundance data may be time-consuming and expensive, and thus many taxonomic groups lack information on abundance trends and population dynamics. In contrast, many more species have large datasets of distribution records (i.e. unique records of the presence of species at a given location and date). Such data are available for a wide range of taxonomic groups, tend to cover wide areas, span many years, and are often collected as part of 'citizen science' projects (Devictor et al., 2010;Pocock et al., 2015).
It is well-known that there is a positive relationship between species' abundances and distributions (Brown, 1984;Gaston et al., 2000) and very abundant species tend to have larger ranges (Holt et al., 1997). Abundance-distribution relationships are general patterns in ecology, but there are many forms of the relationship (Gaston, 1996), and these relationships are not necessarily linear (Hartley, 1998). In spite of this complexity, strong relationships have been found between distribution and abundance, which are evident over time, large spatial scales and different taxonomic groups (Zuckerberg et al., 2009;Roney et al., 2015). These relationships allow occupancy changes (changes in the likelihood of a species' presence) to be used to estimate population trends (Tempel & Guti errez, 2013), broad biodiversity changes to be assessed across multiple taxonomic groups (Oliver et al., 2015a), and long-term trends in the frequency of species' occurrences to be modelled (Pearce & Boyce, 2006). These long-term occurrence trends have been shown to be reasonable proxies for abundance trends for both birds (Kamp et al., 2016) and butterflies (Warren et al., 2001;Oliver et al., 2015a). But, there is little information on the capacity of distribution data to describe other aspects of population dynamics, such as population variability, which is an important factor affecting extinction risk (Inchausti & Halley, 2003;Mace et al., 2008).
A challenge for ecologists is deriving an accurate measure of population variability when standardised abundance estimates are lacking. The positive associations between distribution size and abundance suggest that distribution records could potentially be used in analyses inferring species' population dynamics, by acting as proxies for abundance data. If there are strong and predictable relationships between year-to-year changes in abundance and year-to-year changes in distribution records, then distribution records could provide a useful metric for ecologists to study the factors affecting population variability in a much wider range of taxa than is currently possible.
In this study, we examine the relationships between abundance and distribution to assess whether year-to-year changes in the number of distribution records are strongly related to year-to-year changes in abundance. We study British butterflies because there are long-term and finescale data on both distribution and abundance, allowing robust testing of these relationships. We predict that yearto-year changes in abundance will be strongly positively related to year-to-year changes in distribution records, because increasing numbers of individuals would be expected to result in an increased likelihood of a species being recorded. In addition, as a population increases in size, density-dependent dispersal would be expected to result in individuals moving away from areas of high population density, thereby increasing the number of sites where species can be observed (Gaston et al., 2000).
Within this broad topic, we examine three issues. The first is whether it is possible to identify species with higher or lower population variability using distribution dataa between-species comparison. We do this by calculating average between-year changes in the numbers of distribution records over time, and comparing these estimates with measures of variability that are based on fixed-transect population count data. Secondly, we assess whether distribution records can be used as proxies for inter-annual changes in abundance in each species separatelya within-species analysis. Finally, we identify the characteristics of species for which distribution data provide a proxy for abundance, concentrating on three attributes that can be deduced from the distribution records themselves (i.e. not requiring additional ecological or population dynamic data, which are lacking for many taxa). We selected these metrics because they are likely to be linked to our statistical capacity to detect year-to-year variation in abundance from distribution records: (i) the total number of distribution records for a species, (ii) how aggregated these records are in space (using a metric of 'fractal dimension' of distribution records) and (iii) the average size of the year-toyear changes in distribution records (i.e. how much annual variation there is in distribution records for a species). We refer to these metrics as 'biogeographical attributes', but recognise that they are also influenced by variation in recording intensity across species and over time. We also examine the effect of the spatial scale of the study area on the relationship between year-to-year changes in distribution records and year-to-year changes in abundance, by comparing data analysed at national (UK study area, 30,2800 km 2 ) and regional (county study area, 440 km 2 ) levels, given that population fluctuations may be synchronous in their dynamics at one spatial scale but not others (Sutcliffe et al., 1996).

Study species
We studied 33 species of British butterfly (See Table 1), including northern and southern species, and resident and migrant species, over the period 1976 to 2012. This study period was selected to maximise the geographical coverage of data, the length of the time-series of data analysed and the number of species analysed. We excluded species without 37 years of abundance and distribution data. Species that were subject to targeted, intensive surveying effort during certain years of the study period were also excluded (Hesperia comma; Thomas & Jones, 1993;Boloria euphrosyne;Brereton, 1998; and Satyrium w-album; Thomas, 2010), because large differences in the level of recording effort between years could bias results.

Distribution records
We computed year-to-year changes in distribution records based on data collected by volunteers for the Butterflies for the New Millennium (BNM) recording scheme, surveying sites in the study area (see below) on an opportunistic basis using unstructured sampling (Fox et al., 2015). A distribution record is an observation (recorded presence) of an individual species at a location on a particular date. Recording efforts are generally unstructured (there are no fixed or assigned times, places or methods for recording) and opportunistic, with little to no guidance given to recorders as to how, when and where to record, meaning that recording is influenced heavily by recorder behaviour (Boakes et al., 2010;Isaac & Pocock, 2015). Recorder behaviour can vary due to encouragement to record in under-represented regions for the purposes of atlas creation or other targeted survey efforts. Despite these attempts to encourage, spatial and temporal variation in opportunistic recording effort remains high. Due to increased recruitment of recorders over time numbers of distribution records have increased (see Figure S2), which is why we de-trended the data prior to analysis. Latin names with an asterisk (*) indicate migratory species. Presented are the Pearson's r2 values of the relationship between year-to-year log 10 change in abundance and year-to-year log 10 change in total number of distribution records. We checked r s values and found them to all be positive, indicating that the direction of the relationships below were always positive. Biogeographical attribute values are also included for each species: total number of distribution records (ƩD), mean absolute year-to-year change in log 10 distribution records, fractal dimension (Fractal D).
The spatial and temporal resolution of BNM distribution records varies; we excluded records with spatial resolution coarser than a 10 km 9 10 km grid square or with date ranges spanning more than 1 year. The study area was the UK, Isle of Man and Channel Islands (3028 hectads in total). We analysed a total of 5,873,182 distribution records from 1976 to 2012, after all filtering processes (see below). The majority of distribution records are independent of abundance data (UK Butterfly Monitoring Scheme (UKBMS) transect), but the distribution dataset did contain some records sourced from transects. Therefore, distribution records were excluded if they occurred within the 1 km grid cell that contained a UKBMS transect (based on the centroid of the digitised transect route). This led to 1604 1 km cells being excluded; approximately 5.3% of the UK land area and 26.2% (2 089 886) of records.
Year-to-year changes in log 10 distribution records were calculated for each study species over the 37-year study period by subtracting the number of distribution records (log 10 -transformed) in year t-1 from the number of records in year t.

Abundance data
We analysed abundance data from the UKBMS national collated index (www.ukbms.org). The UKBMS calculates their index from counts from weekly transect walks along fixed routes undertaken during the recording period (April-September) every year since 1976 (see http:// www.ukbms.org/Methods.aspx for full details). Counts are taken from sites in Great Britain and Northern Ireland (1854 transect sites in total). Counts for missing weeks are estimated by the UKBMS by considering the area of a GAM curve fitted to observed weekly count data throughout the year . The UKBMS national collated index from 1976 to 2012 is created using a log-linear model, with a transect site and year effect (Brereton et al., 2011), as shown below: Where c is the expected count for site i in year j, and where x i and y j give the means for the ith site and the jth year. The index is then scaled to a mean of 2, for the purposes of comparing abundance trends across species. This produces a log 10 -transformed abundance index, which we used in our calculation of population variability. We computed year-to-year changes in log 10 abundance by subtracting the abundance index value (log 10 -transformed) for year t-1 from the value for year t.

Accounting for phylogeny
Our butterfly species share evolutionary lineages, and this must be taken into account when analysing species together in models. All multispecies analyses conducted in this study accounted for the non-independence of species using phylogenetically informed linear models with estimated Pagel's k, using the pgls function of the caper package in R (Pagel, 1999;Orme et al., 2013), and the butterfly phylogeny of Brooks et al. (2016). These models are interpreted by p values indicating the difference between the phylogenetic correlation k value (estimated using maximum likelihood) and the upper and lower bounds: 1 (indicating phylogenetic dependence) and 0 (indicating phylogenetic independence). In all our analyses, the phylogenetic correlation was not significantly different from the lower bound, indicating phylogenetic independence, and so we conclude that phylogeny did not influence our linear models.

Examining relationships between abundance and distribution records
First, we explored whether yearly changes in log 10 distribution records (as above) were correlated with yearly changes in log 10 abundance (as above), in a multispecies analysis with a control for phylogenetic independence (see section above). We computed overall mean change values for both variables for each species over the 37-year study period. In both cases (distribution-record and abundance changes), we calculated the average absolute magnitude of the year-to-year changes, rather than directional changes (positive or negative). We analysed year-to-year changes rather than absolute numbers each year to de-trend the data, and to remove any temporal trends in recording effort. This analysis tests whether species with high population variability (on transects) also have high variability in terms of numbers of distribution records.
Second, we examined each species separately. We calculated the strength of the relationships between year-toyear changes in log 10 distribution records and changes in log 10 abundance using r 2 values from least squares regressions. This relationship is hereafter termed the interannual distribution-abundance relationship and, for each study species, it reflects the extent to which yearly changes in log 10 numbers of distribution records can be used to predict population size changes (from transect data).
Third, we examined the influence of three independent biogeographical attributes on these inter-annual distribution-abundance relationships to identify species for which distribution records were adequate proxies for population change. These attributes were: total number of distribution records; fractal dimension of a species' range; and overall variability in distribution records. We computed the total number of distribution records collected at any spatial resolution (10 m to 10 km grid) for a species during the study period . Fractal dimension is a metric of how 'well-filled' a species' range is, based on the proportion of 10 km grid cells with records within each occupied 100 km grid cell (Wilson et al., 2004). For each species, we calculated the total area of all occupied 10 km and 100 km grid cells, and regressed these values against the length of the grid cells (10 km and 100 km respectively; all values log 10 transformed).The slope of the regression gives a measure (fractal dimension) of how 'well-filled' a species range is at 10 km scale, where a slope of 0 indicates a completely filled range, and a slope of two indicates a minimally filled range (see Figure S1 for two exemplar species; Thymelicus sylvestris, with the most well-filled range and Hipparchia semele with the most minimally filled range). For overall variability in distribution records we used the mean year-to-year change in log 10 distribution records over the study period. A phylogenetic multivariate regression was then fitted with the three biogeographical attributes as explanatory variables and the r 2 value of each species' inter-annual distribution-abundance relationship as the response variable. We fitted a fourth term to the model, the quadratic term of mean year-to-year change in log 10 distribution records, to account for its apparent non-linear relationship with goodness-of-fit (r 2 ) values when relationships were visually inspected by plotting the data. We tested a full model, then removed non-significant terms using a stepwise deletion approach.
We selected these three attributes to test because autoecological information may be limited for other taxonomic groups, but these attributes can be easy from distribution datasets. Because butterflies, however, do have detailed autoecological information, we tested the influence of dispersal ability on the inter-annual distribution-abundance relationship in PGLS models, using two metrics: dispersal rankings based on expert opinion (Cowley et al., 2001) and a mobility score calculated from indices of ecological information (Dennis et al., 2004). We found no significant relationship between dispersal ability and the strength of the inter-annual distribution-abundance relationship (see Table S2).

Comparison of national and regional inter-annual distribution-abundance relationships
To investigate whether the goodness of fit of the interannual distribution-abundance relationships varied with spatial scale, we repeated our analysis of this relationship at a regional level, for the county of Dorset. We compared r 2 values from national and regional inter-annual distribution-abundance relationships for a sub-set of 23 butterfly species for the period 1983-2009 (maximum time period containing abundance data for species in Dorset). Dorset was selected because of its extensive history of surveying butterflies (Robertson et al., 1988;Thomas et al., 2001).

Relationship between variability in abundance and distribution records across species
Across the 33 study species, there was a strong positive relationship between the mean year-to-year changes in log 10 distribution records and mean year-to-year changes in log 10 abundance (Fig. 1a, PGLS, r 2 : 0.95, F 1,31 = 623.8, P = <0.001), even when two outlier species were removed (Fig. 1b, PGLS, r 2 : 0.66, F 1,29 = 55.35, P = <0.001). Thus, species that show high variability in abundance also have high variability in distribution records, and there was little evidence for any phylogenetic signal (i.e. results were not significantly different between models based on estimated k, and where k was set to 0).

Measuring inter-annual distribution-abundance relationships within species
For each of our 33 study species, the relationships between year-to-year changes in log 10 distribution records and year-to-year changes in log 10 abundance produced an overall mean r 2 value of 0.36, indicating that year-to year changes in distribution records of UK butterflies provide a moderate proxy for year-to-year abundance changes. Eight butterfly species had r 2 > 0.5, showing that distribution records were particularly informative in approximately 25% of study species. Nevertheless, there was considerable variation among species, with r 2 values varying between 0.03 and 0.92 (Table 1). Figure 2 highlights two exemplar species, where the relationship was strong (Holly blue, Celastrina argiolus, r 2 = 0.85) and one where the relationship was very weak (Marbled White, Melanargia galathea, r 2 = 0.16).
We selected these three attributes to test because autoecological information may be limited for other taxonomic groups, but these attributes can be easy calculated from distribution datasets. Because butterflies, however, do have detailed autoecological information, we tested the influence of dispersal ability on the inter-annual distribution-abundance relationship in PGLS models, using two metrics: dispersal rankings based on expert opinion (Cowley et al., 2001) and a mobility score calculated from indices of ecological information (Dennis et al., 2004). We found no significant relationship between dispersal ability and the strength of the inter-annual distribution-abundance relationship (see Table S2).

Influence of biogeographical attributes
The r 2 value for each species' inter-annual distributionabundance relationship (i.e. relationships between year-to year changes in log 10 distribution records and year-to-year changes in log 10 abundance; as in Fig. 2) was then analysed in relation to the biogeographical attributes of each species, which are provided in Table 1. We tested all these variables in a full model (PGLS, r 2 = 0.64, F 4,28 = 12.58, AIC = À30.43, P = <0.001; Table 2a). Only mean absolute year-to-year changes in distribution records and its quadratic term significantly influenced inter-annual distribution-abundance relationships: total number of distribution records and fractal dimension did not, and were consequently dropped during model simplification. The best and most parsimonious model (PGLS, r 2 = 0.63, F 2,30 = 26.02, AIC = À33.70, P = <0.001; Table 2b) revealed that the strength of the relationship (r 2 value) increased with overall variability in distribution records (Fig. 3). Thus, the results show that species with greater fluctuations in distribution records over time had stronger inter-annual distribution-abundance relationships (although the effect of variability in records was non-linear and asymptoted at roughly 0.8; Fig. 3). Two species (Celastrina argiolus and Vanessa cardui) potentially had strong effects on our analyses (Fig. 3c), but excluding these two species did not alter our conclusions (Table S1).

Comparison of national and regional inter-annual distribution-abundance relationships
The strength of inter-annual distribution-abundance relationships computed for species at a regional level (Dorset) were strongly positively correlated with those computed at the national level (PGLS, r 2 = 0.53, F 1,21 = 23.25, P = <0.001; Fig. 4). Therefore, we conclude that any differences in population synchrony between national and regional scales had little influence on the strength of inter-annual distribution-abundance relationships for butterfly species.

Discussion
We found that citizen-collected distribution data can be used to extract information about population variability, in the absence of bespoke abundance monitoring  Table 2a shows the first, full model with the following explanatory variables: mean absolute year-to-year change in distribution records, total number of species records and fractal dimension. The model summary statistics were: r 2 = 0.64, F 4,28 = 12.58, AIC = À30.43, P = <0.001. Table 2b shows the best model with only one explanatory variable: mean absolute year-to-year change in distribution records. Model summary statistics: r 2 = 0.63, F 2,30 = 26.02, AIC = À33.70, P = <0.001. In both models, the quadratic term of the mean absolute year-toyear change in distribution records was included to account for the non-linear nature of the relationship, and model results with estimated k were not significantly different from a model with k set to 0 (Fig. 3). programmes. In particular, mean year-to-year changes in distribution records were positively related to mean yearto-year changes in abundance (with outlier species removed, r 2 value: 0.66; Fig. 1). Thus, we were able to identify species with low and high between-year population variability quite accurately, using distribution data. This result supports the ability of unstructured citizen science data to reflect population-dynamic patterns found in longterm abundance data, and hence citizen science data may be useful in multispecies studies for which it is necessary have an overall measure of population variability Gandiwa et al., 2016) where abundance data are lacking. The ability to recognise species with the highest levels of population variability may help identify species that are at greatest risk of stochastic extinction following habitat fragmentation (Pimm et al., 1998;Vucetich et al., 2000;Oliver et al., 2012), and the most variable species may potentially be the most responsive to yearly variation in climatic conditions (Maclean et al., 2008;Howard et al., 2015) and to parasitoids or other natural enemies . The findings from these analyses imply that information from citizen science data can provide useful input to landscape-scale conservation planning and to climate-change risk assessments. When we considered each species in turn, there was considerable variation in the strength of relationships between year-to-year changes in distribution records and abundance among our study species; although these associations were always positive, averaging an r 2 of 0.36 across all species (Table 1). These relationships suggest that there is also some potential to use the distribution records of individual species to infer their population dynamics in greater detail (rather than as one metric for overall variability in the time-series). But, this is only feasible for some species: only eight out of 33 species having 'strong' relationships (r 2 > 0.5) between year-to-year abundance and distribution changes. Thus it should not be presumed that distribution records can be used as a substitute for population data in the assessment of interannual change for all species.
Inferring abundance change from distribution data Many species are declining or facing range retractions (Hayhow et al., 2016), and it is important to monitor their population trends. Species with highly variable population dynamics tend to be at high risk of extinction (Pimm et al., 1998;Vucetich et al., 2000;Oliver et al., 2012) and thus our measure of variability in distribution records has ecological value, with the potential to assist conservation assessments by helping to identify species at risk of extinction or habitats in need of management (Meyer et al., 2015;S anchez-Hern andez et al., 2015). Our multispecies analysis (Fig. 1) indicates that it is possible to derive robust estimates of population variability using distribution data alone.
Despite the promising results, there are two caveats that we should highlight. In this study, we examined only one taxonomic group with a high level of recording effort by citizen scientists. We also included only species with data in every year of the study period, excluding rare/less wellstudied species. The value of other distribution datasets with lower recording effort may not be so informative. Kamp et al. (2016) found that reducing the number of distribution records resulted in poorer abundance trend estimates for Danish birds. Even without reducing the sample size, population trends were misclassified for 50% of the species they considered. Thus, using distribution data to infer population changes may require quite mature citizen science schemes, with substantial numbers of distribution records. Given that butterflies are a datarich taxonomic group in the UK it remains unknown whether other groups will have sufficient data to replicate these results. Datasets which may have sufficient data for this method are butterflies in other countries, or other taxa in the UK, for which standardised abundance monitoring schemes are lacking, e.g. dragonflies.
Our second caveat is that more detailed populationdynamic interpretations of distribution data only seem possible for some species. Our finding that citizen science distribution data explain an average of only 34% of the year-to-year variation in abundance is unlikely to be sufficient to build meaningful models for examining the sensitivity of populations to environmental drivers, such as specific climate variables. For example Malinowska et al. (2014) were unable to detect impacts of extreme weather events on populations of ectothermic species from distribution records, despite evidence of these impacts from population data (e.g. Oliver et al., 2015b). In addition, while we have removed species which have unusually high levels of recording effort due to species-specific surveys, not all species are necessary reliably monitored by UKBMS, which could result in poor year-to-year distribution-abundance relationships. For example the purple hairstreak butterfly (Favonius quercus) occurs in tree canopies, and is therefore difficult to monitor from groundbased surveys. Other species may suffer from limited recording for other reasons, such as occurring in restricted locations or not being identified correctly due to confusion with other morphologically similar species.

Biogeographical attributes
Despite the above caveats, we conclude that year-toyear changes in distribution records represented an adequate proxy for abundance change in species with large fluctuations in their occurrence from year to year (Fig. 3, Table 1). Species with large year-to-year fluctuations in their occurrences, such as migrants, may offer the greatest statistical power to deduce population changes from distribution data. Even though two migrant species and the holly blue butterfly Celastrina argiolus demonstrate the strongest inter-annual distribution-abundance relationships, the mean year-to-year change in distribution records was also an important variable in predicting the strength of the year-to-year distribution-abundance relationship for other species. Therefore, mean year-to-year change in distribution records may help to identify non-butterfly species where citizen science distribution data could be used as a 'replacement' for direct population data. We found that total numbers of records and fractal dimension did not significantly influence the inter-annual distribution-abundance relationship. The most parsimonious explanation for this is that these variables are not important, and that our hypotheses, that our statistical capacity to detect year-to-year variation in abundance from distribution records was linked to the total number of distribution records, and fractal dimension, were wrong. We had predicted that a large total number of records would mean greater statistical power to find the interannual distribution-abundance relationship. The lack of a significant relationship between the inter-annual distribution-abundance relationship and total number of distribution records could be because patterns of yearto-year change in distribution records can be similar those in abundance even when numbers of observations are low. Recorder behaviour may have biased our results, as recorders may not record widespread common species on an ad hoc basis instead favouring notable records (e.g. rare species), this contrasts the abundance data that were collected following a structured survey design where all species seen are recorded. This could lead to mismatch in abundance and distribution patterns even for inter-annual changes, as recording effort varies temporally. Finally, the lowest total number of distribution records in our study was quite high (see Table 1), therefore the concerns with low sample size were not an issue here. Nevertheless, the issue may be important to other more poorly recorded taxonomic groups.
Fractal dimension of species' distribution also did not impact the inter-annual distribution-abundance relationship. This might be because even if a range is fragmented, distribution recorders and transect volunteers still find and document species in those locations. In addition, if a species is known to be fragmented (which usually indicates rareness or being at risk of extinction), there may be a recording bias towards it (Isaac & Pocock, 2015), which results in good information for that species. Therefore, species with a high fractal dimension may still have a positive inter-annual distribution-abundance relationship. Although it should be noted that species which are very poorly studied, and therefore likely rare and in fragmented habitats, were not been included due to the selection criteria. The study species also had ranges which were relatively well-filled, with fractal dimension scores ranging from 0.257 to 0.716 (maximum possible value is 2). It is possible that fractal dimension is an important factor for highly fragmented species, and there may have been insufficient variation in this attribute to be important to the inter-annual distribution-abundance relationship. Similarly we found no relationship between the inter-annual distribution-abundance relationship and dispersal for butterflies (Table S2). If these variables lack significant explanatory power even for a well-studied taxon, then this suggests that they will have limited use for identifying species in other taxa for which our method may be appropriate.

Population synchrony and inter-annual distributionabundance relationships
The success of year-to-year changes in distribution records mirroring abundance changes in migratory species suggests that population synchrony over large areas may play a role, and so we examined the impact of scale on the inter-annual distribution-abundance relationship by comparing national and county-level analyses. Weak relationships at the national level may occur if species' population dynamics are asynchronous, such that abundances and distributions may be closely linked locally, but a 'good year' in one region might occur when it is a 'bad year' in another region, obscuring any overall pattern at a national scale. Yet, when we repeated our national-scale analysis for a much smaller region (the county of Dorset), the results were similar: goodness of fit scores across species for the inter-annual distribution-abundance relationship for Dorset were correlated with those for the same species at the national level (Fig. 4). The majority of species had lower r 2 values for the regional analyses, probably due to reduced data quantity. The spatial scales at which abundance and distribution changes are linked deserve more attention, but our preliminary conclusion is that reducing the extent of the study region considered does not improve the inter-annual distribution-abundance relationship.

Conclusions
Our key finding that (mean year-to-year changes in) citizen-collected distribution data can provide useful information on population variability suggests that it may be possible to expand our methods to other taxonomic groups, or to populations of butterflies in countries that do not have standardised population monitoring schemes. Such measures of variability can inform habitat, landscape and regional conservation decision-making. The use of distribution data for more detailed analyses of interannual population change is only likely to be possible, however, for species that have highly variable numbers of records between years. For these species, it may be possible to analyse year-to-year population changes across much longer time periods than covered by transect data and hence identify how populations are influenced by the effects of specific weather variables, density dependence, and any other process that operates at a large geographical and temporal scale. Further investigation is required, however, in the feasibility of extending these methods to other taxonomic groups.
Author contributions SCM, GDP, THO, CDT, JKH designed the study, SCM carried out all analyses and calculations, with the exception of national UKBMS indices. THO provided code and data for regional UKBMS indices, RF provided species information for exclusion, SCM initiated the writing of the manuscript and all authors commented on and helped write the paper.

Data accessibility
The data for the butterfly species that were used in this study are archived by the National Biodiversity Network Gateway https://data.nbn.org.uk/ (BNM presence records) and the UKBMS http://www.ukbms.org (UKBMS transect count data and TRMOBs index for 1976-2012).

Supporting Information
Additional Supporting Information may be found in the online version of this article under the DOI reference: doi: 10.1111/ icad.12242: Figure S1. The ranges of a) the small skipper butterfly, Thymelicus sylvestris, a species with a well-filled range (fractal dimension: 0.257), and b) the Grayling butterfly, Hipparchia semele a species with the most minimally filled range of the 33 species studied (fractal dimension: 0.716). Figure S2. The annual total number of distribution records for all 33 study butterfly species across the study period 1972-2012. Table S1. Results of the best model explaining the interannual distribution-abundance relationship with two outlying species, Celastrina argiolus and Vanessa cardui removed from the analysis. One biogeographical attribute was included as an explanatory variable: mean absolute year-toyear change in distribution records. PGLS model results: r 2 = 0.43, F 1,28 = 20.86, AIC = À31.11, P = <0.001. Table S2. Results of two models examining the relationship between dispersal ability and the inter-annual distribution-abundance relationship. The explanatory variable in Model 1 is a dispersal ranking from Cowley et al. (2001), PGLS model results: r 2 = 0.13, F 1,26 = 3.776, AIC = À17.52, P = 0.063; and the explanatory variable in Model 2 is a dispersal score from Dennis et al. (2004) PGLS model results: r 2 = 0.08, F 1,26 = 2.119, AIC = À15.78, P = <0.157.