Unlocking the potential of historical abundance datasets to study biomass change in flying insects

Abstract Trends in insect abundance are well established in some datasets, but far less is known about how abundance measures translate into biomass trends. Moths (Lepidoptera) provide particularly good opportunities to study trends and drivers of biomass change at large spatial and temporal scales, given the existence of long‐term abundance datasets. However, data on the body masses of moths are required for these analyses, but such data do not currently exist. To address this data gap, we collected empirical data in 2018 on the forewing length and dry mass of field‐sampled moths, and used these to train and test a statistical model that predicts the body mass of moth species from their forewing lengths (with refined parameters for Crambidae, Erebidae, Geometridae and Noctuidae). Modeled biomass was positively correlated, with high explanatory power, with measured biomass of moth species (R 2 = 0.886 ± 0.0006, across 10,000 bootstrapped replicates) and of mixed‐species samples of moths (R 2 = 0.873 ± 0.0003), showing that it is possible to predict biomass to an informative level of accuracy, and prediction error was smaller with larger sample sizes. Our model allows biomass to be estimated for historical moth abundance datasets, and so our approach will create opportunities to investigate trends and drivers of insect biomass change over long timescales and broad geographic regions.

Similarly, the body size of individual species can play a substantial role in structuring networks of interspecific interactions (Woodward et al., 2005). All of these factors make moths a valuable taxon in which to study long-term biomass change at the community level, but biomass data are currently lacking for these analyses.
Existing long-term moth population and distribution datasets are potentially a very valuable resource for understanding biomass changes, but these datasets record abundance, not measurements of body mass or size, and in most cases do not retain specimens (preventing biomass information from being obtained retrospectively).
To address questions of biomass change using these abundance datasets requires reliable body mass data for all species, but such empirical data are currently available for only a limited set of species (García-Barros, 2015). An alternative approach is to use empirical data from a subset of species to model the expected body mass of all species, using some other, more readily available, trait. Such models have previously been formulated to predict the body mass of moths and other invertebrates from their body length (Höfer & Ott, 2009;Sabo, Bastow, & Power, 2002;Sage, 1982;Sample, Cooper, Greer, & Whitmore, 1993) and variants thereof (García- Barros, 2015), chosen because it is easily measurable from museum specimens (García-Barros, 2015). However, for moths, body length data are not widely available and in any case may be influenced to a greater degree by contraction in dried specimens than other traits (García-Barros, 2015). The only morphological trait for which existing data on many species are readily available is forewing length: for example, an expected range of forewing lengths is included for all British species of macro-moths, and most British species of micromoths, in standard field guides (Sterling & Parsons, 2012;Waring & Townsend, 2017), and it may therefore be possible to predict body mass based on forewing length (Miller, 1997). The existence of substantial interfamilial variation in body plan (e.g., between Saturniidae and Sphingidae; Janzen, 1984) may provide opportunities to use taxonomy to fine-tune models, but no previous model has included any refinement based on taxonomic relationships between moths.
In this study, we develop a statistical model to estimate the body mass of individual moths from their forewing length and hence quantify the biomass of samples of moths for which species abundances only have been recorded. We have four aims: (a) collection of empirical data (during 2018 on the University of York campus, UK) to establish the relationship between forewing length and body mass in moths; (b) construction of a predictive model for estimating body mass from species identity and associated forewing length, (c) testing the accuracy of this model's predictions and how accuracy changes with increasing moth abundance, and (d) using existing data on forewing lengths to predict the body masses of all British macro-moths, thus providing a resource to users of moth population data and to comparative biologists.

| Field sampling, identification, and measurement of moths
We sampled moths at three sites (Appendix S1.1-2) on the University of York campus (northern England, UK; 53°56′41″N 1°2′2″W), between June 11 and July 20, 2018 (Appendix S2.1). Moths were collected using Heath-style moth traps (Heath, 1965), each operating a 15 W actinic fluorescent tube and powered by a 12 V battery (Anglian Lepidopterist Supplies). Moths were euthanized and returned to the laboratory for identification and measurement. Moths were identified to species level where possible using standard field guides (Sterling & Parsons, 2012;Waring & Townsend, 2017).
Where species-level identification would have required dissection of the genitalia, identification was made to aggregate level (e.g., Common Rustic agg. Mesapamea secalis/didyma). After identification, moths were allowed to air-dry at room temperature for a minimum of 1 week, which was sufficient for the dry body mass of even the largest individuals to stabilize (Appendix S1.3, Appendix S2.2).
After drying, we measured the forewing length and dry mass of each moth. Forewing length was measured from wing base to wing-tip, using calipers and a ruler, to the nearest 1 mm. Dry mass was measured using an A&D HR-202 balance (A&D Instruments Ltd.), to the nearest 0.01 mg. Measurements were precise to within ±6% of the true value (Appendix S2.2).

| Modeling forewing length-body mass relationship from empirical data
To investigate the relationship between forewing length (mm) and body mass (mg) in moths, we constructed generalized linear mixedeffects models (GLMMs) using our 2018 field data, with species as a random effect, and body mass explained by the interaction between forewing length and taxonomic family. We selected between three candidate model structures by comparison of Bayesian Information Criterion (BIC) scores: (i) linear predictor (i.e., ln(body mass) ~ wing length × family); (ii) nonlinear predictor (i.e., ln(body mass) ~ ln(wing length) × family); and (iii) segmented predictor (as for model i, but permitting the slope of the model to change once as forewing length increases). Finally, we tested the significance of independent variables, including the interaction between wing length and family, using Likelihood Ratio Tests.
To reduce the risk of our predictive model overfitting for families represented by only a few species in our dataset (and therefore to allow accurate predictions of body mass to be made), we refitted this model with a simplified family variable, in which seven families represented by fewer than five species in our dataset were grouped together as "other" (effectively reducing the family variable from 11 categories to 5). The four retained families (each with ≥5 species sampled) were Crambidae, Erebidae, Geometridae, and Noctuidae, allowing the predictive model's parameters to be refined for these families, while also making overall predictions for all other families.
We fitted a GLMM to the dataset as above, using this reduced version of the family variable, and extracted all fitted parameters from the GLMM to form the predictive model. We did not include information on whether individuals were male or female, even though male and female moths can differ substantially in size in some species, because this information is not recorded in historical abundance datasets. Our model therefore used overall slope and intercept to predict body mass from forewing length for all moths, with a refined prediction for moths from the most speciose (and therefore data rich) four families in our dataset.

| Testing model accuracy
To test the accuracy of this general predictive modeling approach when making predictions based on forewing length data from field guides, we estimated the body mass of each of the 94 moth species in our dataset from its expected forewing length (obtained by taking the midpoint between minimum and maximum forewing lengths given by field guides for micromoths (Sterling & Parsons, 2012) and macro-moths (Waring & Townsend, 2017); archived at Zenodo, https://doi.org/10.5281/zenodo.3786303), and used these estimates to calculate the estimated biomass of each mixed-species sample of moths (where one sample = all moths that were captured at the same site on the same day, across multiple traps; n = 44 samples). We compared between these estimates of biomass and the empirically measured biomass of the moths in question. We conducted this testing at both species and sample levels, because rare species from rare families are likely to have the least accurate predictions from our model, but may also have the least impact on the accuracy of sample-level predictions.
We first compared between measured and estimated biomass for the full set of 600 moths. At both species and sample level, we tested the relationship between measured and predicted biomass, using model II regressions with a Major Axis approach because neither biomass variable was dependent upon the other (Legendre & Legendre, 2012). Significance of relationships from random was tested using one-tailed permutation tests (with 100 permutations), and relationships were also compared to the desired y = x (i.e., estimated = measured) relationship by calculation of 95% confidence intervals around the estimated slope. The strength of the relationships between measured and estimated biomass at species and sample level was determined by model R 2 values.
However, because in this case comparisons were not independent of the predictive model (i.e., model accuracy was tested with the same data that had been used to fit the model), we also used a resampling approach to further test the accuracy of our general predictive modeling approach. We split our full dataset 10,000 times into training and testing subsets. In each replicate, we randomly selected 480 individual moths (80% of the 600 total individuals) without replacement to form a training subset, with the remaining 120 individuals forming an independent testing subset. We trained a model with the same structure as the full predictive model (above) on the training dataset, and from its parameters, extracted estimates of species-and sample-level biomass as above for the 120 moths included in the testing dataset. We tested the relationship between measured and predicted biomass for each replicate as above. Across the results of all 10,000 replicates (and at both species and sample levels), we then calculated the proportion of replicates for which measured and estimated biomass were significantly correlated, the mean and standard error of model R 2 values, and the proportion of replicates for which the modeled relationship was significantly different from y = x.
Finally, we used a resampling approach to assess the influence of moth abundance (i.e., sample size) on prediction error. We randomly sampled sets of individuals (with replacement, from the full set of 600 measured individuals) at sample sizes between 10 and 1,000 in steps of 10, taking 1,000 replicates at each sample size for a total of 100,000 replicates. For each replicate sample, we calculated the measured biomass and the estimated biomass (based on the parameters of the final predictive model). We then calculated the prediction error for each sample as a proportion of the true biomass, normalizing by subtracting the known prediction error of 3.40% in the full dataset (i.e., the total predicted biomass of all 600 moths was 3.40% lower than their total measured biomass), such that: Grouping sample sizes into windows of 100, we calculated the mean, standard error, and range of prediction errors observed across all replicates in each window.

| Field sampling, identification, and measurement of moths
We sampled 614 individual moths, of which 13 could not be confidently identified beyond family level (2 individuals from Crambidae, 1 from Pterophoridae, and 10 from Tortricidae). One micromoth (Narycia duplicella [Goeze, 1783], Psychidae) could not be detected by our balance (and therefore weighed less than 0.005 mg). These 14 individual moths were excluded from further analyses. The remaining dataset contained exactly 600 individual moths, representing 94 species from 11 families (6.6% of all species, or 13.7% of macro-moth species, ever recorded in the region (i.e., compared with the UK Lepidoptera recording area of Vice-county 61 (southeast Yorkshire), which includes the University of York); Appendix S1.4). Among these moths, forewing lengths ranged from 7 mm (individuals of Eudonia pallida (Crambidae) and Agapeta hamana (Tortricidae)) to 40 mm (an individual of Laothoe populi (Sphingidae)) and dry body masses ranged from 1.1 mg (an individual of Eupithecia tenuiata (Geometridae)) to 753.2 mg (an individual of Smerinthus ocellata (Sphingidae)).

| Modeling forewing length-body mass relationship from empirical data
From the three candidate model structures described above, we selected the nonlinear predictor (model ii) as the best-fitting model (BIC: 360.7, compared to 431.1 and 494.3 for models i and iii, respectively). The natural logarithms of body mass and forewing length were significantly related to each other at both species and individual levels (Figure 1), with variation among the 11 families in the slope and intercept of this relationship (individual level: χ 2 = 35.9, df = 10, p < .001; marginal R 2 = 0.819) revealing that interfamilial variation in body plan significantly influences the scaling of forewing length to body mass.
The significance of the model (and almost all of its explained variance) was retained when fitting the simplified model (in which seven families represented by <5 species were grouped as "other"; χ 2 = 30.7, df = 4, p < .001; marginal R 2 = 0.812), resulting in a set of parameters from which body mass could be predicted based on forewing length (Table 1). All four families retained as independent levels (Crambidae, Erebidae, Geometridae, and Noctuidae) had larger intercepts and shallower slopes than the overall prediction across the other families (Table 1). Thus, we conclude that the nonlinear model with simplified family variable has the greatest potential for estimating body mass.

| Testing model accuracy
We then used our best-fitting model to estimate body masses for all 94 species as described above, and compared between measured and estimated biomass for the full sample of 600 individual moths.
We found that our estimates of biomass significantly correlated with measured biomass at both species and sample levels ( Figure 2), even though body mass varied widely both within and between species (within-species SD of body mass = 34.6 mg, between-species SD of body mass = 74.7 mg). At sample level, the relationship between estimated and measured biomass was not significantly different from a 1:1 relationship (Table 2), with 91.5% of variation explained. At species level, estimated biomass explained 91.1% of variation. The relationship was less steep than the expected 1:1 relationship (Table 2) with all moths included; however, the 1:1 relationship was recovered when we excluded the 34 smallest species from models (i.e., only included species weighing >15 mg, n = 60 species). These results indicate that our predictive model may slightly overestimate the body mass of very small species of moths, but that this does not substantially bias estimates of sample-level biomass.
To test whether this general predictive modeling approach can accurately estimate biomass beyond the sampled individuals and species, we split our data 10,000 times into random training (480 individuals in each case) and testing (120 individuals) subsets. We refitted our final model to the training subset in each case and predicted the body masses of individuals in each testing subset. We found again that our estimates of biomass significantly correlated with the prediction error = 100 × predicted biomass − measured biomass measured biomass − 3.4 measured biomass at both species and sample levels in 100% of replicates (Table 3). At sample level, estimated biomass explained on average 88.4% (±SE 0.07) of variation in measured biomass, and was not significantly different from a 1:1 relationship in 75.6% of cases ( F I G U R E 1 Relationship between forewing length (mm) and dry mass (mg). In panel (a), the mean forewing length and dry mass of each species sampled in the study are shown on logarithmic axes, with error bars showing standard errors and family indicated by the combination of point color and shape. In panel (b), the forewing length and dry mass of every individual moth sampled in the study is shown on logarithmic axes, with the four most speciose families in our sample (Crambidae, Erebidae, Geometridae, and Noctuidae) indicated as above by point color and shape TA B L E 1 Parameters of the predictive model, extracted by fitting a GLMM with the fixed-effects structure: ln(body mass) ~ ln(forewing length) × family, to data from 600 individual moths

Intercept estimate (SE)
Overall model 94 (600) Note: The number of measured individuals and species on which each parameter estimate was based is given. Overall model parameters are given, including the χ 2 and p-values of a likelihood ratio test of the model's overall significance. Family-specific slope and intercept values are refinements to be added to the parameters for "other families" (rather than taken in isolation). To predict body mass of a moth from its forewing length, these parameters should be applied to the following formula: ln(body mass) = (ln(forewing length) × ("other families" slope + family slope adjustment)) + ("other families" intercept + family intercept adjustment).

F I G U R E 2
Accuracy of predicted biomass of moth species and samples of moths compared to the true, measured biomass. (a) Predicted dry mass of species (mg) is plotted against mean measured dry mass (mg); the 1:1 relationship is plotted as a blue line, and points are colored by the number of individual moths from which the measured mean was calculated. (b) The absolute difference between mean measured dry mass and predicted dry mass of each moth species is plotted against the number of individuals from which the measured mean was calculated; a horizontal line is plotted at y = 0. (c) Predicted dry mass of samples (mg) is plotted against measured dry mass (mg); the 1:1 relationship is plotted as a blue line, and points are colored by the number of individual moths contained in the sample. (d) The absolute difference between measured and predicted dry mass of each sample of moths is plotted against measured dry mass (mg); a horizontal line is plotted at y = 0

| D ISCUSS I ON
Findings from our analyses show a strong relationship between forewing length and body mass in moths, which enables prediction (to an informative level of accuracy) of the biomass of samples of moths when such data are not available (e.g., because historical specimens have not been kept). Generating biomass data using this approach will provide an additional tool to ongoing investigation of the nature and consequences of changes in insect populations (Didham, Basset, et al., 2020;Hallmann et al., 2017;Macgregor, Williams, et al., 2019) using long-term recording datasets. It may also permit the inclusion of estimates of moth body mass in comparative studies and trait-based analyses, despite the general lack of empirical data of this nature (García-Barros, 2015).
In particular, these data will facilitate studies of the relationships between biomass, abundance, and community composition

| Evaluation of the predictive model's current and future utility
Overall, the estimates of body mass calculated using the predictive model's parameters performed relatively well during testing, with ~90% of variation in measured biomass explained by predicted biomass at both species and sample levels, and prediction error decreasing as sample size increased (Appendix S1.5). Therefore, using .010 Note: Relationships were tested using a model II regression, and significance was determined by a one-tailed permutation test with 100 permutations. The R 2 of each model is also given, alongside the estimated intercept and slope of each model, with associated 95% confidence intervals.
TA B L E 2 Details of statistical models testing the relationships between measured biomass and estimated biomass at species and sample level for the final model TA B L E 3 Details of bootstrap testing (over 10,000 replicates) of statistical models testing the relationships between measured biomass and estimated biomass at species and sample level Relationships were tested using a model II regression, and significance was determined by a one-tailed permutation test with 100 permutations. The R 2 of each model was also taken, alongside the 95% confidence intervals for the estimated slope. Here, the number of replicates (/10,000) for which measured and estimated biomass were significantly related is given, as well as the number of replicates for which the 95% confidence intervals for the estimated slope did not contain 1 (i.e., y ≠ x). The mean model R 2 (and standard error) across all 10,000 replicates is also given. For tests of the slope's relationship to 1, all models were retested with the species weighing >15 mg excluded from the testing dataset.
collections provide opportunities for targeted sampling of particular species (accounting for the mass of entomological pins when taking such measurements; Gilbert, 2011). Including data for a wider range of body sizes, and from rarely trapped families (e.g., Sphingidae) or those which have few (e.g., Saturniidae) or no (e.g., Hedylidae) species extant in Britain, would allow model accuracy to be increased by refining parameter estimates for additional families, at subfamily level (to better account for within-family variation in body plan), or potentially through a phylogenetic imputation approach (Penone et al., 2014). Nevertheless, estimates of British moth biomass made using our approach (Macgregor, Williams, et al., 2019) revealed that 93.3% of total biomass is comprised of the three macro-moth families for which we made refined predictions (Erebidae, Geometridae and Noctuidae), so improving prediction accuracy for other families (which comprise only a small proportion of each sample) may only improve sample-level accuracy of the overall model by a correspondingly small amount.
One source of potential error when using published forewing lengths to estimate biomass is that 19% of individuals in our 2018 dataset had a measured forewing length which was outside the expected range given by field guides. In 92% of such cases, the moth was smaller than expected, suggesting a systematic explanation; for example, that forewings shrank slightly during the air-drying process, or that published size ranges are based on measurements of historical specimens but contemporary individuals are now smaller (e.g., due to climate change; Gardner, Peters, Kearney, Joseph, & Heinsohn, 2011). Nevertheless, there was a strong overall correlation (R 2 = 0.942) between the mean forewing length at species level derived from our 2018 empirical measurements and the midpoint of the range of forewing lengths for each species, taken from the published field guides (Appendix S1.6). This suggests sufficient accuracy in our approach, particularly considering that our largest measured species had a forewing length 571% larger than that of our smallest species. Similarly, the approaches we took to measuring forewing lengths (i.e., with analogue callipers and a ruler, to the nearest 1 mm) and dry body masses (i.e., after 1 week of air-drying) mean that our dataset may not be fully comparable to datasets collected under other conditions or using other approaches (e.g., using digital callipers with higher resolution to measure forewing length, or measuring dry body mass after oven-drying). However, since all air-drying took place, and all measurements were taken, by the same person in the same laboratory over the same 6-week period (and air-drying for 1 week was shown to be sufficient for the mass of even the largest moths to stabilize: Appendix S1.2), these measurements are adequate to accurately establish the relative relationships between species for both forewing length and dry body mass. Therefore, our models can also be safely used to estimate relative change in moth biomass over time, or in space, assuming only that the average body mass of each individual species does not substantially change over the same scales.
An additional source of possible error in our models is sexual dimorphism in moths. Some moth species, including some sampled in our study (e.g., Drinker Euthrix potatoria; Lasiocampidae), exhibit substantial sexual dimorphism in wing length (Waring & Townsend, 2017) and in body mass (Allen, Zwaan, & Brakefield, 2011).
However, we did not quantify or adjust for sexual dimorphism in this study because long-term recording schemes rarely include information on sex of individual moths, even for dimorphic species, although the majority of such records are likely to be males (Altermatt, Baumeyer, & Ebert, 2009
Our approach may also be of use for conducting trait-based analyses of moths (e.g., van  there is interfamilial variation in this relationship (Figure 1), which can be incorporated by using our approach. However, an appropriate level of caution is advised before applying our specific estimates of body mass to systems where the moth fauna is markedly different in size, or otherwise distinct, from that used to construct our model (i.e., chiefly night-flying UK macro-moths). For example, studies incorporating records of primarily day-flying families (e.g., Sesiidae), micromoths, or large tropical species should consider carefully whether it would be more appropriate to generate a new, regionally and taxonomically specific predictive model by using this approach.

| CON CLUS IONS
We have developed a predictive model to estimate the dry body mass of moths based on their forewing length, using it to generate body masses for all British species of macro-moth. The predictions of sample biomass made by our model correlated strongly with measured biomass of the same samples (R 2 = 0.915), indicating that this approach provides a robust way to estimate the biomass of samples of moths identified to species level. Our approach unlocks new opportunities to study trends in moth biomass over time and over large geographic regions.

ACK N OWLED G M ENTS
This study was funded by the Department of Biology, University of York, through an internal grant awarded to C.J.M. and R.S.K. We are grateful to collaborators and staff who have contributed data from the Rothamsted Insect Survey, a BBSRC-supported National Capability. C.D.T., J.K.H., and C.J.M. were additionally supported by the Natural Environment Research Council (grant no. NE/ N015797/1).

CO N FLI C T O F I NTE R E S T
None declared.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5281/ zenodo.3786303.

DATA AVA I L A B I L I T Y S TAT E M E N T
All R scripts and data used in the analysis are archived online at