Applying generalized allometric regressions to predict live body mass of tropical and temperate arthropods

Abstract The ecological implications of body size extend from the biology of individual organisms to ecosystem‐level processes. Measuring body mass for high numbers of invertebrates can be logistically challenging, making length–mass regressions useful for predicting body mass with minimal effort. However, standardized sets of scaling relationships covering a large range in body length, taxonomic groups, and multiple geographical regions are scarce. We collected 6,212 arthropods from 19 higher‐level taxa in both temperate and tropical locations to compile a comprehensive set of linear models relating live body mass to a range of predictor variables. We measured live weight (hereafter, body mass), body length and width of each individual and conducted linear regressions to predict body mass using body length, body width, taxonomic group, and geographic region. Additionally, we quantified prediction discrepancy when using parameters from arthropods of a different geographic region. Incorporating body width into taxon‐ and region‐specific length–mass regressions yielded the highest prediction accuracy for body mass. Using regression parameters from a different geographic region increased prediction discrepancy, causing over‐ or underestimation of body mass depending on geographical origin and whether body width was included. We present a comprehensive range of parameters for predicting arthropod body mass and provide guidance for selecting optimal scaling relationships. Given the importance of body mass for functional invertebrate ecology and the paucity of adequate regressions to predict arthropod body mass from different geographical regions, our study provides a long‐needed resource for quantifying live body mass in invertebrate ecology research.

production, trophic link structure, and species interaction strengths are also related to the body size of constituent individuals and populations (Belgrano, Allen, Enquist, & Gillooly, 2002;Boudreau, Dickie, & Kerr, 1991;Brose, Williams, & Martinez, 2006;Kalinkat et al., 2013;Rall et al., 2012;Riede et al., 2011). As a result, arthropod body size has substantial impacts on the contribution of individuals and communities to ecosystem processes such as decomposition, pollination or pest control, making it a powerful predictor of ecosystem performance .
Most biological rates scale with body size following a power-law relationship (Peters, 1983;White et al., 2007), which has important implications for individual and community ecology. In the early 1930s, Kleiber (1932) proposed an allometric scaling relationship of metabolism with body mass following a ¾ power-law function, though this has been extensively debated (see Brown, Gillooly, Allen, Savage, & West, 2004;Ehnes, Rall, & Brose, 2011;Kolokotrones, Savage, Deeds, & Fontana, 2010). This power-law scaling means that smaller animals have a lower per capita metabolic rate than larger ones, though their mass-specific metabolic rate is higher, yielding distinct patterns of energy demand in populations and communities depending on the relationship between body size and total biomass (Reichle, 1968). Additionally, home-and foraging ranges of animals increase with body size, which has been demonstrated for a wide range of organisms, from small invertebrates to large mammals (Greenleaf, Williams, Winfree, & Kremen, 2007;Jetz, Carbone, Fulford, & Brown, 2004;Lindstedt, Miller, & Buskirk, 1986;Swihart, Slade, & Bergstrom, 1988). Due to the allometric scaling of a broad range of physiological and ecological properties, researchers can use general scaling relationships to predict ecological properties from measured values of organism body size (Savage, Deeds, & Fontana, 2008).
While body size is highly useful as a predictive trait for many ecosystem processes, measurement of individual arthropod body masses from community samples is particularly challenging due to their small body size and typically high abundance. As a consequence, researchers might measure only a few individuals of each species and apply an average of these values to all individuals of that species. This practice disregards intraspecific variation that occurs among sampling sites, especially when the sites are distributed along ecological gradients that affect body size (Violle et al., 2012).
In order to solve this issue and adequately account for intraspecific variation, the measurement of arthropod body size would have to be simple enough to allow for the processing of high individual numbers. However, in extensive field sampling campaigns, collecting individual body mass data across all samples is often infeasible due to the logistic difficulties of weighing large numbers of individual organisms. Additionally, many ecological disciplines typically require data on live rather than dry body mass to relate body size to a range of ecological attributes. For example, studies investigating arthropod metabolism (Ehnes et al., 2011;Meehan, 2006), interaction strengths and the dimensionality of consumer search space (Pawar, Dell, & Savage, 2012;Vucic-Pestic, Rall, Kalinkat, & Brose, 2010), movement (Hirt, Lauermann, Brose, Noldus, & Dell, 2017) and size-abundance relationships (Chown & Steenkamp, 1996;Gouws, Gaston, & Chown, 2011) typically rely on live body mass of organisms. However, dry body mass estimates are more frequently available for arthropods because of the difficulty of accurately measuring their live body mass. This limitation calls for the provision of practical and accurate tools to acquire individual-level, live arthropod body mass data in order to assess population and community responses in arthropod size structure and investigate corresponding ecosystem processes. Different approaches to indirectly assess body mass have been proposed in the literature. Among others, these include quantitative magnetic resonance (O'Regan, Guglielmo, & Taylor, 2012), clay-modeling, image analysis or geometric approximation (Llopis-Belenguer, Blasco-Costa, & Balbuena, 2018). While these are powerful methods for low sample sizes, obtaining individual body masses for high abundance samples with organisms from many taxonomic groups is infeasible and is where length-mass regressions provide an optimal alternative.
Length-mass regressions have proven to be a powerful tool to predict body mass based on body length measurements (Benke, Huryn, Smock, & Wallace, 1999;Gruner, 2003;Johnston & Cunjak, 1999;Rogers, Buschbom, & Watson, 1977;Schoener, 1980;Wardhaugh, 2013), which are in some cases, easier to obtain than direct measurements of body mass. For living or particularly small organisms, direct measurement of body mass can be difficult and time-consuming. The length-mass regression approach relies on regression parameters estimated for length-mass relationships. However, finding suitable regression parameters for a given taxon from a specific geographic region is often not possible. This limitation can be problematic because scaling relationships-and thus, their regression parameters-are likely to vary substantially among taxonomic groups and geographic regions; an aspect that has been shown to be especially distinct between tropical and temperate regions (Gruner, 2003;Schoener, 1980;Wardhaugh, 2013). Thus, using length-mass regression parameters from a different geographical region is likely to increase the discrepancy in predictions of body mass. Finally, datasets of length-mass regressions available in the literature are often based on dry body mass measurements. Therefore, researchers requiring live body mass estimates are typically constrained to using rough conversion factors (Peters, 1983) or more elaborate dry mass-fresh mass regressions (e.g., Mercer, Gabriel, Barendse, Marshall, & Chown, 2001), which add further discrepancy to body mass predictions due to the very same sources of variation in length-mass scaling relationships (geographic origin, taxon-specificity, etc.). Considering the broad application of live body size data in ecological research, there are surprisingly few studies that provide length-body mass regression parameters for terrestrial arthropods, and most studies are restricted to one of either temperate or tropical animals, or to only a few taxonomic groups (Benke et al., 1999;Burgherr & Meyer, 1997;Gruner, 2003;Mercer et al., 2001;Schoener, 1980;Wardhaugh, 2013).
In this paper, we provide an unprecedented dataset of lengthmass scaling relationships based on measurements of live body mass and body length of 6,212 terrestrial arthropods from both tropical and temperate geographical regions. We performed length-mass regressions for arthropods, including various combinations of body width, taxonomic group and geographic origin as additional covariables, and compared the accuracy in predicting body mass among these various models. We hypothesized that prediction accuracy improves with an increasing number of additional predictors (e.g., including body width, taxonomic group, and geographic region), as opposed to using only body length as a sole predictor of body mass.
Additionally, we expected a higher prediction accuracy when using regression parameters taken from the same geographic region, as opposed to using regression parameters of arthropods from a different geographic region (hereafter, geographically disjunct regression parameters). Our study thus provides a generalized resource for predicting live body mass across an unprecedented range of terrestrial arthropod groups (including 19 orders of Arachnida, Myriapoda, Crustacea, and Insecta), as well as guidance for deciding which scaling relationships to use for predicting arthropod body mass depending on the dataset at hand.

| Study sites and sampling techniques
To account for different scaling relationships in temperate versus tropical geographical regions, we chose two sampling locations: one temperate location in Germany and one tropical location in Indonesia. Temperate sites were located near Göttingen, Germany (51°32′02″N, 09°56′08″E) at an altitude of around 150 m asl, with a mean annual air temperature of 7.4°C, mean annual precipitation of 700 mm (Heinrichs, Winterhoff, & Schmidt, 2014) and a vegetation growth period from May to September. Tropical sites were located near Jambi City in Sumatra, Indonesia (1°35′24″S 103°36′36″E), at an altitude around 20 m asl. Jambi City has a mean annual air temperature of 25°C and a mean annual precipitation of 2,100 to 2,800 mm (Ishizuka, Tsuruta, & Murdiyarso, 2002). The sampling sites in both regions included wayside vegetation, open grassland areas, and forest strips. Sampling sites were chosen due to their proximity to the laboratory in both regions to ensure a fast and simple workflow, since animals had to be kept alive after collection and living animals could not be stored for more than 8 hr to avoid increased body mass loss.
The temperate organisms were collected in June, July, and August 2014 and the tropical organisms were collected in October, November, and December 2014. Three standard sampling techniques were used in order to cover a broad variety of arthropod taxa and to achieve a sufficient overlap of taxonomic groups from both sampling regions. For active and fast moving ground animals, as well as nocturnal species, live pitfall traps (diameter of 11 cm and height of 12 cm) were used within forest and grassland sites. Pitfall traps were closed with a funnel-shaped lid to prevent animals from escaping.
Pitfall traps were buried so the opening of the pitfall was flush with the surface of the ground. They were installed in the morning and animals were collected after 24 hr to avoid loss of individuals due to predation, drowning, or desiccation. Sweep nets were used in open grassland and wayside vegetation plots to collect animals from within low vegetation, shrubs and small trees to sample stationary, as well as fast-moving and flying animals. At the forest sites, less mobile animals from within the litter layer were collected via leaf litter sieving.
Material from the loose leaf litter (F-Layer) on top of the humus layer was collected and sieved through a coarse-meshed grid (2 × 2 cm).
Animals that fell through the mesh were hand-collected from a collecting tray and stored in individual vials for further processing.

| Morphological measurements and data collection
Arthropods were stored in a refrigerator at 10°C for a maximum of 8 hr after collection to slow down their metabolism and reduce body mass loss. As the goal of our study was to provide length-live body

| Statistical analysis
All statistical analyses were performed using R Version 3.4.0 (R Core Team, 2015). All larvae and taxa without width measurements were excluded from the main analysis. We present length-mass regressions for these excluded taxonomic groups, along with a range of behavioral, morphological, or taxonomic groups of specific taxa in the Supporting Information (Table S1). Specifically, subgroup regressions are presented for web building and hunting spiders (Araneae), Brachycera, and Nematocera (Diptera), Staphylinidae, beetle larvae and all other beetles aside from larvae and Staphylinidae (Coleoptera), Heteroptera, and all Hemiptera without Heteroptera, larvae of Lepidoptera, Glomerida (only length measurement), Julida (only length measurement) and five morphological subgroups of Hymenoptera.
We log 10 -transformed body mass, body length, and body width to assure normality of the data and to prevent negative model predictions.
Using generalized linear models, we tested the relationship between body mass and length (L), including width (W) and two other covariables. As there was no full-factorial design for the two factorial independent variables taxonomic group (T) and geographic region (R) (because not all taxa were found in both regions), we created a factorial variable combining these two predictors ("TaxReg"). Note that this implies an interaction between taxonomic group and geographical region that we cannot resolve as long as we use the full dataset. Our most complex model included body length and width (additive, a multiple regression), the factorial variable "TaxReg" and the interactions between each of the two continuous variables and the combined factorial variable (model LWTR). All other models were constructed by reducing the complexity of this overall model by removing independent variables and providing all combinations of these under the above-described constraints for interactions. Some of the models include taxonomic group (T) or region (R) only or none of the factorial variables, (see Table 1 for the eight models tested and see Supporting Information Methods S1 for a worked example of body mass predictions using each model type). Model fits were compared using the Bayesian information criterion (BIC) and prediction errors obtained through leave-one-out cross-validation (LOOCV) from the R package "boot" (Canty & Ripley, 2017). R 2 values were calculated using the "rsq"-function from the R package "rsq".
We hypothesized that using regression parameters from different geographic regions likely increases discrepancy in predictions of arthropod body mass. In order to assess this prediction discrepancy, we quantified the proportional difference between predicted and observed body mass using geographically nondisjunct and geographically disjunct regression parameters (i.e., where regression parameters obtained from one geographic region are used to predict body mass of arthropods in a different geographic region) for the two all-taxa models (models LWR and LR). Specifically, we calculated body mass prediction discrepancy of regression parameters as the log response ratio where Δ is the prediction discrepancy, m pred is the predicted body mass using length-mass regressions and m obs is observed body mass (obtained by weighing organisms). We then assessed how prediction accuracy varied across the range of body length to ascertain if there might be systematic error in body mass predictions depending on arthropod body size. We applied geographically disjunct and nondisjunct regressions separately to all temperate and all tropical body lengths. Subsequently, we divided predicted by observed body mass values and calculated the decadic logarithm of this ratio. Zero discrepancy thus means that predicted and observed body masses are identical. Positive discrepancy means that predicted body masses are higher than observed body masses and negative discrepancy means that predicted body masses are lower than observed body masses. Given that we use all temperate and tropical body lengths for obtaining the model regressions in the first place and for testing them here, the calculated discrepancy patterns will be symmetrical between temperate and tropical data. For further detail on the calculation and interpretation of prediction accuracy, please refer to Supporting Information Figure S1. best explained variation in body mass according to BIC selection,

| RE SULTS
R 2 values and cross-validation prediction errors (Table 1). We found consistently positive slopes of body mass in relation to body length across our combined factorial variable "TaxReg" including taxonomic groups and regions (Table 3, Figure 1). Thus, the slope of the lengthmass relationship varied with body width, taxonomic group and geographic region (e.g., the slope of the length-mass relationship differed between spiders and beetles as well as between temperate and tropical spiders).
The eight different models explained between 81.8% (model L, least complex model) and 97.2% (model LWTR, most complex model) of the total variance in body mass (Table 1). According to BIC, R 2 and the cross-validation comparisons, the four models that included body width as a covariate explained more variation in body mass than models that only included body length as a predictor (Table 1).
Finally, to test if the application of geographically disjunct regression parameters increases discrepancy in body mass predictions, we calculated body mass using geographically disjunct and geographically nondisjunct regression parameters and quantified the difference to observed body mass. When quantifying the difference between body mass estimates from geographically nondisjunct and disjunct regression parameters, we found that the application of geographically disjunct parameters for whole-fauna regressions led to increased prediction discrepancy of body mass when compared to using nondisjunct regression parameters (Figure 2  TA B L E 3 Regression parameters for the eight linear models for live body mass prediction in dependence of body length (L, in mm), maximum body width (W, in mm), taxonomic group (T), and geographic region (R, temperate and tropical). The asterisks indicate significance levels of the regression parameters (***indicates p-value <0.001; **indicates p-value <0.01; *indicates p-value <0.05)

| 12745
SOHLSTRÖM eT aL. Hirt, Jetz, et al., 2017;White et al., 2007). In order to make realistic predictions of these measures, it is essential to have reliable body mass data of target organisms. In our dataset consisting of 6,212 organisms spanning 19 taxa from both tropical and temperate geographic regions, we found an overall positive power-law relationship between body mass and body length across taxonomic groups and the tropical and temperate geographic regions. The only exception to this universal trend was for temperate Neuroptera, which showed a negative relationship between body mass and body length in models that also included body width. A decoupling of length and width through a combination of morphologically distinct individuals and low replication in this group likely caused an average increase in body length without a proportional average increase in body width, resulting in long but thin organisms.

Charnov
Generally, adding body width as an additional morphological predictor strongly improved body mass prediction accuracy. This increase in model performance is probably due to certain groups where the body length-to-width ratio is considerably different to the average of all taxonomic groups (e.g., Staphylinid beetles have a higher body length-to-width ratio than other beetle families).
F I G U R E 1 Length-mass regressions of the best fit model, which included body length, maximum body width, taxonomy, and geographic region (LWTR) to predict body mass for the ten most abundant arthropod groups from the temperate (blue) and tropical (red) study areas.
The y-axis displays partial residuals and, therefore, shows the effect of body length after correcting for the other variables Log 10 body length (mm) Log 10 body mass (mg) Partial residuals Thus, using body length as the only predictor of body mass at the order level is almost certainly insufficient to capture the morphological variation present within taxonomic groups. Therefore, we expected that the incorporation of body width as an additional predictor in our models should increase the accuracy of body mass predictions at the order level. Consistent with our expectations, we found that including body width into the estimation of body mass resulted in a strong improvement of prediction accuracy, in comparison to using body length, alone, as a single predictor of body mass. Moreover, incorporating only body width as an additional predictor yielded higher prediction accuracy than incorporating taxonomic group and geographic region into the models.
Body mass is related to the volume of an organism, which can be described by length, width and height. Hence, adding height to predict body mass could lead to more accurate body mass estimations than using only body length and width. Measuring another morphological trait of an organism, however, increases the time needed for processing samples, presenting a trade-off between maximizing prediction accuracy and minimizing time spent measuring traits. As more than 97% of the variance in body mass was described by length, width, taxonomic group, and geographic region, the benefit of adding body height would unlikely outweigh the added workload. Indeed, previous studies have shown that including body shape (i.e., body length and width) instead of taxonomy lead to more accurate body mass estimates at the order level, but not at higher taxonomic resolution (Gruner, 2003;Wardhaugh, 2013). Our results strongly support the finding that the accuracy in predicting body mass improves with adding further morphological traits, which are related to volume, in addition to body length for scaling relationships conducted at the order level.
Besides body width, taxonomic group and geographic origin of the arthropods also influenced the relationship between body length and body mass. This interaction is likely because variation in arthropod body size is influenced by a range of other factors such as evolutionary history and environmental variation (Chown & Gaston, 2010). For example, Bergmann's rule proposes F I G U R E 2 Prediction discrepancy (log response ratio of predicted vs. observed body mass values) for temperate (blue datapoints, panels a and c) and tropical (red datapoints, panels b and d) arthropod body mass obtained by using geographically disjunct (light-blue crosses and light red crosses) and nondisjunct (dark-blue points and dark-red points) regression parameters for the LR (a and b) and LWR (c and d) models. LR = length × region and LWR = (length + width) × region models. The lines show the linear model of the log response ratio of predicted and observed body mass values and body length by using geographically disjunct (dashed lines) and geographically nondisjunct lines (solid lines). For further explanation of the presented patterns, please refer to Supporting Information Figure S1  Log 10 body length (mm) Prediction discrepancy (log 10 (pred. BM/obs. BM)) that body size increases with latitude, though the opposite has been observed for arthropods (Mousseau, 1997). In general, these concepts suggest that the body size of arthropods depends strongly on their geographic origin, particularly with respect to latitude. Therefore, we expected that the application of geographically disjunct regression parameters from tropical and temperate regions could lead to significant prediction discrepancy in arthropod body mass. If researchers are unable to use regression parameters from data collected in a similar geographic region to their study site (due to a lack of available scaling relationships), this could have important consequences for the body mass-related results drawn from their studies. Consistent with our expectations, we found that the use of geographically disjunct length-mass regression parameters led to inaccurate body mass predictions ranging between average prediction discrepancies of 8% to 23%, depending on the model used. The patterns presented in Figure 2 are caused by the underlying differences between the temperate and tropical length-mass relationships of our study (Supporting Information Figure S1). For a given body length, temperate arthropods had on average higher body mass than tropical arthropods in our dataset. The difference between the relationships increased with body length (see Supporting Information Figure S1 for further explanation). Consequently, when only body length was used as a morphological predictor, body mass prediction discrepancy of geographically disjunct regressions increased with increasing body length of arthropods.
This has important consequences for the quality of body mass data, as our results suggest that body mass of longer arthropods will be more severely over-or underestimated than that of shorter arthropods. Therefore, our results highlight a potential systematic bias of decreasing prediction accuracy with increasing body length when applying regression parameters from different geographical regions. Ultimately, studies investigating body size responses to environmental conditions and the resulting impacts on ecosystem functioning rely on accurate calculations of body mass. Therefore, it is essential for such studies to use lengthmass regression parameters that are obtained from similar geographic origins as the organisms for which body mass is being predicted.
In addition to the potential prediction discrepancy caused by using geographically disjunct regression parameters, using lengthmass regressions obtained from organisms collected in a different season might reduce the accuracy of body mass estimations. It has been demonstrated, that body size can vary across seasons as a consequence of temperature variation (Horne, Hirst, & Atkinson, 2017).
Hence, our temperate length-mass regressions will likely be most accurate when used for organisms collected during the main vegetation growth period. Furthermore, some animals collected from pitfall traps may have been captured directly after the traps were set and could, therefore, have either starved for up to 32 hr or larger predators could potentially have fed on smaller organisms and temporarily increased their body mass. However, only 421 organisms were captured using pitfall traps, while the majority of arthropods (5,700 organisms) were captured using litter sieving and sweep nets.
Moreover, given that studies typically require body mass data for temperate arthropods during their active period and that the only exception to the general positive power-law relationships was found for a low replication taxon in one geographical region, our dataset of length-mass regressions still provides a useful and robust tool to estimate arthropod body mass.
Our study provides a highly comprehensive set of regression parameters for predicting live body mass of terrestrial arthropods. This set of regression parameters is useful for researchers wishing to quantify body mass of arthropods across a range of underlying morphological traits, taxonomic identities, and different geographical regions. By incorporating all combinations of geographic region, taxonomic group and body width in our allometric models, our results allow investigators to choose length-mass regression parameters for predicting body mass across a broad variety of arthropod datasets. Additionally, we provide an explicit estimation of the prediction discrepancy caused by using geographically disjunct regression parameters, to assist in deciding which regression parameters will be the most appropriate for predicting arthropod body mass for a given dataset. In summary, our results will aid future studies in accurately assessing body mass of arthropods, thus increasing our ability to further explore the ecological implications of body size.

ACKNOWLEDGMENTS
This study was financed by the Deutsche Forschungsgemeinschaft

CO N FLI C T O F I NTE R E S T S
The authors declare no competing interests.

AUTH O R CO NTR I B UTI O N S
LM, ADB, UB, and MJ conceived and designed the study, EHS, LM and MJ carried out the field and laboratory work, EHS and BCR analyzed the data, and all authors interpreted the results. EHS and LM wrote the first draft and all authors contributed substantially to the writing.