E. Garcia-Barros (firstname.lastname@example.org) and H. Romo Benito, Dept of Biology, Univ. Autónoma de Madrid, ES-28049 Madrid, Spain.
The geographic range of a species is influenced by past phylogenetic and biogeographic patterns. However, other historical interactions, including the interplay between life history and geography, are also likely involved. Therefore, the range size of a species can be explained on the basis of niche-breadth or dispersal related hypotheses, and previous work on European butterflies suggests that both, under the respective guise of ecological specialisation and colonising ability may apply. In the present study, data from 205 species of butterflies from the Iberian peninsula were processed through multiple regression analyses to test for correlations between geographic range size, life history traits and geographic features of the species distribution types. In addition, the percentage of variance explained by the subsets of variables analyzed in the study, with and without control for phylogenetic effects was tested. Despite a complex pattern of bivariate correlations, we found that larval polyphagy was the single best correlate of range size, followed by dispersal. Models that combined both life history traits and geographic characteristics performed better than models generated independently. The combined variables explained at least 39% of the variance. Bivariate correlations between range size and body size, migratory habits or egg size primarily reflected taxonomic patterning and reciprocal correlations with larval diet breadth and adult phenology. Therefore, aspects of niche breadth i.e. potential larval diet breadth emerged as the most influential determinants of range size. However, the relationships between these types of ecological traits and biogeographic history must still be considered when associations between life history and range size are of interest.
Large-scale inter-specific analyses of present species ranges are often hampered by three factors: 1) phylogenetic patterning; 2) geographic turnover in species life histories; and 3) the biogeographic history underlying present ranges. Taxonomic variability and phylogenetic relationships across a species range are routinely addressed (Carrascal et al. 2008, Gove et al. 2009, Calosi et al. 2010), however geographic turnover and biogeographic history remain largely unresolved. First, species life histories exhibit geographic variability across a species range. Consequently, analyses of “mean” trait values measured across geographic gradients may suggest a misleading niche breadth-based explanation (Gaston et al. 2007). A provisional solution is to focus on intermediate scale patterns. Finally, evidence for hidden historical causes in observable ranges can be tested by comparing apparent relationships between range size and ecological factors with simple biogeographic features of the species range, such as the chorotype and range position (as shown by recent work on plant distributions; Weiser et al. 2007, Gove et al. 2009).
Butterflies are highly sensitive to environmental changes, particularly changes impacting vegetation because butterfly larvae are specialised herbivores. Several Lepidopteran life history traits display notable interspecific variation, as well as intraspecific differences along geographic gradients (Nylin 2009). Studies assessing butterfly range size and ecology across species have been reported in temperate areas, most notably the British Isles (Hodgson 1993, Quinn et al. 1998, Dennis et al. 2000, 2004, 2005, Cowley et al. 2001a, b), among others (Hughes 2000, Komonen et al. 2004). Dennis et al. (2004) provides evidence that correlates range size with population density (depending on the geographic scale; Hughes 2000), development time, number of broods per year, adult mobility, and resource use specificity. Furthermore, host plant type is correlated with phenology (Cizek et al. 2006) and adult mobility with larval polyphagy, resource availability and range position (Komonen et al. 2004). However, Dennis et al. (2005) points out that range size and larval host range are associated because of a reciprocal dependence on other life history and resource variables. Therefore, the available evidence suggests a complex pattern of interrelated factors, and those compatible with a dispersal- or niche- related explanation of range size are most relevant.
The role of history in the evolution of the ecological links within the land biotas remains untested, and evidence suggests that the geographic structure of the West-Palaearctic butterfly fauna has been strongly modelled by postglacial events (Dennis et al. 1991, 1998, Schmitt 2007). The present study served to determine how the integration of data on species ranges and ecology might modify former explanations of range size in these insects (Dennis et al. 2004, 2005). In addition, to what degree the correlations among variables were supported from data derived from faunal regions of different climate and physiography. To answer these questions, we tested the relationships between life histories and geographic ranges of Iberian butterflies. Obscure historical patterns, which might be identifiable on a topological basis, were addressed by analyzing relationships between range size and variables that describe species range (with and without controlling for phylogenetic relatedness). We subsequently compared the explanatory power of each of the two subsets of variables (life history and geography), and tested for a relationship between subsets in terms of their shared variance.
Species and range sizes
The data matrix was comprised of 205 butterfly species (superfamilies Hesperioidea and Papilionoidea) from the Iberian Peninsula (Iberia, the continental territories of Portugal and Spain). Non-native, long-distance migrant species (such as monarch butterflies, Danaus spp.), and species with missing life history data were excluded. Each species range (AREA, the dependent variable) was measured as the number of 10 km square cells (UTM military grid) occupied (data from García-Barros et al. 2004, updated for this study by the authors). Area coverage at this grid size does not adequately represent the region and sampling was concentrated on the most diverse areas. Consequently, species occupancies may be underestimated for widespread and overestimated for rare species (Romo and García-Barros 2005, Romo et al. 2006).
Life history variables
Variable selection and measurements were chosen based on the available study material and literature. Care was taken to obtain only information specific to the study area (sources in García-Barros et al. 2004, with information from the French Pyrenees from Lafranchis 2000 incorporated when required), with the exception of relative egg size (as detailed below). Life history variables were divided into to three subsets: adult features (size, migratory habits and fecundity); phenology; and larval features (larval host specificity, gregariousness and ant relationships). Multistate categorical factors were recoded as dummy binary variables to facilitate the identification of the precise nature of any significant effects. The variables and abbreviations used throughout the remaining text are as follows.
AWL: adult size (forewing length in mm, log-transformed), estimated as the median of male and female mean wing lengths from sample cabinet specimens (n>20 000: García-Barros unpubl.). When <10 individuals of each sex were available, measurements from the literature were obtained and included in the data set (Manley and Allcard 1970, Fernández-Rubio 1991, Maravalhas 2003). ASXD: sexual dimorphism in adult size range, residuals from a linear regression of female size onto male size, both log-transformed (female AWL=−0.022+1.031×male AWL; R=0.988; p<0.00001). MIGR: migratory habits, where the species were classified as “sedentary” or “migratory” (including eumigrants, intra-regional migrants and long-distance dispersers; Templado 1976, Eitschberger et al. 1991). EGGS: relative egg size, estimated as the residuals from a regression of log-transformed egg volume (from García-Barros 2000b) on AWL (R=0.628, p<0.00001 using raw data; and R=0.428, p<0.00001 using independent contrasts, see below). Estimated egg size does not always include Iberian samples (consequently, a combination with local adult sizes resulted in some level of measurement error), and was available from only 154 species. Therefore, it was necessary to duplicate some analyses, as detailed below.
VOLT: voltinism pattern (one life-cycle per year, or more than one in at least part of the territory). NMON: number of months in which the adults were recorded (95% confidence limits). MMON: mean month of adult occurrence (with months scored as 1–12). SDMN: standard deviation of MMON. The last three variables (NMON, MMON and SDMN) were calculated from pooled historical records with reliable dating (data from the authors database, details in García-Barros et al. 2004). Four binary variables were used to code for the over wintering stage: OWSE (egg), OWSL (larva), OWSP (pupa) and OWSA (adult).
Larval, and larval-host features
LPDB (larval diet breadth) was measured as the number of plant families (F), genera (G) and species (S) used as larval hosts using the formula LPDB=(S×G×Fa)1/3, where a=(G+F)/2S. The index assumes higher values for any number of hosts when the hosts represent a greater number of genera and families (i.e. increased LPDB); LPDB was better correlated to range size (r=0.62) than either the number of host species, genera or families (respectively R=0.589, R=0.567 and R=0.316; p<0.001). LHPM: coarse taxonomy of the larval hosts (monocotyledons or dicotyledons). Plant structure was broadly described by two binary variables, LHTW (woody plants) and LHTH (herbaceous plants), and plant parts eaten were described as LHPL (leaves), and LHPF (flowers, fruits or buds). Larval gregariousness was measured by coding the egg-laying mode of each species (EGGB: eggs laid singly, or arranged in egg clusters), since the habit of laying the eggs in clusters is almost universally associated to gregarious larval habits.
The larvae of some butterflies (namely, the Lycaenidae) may show symbiotic or parasitic relationships with specific ants, whose presence may condition the success of the butterfly population. The strength of myrmecophily was described by three binary factors (adapted from Fiedler 1991): weak (including facultative myrmecophily, MIRW), strong (MIRS) and obligate (MIRO). The three variables were also combined as a single semi-quantitative variable (MIRX, with values from 0= no relationship with ants to 4=obligate ant-dependence), as well as an all-or-nothing binary variable (MIRM, where “1” means any level of association and “0” no association).
Geographic variables (species range attributes)
The X and Y UTM coordinates of each species data subsets were used to estimate LATM (mean Y value, a surrogate of mean latitude) and DISP (dispersion of the cells occupied, calculated as the geometric average of the variances of X and Y). ALTM (mean altitude, in m a.s.l.) was derived from the original database records. ENDM: degree of endemism, quantified as the percentage of 1°×30″ cells occupied by each species in Iberia over the number of European species (from Kudrna 2002). Five variables: REG (A, B, C, D, E) were adopted to qualify the species according to their presence/absence in each of five Iberian regions (A to E, Fig. 1, formerly defined by Romo 2008).
The raw data values for AREA, SDMN and ALTM were square root transformed, MMON cubed, and AWL, EGGS and LPDB were log10-transformed to approach normality. Two series of computations were conducted, using species means and independent contrasts. Contrast-based regressions were forced through the origin (Felsenstein 1985).
The bivariate relationships between AREA and the independent variables were determined. Subsequently, regression models were selected using a manual stepwise procedure (GLM module of Statistica: StatSoft 2004), assuming a Poisson distribution of the residuals and a logarithmic link function. The variables were entered in descending order according to their bivariate relationship with AREA (Table 1), and a backward selection was fit at each step to discard redundant terms. Wald's statistic was applied to measure the contribution of each variable, and model strength with the standard deviation (whole model R2 values were added from GRM fitting). First, one model was fit using all variables (life-history and geographic). Two additional models were subsequently calculated with the two sets of variables independently. Finally, we applied variance partitioning (Legendre and Legendre 1998, Diniz-Filho and Bini 2008) to measure the amount of variance explained by life history, geographic features, or shared by both sets of variables (the variables included in this specific analysis were those that exhibited significant effects in any of the three formerly described models, i.e. “all”, “life history”, and “geography”).
Table 1. Bivariate correlations (R) between occupancy (AREA) and the independent variables, based on the raw species data (columns a, b) and on the standardised independent contrasts (columns c, d), and from the full data set (a, c) or the subset of species with known egg sizes (b, d). Sample sizes are a=205, b=155, c=146, d=119. ns=p>0.05, *=p<0.05, **=p<0.01, ***=p<0.001, ****=p<0.0001.
Raw species values
Pairwise correlations are shown in Table 1. Egg size (EGGS) was significantly correlated with AREA. One additional stepwise regression restricted to data from species with known EGGS values did not select egg size (using either the log-transformed or the independent contrasts). The variables selected without controlling for phylogeny included NMON, SDMN, REGB, REGA, REGC, REGE and LPDB (deviance/DF=0.873; log-likelihood=−232.728; R2=0.837; p<0.0001), and variables from the set of contrasts included MIGR, LPDB, ALTM, LATM, REGA and REGD (deviance/DF=1.701; log-likelihood=−74.425; R2=0.597; p<0.0001). Consequently, we excluded EGGS from subsequent analyses, which resulted in a somewhat lower number of species and contrasts analyzed.
LATM displaced dispersion (DISP) (with which it was correlated; R=−0.427, p<0.0001; R=0.393, p<0.0001, based on the contrasts) in the analysis, probably due to heteroscedascity in the X and Y values. Therefore, to gain insights into the correlation in dispersion patterns, we tested the relationship between DISP to AREA with a polynomial function (species means) and a linear function (contrasts) (Fig. 2). We subsequently analysed the residuals following the methods described above.
Pairwise analyses found 24 variables significantly correlated with AREA (Table 1) across species means, and 14 across contrasts (Table 1, Fig. 3 and 4). The selected variables are detailed in Table 2 and 3. A maximum of six “life history” variables were selected when this subset of variables was analysed. However, in the combined analyses (life history plus geography), the geographical variables tended to displace the life history traits from the models, with the exception of LPDB (Fig. 3) and, depending on the analysis, SDMN.
Table 2. Explanation of AREA based on species means and three subsets of independent variables: life history plus geography (A), life-history (B) and geography (C). The figures shown are estimated coefficients and signs (Coef.), Wald statistics and significance levels (P) of the variables selected in each model. Whole model results are shown at the lower part of the table. **=p<0.01, ***=p<0.001, ****=p<0.0001.
(A) All variables
(B) Life history
Table 3. Explanation of AREA based on the independent contrasts and variable selection. Variable subsets: life history plus geography (A), life history (B) and geography (C). The data shown are estimated coefficients and signs (Coef.), Wald statistics and significance levels (P) of the variables selected in each model. Whole model fitting results are shown at the lower part of the table. **=p<0.01, ***=p<0.001, ****=p<0.0001.
(A) All variables
(B) Life history
Seven of the geographical variables exhibited significant effects when species means were analysed (ENDM, ALTM, LATM, REGA, REGB, REGC and REGE; Table 2). However, significant effects were not evident when contrasts were analysed, with the exception of ENDM and/or REGD, which remained significant (Table 3, Fig. 4).
Variance partitioning (Table 4) indicated that irrespective of the source of data (species mean values or contrasts), a large part of the explanation of the variance is shared by life history plus geography.
Table 4. Variance partitions among the life history and geographic variables, based on the species values and on the standardised independent contrasts. The subset of variables for each column consisted of those with significant effects in any of the models described in Table 2 (raw data) and Table 3 (contrasts). The figures represent the proportion of variance explained (over 1.000).
Shared (life history+geographic)
Residual (not explained)
Finally, dispersion (with fixed effects for AREA) was positively correlated with geography (REGD and ENDM) and negatively with larval host type (LHTH i.e. “not feeding on herbaceous plants”), and with “overwintering as a pupa” (OWSP) (Table 5, 6).
Table 5. Bivariate correlations (R) between the geographical dispersion of the species data points not explained by occupancy (residuals of the regression of DISP on AREA) and the potentially explanatory variables, derived from raw data (n=205) and independent contrasts (n=149). ns=p>0.05, *=p<0.05, **=p<0.01, ***=p<0.001, ****=p<0.0001.
Table 6. Stepwise selection results summarising the relationships between range dispersion (with fixed AREA effects, residuals from the function in Fig. 2) and the remaining variables, based on species raw values and on independent contrasts. The values shown are estimated coefficients and signs (Coef.), Wald statistics, and significance (P) of selected variables. Whole model fitting statistics are indicated at the lower part of the table. *=p<0.05, **=p<0.01, ***=p<0.001, ****=p<0.0001.
The following summarizes the results of our study: 1) significant pairwise correlations between range size and several life history variables were observed; 2) these correlations generally tend to loose strength after controlling for phylogeny; 3) the variables retained by stepwise regression selection were associated with larval host diet breadth, adult phenology, and geographic features of species distribution; and 4) an integral component of the explanation provided by the models was of a mixed (life history and geographic) nature.
Both niche theory and dispersal-related expectations are supported by the bivariate correlations across the species means results generated in this study. Comparatively, widespread species are larger, more sexually dimorphic, and exhibit increased fecundity and dispersal ability. Species with more scattered/patchy distributions tend to show longer periods of adult occurrence, their larvae typically feed on dicotyledonous woody plants, and demonstrate some degree of ant-dependence as well as a restricted Iberian range with low mean latitude and altitude.
The generally (though not universally, Table 1) lower correlations from independent contrasts do not simply reflect the reduced degrees of freedom (as argued by Dennis et al. 2004), but also the different weight of taxonomic patterning across variables (Brooks and McLennan 2002, Diniz-Filho and Torres 2002). Our comparative results were partly hindered by poor phylogenetic resolution and unknown branch lengths, two reasons why (despite the robust contrasts method under such conditions; Martins and Garland 1991, Garland et al. 1992) we concentrated on identifying the sign and strength of the relationships rather than their shape (Quader et al. 2004). Therefore, within these limitations we confidently concluded that the bivariate correlations between range size and relative egg size, dispersive adults, long period of adult occurrence, larval diet breadth and some geographic variables were not taxonomically driven artefacts.
Among the life history variables, multiple selection on independent contrasts supports a causal interpretation of range size in terms of niche breadth (potential or realised larval polyphagy may assist the offspring to thrive in more diverse habitats) and, at least partially, dispersion in the adult “time window”. However, the second variable may actually represent a geographic turnover of species phenologies, as the Iberian lands cover a remarkable array of climate and vegetation types. Phenological variation along a geographic gradient should be higher among populations than within them, therefore the observed correlation may not denote an ability of widespread species to occupy varied habitats, but the degree to which widespread species phenologies are tuned to local conditions. Depending on the methods and variables used to study British butterflies, this explanation is supported by some reports but not others (Hodgson 1993, Dennis et al. 2004, 2005). Correlations between range size and body size or dispersal ability (formerly documented from various taxa, Gaston and Blackburn 1994, Brown 1995, Purvis et al. 2001 and Diniz-Filho et al. 2005, but cf. Hillebrand et al. 2001, Wilkinson 2001, Fernandez and Vrba 2005 or Rundle et al. 2007) are not supported by the most parsimonious interpretation of our results. The relationship between Iberian butterfly range and adult size is taxonomic, range size is weakly related to migratory status, and both relationships break down under multivariate selection protocols. However, we cannot strictly discard alternative explanations of dispersal type, namely because butterfly wing length (which we tested) might not the best surrogate for dispersal ability (as shown for damselflies by Rundle et al. 2007). Furthermore, functional and genetic links among species life history attributes (Stearns 1977, Roff 2002) generally result in complex patterns of cross correlations (see butterfly examples in García-Barros 2000a or Cizek et al. 2006). If distribution patterns are to be evaluated in terms of the realized niche (Austin and Smith 1989, Kockemann et al. 2009) and niche is assessed based on environmental variables (Thuiller et al. 2003), “complex” variables (describing organism-habitat interactions) should perform better in multivariate tests than “proximate” variables (describing features of the organism, or of the habitat, Dennis et al. 2004, 2005). Therefore, the distinction between “causal” and “most parsimonious” solutions is of interest. However, a more exhaustive analysis of the life histories of Iberian butterflies at local or regional levels and comparable studies across different regions is warranted to provide resolution at this scale.
Results of the study suggested that the explanation for range size could be improved by including additional variables to represent processes not explicitly tested. For example, some geographic variables exhibited a high weight and proportion of variance shared with life history data. Range size might be the result of interactions between species life history patterns and geographic history. Comparable conclusions have been drawn from recent work on plant biogeography (Svenning and Skov 2005, Weiser et al. 2007, Gove et al. 2009). Resolving these processes for further analyses is not a simple task, as it requires a comparative approach to the life histories of each of the taxa, done within a phylogeographic framework. The geographic distribution of species is a product of speciation, extinction and the temporal dynamics of its range (Gaston 1998, 2003). Consequently, the range of a species should be examined and explained in terms of the species features, its interactions with the environment (Kean and Barlow 2004) and the historical factors, which likely set limits on other interactions. However, little attention has been drawn to the relationships between different macroecological patterns (Blackburn and Gaston 2001).
Two incidental findings of our work relate to endemism and Rapoport's pattern. The significance of endemism has a practical application in Iberia: Iberian endemics tend to be rare in Iberia (for comparable results from other regions and taxa see Gregory and Blackburn 1998 and Carrascal et al. 2008). Based on our knowledge of butterfly species and their distributions in the study area, most of the butterflies with small ranges in this region are restricted to relatively high elevations on the main mountain chains, which cover a small portion of the peninsula. These species may be particularly sensitive e.g. in terms of global warming and other environmental impacts. For the same reason, the concentration of mountain ranges in the northern half of the peninsula explains the negative pairwise correlations between range size and both latitude and altitude, incongruent with Rapoport's pattern (where ranges should be wider at higher latitudes or altitudes; Rapoport 1975, Lomolino et al. 2006).
In summary, our results identified niche breadth (via larval polyphagy) as primarily correlated with range size, together with interactions between non-explicit historical causes (represented by chorotypes) and life histories. A more thorough “dissection” of the biological correlations of range size, as well as an integrated multi-scale protocol is required before more specific explanations for range size may be achieved.
We thank I. Echavarren for assistance during the early stages of the study, and C. Stefanescu and M. L. Munguira for contributing to the revision of life history data. This study was partly funded by project GL2006-10196 (M. E. C.).