Ecological networks, nestedness and sampling effort

Authors


*Author to whom correspondence should be addressed: Anders Nielsen. Tel. +47 64 96 57 67. Fax +47 64 96 58 01. E-mail anders.nielsen@umb.no.

Summary

  • 1Ecological networks have been shown to display a nested structure. To be nested, a network must consist of a core group of generalists all interacting with each other, and with extreme specialists interacting only with generalist species.
  • 2Studies on ecological networks are especially prone to sampling effects, as they involve entire species assemblages. However, we know of no study addressing to what extent nestedness depends on sampling effort, despite the numerous studies discussing the ecological and evolutionary implications of nested networks.
  • 3Here we manipulate sampling effort in time and space and show that nestedness is less sensitive to sampling effort than number of species and links within the network.
  • 4That a structural property of an ecological network appears less prone to sampling bias is encouraging for other studies of ecological networks. This is because it indicates that the sensitivity of ecological networks properties to effects of sampling effort might be smaller than previously expected.

Introduction

Mutualistic networks involving flowering plants and their mutualistic pollinator or seed disperser animals can be visualized by a matrix where each column represents an animal species and each row a plant species. The cells within the matrix are used to indicate whether there is an interaction between the plant and the animal species that intersect in the matrix cell (Bascompte et al. 2003) (Fig. 1). Recent work has described the structure of both plant-pollinator and other types of mutualistic networks (Bascompte et al. 2003, 2006; Jordano et al. 2003; Vázquez 2005; Vázquez et al. 2005; Fontaine et al. 2006; Ollerton et al. 2007). Specifically, mutualistic networks are highly nested, i.e. there is a core group of generalists all interacting with each other, with extreme specialists interacting only with generalist species (Bascompte et al. 2003) (Fig. 1). Numerous studies have revealed this nested structure and have discussed its evolutionary and ecological reasons and consequences (Bascompte et al. 2003; Dupont et al. 2003; Ollerton et al. 2003; Memmott et al. 2004; Jordano et al. 2006; Lewinsohn et al. 2006; Santamaria & Rodríguez-Gironés 2007; Stang et al. 2007). Neither envisioning plant–pollinator interactions as being highly specialized, or diffuse, produces highly nested mutualistic networks. The core of generalist species may drive the evolution of the whole community, while specialist species are involved in highly asymmetric interactions with the generalists. One to one specialist coevolution is very rare, as seen in the lack of compartments within the networks (Bascompte et al. 2003). Nestedness has also been shown to increase network robustness, as nested networks appear less prone to the detrimental effects of habitat loss (Fortuna & Bascompte 2006) and species extinctions (Memmott et al. 2004).

Figure 1.

A matrix showing the interactions between plants (rows) and their pollinators (columns). Black squares represent an interaction occurring between the plant and animal species intersecting in the matrix cell. We used ANINHADO to pack the matrix (rearranging rows and columns) to achieve maximum nestedness. The line represents the isocline of perfect nestedness given the matrix size and fill. The core of generalists (both plants and pollinators) occurs in the top left corner of the matrix. Specialist pollinators tend to interact with plants above the isocline of perfect nestedness (the plant generalists). Specialist plants tend to interact with pollinators to the left of the isocline of perfect nestedness (the pollinator generalists). The matrix is created from the Siljan data, and is based on the network created by pooling four plots of clear cut forest (see Fig. 3 and Table 1).

More recently, nestedness has also been found in other types of networks, including the defensive interactions between ants and extrafloral nectary-bearing plants (Guimarães et al. 2006), the network of carcass visits by scavenger animals (Selva & Fortuna 2007), the interactions between sea anemones and their associated fish species (Ollerton et al. 2007), and the interaction between trees and tree living lichens (A. Nielsen & M. H. Lie, unpublished data).

Despite the importance of nestedness for network robustness and coevolution (Thompson 2005), no study has addressed how sensitive nestedness is to sampling effort. This is despite the fact that apparent properties of multispecies systems have been shown repeatedly to be prone to the effects of sampling effort (Paine 1988; Cohen et al. 1993; Goldwasser & Roughgarden 1997; Martinez et al. 1999; Winemiller et al. 2001; Bersier et al. 2002; Banasek-Richter et al. 2004). Numerous studies have addressed the problem of sampling effort (e.g. Colwell & Coddington 1994; Mao & Colwell 2005), including its relation to plant-animal mutualistic network properties (e.g. Ollerton & Cranmer 2002; Herrera 2005). A common observation is that the number of species and links within a network tends to increase with sampling effort. However, with regard to nestedness, there is currently much debate as to what extent this nested pattern may be affected by sampling. Vázquez & Aizen (2006) indicated that relative specialists are less abundant than generalists, inducing a potential sampling bias towards the core of generalist species and thus underestimating the presence of the more specialized species. Ollerton & Cranmer (2002) suggested that variation in sampling effort might underlie the impression that tropical plants have more specialized pollination interactions than temperate plants. As nestedness involves specialist-generalist linkages, a sparsely sampled network might be expected to include relatively few specialist links, and hence appear less nested than a well-sampled network. Bascompte et al. (2003) showed that there is a positive relationship between nestedness and network size and fill, and that there seems to be a minimum number of species needed for the nested structure to appear significant. In contrast, Rodríguez-Gironés & Santamaría (2006) found that the nestedness of randomly generated matrices decreased within the span of network size and fill observed in our study (less than 50 pollinator species (columns) and less than 30% fill).

While rarefaction curves are primarily used to illustrate how the number of sampled species increases with the number of individuals or area sampled, they can also be used to relate other ecological measures to sampling effort. If the relationship between sampling effort and, for example, species richness reveals a steadily increasing curve, we may assume our sampling effort has been too small, as more extensive sampling would reveal more species. On the other hand, if the value appears stable, or levels off and reaches an asymptote, one can say that there has been sufficient sampling, as more sampling would not reveal more species. Here we use rarefaction curves to address whether nestedness is sensitive to sampling effort compared with number of species and links within the network. As the significance level of a nestedness value is related to both the size and fill of the matrix, we also present relative nestedness (nestedness corrected for matrix size and fill) as well as the significance level for the absolute nestedness value not being obtained from a randomly assembled matrix. In addition we show the relationship between sampling effort and network connectance (fill) as connectance is important for determining the significance of the nestedness value (Rodríguez-Gironés & Santamaría 2006). Based on previous studies we expect nestedness to increase with sampling effort towards an asymptote. Our main question is whether this asymptote is reached before the entire network (all species and links) is known. We do this by relating sampling effort, in time and space, to absolute and relative nestedness, significance level of absolute nestedness, as well as number of species and links within the network.

Methods

We obtained spatially and temporally structured data on the plant-pollinator (flower visitor, strictly speaking) network in a boreal forest landscape near Siljan in south-east Norway. As the study area is heavily managed (Nielsen et al. 2007), we established 12 20 × 20 m study sites evenly distributed among forest stands of three forest maturity categories (clear cuts, young forest, and old growth forest). The plant-pollinator network data were collected throughout the summer of 2004 (3 June to 24 August). To minimize the number of zero observations, sampling was performed in fair weather only. Local weather conditions made it impossible to distribute the sampling days uniformly throughout the season, so in some weeks 5 days of sampling were done, while in most weeks between two and four samplings of each study site were performed. We sampled each study site an average of 31.33 times (minimum = 29, maximum = 34). We sampled our study sites by randomly walking around within them for 30-min periods, catching all insects seen within flowers. All insects caught were brought back to the laboratory for identification and the species of plant it was caught in was recorded.

To manipulate the temporal sampling effort, we created plant-pollinator network matrices based on data obtained from an increasing number of sampling days. We created plant-pollinator networks for each level of sampling effort by including data from 4, 6, 8, ... up to 28 randomly selected sampling days (28 days were chosen as the maximum sampling effort to ease comparisons among sites, as the lowest temporal sampling effort was 29 days for a particular site (Table 1)). We did this for each study site separately. Fifty replicate matrices were created for each level of sampling effort. For each of the resulting networks, number of species and links were counted and absolute nestedness and connectance were calculated. We also ran null model analysis for each of the nestedness values and calculated relative nestedness and the significance level (P-value). We then plotted all our network measures against the number of sampling days the network was based on.

Table 1.  Descriptive measures of the matrices obtained for each of the 12 study sites and the networks created for all four sites in each forest community category. All data are based on calculations on networks obtained for the entire season. Days are total number of days of sampling conducted in the particular study site or for the four study sites comprising each forest community. Visits are the total number of insects recorded in a particular study site. Size equals plants × animals. N equals absolute nestedness and significance levels are obtained using null model analysis in ANINHADO (see text for details)
SiteDaysVisitsSize (P × A)LinksFill (%)NSignificance
C1 30269 7 × 48 7221.40.81P = 0.001
C2 31136 8 × 27 4420.00.66P > 0.1
C3 34217 7 × 35 4719.20.72P > 0.1
C4 33172 9 × 33 6020.20.73P = 0.028
Y1 30248 8 × 38 6320.70.70P > 0.1
Y2 31172 8 × 41 5918.00.80P = 0.09
Y3 33192 8 × 42 6218.50.81P = 0.008
Y4 31161 5 × 39 5327.20.50P > 0.1
O1 32266 8 × 40 6119.10.80P = 0.005
O2 29 49 3 × 18 1935.20.29P > 0.1
O3 31127 6 × 31 4323.10.81P = 0.028
O4 31 75 3 × 15 17380.35P > 0.1
Clear12879412 × 7715416.70.83P < 0.001
Young12477311 × 8616517.40.78P < 0.001
Old12351710 × 70113160.80P < 0.001

For the spatial component of our sampling effort we used the networks based on sampling throughout the entire season. To alter the spatial scale we created networks based on data from single study sites, all combinations of two and three sites, as well as the network based on data from all four sites within each forest maturity category. We did not combine networks from sites of different forest maturity because the spatial heterogeneity in our system might make us actually sample more than one network (Paine 1988).

The goal of our sampling protocol was to obtain the best possible estimate of the ‘real’ plant-pollinator network within the study area, given a certain amount of sampling effort applied. To achieve this, we manipulated the temporal and spatial scale independently of the number and abundance of both plant and pollinator species.

To calculate absolute nestedness, relative nestedness and significance levels for the nestedness values, we used the ANINHADO software designed by Guimarães & Guimarães (2006), which is based on the algorithms from the Nestedness Temperature Calculator (Atmar & Patterson 1993). The nestedness calculator has been criticized for overestimating the nestedness significance, as it uses a null model based on equal probability for all interactions to be realized (Cook & Quinn 1998; Fischer & Lindenmayer 2002). ANINHADO is, in this sense, a better software package as it allows the possibility to choose from four different null models. Here we use a null model based on probabilities in each cell being the arithmetic mean of the connection probabilities of the focal plant and animal species (null model 2 in Bascompte et al. 2003 and null model Ce in ANINHADO). This assumes that the probability of an interaction is proportional to the generalization level of both species. This null model has been found to perform better when compared with other models in the sense of having the smallest type-I errors (Rodríguez-Gironés & Santamaría 2006). ANINHADO gives the nestedness in temperature or degrees (T), as an analogy to physical disorder. As we emphasize nestedness or order, instead of disorder, we present nestedness, N, as N = (100 − T/100), with values ranging from 0 to 1 (maximum nestedness). Whether an absolute nestedness value significantly departs from what could be expected of a random matrix depends on the matrix size and fill. To allow cross network comparisons we therefore calculated relative nestedness, a measure that corrects for variation in species richness and number of links. Relative nestedness is defined as N* = (N − NR)/NR, where N is the nestedness of the actual matrix and NR is the average nestedness of random replicates generated from null model analysis with ANINHADO (Bascompte et al. 2003). Details of the matrices obtained from all 12 study sites and all three forest maturities are given in Table 1.

The goal of this study is to illustrate whether our nestedness values are representative for the ‘real’ network under study. However, the question of whether our networks are significantly nested compared with random networks with the same number of species and links does become important as the relationship between absolute nestedness and significance level is dependent on the size and fill of the matrix. We therefore report both absolute and relative nestedness as well as the significance level, and how they relate to sampling effort in time and space.

Results

As seen in Fig. 2, both number of species and links increase with temporal sampling effort. Though the increase does not necessarily follow a linear model, none of the study sites appear to have reached saturation levels for these variables within the extent of our sampling effort. In contrast, the relationship between absolute nestedness and sampling effort seems much more stable. For most of the study sites the average absolute nestedness value does not change markedly with sampling effort; however, variation around the mean does decrease. For the study sites with the two smallest networks (O2 and O4) absolute nestedness decreases with sampling effort, but seems to level off within our sampling scale (Fig. 2). As a result of its relationship with size and fill of the matrix, relative nestedness appears to be not as stable as the absolute value. For the study sites exhibiting significant nestedness at higher levels of sampling effort, relative nestedness does increase with sampling effort but reaches an asymptotic level well within the extent of our sampling effort. For the two smallest networks (O2 and O4) relative nestedness seems quite stable or even decreasing with sampling effort.

Figure 2.

Figure 2.

The relationship between number of sampling events (temporal sampling effort) and number of species, number of links, absolute nestedness (N), relative nestedness, connectance and P-value for the statistical significance of the absolute nestedness value for the 12 study sites. Data are presented for four replicate study sites in each forest maturity category, and the labels C, Y and O represent study sites in clear cuts, young forest and old growth forests, respectively. Each point represents the average value for the measure, obtained from 50 random combinations of 4, 6, 8, ... up to 28 sampling days. Error bars are standard deviation.

Figure 2.

Figure 2.

The relationship between number of sampling events (temporal sampling effort) and number of species, number of links, absolute nestedness (N), relative nestedness, connectance and P-value for the statistical significance of the absolute nestedness value for the 12 study sites. Data are presented for four replicate study sites in each forest maturity category, and the labels C, Y and O represent study sites in clear cuts, young forest and old growth forests, respectively. Each point represents the average value for the measure, obtained from 50 random combinations of 4, 6, 8, ... up to 28 sampling days. Error bars are standard deviation.

We found similar results for the relationship between network properties and spatial sampling effort. Both the number of species and links increase with increasing spatial sampling effort. Nestedness, on the other hand, seems much more stable. For the clear cut and young forest study sites, both absolute and relative nestedness appear quite stable across the extent of our sampling effort. For the old growth forest, while average, absolute and relative nestedness increase from one to two study sites included, these seem to stabilize as three and four study sites are included. Figure 3 shows the relationship between spatial sampling effort and the different network properties.

Figure 3.

The relationship between number of study sites (spatial sampling effort) and number of species, number of links, absolute nestedness (N), relative nestedness, connectance and P-value for the statistical significance of the nestedness value for the three communities (clear cuts, young forest and old growth forests). Each point represents the average value for the measure, obtained from single sites (1), and combinations of two (2), three (3) and all four sites (4). Error bars are standard deviation.

To control for the variation in network size and fill we calculated relative nestedness and significance levels for our absolute nestedness values, using null model analysis. Relative nestedness showed two distinct patterns of relationships with sampling effort, depending on whether the networks appear to be statistically significantly nested at higher levels of sampling effort (Table 1 and Fig. 2). For the networks not statistically significantly nested, the relative nestedness value seems not to change much throughout the extent of our sampling effort (constantly below zero). The P-value of these networks stayed high and even increased (O2 and O4) with sampling effort. The two non-significantly nested clear cut networks (C2 and C3) had an increase in relative nestedness and decrease in P-value. Indeed, we might suspect the nestedness value of these networks to become statistically significant if we had applied a higher temporal sampling effort. The relative nestedness of the networks that were statistically significantly nested increased and stabilized above zero, at high levels of sampling effort. The P-values of these networks decreased with sampling effort. These patterns were also apparent for the increase in spatial sampling effort, where all three communities were highly significantly nested at higher levels of sampling effort.

Discussion

As observed in both Figs 2 and 3, our sampling effort regarding species and links appears to be insufficient. Our graphs indicate that if we had sampled more days or more study sites, we would have found more species and detected more links in our networks. On the other hand, connectance decreased with sampling effort towards an asymptotic value and actually stabilized within the extent of our sampling effort. The main focus of this study, namely the relationship between nestedness and sampling effort, did show a quite contrasting pattern. For all but the two smallest networks (O2 and O4) the absolute nestedness value appeared stable throughout the entire range of our temporal sampling effort. The networks that were statistically significantly nested at high sampling efforts have slightly increasing absolute nestedness values and do stabilize at higher values than those that were not significantly nested. This is in accordance with our expectations based on previous studies (Bascompte et al. 2003). Sampling more days or more plots would probably not significantly affect the estimate of this network pattern. The absolute value of nestedness is, however, not meaningful unless considered in relation to matrix size and fill. Rodríguez-Gironés & Santamaría (2006) showed that the nestedness temperature of randomly assembled matrices increased with network size and attained its maximum value for intermediate fills. This shows that smaller networks need a lower temperature (higher nestedness value) to be significantly nested. The sizes of our networks in relation to temporal sampling effort are all in the range where the temperature of randomly assembled matrices changes rapidly (< 50 columns in the matrix). Our matrix fill decreases with sampling effort, towards a relative stable value in the range 0.2–0.3. This is within the range where the nestedness temperature of randomly assembled matrices attains its maximum value (lowest nestedness value). The two smallest networks (O2 and O4) seem to have a slightly higher fill, close to 0.4, a value where the chance that a random matrix will be significantly nested is smallest (Rodríguez-Gironés & Santamaría 2006). The significance of the absolute nestedness values for our networks should therefore be most sensitive to the size of our networks, i.e. the number of species comprising them.

We acknowledge that our increase in both temporal and spatial sampling effort is not based on independent samples. Our spatial scale is based on combinations of four independent study sites and our temporal scale is based on between 29 and 34 temporally autocorrelated days of sampling and, as such, our results must be interpreted with caution. Despite this, however, we feel confident that absolute nestedness is less sensitive to sampling effort in both space and time, compared with number of species and number of links in our networks. Previous studies on food webs have shown that number of links is sensitive to sampling effort, compared with number of species (Goldwasser & Roughgarden 1997). Both species and links increase steadily within the extent of our sampling effort while absolute nestedness appears much more stable. We take this as an indication of absolute nestedness being less sensitive to sampling biases. The exact amount of time and area sampled before an asymptote is reached is of course dependent on site-specific properties. However, the point is that the stability of the absolute nestedness value occurs at a lower level of sampling effort, in both time and space, than number of species and number of links, and that this pattern appears to prevail across study sites. This study is based on data sampled from 12 study sites situated in forest stands of contrasting maturity. The sites are indeed very different regarding forest structure (tree size and age distribution), and there are great differences in species composition, abundance and diversity among the sites situated in forests of contrasting maturity (A. Nielsen, unpublished data). We believe that the heterogeneous outline of the study system increases the potential applicability of our findings. With respect to the temporal variation in sampling effort, the absolute nestedness value, as well as the P-value obtained at high sampling effort, varied among the study sites. That the absolute nestedness seems not to vary with sampling effort, whether an increase in sampling effort reveals that the nested pattern is significant or not, also strengthens our confidence that absolute nestedness values are less prone to arbitrary effects of sampling effort.

The number of documented ecological networks displaying a nested structure is increasing (Ollerton et al. 2003; Lazaro et al. 2005; Vázquez et al. 2005; Fontaine et al. 2006; Guimarães et al. 2006; Selva & Fortuna 2007). The nested structure of these networks has been discussed and interpreted without addressing the extent to which the absolute nestedness values are sensitive to sampling effort. Here, we address for the first time the relationship between sampling effort and absolute nestedness. We conclude that absolute nestedness is a relatively robust measure of network structure, less prone to arbitrary effects of insufficient sampling effort than other network descriptors such as number of species and links.

One potential explanation for the robustness of the nestedness estimates reported here is related to the assembly type of a nested matrix. As noted above, nestedness implies a core of generalist plant and animal species interacting among themselves. Despite imperfect sampling, these species and links will probably soon be recorded. The density of links in the core is very high, so even if some are missed the structure will remain similar. Rarer, generalist species may not be recorded, but other, less generalist species will be attached to the tails of this generalist core. Essentially, nestedness implies a network assembly similar to Chinese Boxes in the sense that the pollinators that one plant species interacts with are included in the larger group of pollinators that a more generalist plant species interacts with, which, in turn, are contained in the larger group of pollinators that the most generalist plant species interacts with. There are sets of species within larger sets. This structure is quite robust in the face of sampling effort because all layers are built around a central core containing the most generalist, abundant species. One could assume that the first species (and layers) going unnoticed because of imperfect sampling would be the external ones (containing more specialist species). However, one could still see the global structure of boxes within larger boxes. If one layer (or box) is removed, the internal layers still have the same relationship.

Acknowledgements

The Research Council of Norway (project 154442/720) supported this study (A.N.).

This work was funded by the European Heads of Research Councils, the European Science Foundation, and the EC Sixth Framework Programme through a EURYI (European Young Investigator) award (to J.B.), and by the Spanish Ministry of Education and Science (Grant REN2003-04774 to J.B.).

We thank Martin Haugmo and Sverre Lundemo for field assistance and Frizöe Skoger for letting us use their forestland for our research. We also thank Jeff Ollerton, Mary Price and an anonymous reviewer for useful comments on previous versions of the manuscript.

Tore Randulf Nielsen (Syrphidae), Adrian Pont (Muscidae), Brad Sinclair (Empididae), Frode Ødegaard (Coleoptera) and Sigmund Hoggvar (Hemiptera) kindly identified insects to species level. Fred Midtgaard determined the less abundant groups of pollinators to appropriate taxonomic level.

Ancillary