Influence of taxonomic resolution on mutualistic network properties

Abstract Ecologists are increasingly interested in plant–pollinator networks that synthesize in a single object the species and the interactions linking them within their ecological context. Numerous indices have been developed to describe the structural properties and resilience of these networks, but currently, these indices are calculated for a network resolved to the species level, thus preventing the full exploitation of numerous datasets with a lower taxonomic resolution. Here, we used datasets from the literature to study whether taxonomic resolution has an impact on the properties of plant–pollinator networks. For a set of 41 plant–pollinator networks from the literature, we calculated nine network index values at three different taxonomic resolutions: species, genus, and family. We used nine common indices assessing the structural properties or resilience of networks: nestedness (estimated using the nestedness index based on overlap and decreasing fill [NODF], weighted NODF, discrepancy [BR], and spectral radius [SR]), connectance, modularity, robustness to species loss, motifs frequencies, and normalized degree. We observed that modifying the taxonomic resolution of these networks significantly changes the absolute values of the indices that describe their properties, except for the spectral radius and robustness. After the standardization of indices measuring nestedness with the Z‐score, three indices—NODF, BR, and SR for binary matrices—are not significantly different at different taxonomic resolutions. Finally, the relative values of all indices are strongly conserved at different taxonomic resolutions. We conclude that it is possible to meaningfully estimate the properties of plant–pollinator interaction networks with a taxonomic resolution lower than the species level. We would advise using either the SR or robustness on untransformed data, or the NODF, discrepancy, or SR (for weighted networks only) on Z‐scores. Additionally, connectance and modularity can be compared between low taxonomic resolution networks using the rank instead of the absolute values.


| INTRODUC TI ON
The study of species interactions has always been central in ecology.
Such interactions have historically been examined by focusing on two interacting species, but in recent years, the marked increase in the amount of biological information available and the development of novel approaches and tools have placed a new focus on the study of interaction networks (Proulx, Promislow, & Phillips, 2005). Ecological networks may provide important insights that cannot be gained when species are studied in isolation. They currently play a central role in key aspects of ecological theory such as the long-standing question of the relationship between complexity and stability in ecosystems (Montoya, Pimm, & Solé, 2006;Thébault & Fontaine, 2010) or the interplay between interspecific competition and ecological niche (Bastolla et al., 2009). Ecological networks are also powerful tools for applied ecology, as they can be used to monitor the impact of biological perturbations on an ecosystem or the efficiency of restoration programs (Kaiser-Bunbury & Blüthgen, 2015;Kaiser-Bunbury et al., 2017;Memmott, 2009).
Most studies on ecological networks have focused on three main categories of networks defined according to the type of species and their interactions: food webs, parasitoid host interaction networks, and more recently, mutualist interaction networks (Ings et al., 2009).
In this paper, we concentrate on the case of the mutualistic networks linking plants and pollinators, which have attracted particular attention in recent years. Indeed, pollinators have an essential ecological function, namely the pollination function, which is threatened in many parts of the world by the sharp decline in pollinators on account of the many threats that they face (Goulson, Nicholls, Botías, & Rotheray, 2015). Such a decline in pollinator populations may harm both wild biodiversity and agricultural productivity (Garibaldi et al., 2013).
The use of a network makes it possible to synthesize in a single object the species and interactions linking them and thus constitute the community of species (Delmas et al., 2019). It thus becomes possible to use the many methods developed to study ecological networks to describe their structure and properties using different indices (Lau, Borrett, Baiser, Gotelli, & Ellison, 2017). One structural characteristic that has received particular attention in the study of plant-pollinator networks is nestedness (Bascompte, Jordano, Melián, & Olesen, 2003; Table 1). A nested network is characterized by the extent to which interactions of less-connected species form subsets of the interactions of more-connected species. Other frequently examined structural characteristics of mutualistic networks are connectance, the proportion of realized interactions among all possible ones, and modularity, that is, the extent to which linked interactions between pollinators and plants are organized into delimited modules, as well as motifs, which are subnetworks representing the interactions between a given number of taxa (Milo et al., 2002). These properties have been associated with the ecosystem's resilience to perturbations (Soares, Ferreira, & Lopes, 2017). It has, for example, been shown that high levels of connectance, modularity, and nestedness promote both the structural and dynamic stability of mutualist interaction networks (Vanbergen, Woodcock, Heard, & Chapman, 2017).
A large number of datasets on plants and their pollinators have been collected to date. However, given the large number of pollinator species potentially present in a community, as well as the relative difficulty in identifying some of these pollinators at the species level, a significant portion of the collected datasets has a taxonomic resolution lower than the species level. For a given research effort, there is therefore a trade-off between the quantity of possible identifications and the taxonomic accuracy of these identifications, which makes it difficult to produce large or numerous sets of data identified down to the species level. An extreme point in this regard is the datasets provided by citizen science programs for pollinators (Toomey & Domroese, 2013) such as the Spipoll program in France (Deguines, Julliard, Flores, & Fontaine, 2012), which generally allow very large datasets to be collected, although their taxonomic accuracy does not generally extend to the species level (Dickinson, Zuckerberg, & Bonter, 2010;Kremen, Ullman, & Thorp, 2011).
Currently, network analyses are performed on networks with varying levels of taxonomic precision, which makes comparisons between studies or even sites of the same studies potentially invalid, because we do not know how taxonomic resolution influences the indices of those networks, nor how they should be interpreted. If possible, it would, however, be interesting to use network analyses on such datasets in order to fully exploit the information contained therein and allow comparisons with other studies. Here, we sought to establish whether taxonomic resolution has an influence on the architecture and properties of a mutualistic network estimated using several indices. We used a set of 41 plant-pollinator networks from the literature and compared their index values at three different taxonomic resolutions: species, genus, and family. We showed that for a given network, changing the taxonomic resolution usually significantly changes the value of most indices. We also show that after the standardization (with the Z-score, using null models) of the indices measuring nestedness, these three indices are no longer differed significantly at different taxonomic resolutions. We also used another normalization measure for one nestedness index (NODF) called NODFc and show that this measure is robust to a lower taxonomic resolution (Song, Rohr, & Saavedra, 2017). Additionally, we showed that among the set of 41 networks, the relative value of a given network for a given index is well conserved across different taxonomic resolutions, particularly between the species and genus levels.

| Overview
We used plant-pollinator networks from the literature (Vázquez, Goldberg, & Naik, 2003) determined to the level of species. For each species-level network, we deduced the equivalent network at the genus and family levels. We then calculated several indices commonly used to estimate mutualistic network properties for each of these networks and then compared their values across taxonomic resolutions. Data manipulation and analysis were conducted with the R language (R version 3.2.3, 2015-12-10). The script used for those results is accessible here: https ://gitlab.com/Estel leRen aud/ taxon omic_influ ence_netwo rk_prope rties

| Network indices
We selected frequently used indices that describe various properties of interaction networks, namely nestedness, connectance, modularity, motifs, and robustness. Given the particular importance of generalist pollinator species in maintaining plant-pollinator networks (Martín González, Dalsgaard, & Olesen, 2010), we also added one index calculated at the species level, that is, the normalized degree.
The characteristics of these indices are summarized in Table 1 (Brualdi & Sanderson, 1999): it is the minimal number of differences with a perfectly nested matrix with the same size, number of links, and column (or row) sums as the real network. The SR of a network is thus the largest of its matrix eigenvalues (Staniczenko et al., 2013).
We used five additional indices. Four of them-connectance, robustness, motifs, and normalized degree-are calculated on the presence/absence matrices, whereas modularity is calculated on frequency matrices. Network connectance was calculated as the sum of links divided by the number of cells in the matrix.
Network modularity was measured according to the Beckett algorithm DIRTLPAwb+ (Beckett, 2016), which aims to estimate the modularity of the network using three steps. The first uses label propagation to obtain a locally maximized modularity (bottom-up); the second agglomerates the modules found in the first step if it allows for an increased modularity; the third repeats these steps until modularity can no longer be increased. DIRTLPAwb+ then randomizes the initial labeling of nodules multiple times and returns the result with the greatest modularity score. Modularity itself was then calculated as the modularity M proposed by Newman (Newman, 2006  with one extreme being the fixed-fixed model that is susceptible to type II errors (Gotelli, 2000) and the other the equiprobableequiprobable model that is susceptible to type I error (Wright, Patterson, Mikkelson, Cutler, & Atmar, 1997). This model has statistically determined elements following the degree distribution of the initial matrix as p ij = 1/2*(d j /r + k i /c), where p ij is the probability of assigning a 1 to the ith row and jth column, d j is the column degree of the jth column, k i is the row degree of the ith row, and r and c are the respective numbers of rows and columns. For the weighted indices (WNODF, SR), we used two kinds of null models, as no null model has been established as more suited to WNODF or SR yet: The first set of null matrices is obtained from 500 iterations of the row and column total average model (introduced in the Falcon software) that averages two matrices: a matrix created conserving the row totals and redistributing a random portion of that total to each element of a given row, and a matrix following the same principle with the column totals. The second kind is Patefield's historical r2dtable model, implemented with the null model function (option "r2dtable") of the bipartite R package (Dormann et al., 2008). We also generated 500 matrices under that model.

| Pollination networks
We extracted all plant-pollinator interaction networks from the Interaction Web Database (Vázquez et al., 2003). All networks were issued from previously published data (Table S1). We only kept matrices for which taxa determination was possible using the taxize package; that is, valid taxonomic names resolved at the genus or species level. In some cases, we replaced old taxonomic names by a current valid synonym. We also only kept matrices that dealt with several families, which left us with a dataset of 41 matrices, 10 of which were binary (presence/absence) matrices. The remaining 31 were weighted according to the frequency of the visitation or a proxy of that frequency. The number of taxa in the matrices varied from seven to 135 for plants, and 12 to 144 for pollinators.
We then used the taxize package (Chamberlain & Szöcs, 2013; version 0.9.0) from R to extract from the taxonomic information supplied by the authors the taxonomic affiliation from the superior ranks. Only the identification from the species, genus, family, and order ranks was retained, as these were the ranks most often known for all observations. The database GBIF (GBIF, 2018) was used as a reference.
We transformed each of the 41 previously described matrices into interaction matrices determined at the species level by keeping only the observations (within each network) for which both the plant and pollinator were determined to the species level. From these species-level matrices, we deduced the genus-level and then the family-level matrices.

| Statistical tests
To examine the influence of the taxonomic level on the structure of a given matrix, we compared the values of the indices for Specieslevel matrices and Genus-level matrices, Genus-level matrices and Family-level matrices, and Species-level matrices and Family-level matrices, using a one-way analysis of variance. Post hoc tests were performed with a Bonferroni correction, using the built-in pairwise.t.
test R function, with the "paired" option. We also performed the same analyses after standardizing nestedness values using Z-scores.
To investigate whether an index was useful for comparing different observed matrices, we performed a nonparametric correlation test (cor.test on R) to calculate both the value and significance of Spearman's rho for a given index in Species-level matrices, Genuslevel matrices, and Family-level matrices. This allowed us to test whether the relative ranks of this index's values were significantly correlated between one taxonomical level and another. We used the Z-score (with two kinds of null models for the weighted indices) to take into account the difference in the matrix fills and sizes caused by the change in taxonomic resolutions, as well as another normalization by the maximal NODF (noted as NODF c ).

| RE SULTS
We compared nestedness Z-scores and NODF c values from one taxonomic level to another using a one-way analysis of variance test.
After this standardization, only the WNODF showed a significant effect of taxonomic level on its Z-score value (Figure 2 and Table 3

TA B L E 3
Results of one-way analysis of variance comparing Z-score values (as well as NODF normalized according to Song et al. (2017)) at three different taxonomic resolutions We also showed that the normalization method proposed by Song et al. (2017) offers an NODF index robust to taxonomic resolution, which is in line with their own conclusions that NODF c is independent from network number of rows, columns, and number of links, making it remarkably relevant to compare networks across studies or spatial gradients. Our results are also in agreement with an as yet unpublished study by Hemprich-Bennett, Oliveira, Comber, Rossiter, and Clare, (2018) who also found that absolute measures of most of the metrics they tested (which includes NODF, robustness, connectance, but not SR or BR) vary according to the taxonomical level of the networks (both observed networks and networks deduced from metabarcoding data).
The main objective of our study was to determine whether it is possible to meaningfully estimate indices describing the characteristics of plant-pollinator interaction networks with a taxonomic resolution lower than the species. Our results suggest that it is indeed the case. To estimate nestedness, our results suggest that only the absolute values of SR indices are minimally impacted by changes in taxonomic resolution and should therefore probably be preferred when the objective is to compare nestedness levels for networks with a lower resolution than the species. Alternatively, it is possible to use NODF and BR after standardization using the Z-score. For the other properties of the networks that we examined, namely connectance, modularity, normalized degree, and robustness to species loss, the absolute values of the indices cannot be directly compared at resolutions lower than the species level, but it is still possible to rank networks according to their values for these indices, because such ranks are well preserved when the level of taxonomic resolution changes. Motif frequencies do not present a unique pattern of sensitivity to taxonomic resolution. Indeed, some motif frequencies are significantly influenced by taxonomic resolution, while others are not. However, they all show a good preservation of the ranks between species and genus networks. We would advise not to use motif frequencies at a family level, though, as the correlation between ranks gets rather low (sometimes as low as 0.2).
Note, however, that whereas the taxonomic resolution lower than the species seems to allow us to characterize the properties of plant-pollinator interaction networks, it may make it more difficult to interpret these properties. One of the main objectives of the measurement of network properties is to make or test inferences about their underlying mechanisms. For example, Junker et al. (2013) showed that sets of plant traits such as phenology, floral reflectance, and morphology can predict plant-pollinator interactions and thus network structure. Similarly, Klumpers, Stang, and Klinkhamer (2019) showed that size matching between the pollinator proboscis length and the nectar tube depth is important in shaping plant-pollinator interactions. Such conclusions would be more difficult to reach when working above the species level. In the future, working on these levels would require careful consideration: Can functional traits be extended to the whole genus in that particular case? If this is not possible, then working on these levels could thus deprive us of a significantly explanatory variable. This means that while genus-and family-level networks are usable and interpretable, they still entail a loss of information for future studies. For this reason, future studies need to consider the gain in network explicitness versus the loss of information before choosing to work at the genus or family level.
Our results support the relevance of citizen science for ecological research. The major strengths of citizen science programs lie in their ability to conduct studies at large geographic scales and on private properties, which are usually impossible to perform with traditional field research (Dickinson et al., 2010), although these are often at the price of a lower taxonomical precision. Here, we showed that datasets with a taxonomic resolution lower than the species level can be used to estimate the properties of networks assembled at the same resolution, even if it is lower than the F I G U R E 3 Correlation strength for all taxonomic levels and all indices. All correlations are significant. BR, discrepancy; NODF, nestedness index based on overlap and decreasing fill; SR_Bin, spectral radius calculated on binary (absence/presence) matrices; SR_Qua, spectral radius calculated on weighted (abundance) matrices; VS, versus; WNODF, NODF calculated on weighted matrices species. However, plant-pollinator interaction data produced by citizen science are probably characterized by relatively low sampling completeness, because detecting all the species interactions is extremely labor-intensive (Chacoff et al., 2012), which can have an effect on the estimated properties of the networks. For the indices that we studied, Rivera-Hutinel, Bustamante, Marín, and Medel (2012) showed that nestedness, modularity, and robustness to species loss are little affected by sampling completeness, whereas connectance is very sensitive to low sampling. In conclusion, sets of plant-pollinator networks produced by citizen science, frequently characterized by low taxonomic resolution and low sampling efforts, are probably best analyzed by calculating their nestedness with SR (or NODF and BR after standardization using the Z-score) and their robustness with species loss, and then ranking them according to their modularity.
Our work confirms that we can use protocols with only genus-or family-level data and still use network-level analyses of plant-pollinator interactions. An interesting complement would be to study the same question for other kinds of mutualistic networks such as ant-plant networks or even for other kinds of interaction networks such as food webs.

ACK N OWLED G M ENTS
We thank the ENS de Lyon, who funded E.R. during the project.

AUTH O R S' CO NTR I B UTI O N S
CBG and EB conceived the ideas and designed methodology. ER collected the data. ER, CBG, and EB analyzed the data. ER led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.

DATA AVA I L A B I L I T Y S TAT E M E N T
The networks used in this analysis are available at the Interaction Abbreviation: BR, discrepancy; ND, normalized degree compiled for pollinator ("high") and plant ("low") taxa, for each bound of the quartile values (1-5); NODF, nestedness index based on overlap and decreasing fill; SR_Bin, spectral radius calculated on binary (absence/presence) matrices; SR_ Qua, spectral radius calculated on weighted (abundance) matrices; WNODF, NODF calculated on weighted matrices.