Overcoming the pitfalls of merging dietary metabarcoding into ecological networks

The construction of increasingly detailed species interaction networks is extending the potential applications of network ecology, providing an opportunity to understand complex eco‐evolutionary interactions, ecosystem service provision and the impacts of environmental change on ecosystem functioning. Dietary metabarcoding is a rapidly growing tool increasingly used to construct ecological networks of trophic interactions, enabling the determination of individual animal diets including difficult‐to‐distinguish prey taxa and even for species where traditional dietary analyses are unsuitable (e.g. fluid feeders and small invertebrates). Several challenges, however, surround the use of dietary metabarcoding, especially when metabarcoding‐based interactions are merged with observation‐based species interaction data. We describe the difficulties surrounding the quantification of species interactions, sampling perspective discrepancy (i.e. zoocentric vs. phytocentric sampling), experimental biases, reference database omissions and assumptions regarding direct and indirect consumption events. These problems are not, however, insurmountable. Effective experimental design and data curation with appropriate attention paid to these problems renders the incorporation of dietary metabarcoding into ecological network analysis a powerful tool for the construction of highly resolved networks. Throughout, we discuss how these problems should be addressed when merging data to construct ecological networks.


| THE VALUE OF D IE TARY DNA ME TABARCODING FOR NE T WORK ECO LO GY
Trophic interactions are fundamental to many ecosystem processes (Thébault & Loreau, 2005), and understanding the diet of animals is critical in determining prey choice, assessing ecological responses to environmental change, evaluating ecosystem health and ultimately designing conservation strategies Murray et al., 2011;Peterson et al., 2016;Piñol et al., 2014). Such interactions are also an integral component of network ecology, which endeavours to characterise and assess the interactions between organisms in complex ecological systems at a range of scales (Bascompte, 2007). Increasingly, network ecology has transitioned towards the integration of different types of networks and the construction of complex multilayer (also referred to as multipartite or multiplex) networks, where different layers can be anything from ecological networks in different habitats, networks with different types of interaction (e.g. mutualistic and antagonistic), interactions of a different nature (e.g. trophic and social) and/or different time points sampled for the same network (Evans et al., 2013;Hutchinson et al., 2019;Pilosof et al., 2017;Pocock et al., 2012). A popular use of multilayer networks in ecology to date has been to combine and simultaneously analyse different types of species interaction data (i.e. trophic and non-trophic interactions; Pilosof et al., 2017) at different levels of biological organisation (i.e. bipartite plant-herbivore and herbivore-parasitoid networks integrated together to form tripartite plant-herbivore-parasitoid networks; Miller et al., 2021) to provide both conceptual and applied advances (e.g. identifying the optimal plant communities in field margins that provide multiple ecosystem services; Windsor et al., 2021). These approaches involve the merging of species interactions, often derived from different data sources, each possibly specific to the focal taxon or interaction type (Fontaine et al., 2011). The merged network approach is a powerful means of compiling large-scale interaction networks, but it is potentially confounded by compiling data that may be of conflicting types, sources and units (Fontaine et al., 2011;O'Connell et al., 2021;Quintero et al., 2021).
Empirical network ecology has traditionally relied on the observation of interactions in the field or the laboratory (e.g. for hostparasitoid associations), but many consumers are difficult to observe in this manner given: (a) their ecology (e.g. nocturnal and fossorial species), (b) the difficulty in identifying the taxon consumed during these events (e.g. minute taxa with difficult-to-distinguish morphologies) and/or (c) the introduction of observer bias based on species traits like the size, activity level and colour of the animals observed (Birkhofer et al., 2017;Gibson et al., 2011;Symondson, 2002). Visual analysis of faeces or gut contents using microscopy can also generate post hoc dietary data, but this has traditionally been constrained by the oftentimes inaccurate and laborious process of identifying ingested taxa from remaining hard parts, predominantly from vertebrates (Birkhofer et al., 2017;Jeanniard-du-Dot et al., 2017;Pompanon et al., 2012;Symondson, 2002).
The detection of latent DNA in the guts or faeces of consumers provides a suitable alternative which can also facilitate the detection of small, soft-bodied or cryptic species which might be overlooked during hard-part analysis (Symondson, 2002). This facilitates often non-invasive detection of a greater dietary diversity than traditional techniques such as hard-part analysis, even in vertebrate consumers (Jeanniard-du-Dot et al., 2017). Even greater, however, is the advance in access to invertebrate dietary information through DNAbased methods, since most traditional methods are not applicable to the minute gut contents or faeces of invertebrates, especially fluid feeders Lafage et al., 2019;Pompanon et al., 2012;Symondson, 2002). Since the advent of high-throughput sequencing, 'DNA metabarcoding', the parallel identification of many species using short DNA amplicons, has become an increasingly common and accurate method for the identification of species consumed by a given animal (Clare, 2014;Pompanon et al., 2012). Metabarcoding has been used to study the diet of vertebrates [e.g. bats (Hemprich-Bennett et al., 2021), pigs (Robeson et al., 2018), penguins (Cavallo et al., 2018)] and invertebrates [e.g. beetles (Ammann et al., 2020), spiders (Lafage et al., 2019), dragonflies (Kaunisto et al., 2017)], including carnivores (Birkhofer et al., 2017;Deagle et al., 2009;Galan et al., 2018), herbivores (Kartzinel et al., 2015;Soininen et al., 2009), omnivores (Barba et al., 2014;Robeson et al., 2018), sanguivores (Schnell et al., 2012) and coprophagous species (Drinkwater et al., 2021), but rarely in an ecological network context.

| E X AMPLE S OF DNA ME TABARCODING IN NE T WORK S
Metabarcoding data have been successfully integrated into ecological networks, but largely through the application of metabarcoding to bulk samples for community-level data (Derocles, Bohan, et al., 2018;Evans et al., 2016;Petsopoulos et al., 2021) or the elucidation of host-parasitoid plant-pollinator interactions (Vere et al., 2017). While the latter two are conceptually similar to dietary analysis, some of the problems that they present are distinct (e.g. the degradation of consumed taxon DNA by abrasive digestion processes in dietary samples). These are also importantly novel in network ecology.
Dietary metabarcoding is increasingly being applied to network ecology in a limited number of recent examples (Clare et al., 2018;Hemprich-Bennett et al., 2021;Mata et al., 2021), but these are often isolated predator-prey interactions. However, dietary metabarcoding does theoretically facilitate the merging of complex multilayer networks including many types of interactions across multiple trophic levels, including interactions within trophic levels (i.e. intra-guild predation; Hambäck et al., 2021;Parimuchová et al., 2021;Saqib et al., 2021). The evolutionary data inherent to the output of metabarcoding also facilitates the incorporation of phylogenetic data into networks for enhanced evolutionary context and an improved understanding of how eco-evolutionary processes affect important ecosystem functions such as pollination and parasitism (Derocles, Lunt, et al., 2018;Handley et al., 2011;Kitson et al., 2018;Melián et al., 2018;Segar et al., 2020). This does, however, require sufficient phylogenetic information from the metabarcoding data output, which is unlikely to be possible from many shorter amplicons commonplace in dietary metabarcoding. Clare et al. (2018) constructed the first 'network of networks' solely using dietary metabarcoding, which included bat-plant, bat-arthropod and parasite-bat interactions. The study marked a significant step towards molecular-derived ecological networks given that these interactions were based entirely on molecular data; this also circumvented many of the problems inherent to networks constructed by merging data types/sources discussed herein. Hemprich-Bennett et al. (2021) similarly used molecular methods to elucidate bat-arthropod interactions, in this case finding reduced prey richness in logged forests compared to old-growth forests.
This study too applied network principles to dietary metabarcoding data, constructing networks entirely based on these data and using techniques from graph theory to understand the structure of the subsequent networks. Mata et al. (2021) highlighted the conservation biocontrol potential of bats in a Mediterranean landscape by reference to network structure and interaction frequencies. The study concluded that network ecology and metabarcoding provided a synergistic framework in which both individuals and communities can be assessed effectively in a biocontrol context.  used dietary metabarcoding to investigate spider-arthropod networks to highlight how spider prey preferences changed following harvest in a cereal crop. This study did make use of field survey data, but only to contextualise the molecular data by comparing prey abundance in the field to interaction frequency to assess density dependence of interactions (i.e. this did not entail merging of trophic levels). While all of these studies highlight significant advances in the integration of molecular data into ecological networks, they did not merge these data with networks constructed using other methods.
Given the potential power of this combination of molecular and traditional methods in network construction, it is inevitable that such studies will become commonplace.
The technical limitations and problems arising from dietary metabarcoding are plentiful, from PCR primer bias, through sensitivity to contamination, to the inability to ascertain the ecological context of interactions; these have been reviewed extensively (Lamb et al., 2019;Pompanon et al., 2012;Taberlet et al., 2018), but not in the context of their integration into merged networks which presents many unique and insidious challenges. The problems inherent to the resultant data, and those novel problems presented by the merging of metabarcoding data with other data sources, are critical considerations that must be contemplated by new adopters of metabarcoding and experienced researchers alike. To construct highly resolved networks, however, we must increasingly rely on a combination of construction methods (Evans & Kitson, 2020;Quintero et al., 2021;Wirta et al., 2014), particularly to integrate otherwise under-represented interactions such as those of invertebrate consumers. If these potential pitfalls can be overcome, dietary metabarcoding presents a powerful toolset for network ecologists, particularly when combined with traditional methods.
Here, we highlight several of the key concepts that network ecologists should be aware of before integrating dietary metabarcoding data into merged ecological networks, and molecular ecologists should be aware of before providing dietary metabarcoding data to network ecologists. These include the problems inherent to quantifying metabarcoding data, differences in sampling perspective, innate technical biases, variable resolution of identification and the inability to differentiate between modes of feeding (e.g. intentional, accidental and secondary consumption).

| QUANTITATIVE ISSUE S
The viability of quantifying the contributions of multiple taxa to the diet of a consumer using metabarcoding data has long been debated, with many views and considerations emerging (Deagle et al., 2013;Murray et al., 2011;Piñol et al., 2018;Thomas et al., 2016). It is unquestionable, however, that dietary metabarcoding provides a novel and unique set of problems for quantification.
Quantification of PCR-based metabarcoding data is difficult, if not impossible, mostly due to issues such as PCR primer bias and random sampling during sequencing (Leray & Knowlton, 2017;Murray et al., 2011). These issues are particularly prevalent in dietary metabarcoding, where biases are further compounded by differential degradation rates of tissues in the guts of consumers (Murray et al., 2011), variable and dynamic consumer metabolism (Greenstone et al., 2007), DNA density differences between consumed tissues (Murray et al., 2011;Veltri et al., 1990) and the interacting effects of time since ingestion and the volume of tissue ingested (Egeter et al., 2015).
There are a number of ways to address these problems without relying on raw read count data for quantification. First, the half-life of DNA in the gut can be estimated empirically (Greenstone et al., 2007) to approximate the length of time for which a given length of DNA can be detected. Based on these data, semi-quantitative predation rates can then be calculated (Egeter et al., 2015;Uiterwaal & DeLong, 2020). However, this is complicated as many species have highly variable metabolic rates (Greenstone et al., 2014;Sheppard et al., 2005). Second, other correction factors can be determined to allow for amplification bias (Thomas et al., 2016), but these too can be laborious to determine and often require new data for each study or taxon. Such methods are further affected by the developmental stage of the prey and the DNA density of the predated tissue, which are impossible to ascertain through typical metabarcoding data (Murray et al., 2011).
When merging data from different methodological sources, quantification issues become even more insidious. Simplifying an individual's interactions to binary presence/absence data, commonplace to dietary metabarcoding studies, neglects repeated consumption events and may thus inaccurately represent the true frequencies of interaction events, and therefore network weighting (Clare, 2014). To overcome this, many dietary studies convert individual diets into frequency-of-occurrence across groups, effectively creating quantitative group data from binary individual data. This does, however, preclude analyses that might only be made possible in a network context through dietary metabarcoding, such as analyses of choice in the field through estimation of encounter rate Vaughan et al., 2018). As can be seen from the study of other bipartite networks, quantitative networks (i.e. those that incorporate counts of consumed species or proportions of diet, or use frequency-of-occurrence) provide a disproportionate amount of extra information on the structure and the function of the network (Evans et al., 2013;Pocock et al., 2012). If one attempts to quantify individual-level consumption events from metabarcoding data (as is often attempted; Deagle et al., 2019;Deagle et al., 2013;Piñol et al., 2018;Thomas et al., 2016), the assumptions inherent to this (e.g. read count equates to biomass which further equates to the number of consumption events) are easily violated and may result in grossly inaccurate interaction weightings within ecological networks ( Figure 1). All of these quantitative biases and limitations can be factored into interpretations arising from studies solely concerned with dietary metabarcoding, but such problems are easily overlooked when merging these data with those generated by traditional methods.
Aggregating data within groups (e.g. species) to obtain frequencyof-occurrence per group can partly overcome the lack of quantification, but precludes the analysis of taxa commonly eaten together, requires a greater sampling effort and may neglect intraspecific variation. As well, while frequency-of-occurrence data facilitate a form of quantitative analysis from metabarcoding data, a further problem arises in that their representation of network weighting is greatly affected by the generalism of the individual consumers. If one consumer eats four individuals of the same species, this will result in an interaction weighting contribution of one, whereas a conspecific consumer that eats one each of four species will result in an interaction weighting contribution of four, despite eating the same number of individuals (Table 1). This value is representative of the degree (i.e. the number of nodes the individual is interacting with), but not the actual frequency of interaction. While this is compounded by the quantitative issues discussed above, it will skew even binary representation of the data.
Since it is inductive to assume that two consumers of the same species are likely to eat approximately the same amount (life stage, sex and other modifiers aside), it is then reasonable to normalise these binary values as proportions of an individual's diet prior to aggregating into frequency-of-occurrence (i.e. each dietary item is an even proportion of 1, so if five prey were consumed, each would be represented by a weighting of 0.2; Table 1). This form of 'interaction normalisation' has been discussed and used in other studies, largely in marine systems (Deagle et al., 2019;Merrick et al., 1997;Olesiuk et al., 1990), but not in a network context. Deagle et al. (2019) proposed that this weighted per cent of occurrence method could facilitate a biologically meaningful method for normalisation of frequency-of-occurrence data from metabarcoding. This would result in the two hypothetical consumers above contributing equally to the total interaction weighting of their group when aggregated. Importantly, interaction normalisation could skew representation if treating all consumed taxa equally since many will in fact not be consumed in equal proportions. Relative read abundances could be integrated into these calculations to adjust the ratios to overcome this problem but, as discussed above, their accuracy is questionable. Ultimately, interaction normalisation by representing each detected interaction as an equal proportion within an individual will produce a much more accurate representation of network weightings, and much more comparable to networks constructed from observational data. Networks constructed using these principles are therefore likely to represent more accurately the structure and dynamics of interactions in a complex real-world ecological community.

F I G U R E 1
The differential effect on network weighting generated by representing metabarcoding as either binary or quantitative for individual consumers. Binary representation of metabarcoding data will neglect differences in interaction frequencies (b), whereas quantified representation of metabarcoding data may misrepresent the interaction frequencies due to the inherent technical biases (c). In this example, such misrepresentation may occur if the PCR primers used exhibit bias towards flies and against parasitoids and thrips, resulting in a complete alteration in network weightings. Equally, the biomass of minute thrips will naturally result in fewer copy numbers of thrips DNA present in the predator's guts, whereas relatively large flies may be better represented with even one consumption event

| SAMPLING PERSPECTIVE DISCREPANCY AND SAMPLING COMPLETENESS
It is commonplace in network ecological analyses to ascertain the completeness of the networks by assessing the number of pairwise interactions observed (Chacoff et al., 2011;Jordano, 2016;Macgregor et al., 2017;Traveset et al., 2015). This is conceptually distinct from that performed in dietary metabarcoding studies, in which completeness is often assessed based on the number of taxa detected in the diet of the focal organism(s), treating each individual's diet as a distinct community. This is essentially the distinction between sample-based and individualbased rarefaction to assess richness: the former pertains to the diversity of individual records (e.g. interactions recorded across a study) and the other to the diversity within samples (e.g. the prey recorded in the diet of an animal; Gotelli & Colwell, 2001). While this could appear similar upon initial reflection, the discrepancy results in a differential sampling requirement to satisfy sampling completeness. This distinction arises from the disparity in sampling focus: traditional broadly focused field-based empirical network ecology often observes pairwise interactions between two diverse assemblages (e.g. pollinators and flowers) while dietary metabarcoding mostly focuses on interactions between a single taxon and its resources (e.g. a single spider species feeding on its prey in a given environment). In the former, interactions are spatially constrained (i.e. to the species found in the surveyed location) and often phytocentric, while in the latter, interactions are only taxonomically constrained (i.e. the spider can interact with species beyond the sampling area, but only those accessible to that single spider species) and often zoocentric (Evans & Kitson, 2020;Quintero et al., 2021).
The resultant disparity in sampling will naturally cause differences in the stringency of sampling completeness assessment, resulting in uneven representation of perceived interaction diversity (Jordano, 2016;Quintero et al., 2021). This is particularly problematic when these two network layers (i.e. plant-herbivore interactions via transect surveys and predator-prey interactions via dietary metabarcoding) are merged, possibly resulting in temporally and spatially distinct assemblages of herbivores between the two networks (e.g. metabarcoding will represent nocturnal prey and those beyond the sampled area; Figure 2).
To use both methods to characterise a single network layer (i.e. both phyto-and zoocentric) would, however, provide the greatest resolution and diversity of interactions through the synergy of their distinct biases (Evans & Kitson, 2020;Wirta et al., 2014).
Ultimately, most estimates of sampling completeness in network and trophic ecology poorly represent the specialisation of those species studied (Macgregor et al., 2017). For typical network interaction data, Macgregor et al. (2017) recommend calculating the weighted mean (weighted by the estimated interaction richness per species) of sampling completeness for all species observed at the focal level (the level directly constrained by sampling); in the case of dietary metabarcoding, this would be the consumer, and for transect surveys of plant-herbivore interactions, the plants. The ability to detect multiple interactions from one consumer using dietary metabarcoding, however, facilitates the use of individual data, rather than the species-level data aggregation necessary to avoid false presumption of specialisation using observation data. Importantly, the typical asymptotic means of determining sampling completeness can poorly represent rare species, unlike equivalent Hill numberbased approaches which account for sample size and coverage in rarefaction and extrapolation of diversity (Hsieh et al., 2016;Roswell et al., 2021). Such approaches can be applied to network (Ohlmann et al., 2019) and dietary metabarcoding data (Alberdi et al., 2020), even taxonomy-free and presence-absence DNA metabarcoding data (Mächler et al., 2021).
Differences in sampling completeness as a consequence of sampling perspective disparities between compartments of a merged or multilayer network poses a distinct problem. Greater or lesser sampled interaction types can bias subsequent multilayer network-level TA B L E 1 Frequency-of-occurrence network weightings as generated conventionally and by first normalising the binary data  This is a wider problem for multilayer network analyses, yet it is particularly important when considering how we might go about constructing these networks using molecular methods in the future. Sampling completeness influences network metrics, topology and interpretation (Macgregor et al., 2017) and suboptimal sampling completeness will inevitably reflect insufficiencies of a given sampling strategy to capture an adequate representation of ecological diversity in the time/effort applied. The taxa or interactions omitted will depend on the mechanism of collection. For example, methods measuring activity-density (e.g. observation-based flower visitation surveys) will likely omit the least active species, or species that occur at different times of day to those sampled, whereas dietary metabarcoding is more likely to omit less frequently consumed taxa-rarer taxa or those which the consumer exhibits less preference towards.
Ultimately, these two data sources may result in networks that adequately represent the species present but may fail to join nodes between layers among which legitimate ecological interactions occur.
Such biases must be fully considered when assessing network topologies based on the merging of data derived from dietary metabarcoding and more traditional sources.

| ME TABARCODING B IA S E S
Sampling discrepancies do not stop in the field. Detectability biases, issues that result in the non-detection of specific components of ecological networks, are a critical consideration in any network (Quintero et al., 2021). The metabarcoding process involves many key biases which directly influence the detection of taxa in the diets of consumers. Given the reliance of metabarcoding on amplification of DNA, the selection of appropriate PCR primers is possibly the most critical step (Piñol et al., 2018). Primers must be designed to amplify the DNA of all target species simultaneously from mixed communities and, to do so, an appropriate marker must be selected (Deagle et al., 2014;Elbrecht & Leese, 2016a, 2016b. Metabarcoding primers for dietary analysis must amplify the DNA of a full range of the consumed species of interest, ideally without strongly amplifying the DNA of the consumer. Given the degraded quality of consumed DNA and the intact state of consumer DNA, amplification of the predator is much more efficient and is likely to comprise a large contingent of the PCR product (Paula et al., 2015;Waldner et al., 2013). To circumvent this issue, blocking probes were developed, which prevent amplification of the DNA of specific taxa (Vestheim & Jarman, 2008) but these introduce biases of their own and can have unpredictable non-target effects (Murray et al., 2011;Piñol et al., 2015). Blocking primers could skew the representation of certain consumed taxa, ultimately excluding important interactions from the network entirely. Primers can otherwise be designed carefully with a comprehensive reference database to amplify only target species, or amplification of the consumer's DNA must be accepted and bioinformatically filtered out (Piñol et al., 2014). Each of these solutions (blocking probes, exclusive amplification, loss of sequencing depth to consumers) imposes sometimes quite severe taxonomic biases, greatly affecting the detection of interactions and thus perceived network structure and ecological function.

F I G U R E 2
Integrating single-species dietary metabarcoding predator-prey data into plant-herbivore networks constructed from direct observation may disproportionately introduce nodes that are not linked to basal resources and exclude important predators. By surveying a transect of one or few flowering plants, the herbivore network diversity is low given the spatio-temporal (and thus likely taxonomic) restrictions to plant diversity, whereas dietary metabarcoding of predators will reveal a broad range of interactions for that one predator taxon beyond the spatial and temporal constraints of the sampling. The single-species focus of many dietary metabarcoding studies would also neglect many key predators that may be commonly interacting with the surveyed herbivores. Network completeness could be improved by sampling both layers of the network using both techniques (i.e. incorporation of herbivory data from dietary metabarcoding of herbivores and inclusion of a greater range of predators by dietary analysis) Taxonomic biases inherent to the metabarcoding process are differentially impactful depending on the diet of the consumer. Specialist consumers, for example, will naturally present a restricted range of potential consumed taxa, thus the exclusion of dietary taxa is probabilistically less likely, particularly if multiple species are rarely consumed together. On the other hand, generalist consumers will present a highly heterogeneous diet with many taxa consumed together, many of which may not even be predictable based on limited observation data, thus taxonomic biases may be more likely to omit rarer taxa or those taxa against which PCR bias is particularly strong. Importantly, the omission of cannibalism from metabarcoding data, the detection of which cannot be

| TRUE VER SUS FAL S E P OS ITIVE S
While false negatives are a crucial concern in metabarcoding, some consideration must also be spent on the potential impact of false positives on network construction. Given its sensitivity to minute quantities of DNA, metabarcoding is highly prone to the detection of contaminants (e.g. environmental contaminants, crosscontamination; Alberdi et al., 2018;Jusino et al., 2019). The PCR and high-throughput sequencing process can also lead to the production of errors (e.g. sequencing errors and chimeras). In both cases, taxa which were not actually consumed will appear in the diet of the consumer, sometimes representing taxa absent from the study site, system or even continent.
There are many aspects of best practice that can mitigate these issues. Appropriately sterile approaches to fieldwork that limit crosscontamination, for example sterilisation of any tools and individual collection of samples, can reduce contamination prior to DNA extraction (Athey et al., 2017;King et al., 2012). Stringent use of both negative and positive controls throughout the experimental process, and implementation of additional safeguards like spatial separation of pre-and post-PCR samples and oil sealing of reactions (e.g. Kitson et al., 2019), are essential to ensure confidence in the study outcomes Taberlet et al., 2018). A key aspect of this is the choice of bioinformatics process, which can profoundly affect the data output and thus any ecological inferences subsequently drawn (Clare et al., 2016). Following or even during bioinformatics, appropriate measures must be taken to remove false positive artefacts in the data output. These thresholds, termed minimum sequence copy thresholds, have traditionally involved the removal of read counts below an arbitrary value, but percentage sample read count thresholds and use of experimental controls to determine threshold values provide a more even solution which can more accurately remove false positives while maximising retention of target data . Instances in which these principles are not adhered to may introduce temporally, spatially or ecologically infeasible interactions and taxa into the networks.

| RE SOLUTI ON AND REFEREN CE DATABA S E ISSUE S
A significant problem in metabarcoding and indeed most taxonomically inclined DNA-based studies is the completeness of reference databases, which are necessary in linking sequence data to a taxonomic identity. Taxonomic resolution inherently affects network structure and even small changes can profoundly alter network metrics, ultimately reducing inter-network comparability (Hemprich-Bennett et al., 2020). Metabarcoding primers are often shorter than those typically used for standard barcoding to account for degradation of DNA in environmental samples (Paula et al., 2015;Symondson, 2002;Zaidi et al., 1999). For example, the COI primers typically used for animal barcoding produce a 658 bp amplicon (Folmer et al., 1994), whereas metabarcoding primers are typically designed to amplify 100-350 bp amplicons given the limit imposed by Illumina technology, but also to facilitate the detection of degraded DNA, including semi-digested DNA in the guts and faeces of consumers (Elbrecht & Leese, 2016a;Leray et al., 2013;Vamos et al., 2017;Zeale et al., 2011). As such, the taxonomic resolution can be drastically reduced in metabarcoding studies given the smaller amount of data on which to base taxonomic assignments. This is, of course, similarly true of morphological identification of specimens in or from the field, which can include cryptic species which are best differentiated with DNA data, but the completeness of reference databases can be particularly debilitating for taxa or geographical regions containing poorly described species pools. Taxonomy-free approaches can be employed by clustering sequence data based on genetic similarity (i.e. operational taxonomic units), or representing unique error-corrected sequences separately (i.e. amplicon sequence variations or zero-radius operational taxonomic units), but this can impose its own taxonomic biases (Clare et al., 2016) and might increase disparity with network layers determined by other means. A key factor in determining taxonomic resolution is the choice of marker gene. Depending on the study system (e.g. terrestrial invertebrates vs. marine vertebrates), some genes may provide greater taxonomic resolution at the expense of a less populated reference database, for example the 16S gene. This, alongside the aforementioned PCR biases, renders primer choice one of the most fundamentally critical steps in any metabarcoding workflow (Piñol et al., 2018).
Assembling comprehensive reference databases, particularly where fauna and flora are poorly characterised or hyper-diverse, can be laborious and expensive (Gonzalez et al., 2009), resulting in poorer characterisation of consumed taxa in the diets of local species (Quéméré et al., 2013). This, much like the taxonomic biases presented by PCR, can involve poorer resolution or omission of large compartments of interaction networks, specifically those concerning disproportionately unstudied taxa lacking publicly available barcode data. Many of those species for which barcode data are lacking are those for which we have the poorest ecological understanding, such as cryptic invertebrates or undiscovered species.
This creates a self-fulfilling prophecy of scientific neglect as we rely ever-more on metabarcoding-based approaches, sustaining a relatively poor understanding of the ecologies and interactions of these taxa. Observation-based knowledge of the diet of the focal organisms, or at least predicted diets based on related taxa, may allow relevant reference database checks to be carried out in advance of the study, but such information may itself be based on similarly biased means. Network ecology can be a powerful tool for elucidating otherwise unobserved interactions, thus a comprehensive reference database is the ideal precursor to any such study, even if not all taxa are resolved to the species level. Ultimately, networks can even be constructed and analysed irrespective of Linnaean taxonomy by treating operational taxonomic units/amplicon sequence variants as distinct ecological units; this can, however, neglect functional information typically derived from taxonomic assignment.

| THE A SSUMP TI ON OF CON SUMP TI ON
The mechanism of interaction, how and why organisms interact with one another, is an ecologically crucial consideration in network anal-

| S ECONDARY CONSUMP TION
Erroneous interaction data may also arise from secondary predation, which is the detection of consumed species from the guts of consumed species , thus representing three trophic levels as two (Pompanon et al., 2012;Silva et al., 2019). As with accidental consumption, this will be indistinguishable from a genuine consumption event through the lens of metabarcoding data.
By misrepresenting the number of trophic levels, direct interactions may be assumed which could be ecologically infeasible. If, for example, a hypothetical siphon-feeding herbivorous insect that had recently fed on cereal stems was predated by a spider, and the spider was screened for consumed plant DNA using metabarcoding, the observer might incorrectly assume that the spider had fed directly on the cereal plant, possibly resulting in its incorrect classification as a crop pest ( Figure 5). Such misinterpreted interactions could lead to incorrect ecological conclusions and improperly constructed networks. The problems of accidental consumption and secondary predation are particularly profound in the case of omnivores, which is discussed by Tercel et al. (2021).
The ambiguous nature of these interactions does not pose a problem specific to the merging of different data sources, but is a consideration that novel adopters of metabarcoding focused on network compilation may not be aware of, resulting in introduction F I G U R E 4 Accidental consumption of parasitoid DNA by a predator consuming its prey can be represented as an intentional interaction. While the ecological consequences are undeniable, the intention of the interaction is of critical importance depending on the application of the data of erroneous interactions. By introducing such data into merged networks, the network topology may be skewed by ecologically meaningless interactions, or conflicts may arise through contrasting results from different data sources. Such instances may, however, be detected in metabarcoding data through co-occurrence analysis, facilitating their contextualisation and possible removal, though these are not a panacea and must be interpreted with caution (Blanchet et al., 2020;Tercel et al., 2021).

| HOW DIE TARY ME TABARCODING C AN B E US ED TO CON S TRUC T MULTIL AYER NE T WO RK S
Metabarcoding is a useful tool in the construction of multilayer networks, and with cautious and reasoned use may drive forward this field of empirical research. Quintero et al. (2021) present a valuable discourse on the merging of plant-frugivore interaction network data in which two methods, grand total standardisation and min-max scaling, are suggested for merged data similar to that generated by metabarcoding and observations. These approaches are suitable for the merging of interaction data in which the sampling effort cannot be easily compared or corrected for. Many of the issues inherent to dietary metabarcoding (e.g. PCR bias, quantification, taxonomic resolution) are, however, unavoidable. It is thus important to consider fully how these biases may present in the data of such merged networks.
The impact of bias on every stage of the metabarcoding process, from sampling, through extraction, PCR and sequencing, to bioinformatics, data processing and statistics, must be considered prior to each study to minimise these effects where possible. Particular scrutiny must be spent on the study outcomes and how the data may have been affected by the constraints of the methodology. This study has highlighted these problems and provides an overview of some of the solutions that might be applied to mitigate them (  (Paula et al., 2016). As sequencing throughput increases, such techniques are likely to become commonplace in network ecology, and those that employ them must be aware of the possible problems inherent to their merging with other data types.
Biases exerted by molecular techniques for dietary analysis and network construction are plentiful (e.g. PCR and primer biases as well as sequencing biases) but also unique from those encountered using more traditional approaches (e.g. observer bias, temporal bias and weather effects). Combining data will yield a greater depth of understanding of complex ecological networks, and the merging of dietary metabarcoding data into observation-based networks provides unprecedented access to interaction data. Data merging improves overall network completeness and facilitates the inclusion of taxa otherwise methodologically precluded. The fundamental mismatch between these data types and their associated biases cannot, however, be ignored. Awareness of these critical considerations from the very beginning of experimental design will mitigate many of these issues through well-designed experiments constructed around the given research question. Conscientious data curation is also paramount in unravelling the complex interactions that can be unveiled using these approaches in synergy.
Molecular dietary analysis offers the potential for transparent and repeatable network construction on potentially massive scales so long as those data are accessible, comprehensible and accurately generated. Even 'failed' experiments can valuably add to the discourse surrounding the frontiers of metabarcoding and network F I G U R E 5 Secondary consumption can be indifferentiable from direct consumption through the lens of metabarcoding. If a hypothetical spider was to predate an aphid which had recently fed on a barley plant, the barley DNA in the aphid's gut will be assimilated into the spider's gut. This might be treated as a direct interaction between the spider and the barley despite the low ecological likelihood of a spider feeding on a plant, resulting in misrepresentation of the spider's ecology and ecosystem service provision ecology and their reporting should be encouraged and made commonplace in such publications (e.g. see supplementary materials of Kitson et al., 2019). Existing sequencing-based data repositories such as NCBI facilitate this to some degree, but the combined use of molecular and observational data may require new standards of data handling and management for standardisation in this rapidly emerging cross-disciplinary subfield.

ACK N OWLED G EM ENTS
We thank two anonymous reviewers for their constructive input.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

AUTH O R S ' CO NTR I B UTI O N S
J.P.C. conceptualised the review with substantial input from F.M.W. and M.P.T.G.T. J.P.C. led the writing and all authors contributed toward the writing, concepts and synthesis. J.P.C. designed and produced the illustrations.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.

DATA AVA I L A B I L I T Y S TAT E M E N T
There are no data associated with this manuscript. Prior knowledge of each animal's behaviour and ecology, feeding trials including the interacting species or co-occurrence analysis to ascertain any common links that might indicate secondary consumption