How to make methodological decisions when inferring social networks

Abstract Social network analyses allow studying the processes underlying the associations between individuals and the consequences of those associations. Constructing and analyzing social networks can be challenging, especially when designing new studies as researchers are confronted with decisions about how to collect data and construct networks, and the answers are not always straightforward. The current lack of guidance on building a social network for a new study system might lead researchers to try several different methods and risk generating false results arising from multiple hypotheses testing. Here, we suggest an approach for making decisions when starting social network research in a new study system that avoids the pitfall of multiple hypotheses testing. We argue that best edge definition for a network is a decision that can be made using a priori knowledge about the species and that is independent from the hypotheses that the network will ultimately be used to evaluate. We illustrate this approach with a study conducted on a colonial cooperatively breeding bird, the sociable weaver. We first identified two ways of collecting data using different numbers of feeders and three ways to define associations among birds. We then evaluated which combination of data collection and association definition maximized (a) the assortment of individuals into previously known “breeding groups” (birds that contribute toward the same nest and maintain cohesion when foraging) and (b) socially differentiated relationships (more strong and weak relationships than expected by chance). This evaluation of different methods based on a priori knowledge of the study species can be implemented in a diverse array of study systems and makes the case for using existing, biologically meaningful knowledge about a system to help navigate the myriad of methodological decisions about data collection and network inference.


| INTRODUC TI ON
Social network analysis (SNA) has gained popularity in behavior ecology as a tool to study the processes underlying the associations between individuals and the consequences of those associations (Cantor et al., 2019). It allows biologists to characterize not only the social environment experienced by a single individual in the population, but also the broader social characteristics of a population (Newman, 2010). However, while the methods involved in analyzing a network are reasonably well-explained (e.g., Whitehead, 2008), there are many decisions involved with the design of data collection and creating the network itself (Farine & Whitehead, 2015).
Decisions about the design of a study can have consequences on the inferred network structure (James, Croft, & Krause, 2009). How can we know that our design decisions produce a suitable network for the species and the type of hypotheses we are studying? There is generally little discussion of the considerations made when designing a network-based study, with most published papers presenting their design as a "fait accompli." When analyzing a social network, the key decision that needs to be made is how to define the relationships (edges) connecting the individuals (nodes). This definition can include two main components.
The first is the set of considerations relating to how data are collected (e.g., direct observations versus. video recordings), and the second is the decisions that relate to how observations are turned into edge weights (e.g., rate of interactions versus. time spent together). In most systems, the scope of decisions about data collection appears constrained by methodological limitations, but often there are choices that reflect some trade-offs. For example, is it better to collect fewer data across more individuals at once or to collect more detailed data on fewer individuals? These decisions in turn have consequences for hypotheses testing. Davis, Crofoot, and Farine (2018) provided a useful general discussion on the impact of these trade-offs. However, there is no general guidance on how to quantify the relative value of different approaches when faced with designing methods for real data collection.
Once data are collected, the second set of considerations that arise reflect decisions about how to calculate the strength of the relationships among individuals. While one aspect determining the accuracy of a network to ensure that sufficient data are collected (see Farine & Strandburg-Peshkin, 2015), how data are used to generate quantitative measures of connection strength (edge weights) can also have a large impact on the resulting network. For example, different association indices (Cairns & Schwager, 1987;Hoppitt & Farine, 2018) or different types of data resolution (e.g., the number of grooming bouts versus the amount of time spent grooming) can be used to estimate the strength of a given relationship.
The lack of guidance on how to evaluate different approaches to data collection and network inference might lead researchers to try several different methods and to select the one that best correlates with the predictions of the study (e.g., a positive relationship between a given network metric and survival). Such a correlation could give a false impression that the method chosen produces a network that is successfully capturing the species' or population social structure. At worse, this approach could constitute a multiple hypotheses testing scenario, elevating rates of type I errors because the design decisions are made based on producing a result. This risk is elevated when combined with opportunities to calculate multiple network metrics (e.g., degree and betweenness). For example, a researcher might be interested in understanding whether specific individual attributes, such as personality, correlate with one or multiple network centrality metrics (e.g., Aplin et al., 2013;Boogert, Farine, & Spencer, 2014;Chock, Wey, Ebensperger, & Hayes, 2017;Johnson et al., 2017;Moyers, Adelman, Farine, Moore, & Hawley, 2018;Wilson, Krause, Dingemanse, & Krause, 2013). In the absence of significant results, it could be tempting to change a posteriori the methods by which the network is generated from the data, such as changing the time window or the proximity criterion used to consider that two individuals are associated. While such an example is extreme, there is an important challenge arising from not knowing whether failing to reject a given null hypothesis is a consequence of the expected pattern not being present or the researchers' failure to correctly construct the network. We therefore need an approach that avoids creating circularity, that is, using the same data tested in different ways to corroborate a given hypothesis, as well as using the significant result to corroborate the quality of the information contained in the network. This problem is exacerbated by the lack of information, in most published studies, about how design decisions were made, that is, whether they were made arbitrarily (or based on a published study), based on pilot studies, or if explored in the way described above (but see Boogert, Farine, et al., 2014;Castles et al., 2014;Mourier, Bass, Guttridge, Day, & Brown, 2017, for some exceptions).
Two complementary approaches can help with making decisions about the design of a network study. The first is to collect pilot data, testing different data collection setups (e.g., varying the number of simultaneous observers collecting data). Unfortunately, this is often not possible, not done, or not reported. The second is to run exploratory a priori analyses aimed at comparing different competing networks resulting from different network generation methods and in networks with different edge definition. For both approaches, we propose (and show) here that comparison of the different methods is made possible by testing and interpreting simple hypotheses that we generally consider a network from that study species should support, before testing the hypothesis of central interest.
Capturing structure in a given species' network that aligns with a priori knowledge on the species can be interpreted as an approximation of hypothetical ground-truthed network (which is something that is unlikely to be available when working with nonhuman animals). For example, in a species where mother and offspring or breeding pairs create strong social bonds, we expect that the implemented method would result in a network that would be able to capture these preferred associations (i.e., estimate the edge weight within a family/breeding pair as being significantly greater than those between other sets of individuals, see Boogert, Farine, et al., 2014 andHobson, Avery, &Wright, 2014). Such an analysis would then provide information about whether a network is capable of differentiating, and therefore capturing, one or more important aspects of the biology of the system.
In this paper, we provide an empirical example of how to make decisions about the design of a network study using exploratory a priori test. We start by formulating simple tests of hypotheses to help guide the design of data collection and network inference. We conducted this study in a population of a colonial and cooperatively breeding bird, the sociable weaver (Philetairus socius). In this population, individuals are marked with PIT tags allowing automatic data collection at feeders containing supplemental food. We decided to collect associations in a feeding context not only because this has been shown to be important and meaningful in other bird studies (e.g., , but also as a result of the general insights on the social foraging behavior of this species that have been reported in previous studies on this population (Lloyd, Altwegg, Doutrelant, & Covas, 2017;Rat, van Dijk, Covas, & Doutrelant, 2015;Silva et al., 2018). Therefore, it seems reasonable to assume that information about social relationships within a colony could be obtained from foraging associations (see Farine, 2015), if the study is well designed.
We evaluate the performance of different study designs at extracting two fundamental structural aspects of the social system in our study species (herein our test statistics). The first metric is social differentiation, which we calculate using the coefficient of variation (CV). Because sociable weavers' colonies are large, we do not expect birds to have the same relationship strength with all colony members (i.e., low values of CV). Thus, an informative network should be one that features large differences in the connection strengths that individuals have in their social network (i.e., having many small and large values, rather than many intermediate values). However, solely relying on social differentiation can be misleading as high values can be obtained as a result of nonsocial factors (e.g., low sampling or spatially distributed individuals), nor should maximizing social differentiation necessarily result in the most biologically accurate network. Thus, our second metric for testing if the edges in the foraging network reflect social bonds is one that aims to capture something more specific about sociable weaver biology, assortment by breeding group. Sociable weaver colonies contain several breeding groups composed of breeders with their helpers (usually a breeding pairs plus one to four helpers; Covas, Dalecky, Caizergues, & Doutrelant, 2006). Assortment is a measure of the tendency for connections in a network to be more common among similar than among dissimilar types of nodes (Farine, 2014;Newman, 2003). Thus, assortment by breeding group is a metric that would capture the tendency of individuals from the same breeding group to be more strongly connected to one another in the network. We expect this because while aggression between individuals at food patches is common (sociable weavers typically forage in large groups containing many colony members), aggression between members of the same breeding groups is rare (suggesting higher tolerance for other breeding group members, Rat, 2015). Thus, we expect members of the same breeding group to be disproportionately detected together, resulting in a real social network that is assorted by breeding group membership.
First, we quantify the effects of data collection decisions on the resulting values of social differentiation and assortment by breeding group. Specifically, we test how allowing different numbers of individuals to feed simultaneously impacts our two test statistics.
As data collection decisions are challenging to make when starting a new study, but are critical because they can have a major impact on the robustness of the resulting network(s) (i.e., the network, are sufficient to reliably estimate properties of the real social structure; Davis et al., 2018). In our case, it is not clear whether sociable weavers with stronger social relationships feed more synchronously across repeated foraging visits than birds with weaker relationships, or whether the differences in behavior are better defined as the patterns of foraging within a foraging visit (i.e., with who within the flock the individuals prefer to associate in close proximity). The former requires more widespread effort (i.e., determining only the foraging flock composition), while the latter requires more refined data to be collected within foraging flocks (i.e., more opportunities to record individuals simultaneously at the same site). These two approaches represent a clear cost trade-off as the former can be achieved with fewer resources compared to the latter. For example, when collecting data using RFID technology, having one feeder fitted with an RFID antenna can be enough to obtain the identity of all individuals in a foraging flock (e.g., Jones, Evans, & Morand-Ferron, 2019), multiple RFID antennas working simultaneously are needed to determine if two birds present in the same flock feed in close proximity. We therefore compare different setups for collecting associations that differ in the number of birds that can be detected in an automated RFID system at the same time.
Second, we focus on how to define associations from within a given dataset. Specifically, we compare three different approaches to generate quantitative measures of edge weights in the network, and test how these subsequently impact our test statistics. Two approaches are based on number of co-occurrences in "foraging events." These are akin to using the "gambit-of-the-group" approach (Franks, Ruxton, & James, 2010;Whitehead & Dufault, 1999), where all birds that are detected (i.e., observed) in a flock together are considered to be associated. However, this approach discards more detailed data that could be available about within-flock structure and instead assumes that birds with strong relationships will tend to be co-observed in the same flock more often than those with weak relationships. The third approach is a more direct measure of the proportion of time that two individuals spend in close proximity within the flocks. That is, because we collected data at multiple readers in close proximity, we could estimate how much time two individuals spent on neighboring feeders.
Our aim is to provide guidance on how to make decisions when dealing with choices in the design of data collection and/ or network inference. We achieve this by drawing from an empirical example in which we use existing knowledge of our study species guide decisions for designing a network study. In doing so, our study highlights how relatively simple approaches, using short periods of pilot data collection and evaluating network data against basic knowledge about the study species, can facilitate making methodological decisions that could have long-term impact on the success of a study. While our focus is on collecting and analyzing network data, such an approach goes beyond studies of animal social networks.

| Study scope and model species
We studied a population of sociable weavers at Benfontein Nature Reserve, situated ca. 6 km southeast of Kimberley, in the Northern Cape Province, South Africa. The sociable weaver is endemic to the semiarid savannahs of southern Africa (Maclean, 1973a) and feeds mainly on insects and seeds (Maclean, 1973c). Sociable weavers build large nests, usually on Acacia (Vachellia) trees, with several independent chambers where the birds roost throughout the year and where breeding takes place (Maclean, 1973b). This species exhibits three noticeable cooperative behaviors: building the communal nest, feeding nestlings of others, and communal nest defense from predators such as snakes (e.g., Boomslang, Dyspholidus typus and Cape cobra, Naja nivea). The size of a colony can range from less than ten to several hundred individuals. The breeding pairs can either breed with or without helpers (30%-80% of breeding attempts have helpers; Covas, Du Plessis, & Doutrelant, 2008).
This study is part of a long-term research program which involves the annual capture of 14 colonies to maintain an individually marked population (all individuals are marked with a unique metal ring and color combination: Covas et al., 2008;Paquet, Doutrelant, Hatchwell, Spottiswoode, & Covas, 2015). At five colonies, all birds are also marked with a passive integrated transponder (PIT tag, enclosed in a plastic leg ring). These colonies ranged in size from 43 to 82 individuals (colony size estimated from the annual captures in September 2017).

| Breeding groups' identification
Breeding groups were determined using video recordings of the chambers during the reproductive season of October 2017 to January 2018.
We routinely inspected all colonies every 3 days to identify initiation of new clutches. We visited chambers in the days around the expected hatching date to determine the age of the nestlings and then recorded each breeding group for at least 2 hr when the chicks were between 8 and 20 days old. We considered an individual as part of the group if it was seen feeding the chicks at least 3 times, as occasionally some individuals try to feed but are expelled by the breeding group.

| SNA data collection
During December 2017 and April 2018, we collected two rounds of association data in a feeding context using artificial feeders at the 5 PIT-tagged colonies. For all the 5 colonies, the feeding location was 80-205 m away from the colony.
Data from three of the five colonies were collected using a setup containing 2 feeding boxes (high competition setup), each with 4 perches and 4 small standard plastic bird feeders. Each small feeder allowed for only one bird to feed at a time and was fitted with a RIFD antenna (Priority1rfid, Melbourne, Australia) connected to a data logger ( Figure 1a). Data from these three colonies were collected for 14 days (sampled continuously).
At two of the five colonies (of similar sizes, 43 and 44 individuals), we evaluated alternative methods for collecting feeding association by varying the number of birds that could feed at the same time. We introduced an alternative setup comprising 4 feeding boxes instead of 2 (low competition setup; Figure 1b), allowing birds to spread out more when visiting the feeding station and, therefore, for us to collect more observations of cofeeding. Data for each setup (high and low competition) were collected within the same study period, alternating between the setup each day. This design allowed us to make direct comparisons of the two setups without a cofounding factor of time period in which the data were collected, the number of days that each setup was used to collect data, or which colony data were collected from. We collected 10 days of data for each setup.

| Edge weight calculations
The stream of data collected in the field comprised of temporal sequences of PIT tag codes detected at each of the feeder perches.
From these data, we calculated associations from our observation data in two different ways: F I G U R E 1 Setup for collecting associations (a) A feeding box with birds feeding at the four plastic feeders and the RFID antennas (b) the low competition setup with four feeding boxes. Photographs by Cecile Vansteenberghe 1. Co-occurrence method. We first used the gambit of the group, where all individuals that are observed together are considered to be equally connected to each other (i.e., a flock) and the strength of connections is estimated based on the repeated patterns of co-occurrences of individuals in the same observation. However, there are several ways a flock can be defined (see Farine & Whitehead, 2015). Here, we used an established method of inferring flocks based on the time differences between two detections. The start and end times of a "wave" of individuals considered to be forming a flock are determined by a Gaussian mixture model (GMM; using R package "asnipe" Farine, 2013; following Psorakis et al., 2015), which is an automated clustering algorithm designed to detect peaks, or clusters of detections, in the temporal profile of activities at the artificial feeders. This approach uses data from the feeding behavior of the entire set of individuals as part of determining the associations between any two individuals.
2. Time overlap method. We estimated association strengths directly from the data by calculating the total time that two individuals overlapped while feeding at the same feeding box. This approach does not use any data from other individuals when determining the associations between two individuals.
These two methods are described in more detail below. For the co-occurrence method, we used two variants (see Figure 2): one focused on the association at the broad flock level (single GMM) and the other added a second step of estimating association within each flock (double GMM). Therefore, three different network types were compared for each combination of colony (see Figure 3 for an illustration of the different comparisons done in this work).

| Co-occurrence networks
Single GMM (broad flock): We built networks using the rates of co-occurrence on the same so-called "foraging events" as commonly done in other studies (e.g., Aplin et al., 2013). Foraging events were defined using a single run of the GMM (single GMM network) directly on the raw daily RFID feeder data, which splits the temporal data in different foraging events based on peaks of activity on the feeding boxes for that day (following Psorakis et al., 2015). We considered each feeding box as a different location to allow us to split the flock spatially in order to archive a greater resolution in detecting preferred associations. We inferred the association strengths (edge weights) among colony members from their copresence across all foraging events. We used the simple ratio index: the number of times that two individuals were in the same foraging event divided by the number of foraging events that contained at least one of the two individuals.
Double GMM (within flock): Since our study species is colonial and highly gregarious, we believed that to differentiate the relationships among colony members we would need edges based on co-occurrences at a finer scale than what has traditionally been used for other species (i.e.,. using the single GMM). Therefore, we used the Gaussian mixture model approach to define associations among individuals using a two-step procedure. Because the data from the feeders are quite discontinuous in this population (i.e., all individuals tend to visit foraging patches together and then all depart together in a very synchronized manner), we first detected the broader activity profile at the set of feeder boxes. We did this by grouping the individuals' detections across all feeder boxes at a location in a given day into 1 min blocks and used the GMM to extract the arrival and departure times of broad foraging events (see Figure 2a). After this first step, we used the GMM again, but this time to detect waves of activity within each of the foraging events determined by the first GMM run. In the second run, we considered each feeder box (containing 4 RFID perches each) as a different location and used detections at a 1-s resolution. Considering each feeding box as a different location allowed us to split the data on the flock spatially, while running the GMMs within each foraging event allowed us to decrease the time scale and forced the GMM to split into shorter feeding bouts (Figure 2b), thereby allowing the detection of within-flock spatial and social preferences. We inferred the association strengths among colony members from their copresence across all feeding bouts generated from the second runs of the GMM (double GMM network). As with the single GMM approach, we used the simple ratio index.

F I G U R E 2
Example of applying the GMM algorithm method. (a) Sociable weaver visits to a feeding location during one morning. The top straight lines represent the foraging events resulting from the first GMM. (b) The foraging events resulting from the second GMM, discriminating between the two feeding boxes and using only visits from the first event determined by the first GMM (corresponding to the first horizontal line segment on Figure 2a)

| Time overlap networks
For the time overlap networks, we directly calculated the proportion of total feeding time during which two individuals were feeding simultaneously in the same feeding box (i.e., the time that birds spent feeding side-by-side). Here, edges were calculated by taking the sum of time that two individuals spent feeding at the same time at the feeding box divided by the sum of the total time that at least one of these two individuals were present at the feeder (which is also the simple ratio index, but more explicitly time-based rather than occurrence-based). This method aimed to define a stricter scale at which we consider that two individuals were associated and represents the degree of tolerance to feed together. This method can be more relevant for colonial and very gregarious species such as sociable weavers, since all members of the colony are often found foraging together and are already connected by colony membership, and since our interest is to find a sublevel of sociality within this colony structure.

| Hypothesis testing
We evaluated each network we produced by testing if they were significantly different from networks generated from randomizations of our data and if they generated patterns that reflect a biologically meaningful social aspect of this species. Specifically, we evaluated the utility of each network we generated (3 variants times 2 data collection methods) according to two test statistics: We tested the statistical significance of the CV and the assortment coefficients by comparing the test statistics calculated from the observed networks with the same statistics calculated from 1,000 random networks generated using permutations of the observed data (see Farine, 2017). For the co-occurrence method, we generated random networks following the method first described by Bejder, Fletcher, and Brager (1998), using the R package asnipe (Farine, 2013). Briefly, for the single GMM networks, we selected pairs of observations of individuals from different foraging events and then swapped these individuals. For the double GMM network, the approach is similar; however, pairs of observations of individuals were selected from the same foraging events (from the first run of the GMM) and at the same feeder, but from different feeding bouts For all the 5 colonies, we compared the CV and the assortment coefficients from the 3 different types of networks (singles GMM, double GMM, and overlap of time). Additionally, for 2 of those 5 colonies, we also compared each of the network types resulting from data collected using high and low competition setups. This allowed us to test whether we could improve our networks not just in terms F I G U R E 3 Flow diagram illustrating the steps for the two different comparisons of the study: comparing different methods for calculating edge weights and comparing different data collection setups of edge definition but also regarding the design of data collection by changing the number of birds that can access food simultaneously.
As illustrated in the diagram of Figure 3, the decisions about our method for constructing a suitable network for the sociable weavers were guided by both the setup design and the edge definition.
Addressing these two questions might appear to be a sequential scheme, that is, first looking at feeder saturation and after deciding if there was or not a significant improvement in using the 4 feeding boxes, addressing the scale problem (by comparing the different types of networks) or the other way (first the scale and then the feeder saturation). However, we did not address this as a sequential problem, since the two types of comparisons (comparisons of scale and comparisons of feeder saturation) are not easy to disentangle.
In order to compare the high competition with the low competition setup, we need a reliable edge definition which can only be obtained by comparing the 3 types of networks. However, the best edge definition might differ when using different methods for collecting data.

| RE SULTS
We found that our methodological approach for evaluating different methods for data collection and network inference yielded informative results that could be directly applied when making decisions about study design. All of the methods we used generated networks that were significantly different from random. From an edge definition perspective, the overlap of time method consistently generated networks with higher CV (Table 1) and higher values of assortment (Table 2). While the co-occurrence methods were able to detect the predicted positive assortment by breeding group in most colonies, the overlap of time method consistently produced considerably higher assortment coefficients. The single GMM co-occurrence method was able to generate well-differentiated networks, but performed worse with the assortment coefficients being closer to zero (Table 2). These results suggest that the networks produced by the overlap of time method performed better at capturing a sublevel of sociality within the colony that we expected to be captured in a network of sociable weaver with an appropriate edge definition.
From a data collection methods perspective, using four boxes instead of two resulted in higher CVs and in higher assortment coefficients in both colonies (Tables 1 and 2). In other words, using more feeding boxes at a given site resulted in greater power to discriminate between same breeding group associations within a colony across all the three types of networks. This effect was more pronounced in the co-occurrence method than in the overlap of time method.
Together these results show that using more feeders and an edge definition based on overlap of time produced networks that are able to capture the expected assortment by breeding group and performed better than other methods. We can now use this method to construct networks to test our hypotheses of interest in future research such as testing if specific individual attributes (e.g., personality traits) influence social relationships among the individuals.

TA B L E 1 Comparison between the
CVs of the three different types of networks obtained using a setup with two and four feeding boxes the data collection design and determining how to calculate the strengths of social relationships. We have also shown that, as expected, different edge definition and experimental designs in the same context can result in different networks: some presenting a low coefficient of variation and thus a network in which individuals are more equally connected, and others with a higher CV, and thus a network containing a higher number of both stronger and weaker relationships. Importantly, we found that the methods that appear best suited to our study system differ from those that have been widely used in studies of PIT-tagged songbird populations, highlighting the need to ensure that methods are tailored to the specific systems under investigation.
In the case of the sociable weaver, we showed that using the time that individuals spent together, rather than data on simpler co-occurrences, generates networks that best captured network features that we a priori identified as being important. For example, the assortment coefficients by breeding group were more than ten times higher in the time-based networks than in the networks generated from co-occurrences. While using a more time resolved co-occurrence method (double GMM) resulted in a better network to capture assortment by breeding group relative to the standard GMM method, it still performed worse than a network based on the time that individuals spend in close proximity. This would be expected for a species such as the sociable weaver, in which colonies can forages in flocks always containing the same individuals. Thus, while we found that a network definition based on the overlap in time provided the networks that best captured a priori knowledge of the study species' social structure (i.e., the breeding group), it might not necessarily be the best method for all questions or study systems.
For example, tits (Paridae) spend the winter in flocks with highly dynamic membership with membership changing over the course of minutes  and pairs of blue tits (Cyanistes caeruleus) can be detected forming through their increased comembership in the same flocks (Beck, Farine, & Kempenaers, 2020). Thus, using a single GMM can extract the social signal from tit flocks because this signal is contained in broader patterns of flocking rather than finescale patterns of social proximity. Hence, for each study system, and for each purpose, researchers should carefully consider what is the best way to construct their networks, potentially requiring experimenting while avoiding trying the different methods on a given hypotheses of interest.
We also generated new insights into how to design data collection protocols. For the sociable weaver, we found that networks generated using more sampling opportunities (in this case a higher number of feeder boxes available simultaneously) produced networks with higher assortment by breeding unit. Our finding is in line with the suggestions made in a recent methodological paper that simultaneous sampling data can result in more robust networks (Davis et al., 2018). Even though our analyses are based on only two colonies, the reason for this improvement is easy to explain. Having fewer feeders available increases competition for access to feeders, which, in turn, might reduce the ability for groups of preferred associates within a colony to forage at the same time, and force them to forage with less preferred conspecifics. Alternatively, competition for access to the resource could go as far as causing only the more dominant individuals of each group to have access to the resource, meaning that we would fail to sample subordinates. In either case, having fewer feeders means that birds could not clearly express the social preferences we would expect them to have in more dispersed and more natural resources.
TA B L E 2 Comparison between the assortment by breeding groups for the three different types of networks obtained using a setup with two and four feeding boxes In social network studies, the number of individuals that can be detected at the same time (or in a given time window) is rarely considered or reported. In our study, 8 or 16 individuals could be detected simultaneously, contrasting with studies on tits and other songbirds that use feeders which typically detect one (Jones et al., 2019) or two (e.g., Beck et al., 2020) birds simultaneously. Other field studies, such as recent work on wild zebra finches (Taeniopygia guttata) (Brandl, Farine, Funghi, Schuett, & Griffith, 2019) used feeders with a restricted entrance allowing multiple flock members to enter and exit feeders together.
Reporting the proportion of birds detected feeding together could allow assessing whether restricting data collection to fewer simultaneous observations dilutes true social bonds, causing lower network resolution and potentially leading to less accurate associations, as it appears to be the case in the sociable weaver. This issue becomes an important consideration for studies with limited budgets or researcher time as field studies often face the trade-off between maximizing replication across individuals (i.e., sampling more individuals in total) versus maximizing the precision of the data collected (i.e., sampling individuals simultaneously). In our study, one setup requires twice as much equipment, meaning that we could only sample at half the locations or revisit each location half as often. Simulation studies suggest that collecting more simultaneous data is generally preferable (Davis et al., 2018), because networks require many replicated observations of each possible pair of individuals in order to be robust (see Farine & Strandburg-Peshkin, 2015). Such improvement in the resulting networks might well justify the additional economic cost associated with having more feeders or having technology capable of detecting multiple individuals in close proximity.
Our study also illustrates how different data collection methods and algorithms for estimating association strengths can generate different networks (see also Castles et al., 2014). While the different networks that are collected may be correlated (see Farine, 2015), this does not mean they are all equally powerful at testing a hypothesis.
However, when testing network quality, the choice of which a priori knowledge to use is also critical. For instance, while a method that was guided using the assumption that individuals prefer to associate with other members of the same breeding group might be appropriate to study phenomena that potentially involve a social preference Previous studies used simulation-based approaches (Bonnell & Vilette, 2020;Psorakis et al., 2015) to identify the best method to discriminate patterns of social connections, or video data to confirm that the detection data match reality (Evans, Devost, Jones, & Morand-Ferron, 2018;Nomano, Browning, Nakagawa, Griffith, & Russell, 2014). Here, we demonstrated that using a priori knowledge about the study species or population can be helpful in making decisions about which network to use-which we believe is a stronger approach as collecting pilot data captures many of the nuances that come with collecting field data. Anticipating the potential limitations of the method used for data collection provide researchers with the opportunity to make the necessary adjustments before collecting the actual data, avoiding revisiting their methods and even hypotheses a posteriori. The crucial point to keep in mind, however, is that researchers should aim to make a priori decisions (even if some are inevitably arbitrary) about methods for collecting data and building networks and ensuring that these are independent of any later tests of hypotheses. Failing to do so would decrease the rates of type I errors in social network studies. Researchers could also make use of preregistration services (Nosek, Ebersole, DeHaven, & Mellor, 2018) to publish the research questions, discuss different methods, plan analyses and pilot studies before collecting the data and observing the research outcomes. This would not only greatly improve the credibility of research findings but it would be also useful information to other researchers that are planning their studies.
We have tried to draw attention to the decisions that underlie social network analyses. Many recent papers provide guidance on how to construct networks (reviewed in Farine & Whitehead, 2015 (Altmann, 1974). As the comparison of co-occurrence versus the overlap of time done here, decisions on how to define the edges of the network also have to be made: are edges defined by spatial proximity more meaningful for a given species and for specific question of interest than edges defined by other social interactions? These decisions are easier to make if we know what patterns to expect in a social network of for a given study species. Basing methodological decisions on tests of a priori known biological properties of the study system, ideally while simultaneously collecting pilot data, will result in more robust network data than copying studies from other systems. This should also avoid the pitfalls of combining exploration of network inference with testing new hypotheses.
In this paper, we provide a structured approach that can be used to make design decisions in network, or other, studies. In addition, we also call for researchers to provide more information about the rationale leading to their decisions. In our case, we took advantage of the information obtained as a result of a long-term project on a cooperatively breeding species, which provided information on composition of breeding groups. In other projects, this type of information might not be available or relevant, but other types of information, such as the importance of mated pairs which are expected to share strong social bonds (see Beck et al., 2020;Boogert, Farine, et al., 2014;Brandl et al., 2019; could be used. Further, we reiterate that our study clearly highlights the need for data collection and analysis methods to be tailored to each study system, as different approaches (all of which are valid and exist in the literature) can produce quite different outcomes. We hope that once sufficient studies report their design process, as we have here, we will be able to identify some general guidelines for animal social network data collection and analysis.

ACK N OWLED G EM ENTS
This study would have not been possible without the contribution of several people that helped with network data collection, to capture and monitor the reproduction of the birds at the study Behaviour" (ID: 422037984) and CD by the CNRS. This research was conducted in the scope of the LIA "Biodiversity and Evolution" (CNRS-CIBIO).

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
Code and data for reproducing the entire contents of this article are available at Dryad Digital Repository https://doi.org/10.5061/ dryad.p8cz8 w9mx