Network dynamics with a nested node set: Sociability in seven villages in Senegal

We propose two complementary ways to deal with a nesting structure in the node set of a network-such a structure may be called a multilevel network, with a node set consisting of several groups. First, within-group ties are distinguished from between-group ties by considering them as two distinct but interrelated networks. Second, effects of nodal variables are differentiated according to the levels of the nesting structure, to prevent ecological fallacies. This is elaborated in a study of two repeated observations of a sociability network in seven villages in Senegal, analyzed using the Stochastic Actor-oriented Model.

There are many approaches to statistically modeling dynamics of networks. Wide-ranging overviews are given by Goldenberg, Zheng, Fienberg, and Airoldi (2009), section 2 and Kolaczyk (2009), and a more recent survey is Fritz, Lebacher, and Kauermann (2020) in this issue. Here we give a very brief overview of statistical models for longitudinal network data in the form of repeated observations of networks on a constant node set, each network being represented as a graph or a digraph. For the repeated observations we use the term "waves," like in panel analysis. We restrict our overview to the case that the number of waves is small (two or a few more), and where covariate effects also are of interest.
An important distinction here is between models assuming a continuous-time underlying process (even though there are only few waves), and those based on a discrete-time model. Using a continuous-time model for network panel data was proposed by Holland and Leinhardt (1977). This idea was taken up by Wasserman (1980). Snijders (2001) proposed the Stochastic Actor-oriented Model, a continuous-time model, of which an overview is given in Snijders (2017). Another continuous-time model is the Longitudinal Exponential Random Graph Model of Koskinen and Snijders (2013). Discrete-time models were proposed, as autoregressive longitudinal extensions of the Exponential Random Graph Model (Lusher, Koskinen, & Robins, 2013), by Robins and Pattison (2001), Hanneke, Fu, and Xing (2010), and Krivitsky and Handcock (2014). The difference between a continuous-time approach and an autoregressive discrete-time approach is elaborated by Block, Koskinen, Hollway, Steglich, and Stadtfeld (2018). Latent variable models have also been proposed, of which dynamic stochastic block models are a special case; an overview is given by Kim, Lee, Xue, and Niu (2018). A continuous-time extension of the stochastic block model was proposed by Matias, Rebafka, and Villers (2018).
This article proposes ways to deal with a nested ("grouped") structure in the node set of the network, and elaborates this using the Stochastic Actor-oriented Model. Two modeling principles are presented.
First, within-group ties are considered to be essentially different from between-group ties, and therefore represented by separate networks; this leads naturally to what might be called a multivariate or multilayer network. Multilayer networks have recently received a lot of attention; rich reviews are Boccaletti et al. (2014) and Kivelä et al. (2014). An extension of the Stochastic Actor-oriented Model to the case of multivariate networks was proposed by Snijders, Lomi, and Torló (2013).
Second, following well-known analytical principles in multilevel analysis and related to the ecological fallacy (Robinson, 1950), it is recognized that actor variables used as covariates may have different substantive meanings at different levels of analysis, and accordingly the model should contain a distinction between between-group and within-group coefficients.
These modeling principles are applied in an analysis of a network panel dataset, with two waves, of social relations in seven villages in rural Senegal.
In Section 2, we elaborate these modeling principles and the plan for the article. Section 3 introduces the case of seven villages in Senegal. The model is explained in Section 4, and Section 5 presents the results. The article ends with a discussion in Section 6.

NETWORKS WITH A NESTED NODE SET
In this section, we elaborate the two modeling principles, mentioned above, for analyzing a network with a nested node set. For the sake of concreteness, we shall already employ some of the specifics of the dataset further described in Section 3.
The data are constituted by two repeated observations of a social network with a node set containing four levels of nesting: individuals, in households, in compounds, and in villages. The compound is an extended family of several households living closely together. Of these levels, the village level is mainly important for the network structure, while the other levels are important for moderating the effects of covariates.

Within-and between-group ties
In the nesting structure defined by the villages, the groups are the villages. Social relations between inhabitants of different villages are of a different nature than between inhabitants of the same village. They require more effort to establish and maintain, are fewer in number, and may serve different purposes. This could be represented in the model by interactions of various effects with a dyadic dummy variable differentiating between-village from within-village contacts. This would lead, however, to a very large list of interaction effects. A more transparent analysis is obtained by representing within-village and between-village relations as two distinct but interdependent networks, with a common node set and disjoint edge sets: a multivariate or multilayer network As our basic network variable we have one directed graph where the node set consists of all individuals. Each individual is inhabitant of precisely one village; suppose that the node set is ordered by village. The within-village network W is defined as the directed graph restricted to the pairs of inhabitants of the same village; and the between-village network B as the directed graph restricted to the pairs of inhabitants of different villages.
The adjacency matrix of the within-village network is composed of diagonal blocks W h and structurally zero off-diagonal blocks, as in Table 1. The adjacency matrix of the between-village network is composed of off-diagonal blocks B hk for h ≠ k and structurally zero diagonal blocks, as in Table 2.
The sum B + W of these two adjacency matrices is the adjacency matrix of the original network.
This can be treated as a multivariate network Y = (B, W), when the structural zero blocks are respected and no ties are allowed in them. The dynamic model used for this multivariate network will further be explained in Section 4. TA B L E 1 Within-village block structure for network W for seven villages TA B L E 2 Between-village block structure for network B for seven villages

Within-group and between-group regressions
In a social context such as a classroom, a school, a neighborhood, or a family, individual attributes will have a distinct meaning from such attributes aggregated to the social context. For example, the ability of a student and the average ability of all students in a given classroom are distinct variables with distinct consequences. Neglecting this distinction is called the ecological fallacy (Robinson, 1950), discussed at length by the sociologists Lazarsfeld and Menzel (1961). Davis, Spaeth, and Huson (1961) proposed to deal with this simply, for a given individual-level variable X where individuals are nested in groups, by defining the contextual variable X as the mean of X for all individuals in the given group and using X and X together as explanatory variables in a regression analysis. An equivalent alternative is to use X together with the within-group deviation X − X. In more modern methodological practice, it is common to do this not in an OLS regression model, but in a hierarchical linear model that contains random terms at both the individual and the group (contextual) level; see, for example, Neuhaus and Kalbfleisch (1998) and Snijders and Bosker (2012). In our case, this issue is relevant especially for the three lowest levels: individuals, household, and compound. An important individual attribute is wealth. For wealth, the meaningful aggregation is not by averaging but by adding, and the aggregate then is total wealth at the household or compound level. If the wealth of individual i in household j in compound k is denoted X ijk , the aggregate variables are household wealth and compound wealth,

SEVEN SENEGALESE VILLAGES
We present the analysis of a longitudinal multilevel network in a case study of data, collected in 2010 and 2016, about sociability relations between adult inhabitants of seven villages in North-West Senegal, a rural region. See Faye (2011Faye ( , 2013aFaye ( , 2013b and Brailly and Faye (2015) for further information.
The main substantive research questions are, first, how the structure of within-village relations differs from the structure of between-village relations, and how these two are connected; second, how the wealth of the villagers plays a different role for the relations dependent on whether the wealth is considered at the individual, the household, or the compound level; third, how the pattern of relations is influenced by sex, age, and ethnicity of the inhabitants. The distance between compounds is another important variable, associated with the nesting levels, and therefore important to take into account.
The seven villages are near each other and share a common water resource. They are inhabited by two ethnic groups: Wolof, who are mainly peasants; and Fulani, who are transhumant or seminomads. Two villages are inhabited exclusively by Fulani, and two exclusively by Wolof. Three others villages are mainly Wolof, but include also some Fulani families. These ethnic groups have different languages, but most will also understand the other language.
The population structure is characterized by a majority of children, women, and elderly men. This is explained by the high rate of emigration, with many men working temporarily outside the village, often abroad. It was attempted to include all inhabitants of age 15 years and over in the survey, but this was not entirely possible because of prolonged absence of part of the population, especially younger men.

Data collection
The first wave of data was collected between November 2009 and March 2010. For practical reasons, adults (individuals over 15 years of age) who were not emigrants were interviewed. A response rate of 70% (460 responses) was achieved despite the periodic absence of many of the male inhabitants.

Network
The network survey was based on a number of name generators. For this article, a tie from the respondent to the nominee was supposed to exist if the nominee was included in the answers to at least one of the following questions.
• Loans. If you have an urgent need to buy a good or service, and you do not have the money for it, whom in the village community will you ask to lend you money?
• Advice. If you need advice on an important decision in your life, whom in the village community will you consult?
• Assistance. If you are ill, from whom in the village community will you seek help to assist and/or take care of yourself?
• Discussions. Most people discuss important topics with others. Apart from public discussions (in the village square, in the house yard, etc.), with whom in the village community have you recently discussed topics of importance to you?
• Visits. Can you name the people in the village community whom you regularly visit?
These name generators, common to many sociability surveys (Marsden, 2011), are essential regular exchanges for members of this community. We define the relation defined in this way as a sociability relation. The second wave of data collection took place in February and March 2016. Due to death, emigration, or transhumance, not everybody could be interviewed.
The data used is the network of all 406 individuals who were interviewed in both waves. Within-village outdegrees ranged from 0 to 21, and their means in the two waves were 6.0 and 4.1; between-village outdegrees ranged from 0 to 14, with means 0.8 and 0.7. The individuals lived in 135 households, and these were grouped in 54 compounds.

Covariates
Several individual-level variables were collected including sex, age, and ethnic group. Wealth was defined as the aggregate of the value of cattle, harvest, and machines. Before adding, a few outliers were truncated, and the values were log-transformed. The locations of the houses were geocoded, and from this the distances in kilometers were calculated. These also were log-transformed. Descriptive statistics are given in Table 3.

STOCHASTIC ACTOR-ORIENTED MODEL
After splitting the network in the within-village and the between-village networks, as discussed in Section 2.1, our data can be represented as two repeated observations (W(t 1 ), B(t 1 )), (W(t 2 ), B(t 2 )) of the within-village and between-village networks. These are modeled here by the multivariate version of the Stochastic Actor-oriented Model ("SAOM") . This model is generally explained in Snijders (2017). We give here a concise overview of the model for the case of two repeated observations ("waves"). In view of the social science applications, nodes are called actors.
We consider a multivariate network , and t is time. Observations are available for t = t 1 , t 2 with t 1 < t 2 . The label k indicates the network, and i and j are the sender and receiver of the tie i ij (t) is equal to 1 if the tie i k → j exists, and 0 otherwise. Sender and receiver are also called "ego" and "alter," respectively. Self-loops are excluded, so that Y (k) ii (t) ≡ 0. It is assumed that there is an unobserved stochastic process (Y (t); t 1 ≤ t ≤ t 2 ) of which Y(t 1 ) and Y(t 2 ) are observations, and it is assumed that Y(t) develops according to a continuous-time Markov process. The SAOM is built on two fundamental simplifying restrictions and parametric assumptions for the Markov process.
The first basic simplifying restriction is that at each time point t ∈ (t 1 , t 2 ), if there is a change at time t, only one of all the tie variables Y (k) ij can change. This simplifies the definition of the transition process for the whole system Y(t) to the definition of the conditional probabilities of changes in single tie variables. This restriction was proposed by Holland and Leinhardt (1977). The second basic assumption is that tie changes are organized by dependent networks k and by actors i, hence the name "actor-oriented model." Tie changes are modeled as choices between changes in the outgoing ties of a given actor i in a given network k. This approach was proposed by Snijders (2001) (for the actors) and Snijders et al. (2013) (for the multivariate networks) and expresses the heuristic idea that actors have control over their outgoing ties, and tie changes reflect choices made by actors. The change probabilities will depend on the actor's neighborhood in all the networks simultaneously.
We characterize the transition distribution of the stochastic process by a simulation algorithm and by the transition intensity matrix, or Q-matrix (Norris, 1997). This is done on the basis of the rate functions k i (y; ), which indicate the rate at which opportunities occur for change for actor i in network k, and the evaluation functions f k i (y; ). A higher value of f k i (y; ) compared with f k i (y ′ ; ) indicates that, when actor i could make a change in network k to y or to y ′ (and perhaps other values), the probability of changing to y is higher than that of changing to y ′ . The parameters of the model are and . Heuristically, the evaluation function can be characterized as indicating the value that i attaches to the combined network state y in the event of contemplating a change in network k. It is similar to a utility function but applying to network k only, and can be interpreted as the short-term balance of goals and restrictions. A further aspect of the model is that a restriction may be given for the set of tie variables that is allowed to change. For all k and i, subsets N(k, i) are defined such that N(k, i)∖{i} is the set of actors j for which the tie variable Y (k) ij can change. For the convenience of notation below, always i ∈ N(k, i).
It will be convenient to have a notation for the network that results from a given tie change. Given a multivariate network y, a network index k, and actors i, j, we denote by y (k±ij) the multivariate network that results from y by changing tie variable y k ij to 1 − y k ij . Formally, we denote y (k±ii) = y. Summation over an index is denoted by replacing this index by a +. The simulation algorithm proceeds as follows. For convenience we set t 1 = 0, t 2 = 1. For concise notation, we drop the dependence on parameters and .
1. Start with Y(t) and t = 0. 2. Generate Δt as a random draw from the exponential distribution with parameter + + (y).

Choose network k with probability
.
For interpreting the probabilities (4), it is instructive to see that they can be rewritten as functions of the change in the evaluation function, } .
This simulation model corresponds to a transition matrix, or Q-matrix, for the Markov process defined as follows.
In this article, the rate functions are constant, k i (y; ) = k . In many applications this is reasonable because individual tie changes are not observed. The evaluation functions are specified as linear combinations where the s (k) i (y) are statistics depending on the neighborhood of actor i in the multivariate network y, and on covariates, if these are included in the model. These statistics are called effects. This makes f k i (y; ) the linear predictor in generalized linear model (4). If an effect s (k) i (y) depends only on y (k) , it expresses internal dependence of network k; if it also depends on one or more of the other networks, it expresses cross-network dependence. Examples follow below.

Estimation
Estimation in this model is conditional on y(t 1 ). For this article, estimation was done by the method of moments ("MoM"); this is explained in Snijders (2001Snijders ( , 2017, and sketched below. Maximum likelihood estimation is also possible (Snijders, Koskinen, & Schweinberger, 2010), but would be too time consuming for this large dataset. Denote the parameter by = ( , ). We treat the MoM only for the case of two waves, two dependent networks, and a constant rate function (k) , observed for t = t 1 , t 2 . The MoM estimate is defined, for a suitable statistic z (y(t 1 ), y(t 2 )), as the solution̂of The vector z (y(t 1 ), y(t 2 )) has the same dimensionality as ; to each one-dimensional parameter in corresponds one one-dimensional statistic. Suitable statistics z can be chosen as follows (Snijders, 2001;Snijders et al., 2013). Corresponding to (k) is the observed number of changes For each effect s (k) i (y) that depends only on the dependent network in question, y (k) , the statistic is the aggregate value at t 2 , For each effect s (k) i (y) that depends on both networks, and which therefore can be written as s (k) i (y (k) , y (h) ) for h ≠ k, the statistic is the cross-lagged aggregate where the dependent network is taken at t 2 and the explanatory network at t 1 , A generalized MoM estimator that also uses the contemporaneous statistics (y (k) (t 2 ), y (h) (t 2 )) is proposed in Amati, Schönenberger, and Snijders (2019). The solution of the MoM equation can be approximated by stochastic approximation (multivariate Robbins-Monro algorithm); further references are in Snijders (2017). The covariance matrix of̂can be approximated according to the method of Schweinberger and Snijders (2007). These procedures are implemented in the R package RSiena (R Core Team, 2020; Ripley, Snijders, Bóda, Vörös, & Preciado, 2020). This includes a convergence check based on the so-called maximum convergence ratio, defined as a multivariate deviation of the Monte Carlo approximation for the left-hand side of (7) from the right-hand side. The rule of thumb is that for good convergence, this ratio should be less than 0.25.
It may be assumed that the estimator̂has an approximate multivariate normal distribution with expected value very close to , and that the estimator of the covariance matrix is consistent. No proof is available, but this is supported by intuition and by many simulation results. Accordingly, hypotheses of the type A = 0 for an r × p matrix A can be tested by referring (Â) ′ (AΣA ′ ) −1 Âto a chi-squared distribution with r degrees of freedom, whereΣ is the estimated covariance matrix of̂.

Specification of the nested data structure
In our case we have K = 2 networks, the within-village network W and the between-village network B. These will be indicated by k = W and k = B; these letters will also be used in various places in the text to distinguish the networks. The use of structural zeros, mentioned above, is together with the rule that Y (k) ij (t) = 0 for all j ∉ N(k, i) and all t.

Model specification
For the specification of the evaluation function, we follow what currently is considered good practice in applications of the SAOM (see Ripley et al., 2020). This requires effects depending on degrees, reciprocation, transitivity, and covariates. In first instance, the following effects were included.

Single structural network effects
First we give the effects s (k) i (y) depending only on network k, representing internal dependence of this network. For a simpler typography, we drop the superscript (k). The effects aim to represent local patterns in the network neighborhood of the actors: number of ties, number of reciprocated ties, triadic patterns, and in-and out-degrees.
• The outdegree effect is like an intercept, balancing between the creation and termination of ties. For sparse networks, it usually is negative because many more ties could be created than could be terminated.
• The reciprocity effect ∑ j y ij y ji reflects the tendency to reciprocate ties.
• Transitivity is represented by the gwesp statistic, "geometrically weighted edgewise shared partners," introduced for Exponential Random Graph Models by Snijders, Pattison, Robins, and Handcock (2006) and Handcock and Hunter (2006). It is defined by where TP ij (y) = ∑ h y ih y hj , the number of twopaths from i to j. With a positive coefficient , this effect implies that the existence of twopaths, that is, indirect connections i → h → j, increases the probability of creating the tie i → j. This function is a concave increasing function of TP ij (y), representing that the effect of the number of connecting twopaths on the log-probability of a tie between two nodes is sublinear. The degree of concavity is determined by the parameter , which may be fixed or estimated. Here we use = ln(2).
• The outdegree-activity effect ∑ j y ij y i+ or ∑ j y ij √ y i+ represents the extent to which a currently high outdegree y i+ of actor i promotes the further creation and maintenance of outgoing ties of i. The two forms, with and without the square root, may be used dependent on what provides the better fit.
• The indegree-popularity effect, also in two forms, ∑ j y ij y +j or ∑ j y ij √ y +j represents the extent to which a currently high indegree y +j of actor j promotes the probability for other actors i to send ties to j.
• The indegree-activity effect, in two forms again, ∑ j y ij y +i or ∑ j y ij √ y +i represents the extent to which a currently high indegree of actor i promotes the further creation and maintenance of outgoing ties of i.
• According to Block (2015), it is important to consider the interaction between reciprocity and transitivity. Given that transitivity is represented by the gwesp effect, this interaction is given by The outdegree-activity and indegree-popularity effects are feedback effects, with consequences for the outdegree variance and the indegree variance, respectively.

Covariates
• For a dyadic covariate with values v ij , the effect is Actor covariates X with values x i have to be transformed to a dyadic covariate. Calling the sender of the tie "ego" and the receiver "alter," basic transformations are v ij = x i , corresponding to the "ego effect," and v ij = x j , for the "alter effect." A well-known fundamental network mechanism is homophily, the tendency of ties to be created with a higher probability between similar actors (McPherson, Smith-Lovin, & Cook, 2001). For a categorical actor variable, this is obtained from the transformation v ij = I{x i = x j }, where I is the indicator function.
For a numerical actor variable X a possible transformation representing homophily is v ij = (x i − x j ) 2 (expecting a negative parameter ). Snijders and Lomi (2019) proposed to consider general quadratic functions of x i and x j , which leads to five choices for v ij , defined by the five transformed variables These are called, respectively, the ego, alter, difference squared, ego squared, and alter squared effects of X. Their linear combinations may represent, for ego i and alter j, homophily (attraction to j with ego's own value), attractiveness of alters j depending on their value x j , differential propensity of egos to create ties depending on their value x i , attraction to j with some given value, and combinations of these. This five-parameter family can be extended or reduced depending on empirical fit.
All covariates were centered in the analysis by subtracting the overall mean.

Multivariate network effects
Multivariate network dependencies were discussed in Snijders et al. (2013). In our case, a pair (i, j) cannot simultaneously have a tie in network W as well as B; this excludes the need for effects expressing direct tie-level dependencies. The multivariate network effects s (k) i (y) considered in this article are the following. These are effects in the evaluation function f k i (y; ) for network k, so that k is the "dependent network." It is always assumed here that h ≠ k, so that h is the "explanatory network." The first four effects are mixed degree effects.
• The mixed outdegree effect, or mixed outdegree activity, represents, if its parameter is positive, that higher outdegrees in network h will lead to higher outdegrees also in network k. Whether the square root is used will depend on what gives the better fit.
• The mixed indegree effect, or mixed indegree popularity, represents, if parameter is positive, that higher indegrees in network h will lead to higher indegrees also in network k.
• Analogous are the mixed indegree activity effect representing the effect of h-indegrees on k-outdegrees; • and the mixed outdegree popularity effect representing the effect of h-outdegrees on k-indegrees.
The next four effects are different versions of mixed triadic closure, or transitivity. We give only the specification which is linear in the number of twopaths in network h; concave functions, as in the gwesp effect, might also be possible but are not applied here.

Model selection procedure
For the SAOM, there is a limit to the complexity of models for which estimation is feasible. Therefore, backward model selection is impossible. Furthermore, information-based model selection criteria are not available for estimation by the MoM. We employ a model choice procedure consisting of the following steps. The procedure has a forward orientation from step to step. Within each step the model starts with a baseline specification for this step, extending it if this is necessary for the goodness of fit, and reducing it by dropping nonsignificant effects. The extension to achieve a good fit is done in an explorative fashion. Goodness of fit is investigated according to the method of Lospinoso and Snijders (2019), implemented in the RSiena package. This method checks whether the estimated distribution corresponds to the observed data also for other statistics than those used for estimating the parameters. Consider a vector-valued statistic u(y), not directly used in the statistics (8) and the Mahalanobis distance by the fit for the observed value u(y) is expressed by the "Mahalanobis p-value" The fit is considered satisfactory if this p-value is larger than 0.05; if several statistics are considered, it will be acceptable, because of multiple testing, that some of them have smaller p-values, provided these are not too small. This is applied with H = 1,000 and seven choices for the vectors u depending only on the networks: the empirical distribution function of the indegrees at several values, for both networks; the same for the outdegrees; the triad census (Holland & Leinhardt, 1976;Wasserman & Faust, 1994), defined as the frequencies of subgraphs on three nodes, for both networks; and the mixed (multivariate) triad census of Hollway, Lomi, Pallotti, and Stadtfeld (2017). For effects of covariates, these were divided into several ordered categories, and the statistics in u are the frequencies of ties from sender ("ego") categories to receiver ("alter") categories.
The model selection steps are the following.
1. Estimate a model for the simultaneous network dynamics with structural effects and effects of sex. Include the effects mentioned above. Check the goodness of fit for degree distributions and triad counts, and modify the model so as to achieve a satisfactory fit. 2. Extend this model with effects of distance, ethnicity, and age as covariates for W and B with distinct parameters. For age, use the five transformations in (12). Check the goodness of fit for age divided in six categories as mentioned above. Depending on the results, extend the model if necessary and reduce it if possible while respecting the hierarchy principle mentioned above. 3. Extend the model of the preceding step with effects of wealth at the three levels of individual, household, and compound. These effects are considered for W and B with distinct parameters.
Using the transformations in (12) leads to a total of 30 parameters for wealth alone, a large number indeed. This model is estimated and then reduced by retaining only the significant wealth variables, again respecting the hierarchy principle. 4. The resulting model is the final one and estimated.

RESULTS
Parameters were estimated using RSiena version 1.2-23 (Ripley et al., 2020). The results are presented following the four steps outlined above. In view of space, for the extension-and-reduction Steps 2 and 3, we do not present the full tables of estimates. When interpreting results in view of significance, we use a significance level of .05 and the tests based on normal approximations mentioned above; for single parameters, this corresponds to significance meaning that the parameter estimate is larger (in absolute value) than twice the standard error. For the goodness of fit, we tried to achieve that the Mahalanobis p-value was not less than .05; in view of multiple testing, one or very few p-values being between .02 and .05 was deemed acceptable.

Structural model
The first step was to estimate models with effects of univariate and multivariate network structure and sex. As a preliminary, the outdegree distributions for the second wave were investigated (see Figure 1). Especially the B network has a very large number of actors with outdegree equal to 0. A further exploration showed that there were gender differentials in the proportions of actors with outdegree 0. To achieve a model with a good fit, we added the following effects.
• The positive outdegree effect, defined as where I{A} is the indicator function of the event A; this effect was also interacted with gender of ego; • the inverse outdegree effect, defined by The interaction gwesp × reciprocity could not be estimated for the B network, but the score-type test of Schweinberger (2012) showed that it was not significant. The impossibility to obtain convergence of the estimation algorithm for models including this interaction apparently was related to a lack of information in the data about its parameter value.
The estimates for the structural model are given in Table 4. This model has a satisfactory goodness of fit with respect to the seven mentioned statistics. In the interpretation, we focus on the significant parameters (estimate in absolute value larger than twice the standard error).
The first effect, outdegree defined in (11), balances the creation and deletion of ties, conditional on all other effects included in the model. Since the network is sparse, there are many more options to create than to delete ties, usually resulting in a negative parameter.
We see that there are strong tendencies to reciprocity, for both W and B. Transitivity as represented by the gwesp effect is significant only for W; the negative gwesp × reciprocity parameter indicates that reciprocity and transitivity compensate for each other, a phenomenon which is often seen, and discussed by Block (2015). Transitivity is estimated positively for B, but with a large standard error; there are not so many transitive triangles where all three actors are in different villages, a requirement for between-village transitivity. For the W network, there is evidence for cyclic closure in addition to transitive closure.
Of the degree effects, those related to indegrees all are nonsignificant and remain in the model only as control effects, to make sure that a good fit is obtained for variances of indegrees and their covariances with outdegrees. The effects related to outdegrees, however, present a complicated picture. For W, the only significant effect among these is the negative outdegree-activity effect. It implies that, in the W network, the tendency to create new ties, as opposed to deleting existing ties, is weaker for actors having higher outdegrees. This keeps the outdegree variance relatively low. For B, there are three significant outdegree-related effects: outdegree activity (the √ version), inverse outdegree, and positive outdegree. Of the joint contribution of these three effects, the coefficient of y ij between curly braces is decreasing in y i+ going from 0 to 1, but increasing for y i+ ≥ 1. This implies the tendency, for between-village ties, that those who currently have no outgoing ties will tend not to create new ties, but for those who do have some ties there is a positive feedback effect of their number of outgoing ties. (The positive feedback does not lead to an avalanche of ties for those having high outdegrees already because time is limited.) The cross-network degree effects are not significant for the W network. For B, the main of these effects is mixed indegree popularity: those who currently have high indegrees in the W network, will also attract more ties (i.e., get higher indegrees) in the B network.
The last four lines in Table 4 are triadic between-network effects. To make them well interpretable, it is necessary to have the mixed degree effects in the model because these represent

Effects of distance, age, and ethnicity
In the next step, the model was extended with effects of age, log distance, and ethnicity. All parameters are allowed to differ between the networks W and B. For age, all five effects in (12) were included for both networks. Fit was checked by applying the method of Lospinoso and Snijders (2019), mentioned above, to distance divided into nine categories, and to the number of ties between egos and alters each in six different age categories. The fit for distance was good. For the B network, only the effect of age difference squared was significant. The joint test for the other age effects yielded 2 4 = 2.8, p = .60; these effects were dropped. For this network, the fit of the model was good. For the W network, however, the fit of the five-parameter model for age was not good. A plot showed that mainly the fit for the lower age categories was insufficient, and the model was improved by constructing a piecewise linear function of age with the knot at 40 years, min{0, age i − 40}.
To obtain a more flexible dependence on age, a B-spline approximation was used, including this variable for the ego effect and its square for the alter effect, with several interactions. This led to a family of 10 functions, each a smooth and piecewise quadratic function of alter's age, and depending continuously on ego's age. With this addition, the fit for the 36 combinations of age categories of senders and receivers of the ties was good (Mahalanobis p-values .26 and .09 for the final model for the W and B networks). Parameters for age and ethnicity are interpreted in the discussion of the final model.

Effects of wealth
Next, extending the model resulting from the previous step, the effects of wealth were included: for the three levels of individual, household, and compound (see (1)); for the two networks W and B; and with the five parameters (12). This gives a total of 30 parameters; the purpose of this step is to see if this number can be reduced. A goodness of fit check for the estimated model with categorized wealth variables at three levels for ego and alter showed that the fit was acceptable, so the quadratic model did not need to be extended. In view of the large number of variables, we wished to retain only the strongest effects. For the W network, the highest t-statistics (ratio of parameter estimate to standard error) were for the ego and ego squared effects at the household and the compound level; the other 11 effects jointly had a chi-squared test value of 2 11 = 3.3, p > .9, and were left out of the further model. For the B network, all effects with t-statistics less than 1 were dropped from the model, and in a subsequent backward selection step, further nonsignificant effects were dropped. The remaining effects were the difference squared effect at the personal level, and at the compound level, were the ego and ego squared effect. In the model with all 30 wealth effects included, the 12 between-village effects that were left out had a combined chi-square test value of 2 12 = 9.4, p = .67. This leaves seven wealth-related effects for the final model.

5.4
Final model Table 5 gives the results of the final estimated model. There are too many parameters to discuss the interpretations of all of them; therefore, we focus only on some. The structural parameters were discussed already for the model of Table 4; the results for these parameters are roughly similar here, and we do not repeat the discussion. The Mahalanobis p-values for the fit of the seven vectors of statistics mentioned in Section 4.4 ranged from .044 to .89, indicating a good fit. The parameters in Table 5 are the coefficients of the effects in the evaluation function (6), and the variability of the effects can be widely different; therefore, the numerical values of parameters do not say anything about effect sizes.
For the covariates, in most cases we cannot interpret single parameters, but must interpret the totality of all parameters associated with a given covariate. This will be done, for each network separately, by adding the contributions to the evaluation function for this network of all effects depending on the covariate; this is called the selection function for the covariate. For binary covariates, the selection function can be shown in a 2 × 2 table (ego by alter); for numerical covariates the selection function can be plotted, for various values of ego's covariate, as a function of alter's covariate on the horizontal axis. This is in line with the idea of actor-oriented modeling, and with the data collection mode where respondents report their outgoing ties: the ties are regarded as choices by the respondents ("egos"), dependent on their own covariate value and of those of their potential network members.
• Distance. Log-distance has a strong effect, as expected.
• Ethnicity. The ego by alter table is presented in Table 6. The contrast between mentioning the own group compared with the other group (homophily) is larger for Wolof individuals (as senders) than for Fulani, for both networks.
• Sex. Table 7 shows a similar pattern for the effects of the sex of sender and receiver.
Here we see a strong gender homophily, the contrast for males being stronger than for females, and especially strong for the B network. This table does not include the consequences of the interaction of gender with positive outdegree. From Table 5 we see that males have a larger tendency than females to have outdegrees equal to 0 in the W network, while the converse is true in the B network-controlling for all other effects in the model. This reflects the locality of female behavior, who are confined by social behavioral norms to the sphere of their family in law, mainly living in the same village.
• Age. For the B network there is homophily with respect to age, as indicated by the negative parameter for the squared age difference. For the W network 10 age-dependent effects are used, as discussed above. The parameter estimates are not interpretable in themselves and therefore are not given in the function for the within-village network is plotted in Figure 2, separately for ego's age from 30 to 70 years. The pattern gives clear evidence of age homophily: The maximum of the selection curve is close to ego's value. This tendency is modified by age, being weak for the younger egos, and very pronounced for the older egos.
• Wealth. For this model, measures of effect size have not yet been developed. However, the effects can be compared in an elementary way, by considering the minimum and maximum over all actor pairs of the selection functions added for all wealth variables. For the W network the range is 0.4, while for the B network it is 2.5. This shows that the effect of wealth on between-villages ties is considerably stronger than on within-village ties. The ways in which the wealth variables affect the two networks also are quite different. For the W network, their effect depends only on ego. Drawing the curves (not shown here for space reasons) shows that the effect of the compound's wealth is mainly increasing, while the house level shows that those living in houses of relatively medium wealth in their compound mention more W ties than those of relatively low or high wealth. For the B network, there is a rather strong homophily effect for individual wealth, while those living in compounds of medium wealth mention more B ties than those living in compounds of low or high wealth.

DISCUSSION
Node sets of large networks often have additional structure, which may be associated with heterogeneities in the network. This has been the topic of special issues of Social Networks (Adams, Faust, & Lovasi, 2012) and of Network Science (De Benedictis, Vitale, & Wasserman, 2015), both with much attention for spatial structure. Such structure can be manifold, which opens many directions for further modeling (Snijders, 2016). In this article we have considered a nested structure of the node set, that is, it a partition into subsets that here are referred to as groups. This leads to a multilevel structure where the node set and the set of groups are distinct levels.
For such nested node sets we have presented two issues. The first issue is the distinction between within-group and between-group ties. This is addressed in this article by representing these as two distinct but interdependent networks. The second issue is the differentiation of the role of covariates according to the levels in the nesting hierarchy; this is related the well-known ecological fallacy (Robinson, 1950;Wakefield, 2009). This is addressed here by using versions of the covariates aggregated to the higher levels of the hierarchy (Davis et al., 1961). These issues were elaborated in this article in the longitudinal analysis of a network between inhabitants of seven villages in rural Senegal (Faye, 2011) by means of a multivariate stochastic actor-oriented model .
For this case study, our analysis has shown large differences between the within-village and between-village sociability networks and large differences between the effects of wealth at the personal level, the household level, and the compound level. It is tricky to give interpretations because the "other things equal" clause is very hard to express; this clause is tacitly assumed. Let us mention just a few of the main conclusions. While there is in general homophily with respect to age, this is much stronger for the old (as senders of ties) than for the young. Wealth is a much stronger determinant of between-village ties than of within-village ties; homophily does operate for wealth, but only at the individual level and between villages.
We have shown possibilities for taking into account the nested structure of the node set. These are feasible with existing software and lead to interesting findings, although this comes at the expense of a large number of parameters and complicated analyses.

ACKNOWLEDGEMENT
The authors are grateful to Emmanuel Lazega for many discussions and to the Guest Editor and two reviewers for helpful comments.