Asymptotic theory in bipartite graph models with a growing number of parameters

Affiliation networks contain a set of actors and a set of events, where edges denote the affiliation relationships between actors and events. Here, we introduce a class of affiliation network models for modelling the degree heterogeneity, where two sets of degree parameters are used to measure the activeness of actors and the popularity of events, respectively. We develop the moment method to infer these degree parameters. We establish a unified theoretical framework in which the consistency and asymptotic normality of the moment estimator hold as the numbers of actors and events both go to infinity. We apply our results to several popular models with weighted edges, including generalized β ‐, Poisson and Rayleigh models. Simulation studies and a realistic example that involves the Poisson model provide concrete evidence that supports our theoretical findings.


INTRODUCTION
The affiliation relationships between a set of actors and a set of events can be conventionally represented by a bipartite graph, where edges only exist between nodes of distinct parties of this graph, i.e., actors and events.We use "actor" as a generic term that may stand for "actress," "author," "member" and the like.Correspondingly, "event" could denote "movie," "paper," and "club," where edges denote a movie in which actresses appear, a paper on which authors collaborate and a club to which club members belong.In this article, we study weighted networks and consider various edge-wise distributions.As increasing amounts of affiliation network data are collected, it is important to understand the generative mechanisms of these networks and to explore various characteristics of the network structures in a principled way.As a result, the analysis of affiliation networks has attracted great interest in recent years; for example, see Robins & Alexander (2004), Latapy, Magnien & Del Vecchio (2008), Snijders, Lomi & Torló (2013) and Aksoy, Kolda & Pinar (2017).The latter authors propose a data generation procedure that produces new synthetic networks that match the degree distributions and the metamorphosis coefficient of the original network.Compared with the method that we describe below, their approach is model free but focuses solely on the two mentioned structural features.Their approach and ours involve different perspectives, and provide mutually complementary additions to the available toolbox.
Node degrees furnish important structural information (Albert & Barabási, 2002) and play central roles in many network models; for example, see Newman, Strogatz & Watts (2001), Chatterjee, Diaconis & Sly (2011) and Barvinok & Hartigan (2013).A frequently observed phenomenon is that many nodes have small degrees while others have large degrees, which corresponds to a condition known as degree heterogeneity.Random graph models have been proposed to model the degree heterogeneity in both undirected and directed networks, including the  1 -model (Holland & Leinhardt, 1981), the -model (Chatterjee, Diaconis & Sly, 2011), the null model (Perry & Wolfe, 2012) and the maximum entropy models (Hillar & Wibisono, 2013), where each node is assigned one parameter to model the tendency of nodes to participate in network connection.Asymptotic theory for many of these models has also been derived; for example, see Chatterjee, Diaconis & Sly (2011), Hillar & Wibisono (2013), Yan & Xu (2013), Yan, Leng & Zhu (2016), Yan, Qin & Wang (2016) and Zhang et al. (2021).
Despite the significant advances in degree-based network models for undirected and directed graphs, analogous results have not been established for bipartite graphs.In this article, we introduce a class of bipartite graph models for modelling the degree heterogeneity in bipartite graphs and study the theoretical properties of this class.Our main contributions are three-fold.First, we formulate a general model framework for the structures of the exponential random graph model, driven by the degree heterogeneity in bipartite graphs.Our framework significantly extends the scope of existing works such as Yan, Leng & Zhu (2016) and Zhang et al. (2017).Second, we develop a computationally feasible moment estimator.Conveniently, our proposed estimator works for several popular models that are special cases of our general framework, such as the -model, the Poisson model and the Rayleigh model.Third, we outline a theoretical analysis of our proposed estimator and establish its consistency and asymptotic normality under mild conditions.Additionally, these three aspects are reinforced by simulation studies and a realistic example that demonstrate the merits of our method on both synthetic and real datasets.
The rest of this article is organized as follows.In Section 2, we introduce a very general model for bipartite graphs and propose a moment equations-based estimation framework.In Section 3, we present the key asymptotic properties of our estimator.In Section 4, we apply our general results from Section 3 to several popular bipartite network models that cover a wide range of settings, including continuous and discrete edge weights, and demonstrate the effective utility of our unified results.Section 5 reports simulation studies and a realistic case study involving the Poisson model.Section 6 contains discussion.All proofs may be found in the accompanying Supplementary Material.∑  =1  , .Edge weights could take discrete or continuous values.For instance, in an athlete-event network, we may use a binary weight to record the presence/absence of an athlete  in event .In a bus-station network, an edge  ∈ ℕ 0 counts the number of buses arriving at station  in a day.In an insect-flower network, continuously weighted edges  ∈ ℝ + represent the frequencies of insects choosing flowers.
We introduce a general model framework for modelling the degree heterogeneity of bipartite graphs, which can be described as follows.Suppose that the probability density (mass) function of the edge weight  , between actor  and event  has the following form: where  (⋅) is a probability density or mass function,   is the degree parameter of actor  measuring the activity of actors and   is the degree parameter of event  measuring the popularity of events.We further assume all edges are independently generated.The above model can be viewed as a generalization of a class of directed and undirected degree-based network models (e.g., Holland & Leinhardt, 1981;Chatterjee, Diaconis & Sly, 2011;Yan, Leng & Zhu, 2016) to bipartite graphs.For example, a logistic  (⋅) corresponds to the bipartite version of the  1 -model for directed graphs in Holland & Leinhardt (1981).
We note that the value of  (⋅) in Equation ( 1) is invariant under the transforms (, ) to ( − ,  + ) for a constant .For model identification, without loss of generality, we constrain   = 0.
To estimate the model parameters, we use a moment method instead of maximum likelihood estimation.When  (⋅) is an exponential family distribution, both methods are equivalent.Let (⋅) denote the expectation of  (⋅).By definition, we have ( , ) = (  +   ).Equating population and sample versions of node degrees, we have the following moment equations: One can easily verify that ∑  =1   = ∑  =1   ; therefore, the number of effective moment equations is  +  − 1 and we formulate the equations for  1 , … ,   ,  1 , … ,  −1 in Equation ( 2) and shall use them to estimate the  +  − 1 free model parameters.We denote our moment estimator as the solution to Equation (2) by θ ∶= ( α1 , … , α , β1 , … , β−1 ) .We could also use the Newton-Raphson algorithm to solve for θ.DOI: 10.1002/cjs.11735 The Canadian Journal of Statistics / La revue canadienne de statistique To discuss the existence and uniqueness of θ, define and let  () = (  1 (), … ,  +−1 () ) ⊤ .Generally speaking, one cannot always expect the Jacobian matrix  ′ () to be invertible, which naturally leads to the existence and uniqueness of θ, but fortunately, in the next section, we will show that θ exists with probability approaching one under mild conditions.

Notation and Preliminaries
Let ) .The asymptotic behaviours of the moment estimator crucially depend on the Jacobian matrix of  ().It turns out that this Jacobian is structured, and we characterize its structure by the notion of a general matrix class as follows.For  ≥  > 0, we say that a ( +  − 1) × ( +  − 1)-dimensional matrix  = (v , ) belongs to the matrix class  , (, ) if the following conditions hold: and  , = 1 if  =  and is otherwise zero.The upper bound of the approximation error is identified in Lemma S1.1; see the Supplementary Material.

Consistency, Uniqueness and Asymptotic Normality
Suppose for some  , and  , , we have  , <  *  +  *  <  , for all , .We will first show that there is a unique solution in the neighbourhood of  * , where , and further show that this local solution is also the unique global solution.Assume that (⋅) is second-order differentiable and satisfies the following two regularity conditions: • Condition (1): When 0 <  , ≤  ≤  , , there are three positive numbers  ,,0 ,  ,,1 and  ,,2 such that where  ,,0 ,  ,,1 and  ,,2 may depend on  , and  , .
We notice that the assumption specified in Equation ( 7) guarantees that  ′ () is always positive (or always negative) in [ , ,  , ] and thus bounded away from zero.We will use this fact later.A well-known corollary of Condition (2) is the sub-exponential Bernstein's inequality (see below).A proof can be found in Corollary 2.8.3 of Vershynin (2018) (i) Then with probability at least 1 −  Q, , ,,3 ( + ) −1 , the moment estimator θ exists and is unique in the following neighbourhood: where  is some global constant and, moreover, for this θ, we also have (ii) The unique solution θ in the neighbourhood specified in Equation ( 12) is also the unique solution in Remark 1.The main technique that we used to establish the key results in Theorem 1 is the analysis of an oracle Newton iteration initiated at  * , which provably converges to θ.Here we briefly explain the roles of the assumptions in this theorem.In Condition (1), Equation ( 7) ensures that the Jacobian matrix  ′ () belongs to the matrix class  , (  ,,0 ,  ,,1 ) . For simplicity, in this article, we only focus on the case where  ′ () > 0 and  ∈ .Proof for the case where  ′ () < 0 and − ∈  can be established similarly.Equation ( 8) guarantees that the Jacobian matrix  ′ () is Lipschitz continuous, which would be a key in bounding the errors of the Newton iterations and eventually the error in θ.Condition (2) implies Lemma 1 and some important concentration results dependent on node degrees in our analysis; see details in Lemmas S1.4,S1.6 and S1.8 in the Supplementary Material.The apparently mild assumptions identified in Equations ( 10) and ( 11) will be needed in some technical proof steps.In fact, our results allow the complexity of the population model to increase with the sample sizes , .In most existing models, the population distribution is fixed, and in this case,  , ,  , ,  ,,0 ,  ,,1 ,  ,,2 and  ,,3 will all be global constants, and our assumptions identified in Equations ( 10) and ( 11) would trivially hold.
Remark 2. If  ,,2 ∕ ,,0 goes to zero slowly enough, there exists a small constant  > 0 such that the solution to the moment functions specified in Equation (3) exists and is unique in Ω (  * ,  ) .
The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs.11735 Moreover, if the initial point  (0) is close enough to the true value, that is,  (0) ∈ Ω( * , ), the iterative sequence provably converges to θ.For more details, see conclusion (iii) of Lemma S1.2 in the Supplementary Material.For simplicity, in this article, we only present theoretical guarantees for the estimator obtained by the Newton iterations initiated at  (0) =  * ; the analysis for  (0) ∈ Ω( * , ) is similar but requires a much more involved formulation.
Our second main result describes the asymptotic normality of θ.It turns out that the asymptotic variance of θ can be characterized by the covariance structure of observed node degrees.Let g )  and  + =   denote the observed degree sequence.
• Condition (3) For some 0 <  ,,4 <  ,,5 , we have We have the following relationship between θ and g: Lemma 2. Under Condition (3) and then we have where  * is the particular approximation specified in Equation (5) to the inverse matrix of  * =  ′ ( * ),  * denotes the Jacobian matrix of  () at the true point  * and we borrow the   (⋅) notation from probability theory.
Notice that  * in Equation ( 16) is a nonrandom coefficient matrix, and g − (g) is asymptotically normal.Now we characterize the distribution of g via its covariance matrix.Under the assumption specified in Equation ( 14), for 1 = 0, and similarly  , ′ = 0, and further, . It is easy to check the remaining conditions to verify that indeed  ∈  , (  ,,4 ,  ,,5 ) .Define  +,+ = Var(  ), and ) .
Since  1 ∕ 1,1 , … ,   ∕ , and  1 ∕ +1,+1 , … ,   ∕ +,+ are asymptotically independent, we have the following characterization of the asymptotic behaviour of : The Canadian Journal of Statistics / La revue canadienne de statistique where   is the th element of g.Also, for any fixed  ≥ 1, as  → ∞, the vector consisting of the first  elements of  * { g − (g) } is asymptotically normal with mean zero and covariance matrix given by the upper left  ×  submatrix of .
Remark 3. When the maximum likelihood equations of  coincide with the moment equations, as is the case in the maximum entropy models, Poisson model and -model, then Equation ( 14) implies Equation ( 7), and it can be shown by a first-order Taylor expansion that the Jacobian matrix of the parameter vector and the covariance matrix of the degree sequence coincide.If  , is a sub-exponential random variable with parameter  , , ,,4 → 0, as  → ∞, where  0 is some absolute constant.By the Lyapunov central limit theorem (Vershynin, 2018; page 362), we get that With the above preparations, we are now ready to present our second main result on the asymptotic normality of θ.
Theorem 2. Suppose Equations (8), ( 13)-( 15) and  2 ,,1  ,,5 where θ[1∶] and  [1∶𝑘] are the first  elements of the corresponding vector and Theorem 2, we can conveniently construct approximate marginal and joint confidence intervals for estimating  * .For example, an approximate 1 ) 1∕2 , where  1−∕2 is the 1 − ∕2-quantile of the standard normal distribution, and v, and v, are the moment estimates of v , and v , that are obtained by replacing all   with their moment estimates.Here û and v are the estimated covariance matrix Cov {g − (g)} and the estimated Jacobian matrix  ′ ( θ), respectively.

Generalized 𝛽-model
The -model (Chatterjee, Diaconis & Sly 2011) is an exponential random graph model with the degree sequence as the exclusively sufficient statistic.Here we generalize the -model to bipartite graphs.For simplicity, we assume that edges belong to the sample space Λ = {0, 1, … ,  − 1}, where  ≥ 2 is a constant.Assume the edges  , are independently generated with the following probability mass function: The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs.11735 In this example, we let  be as defined in Equation ( 6) and set  , = − , .By definition, we have . Now we check the conditions of Theorem 1.For Condition (1), straightforward calculation shows that ) 2 .
Note that the moment equations of the maximum entropy distributions are equal to the maximum likelihood equations; then the covariance matrix of By the central limit theorem for the bounded case (Loève, 1977; page 289), v , then Now that all the required conditions have been checked, we apply Theorem 2 and obtain the following result.
Corollary 2. If   , = ) , ∕ = (1), then for any fixed  ≥ 1, as  → ∞, the vector consisting of the first  elements of θ −  * is asymptotically multivariate normal with mean zero and covariance matrix given by the upper left  ×  submatrix of .

Poisson Model
In this example, independent weighted edges take values in Λ = ℕ 0 and each follows an edge-specific Poisson distribution Now we check the conditions of Theorem 1.For Condition (1), we have In addition, for Condition (1), we have, for  ∈ , In this example, the moment equations coincide with the maximum likelihood equations.The covariance matrix of {g − (g)} is  =  ′ () ∈  , (  ,,0 ,  ,,1 ) , and the third moment of the Poisson with parameter    +  is , the above expression goes to zero.This satisfies the conditions of the Lyapunov central limit theorem (Billingsley, 1995;page 362) ) , ∕ = (1), then for any fixed  ≥ 1, as  → ∞, the vector consisting of the first  elements of θ −  * is asymptotically multivariate normal with mean zero and covariance matrix given by the upper left  ×  submatrix of .
Corollary 5.If   , − , the above expression goes to zero.This satisfies the conditions of the Lyapunov central limit theorem (Billingsley, 1995;page 362).Therefore v ) is also asymptotically standard normal under the same conditions.If   , − , = Applying Theorem 2, we have the following asymptotic normality result for θ. ) , ∕ = (1), then for any fixed  ≥ 1, as  → ∞, the vector consisting of the first  elements of θ −  * is asymptotically multivariate normal with mean zero and covariance matrix given by the upper left  ×  submatrix of .

Maximum Entropy Distributions with Continuous Weights
We now consider maximum entropy distributions with continuous weights, that is, Λ = ℝ 0 , where the moment equations are equal to the maximum likelihood equations.Assume that each  , ,  ∈ [],  ∈ [], is a mutually independent exponential random variable with the density function  ( , = ) = (   +   )  −(  +  ) ,  > 0.
We first apply Theorem 1 to obtain the existence and consistency of θ.In this case By direct calculations, for  ∈ , we have If  , ∕ , =  (  1∕6 ) , then the above expression goes to zero.This shows that the condition for the Lyapunov central limit theorem (Billingsley, 1995;page 362)  ) , ∕ = (1), then for any fixed  ≥ 1, as  → ∞, the vector consisting of the first  elements of θ −  * is asymptotically multivariate normal with mean zero and covariance matrix given by the upper left  ×  submatrix of .

NUMERICAL STUDIES
In this section, we report the results of numerical studies involving synthetic data generated by the Poisson model specified in Equation ( 19), in order to assess the performance of our moment estimator.This model is widely used to describe the likelihood of discrete events occurring in a continuous manner, such as website visits, user ratings, crime and disease incident reports.We also describe the analysis of an example from the US Law Firms and World Cities network.

Simulations
We set  *  = ( − )∕( − 1),  *  = ( − )∕( − 1) and  *  = 0 .We chose to fix the bipartite network size at (, ) = (100, 50), (200,100) and test  on the logarithmic scale, that is,  ∈ {0, log(log ), log }.We focussed on the asymptotic distributions of ξ, = where v, is the estimate of v , which was obtained by replacing  *  with θ , and then verifying Corollary 4 empirically.We assessed the asymptotic normality of ξ, , η, by Q-Q plot under various .We also evaluated the coverage probabilities and the lengths of the empirical 95% confidence intervals.In addition, we recorded the observed frequency that the estimate does not exist.Each simulation involved 10,000 repetitions.
We tested the bipartite network size of (, ) = (100, 50) and (200,100), respectively, and found the Q-Q plots of  *  −  *  and  *  −  *  to be similar for each setting.Therefore, in Figure 1, we only show the Q-Q plot of  *  −  *  for (, ) = (100, 50).The horizontal and vertical axes are the theoretical and empirical quantiles, respectively, and the red lines correspond to the reference lines  = .From Figure 1, we see that under all three  *  −  *  configurations, the empirical distribution shows no evidence of departure from normality when  ≤ log().
Table 1 summarizes the observed results for estimating  *  −  *  , including estimated coverage probabilities, the length of the empirical 95% confidence intervals and the observed frequency that such a confidence interval does not exist.Here, a large  regulates the confidence interval The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs.11735 length downward, while the length decreases as the bipartite network size of (, ) increases.
As we read from Table 1, the coverage frequencies are all close to the nominal level of 95% for all .Furthermore, we use a Lilliefors test to verify whether our data sample is drawn from a normally distributed population; see Table 2.When  = 0,  = 0.00 for all pairs (, ), thereby implying that it is unlikely that this sample came from a normal population; when  = log(log ) and  = log , the  -values for most pairs (, ) are not less than 0.05, implying that the empirical distribution may reasonably be regarded to follow the normal distribution.
In Table 3, we report the running times of our algorithm under different configurations.Our method scales comfortably to networks of roughly 500 senders and receivers, respectively.Using a Newton method, the memory cost of our approach is (( + ) 2 ).DOI: 10.1002/cjs.11735 The Canadian Journal of Statistics / La revue canadienne de statistique  We used our method to analyze data concerning the US Law Firms and World Cities network discussed in Beaverstock, Smith & Taylor (2000); the dataset may be found at https://www.lboro.ac.uk/gawc/datasets/da5_1.html.The data concern numbers of lawyers of 100 American law firms with foreign offices in 72 cities outside United States.We preprocessed the data by removing isolated nodes and obtained a bipartite graph with 98 firms and 68 cities.We used the GLM package with a Poisson link function to estimate the Poisson model parameters.Table 4 reports the estimated   and   values with standard errors and observed degrees; recall that we set  69 = 0.In Table 4, we observe a clear positive relationship between α and d, and between β and b as well.The full version of this table may be found in the Supplementary Material.
Table 5 summarizes the estimated quantiles for firm and city degrees and shows that d varies markedly from 1 to 1802, whereas the range of b, from 1 to 761, is somewhat more confined.This is also reflected in the spread of α , ranging from −5.01 to 2.48, and β from −3.33 to 3.30, as is shown in Figure 2. The histograms of both α 's and β 's indicate that they follow  approximate normal distributions.We also give the Q-Q plots of both α 's and β 's in Figure 3; the horizontal and vertical axes are the theoretical and empirical quantiles, respectively, and the red lines correspond to the reference lines  = .From Figure 3, we see that many points lie close to the reference lines  = , especially in the middle, but the points in the tails are somewhat more variable.We see that both α and β appear to follow normal distributions.
As an application of our method, we used the estimated parameters to generate 100 bootstrapped adjacency matrices.The resulting bootstrapped adjacency matrices were then used to build 95% bootstrap confidence intervals for the degree sequences, as reported in Table 6.We can see that the mean degrees of 100 bootstrap degree sequences are close to the original data concerning degrees from the observed network dataset.Moreover, these original degrees belong to the 95% bootstrap confidence intervals.On the other hand, we also calculated the mean skewness of 100 bootstrap degree distributions and the skewness of each bootstrap degree distribution, respectively; see Table 7.Compared with the skewness of the original degree distributions, the skewness values for the bootstrap degree distributions seem quite similar to each other.Referring to the observed values reported in Table 4, we make the following observations concerning this example.First, as one naturally expects, the estimated standard errors on high-degree nodes (also known as "hub nodes" or core nodes) are significantly smaller than the corresponding estimated standard errors for low-degree nodes ("leaf nodes," or peripheral nodes), since nodes with higher popularity provide more data about their connection patterns.Second, in light of Remark 4 in Section 3.2, we can construct a marginal approximate 95% t-confidence interval for each   and   parameter, using ( α ± 2 ŝ.e.(α  ) ) .Let us inspect the top-10 highest degree firms listed in Table 4.A large value of α suggests that the company is highly "internationalized."Similarly, a large value of β suggests the city's attraction for US law firms.Tracing the top-ranked cities in Table 4, we find the top three to be London, Paris and Hong Kong, all of which ranked in the top tiers of the recent GaWC Global City Index (London is Alpha++, the other two are Alpha+).Those ranked 4-10 (Brussels, Warsaw, Tokyo, Frankfurt, Moscow and Singapore, in order) suggest that while there is a clear positive relationship between a city's global centrality and its attraction for US law firms, there may be other factors that also play a role in the choices that law firms finally make.For instance, two potentially very relevant other factors are law systems and geographical distance.The United Kingdom, Hong Kong and Singapore share the same law system (common law) as the United States, which might facilitate their connection in legal practice and business.This might also partially explain the situation of Tokyo, since the Japanese legal system is a mixture of civil and common law systems.On the other hand, the observation that European continent cities that adopt civil laws are significantly elevated in their ranking in Table 4 compared with their global city indices, possibly due to their geographical and cultural closeness to the United States, compared with some other Asian-Pacific cities with high global centrality.On the other hand, the estimated α values reported in the same table may serve as a useful reminder of the possible limitations of our study.The most internationalized company (Firm 45) has almost twice as many network connections as the total connections of the remaining top 10 law firms combined.Consequently, this firm alone strongly influences the estimated values of the   parameters, and one should keep this fact in mind when interpreting the β values.

DISCUSSION
In this article, we proposed a general model for degree-based modelling and analysis of bipartite graphs.Our model generalizes several popular models in the existing literature that adopt this particular perspective for bipartite graphs.In contrast to the common likelihood-based methods, we proposed a moment method for parameter estimation that is computationally efficient and enjoys nice theoretical properties under mild conditions.Our proof makes original use of the theory on Newton's iterations in this setting.
The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs.11735 Our model applies to a rich family of network models, and as demonstrated in Section 4, it provides a general framework for systematically studying the asymptotic properties of many existing models as its special cases.The simulation studies that we conducted for the Poisson model and the practical example that we analyzed concerning US law firms and world cities amply demonstrated the effectiveness of our method in concrete problem settings.
There are several possible directions for future research.One of these would involve seeking finite-sample error bound guarantees rather than asymptotic results.The analysis method is not very challenging, but the formulation will become much more involved; therefore, we chose not to pursue such a challenging goal in this article.
Another interesting question would be to investigate the case where  and  are at very different scales.Very different  and  would significantly complicate the approximation to  −1 , which is a key technical ingredient in our analysis, and this complication would propagate to all consequent analyses and results.While we believe such an extension is feasible, it represents a separate line of investigation and exceeds the scope of this article.
The asymptotic approximation and concentration results that our analysis relies on can in fact accommodate slight dependency across network edges.In light of recent research interest in the topic of networks with dependent edges, we are also interested in exploring the relaxation of the independent edge assumption that is almost universally assumed in the current bipartite graph literature.
Finally, a natural topic of inquiry concerns richer network features beyond degrees.In fact, degrees can be viewed as a rescaled version of the simplest network moment, namely, edges in Zhang & Xia (2022).Other motifs such as stars and cycles are also useful and very meaningful quantities to study.However, methods and theory for degree-based exponential random graph models are still in active development and not yet complete.In this article, our aim has been to provide a more comprehensive understanding of relatively simple, degree-based models as a solid step forward.Also, while including more statistics into the exponential random graph model, one must also be very careful with model identifiability, which is generally a challenging task for many network models.
) be a bipartite graph with  actors and  events.Define [] ∶= {1, … , } and [] ∶= {1, … , } by the set of the actors and the set of the events, respectively.Without loss of generality, we assume  ≤  hereafter in order to simplify the presentation of results and proofs.In this article, we study weighted edges and let  , ∈ Λ ⊆ ℝ be the edge weight between actor  ,   ) ⊤ to be the degrees of actors and events, respectively, where   = The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs.117352. BIPARTITE NETWORK MODELS AND ESTIMATION Let (, × be the bi-adjacency matrix of (, ).Define d = (  1 , … ,   ) ⊤ and b = ( 1 , …

TABLE 1 :
Estimated coverage probabilities (×100%) of  *  −  *  for a pair (, ) as well as the length of confidence intervals (in square brackets), and the probabilities (×100%) that the estimate does not exist (in parentheses).

TABLE 4 :
US Law Firms and World Cities network dataset: The estimators of α and β and their standard errors (in parentheses), and selected top 10 and bottom 10 firms and cities according to the degree sequences, respectively.

TABLE 5 :
The minimum, quartiles and maximum values of degrees from 98 firms and 68 cities.The histograms of α 's and β 's for the US Law Firms and World Cities data with 98 firms and 68 cities.The histogram of the estimates of the 98 firms (left) and 68 cities (right) parameters in the US Law Firms and World Cities data.
The Q-Q plots of α 's and β 's for the US Law Firms and World Cities data with 98 firms and 68 cities.The Q-Q plots of the estimates of the 98 firms (left) and 68 cities (right) parameters in the US Law Firms and World Cities data.

TABLE 6 :
US Law Firms and World Cities network dataset: The mean degrees of 100 bootstrap degree sequences and their 95% bootstrap confidence intervals (i.e., CI) (in square brackets), and selected top 10 and bottom 10 firms and cities in order, respectively.

TABLE 7 :
The skewness of degree distributions from 98 firms and 68 cities:   indicates the skewness of original degree distributions, β  indicates the mean skewness of 100 bootstrap degree distributions, while the skewness of the th bootstrap degree distributions is denoted by β  .