In the maximum likelihood framework, the score test of Rao (1948) is preferable to the Wald test and the likelihood ratio test in terms of computation time, because it requires the estimation of the model restricted by the null hypothesis, while the Wald test requires the more time-consuming estimation of the unrestricted model and the likelihood ratio test the even more time-consuming estimation of both the restricted model and the unrestricted model. In addition, if it is desired to test models with correlated statistics or test homogeneity assumptions, root-finding algorithms to estimate the unrestricted model may need a large number of iterations and struggle to find the root(s) of estimating function (9). Therefore, in situations where computation time and computing in general are an important concern, the score test of Rao (1948) is preferable to the Wald test and the likelihood ratio test. Under regularity conditions, the asymptotic behaviour of the three tests is equivalent (see Rao, 2002; Lehmann & Romano, 2005) and, in practice, it is therefore both legitimate and prudent to let practical considerations along these lines guide the choice of test.

However, given a medium to large data set, maximum likelihood estimation is hardly necessary on statistical grounds and hardly attractive on computational grounds (see Section 2.2), and therefore tests in the method of moments framework are preferable to tests in the maximum likelihood framework. Therefore, it is desirable to obtain a score-type test in the method of moments framework. A score-type test in the method of moments framework, which is the natural counterpart of the score test of Rao (1948) when nuisance parameters are estimated by maximum likelihood estimators and of the *C*(α) test of Neyman (1959) when nuisance parameters are estimated by consistent estimators, can be obtained by replacing the score function by regular estimating functions (Godambe, 1960, 1991) along the lines of Basawa (1985, 1991).

The section proceeds as follows. In Section 3.2.1, first the case is considered where the estimating function is given by the score function (9), and the score test of Rao (1948) and the C(α) test of Neyman (1959) in the family of models of interest are introduced; then the case is considered where the estimating function is given by (8), and a new score-type test in the family of models of interest is introduced. Remarks and extensions are given in Section 3.2.2.

##### 3.2.1. Basic score-type test

Let **θ**_{1} be a vector of nuisance parameters and **θ**_{2} be a vector of parameters of primary interest, and **θ**′= (**θ**′_{1}, **θ**′_{2}).

In the classical Neyman–Pearson tradition, goodness of fit can be studied by specifying hypotheses regarding the postulated family of probability distributions , for instance the null hypothesis

- (11)

tested against

- (12)

where **θ**_{2,0} is a specified value (such as **θ**_{2,0}=**0**), and **θ**_{1} is unspecified. Let **θ**′_{0}= (**θ**′_{1}, **θ**′_{2,0}) be the parameter vector under *H*_{0}: **θ**_{2}=**θ**_{2,0}. Let g_{n}= g _{n}(**z**_{n}, **θ**_{0}) be an estimating function satisfying regularity conditions (see Godambe, 1960, 1991; Basawa, 1985, 1991). Partition g′_{n}= (g′_{1n}, g′_{2n}) in accordance with **θ**′_{0}= (**θ**′_{1}, **θ**′_{2,0}).

###### 3.2.1.1. *Estimating function: score function*

*Score test without nuisance parameters*

In the absence of nuisance parameters, **θ**_{0} reduces to **θ**_{2,0}. If **θ**_{2,0} were translated by , the local change in the log-likelihood function due to the local change in **θ**_{2,0} would be given approximately by

- (13)

Under *H*_{0}: **θ**_{2}=**θ**_{2,0}, (13) has mean 0 and variance . A test of *H*_{0}: **θ**_{2}=**θ**_{2,0} could be based on the test statistic

- (14)

Test statistic (14) is based on the linear function of the score function g_{2n} and raises the question: which linear function of the score function g_{2n} is optimal in the sense that, under *H*_{1}: **θ**_{2}≠**θ**_{2,0}, test statistic (14) is as large as possible? By the Cauchy–Schwarz inequality,

- (15)

where the maximum on the right-hand side of (15) is attained at . The right-hand side of (15) is the score test statistic of Rao (1948).

*Score test with nuisance parameters*

In the presence of nuisance parameters, the score test statistic of Rao (1948) is given by

- (16)

where is the restricted maximum likelihood estimator of **θ**_{0} under *H*_{0}: **θ**_{2}=**θ**_{2,0}, obtained by maximizing the log-likelihood function with respect to **θ**_{0} subject to the constraint *H*_{0}: **θ**_{2}=**θ**_{2,0}.

*C*(α) *test with nuisance parameters*

The *C*(α) test is designed to test hypotheses in the presence of nuisance parameters, where nuisance parameters are replaced by consistent estimators. If **θ**_{1} and **θ**_{2,0} were translated by and , respectively, the local change of the log-likelihood function due to the local changes in **θ**_{1} and **θ**_{2,0} would approximately be given by

- (17)

If **θ**_{0} were estimated by the restricted maximum likelihood estimator under *H*_{0}: **θ**_{2}=**θ**_{2,0}, then would vanish. If, however, **θ**_{0} were replaced by a consistent estimator under *H*_{0}: **θ**_{2}=**θ**_{2,0}, then would not, in general, vanish. Neyman (1959) showed that, under regularity conditions, the impact of replacing **θ**_{0} by a consistent estimator under *H*_{0}: **θ**_{2}=**θ**_{2,0} on the test can be eliminated by basing tests on

- (18)

where

- (19)

Under *H*_{0}: **θ**_{2}=**θ**_{2,0}, (18) has mean 0 and variance , where **C**_{n} is the variance–covariance matrix of e_{n}. A test of *H*_{0}: **θ**_{2}=**θ**_{2,0} could be based on the test statistic

- (20)

An argument along the lines of the score test shows that the optimal choice of is given by , giving rise to the *C*(α) test statistic of Neyman (1959):

- (21)

###### 3.2.1.2. *Estimating function: non-score function*

Suppose that

- (25)

where denotes convergence in distribution, *N*_{L} refers to the *L*-variate Gaussian distribution, and is non-singular.

The entities and can be replaced by and , respectively, without changing the asymptotic distribution of (29). The parameter vector **θ**_{0} can be replaced by a restricted moment estimator under *H*_{0}: **θ**_{2}=**θ**_{2,0} (see Section 2.2). The test statistic, obtained by replacing , , and **θ**_{0} in (29) by , , and , respectively, is given by

- (30)

##### 3.2.2. Remarks and extensions

Observe that, to test restrictions on the parameter vector **θ**_{2}, **θ**_{2} need not be estimated, saving computation time and avoiding computational issues which may arise in the estimation of unrestricted models with correlated statistics or without homogeneity assumptions.

If **θ**_{2} (and therefore b_{n} and ) is a scalar, then (30) can be used both in its quadratic form, as presented above, and in its corresponding linear form,

- (31)

The linear form is convenient when one-sided one-parameter tests are desired. The minus sign in (31) facilitates the interpretation in the sense that, if u_{2} denotes the statistic corresponding to the parameter **θ**_{2} and its conditional expectations given **X** (*t*_{1}) =**x** (*t*_{1}) are increasing functions of **θ**_{2}, then, by the definition of g_{n} in (8), **θ**_{2}−**θ**_{2,0} > 0 is associated with positive values of (31). By (27), the asymptotic distribution of (31) under *H*_{0}: **θ**_{2}=**θ**_{2,0} is standard Gaussian.

Furthermore, tests with *R* > 1 degrees of freedom can be complemented with one-degree-of-freedom tests, testing the restrictions one by one; two-sided one-parameter tests can be based on the test statistic C_{n}, while one-sided one-parameter tests can be based on the test statistic D_{n}. It is convenient to compute the one-parameter test statistics by using the simulations under the null hypothesis of the multi-parameter test, requiring no additional, time-consuming simulations. If the null hypothesis of the multi-parameter test is true, such one-parameter test statistics are computed correctly. Otherwise, they are computed incorrectly, but one can take them as an informal indication of where the model deviates from the null hypothesis of the multi-parameter test, without requiring additional, time-consuming simulations.

Observe that test statistics C_{n} and D_{n} have an appealing interpretation in terms of goodness of fit in the classic sense, because both are based on

- (32)

where u_{2} is the vector of statistics corresponding to the parameter vector **θ**_{2}. In other words, the test statistics are based on the ‘distance’ between the expected value of the function u_{2} of the data – evaluated under *H*_{0}: **θ**_{2}=**θ**_{2,0}– and the observed value of u_{2}.

Since the one- and multi-parameter tests do not require the estimation of the unrestricted model, the tests are most useful in forward model selection and as tests of homogeneity assumptions with respect to time and nodes. To complement the usefulness of the tests, we derive one-step estimators which are useful as starting values of parameters in forward model selection. Suppose that tests indicate empirical evidence against the model restricted by *H*_{0}: **θ**_{2}=**θ**_{2,0} and it is desired to estimate the unrestricted model. If g_{n} is differentiable at , then, by definition (Magnus & Neudecker, 1988, p. 82),

- (33)

where

- (34)

Thus, in the limit, solving g_{n}= g _{n}(**z**_{n}, **θ**) =**0** is the same as solving

- (35)

suggesting the one-step estimator

- (36)

where is non-singular. The one-step estimator **θ**^{★} is an approximation of the unrestricted estimator . The one-step estimator **θ**^{★} is useful as a starting value of the parameter vector **θ** in the root-finding algorithm which is used to find the root(s) of the estimating function g_{n}= g _{n}(**z**_{n}, **θ**). If either g_{n} is approximately linear as a function of **θ** or is sufficiently close to , the linear approximation of g_{n} around can be expected to result in good one-step estimators **θ**^{★} and therefore good starting values of **θ**. Otherwise, the one-step estimator **θ**^{★} is at least an improvement on .

Finally, concerning the asymptotic Gaussian distribution of the estimating function g_{n} (see (25)), note that the choice of statistics u of the estimating function g_{n} is arbitrary as long as g_{n} is increasing in **θ** and sensitive to changes in **θ**. The use of test statistics C_{n} and D_{n} is admissible for all choices of g_{n} for which g_{n} is asymptotically Gaussian distributed or at least approximately Gaussian distributed. In most applications, verifying the asymptotic distribution of g_{n} is hard. Indeed, hardly anything is known about asymptotics of estimators and tests in the field of social networks, leaving aside simplistic models without dependence; for example, as noted in Section 3.1, the distribution of the *t*-type test is unknown.