This research was supported by NSF grant DMS04-37167, the James McKeen Cattell Fund, and grants DA01070 and DA00017 from the National Institute on Drug Abuse. We would like to thank Bengt Muthén and Siek-Toon Khoo for providing the School Data and their unpublished manuscript. We also thank two reviewers who provided suggestions that helped to improve the paper.

# MULTILEVEL COVARIANCE STRUCTURE ANALYSIS BY FITTING MULTIPLE SINGLE-LEVEL MODELS

Article first published online: 18 MAY 2007

DOI: 10.1111/j.1467-9531.2007.00182.x

Additional Information

#### How to Cite

Yuan, K.-H. and Bentler, P. M. (2007), MULTILEVEL COVARIANCE STRUCTURE ANALYSIS BY FITTING MULTIPLE SINGLE-LEVEL MODELS. Sociological Methodology, 37: 53–82. doi: 10.1111/j.1467-9531.2007.00182.x

#### Publication History

- Issue published online: 18 MAY 2007
- Article first published online: 18 MAY 2007

- Abstract
- Article
- References
- Cited By

### Abstract

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

*Data in social and behavioral sciences are often hierarchically organized. Multilevel statistical procedures have been developed to analyze such data while taking into account the dependence of observations. When simultaneously evaluating models at all levels, a significant statistic provides no information on the level at which the model is misspecified. Model misspecification can exist at one or several levels simultaneously. When one level is misspecified, the other levels may be affected even when they are correctly specified. Motivated by these observations, we propose to separate a multilevel covariance structure into multiple single-level covariance structure models and to fit these single-level models as in conventional covariance structure analysis. A procedure for segregating the multilevel model into single-level models is developed. Five test statistics for evaluating a model at each level are provided. Standard error formulas for the separate estimators are also provided, and their efficiency is compared to simultaneous estimators. Empirical and Monte Carlo results demonstrate the advantages of the segregated procedure over the simultaneous procedure. Computer programs that will allow the developed procedure to be used in practice are also presented.*

### 1. INTRODUCTION

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

Data in social and behavioral sciences often exhibit hierarchical structures. For example, households are nested within neighborhoods, neighborhoods are nested within cities, and cities are further nested within countries; students are nested within classes, classes are nested within schools, and schools are further nested within school districts. Cases within a cluster are generally correlated due to their greater commonality as compared to cases in other clusters. The methodology for analyzing such data has to explicitly account for these correlations in order to get accurate results. Various procedures have been developed for modeling hierarchical data. Among these are the hierarchical linear model (HLM) and the multilevel structural equation model (SEM) (du Toit and du Toit 2007; Goldstein 2003; Lee 1990; Lee and Poon 1998; Lee and Song 2001; Liang and Bentler 2004; Longford 1993; McDonald and Goldstein 1989; Muthén 1994; Muthén and Satorra 1995; Poon and Lee 1992; Raudenbush and Bryk 2002; Yuan and Bentler 2002, 2003).

In a multilevel SEM model, there is a covariance structure at each level. Assuming data are normally distributed, parameter estimates can be obtained by maximizing the corresponding likelihood function. Model fit can be evaluated by the likelihood ratio statistic. Alternative statistics have also been developed for those situations where the normal distribution assumption is not attainable (see Yuan and Bentler 2003). However, due to the simultaneous evaluation of more than one model, these statistics suffer from at least the following three drawbacks: (1) With the same amount of misspecification, the power of a statistic is inversely proportional to model size and degrees of freedom. The overall statistics are less likely to be significant when simultaneously evaluating misspecified multilevel models. (2) Even when a statistic is significant, it is not clear whether a model at a single level is misspecified or whether models at all the levels are misspecified. (3) When fitting models at different levels simultaneously, misspecification at one level will affect the parameter estimates at the other levels. Thus, model diagnostics in simultaneous model evaluation can be very complicated, if not impossible.

As an example, the School Data from the Mplus Web site contain measurements of 5198 students from 235 schools obtained as part of the National Education Longitudinal Study (NELS) started in 1988.^{1} There are 21 variables in the data set, and Muthén, Khoo, and Gustafsson (1997) compared several models for these variables. Here we use only four math variables (algebra, arithmetic, geometry, probability) and four science variables (earth, chemistry, life, methods). Assuming that the math variables measure the “math ability” and the science variables measure the “science ability” of each student, then a two-factor model at the student-level and at the school-level will represent the substantive aspect of the eight variables. Fitting the two-level confirmatory factor model to the data using the normal distribution based maximum likelihood (ML) procedure,^{2} we get a likelihood ratio statistic of 77.97, corresponding to a *p*-value of 0.000 when referred to χ^{2}_{38}. The result may indicate that the model does not fit the data well. The significant lack of fit may be due to misspecification in the level-1 or level-2 model, or both. It may also result from a nonnormality in the data. We will return to this example in Section 3, after presenting the statistical development for our alternative approach to model evaluation.

Since the total sample size in the above example is so large, a small misspecification can cause the significant lack of fit. We might think of avoiding the drawbacks of the overall test statistics by using so-called fit indices. However, in the context of multilevel SEM, it is not clear how to partition the chi-square and to minimize the unwanted effect of sample size. There is only one chi-square statistic, while there are varying sample sizes at different levels. Thus, it is not a simple matter to generalize a model fit index from the conventional SEM context to the multilevel context.

Aiming to solve the above problems in model estimation and evaluation of a multilevel model, this paper proposes to separate the multilevel model into multiple single-level models. The advantage of fitting single-level models is that we can judge whether the model at the given level is adequate or not. For example, when evaluating the first level model, no information of the second or third level model is involved or needed. So techniques for model evaluation used in conventional SEM analysis can be utilized. When models at different levels share common parameters, we may also fit the separated models in steps by starting with the model at the lowest level. Once the first-level model is adequate, parameter estimates from fitting this model will be consistent. Treating these estimates as known when fitting the model at the next level will not systematically affect the model evaluation. This process continues until all submodels have been estimated and evaluated. Of course, the idea of sequential study of multilevel models is not new. For example, Muthén (1994) suggested a stepwise procedure for hierarchical model construction prior to simultaneous estimation and evaluation. For related suggestions, see Hox(2002: 51–54, 232–38). These earlier proposals are directed primarily at model building and exploratory model evaluation. This paper provides a technical foundation and statistical basis for multilevel covariance structure analysis by fitting multiple single-level models.

Compared to the simultaneous estimation procedure, parameter estimates from the segregated or stepwise approach may lose efficiency as characterized by large sample theory. However, it is possible that parameter estimates in the segregated approach will be more efficient in small- to medium-sized samples, because those based on a smaller model are numerically more stable. The obvious advantage of the separate approach is that we can check the model fit at the given level using diagnostic techniques as well as model evaluation procedures available from the conventional SEM literature (Bentler 2006).

Section 2 will give details of the development, which can be regarded as an application of the general theory of estimating equations (Yuan and Jennrich 2000). We will mainly present the results; technical details for obtaining the results are either referred to the literature or provided in the appendix. We will return to the School Data example in Section 3. Simulation results contrasting model evaluation in the separate approach and the simultaneous approach will be provided in Section 4. Computer programs for model segregation and single-level analysis will be introduced in Section 5. Recommendations and a discussion will be provided at the end.

### 2. MODEL SEGREGATION AND EVALUATION

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

This section will provide the statistical development to separate a multilevel model into multiple one-level models. The multilevel model contains a covariance structure at each level, and components are needed in order to fit the model at the individual level. These include a covariance matrix estimate at each level that is parallel to the sample covariance matrix in conventional covariance structure analysis (CSA). Because cannot be regarded as the sample covariance matrix based on a normally distributed sample even when the multilevel data are normally distributed, we also need a consistent estimator of the covariance matrix of . These two matrices will allow us to construct several statistics to evaluate the model at the individual level and to obtain standard errors of parameter estimates for that level. Although alternative procedures for multilevel models are available, estimation techniques are primarily developed for the normal distribution based ML procedure (Lee 1990; Liang and Bentler 2004). Our development will also be based on the ML estimation procedure, but statistics will be formulated to account for possible nonnormality of the multilevel data. Because the development and formulations are quite technical, we will first briefly describe the statistics and their relationship to those in the simultaneous evaluation of multilevel models as well as their counterparts in the conventional SEM context. Readers who are not interested in the technical details may read only the description and then move to the following sections.

In both multilevel and conventional SEM, the most widely used model evaluation procedure is to refer the likelihood ratio statistic *T _{ML}* to a chi-square distribution. Although

*T*can approach a chi-square distribution with nonnormal data when the data and model satisfy certain conditions, it does not approach a chi-square distribution in general. Normally distributed multilevel data requires that observations at every level be normally distributed, which is even less realistic than the normality assumption on a single sample. When simultaneously evaluating multilevel SEM models, Yuan and Bentler (2002) proposed a rescaled statistic

_{ML}*T*, whose asymptotic mean equals that of the nominal chi-square. With nonnormally distributed data,

_{RML}*T*is typically more reliable than

_{RML}*T*, but the asymptotic distribution of

_{ML}*T*is generally unknown. Yuan and Bentler (2003) also proposed three additional statistics based on the ML estimate (MLE). These are the residual-based ADF statistic

_{RML}*T*, the corrected residual-based ADF statistic

_{RADF}*T*, and the residual-based

_{CRADF}*F*-statistic

*F*. The asymptotic distributions of the three statistics are known and do not depend on the distribution form of the data. Simulation results indicate that

_{R}*T*and

_{CRADF}*F*also perform reasonably well at finite sample sizes. The counterpart of

_{R}*T*in conventional CSA is just the likelihood ratio statistic; the counterpart of

_{ML}*T*is the rescaled statistic proposed by Satorra and Bentler (1994); the counterpart of

_{RML}*T*is the residual based statistic proposed in Browne (1984); the counterpart of

_{RADF}*T*is the corrected statistic proposed in Yuan and Bentler (1998); the counterpart of

_{CRADF}*F*is proposed in Yuan and Bentler (1998). In addition to developing these statistics at level-1 and level-2 models, we also construct their counterparts when evaluating the level-2 model using a mix of two types of parameter estimates—that is, some of the parameters in evaluating the level-2 model will be fixed at the estimates obtained when fitting the level-1 model alone.

_{R}Although we aim to study general procedures for model segregation, for simplicity we explicitly consider only two-level models. The procedures described below can be easily generalized to a model with more than two levels. In the remainder of this section, we will first formulate the two-level SEM model and obtain the covariance matrix estimator as well as its asymptotic covariance matrix for each level. These will be used to develop model estimation and evaluation procedures for level 1 and level 2 in sequence. A procedure to check the agreement of the estimates for the common parameters of level 1 and level 2 will be developed next. Then, model evaluation with the mixed types of parameter estimates will be considered. Before ending this section, we will compare the efficiency of parameter estimates in the segregated procedure with that in the simultaneous procedure.

#### 2.1. *The Saturated Model and Its Parameter Estimates*

Let the *p*× 1 vectors **y**_{ij}, *i*= 1, … , *n _{j}* be observations from cluster

*j*with

*j*= 1, … ,

*J*. The two-level structure of

**y**

_{ij}can be described by

- (1)

where ** μ** is a mean vector,

**u**

_{ij}and

**v**

_{j}are independent with

*E*(

**u**

_{ij}) =

*E*(

**v**

_{j}) =

**0**, Cov (

**u**

_{ij}) =

**Σ**

_{1}and Cov (

**v**

_{j}) =

**Σ**

_{2}. The vector

**u**

_{ij}contains the level-1 or within-level components; the vector

**v**

_{j}contains the level-2 or between-level components. The total level-1 sample size is

*N*=

*n*

_{1}+

*n*

_{2}+ … +

*n*while that for level 2 is just

_{J}*J*. Assuming that

**y**

_{ij}follows a multivariate normal distribution, the estimates of

**,**

*μ*

*Σ*_{1}and

**Σ**

_{2}can be obtained by maximizing the log-likelihood function (see Yuan and Bentler 2002)

where

with **Σ**_{j12}=*n*^{−1}_{j}**Σ**_{1}+**Σ**_{2}. Because (** μ**,

*Σ*_{1},

*Σ*_{2}) are saturated for (1), the MLEs are still consistent for the population values (

*μ*_{0},

*Σ*_{10},

*Σ*_{20}) even when the

**y**

_{ij}are not normally distributed. The MLEs are also asymptotically normally distributed. We need to introduce some additional notation in order to characterize the distribution of . Our development for the separate analysis is based on the estimated covariances matrices and as well as their asymptotic covariance matrices.

For a *p*×*p* symmetric matrix **A**, let vech(**A**) be the vector formed by stacking the columns of **A** leaving out the elements above the diagonal. We denote *σ*_{1}= vech(*Σ*_{1}), *σ*_{2}= vech(*Σ*_{2}) and ** β**= (

**′,**

*μ***′**

*σ*_{1},

**′**

*σ*_{2})′. Let vec (

**A**) be the

*p*

^{2}-dimensional vector formed by stacking the columns of

**A**. Then there exists a unique

*p*

^{2}×

*p** matrix

**D**

_{p}such that vec (

**A**) =

**D**

_{p}vech(

**A**) and vech(

**A**) =

**D**

^{+}

_{p}vec (

**A**), where

*p**=

*p*(

*p*+ 1)/2 and

**D**

^{+}

_{p}= (

**D**

_{p}′

**D**

_{p})

^{−1}

**D**

_{p}′ is the generalized inverse of

**D**

_{p}. These notations were systematically introduced in Magnus and Neudecker (1999). We use

A function with dot on top means derivatives or the Jacobian matrix—for example, . When a function is evaluated at the population value of the parameter, we often omit the argument. There are four probabilistic notations: *t _{n}*=

*o*(

_{p}*a*), which means

_{n}*t*/

_{n}*a*approaches zero in probability as

_{n}*n*approaches infinity;

*t*=

_{n}*o*(

_{p}*a*), which means

_{n}*t*/

_{n}*a*is bounded in probability; , which denotes converging in probability; and , which denotes converging in distribution. Parallel nonstochastic notations are without the subscript

_{n}*p*. We will use

**0**to denote a vector or matrix of zeros and

**I**to denote the identity matrix; subscripts will be used to indicate the dimensions when they are not clear. Let

and

Then the asymptotic distribution of the MLE given in Yuan and Bentler (2002) is

- (2)

where **ϒ**=**A**^{−1}**B****A**^{−1} with

A consistent estimate of **ϒ** can be obtained when replacing **A** and **B** by respectively

Yuan and Bentler (2006) studied asymptotic robustness of standard errors of the simultaneous estimators. Their results imply that converges to *σ*_{10} at the speed of while converges to *σ*_{20} at the speed of . Thus we should normalize the distribution of and separately. Denote

where **ϒ**_{μμ} corresponds to corresponds to , and **ϒ**_{22} corresponds to . Then we have

- (3)

and

- (4)

where , and is the average level-1 sample size. We also have the asymptotic covariance matrix

#### 2.2. *The Level-1 Structural Model*

With the above preparation, we are ready to fit the level-1 and level-2 models separately. Suppose *Σ*_{1}(** θ**) is the level-1 structural model and

*Σ*_{2}(

**) is the level-2 structural model. We may denote**

*θ***= (**

*θ***γ**′

_{1},

**γ**′

_{c},

**γ**′

_{2})′, where

*Σ*_{1}=

*Σ*_{1}(

**γ**

_{1},

**γ**

_{c}) and

*Σ*_{2}=

*Σ*_{2}(

**γ**

_{2},

**γ**

_{c}). This means that

**γ**

_{c}contains the set of common parameters, if any, that appear in both the level-1 and level-2 structures. Because is consistent for

**Σ**

_{10}, we start to estimate

*θ*_{1}= (

**γ**′

_{1},

**γ**′

_{c})′ by minimizing

Denote the estimator as . It then follows from (3) that

- ((5a))

where

- ((5b))

When data are balanced (*n*_{1}=*n*_{2}= … =*n _{J}*) and normally distributed, results in Yuan and Bentler (2006) imply that

**Γ**

_{11}=

**W**

^{−1}

_{1}. Thus,

**Ω**

_{1}is simplified to , which is just the inverse of the normal distribution based information matrix when fitting the model

*Σ*_{1}(

*θ*_{1}) in conventional CSA. When data are normally distributed but unbalanced, we might assume that the level-1 sample sizes

*n*s are uniformly distributed on the interval [

_{j}*n*+ 1,

_{a}*n*+

_{a}*J*]. results in Yuan and Bentler (2006) then imply that

- (6)

Thus, when either *J* approaches infinity or approaches infinity, ^{−1}_{1} still provides the correct asymptotic covariance matrix of . More generally, a consistent estimator of **Ω**_{1} follows when *θ*_{1} is replaced by and **Γ**_{11} is replaced by .

Notice that implies or *J*∞. When data are normally distributed and the level-1 sample sizes are uniformly distributed on [*n _{a}*+ 1,

*n*+

_{a}*J*], (6) implies that the statistic

- (7)

will approach when *N*−*J* approaches infinity, where *q*_{1} is the number of parameters in *θ*_{1}. Notice that fitting to *Σ*_{1}(*θ*_{1}) is just the conventional CSA. Thus, many techniques for model evaluation and checking are available even when data are not normally distributed. We will adapt several. In particular, the residual-based ADF statistic (see Browne 1984) is given by

- (8)

where . Notice that the matrix in (8) is obtained through (3) and (2), where the is estimated by and . Because both and are the averages of *J* random terms, the sampling error in is in the magnitude of *O _{p}*(1/

*J*), not

*O*(1/(

_{p}*N*−

*J*)). Thus, the corrected residual-based ADF statistic (Yuan and Bentler 1998) is given by

- (9)

Notice that both *T*_{RADF1} and *T*_{CRADF1} asymptotically approach . The residual-based *F*-statistic (Yuan and Bentler 1998) is given by

- (10)

which is also asymptotically distribution free when referred to the *F*-distribution with degrees of freedom *p**−*q*_{1} and *J*− (*p**−*q*_{1}). Let

Then the rescaled statistic (see Satorra and Bentler 1994) is given by

- (11)

*T*_{RML1} is not asymptotically distribution free, and it approaches a distribution with mean equal to *p**−*q*_{1}. In conventional SEM, *T*_{CRADF1}, *F*_{R1}, and *T*_{RML1} have been shown to perform reasonably well (see Bentler and Yuan 1999; Yuan and Bentler 1998). More studies of these statistics in the context of multilevel SEM will be conducted in Section 4. Because fitting *Σ*_{1}(*θ*_{1}) to is just conventional SEM, fit indices such as CFI and RMSEA can be computed in the usual way (e.g., see Hu and Bentler 1999), as in the NELS example based on the School Data that are going to be further discussed in Section 3.

#### 2.3. *The Level-2 Structural Model*

We now turn to fitting the level-2 model *Σ*_{2}(*θ*_{2}) with *θ*_{2}= (**γ**′_{2}, **γ**′_{c})′, fixing **γ**_{c} at or just letting it be free when minimizing . We first consider minimizing and denote the estimator by . It follows from (4) that the asymptotic distribution of is characterized by

- ((12a))

where

- ((12b))

When data are balanced (*n*_{1}=*n*_{2}= … =*n _{J}*=

*n*) and normally distributed, results in Yuan and Bentler (2006) imply that

**Γ**

_{22}=

**W**

^{−1}

_{2}+

*O*(1/

*n*). Thus, when

*n*approaches infinity,

**Ω**

_{2}is simplified to , which is just the inverse of the normal distribution-based information matrix in conventional CSA. When data are normally distributed but unbalanced, assuming that the level-1 sample sizes

*n*s are uniformly distributed on the interval [

_{j}*n*+ 1,

_{a}*n*+

_{a}*J*], then results in Yuan and Bentler (2006) imply that

- (13)

Thus ^{−1}_{2} might be used to approximate **Ω**_{2} when *J* is huge. Standard errors for should be evaluated according to (12b) for general normal or nonnormal data.

Similarly, fitting *Σ*_{2}(*θ*_{2}) to is again an application of conventional CSA, and all the techniques for model evaluation and diagnostics in the SEM literature can be utilized. Parallel statistics to those in (7) to (11) can also be constructed for testing model *Σ*_{2}(*θ*_{2}). These will be denoted by *T*_{ML2}, *T*_{RADF2}, *T*_{CRADF2}, *F*_{R2} and *T*_{RML2}. When data are normally distributed and *n _{j}*s are uniformly distributed on [

*n*+ 1,

_{a}*n*+

_{a}*J*], then (13) implies that

- (14)

where *q*_{2} is the number of free parameters in *Σ*_{2}(*θ*_{2}). The statistics *T*_{RADF2}, *T*_{CRADF2}, *F*_{R2} are asymptotically distribution-free as *J*∞. It is also straightforward to obtain fit indices when evaluating model *Σ*_{2}(*θ*_{2}).

#### 2.4. *Checking the Common Parameters of Level 1 and Level 2*

It is almost surely true that will not equal due to sampling errors. The two estimators may correspond to different population values as well. In such a situation, we may denote the level-1 and level-2 models by *Σ*_{1}(**γ**_{1}, **γ**_{c1}) and *Σ*_{2}(**γ**_{2}, **γ**_{c2}). However, and should be near each other when both *Σ*_{1}(*θ*_{1}) and *Σ*_{2}(*θ*_{2}) are correctly specified and when **γ**_{c10}=**γ**_{c20}. When either or both of the models are misspecified, it is very likely that and will converge to different values even when **γ**_{c10}=**γ**_{c20} for correctly specified models (Yuan, Marshall, and Bentler 2003). It is also very likely that both models *Σ*_{1}(**γ**_{1}, **γ**_{c1}) and *Σ*_{2}(**γ**_{2}, **γ**_{c2}) are correctly specified when not requiring **γ**_{c10}=**γ**_{c20}. In order to evaluate this situation, we propose checking whether the two estimates and of **γ**_{c} agree statistically. Let **L**_{1} and **L**_{2} be selection matrices such that **γ**_{c}=**L**_{1}*θ*_{1}=**L**_{2}*θ*_{2}, and

When **γ**_{c10}=**γ**_{c20}, Appendix A gives the outline leading to

- (15)

where

with . The Wald-type statistic for testing **γ**_{c10}=**γ**_{c20} is

If *T _{c}* is statistically significant while both

*Σ*_{1}(

*θ*_{1}) and

*Σ*_{2}(

*θ*_{2}) are statistically adequate, we may conclude that

**γ**

_{c10}≠

**γ**

_{c20}.

#### 2.5. *Fitting the Level-2 Model Using the Parameter Estimates Obtained at Level 1*

When **γ**_{c10}=**γ**_{c20}=**γ**_{c0}, the parameter estimate is consistent for **γ**_{c0}. If there is a substantive reason to believe that **γ**_{c10}=**γ**_{c20}, then we may not want to test **γ**_{c10}=**γ**_{c20}. In such a case, we may consider minimizing the objective function for **γ**_{2}. Let the estimator be and denote and . Appendix A provides the details leading to

- ((16a))

where

- ((16b))

with

Notice that the dependence of the distribution of on is reflected by the last three terms in **Δ**_{2}. When **γ**_{c} is known instead of being estimated, then the last three terms in **Δ**_{2} will disappear. Also notice that the dependence of on is inversely proportional to the average level-1 sample size. When is large, the influence of on will be small.

We next turn to test statistics for evaluating the model *Σ*_{2}(**γ**_{2}, **γ**_{c}) based on the mix of parameter estimates . Notice that is a *p**×*q*_{2} matrix. Let be the *p**× (*p**−*q*_{2}) full rank matrix whose columns are orthogonal to those of . Let

- (17)

where and . Under the assumptions of **γ**_{c10}=**γ**_{c20} and *Σ*_{2}(*θ*_{0}) =*Σ*_{20}, Appendix A contains the details leading to

- (18)

We need to evaluate to obtain *T*_{RADFm2} in (17). It follows from lemma 1 of Khatri (1966) that *T*_{RADFm2} can be rewritten as

which does not need the calculation of . Similarly, the corrected ADF statistic and *F*-statistic are given by

and

which are also asymptotically distribution-free when referred to and the *F*-distribution with degrees of freedom *p**−*q*_{2} and *J*− (*p**−*q*_{2}), respectively.

A rescaled statistic can be constructed for

Let

and

- (19)

Appendix A provides the details showing that

- (20)

approaches a distribution with a mean equal to *p**−*q*_{2}. Notice that both and are evaluated at .

#### 2.6. *Efficiency Consideration*

The previous subsections provided asymptotic covariance matrices for level-1 and level-2 parameter estimates. We may want to know the efficiency of estimators in the separate procedure when compared to those in the simultaneous procedure. Because normal theory ML is used in estimating the parameters, we can only analytically compare the efficiency of different parameter estimates when data are normal. For simplicity, we assume that *θ*_{1} and *θ*_{2} do not contain any overlapping parameters when comparing the efficiency of the simultaneous estimators and with that of and . We will also compare the efficiency of against that of when *θ*_{1} and *θ*_{2} share the common parameter **γ**_{c}. We assume that the level-1 sample sizes *n _{j}*s are uniformly distributed on the interval [

*n*+ 1,

_{a}*n*+

_{a}*J*].

For the simultaneous estimators and , results in Yuan and Bentler (2006) imply that

- (21)

and

- (22)

Comparing (5) and (21) with **Γ**_{11} being given by (6), the estimator in the separate procedure will have the same asymptotic efficiency as that of when either or *J*∞. Similarly, and have the same asymptotic efficiency when *J*∞. This implies that, although the simultaneous estimators are asymptotically most efficient, the difference in efficiency between the separate estimators and simultaneous estimators will be small when or *J* is large.

We can also compare the asymptotic efficiency of with that of . It follows from (12b) and (13) that

Denote

where , , and . Then

- (23)

Notice that as given in (16b). Yuan and Bentler (2006) imply that . When is large, we obtain

- (24)

Comparing (23) with (24), should be more efficient when is large. This is because is related to and is related to ; is obtained based on data having a sample size equivalent to , which is more efficient than that is estimated based on data having a sample size equivalent to *J*.

### 3. THE SCHOOL DATA EXAMPLE

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

Continuing with the School Data example, five statistics for evaluating the overall model in the simultaneous procedure are given by the first row of numbers in panel (a) in Table 1; the second row shows the *p*-values obtained from referring the statistics to either χ^{2}_{38} or *F*_{38,197}. As with *T _{ML}* reported in Section 1, the other four statistics are also significant, and they do not tell us whether the significance is caused by a misspecification in level 1 or level 2 or in both of the models. It is also not obvious how to obtain fit indices to evaluate the goodness of fit of the models in approximating the population covariance matrices, as is regularly done in conventional SEM.

(a) Simultaneous Procedure | |||||
---|---|---|---|---|---|

T_{ML} | T_{RML} | T_{RADF} | T_{CRADF} | F_{R} | |

T | 77.968 | 80.565 | 102.672 | 71.454 | 2.275 |

p | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 |

(b) Separate Procedure | |||||

Level 1 | T_{ML1} | T_{RML1} | T_{RADF1} | T_{CRADF1} | F_{R1} |

T | 46.206 | 45.559 | 54.148 | 44.008 | 2.631 |

p | 0.000 | 0.001 | 0.000 | 0.001 | 0.000 |

CFI | 0.998 | 0.996 | 0.976 | 0.857 | |

RMSEA | 0.017 | 0.017 | 0.019 | 0.016 | |

Level 2 | T_{ML2} | T_{RML2} | T_{RADF2} | T_{CRADF2} | F_{R2} |

T | 351.273 | 35.278 | 30.604 | 27.078 | 1.487 |

p | 0.000 | 0.013 | 0.045 | 0.103 | 0.092 |

CFI | 0.909 | 0.988 | 0.920 | 0.887 | |

RMSEA | 0.273 | 0.060 | 0.051 | 0.043 |

Using the model segregation procedure developed in the previous section, test statistics and their *p*-values as well as fit indices CFI and RMSEA for the level-1 and level-2 models are also provided in panel b in Table 1, where the *p*-values are obtained by referring the statistics to either χ^{2}_{19} or *F*_{19,216}. Fit indices based on the *F*-statistic have not been defined and thus are not available. All the statistics at level 1 are also highly significant; the CFIs based on *T*_{ML1}, *T*_{RML1}, and *T*_{RADF1} are above 0.95; RMSEAs based on all the statistics are well below the accepted threshold value of 0.05. Thus, the misspecification in the level-1 model is tiny and the significance is mainly due to the large *N*−*J*= 4963, which plays the role of sample size in analyzing the level-1 model alone. The CFI based on *T _{CRADF1}* being below 0.90 does not imply that the model fits the data poorly. Actually,

*T*corresponds to the greatest

_{CRADF1}*p*-value and smallest RMSEA. The inconsistency occurs because these statistics are not equivalent when the model is misspecified, especially at the base model (i.e., the independence model) on which the CFI is based. More discussions about the inconsistency of fit indices based on different statistics are provided by Sugawara and MacCallum (1993) and Yuan and Chan (2005). Comparing the statistics indicates that nonnormality has little effect on the level-1 model evaluation.

Panel (b) in Table 1 also contains the statistics and fit indices for evaluating the level-2 model alone. Although *T*_{CRADF2} corresponds to the largest *p*-value and smallest RMSEA while *T*_{RML2} corresponds to the largest CFI, these may imply that the model fit the data nicely according to the commonly used threshold values. Comparing the two models at level 1 and level 2, *Σ*_{2}(** θ**) is most likely further from

**Σ**

_{20}than

*Σ*_{1}(

**) is from**

*θ***Σ**

_{10}. Due to the much smaller sample size at level 2, statistics

*T*

_{RML2},

*T*

_{RADF2},

*T*

_{CRADF2}, and

*F*

_{R2}are only marginally significant. There is a greater difference between

*T*

_{ML2}and the other statistics at level 2, implying that the nonnormality in the data has a great effect on level-2 model evaluation. This effect is not obvious when simultaneously evaluating the two-level model.

Although the degrees of freedom of the chi-square statistics obtained from simultaneously evaluating the level-1 and level-2 models equal the sum of the level-1 and level-2 degrees of freedom, the corresponding chi-square statistics do not obey the same rule. That is, adding the level-1 and level-2 statistics does not yield the corresponding statistic for overall model evaluation.

Muthén et al. (1997) noted that the level-1 and level-2 factor loadings are not equal because those at level 1 reflect “the measurement characteristics of the achievement variables on the student level” while those at level 2 reflect “school-level selection and quality of instruction.” Actually, the test statistic *T _{c}* for the equality of the factor loadings is 124.300, which is highly significant when referred to χ

^{2}

_{8}. In such a situation, the statistics based on a mix of types of parameter estimates would be misleading and thus are not calculated.

### 4. MONTE CARLO RESULTS

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

In this section, we contrast the separate and simultaneous model fitting procedures by Monte Carlo simulation. We will use the same model as that in the School Data example—that is, a confirmatory factor model at levels 1 and 2. Three populations are specified in generating the data. The first one () can be represented by (1) with

- ((25a))

where

- ((25b))

- ((25c))

**Ψ**_{1}= Cov (**e**_{1ij}) and **Ψ**_{2}= Cov (**e**_{2j}) are diagonal matrices so that all the diagonals of **Σ**_{10} equal 2.0 and **Σ**_{20} is a correlation matrix. When fitting model (25), the free covariance parameters at each level are the eight factor loadings, the one factor correlation, and the eight error variances. The model is correctly specified in this case. The second population () is constructed by adding a factor loading λ^{(1)}_{12}= 0.1 to **Λ**_{1}= (λ^{(1)}_{ij}) in addition to those in (25) in generating **u**_{ij} while **v**_{j} is the same as that specified in (25). The analysis model excludes this extra parameter. Thus, a misspecified overall model is created. The third population () is obtained by adding an additional factor loading λ^{(2)}_{12}= 0.4 to **Λ**_{2}= (λ^{(2)}_{ij}) in generating **v**_{j} while keeping **u**_{ij} the same as specified in (25). Again, the extra parameter is ignored in the model and thus a misspecified overall model is created. The purpose is to see how these misspecifications will affect the test statistics when simultaneously evaluating the two-level model as well as when evaluating the level-1 and level-2 models separately. For example, our results in the previous section imply that misspecifications at level 2 will not systematically affect level-1 model evaluation when using the separate model-fitting procedure, but any misspecification in either of the levels will affect the simultaneous model-fitting procedure. For certain misspecifications at a given level, we expect that the test statistics will also have better powers when evaluating that level alone. We will see whether the Monte Carlo results will verify such an expectation. Although the School Data example partially verifies some of the results, in the simulation we are aware of whether the model is correct or misspecified as well as at which level it is misspecified, which allows us to examine the performances of the various statics.

Three distribution conditions are used. In the first condition, **u**_{ij} is generated by **f**_{1ij}∼*N*_{2}(**0**, *Φ*_{1}) and **e**_{1ij}∼*N*_{8}(**0**, **Ψ**_{1}); **v**_{j} is generated by **f**_{2j}∼*N*_{2}(**0**, *Φ*_{2}) and **e**_{2j}∼*N*_{8}(**0**, **Ψ**_{2}). So **y**_{ij} follows a multivariate normal distribution. In the second condition, **f**_{1ij}=*Φ*^{1/2}_{1}**z**_{ij2}/*r _{ij}*,

**z**

_{ij2}∼ lognormal (

**0**,

**I**

_{2}),

^{3}

**e**

_{1ij}=

*Ψ*^{1/2}

_{1}

**z**

_{ij8}/

*r*,

_{ij}**z**

_{ij8}∼ lognormal(

**0**,

**I**

_{8}) and

*r*∼ (χ

_{ij}^{2}

_{5}/3)

^{1/2};

**f**

_{2j}∼

*N*

_{2}(

**0**,

*Φ*_{2}) and

**e**

_{2j}∼

*N*

_{8}(

**0**,

*Ψ*_{2}). Random variables in

**z**

_{ij2},

**z**

_{ij8}, and

*r*are independent. The resulting

_{ij}**u**

_{ij}has a skewed distribution with heavy tails. Notice that

**f**

_{1ij}and

**e**

_{1ij}are not independent because they share the common

*r*. In the third condition,

_{ij}**f**

_{1ij}∼

*N*

_{2}(

**0**,

*Φ*_{1}),

**e**

_{1ij}∼

*N*

_{8}(

**0**,

*Ψ*_{1});

**f**

_{2j}=

*Φ*^{1/2}

_{2}

**z**

_{j2}/

*r*,

_{j}**z**

_{j2}∼ lognormal(

**0**,

**I**

_{2}),

**e**

_{2j}=

*Ψ*^{1/2}

_{2}

**z**

_{j8}/

*r*,

_{j}**z**

_{j8}∼ lognormal(

**0**,

**I**

_{8}), and

*r*∼ (χ

_{j}^{2}

_{5}/3)

^{1/2}. Random variables in

**z**

_{j2},

**z**

_{j8}, and

*r*are independent. The resulting

_{j}**v**

_{j}is skewed and has heavy tails in its distribution. The three distribution conditions are designed to check how the different statistics are affected by nonnormal data. Specifically, we want to see whether level-1 statistics are particularly affected by the nonnormality of the level-1 random components and whether level-2 statistics are particularly affected by the nonnormality of the level-2 random components.

Combining the three model conditions with the three distribution conditions creates nine sets of overall conditions in the study. For each of the combination of conditions, we choose *J*= 200 and *n _{j}*= 5 +

*j*so that level-1 sample sizes are uniformly distributed in [6, 205], with a total level-1 sample size

*N*= 21100. For each condition,

*N*= 500 replications are performed. At each replication, statistics

_{r}*T*, and

_{ML}, T_{RML}, T_{RADF}, T_{CRADF}*F*are obtained when simultaneously evaluating

_{R}

*Σ*_{1}(

**) and**

*θ*

*Σ*_{2}(

**);**

*θ**T*

_{ML1},

*T*

_{RML1},

*T*

_{RADF1},

*T*

_{CRADF1}, and

*F*

_{R1}are calculated when fitting

*Σ*_{1}(

*θ*_{1}) to alone;

*T*

_{ML2},

*T*

_{RML2},

*T*

_{RADF2},

*T*

_{CRADF2}, and

*F*

_{R2}are obtained when fitting

*Σ*_{2}(

*θ*_{2}) to alone;

*T*

_{MLm2},

*T*

_{RMLm2},

*T*

_{RADFm2},

*T*

_{CRADFm2}, and

*F*

_{Rm2}are also obtained when fixing the factor loadings in

*Σ*_{2}(

*θ*_{2}) at the values obtained at level 1 by fitting

*Σ*_{1}(

*θ*_{1}) to . Setting the nominal level at 0.05, each of the statistics in the simultaneous procedure is referred to either χ

^{2}

_{38}or

*F*

_{38,162}for significance; each of the statistics in the separate procedure is referred to either χ

^{2}

_{19}or

*F*

_{19,181}for significance. The total number of significant statistics for each condition is reported in Table 2. We will first discuss these numbers and then summarize the strength and weakness of the various statistics.

T | Normal u_{ij} and v_{j} | Lognormal u_{ij} | Lognormal v_{j} | ||||||
---|---|---|---|---|---|---|---|---|---|

_{1} | _{2} | _{3} | _{1} | _{2} | _{3} | _{1} | _{2} | _{3} | |

^{}*Note*: for_{1}, both the level-1 and level-2 models are correctlys specified; for_{2}, the level-2 model is correctly specified but the level-1 model is misspecified; for_{3}, the level-1 model is correctly specified but the level-2 model is misspecified.
| |||||||||

T_{ML} | 28 | 492 | 342 | 334 | 500 | 485 | 164 | 497 | 381 |

T_{RML} | 30 | 492 | 348 | 33 | 403 | 213 | 54 | 489 | 299 |

T_{RADF} | 161 | 498 | 411 | 175 | 461 | 424 | 163 | 498 | 302 |

T_{CRADF} | 19 | 479 | 179 | 20 | 295 | 193 | 16 | 483 | 67 |

F_{R} | 34 | 488 | 220 | 37 | 328 | 230 | 24 | 486 | 99 |

T_{ML1} | 24 | 499 | 24 | 385 | 500 | 385 | 24 | 499 | 27 |

T_{RML1} | 21 | 498 | 21 | 19 | 403 | 19 | 21 | 498 | 26 |

T_{RADF1} | 67 | 499 | 67 | 63 | 460 | 63 | 67 | 499 | 67 |

T_{CRADF1} | 29 | 494 | 29 | 24 | 407 | 24 | 30 | 494 | 30 |

F_{R1} | 35 | 494 | 35 | 30 | 422 | 30 | 35 | 494 | 35 |

T_{ML2} | 50 | 49 | 452 | 54 | 55 | 459 | 265 | 264 | 461 |

T_{RML2} | 28 | 27 | 425 | 31 | 31 | 431 | 27 | 27 | 304 |

T_{RADF2} | 60 | 59 | 392 | 62 | 62 | 403 | 53 | 53 | 227 |

T_{CRADF2} | 22 | 21 | 308 | 21 | 22 | 312 | 15 | 17 | 106 |

F_{R2} | 28 | 28 | 332 | 27 | 28 | 334 | 21 | 21 | 127 |

T_{MLm2} | 252 | 255 | 500 | 302 | 305 | 500 | 481 | 482 | 500 |

T_{RMLm2} | 15 | 20 | 490 | 18 | 20 | 487 | 96 | 99 | 268 |

T_{RADFm2} | 58 | 56 | 412 | 64 | 64 | 418 | 46 | 43 | 148 |

T_{CRADFm2} | 21 | 22 | 331 | 21 | 20 | 346 | 12 | 14 | 54 |

F_{Rm2} | 27 | 29 | 357 | 25 | 26 | 364 | 15 | 16 | 72 |

Each number under represents the type I error for the corresponding statistic on the left side. With 500 replications, the ideal number is 25. *T _{RADF}* cannot properly control type I errors when simultaneously evaluating

*Σ*_{1}(

**) and**

*θ*

*Σ*_{2}(

**), neither can**

*θ**T*

_{RADF1},

*T*

_{RADF2}, and

*T*

_{RADFm2}when separately evaluating the models.

*T*

_{ML2}, especially

*T*

_{MLm2}, also cannot properly control type I errors even when data are normally distributed. When

*Σ*_{2}(

**) is misspecified, the numbers under represent either the power or type I error. Because the level-1 sample size is huge, all the statistics have good power when simultaneously evaluating the models. After separating the models, the statistics at level 1 have even better power. Especially, at level 2,**

*θ**T*

_{RML2},

*T*

_{CRADF2},

*F*

_{R2},

*T*

_{CRADFm2}, and

*F*

_{Rm2}properly identify that

*Σ*_{2}(

**) is correctly specified regardless of the distribution conditions. When**

*θ*

*Σ*_{2}(

**) is misspecified under , due to the relatively small level-2 sample size, the statistics**

*θ**T*and

_{CRADF}*T*, which can properly control type I errors, do not have good power, especially when level-2 random components have a heavy tailed distribution. Statistics

_{F}*T*and

_{ML}*T*reject the model more frequently than others but the higher rejection rates are partially due to their relatively large type I errors.

_{RADF}Among all the statistics, *T _{ML}, T*

_{ML1},

*T*

_{ML2}, and

*T*

_{MLm2}are most sensitive to distribution violations. When separately evaluating the models, the nonnormal distribution of the level-1 components mainly affects

*T*

_{ML1}, and it has little effect on

*T*

_{ML2}or

*T*

_{MLm2}. Similarly, the nonnormal distribution of the level-2 random components mainly affects

*T*

_{ML2}and

*T*

_{MLm2}, it does not affect

*T*

_{ML1}. The statistic

*T*is more robust than

_{RML}*T*. Actually,

_{ML}*T*still asymptotically follows a chi-square distribution for the nonnormal distribution studied here if fitting a conventional covariance structure model. But it does not possess the ADF property, which is why

_{RML}*T*

_{RMLm2}cannot properly control type I error under with lognormal

**v**

_{j}in Table 2 (see also Yuan and Bentler 1998).

### 5. IMPLEMENTATION

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

Because no statistical software currently provides the model segregation methodology, we created an SAS IML program to perform the procedure described in Section 2.1. The output of Multi-single.sas at http://www.nd.edu/~kyuan/multilevel will give, in sequence, the total level-1 sample size *N*, level-2 sample size *J*, number of dependent variables *p*, the sample size equivalent number as in (3), , and as in (4). At the beginning (the data step) of the program, we need to input a multilevel data set by specifying its full name and the number of variables (the School Data have 21 variables). In the main program, xcl represents the cluster variable (the nineteenth variable is the cluster variable in the School Data) that needs to be specified; ymat represents dependent variables **y**_{ij} as in (1) (the 7th to 14th variables are the dependent variable for the School Data example) that needs to be specified. The program assumes that observations from the same cluster/school are next to each other. Otherwise, we need to first run the SAS IML program Sort.sas also available at http://www.nd.edu/~kyuan/multilevel, which will place observations from the same cluster together. Different SEM programs may have different requirements for the order of the elements in . The output is the estimate of the asymptotic covariance matrix of . If we need the to be the estimate of the asymptotic covariance matrix of , then we just need to remove the line “Gamma11=permu*Gamma11*permu';” from the program.

The output from Multi-single.sas can be used to perform the level-1 or level-2 analysis by any SEM software that allows the input of or the weight matrix . An EQS program to perform the level-1 analysis for the School Data is provided in Appendix B, where the output is saved as a data file named “d:\mlevel\Gamma11.dat,” which is a 36 × 36 data matrix. The 8 × 8 matrix within the syntax is . The level-1 sample size is put at *N*−*J*. The output of EQS 6 (Bentler 2006) contains the rescaled statistic *T*_{RML1} and *T*_{RADF1}. The statistics *T*_{CRADF1} and *F*_{R1} have to be calculated separately using (9) and (10). The output of running the EQS program also contains a corrected residual based statistic and an *F*-statistic, but these are different from (9) and (10). When evaluating the level-2 model with sample size at *J*, EQS will contain all the five statistics *T*_{ML2}, *T*_{RML2}, *T*_{RADF2}, *T*_{CRADF2}, and *F*_{R2}; no extra calculation is necessary.

EQS, LISREL (Jöreskog and Sörbom, 1996: 59–62), and SAS Calis (see SAS online documentation, Table 19.1 for Proc Calis) also have the option of including an external weight matrix in their generalized least-squares approach to SEM. When such a weight matrix is supplied, they will provide the ADF statistic, not the one based on the residuals (see Yuan and Bentler 1998).

### 6. DISCUSSION

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

In a multilevel SEM model, overall test statistics are unable to provide information as to the level at which models may be misspecified. We propose to fit the structural model at each level separately, and provide the associated technical development to allow this to be done correctly. There are several advantages to such a procedure: (1) it is easier to find out which level is misspecified; (2) fit indices for conventional SEM can be easily extended to evaluating models at separate levels of a multilevel model; (3) model diagnostics from the conventional SEM literature can be easily applied to check the misspecification of a multilevel model; and (4) misspecification at one level will not systematically affect the evaluation of the other level models. The separate procedures developed in this paper can be compared with methods for testing a contrast in the context of ANOVA or MANOVA. When an overall *F*-statistic in ANOVA is significant, we may need to identify which of several groups or which contrast caused the significance. The procedure can also be compared to the two-stage least-squares procedure for conventional CSA developed by Bollen (1996). When one part of the model is misspecified, the other parts can still be correctly estimated or evaluated.

In order to perform the separate procedure beginning with estimating the level-1 model, *Σ*_{1}(*θ*_{1}) has to be identified. We recommend estimating *θ*_{1} first mainly because is more efficient than . It is also easier to identify the misspecification in *Σ*_{1}(*θ*_{1}) than in *Σ*_{2}(*θ*_{2}), as was shown in Section 4. When *Σ*_{2}(*θ*_{2}) is identified, we may also consider estimating *Σ*_{2}(*θ*_{2}) first. When *Σ*_{1}(*θ*_{1}) and *Σ*_{2}(*θ*_{2}) contain overlapping parameters and one of them is not identified if estimated alone, we can estimate the identified model first since inserting the obtained parameter estimates into the other model will make the otherwise unidentified model identified. It is obvious that the methodology developed in this paper can be equally applied to a model with more than two levels. The methodology can also be easily extended to a model that contains a mean structure.

In order to test the level-2 model alone, one recommendation has been to set the level-1 model as saturated (e.g., Hox 2002:240); the reverse procedure also could be done. Such an approach may be a valuable strategy for identifying the misspecified model. But there are still many unknowns about such an approach. For example, it is not clear how to incorporate the level-1 and level-2 sample sizes into a model fit index so that the effect of all sample sizes is minimized. Although the saturated model is correctly specified, misspecification of the structural model may affect the estimates of the saturated model. That is, in this approach, the estimate of the saturated model can be biased. The power of a statistic to detect the misspecification may also be compromised due to the biased estimates of a saturated model. Finally, techniques of model diagnostics developed in the conventional SEM literature may not be directly applicable to the simultaneous approach to a multilevel model. Any bias may have a confounding effect in model diagnostics.

Among the five statistics for evaluating the model either simultaneously or at an individual level, we recommend *T _{RML}, T_{CRADF}*, and

*F*. In particular,

_{R}*T*and

_{CRADF}*F*are asymptotically distribution free and can be expected to perform similarly when the distribution of the data changes. One concern with inference for multilevel models is when the sample size at the highest level is not large while random components at that level are distributed with heavy tails; then, all the statistics that can properly control type I errors have weak power to detect a misspecification at that level. This is a general phenomenon that occurs in almost all statistical significance tests. When the underlying population has heavy tails, inference based on a normal distribution assumption, or on sample moments, is not efficient. Robust procedures have to be used to obtain more efficient parameter estimates and more powerful test statistics.

_{R}- 1
Further information about the data set is available at http://www.statmodel.com/examples/continuous.shtml and in Muthén, Khoo, and Gustafsson (1997).

- 2
Both EQS and Mplus contain the ML procedure for multilevel SEM.

- 3
A lognormal random number

*z*∼ lognormal(0, 1) is generated by*z*={exp[*n*(0, 1)]− exp(1/2)}/[*e*(*e*− 1)]^{1/2}, where*n*(0, 1) represents a standard normal random variable.

### APPENDIX A

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

This appendix provides details leading to the results in (15), (16), and (18), and it shows why the statistic *T*_{Rm2} in (20) approaches a distribution with a mean equal to *p**−*q*_{2}.

*Equation (15)*** . ** Parallel to equation (A5) in Yuan and Bentler (2000) there exist

- ((A-1))

and

- ((A-2))

*Equation (16)*** . ** We will use the estimating equation approach to obtain (16). Taking the derivative of with respect to **γ**_{2}, we get the following estimating function:

Because minimizes , under standard regularity conditions (see Yuan and Jennrich 2000), it satisfies the estimating equation

Let be the partial derivative of **g** with respect to **γ**_{2}, and let be the partial derivative of **g** with respect to **γ**_{c}. The Taylor expansion of at (**γ**′_{20}, **γ**′_{c0})′ gives

- ((A-3))

where is between and **γ**_{20}, and is between and **γ**_{c0}. Notice that

- ((A-4))

- ((A-5))

It follows from (A3) to (A5) that

- ((A-6))

Using (A1) and substituting by , we can rewrite (A6) as

- ((A-7))

*Equation (18)*** . ** Using the mean value theorem we have

- ((A-8))

It follows from (A8) that

- ((A-9))

Equations (4) and (A9) imply

- ((A-10))

The *T*_{RADFm2} in (17) is just the Wald statistic based on (A10), and thus (18) follows.

** T _{ Rm2 } in **

*(20)*

**Approach a Distribution with Mean Equal to p*−q**It follows from (A1), (A7), and (A8) that

_{2}.- ((A-11))

It follows from (A11) that the asymptotic covariance matrix of is given by the **Π** in (19). Using the appendix in Yuan, Marshall, and Bentler (2002), we have

- ((A-12))

It follows from (A12) that *E*(*T*_{MLm2}) tr (**W**_{2}**Π**) and *T*_{RMLm2} converges to a distribution with mean equal to *p**−*q*_{2}.

### APPENDIX B

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

/TITLE EQS 6.1: Level-1 analysis using the output from Multi-single.sas /SPECIFICATION weight=‘d:\mlevel\Gamma11.dat’ cases=4963; variables=8; matrixcovariance; analysis=covariance; methods=ML, robust; /LABELS V1=Math1; V2=Math2; V3=Math3; V4=Math4; V5=Science1; V6=Science2; V7=Science3; V8=Science4; /EQUATIONS V1=*F1+E1; V2=*F1+E2; V3=*F1+E3; V4=*F1+E4; V5=*F2+E5; V6=*F2+E6; V7=*F2+E7; V8=*F2+E8; /VARIANCES E1-E8=*; F1=1.0; F2=1.0; /COVARIANCES F1,F2=*; /MATRIX 1.0918985 0.7849957 0.5194008 0.7682225 0.4859271 0.4638581 0.4563917 0.4560988 0.7849957 1.0776262 0.5355189 0.8126215 0.5141238 0.4873328 0.4778855 0.4806775 0.5194008 0.5355189 1.6983433 0.4906551 0.3301139 0.3667441 0.3083904 0.3236708 0.7682225 0.8126215 0.4906551 3.2718055 0.552188 0.5099236 0.4939206 0.4950812 0.4859271 0.5141238 0.3301139 0.552188 1.1356429 0.4425474 0.4493122 0.4481522 0.4638581 0.4873328 0.3667441 0.5099236 0.4425474 1.1288758 0.3638105 0.3531403 0.4563917 0.4778855 0.3083904 0.4939206 0.4493122 0.3638105 0.9949672 0.3826106 0.4560988 0.4806775 0.3236708 0.4950812 0.4481522 0.3531403 0.3826106 2.7888061 /END

### REFERENCES

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. MODEL SEGREGATION AND EVALUATION
- 3. THE SCHOOL DATA EXAMPLE
- 4. MONTE CARLO RESULTS
- 5. IMPLEMENTATION
- 6. DISCUSSION
- APPENDIX A
- APPENDIX B
- REFERENCES

- 2006.
*EQS 6 Structural Equations Program Manual*. Encino , CA: Multivariate Software. - 1999. “Structural Equation Modeling with Small Samples: Test Statistics. Multivariate Behavioral Research 34: 181–97. , and .
- 1996. “An Alternative Two-Stage Least Squares (2SLS) Estimator for Latent Variable Equations. Psychometrika 61: 109–21.
- 1984. “Asymptotic Distribution-Free Methods for the Analysis of Covariance Structures. British Journal of Mathematical and Statistical Psychology 37: 62–83.
- Multilevel Structural Equation Modeling.” Pp. 437–480 in
*Handbook of Multilevel Analysis*, edited by J.De Leeuw and E.Meijer. New York : Springer. , and . 2007. “ - 2003.
*Multilevel Statistical Models*. 3d ed. London : Arnold. - 2002.
*Multilevel Analysis: Techniques and Applications*. Mahwah , NJ: Lawrence Erlbaum. - 1999. “Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus new Alternatives. Structural Equation Modeling 6: 1–55. , and .
- 1996.
*LISREL 8 Users' Reference Guide*. Chicago : Scientific Software International. , and . - 1966. “A Note on a MANOVA Model Applied to Problems in Growth Curves. Annals of the Institute of Statistical Mathematics 18: 75–86.
- 1990. “Multilevel Analysis of Structural Equation Models. Biometrika 77: 763–72.
- 1998. “Analysis of Two-Level Structural Equation Models via EM Type Algorithms. Statistica Sinica 8: 749–66. , and .
- 2001. “Hypothesis Testing and Model Comparisons in Two-Level Structural Equation Models. Multivariate Behavior Research 36: 639–55. , and .
- 2004. “An EM Algorithm for Fitting Two-Level Structural Equation Models. Psychometrika 69: 101–22. , and .
- 1993. “Regression Analysis of Multilevel Data with Measurement Error. British Journal of Mathematical and Statistical Psychology 46: 301–11.
- 1999.
*Matrix Differential Calculus with Applications in Statistics and Econometrics*Rev. ed. New York : Wiley. , and, . - 1989. “Balanced Versus Unbalanced Designs for Linear Structural Relations in Two-Level Data. British Journal of Mathematical and Statistical Psychology 42: 215–32. , and .
- 1994. “Multilevel Covariance Structure Analysis. Sociological Methods and Research 22: 376–98.
- 1997. “Multilevel Latent Variable Modeling in Multiple Populations.” Unpublished Manuscript. , , and .
- 1995. “Complex Sample Data in Structural Equation Modeling.” Pp. 267–316 in
*Sociological Methodology*, vol. 25, edited by P. V.Marsden. Cambridge , MA: Blackwell Publishing. , and . - 1992. “Maximum Likelihood and Generalized Least Squares Analyses of Two-Level Structural Equation Models. Statistics and Probability Letters 14: 25–30. , and .
- 2002.
*Hierarchical Linear Models.*2d ed. Newbury Park , CA: Sage. , and . - 1994. “Corrections to Test Statistics and Standard Errors in Covariance Structure Analysis.” Pp. 399–419 in
*Latent Variables Analysis: Applications for Developmental Research*, edited by A.Von Eye and C. C.Clogg. Thousand Oaks , CA: Sage. , and . - 1993. “Effect of Estimation Method on Incremental Fit Indexes for Covariance Structure Models. Applied Psychological Measurement 17: 365–77. , and .
- 1998. “Normal Theory Based Test Statistics in Structural Equation Modeling. British Journal of Mathematical and Statistical Psychology 51: 289–309. , and .
- 2000. “Three Likelihood-based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data.” Pp.167–202 in
*Sociological Methodology*, vol, 30, edited by M. E.Sobel and M. P.Becker. Boston , MA: Blackwell Publishing. . - 2002. “On Normal Theory Based Inference for Multilevel Models with Distributional Violations. Psychometrika 67: 539–61. .
- 2003. “Eight Test Statistics for Multilevel Structural Equation Models. Computational Statistics and Data Analysis 44: 89–107. .
- 2006. “Asymptotic Robustness of Standard Errors in Multilevel Structural Equation Models. Journal of Multivariate Analysis 97: 1121–41. .
- 2005. “On Nonequivalence of Several Procedures of Structural Equation Modeling. Psychometrika 70: 791–98. , and .
- 2000. “Estimating Equations with Nuisance Parameters: Theory and Applications. Annals of the Institute of Statistical Mathematics 52: 343–50. , and .
- 2002. “A Unified Approach to Exploratory Factor Analysis with Missing Data, Nonnormal Data, and in the Presence of Outliers. Psychometrika 67: 95–122. , , and .
- 2003. “Assessing the Effect of Model Misspecifications on Parameter Estimates in Structural Equation Models.” Pp. 241–65 in
*Sociological Methodology*, vol. 33, edited by R. M.Stolzenberg. Boston , MA: Blackwell Publishing. .