Abstract
- Top of page
- Abstract
- 1. Introduction
- 2. The dichotomous factor analysis model
- 3. Method
- 4. Results
- 5. Discussion
- Acknowledgements
- References
- Supporting Information
We conducted a Monte Carlo study to investigate the performance of the polychoric instrumental variable estimator (PIV) in comparison to unweighted least squares (ULS) and diagonally weighted least squares (DWLS) in the estimation of a confirmatory factor analysis model with dichotomous indicators. The simulation involved 144 conditions (1,000 replications per condition) that were defined by a combination of (a) two types of latent factor models, (b) four sample sizes (100, 250, 500, 1,000), (c) three factor loadings (low, moderate, strong), (d) three levels of non-normality (normal, moderately, and extremely non-normal), and (e) whether the factor model was correctly specified or misspecified. The results showed that when the model was correctly specified, PIV produced estimates that were as accurate as ULS and DWLS. Furthermore, the simulation showed that PIV was more robust to structural misspecifications than ULS and DWLS.
1. Introduction
- Top of page
- Abstract
- 1. Introduction
- 2. The dichotomous factor analysis model
- 3. Method
- 4. Results
- 5. Discussion
- Acknowledgements
- References
- Supporting Information
Confirmatory factor analysis (CFA) is a widely used statistical tool in test development which allows researchers to test hypotheses about the structure of a scale. Typically, model parameters within CFA are estimated using maximum likelihood (ML) estimation (Browne, 1984; Bollen, 1989). ML assumes that the observed variables are continuous and that they follow a multivariate normal distribution, or equivalently that their covariance matrix has a Wishart distribution. However, test developers often work with items that have a binary response format (e.g. Nestler, Back, & Egloff, 2011). In this case, the assumptions behind ML are typically not met, leading to significant estimation problems, including, for instance, that the model parameters as well as their standard errors are inaccurately estimated (e.g. Babakus, Ferguson, & Jöreskog, 1987; Muthén & Kaplan, 1992).
Given these findings, a number of alternative estimation methods have been developed (Christofferson, 1975; Muthén, 1984; Jöreskog & Sörbom, 1996). All of these methods assume that a continuous latent variable underlies the observed responses to a binary item, and that the specified CFA model holds for these latent continuous variables and not the observed binary variables (see Bollen, 1989, pp. 439–445, for a general introduction). Model parameters are estimated by employing refined versions of a weighted least squares (WLS) approach (Browne, 1984); these versions draw on the tetrachoric correlations among the dichotomous items and on their asymptotic covariance matrix. Overall, robustness studies have shown that these estimation methods perform better than ML (e.g. Muthén & Kaplan, 1992; Beauducel & Herzberg, 2006). Simulations, however, have also found that they are only partially robust to misspecified factor models as they are system-wide estimators: they estimate all model parameters in one step (Bollen & Maydeu-Olivares, 2007).
The aim of the present paper is to compare the performance of a recently suggested alternative equation-by-equation estimator to the established approaches in the estimation of binary CFA models. Specifically, we compared the polychoric instrumental variable (PIV) estimator suggested by Bollen and Maydeu-Olivares (2007) to the unweighted least squares (ULS) estimator and the diagonally weighted least squares (DWLS) estimator in a Monte Carlo study. In the next section we introduce the dichotomous factor analysis model. We will then give a brief description of the standard estimation methods (e.g. ULS, DWLS) and introduce the PIV estimator.
2. The dichotomous factor analysis model
- Top of page
- Abstract
- 1. Introduction
- 2. The dichotomous factor analysis model
- 3. Method
- 4. Results
- 5. Discussion
- Acknowledgements
- References
- Supporting Information
Furthermore, the latent responses y* are linear functions of m latent factors η:
(1)
where y* denotes the n× 1 vector of latent response variables, Λ is an n×m matrix of factor loadings, η is an m× 1 vector of factors, and ɛ denotes an n× 1 vector of errors terms. By assumption, the factors and the errors are normally distributed, the expectation of both is zero, and the factors and the error terms are uncorrelated.
It follows, then, that the latent distributions are normally distributed, that they have an expectation of zero, and that their covariance matrix Σ is given by:
(2)
where Ψ is the covariance matrix of the latent factors η. The covariance structure hypothesis thus holds for the latent distributions and not for the observed binary responses. To estimate the model parameters, Σ is substituted by the matrix of tetrachoric correlations between the binary responses, P, and Θ is set to Θ = I− ΛΨΛT. The latter is done to identify the model but has the consequence that the error variances are no longer model parameters, and, more generally, that a correlation structure is analysed (see Bentler & Salavei, 2010, for an overview on the analysis of correlation structures; see also Muthén, & Asparouhov, 2002, for an approach for analysing covariance structures when there are binary indicators). Furthermore, it can be shown that the dichotomous factor analysis model is mathematically equivalent to the normal ogive version of the two-parameter item response model and to the graded response model (see McDonald, 1999; Bollen, Bauer, Christ, & Edwards, 2010).
2.1. Standard estimation and testing approach
Imagine that we have a drawn a random sample of size N from the dichotomous factor model. Computing model parameters starts with the estimation of the thresholds, τj1, and the tetrachoric correlations among the items (Olsson, 1979). Then the asymptotic covariance matrix,
, of the tetrachoric correlations is estimated. Finally, a least squares function F is minimized to estimate the model parameters (Muthén, 1978):
(3)
where θ is the vector of (independent) model parameters,
reflects the vector of estimated tetrachoric correlations, ρ(θ) denotes the restrictions imposed on the population tetrachoric correlations by the parameter vector θ, and
, is a positive definite weight matrix.
The standard approaches to estimating model parameters differ in their choice of the weight matrix W. In ULS (Muthén, 1978, 1984), to begin with, W is set to an identity matrix (i.e. W = I). In WLS (Browne, 1984), by contrast, W is the inverse of the estimated asymptotic covariance matrix of the tetrachoric correlations (i.e.
). Finally, in DWLS (Jöreskog & Sörbom, 1996), W is set to
. Note that the name that Flora and Curran (2004) and Muthén, du Toit, and Spisic (1997) use for DWLS is robust WLS. Also, they use a diagonal matrix V instead of W to estimate the model parameters. In contrast to W, V includes not only the asymptotic variances of the tetrachoric correlations, but also the asymptotic variances of the thresholds. However, all approaches – including ULS, WLS, DWLS, and robust WLS – use the full asymptotic covariance matrix,
, to obtain standard errors and the chi-square test statistic.
All three estimators can be employed to obtain model parameters in popular software packages such as LISREL (Jöreskog & Sörbom, 1996) or Mplus (Muthén & Muthén, 1998). In both programs, ULS and WLS are termed accordingly. DWLS, however, is called robust DWLS in LISREL and WLSMV in Mplus. Additionally, although both programs use the same method to estimate the tetrachoric correlation, they differ in their estimation of the asymptotic covariance matrix. Specifically, whereas LISREL employs the procedure described in Jöreskog (1994), Mplus uses the method described in Muthén (1984). The two methods differ in their treatment of the threshold parameters, but are asymptotically equivalent (Muthén & Satorra, 1995). Also, Dolan (1994) showed that they produced similar results in the case of CFA even with a sample size of N = 200.
ULS, WLS and DWLS yield consistent estimates of the model parameters that are asymptotically normal, and asymptotically correct standard errors can be computed (see Bollen & Maydeu-Olivares, 2007). Furthermore, simulation studies have found that whereas WLS performs adequately only for large sample sizes, DWLS and ULS perform well for small ones too (e.g. Dolan, 1994; Flora & Curran, 2004; Beauducel & Herzberg, 2006; Forero, Maydeu-Olivares, & Gallardo-Pujol, 2009). Therefore, both are typically used to estimate the parameters of a CFA model. However, recent simulations have found that they are only partially robust to structural misspecifications (Bollen & Maydeu-Olivares, 2007), and both lead to biased parameter estimates and standard errors when the factor loadings are small, when the latent continuous variables are skewed, and when there are only a few factor indicators (Forero et al., 2009).
2.2. Polychoric instrumental variable estimator
Given these results, and given that these conditions are typically found in applied settings, the performance of alternative estimation methods should be investigated. One of the alternatives is the polychoric instrumental variable estimator recently proposed by Bollen and Maydeu-Olivares (2007). The basic idea behind the PIV estimator is to compute the factor loadings in a first step and to estimate the variance and covariance model parameters – based on the values of the factor loadings – in a second step. Specifically, whereas the factor loadings are obtained using a non-iterative procedure employing instrumental variables (IVs), the variance–covariance parameters are estimated using an iterative procedure.
IV estimation is a special case of generalized method of moments estimation (Baum, Schaffer, & Stillman, 2003; Hall, 2005) and is typically employed in econometrics (e.g. Angrist, Imbens, & Rubins, 1996) to investigate the effect of a predictor on a criterion in a linear regression when the predictor is measured with error or is systematically related to other determinants of the criterion. In this case, the assumptions of ordinary least squares regression are not met, and the model parameters cannot be consistently estimated. A solution to this problem is to find one or more exogenous variables that affect the predictor but not the criterion (cf. Morgan & Winship, 2007), and to use these IVs to compute consistent versions of the regression parameters.
Bollen (1996) extended this approach to structural equation models (SEMs) for continuous variables.1 Applied to a CFA model, he showed that each indicator is a linear function of the factor loadings of this indicator, the indicator used to scale the latent factor, and a composite disturbance term containing the error of the scaling indicator and the error of the actual indicator. Given that the scaling indicator is correlated with the composite disturbance term – the disturbance contains this error term – IVs are used to compute the factor loadings. Specifically, in a two-stage regression (2SLS), the scaling indicator is first regressed on the IVs. The regression coefficients obtained are then used to compute predicted values of the scaling indicator, and the actual indicator is then regressed on these predicted values. The regression coefficient of this second regression is the desired factor loading.
The 2SLS/IV estimator yields consistent estimates of the factor loadings when the IVs meet certain requirements. The IVs must be (a) correlated with the scaling indicator, (b) unrelated to the errors in the composite disturbance term, and (c) sufficient in number so that there are at least as many IVs as there are scaling indicators (cf. Bollen, 1996; Bollen & Maydeu-Olivares, 2007).2 If the CFA model is identified, the potential IVs are a subset of the other indicators, and the structure of the CFA model can be used to determine which of these indicators satisfy conditions (a) and (b). The IVs are thus model implied, and it is this feature that differentiates Bollen's 2SLS/IV approach from the typical usage of IVs in econometric contexts where the IVs are taken from outside of the model. To indicate this conceptual difference, the 2SLS/IV approach is called model-implied instrumental variables (MIIV) in more recent publications (e.g. Bollen & Bauer, 2004).
Recently, Bollen and Maydeu-Olivares (2007) generalized the MIIV estimator to categorical variables. As in the case of continuous indicators, factor loadings are first computed using IVs in a two-stage regression, whereby these computations draw on the tetrachoric correlations among the indicators. Once the factor loadings have been obtained, variances of the factors and the covariance among them are computed. Therefore, the factor loadings are entered into equation (3), and the ULS variant of F is used to estimate the remaining model parameters. Finally, Bollen and Maydeu-Olivares (2007) also provided formulae to compute standard errors for the factor loadings and for the variance and covariance model parameters.
2.3. The current simulation
To date, only Bollen and Maydeu-Olivares (2007) have done a simulation study in which the performance of PIV relative to ULS has been investigated. In their study, they used a two-factor model with high factor loadings (λ = .80), and the model was either correctly or incorrectly specified. They found that PIV provided parameter estimates that were as accurate as estimates obtained using ULS when the model was correctly specified. When the CFA model was incorrectly specified (the cross-loading of an item was set to zero), PIV produced more accurate parameter estimates compared to ULS. The aim of the present research was to replicate and extend these findings. Specifically, we examined the performance of PIV under different settings of factor model size, factor loading magnitude (i.e. indicator reliability), sample size, and non-normality of the latent distributions (see Table 1 for an overview of design factors).3 Furthermore, we investigated the effect of a structural misspecification. Finally, we compared the performance of PIV not only to ULS but also to DWLS, as the latter is also widely used in estimating dichotomous CFA models.
Table 1. Factors investigated in the current simulation| Factor | Levels |
|---|
| Model size | 2 levels: two-factor model or three-factor model |
| Factor loading | 3 levels: λ = .40, .55, .70 |
| Sample size | 4 levels: N = 100, 250, 500, 1,000 |
| Latent distributions | 3 levels: normal, moderately or extremely non-normal |
| Misspecification | 2 levels: no or yes |
5. Discussion
- Top of page
- Abstract
- 1. Introduction
- 2. The dichotomous factor analysis model
- 3. Method
- 4. Results
- 5. Discussion
- Acknowledgements
- References
- Supporting Information
The present simulation study was implemented to investigate the performance of the polychoric instrumental variable estimator in the estimation of a dichotomous confirmatory factor analysis. Specifically, we examined how (a) the factor model size, (b) the magnitude of the true factor loading, (c) the non-normality of the latent distributions, and (d) the size of the sample affected the performance of PIV. Also, its robustness to a model misspecification was tested. Finally, we compared the performance of PIV to two other well-established system-wide estimators, namely, ULS and DWLS.
The results showed, first, that PIV (like ULS and DWLS) provided accurate parameter estimates in most conditions when the model was correctly specified. In some of these conditions, for instance, when the true factor loading was small or moderate, PIV even provided more accurate estimates than ULS and DWLS. Furthermore, we found that the magnitude of the true factor loading had an impact on PIV factor loading estimates and PIV covariance estimates. Non-normality of the latent distributions, by contrast, only affected PIV covariance estimates. Importantly, both of these effects also arose for ULS and DWLS. Thus, overall, PIV produced estimates that were as accurate as ULS and DWLS, and it even outperformed ULS and DWLS in estimation accuracy in some conditions.
Second, when the model was misspecified, the quality of the parameter estimates for the correctly specified factor loading was the same for all three estimators. PIV, however, was more robust to the structural misspecification than ULS and DWLS for the falsely specified factor loadings. This result, however, is to be expected given that ULS and DWLS use all the available information to estimate the model parameters, including, for instance, not only the assignment of an indicator to a latent factor but also whether the latent factors are correlated. PIV, by contrast, employs only instrumental variables (IVs) – the other indicators – to compute the factor loadings. Also, given that the magnitude of an IV estimate depends on the strength of the relation between the IV and the predictor, the bias of the falsely specified factor loading is expected to increase the higher the correlations between the falsely assigned indicator and the other indicators used to compute the factor loading estimate. Note that this not only explains why the bias in parameter estimates got larger the higher the magnitude of the true factor loading, but also suggests that PIV will perform even better the more the correlation between the latent factors approaches zero. Finally, although PIV provided more accurate covariance estimates than ULS and DWLS, these were nevertheless unacceptable in almost all conditions. Again, this result is to be expected as the PIV estimator is most probably robust for the factor loadings but need not be robust for the covariance parameter (see Bollen & Maydeu-Olivares, 2007, for more information concerning the conditions on the robustness of variance–covariance parameters).
The present findings are thus consistent with earlier results showing that PIV as well as MIIV are more robust against structural misspecifications than other system-wide estimators. Although these results are important, they may be of limited value in applied contexts, as it may be more crucial to locate the structural misspecification in applied work (Saris, Satorra, & van der Veld, 2009). We believe, however, that even for this question an application of the MIIV approach is suitable. Specifically, recent evidence suggests that the over-identification tests (see footnote 2) used to test the appropriateness of IVs are a good means for diagnosing the source of a misspecification in the case of structural equation models with continuous indicators (Kirby & Bollen, 2009). The idea behind this approach is that for MIIV, the IVs are unambiguously determined by the hypothesized model structure. When an over-identification test concerning a set of IVs thus fails, this implies that the model structure that suggested these IVs must be misspecified. We think that it is an interesting task for future research to investigate whether this approach is also suitable for SEMs with ordinal indicators.
Third, concerning standard error bias, the present simulation found that ULS and DWLS provided more accurate estimates than PIV when the model was correctly specified. This was, however, mainly due to biased standard errors in cases of a small true factor loading. A potential explanation of this might be that the correlations between the two indicators that served as IVs and the indicator for which the factor loading had to be estimated were too low. This weakness of the IV has been found to influence the asymptotic behaviour of IV estimators before (see Baum et al., 2003; Bound, Jaeger, & Baker, 1995).
Finally, the test statistics for all three estimators performed well in the case of the misspecified factor model, although all three tended to under-reject the wrong factor model when the true factor loading was small and the size of the sample was smaller than N = 1,000.
PIV also tended to under-reject models when the model was correctly specified. Here, better results emerged for the other two estimation methods.
In short, then, the present work shows that the PIV estimator is an interesting alternative method for the estimation of the parameters of a dichotomous confirmatory factor analysis. When the factor model was correctly specified, PIV yielded results similar to ULS and DWLS in most cases, and it was more robust to structural misspecifications than ULS and DWLS. However, the present simulation is just a first step in investigating the performance of the PIV estimator. It would be a worthwhile task for future research to examine PIV in other contexts, such as full structural equation models or models with exogenous observed covariates, to conclusively answer whether it is a true alternative to the more common system-wide estimators.