Let *r*_{1} and *r*_{2} be two dependent estimates of Pearson's correlation. There is a substantial literature on testing *H*_{0} : ρ_{1} = ρ_{2}, the hypothesis that the population correlation coefficients are equal. However, it is well known that Pearson's correlation is not robust. Even a single outlier can have a substantial impact on Pearson's correlation, resulting in a misleading understanding about the strength of the association among the bulk of the points. A way of mitigating this concern is to use a correlation coefficient that guards against outliers, many of which have been proposed. But apparently there are no results on how to compare dependent robust correlation coefficients when there is heteroscedasicity. Extant results suggest that a basic percentile bootstrap will perform reasonably well. This paper reports simulation results indicating the extent to which this is true when using Spearman's rho, a Winsorized correlation or a skipped correlation.

Maximum likelihood estimation of the linear factor model for continuous items assumes normally distributed item scores. We consider deviations from normality by means of a skew-normally distributed factor model or a quadratic factor model. We show that the item distributions under a skew-normal factor are equivalent to those under a quadratic model up to third-order moments. The reverse only holds if the quadratic loadings are equal to each other and within certain bounds. We illustrate that observed data which follow any skew-normal factor model can be so well approximated with the quadratic factor model that the models are empirically indistinguishable, and that the reverse does not hold in general. The choice between the two models to account for deviations of normality is illustrated by an empirical example from clinical psychology.

]]>Psychological tests are usually analysed with item response models. Recently, some alternative measurement models have been proposed that were derived from cognitive process models developed in experimental psychology. These models consider the responses but also the response times of the test takers. Two such models are the Q-diffusion model and the D-diffusion model. Both models can be calibrated with the diffIRT package of the R statistical environment via marginal maximum likelihood (MML) estimation. In this manuscript, an alternative approach to model calibration is proposed. The approach is based on weighted least squares estimation and parallels the standard estimation approach in structural equation modelling. Estimates are determined by minimizing the discrepancy between the observed and the implied covariance matrix. The estimator is simple to implement, consistent, and asymptotically normally distributed. Least squares estimation also provides a test of model fit by comparing the observed and implied covariance matrix. The estimator and the test of model fit are evaluated in a simulation study. Although parameter recovery is good, the estimator is less efficient than the MML estimator.

]]>In order to look more closely at the many particular skills examinees utilize to answer items, cognitive diagnosis models have received much attention, and perhaps are preferable to item response models that ordinarily involve just one or a few broadly defined skills, when the objective is to hasten learning. If these fine-grained skills can be identified, a sharpened focus on learning and remediation can be achieved. The focus here is on how to detect when learning has taken place for a particular attribute and efficiently guide a student through a sequence of items to ultimately attain mastery of all attributes while administering as few items as possible. This can be seen as a problem in sequential change-point detection for which there is a long history and a well-developed literature. Though some *ad hoc* rules for determining learning may be used, such as stopping after *M* consecutive items have been successfully answered, more efficient methods that are optimal under various conditions are available. The CUSUM, Shiryaev–Roberts and Shiryaev procedures can dramatically reduce the time required to detect learning while maintaining rigorous Type I error control, and they are studied in this context through simulation. Future directions for modelling and detection of learning are discussed.

Researchers often want to demonstrate a lack of interaction between two categorical predictors on an outcome. To justify a lack of interaction, researchers typically accept the null hypothesis of no interaction from a conventional analysis of variance (ANOVA). This method is inappropriate as failure to reject the null hypothesis does not provide statistical evidence to support a lack of interaction. This study proposes a bootstrap-based intersection–union test for negligible interaction that provides coherent decisions between the omnibus test and *post hoc* interaction contrast tests and is robust to violations of the normality and variance homogeneity assumptions. Further, a multiple comparison strategy for testing interaction contrasts following a non-significant omnibus test is proposed. Our simulation study compared the Type I error control, omnibus power and per-contrast power of the proposed approach to the non-centrality-based negligible interaction test of Cheng and Shao (2007, *Statistica Sinica*,* 17*, 1441). For 2 × 2 designs, the empirical Type I error rates of the Cheng and Shao test were very close to the nominal α level when the normality and variance homogeneity assumptions were satisfied; however, only our proposed bootstrapping approach was satisfactory under non-normality and/or variance heterogeneity. In general *a* × *b* designs, although the omnibus Cheng and Shao test, as expected, is the most powerful, it is not robust to assumption violation and results in incoherent omnibus and interaction contrast decisions that are not possible with the intersection–union approach.

Snijders (2001, *Psychometrika*,** 66**, 331) suggested a statistical adjustment to obtain the asymptotically correct standardized versions of a specific class of person-fit statistics. His adjustment has been used to obtain the asymptotically correct standardized versions of several person-fit statistics including the statistic (Drasgow *et al*., 1985, *Br. J. Math. Stat. Psychol*., **38**, 67), the infit and outfit statistics (e.g., Wright & Masters, 1982, *Rating scale analysis*, Chicago, IL: Mesa Press), and the standardized extended caution indices (Tatsuoka, 1984, *Psychometrika*,** 49**, 95). Snijders (2001), van Krimpen-Stoop and Meijer (1999, *Appl. Psychol. Meas*., **23**, 327), Magis *et al*. (2012, *J. Educ. Behav. Stat*., **37**, 57), Magis *et al*. (2014, *J. Appl. Meas*., **15**, 82), and Sinharay (2015b, *Psychometrika*, doi:10.1007/s11336-015-9465-x, 2016b, *Corrections of standardized extended caution indices*, Unpublished manuscript) have used the maximum likelihood estimate, the weighted likelihood estimate, and the posterior mode of the examinee ability with the adjustment of Snijders (2001). This paper broadens the applicability of the adjustment of Snijders (2001) by showing how other ability estimates such as the expected a posteriori estimate, the biweight estimate (Mislevy & Bock, 1982, *Educ. Psychol. Meas*., **42**, 725), and the Huber estimate (Schuster & Yuan, 2011, *J. Educ. Behav. Stat*., **36**, 720) can be used with the adjustment. A simulation study is performed to examine the Type I error rate and power of two asymptotically correct standardized person-fit statistics with several ability estimates. A real data illustration follows.

The maximum cardinality subset selection problem requires finding the largest possible subset from a set of objects, such that one or more conditions are satisfied. An important extension of this problem is to extract multiple subsets, where the addition of one more object to a larger subset would always be preferred to increases in the size of one or more smaller subsets. We refer to this as the multiple subset maximum cardinality selection problem (MSMCSP). A recently published branch-and-bound algorithm solves the MSMCSP as a partitioning problem. Unfortunately, the computational requirement associated with the algorithm is often enormous, thus rendering the method infeasible from a practical standpoint. In this paper, we present an alternative approach that successively solves a series of binary integer linear programs to obtain a globally optimal solution to the MSMCSP. Computational comparisons of the methods using published similarity data for 45 food items reveal that the proposed sequential method is computationally far more efficient than the branch-and-bound approach.

]]>