Applications of the Analytically Derived Asymptotic Standard Errors of Item Response Theory Item Parameter Estimates
Abstract
The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees’ responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is repeatedly estimated from the simulation (or resampling) test data. The latter method will result in rather stable and accurate SE estimates as the number of replications increases, but requires cumbersome and time‐consuming calculations. Instead of using the empirically determined method, the adequacy of using the analytical‐based method in predicting the SEs for item parameter estimates was examined by comparing results produced from both approaches. The results indicated that the SEs yielded from both approaches were, in most cases, very similar, especially when they were applied to a generalized partial credit model. This finding encourages test practitioners and researchers to apply the analytically asymptotic SEs of item estimates to the context of item‐linking studies, as well as to the method of quantifying the SEs of equating scores for the item response theory (IRT) true‐score method. Three‐dimensional graphical presentation for the analytical SEs of item estimates as the bivariate function of item difficulty together with item discrimination was also provided for a better understanding of several frequently used IRT models.
Citing Literature
Number of times cited according to CrossRef: 11
- Hwanggyu Lim, Craig S. Wells, irtplay : An R Package for Online Item Calibration, Scoring, Evaluation of Model Fit, and Useful Functions for Unidimensional IRT , Applied Psychological Measurement, 10.1177/0146621620921247, (014662162092124), (2020).
- Sujit K. Sahu, Mark R. Bass, Carla Sabariego, Alarcos Cieza, Carolina S. Fellinghauer, Somnath Chatterji, A full Bayesian implementation of a generalized partial credit model with an application to an international disability survey, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12385, 69, 1, (131-150), (2019).
- Zhonghua Zhang, Mingren Zhao, Standard Errors of IRT Parameter Scale Transformation Coefficients: Comparison of Bootstrap Method, Delta Method, and Multiple Imputation Method, Journal of Educational Measurement, 10.1111/jedm.12210, 56, 2, (302-330), (2019).
- Lianne Ippel, David Magis, Efficient Standard Errors in Item Response Theory Models for Short Tests, Educational and Psychological Measurement, 10.1177/0013164419882072, (001316441988207), (2019).
- Björn Andersson, Marie Wiberg, Item Response Theory Observed-Score Kernel Equating, Psychometrika, 10.1007/s11336-016-9528-7, 82, 1, (48-66), (2016).
- Christine E. DeMars, ``Guessing'' Parameter Estimates for Multidimensional Item Response Theory Models, Educational and Psychological Measurement, 10.1177/0013164406294778, 67, 3, (433-446), (2016).
- Ke-Hai Yuan, Ying Cheng, Jeff Patton, Information Matrices and Standard Errors for MLEs of Item Parameters in IRT, Psychometrika, 10.1007/s11336-013-9334-4, 79, 2, (232-254), (2013).
- Julio Olea, Juan Ramón Barrada, Francisco J. Abad, Vicente Ponsoda, Lara Cuevas, Computerized Adaptive Testing: The Capitalization on Chance Problem, The Spanish journal of psychology, 10.5209/rev_SJOP.2012.v15.n1.37348, 15, 1, (424-441), (2013).
- Bernard P. Veldkamp, Mariagiulia Matteucci, Martijn G. de Jong, Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly, Applied Psychological Measurement, 10.1177/0146621612469825, 37, 2, (123-139), (2012).
- Nambury S. Raju, Kristen A. Fortmann-Johnson, Wonsuk Kim, Scott B. Morris, Michael L. Nering, T.C. Oshima, The Item Parameter Replication Method for Detecting Differential Functioning in the Polytomous DFIT Framework, Applied Psychological Measurement, 10.1177/0146621608319514, 33, 2, (133-147), (2008).
- Yuan H. Li, Hak P. Tam, Leory J. Tompkins, A Comparison of Using the Fixed Common-Precalibrated Parameter Method and the Matched Characteristic Curve Method for Linking Multiple-Test Items, International Journal of Testing, 10.1207/s15327574ijt0403_5, 4, 3, (267-293), (2004).




