Volume 41, Issue 2

Applications of the Analytically Derived Asymptotic Standard Errors of Item Response Theory Item Parameter Estimates

Yuan H. Li

Statistical specialist, Prince George's County Public Schools, Department of Testing, Research and Evaluation, Room 202 E, Upper Marlboro, MD 20772; jeffli@pgcps.org. His primary research interests include IRT‐related studies (e.g., scaling, test equating, computerized adaptive testing, etc.), as well as exploration of utilizing a zero one linear programming approach to create a matched sample as a control group for the quasi‐experimental designs.

Search for more papers by this author
Robert W. Lissitz

ROBERT W. LISSITZ is a professor at the University of Maryland, Department of Measurement, Statistics, & Evaluation, Benjamin Building, Room 1229, College Park, MD 20742‐1 115; RLissitz@umd.edu.

Search for more papers by this author
First published: 15 June 2006
Citations: 11

Abstract

The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees’ responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is repeatedly estimated from the simulation (or resampling) test data. The latter method will result in rather stable and accurate SE estimates as the number of replications increases, but requires cumbersome and time‐consuming calculations. Instead of using the empirically determined method, the adequacy of using the analytical‐based method in predicting the SEs for item parameter estimates was examined by comparing results produced from both approaches. The results indicated that the SEs yielded from both approaches were, in most cases, very similar, especially when they were applied to a generalized partial credit model. This finding encourages test practitioners and researchers to apply the analytically asymptotic SEs of item estimates to the context of item‐linking studies, as well as to the method of quantifying the SEs of equating scores for the item response theory (IRT) true‐score method. Three‐dimensional graphical presentation for the analytical SEs of item estimates as the bivariate function of item difficulty together with item discrimination was also provided for a better understanding of several frequently used IRT models.

Number of times cited according to CrossRef: 11

  • irtplay : An R Package for Online Item Calibration, Scoring, Evaluation of Model Fit, and Useful Functions for Unidimensional IRT , Applied Psychological Measurement, 10.1177/0146621620921247, (014662162092124), (2020).
  • A full Bayesian implementation of a generalized partial credit model with an application to an international disability survey, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12385, 69, 1, (131-150), (2019).
  • Standard Errors of IRT Parameter Scale Transformation Coefficients: Comparison of Bootstrap Method, Delta Method, and Multiple Imputation Method, Journal of Educational Measurement, 10.1111/jedm.12210, 56, 2, (302-330), (2019).
  • Efficient Standard Errors in Item Response Theory Models for Short Tests, Educational and Psychological Measurement, 10.1177/0013164419882072, (001316441988207), (2019).
  • Item Response Theory Observed-Score Kernel Equating, Psychometrika, 10.1007/s11336-016-9528-7, 82, 1, (48-66), (2016).
  • ``Guessing'' Parameter Estimates for Multidimensional Item Response Theory Models, Educational and Psychological Measurement, 10.1177/0013164406294778, 67, 3, (433-446), (2016).
  • Information Matrices and Standard Errors for MLEs of Item Parameters in IRT, Psychometrika, 10.1007/s11336-013-9334-4, 79, 2, (232-254), (2013).
  • Computerized Adaptive Testing: The Capitalization on Chance Problem, The Spanish journal of psychology, 10.5209/rev_SJOP.2012.v15.n1.37348, 15, 1, (424-441), (2013).
  • Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly, Applied Psychological Measurement, 10.1177/0146621612469825, 37, 2, (123-139), (2012).
  • The Item Parameter Replication Method for Detecting Differential Functioning in the Polytomous DFIT Framework, Applied Psychological Measurement, 10.1177/0146621608319514, 33, 2, (133-147), (2008).
  • A Comparison of Using the Fixed Common-Precalibrated Parameter Method and the Matched Characteristic Curve Method for Linking Multiple-Test Items, International Journal of Testing, 10.1207/s15327574ijt0403_5, 4, 3, (267-293), (2004).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.