A Comparative Study of On‐line Pretest Item—Calibration/Scaling Methods in Computerized Adaptive Testing
Abstract
The purpose of this study was to compare and evaluate five on‐line pretest item‐calibration/scaling methods in computerized adaptive testing (CAT): marginal maximum likelihood estimate with one EM cycle (OEM), marginal maximum likelihood estimate with multiple EM cycles (MEM), Stocking's Method A, Stocking's Method B, and BILOG/Prior. The five methods were evaluated in terms of item‐parameter recovery, using three different sample sizes (300, 1000 and 3000). The MEM method appeared to be the best choice among these, because it produced the smallest parameter‐estimation errors for all sample size conditions. MEM and OEM are mathematically similar, although the OEM method produced larger errors. MEM also was preferable to OEM, unless the amount of time involved in iterative computation is a concern. Stocking's Method B also worked very well, but it required anchor items that either would increase test lengths or require larger sample sizes depending on test administration design. Until more appropriate ways of handling sparse data are devised, the BILOG/Prior method may not be a reasonable choice for small sample sizes. Stocking's Method A had the largest weighted total error, as well as a theoretical weakness (i.e., treating estimated ability as true ability); thus, there appeared to be little reason to use it.
Citing Literature
Number of times cited according to CrossRef: 22
- Jianhua Xiong, Shuliang Ding, Fen Luo, Zhaosheng Luo, Online Calibration of Polytomous Items Under the Graded Response Model, Frontiers in Psychology, 10.3389/fpsyg.2019.03085, 10, (2020).
- Chun Wang, Ping Chen, Shengyu Jiang, Item Calibration Methods With Multiple Subscale Multistage Testing, Journal of Educational Measurement, 10.1111/jedm.12241, 57, 1, (3-28), (2019).
- Kyung Yong Kim, Two IRT Fixed Parameter Calibration Methods for the Bifactor Model, Journal of Educational Measurement, 10.1111/jedm.12230, 57, 1, (29-50), (2019).
- Angela Verschoor, Stéphanie Berger, Urs Moser, Frans Kleintjes, On-the-Fly Calibration in Computerized Adaptive Testing, Theoretical and Practical Advances in Computer-based Educational Measurement, 10.1007/978-3-030-18480-3_16, (307-323), (2019).
- Xiaofeng Yu, Ying Cheng, Hua-Hua Chang, Recent Developments in Cognitive Diagnostic Computerized Adaptive Testing (CD-CAT): A Comprehensive Review, Handbook of Diagnostic Classification Models, 10.1007/978-3-030-05584-4_15, (307-331), (2019).
- Yinhong He, Ping Chen, Yong Li, New Efficient and Practicable Adaptive Designs for Calibrating Items Online, Applied Psychological Measurement, 10.1177/0146621618824854, (014662161882485), (2019).
- Hyeon-Ah Kang, Yi Zheng, Hua-Hua Chang, Online Calibration of a Joint Model of Item Responses and Response Times in Computerized Adaptive Testing, Journal of Educational and Behavioral Statistics, 10.3102/1076998619879040, (107699861987904), (2019).
- Yinhong He, Ping Chen, Optimal Online Calibration Designs for Item Replenishment in Adaptive Testing, Psychometrika, 10.1007/s11336-019-09687-0, (2019).
- Jason W. Morphew, Jose P. Mestre, Hyeon-Ah Kang, Hua-Hua Chang, Gregory Fabry, Using computer adaptive testing to assess physics proficiency and improve exam performance in an introductory physics course, Physical Review Physics Education Research, 10.1103/PhysRevPhysEducRes.14.020110, 14, 2, (2018).
- Yinhong He, Ping Chen, Yong Li, Shumei Zhang, A New Online Calibration Method Based on Lord’s Bias-Correction, Applied Psychological Measurement, 10.1177/0146621617697958, 41, 6, (456-471), (2017).
- Ping Chen, A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing, Journal of Educational and Behavioral Statistics, 10.3102/1076998617695098, 42, 5, (559-590), (2017).
- Ping Chen, Chun Wang, Tao Xin, Hua‐Hua Chang, Developing new online calibration methods for multidimensional computerized adaptive testing, British Journal of Mathematical and Statistical Psychology, 10.1111/bmsp.12083, 70, 1, (81-117), (2017).
- Yi Zheng, Online Calibration of Polytomous Items Under the Generalized Partial Credit Model, Applied Psychological Measurement, 10.1177/0146621616650406, 40, 6, (434-450), (2016).
- Jinming Zhang, Jie Li, Monitoring Items in Real Time to Enhance CAT Security, Journal of Educational Measurement, 10.1111/jedm.12104, 53, 2, (131-151), (2016).
- Ping Chen, Chun Wang, A New Online Calibration Method for Multidimensional Computerized Adaptive Testing, Psychometrika, 10.1007/s11336-015-9482-9, 81, 3, (674-701), (2015).
- Yunxiao Chen, Jingchen Liu, Zhiliang Ying, Online Item Calibration for Q-Matrix in CD-CAT, Applied Psychological Measurement, 10.1177/0146621613513065, 39, 1, (5-15), (2014).
- Usama S. Ali, Hua‐Hua Chang, An Item‐Driven Adaptive Design for Calibrating Pretest Items, ETS Research Report Series, 10.1002/ets2.12044, 2014, 2, (1-12), (2014).
- Juan Ramón Barrada, Francisco José Abad, Julio Olea, Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing, The Spanish journal of psychology, 10.5209/rev_SJOP.2011.v14.n1.45, 14, 1, (500-508), (2013).
- Ping Chen, Tao Xin, Chun Wang, Hua-Hua Chang, Online Calibration Methods for the DINA Model with Independent Attributes in CD-CAT, Psychometrika, 10.1007/s11336-012-9255-7, 77, 2, (201-222), (2012).
- Seonghoon Kim, A Comparative Study of IRT Fixed Parameter Calibration Methods, Journal of Educational Measurement, 10.1111/j.1745-3984.2006.00021.x, 43, 4, (355-381), (2006).
- Jae‐Chun Ban, Bradley A. Hanson, Qing Yi, Deborah J. Harris, Data Sparseness and On‐Line Pretest Item Calibration‐Scaling Methods in CAT, Journal of Educational Measurement, 10.1111/j.1745-3984.2002.tb01174.x, 39, 3, (207-218), (2006).
- Yinhong He, Ping Chen, Yong Li, Maximum information per time unit designs for continuous online item calibration, British Journal of Mathematical and Statistical Psychology, 10.1111/bmsp.12221, 0, 0, (undefined).




