›› 2021, Vol. 44 ›› Issue (1): 205-213.
Previous Articles Next Articles
Received:
Revised:
Online:
Published:
孙小坚1,2,毛秀珍3,宋乃庆1,辛涛4
通讯作者:
Abstract: Cognitive diagnostic assessment (CDA) has recently been attracted more and more attention in educational and psychological measurement. CDA can provide detailed information about the strengths and weakness of individuals for a specific content domain. At the same time, the computerized adaptive testing (CAT) can lead to as well as or even better measure accuracy of individuals’ latent skills than the traditional paper and pencil (P&P) tests. To take advantage of both CDA and CAT, these two technologies have been combined to form the cognitive diagnostic computerized adaptive testing (CD-CAT). The major concern of the CD-CAT is the measurement accuracy, therefore, a bunch of item selection methods were developed to achieve this goal. Worth noting that these methods mainly focused on maximizing the measurement accuracy without concerned for item exposure control, as a consequence, a highly uneven distribution of item bank usage was raised. To find a balance between measurement accuracy and item exposure control, some new item selection methods were proposed, for instance restrictive progressive (RP), restrictive threshold (RT), stratified multistage (SM), stratified multistage & item eligibility (SMIE), randomization halving algorithm (RHA), and stratified dynamic binary searching (SDBS) methods. However, the model that used in these studies was either the deterministic input, noisy ‘and’ gate (DINA) or the reduced reparameterized unified model (RRUM). Moreover, some of them tended to produce high item exposure rate, and some of them are lack of flexibility, such as the SDBS method. In order to cope with these shortages above mentioned, two new methods are proposed to balance the measurement accuracy and item exposure rate, the new methods combine the RHA with RP and RT methods, and namely, the halving inverse RP (HIRP) and halving inverse RT (HIRT) methods. A Monte Carlo simulation is conducted to examine the performance of HIRP, HIRT and existed item exposure control methods. Four factors are manipulated in the study: model type, number of attributes, test length, and item selection method. Specifically, the model type includes the DINA model and RRUM; four and six number of attributes are used; the test length are 20 and 30, respectively; and 11 item selection methods—— Random, PWKL, RP, RT, SM, IE, SMIE, RHA, SDBS, HIRP, and HIRT—— are adopted in the study. At the same time, the covariates include: the size of the item bank, item parameters of the model, the sample size, and the maximum item exposure rate. The main results show that: (1) The HIRT method produces slightly worse item exposure rate but with significant higher measurement accuracy than RP, SM, SMIE, RHA, and SDBS; meanwhile, HIRT has slightly lower measurement accuracy than PWKL, RT, and IE, but has more even distribution of item exposure. (2) The measurement accuracy of HIRP is lower than PWKL, RP, RT, IE, and HIRT, but the item exposure rate, in general, is better than these methods; in addition, HIRP produces higher measurement accuracy and better item exposure rate than SM, SMIE, and SDBS. In conclusion, both HIRP and HIRT, in some degree, can balance the trade-off between measurement accuracy and item exposure rate.
Key words: cognitive diagnostic computerized adaptive testing, item exposure control, measurement accuracy, item exposure rate
摘要: 提出了两种适用于定长CD-CAT的题目曝光控制方法(HIRP、HIRT),这些方法在保证较高分类准确率的同时还有较合理的题目曝光率,新方法由二分化方法和RP及RT方法进行结合并适当调整而得到。模拟研究比较了其与RP、RT、SM、SMIE、RHA和SDBS的表现,结果表明: (1)HIRP的分类准确率和题目曝光率均好于SM、SMIE和SDBS;(2)HIRT的题目曝光率较RP、SM、SMIE、RHA和SDBS稍差,但分类准确率更高;(3)HIRP的分类准确率低于RT和RP,但题目曝光控制要更好。
关键词: 认知诊断计算机化自适应测验, 题目曝光控制, 分类准确率, 曝光率
孙小坚 毛秀珍 宋乃庆 辛涛. 定长CD-CAT中两种新的题目曝光控制方法[J]. , 2021, 44(1): 205-213.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://jps.ecnu.edu.cn/EN/
https://jps.ecnu.edu.cn/EN/Y2021/V44/I1/205