认知诊断模型中被试参数估计方法的选择*

周蔓, 刘彦楼, 滕雅茹

心理科学 ›› 2024, Vol. 47 ›› Issue (1) : 229-236.

PDF(1068 KB)
中文  |  English
PDF(1068 KB)
心理科学 ›› 2024, Vol. 47 ›› Issue (1) : 229-236. DOI: 10.16719/j.cnki.1671-6981.20240127
统计、测量与方法

认知诊断模型中被试参数估计方法的选择*

  • 周蔓1, 刘彦楼**2, 滕雅茹1
作者信息 +

The Choice of Examinee Parameter Estimation Methods in Cognitive Diagnostic Models

  • Zhou Man1, Liu Yanlou2, Teng Yaru1
Author information +
文章历史 +

摘要

根据被试的反应数据获得精细化诊断信息,为研究者提供个性化的指导是CDM的主要目的之一。以往研究在DINA模型下比较了被试参数估计方法(MLE、MAP和EAP)的表现,但在实践中如何选择最有效的方法仍有待研究。本文从理论角度探讨了不同知识状态分布对被试参数估计方法的影响,并进行了模拟研究。结果发现:当属性之间相关时,EAP和MAP方法的分类结果相似并高于MLE。鉴于实践中属性一般呈中等或高相关,建议选择EAP/MAP作为被试参数估计方法。

Abstract

Cognitive diagnostic models (CDMs), which are also referred to diagnostic classification models (Rupp et al., 2010), are multiple discrete latent-variable models. In the past few decades or even earlier, CDMs have become a popular method in many fields, such as psychological and educational measurement, psychiatric evaluation, and other disciplines. Arguably, to offer fine-grained differentiated diagnostic information based on the examinees' observed response data to further help teachers and clinicians taking individualized instructions or interventions is one of the ultimate purposes of CDMs. Three examinee parameter estimation methods have been proposed to classify examinees into a group of latent classes in CDMs, including the maximum likelihood estimation (MLE; Birnbaum, 1968), maximum a posteriori (MAP; Samejima, 1969) and expected a posteriori (EAP; Bock & Mislevy, 1982). Huebner and Wang (2011) investigated the performance of MLE, MAP, and EAP for classifying examinees within the DINA model framework. They found that MLE/MAP had a higher correct classification rate on all K skills. In their study, however, the item parameters and structural parameters were assumed to be known. Although the previous study compared the performance of the MLE, MAP and EAP, the choice of the most suitable examinee parameter estimation methods in CDMs still tend to be a problem.
In this study, we proposed that the main difference between MLE, MAP and EAP is that the last two methods consider the latent knowledge state distribution. Thus, a simulation study was conducted to investigate the impact of latent knowledge state distribution on the classification accuracy of MLE, MAP and EAP. Five factors were manipulated: the attribute tetrachoric correlation (0, .5 and .8), number of sample size (300, 1,000 and 5,000), number of attributes (3 and 5), data-generated models (DINA, DINO, A-CDM and G-DINA) and the types of Q-matrices (correctly and incorrectly). Four evaluation criteria were pattern correct classification rate (PCCR), attribute correct classification rate (ACCR), the classification rate for each skill (Skillk) and the average of the classification rate for all skill (Total). The classification results for all four criteria were averaged over the 1000 replications. Results showed that, (1) When the attribute tetrachoric correlation was zero, MLE produced the highest correct classification rate with the criteria of PCCR; (2) When the attribute tetrachoric correlation was moderate or high, the EAP and MAP generally yielded higher classification rate than that of the MLE; (3) The correct classification rate increased as the attribute correlation and item quality increased; (4) The correct classification rate of the misspecification of Q-matrix were worse than those in true Q-matrix and items with more attributes had lower accuracy; (5) The DINA and DINO models yielded more accurate classification rate than the G-DINA and A-CDM models.
Overall, choosing the most appropriate knowledge state estimation method is of theoretical and practical importance. The results of this study indicated that the classification accuracy of MLE, MAP and EAP were affected by the latent knowledge state distribution, we recommend using EAP/MAP as an estimation method in practice to ensure the accuracy of estimation.

关键词

认知诊断模型 / 知识状态 / MLE / MAP / EAP

Key words

cognitive diagnostic model / knowledge state / MLE / MAP / EAP

引用本文

导出引用
周蔓, 刘彦楼, 滕雅茹. 认知诊断模型中被试参数估计方法的选择*[J]. 心理科学. 2024, 47(1): 229-236 https://doi.org/10.16719/j.cnki.1671-6981.20240127
Zhou Man, Liu Yanlou, Teng Yaru. The Choice of Examinee Parameter Estimation Methods in Cognitive Diagnostic Models[J]. Journal of Psychological Science. 2024, 47(1): 229-236 https://doi.org/10.16719/j.cnki.1671-6981.20240127

参考文献

[1] 蔡艳, 谭辉晔, 涂冬波. (2015). 哪个测验Q矩阵更合理: 基于DINA模型测验Q矩阵合理性侦查指标及其比较与应用. 心理科学, 38(5), 1239-1247.
[2] 李令青, 韩笑, 辛涛, 刘彦楼. (2019). 认知诊断评价在个性化学习中的功能与价值. 中国考试, 1, 40-44.
[3] 刘彦楼, 辛涛, 李令青, 田伟, 刘笑笑. (2016). 改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量. 心理学报, 48(5), 588-598.
[4] 刘彦楼, 辛涛, 田伟. (2017). 项目反应理论与认知诊断模型的参数估计: 模型整合视角. 北京师范大学学报(自然科学版), 53(6), 742-748.
[5] 涂冬波, 蔡艳, 戴海琦. (2012). 基于DINA模型的Q矩阵修正方法. 心理学报, 44(4), 558-568.
[6] 汪大勋, 高旭亮, 蔡艳, 涂冬波. (2020). 基于类别水平的多级计分认知诊断Q矩阵修正: 相对拟合统计量. 心理学报, 52(1), 93-106.
[7] 汪文义, 宋丽红, 陈平, 丁树良, 程艳. (2016). 认知诊断测验的属性分类一致性和分类准确性指标. 心理学探新, 36(3), 264-269.
[8] 王晓庆, 丁树良, 罗芬. (2019). 认知诊断中的Q矩阵及其作用. 心理科学, 42(3), 739-746.
[9] Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395-479). Addison-Wesley.
[10] Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431-444.
[11] Chen Y. X., Liu J. C., Xu G. J., & Ying Z. L. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850-866.
[12] de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199.
[13] de la Torre, J., & Chiu, C. Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81(2), 253-273.
[14] de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353.
[15] Feng Y. L., Habing B. T., & Huebner A. (2014). Parameter estimation of the reduced RUM using the EM algorithm. Applied Psychological Measurement, 38(2), 137-150.
[16] Henson R. A., Templin J. L., & Willse J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191-210.
[17] Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71(2), 407-419.
[18] Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
[19] Khorramdel L., Shin H. J., & von Davier M. (2019). GDM software mdltm including parallel EM algorithm. In M. von Davier & Y. S. Lee (Eds.), Handbook of diagnostic classification models (pp. 603-628). Springer.
[20] Kunina-Habenicht O., Rupp A. A., & Wilhelm O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49(1), 59-81.
[21] Liu, R. (2018). Misspecification of attribute structure in diagnostic measurement. Educational and Psychological Measurement, 78(4), 605-634.
[22] Liu Y. L., Andersson B., Xin T., Zhang H. Y., & Wang L. L. (2019). Improved Wald statistics for item-level model comparison in diagnostic classification models. Applied Psychological Measurement, 43(5), 402-414.
[23] Liu Y. L., Tian W., & Xin T. (2016). An application of M2 statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3-26.
[24] Liu Y. L., Xin T., Andersson B., & Tian W. (2019). Information matrix estimation procedures for cognitive diagnostic models. British Journal of Mathematical and Statistical Psychology, 72(1), 18-37.
[25] Liu Y. L., Xin T., & Jiang Y. (2021). Structural parameter standard error estimation method in diagnostic classification models: Estimation and application. Multivariate Behavioral Research. Advance online publication.
[26] Liu Y. L., Yin H., Xin T., Shao L. C., & Yuan L. (2019). A comparison of differential item functioning detection methods in cognitive diagnostic models. Frontiers in Psychology, 10, Article 1137.
[27] Ma, W. C., & de la Torre, J. (2020). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(14), 1-26.
[28] Robitzsch A., Kiefer T., George A. C., & Uenlue A. (2020). CDM: Cognitive diagnosis modeling. https://CRAN.R-project.org/package=CDM
[29] Rupp, A. A., & Templin, J. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68(1), 78-96.
[30] Rupp A. A., Templin J., & Henson R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford.
[31] Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34(1), 1-97.
[32] Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold, & M. G. Shafto (Eds.), Diagnostic monitoring of skills and knowledge acquisition (pp. 453-488). Erlbaum.
[33] Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287-305.
[34] Wang, S. Y. (2018). Two-stage maximum likelihood estimation in the misspecified restricted latent class model. British Journal of Mathematical and Statistical Psychology, 71(2), 300-333.
[35] Wang W. Y., Song L. H., Chen P., Meng Y. R., & Ding S. L. (2015). Attribute-level and pattern-level classi-cation consistency and accuracy indices for cognitive diagnostic assessment. Journal of Educational Measurement, 52(4), 457-476.

基金

*本研究得到国家自然科学基金项目(31900794)、山东省自然科学基金项目(ZR2019BC084)、山东省教育科学规划课题(2020KZD009)和大学生创新创业训练计划(202110446231X)的资助

PDF(1068 KB)

Accesses

Citation

Detail

段落导航
相关文章

/