Person-Fit in Cognitive Diagnostic Assessment

Yu Xiaofeng, Tang Qian, Qin Chunying, Li Yujun

Journal of Psychological Science ›› 2024, Vol. 47 ›› Issue (3) : 744-751.

PDF(880 KB)
PDF(880 KB)
Journal of Psychological Science ›› 2024, Vol. 47 ›› Issue (3) : 744-751. DOI: 10.16719/j.cnki.1671-6981.20240329
Psychological statistics, Psychometrics & Methods

Person-Fit in Cognitive Diagnostic Assessment

  • Yu Xiaofeng1, Tang Qian1, Qin Chunying1,2, Li Yujun1
Author information +
History +

Abstract

Cognitive Diagnostic Assessment (CDA) is a widely used educational assessment. It can provide guidance for further study and teaching by analyzing whether the test-takers have acquired knowledge points or skills.<br/>In psychometrics, statistical methods for assessing the fit of an examinee’s item responses to a postulated psychometric model are often called person-fit statistic. The person-fit analysis can help to verify diagnostic results, and is mainly used to distinguish the abnormal examinees from the normal ones. The abnormal response patterns include “sleeping” behavior, fatigue, cheating, creative responding, random guessing responses and cheating with randomness, all of which can affect the deviation of examinee’s ability estimation. The person-fit analysis can help researchers identify the abnormal response patterns more accurately, so as to delete the abnormal responding examinees and improve the validity of the test. In the past, most of the person fit researches were mainly carried out under the Item Response Theory (IRT) framework, while only few papers have been published dealing with person-fit under the CDM framework. This study attempts to fill a gap in the literature by introducing new methods. In this study, a new person fit index (R) was proposed. <br/>In order to verify the validity of the newly developed person fit index, this study explores the type I error and statistical test power of R index under different item length, item discrimination and different misfit types of respondent, and compares it with existing methods RCI and lz. Type I error rate was defined as the proportion of flagged abnormal response patterns by a person fit statistic out of 1,000 generated normal response patterns from the DINA model. The control variables of this study include: the number of subjects is controlled to 1000, the cognitive diagnosis model is chosen as DINA model, the attributes are 6, and the Q matrix is fixed. Finally, to reflect the value of person fit index in practical application, the R index is applied to the empirical data of fractional subtraction. <br/>The results show that the type I error of R index is reasonable and stable at .05. In the aspect of statistical test power, with the improvement of item differentiation, the statistical test power of each index in different abnormal examinees is improved. With the increase in the number of items, most of the statistical power show an upward trend. For different types of abnormal subjects, R index perform best in the cases of random guessing responses and cheating with randomness. In the case of fatigue, sleep, and creative responding, the lz index perform better. In the empirical data study, the detection rate of abnormal examinees is 4.29%.<br/>With the increase of the discrimination of items and the increase of the number of items, the power of R index has improved, and the performance of R index is the most robust when the discrimination of item is low. The R index has a high power for the types of abnormal behavior such as creative responding behavior, random guessing responses and cheating with randomness.

Key words

cognitive diagnosis / person fit / DINA model / aberrant response

Cite this article

Download Citations
Yu Xiaofeng, Tang Qian, Qin Chunying, Li Yujun. Person-Fit in Cognitive Diagnostic Assessment[J]. Journal of Psychological Science. 2024, 47(3): 744-751 https://doi.org/10.16719/j.cnki.1671-6981.20240329

References

[1] 陈孚, 辛涛, 刘彦楼, 刘拓, 田伟. (2016). 认知诊断模型资料拟合检验方法和统计量. 心理科学进展, 24(12), 1946-1960.
[2] 丁树良, 汪文义, 罗芬. (2014). 多级评分认知诊断测验蓝图的设计——根树型结构. 江西师范大学学报(自然科学版), 38(2), 111-118.
[3] 丁树良, 汪文义, 罗芬, 熊建华. (2015). 多值Q矩阵理论. 江西师范大学学报(自然科学版), 39(4), 365-370.
[4] 涂冬波, 张心, 蔡艳, 戴海琦. (2014). 认知诊断模型-资料拟合检验统计量及其性能. 心理科学, 37(1), 205-211.
[5] 王鹏,孟维璇,朱干成,张登浩,张利会,董一萱,司英栋.(2019).多维项目反应理论补偿性模型参数估计:基于广义回归神经网络集合. 心理学探新, 39(3),244-249.
[6] 夏梦连, 毛秀珍, 杨睿. (2018). 属性多级和项目多级评分的认知诊断模型. 江西师范大学学报(自然科学版), 42(2), 134-138.
[7] 詹沛达, 丁树良, 王立君. (2017). 多分属性层级结构下引入逻辑约束的理想掌握模式. 江西师范大学学报(自然科学版), 41(3), 289-295.
[8] American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association.
[9] Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4), 821-856.
[10] Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques. Chapman & Hall/CRC.
[11] Cui, Y., & Li, J. (2015). Evaluating person fit for cognitive diagnostic assessment. Applied Psychological Measurement, 39(3), 223-238.
[12] Cui Y., Gierl M. J., & Chang H. H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19-38.
[13] Cui, Y., & Leighton, J. P. (2009). The hierarchy consistency index: Evaluating person fit for cognitive diagnostic assessment. Journal of Educational Measurement, 46(4), 429-449.
[14] Cui, Y., & Li, J. (2015). Evaluating person fit for cognitive diagnostic assessment. Applied Psychological Measurement, 39(3), 223-238.
[15] de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130.
[16] Drasgow, F, Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86.
[17] Henson R. A., Templin J. L., & Willse J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191-210.
[18] Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
[19] Leighton J. P.,& Gierl, M. J. (2007). Cognitive diagnostic assessment for education: Theory and applications Cambridge University Press Theory and applications. Cambridge University Press.
[20] Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational and Behavioral Statistics, 4(4), 269-290.
[21] Liu Y., Douglas J. A., & Henson R. A. (2009). Testing person fit in cognitive diagnosis. Applied Psychological Measurement, 33(8), 579-598.
[22] Masters, G. N., & Wright, B. D. (1997). The partial credit model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 101-121). Springer.
[23] Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25(2), 107-135.
[24] Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55(1), 75-106.
[25] Reise, S. P. (1995). Scoring method and the detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19(3), 213-229.
[26] Rupp A. A., Templin J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications Guilford Press Theory, methods, and applications. Guilford Press.
[27] Santos K. C. P., de la Torre J., & von Davier M. (2020). Adjusting person fit index for skewness in cognitive diagnosis modeling. Journal of Classification, 37(2), 399-420.
[28] Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20(4), 345-354.
[29] von Davier, M., & Lee, Y. S. (2019). Handbook of diagnostic classification models. Cham: Springer International Publishing.
[30] Wang C., Xu G. J., & Shang Z. R. (2018). A two-stage approach to differentiating normal and aberrant behavior in computer based testing. Psychometrika, 83(1), 223-254.
[31] Yu, X. F., & Cheng, Y. (2019). A change-point analysis procedure based on weighted residuals to detect back random responding. Psychological Methods, 24(5), 658-674.
PDF(880 KB)

Accesses

Citation

Detail

Sections
Recommended

/