Nonparametric Diagnostic Classification for Polytomous Attributes: A Comparison of 18 Distance Discriminant Methods

Xu Huiying; Chen Qipeng; Liu Yaohui; Zhan Peida

doi:10.16719/j.cnki.1671-6981.20230627

PDF(881 KB)

Journal of Psychological Science ›› 2023, Vol. 46 ›› Issue (6) : 1486-1494. DOI: 10.16719/j.cnki.1671-6981.20230627

Nonparametric Diagnostic Classification for Polytomous Attributes: A Comparison of 18 Distance Discriminant Methods

Xu Huiying¹, Chen Qipeng¹, Liu Yaohui¹, Zhan Peida^1,2

Author information +

History +

Abstract

In the past decades, there has been increasing interests in cognitive diagnostic assessment (CDA) that allows identifying the mastery or non-mastery of specific fine-grained attributes required for solving problems in educational and psychological assessments. In the field of cognitive diagnosis, researchers have proposed a variety of methods to classify respondents into several classes according to their attribute patterns. In general, existing methods can be classified into two categories, including parametric and non-parametric diagnostic methods. Parametric diagnostic methods are based on psychometric models. For different test situations, the theoretical relationship between observed response pattern (ORP) and latent attribute vector is described by cognitive diagnosis models (CDM), such as the DINA model and its generalized models. In contrast, the distance discrimination method in non-parametric diagnostic methods generally assigns respondents directly to a latent category by minimizing the distance between the ORP and the ideal response pattern (IRP). The most important feature of non-parametric diagnostic methods is that they do not involve any CDM and can be computed at arbitrary sample sizes. Examples include the Hamming distance discriminant and the Manhattan distance discriminant.

However, the majority of current methods (both parametric and non-parametric diagnostic methods) assumes that attributes are binary variables (e.g., "0" for "non-mastery" and "1" for "mastery"). This "black-and-white" classification is too coarse and may not be able to meet the needs of refined measurement in practical scenarios. With the increasing demand for refined diagnosis, several CDMs with polytomous attributes have been proposed in recent years. However, non-parametric diagnostic methods have not yet touched on polytomous attributes, which leaves a gap for researchers and practitioners to understand the performance of non-parametric diagnostic methods in the diagnostic assessments with polytomous attributes.

To investigate the performance of non-parametric diagnostic methods in the diagnostic assessments with polytomous attributes, two simulation studies were conducted to compare the diagnostic classification accuracy of 20 non-parametric distance discrimination methods under different test conditions consisting of 5 independent variables, including sample size, item quality, test length, number of attribute levels, and number of attributes, and to compare them with a CDM with polytomous attributes. In simulation study 1, three independent variables were manipulated, including sample size (N = 30, 50, and 100), test length (I = 25 and 50), and item quality (IQ = high; i.e., the mean value of guessing and slipping is around. (1) And low (i.e., the mean value of guessing and slipping is around .(2) All 18 non-parametric methods were implemented using Python's SciPy package; the CDM with polytomous attributes was implemented using the full Bayesian MCMC algorithm. The weighted- and exact-attribute pattern correct classification rates were used to evaluate the classification accuracy. In simulation study 2, two independent variables were manipulated: the number of attributes (K = 3, 5, and 7) and the number of attribute levels (L_k = 3 and 5). The sample size was fixed as 100, the test length was fixed as 50, and the item quality was fixed as high, respectively. All other conditions were consistent with simulation study 1.

The results of studies indicated that: (1) The effect of sample size on the classification accuracy of all non-parametric methods was small; (2)The classification accuracy of non-parametric methods increased with increasing item quality and test length, but decreased with the increasing number of attributes and number of attribute levels; and (3) In two simulation studies, the performance of the 18 non-parametric distance discrimination methods was robust across all test conditions, with the 8 distances of Canberra, Manhattan, Euclidean, Seuclidean, Sqeuclidean, Minkowski, Hamming, and Sokal-Michener dissimilarity distance discrimination methods performing relatively better. (4) In empirical study, the classification findings of the majority of nonparametric distance discriminant approaches match well with the RPa-DINA model.

In conclusion, this study is the first attempt to explore the performance of non-parametric diagnostic methods in the diagnostic assessments with polytomous attributes, which expands the application of non-parametric diagnostic methods and enriches the data analysis methods for polytomous attributes.

Key words

cognitive diagnosis / polytomous attributes / non-parametric cognitive diagnosis / distance discrimination / hamming distance / euclidean distance / Manhattan distance

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

Xu Huiying, Chen Qipeng, Liu Yaohui, Zhan Peida. Nonparametric Diagnostic Classification for Polytomous Attributes: A Comparison of 18 Distance Discriminant Methods[J]. Journal of Psychological Science. 2023, 46(6): 1486-1494 https://doi.org/10.16719/j.cnki.1671-6981.20230627

References

[1] 蔡艳, 涂冬波. (2015). 属性多级化的认知诊断模型拓展及其Q矩阵设计. 心理学报, 47(10), 1300-1308.
[2] 昌维, 詹沛达, 王立君. (2018). 认知诊断中多分属性与二分属性的对比研究. 心理科学, 41(4), 982-988.
[3] 丁树良, 罗芬, 汪文义, 熊建华. (2015). 0-1和多值可达矩阵的性质及应用. 江西师范大学学报(自然科学版), 39(1), 64-68.
[4] 丁树良, 汪文义, 罗芬, 熊建华. (2015). 多值Q矩阵理论. 江西师范大学学报(自然科学版), 39(4), 365-370.
[5] 康春花, 李元白, 曾平飞, 焦丽亚. (2018). 4种多级计分非参数认知诊断方法的比较. 中国考试, 6, 56-62.
[6] 康春花, 任平, 曾平飞. (2015). 非参数认知诊断方法: 多级评分的聚类分析. 心理学报, 47(8), 1077-1088.
[7] 康春花, 杨亚坤, 曾平飞. (2017). 海明距离判别法分类准确率的影响因素. 江西师范大学学报(自然科学版), 41(4), 394-400.
[8] 康春花, 杨亚坤, 曾平飞. (2019). 一种混合计分的非参数认知诊断方法: 曼哈顿距离判别法. 心理科学, 42(2), 455-462.
[9] 李令青, 韩笑, 辛涛, 刘彦楼. (2019). 认知诊断评价在个性化学习中的功能与价值. 中国考试, 1, 40-44.
[10] 罗照盛, 李喻骏, 喻晓锋, 高椿雷, 彭亚风. (2015). 一种基于Q矩阵理论朴素的认知诊断方法. 心理学报, 47(2), 264-272.
[11] 唐宇政. (2019). 基于欧式距离的判别分析——鸢尾花分类问题探究. 现代商贸工业, 9, 183-185.
[12] 汪大勋, 高旭亮, 韩雨婷, 涂冬波. (2018). 一种简单有效的Q矩阵估计方法开发: 基于非参数化方法视角. 心理科学, 41(1), 180-188.
[13] 王立君, 唐芳, 詹沛达. (2020). 基于认知诊断测评的个性化补救教学效果分析: 以“一元一次方程”为例. 心理科学, 43(6), 1490-1497.
[14] 王立君, 赵少勇, 昌维, 唐芳, 詹沛达. (2022). 重参数化多分属性DINA模型的多级评分拓广——基于等级反应模型. 心理科学, 45(1), 195-203.
[15] 汪文义, 丁树良, 宋丽红. (2015). 认知诊断中基于条件期望的距离判别方法. 心理学报, 47(12), 1449-1510.
[16] 汪文义, 丁树良, 宋丽红, 邝铮, 曹慧媛. (2016). 神经网络和支持向量机在认知诊断中的应用. 心理科学, 39(4), 777-782.
[17] 詹沛达, 边玉芳. (2015). 概率性输入, 噪音“与”门(PINA)模型. 心理科学, 38(5), 1230-1238.
[18] 詹沛达, 边玉芳, 王立君. (2016). 重参数化的多分属性诊断分类模型及其判准率影响因素. 心理学报, 48(3), 318-330.
[19] 詹沛达, 丁树良, 王立君. (2017). 多分属性层级结构下引入逻辑约束的理想掌握模式. 江西师范大学学报(自然科学版), 41(3), 289-295.
[20] 詹沛达, 潘艳芳, 李菲茗. (2021). 面向“为学习而测评”的纵向认知诊断模型. 心理科学, 44(1), 214-222.
[21] 詹沛达, 王立君. (2017). 认知诊断评估对实现有效教学的促进作用——以三维目标为视角. 赣南师范大学学报, 38(1), 109-115.
[22] Cha S. H., Yoon S., & Tappert C. C. (2006). Enhancing binary feature vector similarity measures. Journal of Pattern Recognition Research, 1(1), 63-77.
[23] Chen, J. S., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419-437.
[24] Chiu, C. Y., & Chang, Y. P. (2021). Advances in CD-CAT: The general nonparametric item selection method. Psychometrika, 86(4), 1039-1057.
[25] Chiu, C. Y., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30(2), 225-250.
[26] Chiu C. Y., Douglas J. A., & Li X. D. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74(4), 633-665.
[27] Chiu, C. Y., & Köhn, H. F. (2015). A general proof of consistency of heuristic classification for cognitive diagnosis models. British Journal of Mathematical and Statistical Psychology, 68(3), 387-409.
[28] Chiu, C. Y., & Köhn, H. F. (2016). Consistency of cluster analysis for cognitive diagnosis: The reduced reparameterized unified model and the general diagnostic model. Psychometrika, 81(3), 585-610.
[29] Choi S. S., Cha S. H., & Tappert C. C. (2010). A survey of binary similarity and distance measures. Journal of Systemics, 8(1), 43-48.
[30] de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199.
[31] Gu, Y. Q., & Xu, G. J. (2019). The sufficient and necessary condition for the identifiability and estimability of the DINA model. Psychometrika, 84(2), 468-483.
[32] Guo L., Yang J., & Song N. Q. (2020). Spectral clustering algorithm for cognitive diagnostic assessment. Frontiers in Psychology, 11, Article 944.
[33] Jones L. R., Wheeler G., & Centurino, V. A. S. (2015). TIMSS 2015 science framework (pp. 29-58). TIMSS.
[34] Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
[35] Karelitz, T. M. (2004). Ordered category attribute coding framework for cognitive assessments (Unpublished doctorial dissertation). University of Illinois at Urbana-Champaign.
[36] Ma, W. C. (2021). A higher-order cognitive diagnosis model with ordinal attributes for dichotomous response data. Multivariate Behavioral Research.
[37] Virtanen P., Gommers R., Oliphant T. E., Haberland M., Reddy T., Cournapeau D., & SciPy 1.0 Contributors. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in python. Nature Methods, 17(3), 261-272.
[38] Wang, S. Y., & Chen, Y. H. (2020). Using response times and response accuracy to measure fluency within cognitive diagnosis models. Psychometrika, 85(3), 600-629.
[39] Zhan, P. D. (2020). A Markov estimation strategy for longitudinal learning diagnosis: Providing timely diagnostic feedback. Educational and Psychological Measurement, 80(6), 1145-1167.
[40] Zhan P. D., Jiao H., Liao M. Q., & Bian Y. F. (2019). Bayesian DINA modeling incorporating within-item characteristic dependency. Applied Psychological Measurement, 43(2), 143-158.
[41] Zhan P. D., Wang W. C., Jiao H., & Bian Y. F. (2020). Probabilistic-input, noisy conjunctive models for cognitive diagnosis. Frontiers in Psychology, 9, Article 997.
[42] Zhan P. D., Wang W C., & Li X. M. (2020). A partial mastery, higher-order latent structural model for polytomous attributes in cognitive diagnostic assessments. Journal of Classification, 37(2), 328-351.