Comparison of Bifactor CAT Item Selection Criteria for Polytomous Items

MAO Xiu-Zhen; LIU Huan; TANG Qian

PDF(1120 KB)

Journal of Psychological Science ›› 2019, Vol. 42 ›› Issue (1) : 187-193.

Comparison of Bifactor CAT Item Selection Criteria for Polytomous Items

Author information +

History +

Abstract

Bifactor model assumes that the test involves a general factor and multiple group factors. Numerous analyses on the structures of psychological trait measurement, school education survey, medical survey, and diagnostic testing have shown that the bifactor model could well represent the construct structures of the tests, surveys, or scales, and it has shown better model-data fit than other competing models (e.g. unidimensional, higher-order, and correlation models). when the abilities are assumed to be orthogonal in the bifactor model, the bifactor dimension reduction method has proved to be can reduces the multidimensional integration to multiple 2-dimensional integreations, which greatly simplifies the computation of parameter estimation (Gibbons & Hedeker, 1992; Gibbons, et al., 2007). The bifactor CAT has proved to be a practical approach that could substantialy reduce the burden of respondents while increasing testing efficiency (Gibbons, et al., 2007). However, the number of dimensions in mutidimensional CAT usually becomes an obstacle to the application of many famous item selection method, especially for the polytomous items. Specially, this study focus on the formula of information matrix forpolytomous items and how to simplify the computation of item selection method using the dimension reduction method. First, the Fisher information for bifactor grade response model was derived; then, the dimension reduction method was applied to the computation of item selection methods including the posterior weighted Fisher D-optimality method, the posterior weighted Kull-Leilber information method, the continuous entropy method, and the mutual information method; last, these methods were then compared with simulated data under three different bifactor pattern designs, using the original D-optimality method as the baseline. We conducted Monte Carlo simulation using an MATLAB program (R2010a) to wrote the CAT code and evaluate different item selection methods in terms of the correlation between real and estimated abilities, root mean squared error, absolute deviation, and Euclidean distance. The results showed that: (1) the information of the bifactor graded response model can be easily obtained and it is the generation of the information of the 2-parameter logistic model; (2) simulation results showed that for each item selection method, the correlation in high bifactor pattern is the highest, the root mean square and the absolute is lowest; (3) under the same simulation condition, the mutual information item selection method produced the highest average correlation of real ability and estimated ability, lowest root mean square, absolute bias and euclidean distance among all the item selection methods while the Posterior Kullback-Leibler method performed the worst according to these indice; (4) the PDO, CEM and DO methods produce very similar results when fixing the test condition; (5)the euclidean distance of all the methods from the begin to the end showed that their difference become significant when the test length is larger than 20 items. In conclusion, the derivation showed that dimension reduction method can be easily use to simplify the computation of item selection methods including PDO, PKL, CEM and MI. This method can simplify the multidimensional integration contain in each method to multiple 2-dimensional integreations. The simulation results further showed that when the between the discrimination parameters of the group factors and those of the general factor are smaller, estimates of the group factors become more accurate and vice versa for the estimates of the general factor. Under the same test, the CEM method performed the best in test precision while the PKL performed the worst and all other three methods performed similarly. Some problems like controlling the exposure rate, meeting the content constraint and item selection for mix-form test valued to be explore further.