Psychological Science ›› 2017, Vol. 40 ›› Issue (1): 216-223.

Previous Articles     Next Articles

Using Testlet Response Models to Relieve the Perniciousness of Local Item Dependence in Multistage Testing

  

  • Received:2016-02-29 Revised:2016-07-13 Online:2017-01-20 Published:2017-01-20

使用题组反应模型缓解局部题目依赖性对多阶段测验的危害

詹沛达1,高椿雷2,边玉芳3,罗照盛4   

  1. 1. 北京师范大学中国基础教育质量监测协同创新中心
    2. 江西师范大学
    3. 北京师范大学认知神经科学与学习国家重点实验室
    4. 江西师范大学 心理学院
  • 通讯作者: 边玉芳

Abstract: Multistage testing (MST) are those which sets of items are administered adaptively and are scored as a unit. MST has most of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive testing (i.e., computerized adaptive testing, CAT). For instance, CAT is not easily applicable for use with certain item formats (e.g., essays); CAT requires complex item selection algorithms to satisfy content specifications and control item exposure rates; and CAT do not typically allow the possibility of item review for test takers. Recently, MST has become increasingly popular and has been adopted by several important large-scale testing programs. Because the modules used in an MST can be designed and assembled before the test administration and are presented to the test taker as a unit, they allow the knowledge and skill of test developers to play a role in the process of test development, thus the content balance, quality of the test structure, and administration of the test can be greater controlled. In addition, MST also allows test takers to review their item responses within each module. The local independence is a basic assumption of most psychometric models. Unfortunately, such assumption can be easily violated in educational tests. Yen (1984, 1993) lists a number of factors leading to local item dependence (LID): test speededness, fatigue or practice, item or response format, passage dependence, and scoring rubrics or raters. Because modules in MST can be treated as a series of mini tests, the assumption of local independence may also be violated by factors mentioned above or others. A lot of studies on traditional linear tests (e.g., pencil & paper test) have shown that when standard IRT models (e.g., Rasch model, 2PL model and 3PL model) are used to fit the data, LID results in the overestimation of the precision of the test as a whole, spuriously high reliability coefficients, and biased parameter estimates (e.g., Wainer, Bradlow, & Wang, 2007; Wang & Wilson, 2005b). Typically, MST use the standard IRT models to estimates test takers’ abilities. Thus we can deduce that the LID may affects the results of MST. Two simulation studies were carried out to explore the perniciousness of LID in MST. For simplify, In study 1, we use the Rasch testlet model (Wang & Wilson, 2005b) to generate response data, and directly use the standard Rasch model (Rasch, 1980) to fit the data as usual. The results indicated that LID can deceases the precision, but without bias, of estimated ability parameters in MST, observably. By contrast, we use the Rasch testlet model to fit the data in study 2, try to reduce the influences by LID. The results shown that the Rasch testlet model can be used to reduced a part of perniciousness by LID. It’s effect is not as good as it in traditional linear test.

Key words: multistage testing, local dependence, local independence, testlet, item response theory, computerized adaptive testing

摘要: “祸兮,福之所倚;福兮,祸之所伏”,尽管多阶段测验(MST)在保持自适应测验优点的同时允许测验编制者按照一定的约束条件去建构每一个模块和题板,但建构测验时若因忽视某些潜在的因素而导致题目之间出现局部题目依赖性(LID)时,也会对MST测验结果带来一定的危害。为探究“LID对MST的危害”这一问题,本研究首先详细地介绍了MST及其实施流程和LID等相关概念;然后通过模拟研究较详细地探讨了该问题,结果表明LID的存在会影响被试能力估计的精度但仍为无偏估计,且该危害是普遍存在的,并不针对某一特定的路由规则;之后为探究如何合理有效地消除该危害,使用了题组反应模型作为MST施测过程中的分析模型,结果表明尽管该方法能够消除部分危害,但效果有限。这一方面表明LID对MST中被试能力估计精度所带来的危害确实值得关注,另一方面也表明在今后关于如何消除MST中由LID造成危害的方法仍值得进一步探究的。

关键词: 多阶段测验, 局部依赖性, 局部独立性, 题组, 项目反应理论, 计算机化自适应测验