Using Testlet Response Models to Relieve the Perniciousness of Local Item Dependence in Multistage Testing

Journal of Psychological Science ›› 2017, Vol. 40 ›› Issue (1) : 216-223.

PDF(807 KB)
PDF(807 KB)
Journal of Psychological Science ›› 2017, Vol. 40 ›› Issue (1) : 216-223.

Using Testlet Response Models to Relieve the Perniciousness of Local Item Dependence in Multistage Testing

Author information +
History +

Abstract

Multistage testing (MST) are those which sets of items are administered adaptively and are scored as a unit. MST has most of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive testing (i.e., computerized adaptive testing, CAT). For instance, CAT is not easily applicable for use with certain item formats (e.g., essays); CAT requires complex item selection algorithms to satisfy content specifications and control item exposure rates; and CAT do not typically allow the possibility of item review for test takers. Recently, MST has become increasingly popular and has been adopted by several important large-scale testing programs. Because the modules used in an MST can be designed and assembled before the test administration and are presented to the test taker as a unit, they allow the knowledge and skill of test developers to play a role in the process of test development, thus the content balance, quality of the test structure, and administration of the test can be greater controlled. In addition, MST also allows test takers to review their item responses within each module. The local independence is a basic assumption of most psychometric models. Unfortunately, such assumption can be easily violated in educational tests. Yen (1984, 1993) lists a number of factors leading to local item dependence (LID): test speededness, fatigue or practice, item or response format, passage dependence, and scoring rubrics or raters. Because modules in MST can be treated as a series of mini tests, the assumption of local independence may also be violated by factors mentioned above or others. A lot of studies on traditional linear tests (e.g., pencil & paper test) have shown that when standard IRT models (e.g., Rasch model, 2PL model and 3PL model) are used to fit the data, LID results in the overestimation of the precision of the test as a whole, spuriously high reliability coefficients, and biased parameter estimates (e.g., Wainer, Bradlow, & Wang, 2007; Wang & Wilson, 2005b). Typically, MST use the standard IRT models to estimates test takers’ abilities. Thus we can deduce that the LID may affects the results of MST. Two simulation studies were carried out to explore the perniciousness of LID in MST. For simplify, In study 1, we use the Rasch testlet model (Wang & Wilson, 2005b) to generate response data, and directly use the standard Rasch model (Rasch, 1980) to fit the data as usual. The results indicated that LID can deceases the precision, but without bias, of estimated ability parameters in MST, observably. By contrast, we use the Rasch testlet model to fit the data in study 2, try to reduce the influences by LID. The results shown that the Rasch testlet model can be used to reduced a part of perniciousness by LID. It’s effect is not as good as it in traditional linear test.

Key words

multistage testing / local dependence / local independence / testlet / item response theory / computerized adaptive testing

Cite this article

Download Citations
Using Testlet Response Models to Relieve the Perniciousness of Local Item Dependence in Multistage Testing[J]. Journal of Psychological Science. 2017, 40(1): 216-223
PDF(807 KB)

Accesses

Citation

Detail

Sections
Recommended

/