Psychological Science ›› 2014, Vol. 37 ›› Issue (3): 742-747.

Previous Articles     Next Articles

Analysis of Estimating Variance Components for Sparse Data of Test score in Generalizability Theory

  

  • Received:2012-06-12 Revised:2013-05-31 Online:2014-05-20 Published:2014-05-20

考试评分缺失数据的概化理论分析

黎光明1,刘晓瑜2,谭小兰3,周梦培4,张敏强3   

  1. 1. 广州大学
    2. 华南师范大学教育科学学院
    3. 华南师范大学
    4. 华南师范大学心理应用研究中心
  • 通讯作者: 张敏强

Abstract: Missing data are easily find in psychological surveys and experiments. For example, in performance assessment, a certain group of raters rated a certain group of examinees. By this token, the data from performance assessment compose a sparse data matrix. Researchers are always concerned about how to make good use of the observed data. Brennan(2001) provided the estimating formulas of p×i design of sparse data. But in practice, there are always more than one factor which effect the experiment. Especially the factor of rater. This factor is the one which cannot be ignored in the performance assessment. The aim of this article is to find a way which can estimate the variance component of sparse data rapidly and effectively. In China, many studies only analyzed complete data. There are two demerit as following. Firstly, If missing data were encountered, researchers usually deleted incomplete records or used imputation before analysis. But using these methods to analyze performance assessment will reduce the data which can use to analyze. Secondly, the estimated value will differ along with different imputation methods. This article provided the estimating formulas of p×i×r design of sparse data, which are on the basis of the estimating formulas of p×i design of sparse data provided by Brennan(2001). This article used matlab7.0 to simulate data which were usually encountered in examination, then used GT theory to estimate variance components. We simulated two conditions respectively, small size with 200 students and large-sized with 10000 students. And then used the estimating formulas of p×i×r design of sparse data to estimate variance components, in order to test the formulas’ validity. The research showed that: These formulas could provided a good estimation of variance components. The estimated variance components approach to set values. The accuracy rates of item and rater were highest. The accuracy rates of interaction of student and item was low. The maximum bias of interaction could reach 1.5. The number of items had the most important effect on the estimation. The number of item increased only a little, the accuracy rate would increased by a big margin. These formulas could provided a good estimation when the amount of item was moderate. We also found that these formulas could used in either small or large amount of data. Either kind of data could get little bias. In performance assessment, we can increase the number of item to enhance the accuracy rate of variance components. If researchers cannot increase the number of item, they can increase the number of rater instead, this way can also enhance the accuracy rate. The number of rater cannot be to large. It can get little bias when the number of rater reach 5.

Key words: Test score, Sparse data, Generalizability Theory, Two-faceted random cross design(p×i×r), Estimating variance components

摘要: 考试评分缺失数据较为常见,如何有效利用现有数据进行统计分析是个关键性问题。在考试评分中,题目与评分者对试卷得分的影响不容忽视。根据概化理论原理,按考试评分规则推导出含有缺失数据双侧面交叉设计(p×i×r)方差分量估计公式,用Matlab7.0软件模拟多组缺失数据,验证此公式的有效性。结果发现:(1)推导出的公式较为可靠,估计缺失数据的方差分量偏差相对较小,即便数据缺失率达到50%以上,公式仍能对方差分量进行较为准确地估计;(2)题目数量对概化理论缺失数据方差分量的估计影响最大,评分者次之,当题目和评价者数量分别为6和5时,公式能够趋于稳定地估计;(3)学生数量对各方差分量的估计影响较小,无论是小规模考试还是大规模考试,概化理论估计缺失数据的多个方差分量结果相差不大。

关键词: 考试评分, 缺失数据, 概化理论, 双侧面交叉设计(p×i×r), 方差分量估计