Journal of Psychological Science ›› 2022, Vol. 45 ›› Issue (6): 1475-1482.

Previous Articles     Next Articles

Controlling for Clustering in Single Level Study: Design-Based Methods

  

  • Received:2020-03-07 Revised:2021-02-22 Online:2022-11-20 Published:2022-12-11
  • Contact: Zhong-Lin WEN

控制单水平研究中的多水平误差:基于设计的方法

王阳1,温忠麟2,付媛姝2   

  1. 1. 广东金融学院
    2. 华南师范大学
  • 通讯作者: 温忠麟

Abstract: In social science research fields, single-level research often adopts cluster sampling or multi-stage sampling to obtain samples, resulting in the fact that the data structure is multi-level. Thus, researchers have to control for errors from the higher level in their single-level studies. Hierarchical linear model (HLM) suffers from limitations in dealing with such issue. First, HLM's unique advantage to focus on random effects and cluster-specific inferences cannot be reflected in single-level research. Second, the disadvantages of HLM are amplified in single-level research. (1) HLM's assumptions about random effects are harder to satisfy and test. Violation of these assumptions may result in parameter estimation bias. (2) HLM is more likely to produce convergence problems. (3) For single-level studies, HLM is complex in theory, modeling, software operation and interpretation of results. Thus, HLM is difficult to generalize in a single level study with multi-level error. Design-based methods (DBM), including cluster-robust standard errors (CRSE), generalized estimation equation (GEE), and fixed effects model (FEM), represents a category of logical and valid procedures to analyze multi-level data. By correcting for the standard errors of fixed effects, DBM circumvents the issues of partitioning residuals and variables into different levels while accurately estimate parameters. Thus, DBM can address multi-level data within the single-level framework, which is very friendly to single-level researchers. Contrast to HLM, DBM is more parsimonious in modeling, simpler in operating, more efficient in running and more robust in estimating for single-level research. Therefore, at least under the condition of single-level research with multi-level error, DBM is an ideal alternative to HLM. After a detailed introduction of DBM and its advantages, a simulation data set were used to demonstrate the effectiveness of DBM in controlling for multi-level error in single-level mediation studies (i.e., 1-1-1 mediation model). The results showed that although both HLM and DBM were accurate in estimating the within-cluster component of the mediating effect, the former underestimated the standard errors of mediating effect and each mediating path coefficient. In addition, all of the DBMs are simpler than HLM in terms of operations, especially the FEM. FEM is not only possible to operate through SPSS, but also unnecessary to center the variables in level 1 and control between-cluster variables. What’s more, through the popular SPSS mediating analysis macro PROCESS, FEM can realize both casual steps approach and coefficients product approach with bootstrap confidence interval for various complex mediation models. Finally, following suggestions were given for practitioners to select appropriate methods to accommodate clustering in single-level research. (1) DBM is suggested to control the multi-level error in single-level study, especially FEM. (2) If researchers are interested in between-cluster fixed effects, CRSE and GEE is recommended. (3) When researchers have sufficient background knowledge of HLM, and need to focus on random effects, they should collect multi-level data deliberately, especially to ensure that the sample size of level 2 is sufficient. (4) It is recommended to retain the cluster identification information when collecting data, so as to prevent the actual level of data from exceeding the expectant level, leading to the failure to control the multi-level error.

Key words: single-level research, clustered data, hierarchical linear model, design-based methods

摘要: 由于取样设计的原因,多水平数据结构不仅存在于多水平研究,也广泛存在于单水平研究,需要在单水平分析中控制多水平误差。此时使用多层线性模型发挥不了优势,反而因模型的复杂性带来麻烦。基于设计的方法相对更简单、高效和稳健,更契合含多水平误差的单水平研究情境。在详细介绍基于设计的方法及其优势后,利用数据实例展示基于设计的方法在单水平研究中控制多水平误差的效果,并为应用研究者提供方法选用建议。

关键词: 单水平研究, 多水平数据, 多层线性模型, 基于设计的方法