人际情绪能力是个体在社会交往中体现的重要情绪能力,但现有评估中较为缺少真实的情景和准确的评分。研究基于真实情景的视频材料开发了开放式作答的大学生人际情绪能力情境判断测验,共回收293份有效数据,由人工编码评分。分析结果显示,测验总体一致性信度和结构效度验证均表现良好。为了克服人工编码时间成本高和评分差异大等问题,研究基于BERT语言模型对测验主观题进行自动评分。模型分类评价指标准确率、精确率、召回率和F1值均表现良好,人机评分呈显著中高相关(r = .39~.74, p < .001),说明基于BERT语言模型可以较为准确可靠地对被试的作答结果进行评分,自动化评分研究能够有效助力开放式情境判断测验的广泛应用。
Abstract
Interpersonal emotional competence, within the framework of emotional intelligence for ability orientation, holds significant implications for various aspects of college student populations, including their mental health status, interpersonal relationships, and academic achievements. However, theoretical studies have rarely systematically explored and evaluated individuals' interpersonal emotional abilities, and test formats have primarily relied on self-report questionnaires and objective items, posing constraints on situational authenticity and scoring accuracy. In response, this study proposes three important dimensions of perceiving others' emotions, understanding others' emotions, and regulating others' emotions as components of interpersonal emotional ability, based on previous research theories. We developed a novel open-ended Interpersonal Emotional Competence Situational Judgment Test for college students, utilizing video materials to enhance ecological validity. At the same time, in order to overcome the problems of high time cost and large scoring differences brought by manual coding of open-ended answers, the study explores the implementation of automatic scoring for the tests.
A total of 293 valid questionnaires were collected through offline collective testing. Eight trained professionals were invited to manually code the natural texts provided by the subjects in response to subjective questions. The coders were paired, and the inter-rater reliability coefficient for each group was above .9. The subjects' scores were then converted according to the scoring criteria.
Data analysis revealed that the overall consistency reliability of the test was .646, with the consistency reliability of each of the three measurement dimensions ranging from .512 to .576, indicating an acceptable level of reliability. Using Confirmatory factor analysis (CFA) to validate the structural validity, it was found that χ2 = 72.612, df = 74, p > .05, CFI = 1.000, TLI = 1.005, RMSEA = .000, SRMR = .043, indicating good model fit results. This suggests that the test indeed encompasses three dimensions, which all belong to a single ability factor. Additionally, calculating the omega coefficient for this structural equation model yielded an omega value of .92, indicating a high internal consistency and commonality among the latent variables, as well as a good ability to explain the observed variables. In terms of criterion-related validity, the dimensions of understanding others' emotions in the test exhibited a low correlation (r = .118, p < .05) with the self-emotion management dimension of the Emotion Intelligence Scale (EIS).
In the automatic scoring process, responses to 9 subjective questions were randomly split into training and testing sets at a 4:1 ratio. Prior to implementation, models such as TextCNN, LSTM, LSTM+Attention, Transformer, and BERT were compared, and the BERT model was ultimately selected for practical application and reporting based on its average performance across each question. Additionally, the performance of three models within the BERT family—BERT-Base-Chinese, RoBERTa-zh, and BERT-Chinese-wwm—was compared to assess consistency among BERT models. The results revealed that all three BERT family models performed well in terms of accuracy, precision, recall, and F1 score. The BERT-Chinese-Base model exhibited the best performance in terms of human-machine scoring correlation, with overall human-machine scoring results showing moderate to high significant correlation (r = .393~.735, p<.001), indicating that the BERT model can score participants' responses relatively accurately and reliably. The overall α coefficient for the reliability of the automatically scored test was .78, with α coefficients of .63 and .64 for emotional understanding and emotional regulation, respectively. These reliability scores were slightly better than those of the manual scoring. Conducting confirmatory factor analysis (n = 60) on these two dimensions revealed that χ2 = 4.388, RMSEA = .096, CFI = .866, TLI = .814, SRMR = .078, indicating that the structural validity of automatic scoring, although slightly inferior, was still acceptable compared to manual scoring.
Overall, this study has successfully developed a video-based, open-ended situational judgment test of interpersonal emotional intelligence. The quality of the test is acceptable, and it has addressed the lack of exploration of the concept of interpersonal emotional intelligence in previous research both theoretically and empirically. The use of video-based open-ended situational judgment test technology ensures the authenticity of the scenarios during testing and the accuracy of the scoring. Furthermore, the consistency and effectiveness of automatic scoring was validated compared to manual scoring, laying the groundwork for its potential widespread application.
关键词
人际情绪能力 /
视频材料 /
开放式作答 /
情境判断测验 /
自动评分
Key words
interpersonal emotional intelligence /
video materials /
open-ended items /
situational judgment test /
automatic scoring
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 范会勇. (2010). 大学生情绪智力的测量学研究 (博士学位论文). 西南大学, 重庆.
[2] 刘艳梅. (2008). Schutte情绪智力量表的修订及特点研究(硕士学位论文). 西南大学, 重庆.
[3] 卢家楣, 陈念劬, 徐雷, 陈叶梓, 吴洁, 王荣, 叶为锋, 李秀君. (2016). 中国当代大学生情绪智力现状调查研究. 心理科学, 39(6), 1302-1309.
[4] 申志敏. (2019). 基于视频材料的大学生情绪智力量表编制 (硕士学位论文). 信阳师范学院.
[5] 王才康. (2002). 情绪智力与大学生焦虑、抑郁和心境的关系研究. 中国临床心理学杂志, 4, 298-299.
[6] 肖灵云, 侯开虎, 戴洪涛, 杨少琦. (2019). 自动评分方法研究现状及趋势. 软件, 40(6), 153-156.
[7] 许远理. (2004). 情绪智力组合理论的建构与实证研究 (博士学位论文). 首都师范大学, 北京.
[8] 赵京胜, 宋梦雪, 高祥. (2019). 自然语言处理发展及应用综述. 信息技术与信息化, 7, 142-145.
[9] Bardach L., Rushby J. V., Kim L. E., & Klassen R. M. (2020). Using video- and text-based situational judgement tests for teacher selection: A quasi-experiment exploring the relations between test format, subgroup differences, and applicant reactions. European Journal of Work and Organizational Psychology, 30(2), 251-264.
[10] Barthel A. L., Hay A., Doan S. N., & Hofmann S. G. (2018). Interpersonal emotion regulation: A review of social and developmental components. Behaviour Change, 35(4), 203-216.
[11] Brasseur S., Grégoire J., Bourdu R., Mikolajczak M., & García O. (2013). The profile of emotional competence (PEC): Development and validation of a self-reported measure that fits dimensions of emotional competence theory. PLoS ONE, 8(5), e62635.
[12] Cohen, N., & Arbel, R. (2020). On the benefits and costs of extrinsic emotion regulation to the provider: Toward a neurobehavioral model. Cortex, 130, 1-15.
[13] Connolly H. L., Lefevre C. E., Young A. W., & Lewis G. J. (2020). Emotion recognition ability: Evidence for a supramodal factor and its links to social cognition. Cognition, 197, 104166.
[14] Dalkiran M., Yuksek E., & Karamustafalioglu O. (2017). Facial emotion recognition ability in psychiatrists, psychologist and psychological counselors. European Psychiatry, 41, S157.
[15] Gillespie M. A., Oswald F. L., Schmitt N., Manheim L., & Kim B. (2002). Construct validation of a situational judgment test of college student success. In 17th annual meeting of the society for industrial and organizational psychology, Toronto, CA.
[16] Goleman, D. (1995). Emotional intelligence. Bantam Books.
[17] Hobbs, W. R., & Ong, A. D. (2023). For living well, behaviors and circumstances matter just as much as psychological traits. Proceedings of the National Academy of Sciences of the United States of America, 120(12), e2212867120-e221286712.
[18] Ivcevic, Z., & Eggers, C. (2021). Emotion regulation ability: Test performance and observer reports in predicting relationship, achievement and well-being outcomes in adolescents. International Journal of Environmental Research and Public Health, 18(6), 3204-3204.
[19] Liang B., Wang J., & Chen S. (2022). The influence of emotional understanding on prosocial behavior of primary school students: The mediating of emotional tone shift. Chinese Journal of Clinical Psychology, 30(6), 1413-1417.
[20] Lievens F., Peeters H., & Schollaert E. (2008). Situational judgement tests: A review of recent research. Personnel Review, 37(4), 426-441.
[21] LoBue, V., & Ogren, M. (2022). How the emotional environment shapes the emotional life of the Child. Policy Insights From The Behavioral And Brain Sciences, 9(1), 137-144.
[22] MacCann, C. E. (2006). New approaches to measuring emotional intelligence.University of Sydney.
[23] Mayer J. D., Caruso D. R., & Salover P. (2016). The ability model of emotional intelligence: Principles and updates. Emotion Review, 8(4), 290-30.
[24] Mayer J. D., Salovey P., & Caruso D. R. (2002). Mayer-Salovey-Caruso emotional intelligence test(MSCEIT) user' s manual. MHS Publishers.
[25] McDaniel M. A., Morgeson F. P., Finnegan E. B., Campion M. A., & Braverman E. P. (2001). Use of situational judgment tests to predict job performance: A clarification of the literature. Journal of Applied Psychology, 86, 730-74.
[26] Minardi, H. (2013). Emotion recognition by mental health professionals and students. Nursing standard (Royal College of Nursing (Great Britain): 1987), 27(25), 41-48.
[27] Nozaki, Y. (2015). Emotional competence and extrinsic emotion regulation directed toward an ostracized person. Emotion, 15(6), 763-774.
[28] Petrides, K. V., & Furnham, A. (2000). On the dimensional structure of emotional intelligence. Personality and Individual Differences, 29, 313-32.
[29] Petrides K. V., Pita R., & Kokkinaki F. (2007). The location of trait emotional intelligence in personality factor space. British Journal of Psychology, 98(2), 273-289.
[30] Scarpulla E., Stosic M. D., Weaver A. E., & Ruben M. A. (2023). Should I post? The relationships among social media use, emotion recognition, and mental health. Frontiers in Psychology, 14, 1161300-116130.
[31] Schäpers P., Lievens F., Freudenstein J. P., Hüffmeier J., König, C. J. & Krumm, S. (2020). Removing situation descriptions from situational judgment test items: Does the impact differ for video-based versus text-based formats? Journal of Occupational and Organizational Psychology, 93(2), 472-494.
[32] Schultz D., Groth E., Vanderwalde H., Shannon K., Shuttlesworth M., & Shanty L. (2018). Assessment of hostile and benign intent attributions in early childhood: Can we elicit meaningful information? Social Development, 27(2), 401-414.
[33] Sharma S., Gangopadhyay M., Austin E., Mandal M. K. (2013). Development and validation of a situational judgment test of emotional intelligence. International Journal of Selection and Assessment, 21(1), 57-73.
[34] Verron, H. (2014). Assessment of emotion understanding in preschoolers: Multiple-choice vs. open-ended methods (unpublished Master's thesis). University of Maryland.
[35] Weekley, J. A., & Ployhart, R. E. (2006). Situational judgment tests: Theory, measurement, and application. Lawrence Erlbaum Associate Publishers.
[36] Whetzel, D. L., & McDaniel, M. A. (2009). Situational judgment tests: An overview of current research. Human Resource Management Review, 19(3), 188-202.
基金
*本研究得到教育部人文社会科学研究规划基金项目(批准号:22YJAZH077)的资助