AI与人谁更可信?基于强化学习建模的人智重复信任博弈*

谈昊天, 李泽清, 伍珍

心理科学 ›› 2025, Vol. 48 ›› Issue (4) : 920-932.

PDF(1423 KB)
中文  |  English
PDF(1423 KB)
心理科学 ›› 2025, Vol. 48 ›› Issue (4) : 920-932. DOI: 10.16719/j.cnki.1671-6981.20250413
计算建模与人工智能

AI与人谁更可信?基于强化学习建模的人智重复信任博弈*

  • 谈昊天1, 李泽清2, 伍珍**1,3
作者信息 +

Who is More Trustworthy, AI or Humans? Modeling Human-AI Interactions in Repeated Trust Games with Reinforcement Learning

  • Tan Haotian1, Li Zeqing2, Wu Zhen1,3
Author information +
文章历史 +

摘要

随着人工智能的飞速发展,如何建立人智信任成为亟需解决的重点问题。已有研究通过问卷调查或单次互动任务揭示了影响人智信任的因素,但人智信任建立过程的动态行为模式及内在机制仍不明晰。通过重复信任博弈任务及强化学习计算建模,研究发现在任务开始时,成人参与者对人工智能和对人类的信任水平无显著差异;但在重复博弈时,参与者面对人工智能时的投资水平更高,投资失败后更愿意继续投资,对人工智能的负反馈学习率更低。这些结果表明个体对人工智能的不合作行为敏感性较低,人智信任建立过程的韧性更强。研究为理解人智信任的建立机制提供了行为与计算层面的依据,为人工智能的交互设计提供了理论参考。

Abstract

With the explosive advancement of artificial intelligence (AI), human society is entering a new era of “Human-AI Interaction.” The chatbots we interact with, the algorithmic recommendations embedded in software, and the autonomous driving systems in vehicles all rely on AI algorithms. Moreover, AI has become increasingly integrated into critical fields such as education, healthcare, and finance. Undeniably, it has become an indispensable component of modern life. In this context, understanding trust in Human-AI Interaction not only contributes to the development of prosocial AI but also holds significant value for promoting human-AI collaboration and advancing social harmony.
As one of the core mechanisms of human social interaction, trust is defined as the willingness to hold positive expectations of others’ behaviors and expose oneself to potential risks when unable to control the actions of others. AI is reshaping how people interact with their environments, and trust has now extended from human-human relationships to human-AI interactions. Existing research has primarily used self-reported questionnaires or one-shot interactions to identify key factors influencing human trust in AI. However, these “snapshot” studies provide limited insight into the dynamic development of trust. As seen in human-human interactions, trust typically develops over repeated encounters. The mechanisms underlying trust formation in human-AI interactions remain unclear.
To bridge these gaps, the present study aimed to compare the similarities and differences between Human-AI trust and Human-Human trust using a repeated trust game paradigm. It also sought to use reinforcement learning model to explore the computational mechanisms underlying the development of trust in Human-AI interactions. A total of 148 participants were recruited and completed the experiment. During the task, participants engaged in 20 rounds of trust games with each of 6 different trustees—either human or AI. Each trustee’s reciprocal behavior was governed by a predetermined return probability (25%, 50%, or 75%), which was randomly determined by the program. Trustees were distinguished by different images and labels. The order of trustees was counterbalanced within participants using a Latin square design. To enhance ecological validity, the return probabilities were anonymized.
The results showed that, at the beginning of the trust game, participants did not exhibit differential investment levels toward human or AI in all conditions. However, over repeated interactions, participants displayed higher investment probabilities when interacting with AI compared to human. Moreover, participants were less likely to stop investing after experiencing a loss when interacting with AI, indicating fewer loss-shift behaviors. In contrast, after successful investments, participants’ behavior did not differ across trustee types.
A hyperbolic model revealed an interaction between trustee type and return probability. While return probabilities did not significantly affect the temporal trajectory of investment behavior toward AI, they did influence behavior in human interactions. Specifically, lower return probabilities led to a steeper decline in investment rates toward human trustees. Moreover, participants exhibited a higher asymptotic level of investment when interacting with AI, regardless of return probability, suggesting that if the trust game were to continue indefinitely, the probability of investing in AI would remain higher.
Further, the results of the reinforcement learning model indicated that the four-parameter model incorporating trustee type provided the best fit. The results showed that participants exhibited lower negative feedback learning rates and higher temperature parameters when interacting with AI, suggesting reduced sensitivity to negative outcomes and increased exploratory behavior. There were no significant differences in positive feedback learning rates or initial expected utility between human and AI trustees.
Taken together, these findings suggest that, although there is no initial trust bias toward humans or AI, participants gradually develop greater trust in AI over repeated interactions. Moreover, participants exhibit greater tolerance for non-cooperative behaviors displayed by AI. Trust in human-AI interactions appears to be more resilient. These results provide the first evidence of differences in the trust formation processes between human-human and human-AI interactions and offer a foundation for understanding the cognitive adaptation processes elicited by AI in social interactions.

关键词

人智交互 / 重复信任博弈 / 强化学习建模

Key words

human-AI interaction / repeated trust game / reinforcement learning

引用本文

导出引用
谈昊天, 李泽清, 伍珍. AI与人谁更可信?基于强化学习建模的人智重复信任博弈*[J]. 心理科学. 2025, 48(4): 920-932 https://doi.org/10.16719/j.cnki.1671-6981.20250413
Tan Haotian, Li Zeqing, Wu Zhen. Who is More Trustworthy, AI or Humans? Modeling Human-AI Interactions in Repeated Trust Games with Reinforcement Learning[J]. Journal of Psychological Science. 2025, 48(4): 920-932 https://doi.org/10.16719/j.cnki.1671-6981.20250413

参考文献

[1] 高青林, 周媛. (2020). 计算模型视角下信任形成的心理和神经机制——基于信任博弈中投资者的角度. 心理科学进展, 29(1), 178-189.
[2] 许丽颖, 喻丰, 彭凯平. (2022). 算法歧视比人类歧视引起更少道德惩罚欲. 心理学报, 54(9), 1076-1092.
[3] 许为, 高在峰, 葛列众. (2024). 智能时代人因科学研究的新范式取向及重点. 心理学报, 56(3), 363-382.
[4] 袁博, 孙向超, 游冉, 刘福会, 李伟强. (2018). 情绪对信任的影响:来自元分析的证据. 心理与行为研究, 16(5), 632-643.
[5] 张语嫣, 许丽颖, 喻丰, 丁晓军, 邬家骅, 赵靓. (2022). 算法拒绝的三维动机理论. 心理科学进展 , 30(5), 1093-1105.
[6] Afroogh S., Akbari A., Malone E., Kargar M., & Alambeigi H. (2024). Trust in AI: Progress, challenges, and future directions. Humanities and Social Sciences Communications, 11(1), 1568.
[7] Andras P., Esterle L., Guckert M., Han T. A., Lewis P. R., Milanovic K., Payne T., Perret C., Pitt J., Powers S. T., Urquhart N., & Wells S. (2018). Trusting intelligent machines: Deepening trust within socio-technical systems. IEEE Technology and Society Magazine, 37(4), 76-83.
[8] Atf, Z., & Lewis, P. R. (2025). Is trust correlated with explainability in AI? A meta-analysis. IEEE Transactions on Technology and Society, Advance online pubilcation.
[9] Attanasi G., Battigalli P., Manzoni E., & Nagel R. (2019). Belief-dependent preferences and reputation: Experimental analysis of a repeated trust game. Journal of Economic Behavior and Organization, 167, 341-360.
[10] Balliet, D., & Van Lange, P. A. M. (2013). Trust, conflict, and cooperation: A meta-analysis. Psychological Bulletin, 139(5), 1090-1112.
[11] Berg J., Dickhaut J., & McCabe K. (1995). Trust, reciprocity, and social history. Games and Economic Behavior, 10(1), 122-142.
[12] Bonnefon J. F., Rahwan I., & Shariff A. (2024). The moral psychology of artificial intelligence. Annual Review of Psychology, 75, 653-675.
[13] Cabrera Á. A., Perer A., & Hong J. I. (2023). Improving human-AI collaboration with descriptions of AI behavior. Proceedings of the ACM on Human-Computer Interaction, 7, 1-21.
[14] Charness G., Du N., & Yang C. L. (2011). Trust and trustworthiness reputations in an investment game. Games and Economic Behavior, 72(2), 361-375
[15] Chang L. J., Doll B. B., van' t Wout M., Frank M. J., & Sanfey A. G. (2010). Seeing is believing: Trustworthiness as a dynamic belief. Cognitive Psychology, 61(2), 87-105.
[16] Chi O. H., Chi C. G., Gursoy D., & Nunkoo R. (2023). Customers' acceptance of artificially intelligent service robots: The influence of trust and culture. International Journal of Information Management, 70, 102623.
[17] Cochard F., Nguyen Van P., & Willinger M. (2004). Trusting behavior in a repeated investment game. Journal of Economic Behavior and Organization, 55(1), 31-44.
[18] Corrado G. S., Sugrue Lp Fau - Seung, H. S., Seung Hs Fau - Newsome, W. T., & Newsome W. T. (2005). Linear-nonlinear-poisson models of primate choice dynamics. Journal of the Experimental Analysis of Behavior, 84(3), 581-617.
[19] Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates, Inc.
[20] Cutler J., Wittmann M. K., Abdurahman A., Hargitai L. D., Drew D., Husain M., & Lockwood P. L. (2021). Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved. Nature Communications, 12(1), 4440.
[21] Di Dio C., Manzi F., Peretti G., Cangelosi A., Harris P. L., Massaro D., & Marchetti A. (2020). Shall I trust you? From child-robot interaction to trusting relationships. Frontiers in Psychology, 11, 469.
[22] Dunn, J. R., & Schweitzer, M. E. (2005). Feeling and believing: The influence of emotion on trust. Journal of Personality and Social Psychology, 88(5), 736-748.
[23] Dunning D., Fetchenhauer D., & Schlösser T. M. (2012). Trust as a social and emotional act: Noneconomic considerations in trust behavior. Journal of Economic Psychology, 33(3), 686-694.
[24] Engle-Warnick, J., & Slonim, R. L. (2006). Inferring repeated-game strategies from actions: Evidence from trust game experiments. Economic Theory, 28(3), 603-632.
[25] Finkelstein, M., & Whitley, R. (1981). Optimal strategies for repeated games. Advances in Applied Probability, 13(2), 415-428.
[26] Fouragnan E., Chierchia G., Greiner S., Neveu R., Avesani P., & Coricelli G. (2013). Reputational priors magnify striatal responses to violations of trust. Journal of Neuroscience, 33(8), 3602-3611.
[27] Gillespie N., Lockey S., Curtis C., Pool J., & Akbari A. (2023). Trust in artificial intelligence: A global study. The University of Queensland and KPMG Australia.
[28] Glikson, E., & Woolley, A. W. (2020). Human trust in artificial intelligence: Review of empirical research. Academy of Management Annals, 14(2), 627-660.
[29] Grace K., Salvatier J., Dafoe A., Zhang B., & Evans O. (2018). Viewpoint: When will AI exceed human performance? Evidence from AI experts. Journal of Artificial Intelligence Research, 62(1), 729-754.
[30] Green, L., & Myerson, J. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin, 130(5), 769-792.
[31] Guo Z., Yu J., Wang W., Lockwood P., & Wu Z. (2024). Reinforcement learning of altruistic punishment differs between cultures and across the lifespan. PLoS Computational Biology, 20(7), e1012274.
[32] Habbal A., Ali M. K., & Abuzaraida M. A. (2024). Artificial intelligence trust, risk and security management(AI TRiSM): Frameworks, applications, challenges and future research directions. Expert Systems with Applications, 240, 122442.
[33] Hoff, K. A., & Bashir, M. (2015). Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors, 57(3), 407-434.
[34] Huys Q. J. M., Cools R., Gölzer M., Friedel E., Heinz A., Dolan R. J., & Dayan P. (2011). Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLoS Computational Biology, 7(4), e1002028.
[35] Ishowo-Oloko F., Bonnefon J. F., Soroye Z., Crandall J., Rahwan I., & Rahwan T. (2019). Behavioural evidence for a transparency-efficiency tradeoff in human-machine cooperation. Nature Machine Intelligence, 1(11), 517-521.
[36] Jeon, M. (2023). The effects of emotions on trust in human-computer interaction: A survey and prospect. International Journal of Human-Computer Interaction, 40(22), 6864-6882.
[37] Jin Y., Gao Q., Wang Y., Dietz M., Xiao L., Cai Y., Bliksted V., & Zhou Y. (2023). Impaired social learning in patients with major depressive disorder revealed by a reinforcement learning model. International Journal of Clinical and Health Psychology, 23(4), 100389.
[38] Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263-291.
[39] Kahr P. K., Rooks G., Willemsen M. C., & Snijders, C. C. P. (2024). Understanding trust and reliance development in AI advice: Assessing model accuracy, model explanations, and experiences from previous interactions. ACM Transactions on Interactive Intelligent Systems, 14(4), 1-30.
[40] Karpus J., Krüger A., Verba J. T., Bahrami B., & Deroy O. (2021). Algorithm exploitation: Humans are keen to exploit benevolent AI. iScience, 24(6), 102679.
[41] Kaya F., Fatih A., Astrid S., Paul R., Okan Y., & and Demir Kaya M. (2024). The roles of personality traits, AI anxiety, and demographic factors in attitudes toward artificial intelligence. International Journal of Human-Computer Interaction, 40(2), 497-514.
[42] Krach S., Blümel I., Marjoram D., Lataster T., Krabbendam L., Weber J., van Os J., & Kircher T. (2009). Are women better mindreaders? Sex differences in neural correlates of mentalizing detected with functional MRI. BMC Neuroscience, 10(1), 9.
[43] Lalot, F., & Bertram, A. M. (2025). When the bot walks the talk: Investigating the foundations of trust in an artificial intelligence (AI) chatbot. Journal of Experimental Psychology: General, 154(2), 533-551.
[44] Lewicki R. J., Tomlinson E. C., & Gillespie N. (2006). Models of interpersonal trust development: Theoretical approaches, empirical evidence, and future directions. Journal of Management, 32(6), 991-1022.
[45] Li Y., Wu B., Huang Y., & Luan S. (2024). Developing trustworthy artificial intelligence: Insights from research on interpersonal, human-automation, and human-AI trust. Frontiers in Psychology, 15, 1382693.
[46] Lin Y., Xu P., Fan J., Gu R., & Luo Y. (2023). Gain-loss separability in human- but not computer-based changes of mind. Computers in Human Behavior, 143, 107712.
[47] Lockwood P. L., Wittmann M. K., Apps M. A. J., Klein-Flugge M. C., Crockett M. J., Humphreys G. W., & Rushworth, M. F. S. (2018). Neural mechanisms for learning self and other ownership. Nature Communications, 9(1), 4747.
[48] Logg J. M., Minson J. A., & Moore D. A. (2019). Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, 151, 90-103.
[49] Mailath G. J.,& Samuelson, L. (2006). Repeated games and reputations: Long-Run relationships Oxford University Press Long-Run relationships. Oxford University Press.
[50] March, C. (2021). Strategic interactions between humans and artificial intelligence: Lessons from experiments with computer players. Journal of Economic Psychology, 87, 102426.
[51] Oksanen A., Savela N., Latikka R., & Koivula A. (2020). Trust toward robots and artificial intelligence: An experimental approach to human-technology interactions online. Frontiers in Psychology, 11, 568256.
[52] Paeng E., Wu J., & Boerkoel J. (2016). Human-robot trust and cooperation through a game theoretic framework. Proceedings of the AAAI Conference on Artificial Intelligence.
[53] Park, J., & Woo, S. E. (2022). Who likes artificial intelligence? Personality predictors of attitudes toward artificial intelligence. The Journal of Psychology: Interdisciplinary and Applied, 156(1), 68-94.
[54] Phan K. L., Sripada C. S., Angstadt M., & McCabe K. (2010). Reputation for reciprocity engages the brain reward center. Proceedings of the National Academy of Sciences, 107(29), 13099-13104.
[55] Radell M. L., Sanchez R., Weinflash N., & Myers C. E. (2016). The personality trait of behavioral inhibition modulates perceptions of moral character and performance during the trust game: Behavioral results and computational modeling. PeerJ, 4, e1631.
[56] Rahwan I., Cebrian M., Obradovich N., Bongard J., Bonnefon J. F., Breazeal C., Crandall J. W., & Wellman M. (2019). Machine behaviour. Nature, 568(7753), 477-486.
[57] Rescorla, R. A., & Wagner, A. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Black, A. H., & Prokasy, W. F.(Eds), Classical conditioning II: Current research and theory (pp. 64-99). Appleton-Century-Crofts.
[58] Rieger T., Kugler L., Manzey D., & Roesler E. (2024). The (Im) perfect automation schema: Who is trusted more, automated or human decision support? Human Factors, 66(8), 1995-2007.
[59] Sanfey A. G., Rilling J. K., Aronson J. A., Nystrom L. E., & Cohen J. D. (2003). The neural basis of economic decision-making in the Ultimatum Game. Science, 300(5626), 1755-1758.
[60] Schilke O., Reimann M., & Cook K. S. (2021). Trust in social relations. Annual Review of Sociology, 47, 239-259.
[61] Schniter E., Shields T. W., & Sznycer D. (2020). Trust in humans and robots: Economically similar but emotionally different. Journal of Economic Psychology, 78, 102253.
[62] Shin, D. (2021). The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI. International Journal of Human-Computer Studies, 146, 102551.
[63] Sutton R. S.,& Barto, A. G. (2018). Reinforcement learning: An introduction The MIT Press An introduction. The MIT Press.
[64] Thielmann I., Spadaro G., & Balliet D. (2020). Personality and prosocial behavior: A theoretical framework and meta-analysis. Psychological Bulletin, 146(1), 30-90.
[65] Upadhyaya, N., & Galizzi, M. M. (2023). In bot we trust? Personality traits and reciprocity in human-bot trust games. Frontiers in Behavioral Economics, 2, 1164259.
[66] van Dijk, E., & De Dreu, C. K. W. (2021). Experimental games and social decision making. Annual Review of Psychology, 72, 415-438.
[67] Wilson, R. C., & Collins, A. G. E. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, e49547.
[68] Winkielman P., Knutson B., Paulus M., & Trujillo J. L. (2007). Affective influence on judgments and decisions: Moving towards core mechanisms. Review of General Psychology, 11(2), 179-192.

基金

*本研究得到国家自然科学基金项目(32271110,62441614)和清华大学自主科研计划(20235080047)的资助

PDF(1423 KB)

评审附件

Accesses

Citation

Detail

段落导航
相关文章

/