准确性提示对虚假信息传播的干预作用——基于GPT模型的实验*

江浩扬, 彭小刚, 董艺涵, 朱晓隆, 彭晓哲

心理科学 ›› 2025, Vol. 48 ›› Issue (4) : 876-891.

PDF(3085 KB)
中文  |  English
PDF(3085 KB)
心理科学 ›› 2025, Vol. 48 ›› Issue (4) : 876-891. DOI: 10.16719/j.cnki.1671-6981.20250410
计算建模与人工智能

准确性提示对虚假信息传播的干预作用——基于GPT模型的实验*

  • 江浩扬1, 彭小刚2, 董艺涵2, 朱晓隆2, 彭晓哲**1
作者信息 +

The Intervention Effect of Accuracy Prompt on Misinformation Sharing ——Experiment Based on GPT

  • Jiang Haoyang1, Peng Xiaogang2, Dong Yihan2, Zhu Xiaolong2, Peng Xiaozhe1
Author information +
文章历史 +

摘要

大语言模型(LLM)已广泛用于内容生成与审核,但LLM对虚假信息的辨别并非总是可靠,并且自身也时常表现出“幻觉”。探索准确性是否提示以及如何有效提高LLM信息辨别力,可以为理解LLM的类人行为及机制提供依据,也关系到如何以最简单有效的方式提升模型在真实世界中的信息审核与内容呈现。本研究系统探讨了准确性提示能否有效提升GPT对虚假信息的辨别能力。三个研究依次使用经典新闻材料、高推理负荷的TruthfulQA材料和未学习的新材料,均发现准确性提示显著提高了LLM的分享辨别力。结果表明,提示策略可能将LLM的注意转移到真实性维度,并激发模型内部认知机制切换至慎思模式、提高模型表现。研究结果为心理学干预方法与LLM认知机制的整合提供了理论与实证基础。

Abstract

Large language models (LLMs) are increasingly used for content generation and verification. However, their capability to accurately discern misinformation remains imperfect, frequently exhibiting confident yet incorrect outputs, known as "hallucinations". Investigating whether and how accuracy prompts—brief reminders prompting attention toward accuracy—can effectively enhance LLMs' misinformation discernment abilities is crucial. Such inquiry not only advances our understanding of LLMs' human-like cognitive processes and underlying mechanisms but also provides practical guidance for implementing simple yet effective interventions to improve content moderation in real-world applications.
In this study, we systematically examined whether accuracy prompts could enhance misinformation discernment abilities of GPT models. Three sequential studies were conducted, each targeting a distinct cognitive and material context. In Study 1, we employed classical news headline materials previously used in human misinformation intervention studies to test the robustness of accuracy prompts. The results indicated that both GPT-3.5 and GPT-4o models demonstrated significantly improved sharing discernment after receiving accuracy prompts, characterized by reduced intentions to share false news and increased intentions to share true news. Notably, GPT-4o showed stronger improvements compared to GPT-3.5, suggesting that advanced LLMs may be better able to use such prompts to realign their internal cognitive focus toward accuracy considerations.
Study 2 further tested the robustness and generalizability of accuracy prompts by employing TruthfulQA, a dataset that is specifically designed to probe reasoning and common misconceptions. These materials required the model to engage in deeper reasoning processes and cross-domain knowledge. Consistent with Study 1, accuracy prompts robustly improved GPT models' sharing discernment performance even in this cognitively demanding context. This suggests that the effect of accuracy prompt can generalize across different types of information and varying cognitive demands.
To further clarify whether the observed improvements resulted from genuine reasoning or simple retrieval of training data, Study 3 utilized recently emerging news materials published after GPT models' training cutoff date. Thus, the models could not rely on previously learned information. The results showed that accuracy prompts continued to significantly improve sharing discernment in GPT-4o, whereas GPT-3.5 showed limited improvement. These findings indicate that accuracy prompts effectively activate deeper cognitive processes, such as increased attention allocation towards assessing veracity and analytical reasoning, in advanced LLMs, thereby enhancing their capacity to evaluate novel misinformation.
Collectively, these three studies provide robust empirical evidence that simple accuracy prompts effectively enhance misinformation discernment capacities in GPT models by shifting their internal attentional focus toward assessing informational accuracy and triggering deeper analytical processes. Crucially, the observed effectiveness across classical, high-reasoning, and novel materials underscores the robustness and practical applicability of accuracy prompts as cognitive interventions within LLMs.
This research contributes theoretically and practically to the integration of psychological intervention strategies with artificial intelligence cognitive mechanisms. Specifically, it offers foundational insights for implementing psychologically informed interventions ("psychology for AI") that not only clarify cognitive analogies between human cognition and LLMs but also guide practical methodologies for enhancing LLMs' information discernment capabilities, ultimately benefiting real-world misinformation management and digital content verification.

关键词

大语言模型 / 类人行为 / 虚假信息 / 准确性提示

Key words

large language model / misinformation / accuracy prompt / human-like behaviors

引用本文

导出引用
江浩扬, 彭小刚, 董艺涵, 朱晓隆, 彭晓哲. 准确性提示对虚假信息传播的干预作用——基于GPT模型的实验*[J]. 心理科学. 2025, 48(4): 876-891 https://doi.org/10.16719/j.cnki.1671-6981.20250410
Jiang Haoyang, Peng Xiaogang, Dong Yihan, Zhu Xiaolong, Peng Xiaozhe. The Intervention Effect of Accuracy Prompt on Misinformation Sharing ——Experiment Based on GPT[J]. Journal of Psychological Science. 2025, 48(4): 876-891 https://doi.org/10.16719/j.cnki.1671-6981.20250410

参考文献

[1] 曹呈旭, 七十三, 金童林, 曾小叶, 安叶青,卜塔娜. (2024). 基于信号检测论的错误信息鉴别层级模型.心理科学进展, 32(7),1209-1220.
[2] 陈婉婷, 张逸飞,何清华. (2023). 准确性提示降低错误信息的分享意愿.心理与行为研究, 21(6), 751-759.
[3] 焦丽颖, 李昌锦, 陈圳, 许恒彬, 许燕. (2025). 当AI“具有”人格:善恶人格角色对大语言模型道德判断的影响. 心理学报, 57(6),929-549.
[4] 李玉楚, 张思琦, 丁格一, 牛佳雯, 饶俪琳. (2025). 虚假信息相信与分享的内在机制: 基于廷伯根理论框架. 心理科学, 48(2), 447-458.
[5] Abdurahman S., Atari M., Karimi-Malekabadi F., Xue M. J., Trager J., Park P. S., Golazizian P., Omrani A., & Dehghani M. (2024). Perils and opportunities in using large language models in psychological research. PNAS Nexus, 3(7), 245.
[6] Aher G. V., Arriaga R. I., & Kalai A. T. (2023). Using large language models to simulate multiple humans and replicate human subject studies. International Conference on Machine Learning, PMLR.
[7] Aïmeur E., Amri S., & Brassard G. (2023). Fake news, disinformation and misinformation in social media: A review. Social Network Analysis and Mining, 13(1), 30.
[8] Arechar A. A., Allen J., Berinsky A. J., Cole R., Epstein Z., Garimella K., Gully A., Lu J. G., Ross R. M., Stagnaro M. N., Zhang Y., Pennycook G., & Rand D. G. (2023). Understanding and combatting misinformation across 16 countries on six continents. Nature Human Behaviour, 7(9), 1502-1513.
[9] Argyle L. P., Busby E. C., Fulda N., Gubler J. R., Rytting C., & Wingate D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337-351.
[10] Augenstein I., Baldwin T., Cha M., Chakraborty T., Ciampaglia G. L., Corney D., DiResta R., Ferrara E., Hale S., Halevy A., Hovy E., Ji H., Menczer F., Miguez R., Nakov P., Scheufele D., Sharma S., & Zagni G. (2024). Factuality challenges in the era of large language models and opportunities for fact-checking. Nature Machine Intelligence, 6(8), 852-863.
[11] Bago B., Rand D. G., & Pennycook G. (2020). Fake news, fast and slow: Deliberation reduces belief in false (but not true) news headlines. Journal of Experimental Psychology: General, 149, 1608-1613.
[12] Bail, C. A. (2024). Can generative AI improve social science? Proceedings of the National Academy of Sciences, 121(21), e2314021121.
[13] Batailler C., Brannon S. M., Teas P. E., & Gawronski B. (2022). A signal detection approach to understanding the identification of fake news. Perspectives on Psychological Science, 17(1), 78-98.
[14] Bates D., Mächler M., Bolker B., & Walker S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48.
[15] Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6), e2218523120.
[16] Chen Y., Liu T. X., Shan Y., & Zhong S. (2023). The emergence of economic rationality of GPT. Proceedings of the National Academy of Sciences, 120(51), e2316205120.
[17] Crockett, M., & Messeri, L. (2023). Should large language models replace human participants? PsyArXiv.
[18] Dillion D., Tandon N., Gu Y., & Gray K. (2023). Can AI language models replace human participants? Trends in Cognitive Sciences, 27(7), 597-600.
[19] Ecker U., Roozenbeek J., Van Der Linden S., Tay L. Q., Cook J., Oreskes N., & Lewandowsky S. (2024). Misinformation poses a bigger threat to democracy than you might think. Nature, 630(8015), 29-32.
[20] Epstein Z., Sirlin N., Arechar A., Pennycook G., & Rand D. (2023). The social media context interferes with truth discernment. Science Advances, 9(9), eabo6169.
[21] Evans, J. S. B., & Stanovich, K. E. (2013). Dual-process theories of higher cognition: Advancing the debate. Perspectives on Psychological Science, 8(3), 223-241.
[22] Fazio L. K., Brashier N. M., Payne B. K., & Marsh E. J. (2015). Knowledge does not protect against illusory truth. Journal of Experimental Psychology: General, 144(5), 993-1002
[23] Grossmann I., Feinberg M., Parker D. C., Christakis N. A., Tetlock P. E., & Cunningham W. A. (2023). AI and the transformation of social science research. Science, 380(6650), 1108-1109.
[24] Hagendorff, T. (2024). Deception abilities emerged in large language models. Proceedings of the National Academy of Sciences, 121(24), e2317967121.
[25] Hoes E., Altay S., & Bermeo J. (2023). Leveraging ChatGPT for efficient fact-checking. PsyArXiv.
[26] Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? National Bureau of Economic Research.
[27] Huang L., Yu W., Ma W., Zhong W., Feng Z., Wang H., Chen Q., Peng W., Feng X., Qin B., & Liu T. (2025). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 43(2), 1-55.
[28] Khandelwal A., Agarwal U., Tanmay K., & Choudhury M. (2024). Do moral judgment and reasoning capability of LLMs change with language? A study using the multilingual defining issues test. ArXiv.
[29] Kojima T., Gu S. S., Reid M., Matsuo Y., & Iwasawa Y. (2022). Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, 35, 22199-22213.
[30] Kosinski, M. (2024). Evaluating large language models in theory of mind tasks. Proceedings of the National Academy of Sciences, 121(45), e2405460121.
[31] Lazer D. M. J., Baum M. A., Benkler Y., Berinsky A. J., Greenhill K. M., Menczer F., Metzger M. J., Nyhan B., Pennycook G., Rothschild D., Schudson M., Sloman S. A., Sunstein C. R., Thorson E. A., Watts D. J., & Zittrain J. L. (2018). The science of fake news. Science, 359(6380), 1094-1096.
[32] Lehr S. A., Caliskan A., Liyanage S., & Banaji M. R. (2024). ChatGPT as research scientist: Probing GPT’s capabilities as a research librarian, research ethicist, data generator, and data predictor. Proceedings of the National Academy of Sciences, 121(35), e2404328121.
[33] Lewandowsky S., Ecker U. K., & Cook J. (2017). Beyond misinformation: Understanding and coping with the “post-truth” era. Journal of Applied Research in Memory and Cognition, 6(4), 353-369.
[34] Lee S., Peng T. Q., Goldberg M. H., Rosenthal S. A., Kotcher J. E., Maibach E. W., & Leiserowitz A. (2024). Can large language models estimate public opinion about global warming? An empirical assessment of algorithmic fidelity and bias. PLOS Climate, 3(8), e0000429.
[35] Li C., Wang J., Zhang Y., Zhu K., Hou W., Lian J., Luo F., Yang Q., & Xie X. (2023a). Large language models understand and can be enhanced by emotional stimuli. ArXiv.
[36] Li C., Wang J., Zhang Y., Zhu K., Wang X., Hou W., Lian J., Luo F., Yang Q., & Xie X. (2023b). The good, the bad, and why: Unveiling emotions in generative AI. ArXiv.
[37] Lin H., Pennycook G., & Rand D. G. (2023). Thinking more or thinking differently? Using drift-diffusion modeling to illuminate why accuracy prompts decrease misinformation sharing. Cognition, 230, 105312.
[38] Lin S., Hilton J., & Evans O. (2021). Truthfulqa: Measuring how models mimic human falsehoods. ArXiv.
[39] Lin, Z. (2024a). How to write effective prompts for large language models. Nature Human Behaviour, 8(4), 611-615.
[40] Lin, Z. (2024b). Large language models as linguistic simulators and cognitive models in human research. ArXiv.
[41] Liu Z., Yao Z., Li F., & Luo B. (2024). On the detectability of chatgpt content: benchmarking, methodology, and evaluation through the lens of academic writing. Proceedings of the 2024 on ACM SIGSAC conference on computer and communications security.
[42] McLoughlin K. L., Brady W. J., Goolsbee A., Kaiser B., Klonick K., & Crockett M. J. (2024). Misinformation exploits outrage to spread online. Science, 386(6725), 991-996.
[43] Mei Q., Xie Y., Yuan W., & Jackson M. O. (2024). A Turing test of whether AI chatbots are behaviorally similar to humans. Proceedings of the National Academy of Sciences, 121(9), e2313925121.
[44] Messeri, L., & Crockett, M. J. (2024). Artificial intelligence and illusions of understanding in scientific research. Nature, 627(8002), 49-58.
[45] Nahon L. S., Ng N. L., & Gawronski B. (2024). Susceptibility to misinformation about COVID-19 vaccines: A signal detection analysis. Journal of Experimental Social Psychology, 114, 104632.
[46] Pennycook G., Berinsky A. J., Bhargava P., Lin H., Cole R., Goldberg B., Lewandowsky S., & Rand D. G. (2024). Inoculation and accuracy prompting increase accuracy discernment in combination but not alone. Nature Human Behaviour, 8(12), 2330-2341.
[47] Pennycook G., Epstein Z., Mosleh M., Arechar A. A., Eckles D., & Rand D. G. (2021). Shifting attention to accuracy can reduce misinformation online. Nature, 592(7855), 590-595.
[48] Pennycook G., McPhetres, J. Zhang Y., Lu, J. G. & Rand, D. G. (2020). Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy nudge intervention. Psychological Science, 31(7), 767-905.
[49] Pennycook, G. & Rand, D. G. (2019). Lazy, not biased: Susceptibility to partisan fake news is better explained by lack of reasoning than by motivated reasoning. Cognition, 188(5), 39-50.
[50] Pennycook, G. & Rand, D. G. (2021). The psychology of fake news. Trends in Cognitive Sciences, 25(5),388-402.
[51] Pennycook, G., & Rand, D. G. (2022a). Accuracy prompts are a replicable and generalizable approach for reducing the spread of misinformation. Nature Communications, 13(1), 2333.
[52] Pennycook, G., & Rand, D. G. (2022b). Nudging social media toward accuracy. The Annals of the American Academy of Political and Social Science, 700(1), 152-164.
[53] Rathje S., Mirea D. M., Sucholutsky I., Marjieh R., Robertson C. E., & Van Bavel, J. J. (2024). GPT is an effective tool for multilingual psychological text analysis. Proceedings of the National Academy of Sciences, 121(34), e2308950121.
[54] Roozenbeek J., Freeman A. L. J., & van der Linden, S. (2021). How accurate are accuracy-nudge interventions? A preregistered direct replication of Pennycook et al. (2020). Psychological Science, 32(7), 1169-1178.
[55] Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, and Computers, 31(1), 137-149.
[56] Strachan J. W., Albergo D., Borghini G., Pansardi O., Scaliti E., Gupta S., Saxena K., Rufo A., Panzeri S., Manzi G., Graziano M. S. A., & Becchio C. (2024). Testing theory of mind in large language models and humans. Nature Human Behaviour, 8(7), 1285-1295.
[57] Sun Y., Sheng D., Zhou Z., & Wu Y. (2024). AI hallucination: Towards a comprehensive classification of distorted information in artificial intelligence-generated content. Humanities and Social Sciences Communications, 11(1), 1-14.
[58] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., & Polosukhin I. (2017). Attention is all you need. ArXiv.
[59] Webb T., Holyoak K. J., & Lu H. (2023). Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9), 1526-1541.
[60] Weng, L. (2025, May). Why we think. Lil'Log. https://lilianweng.github.io/posts/2025-05-01-thinking/
[61] Wu W., Zhao Q., Chen H., Zhou L., Lian D., & Xie H. (2025). Exploring the choice behavior of large language models. Findings of the Association for Computational Linguistics (ACL 2025).

基金

* 本研究得到国家自然科学基金(72371167)、广东省哲学社会科学“十四五”规划(GD23CXL03)和深圳市科技计划(KCXFZ20230731093600002)的资助

PDF(3085 KB)

评审附件

Accesses

Citation

Detail

段落导航
相关文章

/