大语言模型能理解诗歌韵律吗?&#x02014;&#x02014;基于中国古代诗歌的人机音韵预测研究&#x0002A;

魏庭新; 李佳斌; 赵英; 吴宙; 陈庆荣

doi:10.16719/j.cnki.1671-6981.20250607

PDF(2404 KB)

心理科学 ›› 2025, Vol. 48 ›› Issue (6) : 1370-1383. DOI: 10.16719/j.cnki.1671-6981.20250607

计算建模与人工智能

大语言模型能理解诗歌韵律吗?——基于中国古代诗歌的人机音韵预测研究^*

魏庭新¹, 李佳斌², 赵英³, 吴宙³, 陈庆荣^**3,4

作者信息 +

Do Large Language Models Grasp Poetic Prosody?A Human-Machine Comparison of Phonological Prediction in Classical Chinese Poetry

Wei Tingxin¹, Li Jiabin², Zhao Ying³, Wu Zhou³, Chen Qingrong^3,4

Author information +

文章历史 +

摘要

预测是人类语言理解的核心机制。当前大语言模型在模拟人类语言预测方面取得显著进展,但其是否具备与人类相似的音韵预测能力与驱动机制,尚缺乏系统研究。通过古代格律诗尾字预测任务,系统比较了人类与大语言模型的预测机制。研究发现,人类在缺乏上下文时仍能稳定预测,显著超越大语言模型。PoemBERT模型在获取标点后准确率显著提升并超越人类。研究结果表明,人类在古诗认知与理解中依赖的是内化的韵律知识与结构意识,而非外在形式标记;语言模型依然是基于表层的概率分布学习,尚未具备人类主动整合规则的认知能力,因此不能完全模拟人类的韵律认知过程。同时,大语言模型可以作为认知计算工具,为探究韵律在语言预测中的作用机制提供新视角。

Abstract

Prediction is a core cognitive mechanism in human language processing, essential for understanding and producing language during listening, reading, and conversation. Recent advances in large pre-trained language models (LLMs) have shown striking success in mimicking human-like predictive behavior, sparking ongoing debate over whether such models exhibit "brain-like" mechanisms of prediction. Classical Chinese poetry, with its layered constraints of semantics, structure, and prosody, offers an ideal paradigm to probe multi-level linguistic prediction, particularly in phonological domains such as tonal and rhyming structures. This study presents a last-character prediction task that incorporates tonal class, rhyme category, and semantic consistency, using regulated verse as experimental material. We systematically compare the performance of human participants with various LLMs across three input conditions to explore similarities and differences in their predictive mechanisms.

关键词

大语言模型 / 音韵预测 / 结构感知 / 古代诗歌 / 人机对比

引用本文

EndNote

Ris (Procite)

Bibtex

导出引用

魏庭新, 李佳斌, 赵英, 吴宙, 陈庆荣. 大语言模型能理解诗歌韵律吗?——基于中国古代诗歌的人机音韵预测研究^*[J]. 心理科学. 2025, 48(6): 1370-1383 https://doi.org/10.16719/j.cnki.1671-6981.20250607

Wei Tingxin, Li Jiabin, Zhao Ying, Wu Zhou, Chen Qingrong. Do Large Language Models Grasp Poetic Prosody?A Human-Machine Comparison of Phonological Prediction in Classical Chinese Poetry[J]. Journal of Psychological Science. 2025, 48(6): 1370-1383 https://doi.org/10.16719/j.cnki.1671-6981.20250607

参考文献

[1] 陈庆荣, 杨亦鸣. (2017). 古诗阅读的认知机制:来自眼动的证据. 中国社会科学, 3, 48-76.
[2] 王东波, 刘畅, 朱子赫, 刘江峰, 胡昊天, 沈思, 李斌. (2022). SikuBERT与 SikuRoBERTa: 面向数字人文的《四库全书》预训练模型构建及应用研究. 图书馆论坛, 42(6), 31-43.
[3] 王力. (1962). 汉语诗律学. 上海教育出版社..
[4] Carter B. T., Foster B., Muncy N. M., & Luke S. G. (2019). Linguistic networks associated with lexical, semantic and syntactic predictability in reading: A fixation-related fMRI study. NeuroImage,189(1), 224-240.
[5] Chen Q., Zhang J., Xu X., Scheepers C., Yang Y., & Tanenhaus M. K. (2016). Prosodic expectations in silent reading: ERP evidence from rhyme scheme and semantic congruence in classic Chinese poems. Cognition, 154, 11-21.
[6] Chomsky N., Roberts I., & Watumull J. (2023). The false promise of ChatGPT. The New York Times. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html
[7] Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3),181-204.
[8] Dale, R. (2021). GPT-3: What' s it good for? Natural Language Engineering, 27,113-118.
[9] DeLong K. A., Urbach T. P., & Kutas M. (2005). Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8, 1117-1121.
[10] Devlin J., Chang M., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171-4186.
[11] Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393-402.
[12] Fei H., Wu S., Li J., Li B., Li F., Qin L., Zhang M., Zhang M., & Chua T. S. (2022). Lasuie: Unifying information extraction with latent adaptive structure-aware generative language model. Advances in Neural Information Processing Systems, 35, 15460-15475.
[13] Foucart A., Ruiz-Tada E., & Costa A. (2016). Anticipation processes in L2 speech comprehension: Evidence from ERPs and lexical recognition task. Bilingualism: Language and Cognition, 19(1), 213-219.
[14] Frisson S., Harvey D. J., & Staub A. (2017). No prediction error cost in reading: Evidence from eye movements. Journal of Memory and Language, 95, 200-214.
[15] Goldstein A., Zada Z., Buchnik E., Schain M., Price A., Aubrey B., Nastase. S., Feder A., Emanuel D., Cohen A., Jansen A., Gazula H., Choe G., Rao A., Kim C., Casto C., Fanda L., Doyle W., Friedman D., Dugan, & Hasson U. (2022). Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3), 369-380.
[16] Kim, N., & Smolensky, P. (2021). Testing for grammatical category abstraction in neural language models. Proceedings of the Society for Computation in Linguistics, 4(1), 467-470.
[17] Kuperberg, G. R., & Jaeger, T. F. (2016). What do we mean by prediction in language comprehension? Language, Cognition and Neuroscience, 31(1), 32-59.
[18] Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4(12), 463-470.
[19] Heilbron M., Armeni K., Schoffelen J. M., Hagoort P., & De Lange, F. P. (2022). A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences, 119(32), e2201968119.
[20] Ito A., Corley M., Pickering M. J., Martin A. E., & Nieuwland M. S. (2016). Predicting form and meaning: Evidence from brain potentials. Journal of Memory and Language, 86(1), 157-171.
[21] Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126-1177.
[22] Lowder, M. W., & Ferreira, F. (2016). Prediction in the processing of repair disfluencies. Language Cognition and Neuroscience, 31(1), 73-79.
[23] Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Reviews of Linguistics, 7, 195-212.
[24] Lu K., Mardziel P., Leino K., Fredrikson M., & Datta A. (2020). Influence paths for characterizing subject verb number agreement in LSTM language models. Proceedings of the 58th annual meeting of the association for computational linguistics. Seattle. Wachington.
[25] Mahowald K., Ivanova A. A., Blank I. A., Kanwisher N., Tenenbaum J. B., & Fedorenko E. (2024). Dissociating language and thought in large language models. Trends in Cognitive Sciences, 28(6), 517-540.
[26] Marvin, R., & Linzen, T. (2018). Targeted syntactic evaluation of language models. Proceedings of the 2018 conference on empirical methods in natural language processing. Brusseles.
[27] Mikolov T., Sutskever I., Chen K., Corrado G. S., & Dean J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 1-9.
[28] Otten, M., & Van Berkum, J. J. A. (2007). What makes a discourse constraining? Comparing the effects of discourse message and scenario fit on the discourse-dependent N400 effect. Brain Research, 1153, 166-177.
[29] Radford A., Narasimhan K., Salimans T., & Sutskever I. (2018). Improving language understanding by generative pre-training. Pre-print Work.
[30] Teng X., Ma M., Yang J., Blohm S., Cai Q., & Tian X. (2020). Constrained structure of sncient Chinese poetry facilitates speech content grouping. Current Biology, 30(7), 1299-1305.
[31] Van Assche E., Drieghe D., Duyck W., Welvaert M., & Hartsuiker R. J. (2011). The influence of semantic constraints on bilingual word recognition during sentence reading. Journal of Memory and Language, 64, 88-107.
[32] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., & Polosukhin I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 1-11.
[33] Weber K., Lau E. F., Stillerman B., & Kuperberg G. R. (2016). The Yin and the Yang of prediction: An fMRI study of semantic predictive processing. PLoS ONE, 11(3), 1-25.
[34] Wei W., Huang Z., Feng C., & Qu Q. (2023). Predicting phonological information in language comprehension: Evidence from ERP representational similarity analysis and Chinese idioms. Cerebral Cortex, 33(15), 9367-9375.
[35] Wilcox E., Levy R., Morita T., & Futrell R. (2018). What do RNN language models learn about filler-gap dependencies? Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Brusseles.