Do Large Language Models Grasp Poetic Prosody?A Human-Machine Comparison of Phonological Prediction in Classical Chinese Poetry

Wei Tingxin, Li Jiabin, Zhao Ying, Wu Zhou, Chen Qingrong

Journal of Psychological Science ›› 2025, Vol. 48 ›› Issue (6) : 1370-1383.

PDF(2404 KB)
PDF(2404 KB)
Journal of Psychological Science ›› 2025, Vol. 48 ›› Issue (6) : 1370-1383. DOI: 10.16719/j.cnki.1671-6981.20250607

Do Large Language Models Grasp Poetic Prosody?A Human-Machine Comparison of Phonological Prediction in Classical Chinese Poetry

  • Wei Tingxin1, Li Jiabin2, Zhao Ying3, Wu Zhou3, Chen Qingrong3,4
Author information +
History +

Abstract

Prediction is a core cognitive mechanism in human language processing, essential for understanding and producing language during listening, reading, and conversation. Recent advances in large pre-trained language models (LLMs) have shown striking success in mimicking human-like predictive behavior, sparking ongoing debate over whether such models exhibit "brain-like" mechanisms of prediction. Classical Chinese poetry, with its layered constraints of semantics, structure, and prosody, offers an ideal paradigm to probe multi-level linguistic prediction, particularly in phonological domains such as tonal and rhyming structures. This study presents a last-character prediction task that incorporates tonal class, rhyme category, and semantic consistency, using regulated verse as experimental material. We systematically compare the performance of human participants with various LLMs across three input conditions to explore similarities and differences in their predictive mechanisms.

Cite this article

Download Citations
Wei Tingxin, Li Jiabin, Zhao Ying, Wu Zhou, Chen Qingrong. Do Large Language Models Grasp Poetic Prosody?A Human-Machine Comparison of Phonological Prediction in Classical Chinese Poetry[J]. Journal of Psychological Science. 2025, 48(6): 1370-1383 https://doi.org/10.16719/j.cnki.1671-6981.20250607

References

[1] 陈庆荣, 杨亦鸣. (2017). 古诗阅读的认知机制:来自眼动的证据. 中国社会科学, 3, 48-76.
[2] 王东波, 刘畅, 朱子赫, 刘江峰, 胡昊天, 沈思, 李斌. (2022). SikuBERT与 SikuRoBERTa: 面向数字人文的《四库全书》预训练模型构建及应用研究. 图书馆论坛, 42(6), 31-43.
[3] 王力. (1962). 汉语诗律学. 上海教育出版社..
[4] Carter B. T., Foster B., Muncy N. M., & Luke S. G. (2019). Linguistic networks associated with lexical, semantic and syntactic predictability in reading: A fixation-related fMRI study. NeuroImage,189(1), 224-240.
[5] Chen Q., Zhang J., Xu X., Scheepers C., Yang Y., & Tanenhaus M. K. (2016). Prosodic expectations in silent reading: ERP evidence from rhyme scheme and semantic congruence in classic Chinese poems. Cognition, 154, 11-21.
[6] Chomsky N., Roberts I., & Watumull J. (2023). The false promise of ChatGPT. The New York Times. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html
[7] Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3),181-204.
[8] Dale, R. (2021). GPT-3: What' s it good for? Natural Language Engineering, 27,113-118.
[9] DeLong K. A., Urbach T. P., & Kutas M. (2005). Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8, 1117-1121.
[10] Devlin J., Chang M., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171-4186.
[11] Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393-402.
[12] Fei H., Wu S., Li J., Li B., Li F., Qin L., Zhang M., Zhang M., & Chua T. S. (2022). Lasuie: Unifying information extraction with latent adaptive structure-aware generative language model. Advances in Neural Information Processing Systems, 35, 15460-15475.
[13] Foucart A., Ruiz-Tada E., & Costa A. (2016). Anticipation processes in L2 speech comprehension: Evidence from ERPs and lexical recognition task. Bilingualism: Language and Cognition, 19(1), 213-219.
[14] Frisson S., Harvey D. J., & Staub A. (2017). No prediction error cost in reading: Evidence from eye movements. Journal of Memory and Language, 95, 200-214.
[15] Goldstein A., Zada Z., Buchnik E., Schain M., Price A., Aubrey B., Nastase. S., Feder A., Emanuel D., Cohen A., Jansen A., Gazula H., Choe G., Rao A., Kim C., Casto C., Fanda L., Doyle W., Friedman D., Dugan, & Hasson U. (2022). Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3), 369-380.
[16] Kim, N., & Smolensky, P. (2021). Testing for grammatical category abstraction in neural language models. Proceedings of the Society for Computation in Linguistics, 4(1), 467-470.
[17] Kuperberg, G. R., & Jaeger, T. F. (2016). What do we mean by prediction in language comprehension? Language, Cognition and Neuroscience, 31(1), 32-59.
[18] Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4(12), 463-470.
[19] Heilbron M., Armeni K., Schoffelen J. M., Hagoort P., & De Lange, F. P. (2022). A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences, 119(32), e2201968119.
[20] Ito A., Corley M., Pickering M. J., Martin A. E., & Nieuwland M. S. (2016). Predicting form and meaning: Evidence from brain potentials. Journal of Memory and Language, 86(1), 157-171.
[21] Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126-1177.
[22] Lowder, M. W., & Ferreira, F. (2016). Prediction in the processing of repair disfluencies. Language Cognition and Neuroscience, 31(1), 73-79.
[23] Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Reviews of Linguistics, 7, 195-212.
[24] Lu K., Mardziel P., Leino K., Fredrikson M., & Datta A. (2020). Influence paths for characterizing subject verb number agreement in LSTM language models. Proceedings of the 58th annual meeting of the association for computational linguistics. Seattle. Wachington.
[25] Mahowald K., Ivanova A. A., Blank I. A., Kanwisher N., Tenenbaum J. B., & Fedorenko E. (2024). Dissociating language and thought in large language models. Trends in Cognitive Sciences, 28(6), 517-540.
[26] Marvin, R., & Linzen, T. (2018). Targeted syntactic evaluation of language models. Proceedings of the 2018 conference on empirical methods in natural language processing. Brusseles.
[27] Mikolov T., Sutskever I., Chen K., Corrado G. S., & Dean J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 1-9.
[28] Otten, M., & Van Berkum, J. J. A. (2007). What makes a discourse constraining? Comparing the effects of discourse message and scenario fit on the discourse-dependent N400 effect. Brain Research, 1153, 166-177.
[29] Radford A., Narasimhan K., Salimans T., & Sutskever I. (2018). Improving language understanding by generative pre-training. Pre-print Work.
[30] Teng X., Ma M., Yang J., Blohm S., Cai Q., & Tian X. (2020). Constrained structure of sncient Chinese poetry facilitates speech content grouping. Current Biology, 30(7), 1299-1305.
[31] Van Assche E., Drieghe D., Duyck W., Welvaert M., & Hartsuiker R. J. (2011). The influence of semantic constraints on bilingual word recognition during sentence reading. Journal of Memory and Language, 64, 88-107.
[32] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., & Polosukhin I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 1-11.
[33] Weber K., Lau E. F., Stillerman B., & Kuperberg G. R. (2016). The Yin and the Yang of prediction: An fMRI study of semantic predictive processing. PLoS ONE, 11(3), 1-25.
[34] Wei W., Huang Z., Feng C., & Qu Q. (2023). Predicting phonological information in language comprehension: Evidence from ERP representational similarity analysis and Chinese idioms. Cerebral Cortex, 33(15), 9367-9375.
[35] Wilcox E., Levy R., Morita T., & Futrell R. (2018). What do RNN language models learn about filler-gap dependencies? Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Brusseles.
PDF(2404 KB)

Accesses

Citation

Detail

Sections
Recommended

/