Journal of Psychological Science

The Historical Origins of Large Language Models and Psychology

Huang Linjieqiong, Zhang Wen, Chen Zhen, Li Chenxi, Li Xingshan

2025, 48(4): 773-781. DOI: 10.16719/j.cnki.1671-6981.20250401

Abstract ( 192 )

PDF (389KB) ( 208 )

peer review(348KB)

In recent years, large language models (LLMs) have made significant advancements. Through deep learning, LLMs have learned from vast amounts of human language data and demonstrated human-like language understanding and generation abilities. Through techniques such as supervised fine-tuning and reinforcement learning, LLMs can handle a variety of human tasks and generate text according to human intentions, marking a major breakthrough in the field of artificial intelligence (AI). This paper reviews the development of LLMs, demonstrates their historical roots in psychology, and highlights the critical role of interdisciplinary collaboration, offering insights for future research at the intersection of AI and psychology.
First, psychologists have played a foundational role in the development of artificial neural networks—the backbone of LLMs. Early neuropsychologists such as Donald Hebb and psychologist Frank Rosenblatt focused on learning mechanisms within neural systems, thereby laying the groundwork for machine learning. Long before the deep learning era, psychologists extensively used artificial neural networks to model human cognition. Researchers such as James L. McClelland and David E. Rumelhart continuously refined network architectures to simulate language processing, fostering deep integration between psychology and artificial neural networks. These contributions provided essential theoretical and methodological foundations for the development of LLMs.
Second, the technique of word embeddings is central for enabling LMMs to understand language, and its development has benefited from interdisciplinary collaboration among psychology, linguistics, and computer science. Word embedding technique enables abstract language to be transformed into a form that computers can understand and process. Early psychological and linguistic research introduced the concept of distributed representations of lexical semantics and developed initial quantitative methods. Psychologists later used large-scale corpora to construct high-dimensional semantic vectors, advancing semantic representation techniques. Computer scientists, building on this foundation, implemented these ideas via neural network-based embedding techniques capable of capturing contextual meaning. The evolution of lexical semantic representation methods has facilitated the development of word embedding techniques, enabling the rapid and efficient processing of massive text corpora and contributing to major breakthroughs in language-related AI.
Third, the algorithms of LLMs and cognitive mechanisms of human language processing share several key characteristics, mainly in terms of incremental processing, predictive ability, and dynamic attention allocation. Although the real-time processing, active prediction, and selective attention mechanisms shaped by human biological evolution differ in specifics from the autoregressive generation, masked prediction, and self-attention mechanisms used by LLMs, they exhibit a high degree of functional convergence. This convergence highlights the crucial role of language itself in the development of AI. The deep analogy between the two suggests that understanding the fundamental principles of language may be a vital pathway to achieving general intelligence. Therefore, psychological research into language processing mechanisms could provide essential theoretical foundations and practical guidance for the future development of AI.

References | Related Articles | Metrics

From Human Mind to Artificial Intelligence: Advancing AI Value Alignment Through Psychological Theories

Jin Shaoxiong, Liu Chao

2025, 48(4): 782-791. DOI: 10.16719/j.cnki.1671-6981.20250402

Abstract ( 164 )

PDF (543KB) ( 155 )

peer review(348KB)

In recent years, the field of artificial intelligence (AI) has witnessed unprecedented growth, characterized by major advancements in cognitive intelligence, perceptual processing, and decision-making capabilities. These technological breakthroughs have driven the widespread adoption of AI systems across a wide range of sectors, including healthcare, education, finance, and transportation. As a result, AI has become instrumental in improving operational efficiency, enhancing accuracy, and fostering innovation. There is little doubt that such developments have significantly boosted human productivity and convenience.
However, the increasing sophistication and autonomy of AI technologies have also introduced a variety of societal risks and ethical concerns. Among the most pressing of these are challenges related to AI safety and the alignment of AI behavior with human values. For instance, AI systems have been found to perpetuate bias in recruitment decisions, produce offensive or harmful content during interactions with users, and even pose existential threats in high-stakes domains such as autonomous weapons. These examples reflect growing anxieties about the potential misalignment between AI behavior and the ethical principles upheld by human societies. If left unaddressed, such misalignment could lead to consequences that undermine social trust and moral norms.
In response to these challenges, the concept of AI value alignment has emerged as a central concern within the broader field of AI safety research. AI value alignment refers to the development of AI systems whose goals, behaviors, and decision-making processes are consistent with the values, preferences, and ethical standards of individuals or society as a whole. Technically, several value alignment methodologies have been proposed, including reinforcement learning from human feedback (RLHF), inverse reinforcement learning (IRL), and constitutional AI. These approaches aim to incorporate normative constraints into the training process, thereby steering AI systems toward behavior that is both desirable and predictable. While promising in many respects, such methods face significant limitations. In particular, aligned AI systems often exhibit reduced adaptability when faced with novel scenarios and suffer from poor interpretability, making it difficult to trace or understand the reasoning behind their decisions. These limitations highlight the insufficiency of a purely engineering-driven approach and suggest the necessity of incorporating broader, interdisciplinary perspectives.
One promising approach is to integrate insights from psychology, the scientific study of human behavior, cognition, and moral reasoning, into the research and development of AI value alignment. Psychological theories provide robust conceptual tools for understanding how humans construct values, make moral judgments, and resolve ethical dilemmas in complex social contexts. Rather than designing AI systems that merely replicate the surface-level patterns of human behavior, these insights can inform architectures that embody internal mechanisms analogous to those involved in human moral cognition. Thus, true value alignment requires more than behavioral mimicry; it demands a form of cognitive and ethical compatibility between artificial agents and the human mind, particularly in terms of value judgment and moral decision-making processes.
This paper explores how psychological science can contribute to advancing AI value alignment. It reviews core psychological theories concerning the formation of moral values, dual-process models of moral reasoning, and the roles of emotion and social context in ethical decision-making. Building on these foundations, we propose conceptual frameworks that include the construction of a unified moral cognitive space capable of integrating diverse human values, and the development of dual-system moral architectures that emulate the interaction between intuitive and deliberative reasoning in human moral cognition. To ground these ideas in practice, we use altruistic behavior—a central and complex moral phenomenon—as a case study, examining how its psychological underpinnings could be modeled in AI systems to promote socially aligned decision-making.
By bridging AI safety research with psychological theory, this work seeks to support the development of more interpretable, robust, and ethically aware AI systems. Such interdisciplinary integration is not only timely, but also essential to ensure that the evolution of AI technologies remains aligned with the fundamental values of human society.

References | Related Articles | Metrics

Psychoinformatics: Advances and Perspectives in the Computational Cognition Era

Tong Song, Chen Hao, Ke Luoma, Ye Junkai, Peng Kaiping

2025, 48(4): 792-803. DOI: 10.16719/j.cnki.1671-6981.20250403

Abstract ( 182 )

PDF (409KB) ( 173 )

peer review(304KB)

As artificial intelligence (AI) progresses from perceptual to cognitive intelligence, psychoinformatics—an interdisciplinary field integrating psychology and information science—has entered a crucial phase of theoretical and methodological refinement. This paper reviews the historical background, theoretical foundations, methodological progress, and practical applications of psychoinformatics within the framework of computational cognition. We trace its development from early symbolic processing models to connectionist approaches and, more recently, to deep learning and large language models (LLMs), which have expanded psychological research’s scope and depth.
The paper first reviews the theoretical evolution of psychoinformatics, from Galton’s composite photography to the symbolic information processing models proposed by Simon and Newell, which conceptualized mental processes as rule-based symbolic operations. Connectionist models—particularly Rumelhart and McClelland’s parallel distributed processing, later redefined cognition as an emergent property of distributed networks, enabling more flexible modeling of psychological processes. The advent of deep learning and LLMs has shifted the field from data analysis to language-based reasoning and cognitive simulation, supporting theory-driven modeling in psychology.
The widespread use of digital technologies and the internet has enabled the collection of naturally occurring data, such as social media content and wearable device outputs, providing opportunities to study psychological phenomena in real-world contexts while raising challenges related to data quality and interpretation. Traditional machine learning models have primarily served as predictive tools to identify behavioral and cognitive patterns but often contribute little to theoretical explanation. In contrast, LLMs have shown promise in language understanding, reasoning, and generating research ideas, serving as both analytical tools and aids in theory development. Recent studies illustrate how LLMs help identify psychological concepts, suggest research directions, and illuminate cognitive processes at individual and group levels. Consequently, psychoinformatics is evolving from a purely data-driven paradigm to an integrated framework combining data and theory for explanatory and predictive psychological inquiry.
These developments signal a broader shift toward cognitive intelligence within psychoinformatics. Drawing on Newell’s time-scale framework of human action, these applications correspond to different levels of psychological functioning, from rapid interactions to long-term behavioral change. In clinical psychology, LLMs assist in the early identification of mental health risks, enable ongoing intervention through interactive systems. In educational psychology, LLM-based tutoring systems provide personalized learning, real-time motivational support, and adaptive feedback, leading to improved learning outcomes. In cross-cultural psychology, LLMs show potential in recognizing culturally specific cognitive patterns, helping researchers better understand cultural variations in thinking, emotion, and behavior, and promoting the development of more inclusive psychological theories. Finally, we outline future directions for psychoinformatics: (1) Expanding temporal and contextual models to capture both short-term psychological changes and long-term mental health patterns; (2)Enhancing human-AI collaboration in hypothesis development and theory refinement; and (3) Strengthening ethical governance by applying psychological theories and frameworks—essential for interpreting AI decisions—to guide its responsible and bounded use.
In summary, this paper suggests that psychoinformatics, guided by computational cognition, provides a useful framework for combining data-driven and theory-driven approaches. Integrating real-world data, advanced computational methods, and human-AI interaction not only increases the accuracy and practical relevance of psychological research but also opens new pathways for theoretical and applied work. Looking ahead, psychoinformatics is well-positioned to enrich the field of psychology, shaping how we understand, study, and support human action and cognition in today’s “computational cognitive” era.

References | Related Articles | Metrics

Epitome: An Innovative Tool Platform Connecting AI and Psychological Research

Qu Jingjing, Zhang Weijian, Gao Xiaoxue, Wang Xiangfeng

2025, 48(4): 804-813. DOI: 10.16719/j.cnki.1671-6981.20250404

Abstract ( 165 )

PDF (919KB) ( 194 )

peer review(143KB)

The widespread social penetration of Large Language Models (LLMs) is reshaping human social landscapes, making the systematic study of psychological mechanisms in human-LLM co-evolution a frontier research area. This paper systematically analyzes the impact of LLM technology on psychological experiments through three distinct levels: cognitive mechanism comparison studies, human subject simulation experiments, and multi-agent human-machine interaction experiments.
Cognitive mechanism analysis: Research reveals that LLMs exhibit human-like characteristics in perceptual judgment, reasoning, and decision-making tasks, achieving or surpassing human performance in many cognitive domains. However, fundamental differences exist between LLM and human cognitive mechanisms, particularly in memory and forgetting processes, causal reasoning, and theory of mind capabilities. While LLMs demonstrate perfect short-term memory retention and lack forgetting mechanisms, humans show complex memory dynamics. These differences necessitate careful consideration in experimental design and evaluation metrics.
Human subject simulation: LLMs demonstrate remarkable ability to simulate fine-grained cognitive features, including cognitive dissonance, emotional responses, and social behaviors. However, significant limitations exist, including black-box properties, homogenization tendencies due to alignment techniques, and poor performance in simulating specific demographic characteristics. These constraints raise concerns about ecological validity when LLMs completely substitute human subjects in psychological experiments.
Multi-agent human-machine interaction: LLMs show promise as novel social entities in various experimental paradigms, from one-on-one interactions to large-scale social simulations. In dyadic experiments, LLMs can simulate emotional states and engage in empathetic interactions, though challenges remain in balancing expressiveness with naturalness. In multi-agent scenarios, LLMs participate in game-theoretic settings like prisoner's dilemmas and public goods games, revealing complex strategic capabilities but limitations in theory of mind reasoning. Large-scale social simulations using thousands of LLM agents provide unprecedented opportunities to study collective behavior and social dynamics.
Experimental framework and platforms: The paper outlines a standardized workflow for LLM-integrated psychological experiments comprising 12 core tasks across four phases: proposal, preparation, execution, and data analysis. The complexity of human-machine interaction experiments demands advanced tools and specialized platforms. The emerging experiment platform addresses these challenges through native LLM integration, visual design systems, and multi-agent simulation capabilities, though limitations exist in physiological measurement support.
Future directions: The rapid iteration of LLM technology and technical complexity of human-machine experimental deployment present ongoing challenges. Future research requires developing LLM-native experimental frameworks, modular visualization systems, and comprehensive platforms supporting diverse experimental paradigms. As AI agents become more autonomous and sophisticated, new psychological questions regarding ethics, safety, and human-machine relationships will emerge, necessitating innovative experimental approaches grounded in psychological theory.
This comprehensive review highlights both the transformative potential and inherent limitations of LLM integration in psychological research, providing essential insights for researchers navigating this rapidly evolving interdisciplinary landscape.

References | Related Articles | Metrics

The Performance of Deep Convolutional Neural Networks in Face Recognition and the Comparison with the Human Visual System

Cheng Yuhui, Shen Tianyu, Lu Zitong, Yuan Xiangyong, Jiang Yi

2025, 48(4): 814-825. DOI: 10.16719/j.cnki.1671-6981.20250405

Abstract ( 125 )

PDF (670KB) ( 150 )

peer review(355KB)

Face recognition is a fundamental cognitive function that plays a crucial role in human social interaction, as the human brain exhibits a remarkable sensitivity to facial stimuli. For decades, psychologists, cognitive neuroscientists, and computer vision researchers have been dedicated to uncovering the behavioral and neural mechanisms underlying face processing. Existing studies have demonstrated that humans process facial information differently from other objects, supporting the existence of highly specialized mechanisms for face perception. In particular, the fusiform face area (FFA) in the human brain has been identified as a specialized region for face recognition, and numerous face-selective neurons have been observed in the temporal lobe of macaques. In recent years, Deep Convolutional Neural Networks (DCNNs) have demonstrated remarkable performance in modeling and understanding face processing, providing new computational perspectives for exploring the neural mechanisms underlying face recognition. DCNNs are a class of artificial neural networks that have achieved impressive performance in visual recognition tasks, including face recognition. These models typically begin by applying a series of convolutional and pooling operations to extract increasingly abstract features, which are then passed through one or more fully connected layers to perform classification tasks. Consequently, there has been a growing interest in investigating the applications of DCNNs in face recognition.
First, this review examines the performance of DCNNs in identifying key facial attributes. Although most DCNNs are trained only for face identity tasks, they can still infer social information such as gender and expression. In addition, this review also discusses the similarities and differences between DCNNs and humans in well-known face processing phenomena, such as the inversion, own-race, and familiarity effects. Evidence suggests that DCNNs can produce face-specific cognitive effects similar to those observed in humans. To better understand the computational validity of DCNNs, this review compares their internal representations with the neural mechanisms involved in human face recognition. On the one hand, this paper analyzes the hierarchical processing architecture that emerges in trained DCNNs and evaluates its correspondence with the hierarchical structure of the human visual system, spanning from early visual areas (e.g., V1-V4) to higher-level face-selective regions such as the FFA. On the other hand, this review further discusses evidence for brain-like functional specialization within DCNNs, examining whether units selective to different facial attributes can be mapped onto the functionally specialized cortical areas observed in neuroimaging and electrophysiological studies.
Lastly, this paper highlights several limitations of current models and outlines promising directions for future research. First, although DCNNs excel at face recognition, they remain far less robust than humans when faced with challenges such as viewpoint shifts, image distortions, adversarial perturbations, and limited training data. Second, although DCNNs exhibit behavioral effects like those observed in humans, there are multiple possible explanations for the underlying mechanisms responsible for these phenomena. The DCNN models examined in different studies often vary in terms of architecture, task objectives, and training datasets, which may affect the comparability of their results. Third, the extent to which current models can capture essential features of the biological visual system remains unclear. Specifically, many DCNNs operate as feedforward architectures and lack critical elements such as recurrent processing, top-down feedback, and dynamic attentional modulation, all of which are fundamental characteristics of the human visual system. Fourth, current neural network models primarily focus on the perceptual stage underlying face recognition. Future research should aim to incorporate semantic-level processing to more fully capture the complexity of human face perception. Fifth, generative Adversarial Networks (GANs) have recently attracted significant attention, which are powerful tools for generating diverse facial stimuli, enabling more controlled and flexible investigations of face perception. Integrating GANs with DCNNs has also enhanced our understanding of the mechanisms underlying facial representation, making it a promising direction for future research.

References | Related Articles | Metrics

Neural Simulation-based Inference: A Neural Network And Simulation-based Inference Approach to Cognitive Modelling

Pan Wanke, Hu Chuanpeng

2025, 48(4): 826-835. DOI: 10.16719/j.cnki.1671-6981.20250406

Abstract ( 129 )

PDF (1080KB) ( 181 )

peer review(521KB)

Cognitive computational modeling quantifies human mental processes using mathematical frameworks, thereby translating cognitive theories into testable hypotheses. Modern cognitive modeling involves four interconnected stages: defining models by formalizing symbolic theories into generative computational frameworks, collecting data through hypothesis-driven experiments, inferring parameters to quantify cognitive processes, and evaluating or comparing models. Parameter inference, a critical step that facilitates the integration of models and data, has traditionally relied on maximum likelihood estimation (MLE) and Bayesian methods like Markov Chain Monte Carlo (MCMC). These approaches depend on explicit likelihood functions, which become computationally intractable for complex models—such as those with nonlinear parameters (e.g., learning dynamics) or hierarchical/multimodal data structures.
To address these challenges, simulation-based inference (SBI) emerged, leveraging parameter-data mappings via simulations to bypass likelihood calculations. Early SBI methods, however, faced computational redundancy and scalability limitations. Recent advances in neural simulation-based inference (NSBI), or neural amortized inference (NAI), harness neural networks to pretrain parameter-data relationships, enabling rapid posterior estimation.
Despite its advantages, NSBI remains underutilized in psychology due to technical complexity. This work focuses on neural posterior estimation, one of three NSBI approaches alongside neural likelihood estimation and neural model comparison. Neural posterior estimation operates in two phases: training and inference. During the training phase, parameters are sampled from prior distributions, and synthetic data are generated using the model; a neural network is then trained to approximate the true posterior from these training pairs. In the inference stage, real data are input to the trained network to generate parameter samples. The BayesFlow framework enhances neural posterior estimation by integrating normalizing flows—flexible density estimators—and summary statistic networks, enabling variable-length data handling and unsupervised posterior approximation. Its GPU-accelerated implementation further boosts efficiency.
Neural posterior estimation has expanded the scope of evidence accumulation models (EAMs), one of the most widely used framework in cognitive modeling. First, it enables large-scale behavioral analyses, as demonstrated by von Krause et al. (2022), who applied neural posterior estimation to drift-diffusion models (DDMs) for 1.2 million implicit association test participants. By modeling condition-dependent drift rates and decision thresholds, they revealed age-related nonlinear cognitive speed changes, peaking at age 30 and declining post-60. Neural posterior estimation completed inference in 24 hours versus MCMC’s 50+ hours for a small subset, demonstrating its scalability.
Second, neural posterior estimation supports dynamic decision-making frameworks, exemplified by Schumacher et al. (2023), who combined high-level dynamics with low-level mechanisms using recurrent neural networks (RNNs). Their simultaneous estimation of hierarchical parameters achieved over 0.9 recovery correlations and superior predictive accuracy compared to static models.
Finally, neural posterior estimation facilitates neurocognitive integration, as shown by Ghaderi-Kangavari et al. (2023), who linked single-trial EEG components (e.g., CPP slope) to behavior via shared latent variables like drift rate. This approach circumvented intractable likelihoods and revealed associations between CPP slope and non-decision time.
NSBI enhances cognitive modeling by enabling efficient analysis of complex, high-dimensional datasets. Its key limitations include model validity risks (biased estimates from incorrect generative assumptions), overfitting concerns (overconfident posteriors on novel data), and upfront training costs for amortized methods. Future work should refine validity checks—such as detecting model misspecification—and develop hybrid inference techniques. NSBI’s potential extends to computational psychiatry and educational psychology, promising deeper insights into cognition across domains. By addressing complexity barriers, NSBI could democratize advanced modeling for interdisciplinary research, advancing our understanding of human cognition through scalable, data-driven frameworks.

References | Related Articles | Metrics

Bayesian Observer Models in Visual Perception

Sun Qi

2025, 48(4): 836-846. DOI: 10.16719/j.cnki.1671-6981.20250407

Abstract ( 137 )

PDF (1969KB) ( 147 )

peer review(317KB)

It is widely proposed that the perception of the visual world operates as a form of statistical inference based on uncertain evidence. In this context, researchers have developed various computational models to elucidate the process of inference. This study primarily reviews computational models grounded in the Bayesian inference framework, including:
(1)The Classical Bayesian Observer Model: This model optimally combines prior knowledge of a specific physical feature with its likelihood distribution given a particular value. It effectively accounts for the prior-peak-compression bias (i.e., Bayesian bias).
(2)The Efficient-Coding Constrained Bayesian Observer Model: This model efficiently encodes physical features based on prior information and subsequently decodes these features using Bayes' rule. Notably, it distinguishes between external (physical) noise and internal (sensory) noise that influences stimulus certainty. The results demonstrated that this model adequately explains both Bayesian and anti-Bayesian biases.
(3)The Hierarchical Bayesian Observer Model: This model posits that the sensory system first derives conclusions based on a specific context, which are then used to constrain subsequent Bayesian inference processes, thereby establishing a hierarchical inference framework. Additionally, the context typically generates multiple hypotheses, allowing observers to optimally integrate conclusions from all hypotheses or select the conclusion with the highest probability. The former scenario leads to a full inference hierarchical Bayesian observer model, while the latter results in a conditional inference hierarchical Bayesian observer model. Recent studies have shown that the performance of these two models in explaining behavioral data is influenced by the distance between the feature value and the conditional boundary.
(4)The Holistic Bayesian Observer Model: This model suggests that context and Bayesian inferences occur in parallel, with the sensory system weighting the integration losses from both inferences to perceive the physical feature. Furthermore, this model encompasses the motor system, including motor noise when observers adjust probes to reproduce stimuli.
Upon reviewing these models, the current study found that each exhibits distinct advantages and limitations in elucidating perceptual biases and variance associated with different physical features. In light of these findings, this study developed a comprehensive computational model (illustrated in Figure 2 of the paper) that integrates both bottom-up and top-down perceptual processes, as well as the motor control system including motor noise and the perceptual-response mapping scaling process. This new model effectively balances the strengths and weaknesses of previous Bayesian inference models.
The comprehensive Bayesian observer model proposed in this study is constructed from a theoretical perspective, based on a systematic comparison of the aforementioned Bayesian observer models, and incorporates their respective strengths and weaknesses. However, it still lacks empirical support from behavioral and physiological evidence. Future studies will need to design rigorous behavioral and physiological experiments to provide empirical data that substantiate the validity of the model. In this regard, we propose the following questions that need to be addressed:
(1)What are the computational and physiological mechanisms underlying the integration of long-term and short-term priors?
(2)Does the perception-to-motor mapping process, which is influenced by the response range, interact with the range of the short-term prior? If so, what are the underlying computational and physiological mechanisms?
(3)How do the priors for the integration of long-term and short-term priors interact with context inference?
(4)Is the context inference in hierarchical Bayesian inference also parallel to context inference in feature estimation?
Answering these questions will help validate the new proposed model and enhance researchers’ understanding of perception and decision-making processes. This, in turn, will foster advancements in theory and practical applications across psychology, neuroscience, artificial intelligence, and related fields.

References | Related Articles | Metrics

Human Learning Strategies in a Volatile Feedback Environment

Zhang Ruyuan, Gao Yuyan, Fang Zeming, Zhou Qiang

2025, 48(4): 847-860. DOI: 10.16719/j.cnki.1671-6981.20250408

Abstract ( 128 )

PDF (2269KB) ( 152 )

peer review(878KB)

A volatile feedback environment is defined as one in which the association between actions and outcomes is uncertain and constantly changing. To adapt to such environments, people generally rely on two types of learning: associative learning and volatility learning. Most research explores these strategies using the modeling approaches of either reinforcement learning (RL) or dynamic Bayesian inference (DBI). However, much of the existing research has focused on individual learning processes under the assumption that one of the two modeling approaches is correct. Without directly comparing these two approaches, it is difficult to determine which one people actually use when learning.
The aim of this study wasThis study aims to investigate which learning strategy as reflected by modeling approaches (i.e., RL or DBI) can best account for learning behavior in a volatile feedback environment, and to assess whether these strategies vary with differences in associative probabilities. In order toTo simulate volatile feedback environments, we employed a volatile reversal learning task programmed using jsPsych, which was completed by 36 healthy participants. In this task, the probabilistic contingencies between stimuli and response options remained constant for a period of time (i.e., the stable phase) and fluctuated rapidly during another period (i.e., the volatile phase). Participants were informed that the association probability could change over time, but not when such changes would occur. In order to accurately track and adapt to changes in the environment, individuals must engage in both association learning (i.e., forming associations between cue stimuli and responses) and volatility learning (i.e., detecting how quickly the associations change). This task design enabled a more comprehensive evaluation of how individuals learn in dynamic and uncertain environments, extending beyond the scope of classic learning paradigms. It provides a more ecologically valid measure of learning strategies in volatile feedback environments. The manipulation of association probability differences was also implemented, with each participant completing two experimental conditions (high versus low association probability difference) within a counterbalanced within-subjects design. This manipulation allowed us to explore how variations in association probability impact individual learning strategies in volatile feedback environments.
This study quantitatively analyzes and compares individual learning behavior using several computational models within the frameworks of reinforcement learning (RL) and dynamic Bayesian inference (DBI). Specifically, RL focuses on optimizing behavioral policies based on feedback, emphasizing the manner in which individuals adjust future actions by computing prediction errors through interaction with the environment. In contrast, DBI places greater emphasis on probabilistic inference for modelling uncertainty, thereby enabling individuals to adapt flexibly to novel or ambiguous situations. The Bayesian approach relies on adjusting prior and posterior beliefs to better cope with a volatile feedback environment. The computational models were implemented in Python and were applied to fit participants’ task performance.
Firstly, behavioral accuracy comparisons confirmed that manipulating the difference in associative probabilities effectively distinguished participants’ performance, indicating that participants’ choices were not random. More importantly, through computational modeling, we found that, among all the models, the Hidden Markov Model (HMM) best fitted individual learning behaviors. This suggests that individuals primarily employ Bayesian learning strategies that incorporate heuristics within the task. Furthermore, we found that individuals’ learning strategies remained consistent across different levels of associative probability differences. However, as the differences in associative probabilities decreased (i.e., the task became more difficult), individuals tended to estimate a higher environmental volatility, which led to a higher learning rate (i.e., they adjusted their choices more frequently).
These findings indicate that humans use a combination ofe Bayesian inference with and several heuristics to learn associations in a volatile reversal learning task. Tasks with smaller differences in associative probabilities, which are more difficult, induce higher estimates of environmental volatility in humans. This study highlights the flexibility of human learning and decision-making and motivates future computational models in this line of research.

References | Related Articles | Metrics

Neural Computation and Modeling of Predictive Coding in Naturalistic Speech Comprehension

Zhang Xinmiao, Zhang Dan

2025, 48(4): 861-875. DOI: 10.16719/j.cnki.1671-6981.20250409

Abstract ( 102 )

PDF (515KB) ( 213 )

peer review(461KB)

During naturalistic speech comprehension, brain faces multiple sources of uncertainty, including noise in the speech signal, linguistic ambiguity, and the transient, rapidly unfolding nature of auditory information. To effectively manage these challenges, the brain does not passively receive input but actively engages in predictive processing by integrating prior knowledge with current sensory evidence in a dynamic and adaptive manner. Predictive coding theory, grounded in the Bayesian brain hypothesis, posits that higher-level brain regions generate predictions and send them to lower levels, where incoming sensory inputs are compared against these predictions. The resulting prediction errors are then transmitted to higher levels to iteratively refine internal models and optimize information processing. With advances in naturalistic paradigms and large language models (LLMs), research on predictive coding has shifted from merely establishing its existence to systematically investigating its computational and neural underpinnings in greater depth.
This paper reviews recent progress in the neural computation and modeling of predictive coding during naturalistic language comprehension, focusing on three major methodological approaches: (1) language-model-based computation of prediction; (2) prediction from an inter-brain perspective; and (3) oscillation-based modeling of prediction. First, language models have been extensively employed to extract prediction-related features such as surprisal and entropy. These features are used in analyses like temporal response function (TRF) modeling to map multilayered linguistic predictions—at phonemic, lexical, syntactic, and higher levels—onto spatiotemporal brain dynamics recorded via EEG, MEG, or fMRI. Further studies reveal a high degree of alignment between the activations of LLMs and human brain responses, particularly during continuous natural speech processing, suggesting that LLMs may serve as biologically inspired models to infer predictive mechanisms in the brain. Second, studies adopting an inter-brain perspective have explored prediction in communication through inter-subject correlation (ISC) metrics. For instance, the observation that a listener’s brain activity can precede the speaker’s by several seconds indicates the predictive nature of comprehension and its critical role in communication success. Moreover, some studies have incorporated LLM-derived prediction-related features into ISC frameworks, further extending the applicability of predictive coding theory to multi-agent interactive contexts such as dyadic conversation or group interaction. Third, to elucidate the neural mechanisms of predictive coding, researchers have developed dynamic models that integrate prediction signals with neural oscillations, which offer a mechanistic account of how predictive processes shape oscillatory dynamics during naturalistic speech comprehension.
Building on these methodological foundations, this paper further reviews recent progress in predictive coding during naturalistic language comprehension across multiple levels of analysis. First, at the phenomenological level, recent findings demonstrate that predictive processing is a ubiquitous and robust feature of language comprehension, consistently observed across diverse paradigms and real-life listening conditions, including noisy environments and semantically ambiguous contexts. Second, at the computational level, the brain integrates context across multiple linguistic hierarchies, ranging from phonemes and words to syntactic and discourse structures, and generates structured, temporally extended predictions that are not confined to the next word, but encompass longer-range content organized along hierarchical time scales. Third, at the neural mechanism level cross-frequency coupling, particularly between low-frequency phase (delta/theta) and high-frequency amplitude (beta/gamma), has been identified as a key mechanism for coordinating temporal and hierarchical aspects of prediction, providing a physiological substrate for multiscale linguistic predictive coding.
Despite substantial progress in recent years, several important questions remain. For instance, how does the brain flexibly modulate the precision and temporal range of its predictions under challenging conditions, such as noisy environments or weak contextual constraints? Future research on predictive coding in naturalistic speech comprehension may benefit from integrating naturalistic experimental paradigms with causal manipulation techniques and advanced computational modeling, in order to more effectively elucidate the dynamic and mechanistic foundations of predictive processing in real-world communication.

References | Related Articles | Metrics

The Intervention Effect of Accuracy Prompt on Misinformation Sharing ——Experiment Based on GPT

Jiang Haoyang, Peng Xiaogang, Dong Yihan, Zhu Xiaolong, Peng Xiaozhe

2025, 48(4): 876-891. DOI: 10.16719/j.cnki.1671-6981.20250410

Abstract ( 104 )

PDF (3083KB) ( 136 )

peer review(397KB)

Large language models (LLMs) are increasingly used for content generation and verification. However, their capability to accurately discern misinformation remains imperfect, frequently exhibiting confident yet incorrect outputs, known as "hallucinations". Investigating whether and how accuracy prompts—brief reminders prompting attention toward accuracy—can effectively enhance LLMs' misinformation discernment abilities is crucial. Such inquiry not only advances our understanding of LLMs' human-like cognitive processes and underlying mechanisms but also provides practical guidance for implementing simple yet effective interventions to improve content moderation in real-world applications.
In this study, we systematically examined whether accuracy prompts could enhance misinformation discernment abilities of GPT models. Three sequential studies were conducted, each targeting a distinct cognitive and material context. In Study 1, we employed classical news headline materials previously used in human misinformation intervention studies to test the robustness of accuracy prompts. The results indicated that both GPT-3.5 and GPT-4o models demonstrated significantly improved sharing discernment after receiving accuracy prompts, characterized by reduced intentions to share false news and increased intentions to share true news. Notably, GPT-4o showed stronger improvements compared to GPT-3.5, suggesting that advanced LLMs may be better able to use such prompts to realign their internal cognitive focus toward accuracy considerations.
Study 2 further tested the robustness and generalizability of accuracy prompts by employing TruthfulQA, a dataset that is specifically designed to probe reasoning and common misconceptions. These materials required the model to engage in deeper reasoning processes and cross-domain knowledge. Consistent with Study 1, accuracy prompts robustly improved GPT models' sharing discernment performance even in this cognitively demanding context. This suggests that the effect of accuracy prompt can generalize across different types of information and varying cognitive demands.
To further clarify whether the observed improvements resulted from genuine reasoning or simple retrieval of training data, Study 3 utilized recently emerging news materials published after GPT models' training cutoff date. Thus, the models could not rely on previously learned information. The results showed that accuracy prompts continued to significantly improve sharing discernment in GPT-4o, whereas GPT-3.5 showed limited improvement. These findings indicate that accuracy prompts effectively activate deeper cognitive processes, such as increased attention allocation towards assessing veracity and analytical reasoning, in advanced LLMs, thereby enhancing their capacity to evaluate novel misinformation.
Collectively, these three studies provide robust empirical evidence that simple accuracy prompts effectively enhance misinformation discernment capacities in GPT models by shifting their internal attentional focus toward assessing informational accuracy and triggering deeper analytical processes. Crucially, the observed effectiveness across classical, high-reasoning, and novel materials underscores the robustness and practical applicability of accuracy prompts as cognitive interventions within LLMs.
This research contributes theoretically and practically to the integration of psychological intervention strategies with artificial intelligence cognitive mechanisms. Specifically, it offers foundational insights for implementing psychologically informed interventions ("psychology for AI") that not only clarify cognitive analogies between human cognition and LLMs but also guide practical methodologies for enhancing LLMs' information discernment capabilities, ultimately benefiting real-world misinformation management and digital content verification.

References | Related Articles | Metrics

Cognitive Biases in Artificial Intelligence: Susceptibility of a Large Language Model to Framing Effect and Confirmation Bias^*

Li Hao, Wang You, Yang Xueling

2025, 48(4): 892-906. DOI: 10.16719/j.cnki.1671-6981.20250411

Abstract ( 121 )

PDF (1356KB) ( 143 )

peer review(252KB)

The rapid advancement of Artificial Intelligence (AI) and Large Language Models (LLMs) has led to their increasing integration into various domains, from text generation and translation to question-answering. However, a critical question remains: do these sophisticated models, much like humans, exhibit susceptibility to cognitive biases? Understanding the presence and nature of such biases in AI is paramount for assessing their reliability, enhancing their performance, and predicting their societal impact. This research specifically investigates the susceptibility of Google’s Gemini 1.5 Pro and DeepSeek, two prominent LLMs, to framing effects and confirmation bias. The study meticulously designed a series of experimental trials, systematically manipulating information proportions and presentation orders to evaluate these biases.
In the framing effect experiment, a genetic testing decision-making scenario was constructed. The proportion of positive and negative information (e.g., 20%, 50%, or 80% positive) and their presentation order were varied. The models’ inclination towards undergoing genetic testing was recorded. For the confirmation bias experiment, two reports—one positive and one negative—about “RoboTaxi” autonomous vehicles were provided. The proportion of erroneous information within these reports (10%, 30%, and 50%) and their presentation order were systematically altered, and the models’ support for each report was assessed.
The findings demonstrate that both Gemini 1.5 Pro and DeepSeek are susceptible to framing effects. In the genetic testing scenario, their decision-making was primarily influenced by the proportion of positive and negative information presented. When the proportion of positive information was higher, both models showed a greater inclination to recommend or proceed with genetic testing. Conversely, a higher proportion of negative information led to greater caution or a tendency not to recommend the testing. Importantly, the order in which this information was presented did not significantly influence their decisions in the framing effect scenarios.
Regarding confirmation bias, the two models exhibited distinct behaviors. Gemini 1.5 Pro did not show an overall preference for either positive or negative reports. However, its judgments were significantly influenced by the order of information presentation, demonstrating a “recency effect,” meaning it tended to support the report presented later. The proportion of erroneous information within the reports had no significant impact on Gemini 1.5 Pro’s decisions. In contrast, DeepSeek exhibited an overall confirmation bias, showing a clear preference for positive reports. Similar to Gemini 1.5 Pro, DeepSeek’s decisions were also significantly affected by the order of information presentation, while the proportion of misinformation had no significant effect.
These results reveal human-like cognitive vulnerabilities in advanced LLMs, highlighting critical challenges to their reliability and objectivity in decision-making processes. Gemini 1.5 Pro’s sensitivity to presentation order and DeepSeek’s general preference for positive information, coupled with its sensitivity to order, underscore the need for careful evaluation of potential cognitive biases during the development and application of AI. The study suggests that effective measures are necessary to mitigate these biases and prevent potential negative societal impacts. Future research should include a broader range of models for comparative analysis and explore more complex interactive scenarios to further understand and address these phenomena. The findings contribute significantly to understanding the limitations and capabilities of current AI systems, guiding their responsible development, and anticipating their potential societal implications.

References | Related Articles | Metrics

Effectiveness of Large Language Models in Simulating Regional Psychological Structures: An Empirical Examination of Personality and Subjective Well-being

Ke Luoma, Li Zengyi, Liao Jiangqun, Tong Song, Peng Kaiping

2025, 48(4): 907-919. DOI: 10.16719/j.cnki.1671-6981.20250412

Abstract ( 236 )

PDF (1400KB) ( 268 )

peer review(745KB)

This study aimed to investigate the capacity of a large language model (LLM), specifically DeepSeek, for simulating regional psychological characteristics based solely on demographic information. In particular, it examined whether DeepSeek can preserve culturally distinct psychological patterns without reducing them to oversimplified, flattened profiles, with a focus on personality traits and subjective well-being across different regions of China. Utilizing a sample matched to demographic features from the 2018 China Family Panel Studies (CFPS2018) (N = 2,943), the research generated artificial "virtual participants" with DeepSeek. The simulated dataset was compared to real human responses from CFPS to analyze regional differences in Big Five personality traits (openness, conscientiousness, extraversion, agreeableness, neuroticism) and subjective well-being.
Methodologically, the empirical human dataset comprised adult participants from CFPS 2018, covering seven culturally and socioeconomically distinct Chinese regions (North China, Northeast, East China, Central China, South China, Southwest, and Northwest). Each region had an equal number of males and females aged from 18 to 65. Personality was measured using a simplified 15-item Chinese Big Five inventory, while subjective happiness was assessed using a single-item self-rating scale. Correspondingly, a matched virtual dataset of equivalent size and demographic distribution was generated using DeepSeek-V3-0324, with constructed prompts designed to mirror the demographics and cultural context of the actual participants. The virtual participants responded to identical psychological assessments, ensuring comparability.
Results from independent-sample t-tests indicated overall similarity, while significant differences between human and AI-generated data in certain aspects. Specifically, the virtual dataset closely mirrored human data in terms of personality and happiness distributions, but exhibited significant differences in several traits. Simulated participants scored significantly lower in extraversion and openness (with medium to large effect sizes) and higher in agreeableness and neuroticism compared to human data. Happiness levels in the simulated dataset were consistently lower, suggesting limitations in DeepSeek’s capacity to replicate subjective emotional experiences accurately.
Further ANOVA analyses revealed that both datasets reflected significant regional differences in personality traits and happiness. For example, in human responses, the Southwest region demonstrated significantly higher extraversion, while the Northeast region exhibited higher subjective happiness. However, DeepSeek’s simulated data diverged from these patterns, notably underestimating happiness in the Northeast and overestimating certain personality dimensions in economically prosperous East China.
Additionally, regression analyses explored the relationship between personality traits and subjective happiness within both datasets. Human data indicated significant positive predictors of happiness as conscientiousness, extraversion, openness, and the negative predictor, neuroticism. The virtual data, however, showed different structural variations: openness and agreeableness positively predicted happiness, neuroticism negatively predicted happiness significantly more strongly, extraversion negatively predicted happiness, and conscientiousness had no significant predictive effect. Principal Component Analysis (PCA) further highlighted structural difference between the human and simulated datasets, particularly reflecting an overreliance on more linguistically salient and externally expressed traits in the AI-generated responses.
These findings contribute significantly to the understanding of LLM applications in psychological research. Primarily, they demonstrate DeepSeek’s general effectiveness in simulating broad psychological distributions, while also highlighting its limitations in capturing region-specific psychological structures shaped by the interplay of economic conditions, cultural norms, and psychological dispositions—limitations likely stemming from the model’s training data, which insufficiently represents these layered contextual factors.
The practical implications of this research are substantial. The use of DeepSeek as a tool for generating "virtual participants" could significantly reduce costs and logistical burdens associated with large-scale psychological research, enabling preliminary testing and refinement of research designs prior to field deployment. However, caution is recommended due to observed biases, including exaggerated cultural stereotypes and inadequate modeling of subjective emotional states. Future model iterations and methodological advancements should address these issues by incorporating richer, more culturally grounded training data and more precise affective modeling techniques.
Despite these limitations, the research provides important methodological insights and theoretical contributions by introducing an innovative approach using LLM-generated virtual participants for psychological inquiry. It underscores the potential of DeepSeek and similar models for cost-effective large-scale research while highlighting crucial areas that require further refinement.
In conclusion, this study validates the feasibility of employing large language models such as DeepSeek for simulating regional psychological structures, but also emphasizes the necessity for continued development to address culturally grounded and psychologically meaningful variations effectively. As training data and algorithms advance, these models may help reshape methodologies within personality and cross-cultural psychological research.

References | Related Articles | Metrics

Who is More Trustworthy, AI or Humans? Modeling Human-AI Interactions in Repeated Trust Games with Reinforcement Learning

Tan Haotian, Li Zeqing, Wu Zhen

2025, 48(4): 920-932. DOI: 10.16719/j.cnki.1671-6981.20250413

Abstract ( 124 )

PDF (1427KB) ( 152 )

peer review(659KB)

With the explosive advancement of artificial intelligence (AI), human society is entering a new era of “Human-AI Interaction.” The chatbots we interact with, the algorithmic recommendations embedded in software, and the autonomous driving systems in vehicles all rely on AI algorithms. Moreover, AI has become increasingly integrated into critical fields such as education, healthcare, and finance. Undeniably, it has become an indispensable component of modern life. In this context, understanding trust in Human-AI Interaction not only contributes to the development of prosocial AI but also holds significant value for promoting human-AI collaboration and advancing social harmony.
As one of the core mechanisms of human social interaction, trust is defined as the willingness to hold positive expectations of others’ behaviors and expose oneself to potential risks when unable to control the actions of others. AI is reshaping how people interact with their environments, and trust has now extended from human-human relationships to human-AI interactions. Existing research has primarily used self-reported questionnaires or one-shot interactions to identify key factors influencing human trust in AI. However, these “snapshot” studies provide limited insight into the dynamic development of trust. As seen in human-human interactions, trust typically develops over repeated encounters. The mechanisms underlying trust formation in human-AI interactions remain unclear.
To bridge these gaps, the present study aimed to compare the similarities and differences between Human-AI trust and Human-Human trust using a repeated trust game paradigm. It also sought to use reinforcement learning model to explore the computational mechanisms underlying the development of trust in Human-AI interactions. A total of 148 participants were recruited and completed the experiment. During the task, participants engaged in 20 rounds of trust games with each of 6 different trustees—either human or AI. Each trustee’s reciprocal behavior was governed by a predetermined return probability (25%, 50%, or 75%), which was randomly determined by the program. Trustees were distinguished by different images and labels. The order of trustees was counterbalanced within participants using a Latin square design. To enhance ecological validity, the return probabilities were anonymized.
The results showed that, at the beginning of the trust game, participants did not exhibit differential investment levels toward human or AI in all conditions. However, over repeated interactions, participants displayed higher investment probabilities when interacting with AI compared to human. Moreover, participants were less likely to stop investing after experiencing a loss when interacting with AI, indicating fewer loss-shift behaviors. In contrast, after successful investments, participants’ behavior did not differ across trustee types.
A hyperbolic model revealed an interaction between trustee type and return probability. While return probabilities did not significantly affect the temporal trajectory of investment behavior toward AI, they did influence behavior in human interactions. Specifically, lower return probabilities led to a steeper decline in investment rates toward human trustees. Moreover, participants exhibited a higher asymptotic level of investment when interacting with AI, regardless of return probability, suggesting that if the trust game were to continue indefinitely, the probability of investing in AI would remain higher.
Further, the results of the reinforcement learning model indicated that the four-parameter model incorporating trustee type provided the best fit. The results showed that participants exhibited lower negative feedback learning rates and higher temperature parameters when interacting with AI, suggesting reduced sensitivity to negative outcomes and increased exploratory behavior. There were no significant differences in positive feedback learning rates or initial expected utility between human and AI trustees.
Taken together, these findings suggest that, although there is no initial trust bias toward humans or AI, participants gradually develop greater trust in AI over repeated interactions. Moreover, participants exhibit greater tolerance for non-cooperative behaviors displayed by AI. Trust in human-AI interactions appears to be more resilient. These results provide the first evidence of differences in the trust formation processes between human-human and human-AI interactions and offer a foundation for understanding the cognitive adaptation processes elicited by AI in social interactions.

References | Related Articles | Metrics

The Heterogeneity of the Effectiveness of Human-AI Collaborative Decision-Making: A Four-Stage Process Influence Model Focusing on Agency

Geng Xiaowei, Li Xinqi, Xu Zhiping, Xie Tian

2025, 48(4): 933-947. DOI: 10.16719/j.cnki.1671-6981.20250414

Abstract ( 131 )

PDF (682KB) ( 97 )

peer review(388KB)

The integration of artificial intelligence (AI) into the decision-making has revolutionized fields such as healthcare, finance, and criminal justice by offering the potential for enhanced efficiency and accuracy through human-AI collaboration decision-making (HAIC-DM). However, empirical outcomes remain inconsistent. While AI augments human capabilities in complex tasks (e.g., AI-assisted medical diagnostics matching expert performance), it can also degrade performance in simpler tasks due to cognitive redundancy or overreliance (e.g., automation bias in image recognition). This heterogeneity stems from unresolved tensions between technological potential and human factors. First, misaligned task allocation often undermines complementary strengths. While AI excels at structured, data-driven tasks (e.g., credit scoring), it has limitations in contextual reasoning and ethical judgment necessitate human oversight—a balance frequently disrupted in practice. Second, asymmetric trust dynamics skew collaboration: opaque AI systems (e.g., "black-box" algorithms) foster overreliance or distrust, as seen in radiologists uncritically accepting erroneous high-confidence AI diagnoses. Third, bias amplification—where algorithmic biases (e.g., racial disparities in recidivism prediction tools) intersect with human cognitive heuristics (e.g., anchoring effects), creating self-reinforcing error cycles that exacerbate inequities in judicial and hiring decisions. The urgency to reconcile AI’s computational power with human agency, particularly in ethically sensitive contexts, underscores the need for systematic exploration of collaborative mechanisms and risks.
This study synthesizes 54 empirical studies from computer science, psychology, and organizational research (2018-2024). These studies were retrieved from the ACM Digital Library, the Web of Science, and the AIS eLibrary, using keywords such as "human-AI collaboration" and "decision-making." The inclusion criteria prioritized quantitative assessments of HAIC-DM performance (human-only, AI-only, and collaborative outcomes). A thematic analysis was conducted to identify recurring patterns in task characteristics (e.g., structured vs. unstructured goals), interaction designs (e.g., explanation formats), and moderators (e.g., user expertise). A four-stage Process Impact Model was developed, integrating principles from symbiosis theory and distributed cognition. Case studies (e.g., healthcare diagnostics, autonomous driving) were analyzed to validate stage-specific mechanisms, and experimental findings (e.g., trust calibration experiments) informed theoretical refinements.
The proposed model identifies four interdependent stages that govern the efficacy of HAIC-DM:
(1)Strengths/Biases Recognition: AI excels at structured tasks (e.g., fraud detection), while humans dominate ethical judgments. Biases, such as algorithmic (e.g., historical data biases) and cognitive biases (e.g., anchoring effects), distort collaboration in judicial decisions where humans and AI redundantly overemphasize prior convictions.
(2)Context-Driven Task Allocation: Optimal allocation improves accuracy (e.g., AI pre-screening cancer images + human validation boosts diagnostic accuracy by 15%), whereas misallocation (e.g., AI-led creative writing) yields superficial outputs.
(3)Trust Calibration: Example-based explanations improve the discernment of advice (+22% accuracy in income prediction), yet opaque AI systems induce overreliance, as radiologists often accept erroneous high-confidence diagnoses.
(4)Adaptive Dependency: Balanced reliance maximizes efficacy (e.g., AI risk alerts in autonomous driving + human ethical oversight), but over-dependence triggers cognitive offloading and eroding skills (e.g., lawyers who rely excessively on AI for contract analysis).
This study advances HAIC-DM research by framing collaboration as a co-evolutionary process. It emphasizes bidirectional adaptation between humans (critical thinking, ethical oversight) and AI (transparency, contextual learning). The Process Impact Model clarifies how dynamic interactions, from bias recognition to dependency calibration, determine efficacy. It offers actionable insights for optimizing task allocation and trust mechanisms. Future work must prioritize shared mental models to align AI’s computational logic with human intuition, particularly in high-stakes domains like healthcare and criminal justice. Institutional reforms, including ethical governance frameworks and mandatory human oversight protocols, are critical to mitigate risks like accountability erosion. Fostering synergistic interdependence, where AI augments human cognition without supplanting agency, is key to realizing the vision of "humans as ethical navigators, AI as precision enablers." This alignment ensures that collaborative intelligence enhances, rather than undermines, societal decision-making in an AI-augmented future.

References | Related Articles | Metrics

From Para-social Interaction to Attachment: The Evolution of Human-AI Emotional Relationships

Wu Yan, Geng Xiaowei, Zhou Xiaolin

2025, 48(4): 948-961. DOI: 10.16719/j.cnki.1671-6981.20250415

Abstract ( 194 )

PDF (1063KB) ( 199 )

peer review(1047KB)

The rapid advancement of artificial intelligence (AI) technology and the widespread emergence of AI companions have transformed human-AI interaction from purely instrumental use to quasi-social engagement, potentially evolving into emotional attachment. This article systematically reviews two decades of interdisciplinary research in psychology and human-AI interaction, proposing a theoretical model to elucidate the formation of human-AI attachment. The study identifies three key findings: (1) Human-AI relationships undergo a dynamic progression from instrumental use to quasi-social interaction and, ultimately, to emotional attachment. (2) The development of AI attachment is influenced by dual pathways: individual factors (e.g., loneliness, usage motivation, emotional traits) and AI characteristics (e.g., anthropomorphism, autonomy, responsiveness). (3) This novel emotional bond raises ethical concerns, including emotional bubbles, privacy risks, and interpersonal alienation.
The article constructs a triphasic model to delineate the evolution of human-AI emotional bonds: (1) Instrumental Use, where AI serves as a functional tool with minimal emotional engagement; (2) Quasi-Social Interaction, marked by anthropomorphism and bidirectional communication, though users remain aware of AI's non-human nature; and (3) Emotional Attachment, characterized by deep dependency, where AI becomes a “significant other” and a transitional object for emotional security. This model highlights the continuum of emotional investment, from functional commands to intimate self-disclosure and separation anxiety.
The dual-path mechanism underpinning AI attachment formation integrates user-driven needs (e.g., social motivation, loneliness) and AI-driven performance (e.g., authenticity, autonomy, reactivity). AI’s “backstage” features—privacy, non-judgmental feedback, and identity fluidity—foster a “digital sanctuary” for authentic self-expression, reinforcing attachment. However, excessive reliance on AI may lead to emotional bubbles (illusory reciprocity), self-deception, and real-world social skill deterioration. Ethical dilemmas arise from AI’s hyper-personalized emotional mimicry, which risks manipulating vulnerable users and exacerbating societal isolation.
Despite its contributions, current research suffers from limitations, including cross-sectional designs, homogeneous samples (e.g., overrepresentation of young users), and a lack of neurobiological evidence. Future directions call for longitudinal studies, multimodal data, and investigations into AGI’s potential to disrupt traditional attachment paradigms through bidirectional emotional capacities. Practical implications urge developers to embed ethical safeguards (e.g., transparency in emotional algorithms), policymakers to establish risk-assessment frameworks, and users to cultivate digital literacy for healthier human-AI coexistence.
This study not only advances theoretical frameworks for digital-era attachment but also prompts philosophical reflection on the essence of intimacy, challenging conventional definitions of love and “inter-subjectivity” in an age where AI blurs the boundaries between tool and companion. Balancing technological innovation with ethical vigilance is paramount to ensuring the sustainable development of human-AI relationships..

References | Related Articles | Metrics

The Abstraction and Generalization of Social Decision Information

Wang Han, Dong Yulin, Liu Ningfeng, Zhu Lusha

2025, 48(4): 962-971. DOI: 10.16719/j.cnki.1671-6981.20250416

Abstract ( 76 )

PDF (680KB) ( 123 )

peer review(303KB)

Recent advances in artificial intelligence (AI), cognitive psychology, and neuroscience have significantly enhanced our understanding of abstraction and generalization — how agents extract key features from complex decision environments to support efficient and generalizable decision-making. While extensive research has elucidated the mechanisms of abstraction and generalization in non-social contexts, such as rule-based learning, far less is known about how these processes in social domains. During social interactions, agents must not only filter out irrelevant details and abstract core decision-relevant information (e.g., concepts, perspectives, strategies), but also infer whether their internal representations align with those of others. Investigating how multi-agent systems understand, utilize, and generalize relevant information in service of effective interactions is increasingly critical for understanding the mechanisms underlying general social intelligence. Building on recent findings in non-social decision-making, this paper outlines future research directions for studying abstraction and generalization in social contexts.
Specifically, in the non-social domain, studies of generalization have highlighted two core mechanisms: rule-based strategies, which involve hierarchical categorization of features, and similarity-based approaches, such as analogical reasoning via prototype matching. At the computational level, a prominent example in this area is the successor representation (SR) model, which provides a unified computational framework for AI, behavioral and neuroscience research. Developed in close relation to reinforcement learning theories, SR compresses decision states by encoding predictive relationships, thereby enabling rapid adaptation to reward changes. Inspired by SR predictions, emerging neurobiological evidence has implicated the hippocampal and prefrontal representations of abstract knowledge, potentially facilitating knowledge transfer and strategy generalization. Parallel research in AI demonstrates how deep successor reinforcement learning (DSR) can leverage SR to achieve cross-task generalization in navigation and robotic design. These findings underscore abstraction as a conserved yet effective mechanism for flexible decision-making across both biological and artificial systems.
However, social interactions introduce unique computational demands. Agents must extract and organize relevant information, while negotiating shared intentions and conventions during cooperative interactions or strategizing to outmaneuver opponents during competitive interactions. Despite progress in characterizing the neural and computational mechanisms of various social cognitive functions, fundamental questions remain. For example, how do interacting agents form aligned or misaligned abstractions? How do goals and environmental constraints shape these abstractions, and how do these, in turn, influence decision strategies? What computational and neural mechanisms give rise to the dynamic alignment or divergence of abstractions and generalization across individuals?
Three interconnected open questions critical to future research should be addressed. First, effective social cooperation typically requires abstraction hierarchies and knowledge organization shared across interacting agents. While neural coupling across cooperative individuals is well-documented, the computational role of such alignment in facilitating social decision-making and its strategy generalization remains unclear. Research in AI suggests that alignment enhances cross-task performance, but its relevance to human social cognition requires further investigation.
Second, abstractions of decision-relevant information may vary across individuals with divergent social experiences, network positions, and cultural backgrounds. Therefore, it is critical to elucidate, at a computational level, when and how individual differences give rise to misaligned internal representations, which may contribute to phenomena such as prejudice and polarization. Investigating these misaligned representations will shed light on how social identities shape perception and decision-making, ultimately informing strategies to mitigate bias and foster social cohesion.
Finally, successful social cooperation often depends on aligning initially misaligned representations across individuals. It is important to identify the mechanisms that support such interpersonal adaptation. Although prior research in AI and evolutionary biology has highlighted the benefits of alignment dynamics, the cognitive and neurocomputational processes that govern these internal changes remain to be explored.
Together, this paper proposes to bridge ideas and computational methods in social decision-making, AI, and cognitive neuroscience for developing a mechanistic understanding of abstraction and representation. Such an integrative framework holds the potential to reveal the computational principles of general social intelligence and inspire the design of socially intelligent systems.

References | Related Articles | Metrics

Can the Public Accept AI-Generated Health News?Experimental Evidence from Trust Mediation and Negative Expectancy Violation Moderation

Na Yuxiang, Liu Yingxuan, Lai Kaisheng

2025, 48(4): 972-984. DOI: 10.16719/j.cnki.1671-6981.20250417

Abstract ( 73 )

PDF (1377KB) ( 117 )

peer review(538KB)

The potential of artificial intelligence (AI) in health journalism is becoming increasingly recognized, offering promising opportunities to enhance information dissemination. By automating routine news production, AI enables journalists to allocate more time and resources to strategic health initiatives, policy development, and in-depth investigative reporting. However, the public’s trust in and acceptance of AI-generated health information remain underexplored. Prior research suggests that individuals may actively reject or avoid AI-provided information, known as “AI information avoidance”. This poses a significant challenge to the effectiveness of AI-driven health news. If the public distrusts or resists health information disseminated by AI systems, the potential benefits of AI in health communication could be severely undermined.
In this context, it is critical to examine the underlying psychological processes that influence the public’s engagement with AI health news. The expectation violation theory is a valuable framework for understanding these dynamics because it illuminates how deviations from psychological expectations can affect trust and acceptance. This study aims to explore the psychological mechanisms underlying the public’s perception of AI news in the context of expectation violation. By clarifying these relationships, we hope to strengthen the role of AI as a transformative tool for realizing the ideal vision of health journalism. Specifically, this study explores the effects of news agents (AI vs. human) on the public’s willingness to accept information and the moderating role of negative expectation violation through two between-subject experiments and a moderating mediator model based on expectation violation theory.
The results of experiment 1 showed a significant difference in subjects’ willingness to accept health news information written by AI agent and human agent (t = - 6.75, p < .001, Cohen’ s d = 1.49). Thus, Hypothesis 1 was supported. Perceived trust mediated the relationship between news agent and willingness to accept information. Type of news agent had a significant effect on perceived trust (β = - 1.28, t = - 7.41, p < .001), whereas perceived trust had a significant effect on willingness to accept information (β = .94, t = 23.27, p < .001). Results of the mediation analysis showed a nonsignificant direct effect and a significant indirect effect (β = - 1.20, 95% CI = [-1.54, -.88]). The mediation effect was a full mediation effect. Therefore, Hypothesis 2 was supported. Furthermore, negative expectation violation moderated the relationship between news agent type and perceived trust (β = -.22, t = - 2.34, p = .02), and the public’s negative expectation violation exacerbated the detrimental effect of AI authors as agents on their perceived trust. Hypothesis 3 was supported. Experiment 2 further validated the interaction effect between news agent type and negative expectation violation on perceived trust by manipulating the degree of negative expectation violation of the subjects in a more rigorous experimental design. The results showed that the results of experiment 1 were validated.
Our study reveals a significant public’s preference for health news authored by human agents over AI agents, underscoring the critical role of news agent in shaping information acceptance. Importantly, perceived trust emerges as a central mediating mechanism explaining this disparity in the public’s reception. Furthermore, our findings highlight the moderating role of negative expectation violation in the relationship between news agents (human vs. AI) and perceived trust. Negative expectation violation can further exacerbate the negative impact of AI news agent on perceived trust. These results emphasize the importance of strategically managing public’s expectations in the deployment of AI-driven health news.
At the theoretical level, this study expanded the scope of genre types in AI news research, focusing specifically on the public’s trust in and willingness to accept information in AI health news. Meanwhile, by revealing the moderating role of negative expectation violation in the relationship between news agent type and public perceived trust, this study provided an explanatory reference for the inconsistent findings that existed in previous studies on AI news trust. On the practical level, this study revealed the existence of negative expectation violation when the public is exposed to AI health news and confirmed the impact of negative feedback due to failed expectations on the public’s willingness to accept health information.

References | Related Articles | Metrics

Enhancing Students' Metacognitive Abilities through Heuristic Questioning with Large Language Models

Wu Wen, Ren Feifei

2025, 48(4): 985-996. DOI: 10.16719/j.cnki.1671-6981.20250418

Abstract ( 77 )

PDF (1754KB) ( 100 )

peer review(217KB)

Metacognitive ability refers to an individual’s awareness of, reflection on, and regulation of their own cognitive processes. This ability facilitates learners’ autonomous monitoring of their learning. Heuristic questioning, as a key approach to activating and cultivating metacognition, encourages students to engage in active thinking, identify cognitive blind spots, and adjust learning strategies, thereby fostering a positive learning cycle. However, in traditional classrooms, it is challenging for teachers to provide personalized questioning support to every student. With the recent rapid advancement of large language models (LLMs), new opportunities have emerged for personalized questioning. Nevertheless, existing LLMs predominantly function as “answering machines” rather than “questioning mentors.” While they excel at answering questions, they often struggle to generate deep and thought-provoking questions, which limits the potential of intelligent heuristic questioning to promote metacognitive development.
This study proposes a heuristic questioning mechanism based on error-type analysis, aiming to shift LLMs from being encyclopedias to experienced questioning tutors. A cross-disciplinary question bank was developed to categorize common errors and their corresponding heuristic questions. Retrieval-Augmented Generation (RAG) was used to enable flexible dialogue guided by preset prompts, referencing the error-based knowledge base. Three questioning strategies, characterized by fully open-ended, template-constrained, and semi-open (combining error guidance with generative flexibility), were designed and compared with a baseline model without the question bank.
To evaluate the effectiveness of these strategies, a dual evaluation framework combining human judgment and automated scoring by LLMs was established. For the subjective evaluation, volunteers rated teacher-student dialogues generated by different strategies across multiple dimensions using questionnaires. The automated evaluation used a dialogue-adapted scoring rubric constructed from established metacognitive assessment frameworks, with quantitative analysis of students’ cognitive regulation indicators performed by the large model. By comparing the distribution and trends of human and model scores, the study analyzed the guidance efficacy and task adaptability of each strategy.
The results indicated that: (1) the error-based question bank significantly enhances students’ thinking and metacognitive development; (2) among the tested strategies, the semi-open approach achieves the best overall performance by balancing content specificity, generative flexibility, and learner adaptability; and (3) multidimensional evaluation confirms the effectiveness of the proposed intelligent heuristic questioning mechanism in fostering metacognitive growth.

References | Related Articles | Metrics

Empowering the Construction and Automated Measurement of Psychological Trait Dimensions with Artificial Intelligence: A Case Study of National Stereotypes

Yilin Wang, Nan Zhao, Tingshao Zhu

2025, 48(4): 997-1008. DOI: 10.16719/j.cnki.1671-6981.20250419

Abstract ( 119 )

PDF (1751KB) ( 154 )

peer review(295KB)

National stereotypes play a significant role in shaping intergroup attitudes, behaviors, and international relations. Accurately measuring these stereotypes is essential to understand social cognition at the individual and societal levels. However, traditional methods of assessing such stereotypes typically rely on predefined dimensions and structured questionnaires, which often limit the scope of concept identification and introduce measurement biases. To overcome these limitations, this study introduces an artificial intelligence-empowered paradigm for psychological assessment that applies large language models (LLMs) to integrate dimensional construction and automated measurement without the need for conventional scale development. This automated evaluation approach is referred to as the LLM-rating model, which enables direct, scalable, and objective evaluation of psychological indicators from open-ended textual data.
In Study 1, we utilized LLMs to extract national stereotype content from free-description responses provided by participants of different nationalities. Specifically, we recruited 191 Chinese participants (107 female; mean age = 31.28 years) and 176 American participants (85 female; mean age = 47.08 years) to describe their impressions of different foreign nations. The free-description responses were processed using text mining methods, including network analysis and topic modeling, and further analyzed with LLMs to identify the cross-cultural core dimensions of national stereotypes. This approach revealed five dimensions: cultural richness, development and progress, dominance and threat, social equality, authoritarianism, and dictatorship. These dimensions extend beyond conventional stereotype content models, and offer a more comprehensive understanding of national images. By incorporating LLMs into both the extraction and categorization processes, our study reduces human subjectivity in manual coding and provides a data-driven approach to identifying stereotype structure.
In Study 2, based on the valid American participants from Study 1, we re-invited 59 of them (29 female; mean age = 47.29) to participate again in order to validate the automated measurement model. Using multiple advanced LLMs, including GPT-4o, DeepSeek-R1, Llama 3.3, and Qwen-max, we developed LLM-rating models to assess national stereotype dimensions. Each model generated stereotype ratings independently, which were then evaluated for human-model rating consistency by comparing them with human expert evaluations, and for temporal stability of rating results across different time points. The results demonstrated high consistency between the LLM-generated ratings and the human expert evaluations across all dimensions. Additionally, the models also showed strong temporal stability, maintaining similar ratings for free-description texts about the same country, written by the same participant at different time points. These findings suggest that LLMs could be used for large-scale, automated psychological measurements, saving human and material resources while expanding the methodological possibilities for social cognition research.
The highlight of this study lies in its establishment of a new computational framework for constructing and measuring psychological dimensions, empowered by artificial intelligence. Traditional assessment approaches typically require constructing psychological scales based on theoretical assumptions, involving substantial effort in defining concepts, generating items, and conducting validation studies. In contrast, our LLM-rating paradigm bypasses the need for scale development by directly leveraging the natural language processing capabilities. These models extract meaningful psychological concepts directly from free-text responses, and construct core dimensions based on the extracted concepts, followed by automated scoring. This approach not only enhances efficiency but also ensures adaptability, as it allows national stereotype assessment to evolve dynamically with societal changes based on a large corpus rather than being constrained by static survey items.
In conclusion, this study introduces a computational paradigm for psychological assessment by integrating artificial intelligence and social psychological research. By leveraging LLMs throughout the entire process from dimension construction to automated measurement, our study underscores the potential of LLMs for social science research, which provides more scalable and objective approaches to measuring stereotypes and other psychological indicators. This work offers a new perspective on social cognition research and provides practical implications for interpersonal communication at the individual level and collaboration at the national level.

References | Related Articles | Metrics

The Influence of Risk Perception-Driven Online Prosocial Behavior on the Development Patterns of Unconventional Emergencies

Bai Qiyu, Huang Keyi, Han Sijia, Chen Shangyi, Liu Kuo, Zhang Yue3, Li Shao, Luo Siyang

2025, 48(4): 1009-1023. DOI: 10.16719/j.cnki.1671-6981.20250420

Abstract ( 107 )

PDF (2367KB) ( 110 )

peer review(2526KB)

With deep involvement in internet practices, online space has become an important place for individuals’ prosocial behavior. Online prosocial behavior, characterized by voluntary acts in digital spaces aimed at benefiting others, plays a significant role in this context. Under unconventional emergencies, the internet becomes a virtual living environment that the public relies on, and online prosocial behavior becomes an important channel to help integrate resources and restore order in the real world. This study explores the mechanisms that drive individuals to engage in online prosocial behaviors when unconventional emergencies pose risks in the real space, and further explores how online prosocial behaviors driven by risk perceptions can further influence the direction of unconventional emergencies in the real world.
Study 1 utilized a questionnaire-based approach, gathering a substantial sample of 917 participants for analysis using SPSS software. Grounded in the Terror Management Theory (TMT), the study constructed a moderated mediation model to examine the effect of risk perception on online prosocial behavior. Specifically, it considered the mediating role of ontological security—individuals’ sense of safety and stability in their environment—and the moderating role of health self-efficacy, which reflects individuals' belief in their ability to manage their health and well-being during crises. The results indicated that individuals experiencing higher levels of risk perception were more inclined to engage in online prosocial actions, with this inclination mediated by a reduced sense of ontological security. Furthermore, health self-efficacy played a significant moderating role. That is, individuals with higher self-efficacy demonstrated a stronger tendency to translate their security-seeking behaviors into prosocial actions. These findings highlight the complex interplay between psychological factors and the motivation to engage in prosocial behavior in digital spaces.
Study 2 employed Agent-Based Modeling (ABM) to simulate the practical effects of online prosocial behavior during a public health crisis, focusing particularly on online donations as a key manifestation of such behavior. Based on the two-space coupling model, the results of the model prediction showed that online donation behavior reduces the number of people at the peak of the epidemic and shortens the duration by regulating the effective allocation of medical resources. Risk awareness in crisis situations can stimulate the public's online helping behavior, and the spontaneous increase in donation behavior reduces the number of people infected during the peak of the epidemic and shortens the number of days to reach the peak as well as the overall number of days of the epidemic. People donate out of a desire to improve the well-being of the rest of the population, and online donation behaviors enable the public to respond more effectively to the epidemic threat by allocating more resources to those in need.
This research contributes to our understanding of the psychological and practical drivers of online prosocial behavior during emergencies. It broadens the application of Terror Management Theory by extending its relevance to online contexts and underscores the role of health self-efficacy as a critical factor in shaping prosocial responses. Additionally, the study suggests that promoting online prosocial behavior serves a dual purpose: it helps individuals regain psychological stability during crises and contributes to a more effective societal response.
In conclusion, this study highlights the dual importance of integrating digital and physical spaces in the context of crisis management. By providing insights for policymakers, it underscores the potential to leverage online engagement to enhance crisis response and community resilience. Encouraging online prosocial behaviors can create a supportive environment that empowers individuals and communities, enabling them to navigate the challenges posed by unconventional emergencies more effectively. As societies continue to rely on digital platforms, understanding and promoting these behaviors will be essential to foster collective well-being in times of crisis.

References | Related Articles | Metrics

2025 Vol. 48	No. 3	No. 2	No. 1
2024 Vol. 47	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2023 Vol. 46	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2022 Vol. 45	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2021 Vol. 44	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2020 Vol. 43	No. 6	No. 5	No. 3	No. 2	No. 1
2019 Vol. 42	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2018 Vol. 41	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2017 Vol. 40	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2016 Vol. 39	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2015 Vol. 38	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2014 Vol. 37	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2013 Vol. 36	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2012 Vol. 35	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2011 Vol. 34	No. 6	No. 5	No. 4	No. 3	No. 2	No. 1
2010 Vol. 33	No. 5	No. 4	No. 3	No. 2	No. 1
2005 Vol. 28	No. 2