Journal of Psychological Science ›› 2023, Vol. 46 ›› Issue (5): 1057-1066.DOI: 10.16719/j.cnki.1671-6981.20230504

• General Psychology,Experimental Psychology & Ergonomics • Previous Articles     Next Articles

Encoding and Decoding Credibility in Human Vocal Cues

Hu Yanbing, Jiang Xiaoming   

  1. Institute of Linguistics, Shanghai International Studies University, Shanghai, 201620
  • Online:2023-09-20 Published:2023-11-07

“信”以传信,“疑”以传疑?基于人声线索的可信度编码与解码 *

胡砚冰, 蒋晓鸣**   

  1. 上海外国语大学语言研究院,上海,201620
  • 通讯作者: **蒋晓鸣,E-mail: xiaoming.jiang@shisu.edu.cn
  • 基金资助:
    *本研究得到国家自然科学基金委面上项目(31971037)、上海市“科技创新行动计划”自然科学基金项目(22ZR1460200)、上海市教育发展基金会和上海市教育委员会“曙光计划”项目(20SG31)和上海外国语大学导师学术引领计划项目(2022113001)的资助

Abstract: Recognizing vocal cues of credibility plays a significant role in social interactions. Vocal credibility refers to the degree to which listeners judge the truthfulness of a message from the speaker's voice, and affects the listener's social impressions and subsequent behaviours towards the speaker. Supported by behavioral and neurophysiological evidence, the cognitive processing model of vocal expression proposes that listeners can decode speakers' credibility from various vocal information such as speech prosody, lexical-semantic information, and accents.
In terms of speech prosody, credible voices are associated with higher Fundamental Frequency (F0) and amplitudes, whereas the untrustworthy ones are characterized by slower speech rates and more frequent pauses. Evidence from event-related potentials (ERPs) has shown that listeners can differentiate between credible and untrustworthy prosody apart as early as 200ms and continue to compute these vocal cues in a dynamic fashion. Furthermore, fMRI studies have revealed that the increased trustworthiness of vocal expressions is associated with increased activity in the left Superior frontal gyrus (SFG) and the left Inferior Frontal Gyrus (IFG), whereas the increased untrustworthiness engendered an activation in the right Superior Temporal Gyrus (STG). Moreover, functional connectivity studies have shown that the strengthened connections between the left Postcentral Gyrus and the Supplementary Motor Area (SMA) are associated with the listeners' decoding of vocal cues of speaker untrustworthiness.
The vocal cues and semantic information interact to form the speaker's credibility impression. The ERP evidence showed that vocal expressions with the vocal cue incongruent with the semantic content elicited a larger Late Positive Potential (LPP). Source localization revealed that Middle Frontal Gyrus (MFG) and STG contributed to this effect. Prosody-semantic congruency modulates the neurocognitive mechanisms underlying the decoding of speaker credibility.
Accent cues that indicate in- and out-group voices are relevant for decoding credibility. Listeners perceived in- (vs. out-) group voices as more credible. ERP results showed that for the out-group accent, the doubtful (vs. confident) voice elicited a smaller P200 response. However, for the in-group accent, the doubtful (vs. confident) voice elicited larger early responses. Moreover, basal ganglia, left cuneus and right fusiform gyrus were activated when listeners judged out-group vs. in-group voices for speaker's credibility. More importantly, the superior parietal and middle temporal brain regions were activated when listeners perceived in-group (vs. out-group) credibility. These results suggest that there could be two pathways for decoding vocal credibility. Listeners show greater sensitivity towards in-group voice at the perceptual level, and they tend to follow a ‘direct path' for making social inferences based on the human voice. For the out-group speaker, the social category information activated by accent information delays listeners' social reasoning about the out-group features. Listeners follow a longer or more laborious ‘indirect path' to compute the credibility of the out-group speaker with an in-depth analysis of the vocal expression. The processing of vocal credibility involves not only the analysis of lower-level acoustic features but also the higher-level social categorization.
In future research, one should explore further three topics related to the encoding and decoding mechanisms of speakers' credibility: (1) The developmental mechanisms underlying the decoding of credibility from human voice; (2) The role of multimodal nonverbal cues in encoding and decoding speaker credibility in real-life social interactions; and (3) The neurocognitive deficits of decoding vocal credibility in clinical populations.

Key words: vocal expression processing model, speech communication, vocal cue, paralinguistic cue, vocal trustworthiness

摘要: 言语交流中,个体通过声音来表达和判断可信度的信息是人际信任形成的基础。人声可信度加工是指听者基于声源者的嗓音与发声特点判断其言语信息的可信程度,这一过程会影响听者与声源者之间信任关系的建立。人声加工的认知模型提出,人声中的副语言线索、语义线索和社会群体线索会影响声源者对可信度的编码与听者对可信度的解码。声源者通过有意编码韵律副语言线索产生不同的可信度水平,听者在200ms可以区分不同水平的可信度。当声源者编码的语义和副语言线索之间的可信度水平不一致时,听者需要更长的时间消解线索间的冲突并推理可信度。声源者的种族身份特征会无意编码口音线索,这种线索调节了听者整合社会群体线索解码可信度时两条不同的神经通路:基于人声线索推理可信度的颞顶网络和从人声中提取编码社会群体规则的额顶网络。未来研究应从人声加工的发生和发展、言语多模态信息协调以及特殊群体功能损伤的角度探讨人声可信度的编码与解码机制。

关键词: 人声表情加工模型, 言语交流, 人声线索, 副语言线索, 人声可信度