Encoding and Decoding Credibility in Human Vocal Cues

Hu Yanbing, Jiang Xiaoming

Journal of Psychological Science ›› 2023, Vol. 46 ›› Issue (5) : 1057-1066.

PDF(982 KB)
PDF(982 KB)
Journal of Psychological Science ›› 2023, Vol. 46 ›› Issue (5) : 1057-1066. DOI: 10.16719/j.cnki.1671-6981.20230504
General Psychology,Experimental Psychology & Ergonomics

Encoding and Decoding Credibility in Human Vocal Cues

  • Hu Yanbing, Jiang Xiaoming
Author information +
History +

Abstract

Recognizing vocal cues of credibility plays a significant role in social interactions. Vocal credibility refers to the degree to which listeners judge the truthfulness of a message from the speaker's voice, and affects the listener's social impressions and subsequent behaviours towards the speaker. Supported by behavioral and neurophysiological evidence, the cognitive processing model of vocal expression proposes that listeners can decode speakers' credibility from various vocal information such as speech prosody, lexical-semantic information, and accents.
In terms of speech prosody, credible voices are associated with higher Fundamental Frequency (F0) and amplitudes, whereas the untrustworthy ones are characterized by slower speech rates and more frequent pauses. Evidence from event-related potentials (ERPs) has shown that listeners can differentiate between credible and untrustworthy prosody apart as early as 200ms and continue to compute these vocal cues in a dynamic fashion. Furthermore, fMRI studies have revealed that the increased trustworthiness of vocal expressions is associated with increased activity in the left Superior frontal gyrus (SFG) and the left Inferior Frontal Gyrus (IFG), whereas the increased untrustworthiness engendered an activation in the right Superior Temporal Gyrus (STG). Moreover, functional connectivity studies have shown that the strengthened connections between the left Postcentral Gyrus and the Supplementary Motor Area (SMA) are associated with the listeners' decoding of vocal cues of speaker untrustworthiness.
The vocal cues and semantic information interact to form the speaker's credibility impression. The ERP evidence showed that vocal expressions with the vocal cue incongruent with the semantic content elicited a larger Late Positive Potential (LPP). Source localization revealed that Middle Frontal Gyrus (MFG) and STG contributed to this effect. Prosody-semantic congruency modulates the neurocognitive mechanisms underlying the decoding of speaker credibility.
Accent cues that indicate in- and out-group voices are relevant for decoding credibility. Listeners perceived in- (vs. out-) group voices as more credible. ERP results showed that for the out-group accent, the doubtful (vs. confident) voice elicited a smaller P200 response. However, for the in-group accent, the doubtful (vs. confident) voice elicited larger early responses. Moreover, basal ganglia, left cuneus and right fusiform gyrus were activated when listeners judged out-group vs. in-group voices for speaker's credibility. More importantly, the superior parietal and middle temporal brain regions were activated when listeners perceived in-group (vs. out-group) credibility. These results suggest that there could be two pathways for decoding vocal credibility. Listeners show greater sensitivity towards in-group voice at the perceptual level, and they tend to follow a ‘direct path' for making social inferences based on the human voice. For the out-group speaker, the social category information activated by accent information delays listeners' social reasoning about the out-group features. Listeners follow a longer or more laborious ‘indirect path' to compute the credibility of the out-group speaker with an in-depth analysis of the vocal expression. The processing of vocal credibility involves not only the analysis of lower-level acoustic features but also the higher-level social categorization.
In future research, one should explore further three topics related to the encoding and decoding mechanisms of speakers' credibility: (1) The developmental mechanisms underlying the decoding of credibility from human voice; (2) The role of multimodal nonverbal cues in encoding and decoding speaker credibility in real-life social interactions; and (3) The neurocognitive deficits of decoding vocal credibility in clinical populations.

Key words

vocal expression processing model / speech communication / vocal cue / paralinguistic cue / vocal trustworthiness

Cite this article

Download Citations
Hu Yanbing, Jiang Xiaoming. Encoding and Decoding Credibility in Human Vocal Cues[J]. Journal of Psychological Science. 2023, 46(5): 1057-1066 https://doi.org/10.16719/j.cnki.1671-6981.20230504

References

[1] 蒋晓鸣. (2020). 文化互鉴视角下非言语表情的嗓音编码和解码. 同济大学学报(社会科学版), 31(1), 116-124.
[2] 伍可, 陈杰, 李雯婕, 陈洁佳, 刘雷, 刘翠红. (2020). 人声加工的神经机制. 心理科学进展, 28(5), 752-765.
[3] 张坤坤, 张珂烨, 张火垠, 罗文波. (2019). 面孔可信度加工的时间进程和影响因素. 心理科学进展, 27(8), 1394-1403.
[4] 周爱保, 胡砚冰, 鲁小勇, 申莎, 关香丽, 陈大亮, 崔嘉溦. (2020). 我听故我在? 自我声音识别机制的探索. 心理科学, 43(3), 564-570.
[5] 周爱保, 胡砚冰, 周滢鑫, 李玉, 李文一, 张号博, 胡国庆. (2021). 听而不“闻”? 人声失认症的神经机制. 心理科学进展, 29(3), 414-424.
[6] Bestelmeyer P. E. G., Kotz S. A., & Belin P. (2017). Effects of emotional valence and arousal on the voice perception network. Social Cognitive and Affective Neuroscience, 12(8), 1351-1358.
[7] Birch S. A. J., Severson R. L., & Baimel A. (2020). Children' s understanding of when a person' s confidence and hesitancy is a cue to their credibility. PLoS ONE, 15(1), Article e0227026.
[8] Caballero, J. A., & Pell, M. D. (2020). Implicit effects of speaker accents and vocally-expressed confidence on decisions to trust. Decision, 7(4), 314-331.
[9] Declerck C. H., Boone C., & Emonds G. (2013). When do people cooperate? The neuroeconomics of prosocial decision making. Brain and Cognition, 81(1), 95-117.
[10] Dricu, M., & Frühholz, S. (2016). Perceiving emotional expressions in others: Activation likelihood estimation meta-analyses of explicit evaluation, passive perception and incidental perception of emotions. Neuroscience and Biobehavioral Reviews, 71, 810-828.
[11] Fitch, W. T. (2000). The evolution of speech: A comparative review. Trends in Cognitive Sciences, 4(7), 258-267.
[12] Frühholz, S., & Grandjean, D. (2013). Multiple subregions in superior temporal cortex are differentially sensitive to vocal expressions: A quantitative meta-analysis. Neuroscience and Biobehavioral Reviews, 37(1), 24-35.
[13] Frühholz, S., & Schweinberger, S. R. (2021). Nonverbal auditory communication - Evidence for integrated neural systems for voice signal production and perception. Progress in Neurobiology, 199, Article 101948.
[14] Frühholz S., Trost W., & Kotz S. A. (2016). The sound of emotions-Towards a unifying neural network perspective of affective sound processing. Neuroscience and Biobehavioral Reviews, 68, 96-110.
[15] Fujisaki, H. (1997). Prosody, models, and spontaneous speech. In Y. Sagisaka, N. Campbell, & N. Higuchi (Eds.), Computing prosody: Computational models for processing spontaneous speech (pp. 27-42). Springer.
[16] Goupil, L., & Aucouturier, J. J. (2021). Distinct signatures of subjective confidence and objective accuracy in speech prosody. Cognition, 212, Article 104661.
[17] Goupil L., Ponsot E., Richardson D., Reyes G., & Aucouturier J. J. (2021). Listeners' perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature. Nature Communications, 12(1), Article 861.
[18] Grossmann, T. (2021). Developmental origins of the pathway for social perception. Trends in Cognitive Sciences, 25(7), 546-547.
[19] Hamilton L. S., Oganian Y., Hall J., & Chang E. F. (2021). Parallel and distributed encoding of speech across human auditory cortex. Cell, 184(18), 4626-4639.
[20] Harada T., Itakura S., Xu F., Lee K., Nakashita S., Saito D. N., & Sadato N. (2009). Neural correlates of the judgment of lying: A functional magnetic resonance imaging study. Neuroscience Research, 63(1), 24-34.
[21] Hensel L., Bzdok D., Müller V. I., Zilles K., & Eickhoff S. B. (2015). Neural correlates of explicit social judgments on vocal stimuli. Cerebral Cortex, 25(5), 1152-1162.
[22] Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8(5), 393-402.
[23] Hughes, C., & Cutting, A. L. (1999). Nature, nurture, and individual differences in early understanding of mind. Psychological Science, 10(5), 429-432.
[24] Jessen, S., & Grossmann, T. (2020). Neural evidence for the impact of facial trustworthiness on object processing in a gaze-cueing task in 7-month-old infants. Social Neuroscience, 15(1), 74-82.
[25] Jiang X. M., Gossack-Keenan K., & Pell M. D. (2020). To believe or not to believe? How voice and accent information in speech alter listener impressions of trust. Quarterly Journal of Experimental Psychology, 73(1), 55-79.
[26] Jiang, X. M., & Pell, M. D. (2015). On how the brain decodes vocal cues about speaker confidence. Cortex, 66, 9-34.
[27] Jiang, X. M., & Pell, M. D. (2016a). Neural responses towards a speaker's feeling of (un)knowing. Neuropsychologia, 81, 79-93.
[28] Jiang, X. M., & Pell, M. D. (2016b). The feeling of another's knowing: How "mixed messages" in speech are reconciled. Journal of Experimental Psychology: Human Perception and Performance, 42(9), 1412-1428.
[29] Jiang, X. M., & Pell, M. D. (2017). The sound of confidence and doubt. Speech Communication, 88, 106-126.
[30] Jiang X. M., Sanford R., & Pell M. D. (2017). Neural systems for evaluating speaker (Un)believability. Human Brain Mapping, 38(7), 3732-3749.
[31] Jiang X. M., Sanford R., & Pell M. D. (2018). Neural architecture underlying person perception from in-group and out-group voices. NeuroImage, 181, 582-597.
[32] Kinzler, K. D. (2021). Language as a social cue. Annual Review of Psychology, 72(1), 241-264.
[33] Knight S., Lavan N., Torre I., & McGettigan C. (2021). The influence of perceived vocal traits on trusting behaviours in an economic game. Quarterly Journal of Experimental Psychology, 74(10), 1747-1754.
[34] Kotz, S. A., & Paulmann, S. (2011). Emotion, language, and the brain. Language and Linguistics Compass, 5(3), 108-125.
[35] Krueger, F., & Meyer-Lindenberg, A. (2019). Toward a model of interpersonal trust drawn from neuroscience, psychology, and economics. Trends in Neurosciences, 42(2), 92-101.
[36] Landrum A. R., Eaves B. S., & Shafto P. (2015). Learning to trust and trusting to learn: A theoretical framework. Trends in Cognitive Sciences, 19(3), 109-111.
[37] Lavan N., Burton A. M., Scott S. K., & McGettigan C. (2019). Flexible voices: Identity perception from variable vocal signals. Psychonomic Bulletin and Review, 26(1), 90-102.
[38] Lee, S. A. (2020). Domain specificity. In J. Vonk, & T. Shackelford (Eds.), Encyclopedia of animal cognition and behavior (pp. 1-4). Springer.
[39] Leitman D. I., Wolf D. H., Laukka P., Ragland J. D., Valdez J. N., Turetsky B. I., & Gur R. C. (2011). Not pitch perfect: Sensory contributions to affective communication impairment in schizophrenia. Biological Psychiatry, 70(7), 611-618.
[40] Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Sciences, 4(5), 187-196.
[41] Lima C. F., Arriaga P., Anikin A., Pires A. R., Frade S., Neves L., & Scott S. K. (2021). Authentic and posed emotional vocalizations trigger distinct facial responses. Cortex, 141, 280-292.
[42] Mattavelli G., Romano D., Young A. W., & Ricciardelli P. (2021). The interplay between gaze cueing and facial trait impressions. Quarterly Journal of Experimental Psychology, 74(9), 1642-1655.
[43] Milham M. P., Banich M. T., Webb A., Barad V., Cohen N. J., Wszalek T., & Kramer A. F. (2001). The relative involvement of anterior cingulate and prefrontal cortex in attentional control depends on nature of conflict. Cognitive Brain Research, 12(3), 467-473.
[44] Monetta L., Cheang H. S., & Pell M. D. (2008). Understanding speaker attitudes from prosody by adults with Parkinson's disease. Journal of Neuropsychology, 2(2), 415-430.
[45] Mori, Y., & Pell, M. D. (2019). The look of (Un)confidence: Visual markers for inferring speaker confidence in speech. Frontiers in Communication, 4, Article 63.
[46] O'Connor, J. J. M., & Barclay, P. (2017). The influence of voice pitch on perceptions of trustworthiness across social contexts. Evolution and Human Behavior, 38(4), 506-512.
[47] Oleszkiewicz A., Pisanski K., Lachowicz-Tabaczek K., & Sorokowska A. (2017). Voice-based assessments of trustworthiness, competence, and warmth in blind and sighted adults. Psychonomic Bulletin and Review, 24(3), 856-862.
[48] Pell, M. D., & Kotz, S. A. (2021). Comment: The next frontier: Prosody research gets interpersonal. Emotion Review, 13(1), 51-56.
[49] Pickering, M. J., & Garrod, S. (2013). An integrated theory of language production and comprehension. Behavioral and Brain Sciences, 36(4), 329-347.
[50] Pisanski K., Cartei V., McGettigan C., Raine J., & Reby D. (2016). Voice modulation: A window into the origins of human vocal control? Trends in Cognitive Sciences, 20(4), 304-318.
[51] Rigoulot S., Fish K., & Pell M. D. (2014). Neural correlates of inferring speaker sincerity from white lies: An event-related potential source localization study. Brain Research, 1565, 48-62.
[52] Schirmer, A., & Kotz, S. A. (2006). Beyond the right hemisphere: Brain mechanisms mediating vocal emotional processing. Trends in Cognitive Sciences, 10(1), 24-30.
[53] Scott, S. K. (2019). From speech and talkers to the social world: The neural processing of human spoken language. Science, 366(6461), 58-62.
[54] Spreckelmeyer K. N., Kutas M., Urbach T., Altenmüller E., & Münte T. F. (2009). Neural processing of vocal emotion and identity. Brain and Cognition, 69(1), 121-126.
[55] Sumner, M. (2015). The social weight of spoken words. Trends in Cognitive Sciences, 19(5), 238-239.
[56] Tatz J. R., Peynircioğlu Z. F., & Brent W. (2020). Face-voice space: Integrating visual and auditory cues in judgments of person distinctiveness. Attention, Perception, and Psychophysics, 82(7), 3710-3727.
[57] Torre I., Carrigan E., McDonnell R., Domijan K., McCabe K., & Harte N. (2019). The effect of multimodal emotional expression and agent appearance on trust in human-agent interaction, Newcastle upon Tyne, United Kingdom.
[58] van Overwalle, F., & Baetens, K. (2009). Understanding others' actions and goals by mirror and mentalizing systems: A meta-analysis. NeuroImage, 48(3), 564-584.
[59] van Overwalle F., Baetens K., Mariën P., & Vandekerckhove M. (2014). Social cognition and the cerebellum: A meta-analysis of over 350 fMRI studies. NeuroImage, 86, 554-572.
[60] Wu D. C., Loke I. C., Xu F., & Lee K. (2011). Neural correlates of evaluations of lying and truth-telling in different social contexts. Brain Research, 1389, 115-124.
[61] Young A. W., Frühholz S., & Schweinberger S. R. (2020). Face and voice perception: Understanding commonalities and differences. Trends in Cognitive Sciences, 24(5), 398-410.
[62] Zellou G., Cohn M., & Kline T. (2021). The influence of conversational role on phonetic alignment toward voice-AI and human interlocutors. Language, Cognition and Neuroscience, 36(10), 1298-1312.
PDF(982 KB)

Accesses

Citation

Detail

Sections
Recommended

/