PDF(1118 KB)
PDF(1118 KB)
PDF(1118 KB)
说者身份信息对口语词汇加工的影响*
The Role of Speaker’s Identity Information in Spoken Word Processing
研究采用长时重复启动范式,操纵学习和测验材料中说者身份的一致性,考察说者身份信息对较浅的词汇通达 (实验1) 和较深的概念理解 (实验2) 的影响。实验1词汇决定任务发现,对于学过的词,说者身份一致条件的正确率显著高于不一致条件,表明说者身份信息和语言信息以整合的方式影响词汇通达。实验2类别决定任务发现,对于未学过的词,说者身份一致条件的正确率显著高于不一致条件,表明说者身份信息以独立的方式影响概念理解。基于上述结果和先前的相关理论,研究尝试提出了包含身份信息加工的口语词汇加工的新观点,有助于更具社会性和生态性地解释口语词汇的认知加工。
Considering the speaker’s identity information provides a more social and ecological explanation of the cognitive processing of spoken words. However, whether and how speaker’s identity information affects spoken word processing is controversial. The abstractionist view (including the early and developmental abstractionist views) and the episodic view hold different opinions on this issue. Moreover, previous studies have employed different experimental tasks that provide different evidence for these views. Based on our analyses of these previous studies, we propose that existing views may each be suitable for explaining different processes in spoken word processing. It is necessary to examine the role of speaker’s identity information in spoken word processing requiring different processing depths. Based on this background, the present study focused on whether and how speaker’s identity information affected lexical access and conceptual comprehension in spoken word processing. Addressing these issues can help us better understand spoken word processing.
The present study conducted two behavioral experiments and adopted the classic long-term repetition priming paradigm to minimize possible interference from explicit experimental tasks. Specifically, Experiment 1 adopted a lexical decision task to examine whether and how speaker’s identity information affected lexical access in spoken word processing. Eighty-eight participants were recruited for the experiments and randomly divided into two groups (speakers’ identities were consistent vs. inconsistent). The experiment contained learning and test phases. In the consistent group, participants would hear stimuli spoken by a male in both the learning and test phases; in the inconsistent group, participants would hear stimuli spoken by a male in the learning phase and by a female in the test phase. The experimental materials consisted of 36 real words (e.g., “/yi1fu2/”, which means clothes in English) and 36 pseudowords (i.e., pronounceable but meaningless nonwords, e.g., “/ju4hong2/”). Participants needed to judge whether the auditory word was real or pseudo. Experiment 2 adopted a category decision task to examine whether and how speaker’s identity information affected conceptual comprehension in spoken word processing. The participants and design were the same as Experiment 1, with 36 biological words (e.g., “/xiao3cao3/”, which means grass in English) and 36 non-biological words (e.g., “/qian1bi3/”, which means pencil in English) as experimental materials. Participants needed to judge whether the auditory word was biological or non-biological.
In Experiment 1, the performance of learned words was better than that of unlearned words, indicating a stable repetition effect. More importantly, in the overall analysis (including real words and pseudowords), for learned words, the accuracy of the consistent condition was significantly larger than the inconsistent condition; for unlearned words, there was no significant difference between the consistent and inconsistent conditions. Further analysis revealed that the results for pseudowords were the same as the overall analysis, but for real words, there were no significant differences in either accuracy or reaction time between the consistent and inconsistent conditions for both learned and unlearned words. In Experiment 2, the response times of learned words were significantly shorter than those of unlearned words, suggesting the repetition effect of learned words. However, in contrast to Experiment 1, the accuracy of the consistent condition was significantly larger than the inconsistent condition for unlearned words, while there was no such difference for learned words.
Speaker’s identity information influences the processing of spoken word differently depending on the processes. Specifically, speaker’s identity consistency facilitation for learned words in the lexical decision task suggested that the representation of the speaker’s identity was integrated with linguistic information and would affect lexical access integrally, supporting the episodic view. In contrast, speaker’s identity consistency facilitation for unlearned words in the category decision task suggested that the speaker’s identity and linguistic information would be represented separately and affect conceptual comprehension independently, supporting the developmental abstractionist view. Integrating the developmental abstractionist and episodic views helps us better understand spoken word processing.
口语词汇加工 / 身份信息 / 语言信息 / 词汇通达 / 概念理解
spoken word processing / identity information / linguistic information / lexical access / conceptual comprehension
| [1] |
汉语大字典编纂处. (2020). 现代汉语词典. 四川辞书出版社.
|
| [2] |
胡砚冰, 蒋晓鸣. (2023). “信”以传信,“疑”以传疑?基于人声线索的可信度编码与解码. 心理科学, 5, 1057-1066.
|
| [3] |
姜路遥, 李兵兵. (2023). 汉语听觉阈下启动效应:来自听觉掩蔽启动范式的证据. 心理学报, 4, 529-541.
|
| [4] |
李利, 莫雷, 王瑞明, 罗雪莹. (2006). 非熟练中—英双语者跨语言长时重复启动效应. 心理学报, 5, 672-680.
|
| [5] |
明莉莉, 胡学平. (2021). 人类嗓音加工的神经机制——来自正常视力者和盲人的脑神经证据. 心理科学进展, 12, 2147-2160.
|
| [6] |
莫雷, 李利, 王瑞明. (2005). 熟练中—英双语者跨语言长时重复启动效应. 心理科学, 6, 10-15.
|
| [7] |
余可可, 周亚聪, 刘秉怡, 蔡涵涵, 王瑞明. (2021). 听话者对说话者嗓音中语言学信息和副语言学信息的加工. 心理研究, 1, 29-36.
|
| [8] |
张钦, 张必隐. (1999). 词汇决定任务中的策略因素. 心理科学, 1, 75-76.
|
| [9] |
赵荣, 王小娟, 杨剑峰. (2016). 声调在汉语音节感知中的作用. 心理学报, 48(8), 915-923.
|
| [10] |
|
| [11] |
The human voice is the carrier of speech, but also an "auditory face" that conveys important affective and identity information. Little is known about the neural bases of our abilities to perceive such paralinguistic information in voice. Results from recent neuroimaging studies suggest that the different types of vocal information could be processed in partially dissociated functional pathways, and support a neurocognitive model of voice perception largely similar to that proposed for face perception.
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
Speech carries accent information relevant to determining the speaker's linguistic and social background. A series of web-based experiments demonstrate that accent cues can modulate access to word meaning. In Experiments 1-3, British participants were more likely to retrieve the American dominant meaning (e.g., hat meaning of "bonnet") in a word association task if they heard the words in an American than a British accent. In addition, results from a speeded semantic decision task (Experiment 4) and sentence comprehension task (Experiment 5) confirm that accent modulates on-line meaning retrieval such that comprehension of ambiguous words is easier when the relevant word meaning is dominant in the speaker's dialect. Critically, neutral-accent speech items, created by morphing British- and American-accented recordings, were interpreted in a similar way to accented words when embedded in a context of accented words (Experiment 2). This finding indicates that listeners do not use accent to guide meaning retrieval on a word-by-word basis; instead they use accent information to determine the dialectic identity of a speaker and then use their experience of that dialect to guide meaning access for all words spoken by that person. These results motivate a speaker-model account of spoken word recognition in which comprehenders determine key characteristics of their interlocutor and use this knowledge to guide word meaning access.Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
Addressees use information from specific speakers' previous discourse to make predictions about incoming linguistic material and to restrict the choice of potential interpretations. In this way, speaker specificity has been shown to be an influential factor in language processing across several domains e.g., spoken word recognition, sentence processing, and pragmatics. However, its influence on semantic disambiguation has received little attention to date. Using an exposure-test design and visual world eye tracking, we examined the effect of speaker-specific literal vs. nonliteral style on the disambiguation of metaphorical polysemes such as 'fork', 'head', and 'mouse'. Eye movement data revealed that when interpreting polysemous words with a literal and a nonliteral meaning, addressees showed a late-stage preference for the literal meaning in response to a nonliteral speaker. We interpret this as reflecting an indeterminacy in the intended meaning in this condition, as well as the influence of meaning dominance cues at later stages of processing. Response data revealed that addressees then ultimately resolved to the literal target in 90% of trials. These results suggest that addressees consider a range of senses in the earlier stages of processing, and that speaker style is a contextual determinant in semantic processing.© 2022. The Author(s).
|
| [20] |
G*Power (Erdfelder, Faul, & Buchner, 1996) was designed as a general stand-alone power analysis program for statistical tests commonly used in social and behavioral research. G*Power 3 is a major extension of, and improvement over, the previous versions. It runs on widely used computer platforms (i.e., Windows XP, Windows Vista, and Mac OS X 10.4) and covers many different statistical tests of the t, F, and chi2 test families. In addition, it includes power analyses for z tests and some exact tests. G*Power 3 provides improved effect size calculators and graphic options, supports both distribution-based and design-based input modes, and offers all types of power analyses in which users might be interested. Like its predecessors, G*Power 3 is free.
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
Two event-related potential (ERP) experiments were conducted to investigate whether Cantonese lexical tones are processed with general auditory perception mechanisms and/or a special speech module. Two tonal features (f0 direction and f0 height deviation) were manipulated to reflect acoustic processing, and the contrast between syllables and hums was used to reveal the involvement of a speech module. Experiment 1 adopted a passive oddball paradigm to study a relatively early stage of tonal processing. Mismatch negativity (MMN) and novelty P3 (P3a) were modulated by the interaction between tonal feature and stimulus type. Similar interactions were found for N2 and P3 in Experiment 2, where more in-depth tonal processing was examined with an active oddball paradigm. Moreover, detecting tonal deviants of syllables elicited N1 and P2 that were not found in hum detection. Together, these findings suggest that the processing of lexical tone relies on both acoustic and linguistic processes from the early stage. Another noteworthy finding is the absence of brain lateralization in both experiments, which challenges the use of a lateralization pattern as evidence for processing lexical tones through a special speech module. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
Though listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.
|
| [31] |
|
| [32] |
|
| [33] |
A perceptual learning experiment provides evidence that the mental lexicon cannot consist solely of detailed acoustic traces of recognition episodes. In a training lexical decision phase, listeners heard an ambiguous [f-s] fricative sound, replacing either [f] or [s] in words. In a test phase, listeners then made lexical decisions to visual targets following auditory primes. Critical materials were minimal pairs that could be a word with either [f] or [s] (cf. English knife-nice), none of which had been heard in training. Listeners interpreted the minimal pair words differently in the second phase according to the training received in the first phase. Therefore, lexically mediated retuning of phoneme perception not only influences categorical decisions about fricatives (Norris, McQueen, & Cutler, 2003), but also benefits recognition of words outside the training set. The observed generalization across words suggests that this retuning occurs prelexically. Therefore, lexical processing involves sublexical phonological abstraction, not only accumulation of acoustic episodes.2006 Lawrence Erlbaum Associates, Inc.
|
| [34] |
Two experiments explored repetition priming effects for spoken words and pseudowords in order to investigate abstractionist and episodic accounts of spoken word recognition and repetition priming. In Experiment 1, lexical decisions were made on spoken words and pseudowords with half of the items presented twice (∼12 intervening items). Half of all repetitions were spoken in a “different voice” from the first presentations. Experiment 2 used the same procedure but with stimuli embedded in noise to slow responses. Results showed greater priming for words than for pseudowords and no effect of voice change in both normal and effortful processing conditions. Additional analyses showed that for slower participants, priming is more equivalent for words and pseudowords, suggesting episodic stimulus–response associations that suppress familiarity-based mechanisms that ordinarily enhance word priming. By relating behavioural priming to the time-course of pseudoword identification we showed that under normal listening conditions (Experiment 1) priming reflects facilitation of both perceptual and decision components, whereas in effortful listening conditions (Experiment 2) priming effects primarily reflect enhanced decision/response generation processes. Both stimulus–response associations and enhanced processing of sensory input seem to be voice independent, providing novel evidence concerning the degree of perceptual abstraction in the recognition of spoken words and pseudowords.
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
Previous studies proposed different views to explain the hemispheric lateralization of lexical tone processing. But how the acoustic and phonological information modulates it remains unclear. The acoustic information refers to the physical acoustic features of lexical tones, and the phonological information means the different word meanings differentiated by lexical tones. In the present study, we adopted the active oddball paradigm to explore the effects of pitch type and lexicality on native Cantonese speakers' lexical tone processing with the event-related potential (ERP) technique. We used Cantonese level and contour tones (pitch type) to examine the role of acoustic information and real words and pseudowords (lexicality) to detect the phonological information's effect. The results showed that the pitch type and lexicality affected the N2b amplitudes between the left and right hemispheres interactively, while they did not play roles in P3b amplitudes. The results indicated that the acoustic and phonological information modulated the hemispheric lateralization of lexical tone processing interactively only in the early stage (N2b time window) but not in the later stage (P3b time window). The findings suggested a two-stage model interprets the hemispheric lateralization in lexical tone processing.Copyright © 2021. Published by Elsevier B.V.
|
| [39] |
|
/
| 〈 |
|
〉 |