心理语言学:CH3.perceptionPPT课件

上传人:人*** 文档编号:567673645 上传时间:2024-07-22 格式:PPT 页数:57 大小:2.12MB
返回 下载 相关 举报
心理语言学:CH3.perceptionPPT课件_第1页
第1页 / 共57页
心理语言学:CH3.perceptionPPT课件_第2页
第2页 / 共57页
心理语言学:CH3.perceptionPPT课件_第3页
第3页 / 共57页
心理语言学:CH3.perceptionPPT课件_第4页
第4页 / 共57页
心理语言学:CH3.perceptionPPT课件_第5页
第5页 / 共57页
点击查看更多>>
资源描述

《心理语言学:CH3.perceptionPPT课件》由会员分享,可在线阅读,更多相关《心理语言学:CH3.perceptionPPT课件(57页珍藏版)》请在金锄头文库上搜索。

1、Chapter 3: Speech Perception2021/7/221Overview of QuestionsCan computers perceive speech as well as humans?Why does an unfamiliar foreign language often sound like a continuous stream of sound, with no breaks between words?Does each word that we hear have a unique pattern of air pressure changes ass

2、ociated with it?Are there specific areas in the brain that are responsible for perceiving speech?2021/7/222Speech perception refers to the processes by which humans are able to interpret and understand the sounds used in language. The study of speech perception is closely linked to the fields of pho

3、netics and phonology in linguistics and cognitive psychology and perception in psychology. 2021/7/223Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech research has applications in building comp

4、uter systems that can recognize speech, as well as improving speech recognition for hearing- and language-impaired listeners.2021/7/224Speech PerceptionThe first step in comprehending spoken language is to identify the words being spoken, performed in multiple stages:1. Phonemes are detected (/b/, /

5、e/, /t/, /e/, /r/, )2. Phonemes are combined into syllables (/be/ /ter/)3. Syllables are combined into words (“better”)4. Word meaning retrieved from memory2021/7/225Spectrogram: I owe you a yo-yo2021/7/226Speech perception: two problemsWords are not neatly segmented (e.g., by pauses) Lack of phonem

6、e invarianceCoarticulation = consecutive speech sounds blend into each other due to mechanical constraints on articulatorsSpeaker differences; pitch affected by age and sex; different dialects, talking speeds etc.2021/7/227The speech input consists of;Frequency range 50-5600HzCritical band filters(临

7、界频带滤波器)Dynamic range 50dBTemporal resolution (瞬时清晰度)of 10msSmallest detectable change in F0 2HzSmallest change in F1 40HzSmallest change in F2 100HzSmallest change in F3 150Hz2021/7/228The Speech StimulusPhoneme - smallest unit of speech that changes meaning in a wordIn English there are 47 phonemes

8、:23 major vowel sounds24 major consonant soundsNumber of phonemes in other languages varied11 in Hawaiian and 60 in some African dialects2021/7/229Table 13.1 Major consonants and vowels of English and their phonetic symbols2021/7/2210The Acoustic SignalProduced by air that is pushed up from the lung

9、s through the vocal cords and into the vocal tractVowels are produced by vibration of the vocal cords and changes in the shape of the vocal tractThese changes in shape cause changes in the resonant frequency and produce peaks in pressure at a number of frequencies called formants(共振峰)2021/7/2211Figu

10、re 13.1 The vocal tract includes the nasal and oral cavities and the pharynx, as well as components that move, such as the tongue, lips, and vocal cords.2021/7/2212The Acoustic Signal - continuedThe first formant has the lowest frequency, the second has the next highest, etc.Sound spectrograms show

11、the changes in frequency and intensity for speechConsonants are produced by a constriction of the vocal tractFormant transitions - rapid changes in frequency preceding or following consonants2021/7/2213Figure 13.3 Spectrogram of the word had showing the first (F1), second (F2), and third (F3) forman

12、ts for the vowel /ae/. (Spectrogram courtesy of Kerry Green.) 2021/7/2214Figure 13.4 Spectrogram of the sentence “Roy read the will,” showing the formants such as F1, F2, and F3, and formant transitions such as T2 and T3. (Spectrogram courtesy of Kerry Green.)2021/7/2215The Relationship between the

13、Speech Stimulus and Speech PerceptionThe segmentation problem - there are no physical breaks in the continuous acoustic signalHow do we segment the individual words?The variability problem - there is no simple correspondence between the acoustic signal and individual phonemesVariability from a phone

14、mes context Coarticulation - overlap between articulation of neighboring phonemes2021/7/2216Figure 13.5 Spectrogram of “I owe you a yo-yo.” This spectrogram does not contain pauses or breaks that correspond to the words that we hear. The absence of breaks in the acoustic signal creates the segmentat

15、ion problem. (Spectrogram courtesy of David Pisoni.)2021/7/2217Figure 13.6 Hand-drawn spectrograms for /di/ and /du/. (From “Perception of the Speech Code,” by A. M. Liberman, 1967, Psychological Review, 74, 431-461, figure 1. Copyright 1967 by the American Psychological Association. Reprinted by pe

16、rmission of the author.)2021/7/2218The Relationship between the Speech Stimulus and Speech Perception - continuedVariability from different speakersSpeakers differ in pitch, accent, speed in speaking, and pronunciationThis acoustic signal must be transformed into familiar wordsPeople perceive speech

17、 easily in spite of the segmentation and variability problems 2021/7/2219Figure 13.7 (a) Spectrogram of “What are you doing?” pronounced slowly and distinctly. (b) Spectrogram of “What are you doing?” as pronounced in conversational speech. (Spectrogram courtesy of David Pisoni.)2021/7/2220Stimulus

18、Dimensions of Speech PerceptionInvariant acoustic cues - features of phonemes that remain constantShort-term spectrograms are used to investigate invariant acoustic cuesSequence of short-term spectra can be combined to create a running spectral displayFrom these displays, there have been some invari

19、ant cues discovered2021/7/2221Figure 13.8 Left: a short-term spectrum of the acoustic energy in the first 26 ms of the phoneme /ga/. Right: sound spectrogram of the same phoneme. The sound for the first 26 ms is indicated in red. The peak in the short-term spectrum, marked a, corresponds to the dark

20、 band of energy, marked a in the spectrum. The minimum in the short-term spectrum, marked b, corresponds to the light area, marked b in the spectrogram. The spectrogram on the right shows the energy for the entire 500 ms duration of the sound, whereas the short-term spectrum only shows the first 26

21、ms at the beginning of this signal. (Courtesy of James Sawusch.)2021/7/2222Figure 13.9 Running spectral displays for /pi/ and /da/. These displays are made up of a sequence of short-term spectra, like the one in Figure 13.8. Each of these spectra is displaced 5 ms on the time axis, so that each step

22、 we move along this axis indicates the frequencies present in the next 5 ms. The low-frequency peak (V) in the /da/ display is a cue for voicing. (From “Time-Varying Features of Initial Stop Consonants in Auditory Running Spectra: A First Report,” by D. Kewley-Port, and P. A. Luce, 1984, Perception

23、and Psychophysics, 35, 353-360, figure 1. Copyright 1984 by Psychonomic Society Publications. Reprinted by permission.)2021/7/2223Categorical PerceptionThis occurs when a wide range of acoustic cues results in the perception of a limited number of sound categoriesAn example of this comes from experi

24、ments on voice onset time (VOT) - time delay between when a sound starts and when voicing beginsStimuli are da (VOT of 17ms) and ta (VOT of 91ms)2021/7/2224Categorical Perception - continuedComputers were used to create stimuli with a range of VOTs from long to shortListeners do not hear the increme

25、ntal changes, instead they hear a sudden change from /da/ to /ta/ at the phonetic boundaryThus, we experience perceptual constancy for the phonemes within a given range of VOT 2021/7/2225Figure 13.10 Spectrograms for /da/ and /ta/. The voice onset time - the time between the beginning of the sound a

26、nd the onset of voicing - is indicated at the beginning of the spectrogram for each sound (Spectrogram courtesy of Ron Cole.) 2021/7/2226Figure 13.11 The results of a categorical perception experiment indicate that /da/ is perceived for VOTs to the left of the phonetic boundary, and that /ta/ is per

27、ceived at VOTs to the right of the phonetic boundary. (From “Selective Adaptation of Linguistic Feature Detectors, by P. Eimas and J. D. Corbit, 1973, Cognitive Psychology, 4, 99-109, figure 2. Copyright 1973 Academic Press, Inc. Reprinted by permission.)2021/7/2227Figure 13.12 In the discrimination

28、 part of a categorical perception experiment, two stimuli are presented, and the listener indicates whether they are the same or different. The typical result is that two stimuli with VOTs on the same side of the phonetic boundary (solid arrows) are judged to be the same, and that two stimuli on dif

29、ferent sides of the phonetic boundary (dashed arrows) are judged to be different. 2021/7/2228Figure 13.13 Perceptual constancy occurs when all stimuli on one side of the phonetic boundary are perceived to be in the same category even though their VOT is changed over a substantial range. This diagram

30、 symbolizes the constancy observed by Eimas and Corbit (1973) experiment, in which /da/ was heard on one side of the boundary and /ta/ on the other side.2021/7/2229Speech Perception is MultimodalAuditory-visual speech perceptionThe McGurk effectVisual stimulus shows a speaker saying “ga-ga”Auditory

31、stimulus has a speaker saying “ba-ba”Observer watching and listening hears “da-da”, which is the midpoint between “ga” and “ba”Observer with eyes closed will hear “ba”2021/7/2230McGurk Effect2021/7/2231Figure 13.14 The McGurk effect. The womans lips are moving as if she is saying /ga-ga/, but the ac

32、tual sound being presented is /ba-ba/. The listener, however, reports hearing the sound /da-da/. If the listener closes his eyes, so that he no longer sees the womans lips, he hears /ba-ba/. Thus, seeing the lips moving influences what the listener hears.2021/7/2232Cognitive Dimensions of Speech Per

33、ceptionTop-down processing, including knowledge a listener has about a language, affects perception of the incoming speech stimulusSegmentation is affected by context and meaningI scream you scream we all scream for ice cream2021/7/2233Figure 13.15 Speech perception is the result of top-down process

34、ing (based on knowledge and meaning) and bottom-up processing (based on the acoustic signal) working together.2021/7/2234Meaning and Phoneme PerceptionExperiment by Turvey and Van GelderShort words (sin, bat, and leg) and short nonwords (jum, baf, and teg) were presented to listenersThe task was to

35、press a button as quickly as possible when they heard a target phonemeOn average, listeners were faster with words (580 ms) than non-words (631 ms)2021/7/2235Meaning and Phoneme Perception - continuedExperiment by WarrenListeners heard a sentence that had a phoneme covered by a coughThe task was to

36、state where in the sentence the cough occurred Listeners could not correctly identify the position and they also did not notice that a phoneme was missing - called the phonemic restoration effect2021/7/2236Phonemic restoration Auditory presentation PerceptionLegislature legislatureLegi_laturelegi la

37、tureLegi*lature legislatureIt was found that the *eel was on the axle. wheel It was found that the *eel was on the shoe. heel It was found that the *eel was on the orange. peel It was found that the *eel was on the table. meal Warren, R. M. (1970). Perceptual restorations of missing speech sounds. S

38、cience, 167, 392-393. 2021/7/2237Meaning and Word PerceptionExperiment by Miller and IsardStimuli were three types of sentences:Normal grammatical sentencesAnomalous sentences that were grammaticalUngrammatical strings of wordsListeners were to shadow (repeat aloud) the sentences as they heard them

39、through headphones2021/7/2238Meaning and Word Perception - continuedResults showed that listeners were89% accurate with normal sentences79% accurate for anomalous sentences56% accurate for ungrammatical word stringsDifferences were even larger if background noise was present 2021/7/2239Speaker Chara

40、cteristicsIndexical characteristics - characteristics of the speakers voice such as age, gender, emotional state, level of seriousness, etc.Experiment by Palmeri, Goldinger, and PisoniListeners were to indicate when a word was new in a sequence of wordsResults showed that they were much faster if th

41、e same speaker was used for all the words2021/7/2240Speech Perception and the BrainBrocas aphasia - individuals have damage in Brocas area (in frontal lobe)Labored and stilted speech and short sentences but they understand othersWernickes aphasia - individuals have damage in Wernickes area (in tempo

42、ral lobe)Speak fluently but the content is disorganized and not meaningfulThey also have difficulty understanding others2021/7/2241Figure 13.16 Brocas and Wernickes areas, which are specialized for language production and comprehension, are located in the left hemisphere of the brain in most people.

43、2021/7/2242Speech Perception and the Brain - continuedMeasurements from cats auditory fibers show that the pattern of firing mirrors the energy distribution in the auditory signalBrain scans of humans show that there are areas of the human what stream that are selectively activated by the human voic

44、e2021/7/2243Figure 13.17 (a) Short-term spectrum for /da/. This curve indicates the energy distribution in /da/ between 20 and 40 ms after the beginning of the signal. (b) Nerve firing of a population of cat auditory nerve fibers to the same stimulus. (From “Encoding of Speech Features in the Audito

45、ry Nerve,” by M. B. Sachs, E. D. Young, and M. I. Miller, 1981. In R. Carlson and B. Granstrom (Eds.) The Representation of Speech in the Peripheral Auditory System, pp. 115-130. Copyright 1981 by Elsevier Science Publishing, New York. Reprinted by permission.)2021/7/2244Experience Dependent Plastic

46、ityBefore age 1, human infants can tell difference between sounds that create all languagesThe brain becomes “tuned” to respond best to speech sounds that are in the environmentOther sound differentiation disappears when there is no reinforcement from the environment2021/7/2245Motor Theory of Speech

47、 PerceptionLiberman et al. proposed that motor mechanisms responsible for producing sounds activate mechanisms for perceiving soundEvidence from monkeys comes from the existence of mirror neuronsExperiment by Watkins et al.Participants had their motor cortex for face movements stimulated by transcra

48、nial magnetic stimulation (TMS)2021/7/2246Motor Theory of Speech Perception - continuedResults showed small movements for the mouth called motor evoked potentials (MEP)This response became larger when the person listened to speech or watched someone elses lip movementsIn addition, the where stream m

49、ay work with the what stream for speech perception2021/7/2247Figure 13.18 The transcranial magnetic stimulation experiment that provides evidence for a link between speech perception and production in humans. See text for details. (Reprinted from Neuropsychologia, 41, Watkins, K. E., Strafella, A. P

50、., & Paus, T., Seeing and hearing speech excites the motor system involved in speech production., 989-994, (2003), with permission from Elsevier.)2021/7/2248Models of Spoken Word IdentificationThe Cohort ModelMarslen-Wilson & Welsh, 1978Revised, Marslen-Wilson, 1989The TRACE Model Similar to the Int

51、eractive Activation model McClelland & Elman, 19862021/7/2249Online word recognition: the cohort model2021/7/2250Recognizing Spoken Words: The Cohort ModelAll candidates considered in parallelCandidates eliminated as more evidence becomes available in the speech inputUniqueness point occurs when onl

52、y one candidate remains2021/7/2251Evidence for activation of spurious wordsIf we recognize words by recognizing a cohort of possibilities, then the cohort words should excert some influenceShillcock (1990). Test for semantic priming of cohort words:“He picked up the trombone”“trom” “bone”Semantic pr

53、iming for the word “rib”2021/7/2252TRACE modelSimilar to interactive activation model but applied to speech recognition Connections between levels are bi-directional and excitatory top-down effectsConnections within levels are inhibitory producing competition between alternatives (McClelland & Elman

54、, 1986)2021/7/2253TRACE Model(McClelland & Elman, 1986)2021/7/2254Human Eye Tracking DataPick up the beakerAllopenna, Magnuson & Tanenhaus (1998)Eye tracking device to measurewhere subjects are looking2021/7/2255Human Eye Tracking DataHuman eye tracking data highly similar to TRACE predictions Allopenna, Magnuson & Tanenhaus (1998)2021/7/2256个人观点供参考,欢迎讨论

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 资格认证/考试 > 自考

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号