the grade only
1
Sensation and Perception
Chapter 12: Speech and Music
Most images © 2014 Worth Publishing. Most images from Yantis (2014)
Lecture Outline
• Speech – Phonemes – Production of phonemes
• Vocal system • Filtering • Formants • Spectrogram • Voicing • Point of articulation • Manner of articulation
– Perceiving phonemes • Correspondence • Segmentation • Categorical perception • McGurk effect
– Neural basis of production and perception
• Music – Dimensions of music
• Pitch • Loudness / dynamics • Timing / rhythm
– Melody – Consonance and
dissonance – Neural basic of music
perception – Absolute pitch – Amusia
2
Production of Phonemes
• Phonemes
Production of Phonemes
• Speech consists of multiple harmonics
• The harmonics are filtered (changed) by the vocal folds and structures in the oral and nasal cavities (the filter is indicated as the red line)
• The peaks in the filtered sound (F1, F2, F3) are called formants
Formants for Vowels
• The oral cavity has a different shape when saying “bead” vs. “bad”
• The different shape results in different filtering of the sound
• The formants, F1, F2, and F3 have different frequencies for the two vowel sounds
3
Sound Spectrogram
Production of Consonants
• Voicing
• Point of articulation
• Manner of articulation
Patterns of Articulation Manner of Articulation; Voicing
Place of Articulation
Stops Fricatives Affricates Nasals Approximants
Voiceless Voiced Voiceless Voiced Voiceless Voiced Voiceless Voiced Voiceless Voiced
Bilabial /p/ pit
/b/ bit
/m/ Sum
Labiodental /f/ fine
/v/ vine
Dental /θ/ thick
/ð/ them
Alveolar /t/ tip
/d/ dip
/s/ sip
/z/ zip
/n/ sun
/l/ let
Postalveolar /ʃ/ ship
/3/ vision
/tʃ/ chip
/d3/ jet
/r/ rip
Palatal /j/ yet
Velar /k/ kit
/g/ get
/ŋ/ sing
Glottal /h/ hit
4
Perceiving Speech Sounds
• Perceiving speech is difficult because
– There is no one- to-one correspondence
between speech sounds and phoneme
– Segmentation problem
Coarticulation
Categorical Perception of Speech Sounds
5
Categorical Perception of Speech Sounds
-10 -5 0 +5 +10 +15 +20 +25 +30 +35 +40 +45 +50
McGurk Effect
• meetmarsmarsisasupercutegirllookingforh erforeverhomeshewasbroughtinasastraywi ththreeothercatschinamoroccoandjupiter marsisapolydactylcatwhichmeansshehasex tratoessoinsteadofhavingeighteentoeswhic hmostcatshaveshehastwentyfourmostpoly dactylcatshaveextratoesontheirfrontfeetbu tnottheirbackfeetitisraretohaveextratoeson allfourfeetlikemarsdoes
6
Top-Down Processes in Speech Perception: Word Segmentation
• If there are no physical pauses between words in normal speech, how do we segment speech into individual words
• New word starts when the next phoneme is unlikely to be part of the current word
– Phoneme transition probabilities -- within a word, some phonemes frequently follow others -- /th/ is often followed by /e/, and some phonemes rarely follow others -- /th/ is rarely followed by /q/
Top-Down Processes in Speech Perception: Phonemic Restoration
• When speech is interrupted by another sound, say a cough, one or phonemes may be covered by the sound. Yet, the perception of the phonemes occurs
– Phoneme transition probabilities often dictate that only a small number of phonemes are likely to have occurred during the sound and the missing phoneme is perceptually filled in.
Speech Perception and Production
7
Dimensions of Music: Pitch
• Octave – a range of notes such that the fundamental frequency of the last note is twice that of the first note
• Semitone – each octave is divided into 13 notes (including both end notes) separated by proportionally equivalent intervals
Pitch Helix
• The proportional equivalence of semitones is based on two perceptual observations:
– Notes an octave apart are perceptually more similar than notes that more similar frequencies
– The perceptual spacing of adjacent semitones is equal even though the frequency separation of
adjacent semitones is not equal (they are proportionately equal)
• Tone chroma and tone height
Dimensions of Music: Loudness and Timing
• Dynamics – the way the loudness of the notes changes as a piece of music progresses
• Rhythm – the temporal pattern of notes in music
– Fast tempo ≈ joyful mood (allegro is Italian for joyful)
– Slow tempo ≈ somber mood
8
Melody
• Melody – a sequence of notes arranged in a particular temporal pattern
– Relative position (number of semitones above or below the previous note) of pitches in the sequence
– Transposition – a version of a melody that starts at a different note but retains the relative position of all notes
• Six month old infants recognize transpositions as the same melody
Consonance and Dissonance
C
262
524
786
1048
1310
1572
1834
2096
2358
G
392
784
1176
1568
1960
2352
D♭
277
554
831
1108
1385
1662
1939
2216
2493
Neural Basis of Music Perception
9
Absolute Pitch Perception
• Absolute pitch -- about 1% of the population can accurately name isolated notes
• Amusia – about 4% of the population have a profound impairment in perceiving and remembering melodies and in distinguishing melodies