28.02.2013 Views

Introduction to Acoustics

Introduction to Acoustics

Introduction to Acoustics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

682 Part E Music, Speech, Electroacoustics<br />

Part E 16.3<br />

shows how accurately MFDR could be predicted from<br />

Ps and F0 for previously published data for untrained<br />

male and female singers and for professional bari<strong>to</strong>ne<br />

singers [16.45, 46]. Both Ps and F0 are linearly related<br />

<strong>to</strong> MFDR. However, the singers showed a much greater<br />

variation with F0 than the untrained voices. This difference<br />

reflected the fact that unlike the untrained subjects<br />

16.3 The Vocal Tract Filter<br />

The source-filter theory, schematically illustrated in<br />

Fig. 16.19, describes vocal sound production as a threestep<br />

process: (1) generation of a steady flow of air from<br />

Level<br />

Radiated spectrum<br />

Frequency<br />

Level<br />

Vocal tract frequency<br />

curve formants<br />

Level<br />

Frequency<br />

Glottal voice source<br />

Spectrum<br />

Frequency<br />

Transplottal airflow<br />

Waveform<br />

Time<br />

Lungs<br />

Velum<br />

Vocal<br />

tract<br />

Vocal<br />

folds<br />

Trachea<br />

Fig. 16.19 Schematic illustration of the generation of voice<br />

sounds. The vocal fold vibrations result in a sequence of<br />

voice pulses (bot<strong>to</strong>m) corresponding <strong>to</strong> a series of harmonic<br />

over<strong>to</strong>nes, the amplitudes of which decrease mono<strong>to</strong>nically<br />

with frequency (second from bot<strong>to</strong>m). This spectrum is filtered<br />

according <strong>to</strong> the sound transfer characteristics of the<br />

vocal tract with its peaks, the formants, and the valleys between<br />

them. In the spectrum radiated from the lip opening,<br />

the formants are depicted in terms of peaks, because the partials<br />

closest <strong>to</strong> a formant frequency reach higher amplitudes<br />

than neighboring partials<br />

the singers could sing a high F0 much more softly than<br />

the untrained voices. The ability <strong>to</strong> sing high notes also<br />

softly would belong <strong>to</strong> the essential expressive skills of<br />

a singer. Recalling that an increase of Ps increases F0<br />

by a few Hz/cm H2O, we realize that singing high <strong>to</strong>nes<br />

softly requires more forceful contraction of the pitchraising<br />

laryngeal muscles than singing such <strong>to</strong>nes loudly.<br />

the lungs (DC component); (2) conversion of this airflow<br />

in<strong>to</strong> a pseudo-periodically pulsating transglottal airflow<br />

(DC-<strong>to</strong>-AC conversion), referred <strong>to</strong> as the voice source;<br />

and (3) response of the vocal tract <strong>to</strong> this excitation signal<br />

(modulation of AC signal) which is characterized by the<br />

frequency curve or transfer function of the vocal tract.<br />

So far the first two stages, respiration and phonation,<br />

have been considered.<br />

In this section we will discuss the third step, viz. how<br />

the vocal tract filter, i. e. the resonance characteristics of<br />

the vocal tract, modifies, and <strong>to</strong> some extent interacts<br />

with, the glottal source and shapes the final sound output<br />

radiated from the talker’s/singer’s lips.<br />

Resonance is a key feature of the filter response.<br />

The oral, pharyngeal and nasal cavities of the vocal tract<br />

form a system of resona<strong>to</strong>rs. During each glottal cycle<br />

the air enclosed by these cavities is set in motion by the<br />

glottal pulse, the main moment of excitation occurring<br />

during the closing of the vocal folds, more precisely at<br />

the time of the MFDR, the maximum flow declination<br />

rate (cf. the previous section on source).<br />

The behavior of a vocal tract resonance, or formant,<br />

is specified both in the time and the frequency domains.<br />

For any transient excitation, the time response is an<br />

exponentially decaying cosine [16.27, p. 46]. The frequency<br />

response is a continuous amplitude-frequency<br />

spectrum with a single peak. The shape of either function<br />

is uniquely determined by two numbers (in Hz): the<br />

formant frequency F and the bandwidth B. The bandwidth<br />

quantifies the degree of damping, i. e., how fast the<br />

formant oscillation decays. Expressed as sound pressure<br />

variations, the time response is<br />

p(t) = A e −π Bt cos (2π Ft) . (16.1)<br />

For a single formant curve, the amplitude variations<br />

as a function of frequency f is given (in dB) by<br />

[F<br />

L(f)=20 log<br />

2 + � �2 B<br />

2 ]<br />

�<br />

( f − F) 2 + � �2 B<br />

2<br />

�<br />

( f + F) 2 + � �2 B<br />

2<br />

.<br />

(16.2)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!