27.12.2012 Views

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4.2 Analysis of running speech<br />

Speech research 31<br />

The analysis was only carried out on connected intervals classified as voiced of minimum<br />

length 70 ms. A weighting by length is implemented, since longer intervals are<br />

more expressive. In these intervals period markers are set (Fig. 3 (11)) <strong>and</strong> acoustic<br />

quantities are determined, for instance:<br />

• period lengths by the waveform matching algorithm;<br />

• jitter (3 definitions), shimmer (3 definitions), MWC, GNE.<br />

From these again a GHD can be constructed. The position of voices in the GHD is<br />

different in running speech from that for stationary vowels, so that a new calibration<br />

is required to obtain comparable representations for both cases. The definition of the<br />

axes, which is based on a principal-component analysis in a high-dimensional space,<br />

has to be carried out anew. Here, the choice of the underlying quantities was the same<br />

for consistency reasons, but their weighting was different. The new GHD is called<br />

“GHDT”, “T” meaning “text”. The coordinates in the GHDT are averaged over the<br />

analyzed intervals of the text utterance, weighted by their lengths. The variances<br />

of the measurement points in the GHD are, because of sound dependence, of course<br />

larger than for stationary vowels, but the mean values retain their expressiveness.<br />

The consistency of the GHDT was checked with various normal <strong>and</strong> pathological<br />

voices <strong>and</strong> different utterances.<br />

Besides the GHD, the automatic voiced/unvoiced classification can also be applied<br />

to other diagnostically useful quantities in order to extend their usage to running<br />

speech. This concerns, for instance, the Pitch Amplitude (PA; 1 st maximum of the<br />

autocorrelation function of the prediction error signal) <strong>and</strong> the Spectral Flatness<br />

Ratio (SFR; logarithm of the ratio of geometric <strong>and</strong> arithmetic means of the spectral<br />

energy density of the prediction error signal).<br />

Based on the acoustic quantities, group analyses of various phonation mechanisms<br />

<strong>and</strong> cancer groups (significant group separation) can be conducted. For preliminary<br />

<strong>and</strong> recent presentations of methods <strong>and</strong> results see Refs. [25–28].<br />

So far, no phonemes were to be recognized but only their linguistic (not actual)<br />

voicedness. Meanwhile, the perceptron method has been extended to recognition of<br />

the six stationary vowels, using 6 output cells. Training was done with 8192 vowels<br />

of at least 2 s duration from all kinds of voice quality. This can help to further<br />

automatize the determination of voice quality.<br />

3.5 Analysis of glottal oscillation<br />

The voice pathologies are related to the functioning of the vocal folds, which form a<br />

self-oscillating nonlinear mechanic <strong>and</strong> aerodynamic system driven by the glottal air<br />

flow. In order to relate the acoustic voice characteristics to properties of the glottal<br />

oscillation, these must be (if possible, automatically) recorded <strong>and</strong> characterized by<br />

few quantities. Here, acoustic as well as optical methods are employed. These methods<br />

have not yet been extended to running speech, but the only essential difficulty<br />

to do so appears to be the large amount of data occurring then.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!