PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
PhD thesis - School of Informatics - University of Edinburgh
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 6. Other Potential Applications 156<br />
and Steeneken (2000) show with a combination <strong>of</strong> speech reception threshold (SRT) 3<br />
tests and letter guessing procedure (LGP) 4 experiments that non-authentic, i.e accented<br />
pronunciation <strong>of</strong> foreign words increases intelligibility to inexperienced learners. Ex-<br />
perienced L2 learners, on the other hand, perceive non-accented L2 speech as more<br />
intelligible. Their experiments involves native Dutch speakers listening to German<br />
and English speech produced by German and English native speakers as well as Dutch<br />
native speakers. If the same holds true for the perception <strong>of</strong> polyglot speech, where the<br />
foreign items are spoken in the context <strong>of</strong> the listeners own language, is still an open<br />
question.<br />
Even if a TTS syn<strong>thesis</strong> system is capable <strong>of</strong> producing authentic pronunciations<br />
<strong>of</strong> embedded inclusions or can at least approximate them to some extent in the for-<br />
eign language, it is unlikely to be the accepted or preferred rendering compared to one<br />
that adjusts at least some foreign speech sounds to the acoustic characteristics <strong>of</strong> the<br />
listener’s own language. As Lindström and Eklund (2000) point out, too much ap-<br />
proximation to the foreign language may not result in the desired effect for listeners.<br />
They may consider it exaggerated or inadequate. The other extreme, producing the<br />
pronunciation <strong>of</strong> embedded foreign words merely by means <strong>of</strong> letter-to-sound rules <strong>of</strong><br />
the base language <strong>of</strong> the text, will result in bad mispronunciations or overly accented<br />
speech which listeners are likely to deem uneducated or may not even understand.<br />
Besides speaker- and word-related factors such as age, experience <strong>of</strong> the foreign<br />
language, awareness <strong>of</strong> the task or frequency <strong>of</strong> the foreign word, which can heavily<br />
influence the results <strong>of</strong> a mixed-lingual speech perception experiment, a perception<br />
study <strong>of</strong> mixed-lingual TTS syn<strong>thesis</strong> output also depends on the actual syn<strong>thesis</strong> ap-<br />
proach <strong>of</strong> foreign inclusions. Although the chosen approach may well be modelled<br />
on the behaviour <strong>of</strong> subjects in production experiments, the perception study subjects<br />
3 An SRT test involves adding masking noise to test sentences. If subjects listening to the test sentence<br />
repeat all its words correctly, the noise is increased, otherwise it is decreased. The SRT value is<br />
the average speech-to-noise ratio over a set <strong>of</strong> sentences and provides a reliable measure for sentence<br />
intelligibility in noise.<br />
4 LGP was devised by Shannon and Weaver (1949) as a way <strong>of</strong> estimating lower and upper bounds<br />
<strong>of</strong> linguistic information content <strong>of</strong> a language. Subjects are asked to guess a string <strong>of</strong> text one letter<br />
at a time without receiving any prior information. After each guess, the correct letter is either revealed<br />
(single-LGP) or kept undisclosed until the guess is correct (multiple-LGP). Subjects can make use <strong>of</strong> the<br />
letters guessed up to the current point for predicting the next letter. The more letters a subject guesses<br />
randomly, the less redundant the language is to them and the more linguistically skilled they are in that<br />
language (van Rooij and Plomp, 1991). LGP scores are expressed in terms <strong>of</strong> linguistic entropy L (in<br />
bits): L = −log2(c) where c is the fraction <strong>of</strong> correctly guessed letters.