Norsk i den digitale tidsalderen - Meta-Net
Norsk i den digitale tidsalderen - Meta-Net
Norsk i den digitale tidsalderen - Meta-Net
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Speech Output Speech Synthesis<br />
Speech Input<br />
Signal Processing<br />
user interface. For static utterances where the wording<br />
does not depend on particular contexts of use or per-<br />
sonal user data, this can deliver a rich user experience.<br />
But more dynamic content in an utterance may suffer<br />
from unnatural intonation because different parts of au-<br />
dio files have simply been strung together. rough opti-<br />
misation, today’s TTS systems are getting better at pro-<br />
ducing natural-sounding dynamic utterances.<br />
Interfaces in speech interaction have been considerably<br />
standardised during the last decade in terms of their<br />
various technological components. ere has also been<br />
strong market consolidation in speech recognition and<br />
speech synthesis. e national markets in the G20 coun-<br />
tries (economically resilient countries with high popu-<br />
lations) have been dominated by just five global play-<br />
ers, with Nuance (USA) and Loquendo (Italy) being the<br />
most prominent players in Europe. In 2011, Nuance an-<br />
nounced the acquisition of Loquendo, which represents<br />
a further step in market consolidation.<br />
In the Norwegian TTS market, thirteen Norwegian<br />
voices of varying quality are available, some of which<br />
have been developed by the above mentioned European<br />
players. ree voices have been developed by the Nor-<br />
wegian company Lingit, which targets users groups with<br />
reading and writing impairments. Another voice was de-<br />
veloped by the Norwegian Library of Talking Books and<br />
Braille in cooperation with their sister library in Swe<strong>den</strong>.<br />
ere is also an active research community at the Nor-<br />
Phonetic Lookup &<br />
Intonation Planning<br />
Recognition<br />
5: Speech-based dialogue system<br />
Natural Language<br />
Understanding &<br />
Dialogue<br />
wegian University of Science and Technology in Trond-<br />
heim.<br />
Language resources for speech synthesis are<br />
abundant for English but only to a lesser extent<br />
for Norwegian.<br />
e quality of speech synthesis depends heavily on avail-<br />
able resources (in particular text corpora tagged for part<br />
of speech, tokenisers and pronunciation lexicons) and<br />
language specific research on for instance prosodic fea-<br />
tures for the language in question. Such resources are<br />
abundant for English but only to a lesser extent for Nor-<br />
wegian, even if Norwegian is especially challenging due<br />
to the wide variety of possible spelling variants and the<br />
range of dialects; moreover the tonal accents in most<br />
Norwegian dialects and the lack of a one-to-one relation<br />
between sounds and letters pose challenges.<br />
Regarding dialogue management technology and<br />
know-how, the market is rather dominated by national,<br />
smaller enterprises. MediaLT has developed a general<br />
speech recognition engine used for dialogue manage-<br />
ment for the visually impaired. Regarding speech-to-<br />
text, Max Manus has integrated and localised Philips’<br />
SpeechMagic for Norwegian hospitals. is system is<br />
relatively successful, but it is limited to a relatively closed<br />
domain (with a closed vocabulary). Recently Dragon<br />
Dictation, a voice recognition application for mobile<br />
58