13.07.2015 Views

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Proceedings</strong>, FONETIK <strong>2009</strong>, Dept. of Linguistics, Stockholm Universityabbreviations and formats for ordinals, date expressionsand suchlike. Similar modificationswere carried out for text expansions of theabove-mentioned classifications.A language detector that distinguishes thetarget language from English was also includedin the Norwegian system. This module looks upthe words in all dictionaries and suggests languagetag (Norwegian or English) for eachword depending on unambiguous languagetypes of surrounding words.OOV words are automatically predicted tobe proper names, simplex or compound Norwegianwords or English words. Some of the pronunciationsof these words are generated byrules, but the main part of the pronunciations isgenerated with CART trees, one for each wordtype.The output from the text processor is sent tothe TTS engine in SSML format.Quality assuranceThe quality assurance phase consists of twoparts, the developers’ own testing to catch generalerrors, and a listening test period where nativespeakers report errors in segmentation,pronunciation and text analysis to the developingteam. They are also able to correct minorerrors by adding or changing transcriptions orediting simpler text processing rules. Some tentextbooks will be produced for this purpose, aswell as test documents with utterances of highcomplexity.Black A. and Taylor P. (1997). Automaticallyclustering similar units for unit selection inspeech synthesis. <strong>Proceedings</strong> of Eurospeech97, Rhodes, Greece.DAISY Pipeline (<strong>2009</strong>).http://www.daisy.org/projekcts/pipeline.Ericsson C., Klein J., Sjölander K. and SönneboL. (2007). Filibuster – a new Swedish textto-speechsystem. <strong>Proceedings</strong> of <strong>Fonetik</strong>,TMH-QPSR 50(1), 33-36, Stockholm.Kominek J. and Black A. (2003). CMUARCTIC database for speech synthesis.Language Technologies Institute. CarnegieMellon University, Pittsburgh PA . TechnicalReport CMU-LTI-03-177.http://festvox.org/cmu_arctic/cmu_arctic_report.pdfKällgren G., Gustafson-Capkova S. and HartmanB. (2006). Stockholm Umeå Corpus2.0 (SUC2.0). Department of Linguistics,Stockholm University, Stockholm.Sjölander, K. (2003). An HMM-based systemfor automatic segmentation and alignment ofspeech. <strong>Proceedings</strong> of <strong>Fonetik</strong> 2003, 93-96,Stockholm.Sjölander K., Sönnebo L. and Tånnander C.(2008). Recent advancements in the Filibustertext-to-speech system. SLTC 2008.Øverland H. (2000). Transcription Conventionsfor Norwegian. Technical Report. NordiskSpråkteknologi ASCurrent statusThe second phase of the quality assurance phasewith native speakers will start in May <strong>2009</strong>. Thesystem is scheduled to be put in production ofNorwegian textbooks by the autumn term of<strong>2009</strong>. Currently, the results are promising. Thevoice appears clear and highly intelligible, alsoin the generation of more complex utterancessuch as code switching between Norwegian andEnglish.References39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!