Statistical Language Models based on Neural Networks - Faculty of ...
Statistical Language Models based on Neural Networks - Faculty of ...
Statistical Language Models based on Neural Networks - Faculty of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 5<br />
Wall Street Journal Experiments<br />
Another important data set frequently used by the speech recogniti<strong>on</strong> community for<br />
research purposes is the Wall Street Journal speech recogniti<strong>on</strong> task. In the following<br />
experiments, we aim to:<br />
• show full potential <strong>of</strong> RNN LMs <strong>on</strong> moderately sized task, where speech recogniti<strong>on</strong><br />
errors are mainly caused by the language model (as opposed to acoustically noisy<br />
tasks where it would be more important to work <strong>on</strong> the acoustic models)<br />
• show performance <strong>of</strong> RNN LMs with increasing amount <strong>of</strong> the training data<br />
• provide comparis<strong>on</strong> to other advanced language modeling techniques in terms <strong>of</strong><br />
word error rate<br />
• describe experiments with open source speech recogniti<strong>on</strong> toolkit Kaldi that can be<br />
reproduced<br />
5.1 WSJ-JHU Setup Descripti<strong>on</strong><br />
The experiments in this secti<strong>on</strong> were performed with data set that was kindly shared<br />
with us by researchers from Johns Hopkins university. We report results after rescoring<br />
100-best lists from DARPA WSJ’92 and WSJ’93 data sets - the same data sets were used<br />
by Xu [79], Filim<strong>on</strong>ov [23], and in my previous work [49]. Oracle WER <strong>of</strong> the 100-best<br />
lists is 6.1% for the development set and 9.5% for the evaluati<strong>on</strong> set. Training data for<br />
the language model are the same as used by Xu [79]. The training corpus c<strong>on</strong>sists <strong>of</strong> 37M<br />
words from NYT secti<strong>on</strong> <strong>of</strong> English Gigaword. The hyper-parameters for all RNN models<br />
62