02.04.2013 Views

Statistical Language Models based on Neural Networks - Faculty of ...

Statistical Language Models based on Neural Networks - Faculty of ...

Statistical Language Models based on Neural Networks - Faculty of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table 7.4: Accuracy <strong>of</strong> different language modeling techniques <strong>on</strong> the Micros<strong>of</strong>t Sentence<br />

Completi<strong>on</strong> Challenge task. Human performance is 91% accuracy [83].<br />

Model Perplexity Accuracy [%]<br />

random - 20.0<br />

GT3 92 36.0<br />

GT5 87 38.4<br />

KN5 84 40.0<br />

RNNME-50 68 43.5<br />

RNNME-100 65 45.1<br />

RNNME-200 63 47.7<br />

RNNME-300 60 49.3<br />

models, the perplexity - with the difference that the sentence completi<strong>on</strong> challenge focuses<br />

<strong>on</strong> the informative words that occur infrequently. Results obtained with various n-gram<br />

models and RNNME models are summarized in Table 7.4. The models were trained <strong>on</strong><br />

about 50M tokens using 200K vocabulary, as a link to the training data was provided<br />

in [83].<br />

RNNME language models perform much better that the usual n-gram models <strong>on</strong> this<br />

task: obviously, their ability to represent l<strong>on</strong>ger c<strong>on</strong>text patterns is very useful. While<br />

n-gram models perform about 20% better than is the random performance, the largest RN-<br />

NME model is almost 30% better. Still, the performance is far from human performance,<br />

which is 91% accuracy.<br />

We can think <strong>of</strong> models that would focus more <strong>on</strong> the task itself - basic objective<br />

functi<strong>on</strong> for usual language models is to minimize entropy <strong>of</strong> the training data, while in<br />

the case <strong>of</strong> sentence completi<strong>on</strong> challenge, we are more interested in capturing patterns<br />

between infrequent words. A simple task-specific modificati<strong>on</strong> can involve models that<br />

are trained <strong>on</strong> data where frequent words are discarded. This reduces amount <strong>of</strong> possible<br />

parameters <strong>of</strong> n-gram models for capturing regularities between infrequent words. In the<br />

following experiments, the 200 most frequent words were discarded both from the training<br />

and test data.<br />

It can be observed that n-gram models that are trained <strong>on</strong> such modified training<br />

data give much better accuracy. However, as a lot <strong>of</strong> possibly important informati<strong>on</strong><br />

is discarded, the RNNME models do not have possibility to significantly overcome the<br />

99

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!