02.04.2013 Views

Statistical Language Models based on Neural Networks - Faculty of ...

Statistical Language Models based on Neural Networks - Faculty of ...

Statistical Language Models based on Neural Networks - Faculty of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>of</strong> the task, as <strong>of</strong>ten the easiest way how to obtain good results is to choose crude but fast<br />

techniques, and train models <strong>on</strong> as much data as available. This strategy is however not<br />

getting any closer to solving the problems, rather avoiding them for as l<strong>on</strong>g as possible.<br />

For many tasks today, the amount <strong>of</strong> the available training data is so huge that further<br />

progress by adding another data is not very likely.<br />

Another reas<strong>on</strong> why advanced techniques are not used in practice is importance <strong>of</strong> the<br />

achieved results: it is comm<strong>on</strong>ly known that most <strong>of</strong> the published papers report <strong>on</strong>ly<br />

negligible improvements over basic baselines. Even the best techniques rarely affect the<br />

word error rate <strong>of</strong> speech recogniti<strong>on</strong> systems by more than 10% relatively - and that is<br />

hardly observable difference from the user perspective. However, even small difference can<br />

be huge in the l<strong>on</strong>g term - competiti<strong>on</strong>s are <strong>of</strong>ten w<strong>on</strong> by a slight margin. Also, even if<br />

the improvements are small and hardly observable, it is likely that in the l<strong>on</strong>ger term, the<br />

majority <strong>of</strong> users will tend to prefer the best system.<br />

While I see integrati<strong>on</strong> <strong>of</strong> neural net language models into producti<strong>on</strong> systems as the<br />

next step for the language modeling research, there is still much to do in the basic research.<br />

Based <strong>on</strong> the history <strong>of</strong> the language modeling research that has been <strong>of</strong>ten rather chaotic,<br />

it might be fruitful to first define a roadmap. While detailed proposal for future research<br />

is out <strong>of</strong> scope <strong>of</strong> this work, the main points are:<br />

• The involved models should be computati<strong>on</strong>ally much less restricted than the tradi-<br />

ti<strong>on</strong>al <strong>on</strong>es; it should be clear that a compact soluti<strong>on</strong> to simple problems can exist<br />

in the model space<br />

• The progress should be measured <strong>on</strong> increasingly more complex tasks (for example,<br />

finding the most likely word in an incomplete sentence, as in [83])<br />

• The tasks and the training data should be coherent and publicly available<br />

While such research would not be competitive with the comm<strong>on</strong> techniques in the<br />

short term, it is certain that a progress bey<strong>on</strong>d models such as finite state machines is<br />

needed. It has been popular to claim that we need orders <strong>of</strong> magnitude more powerful<br />

computers, and also much more training data to make progress towards AI - I find this<br />

doubtful. In my opini<strong>on</strong>, what needs to be addressed is the capability <strong>of</strong> the machine<br />

learning techniques to efficiently discover new patterns.<br />

112

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!