User Interface Service Software Developerís Guide - Hitech Projects
User Interface Service Software Developerís Guide - Hitech Projects
User Interface Service Software Developerís Guide - Hitech Projects
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
January 2008<br />
Public<br />
3 Multi-modal <strong>Interface</strong> <strong>Service</strong>s / Multi-device<br />
and Dialog Management <strong>Service</strong><br />
3.1 Component Overview<br />
3.1.1 Voice <strong>Service</strong><br />
3.1.1.1 Implicit Speech Input<br />
Provider<br />
INRIA<br />
Introduction<br />
In the first part of the project, the aim of this task was to design a generic architecture to help<br />
application developers to exploit users’ implicit speech input. In the second phase of the<br />
project, the objective of this task has been to focus on one particular type of implicit speech<br />
information, and to provide it to Amigo application developers. Two kinds of implicit speech<br />
information have been first considered: automatic dialog act recognition, and topic recognition.<br />
After some research work on both aspects, INRIA has decided to choose and develop<br />
automatic topic recognition in the Amigo project, while dialog act recognition will be studied<br />
mainly as paperwork and will not be implemented in the context of Amigo.<br />
Automatic topic recognition is a particular kind of implicit speech interaction, because it<br />
transparently – without disturbing the user – exploits user’s speech. More precisely, it is<br />
implicit, because the user’s speech is not originally intended to communicate with the system,<br />
but rather to communicate with another human. Typically, the automatic topic recognition<br />
functionality might infer the current topic of discussion from two people talking together faceto-face,<br />
or from two people talking on the phone.<br />
One of the main requirements of topic recognition is its low memory and computational<br />
requirements: indeed, such an implicit system is designed to run everywhere, permanently and<br />
for many – if not all – users. This is hardly achievable when it requires a lot of resources. This<br />
is why we have quickly given up the first option, which was to connect the output of a state-ofthe-art<br />
large vocabulary automatic speech recognition system to the input of our topic<br />
recognizer. We have rather decided to investigate and design a lightweight spoken keywords<br />
recognition system instead, which is dedicated to work as a pre-processor to the topic<br />
recognition module. The efforts concerning the topic recognition module have thus been<br />
distributed into both task 4.1 “CMS topic recognition” and subtask 4.5.1 “implicit speech input”<br />
as follows:<br />
• Task 4.1 deals with the design and development of the inference engine that<br />
recognizes topic from a text stream or a sequence of words. It also deals with making<br />
the topic recognition fully compliant with the context management system, in particular<br />
implementing the IContextSource interface, supporting SPARQL queries and RDF<br />
descriptions, and interacting with the context ontology.<br />
• Subtask 4.5.1 deals with developing a lightweight keyword spotting module, which can<br />
of course be used as a standalone module, but that is primarily designed to extract the<br />
most important keywords that can then be passed to the topic recognition module from<br />
the user’s speech.<br />
Amigo IST-2004-004182 68/114