12.01.2014 Views

User Interface Service Software Developerís Guide - Hitech Projects

User Interface Service Software Developerís Guide - Hitech Projects

User Interface Service Software Developerís Guide - Hitech Projects

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

January 2008<br />

Public<br />

3 Multi-modal <strong>Interface</strong> <strong>Service</strong>s / Multi-device<br />

and Dialog Management <strong>Service</strong><br />

3.1 Component Overview<br />

3.1.1 Voice <strong>Service</strong><br />

3.1.1.1 Implicit Speech Input<br />

Provider<br />

INRIA<br />

Introduction<br />

In the first part of the project, the aim of this task was to design a generic architecture to help<br />

application developers to exploit users’ implicit speech input. In the second phase of the<br />

project, the objective of this task has been to focus on one particular type of implicit speech<br />

information, and to provide it to Amigo application developers. Two kinds of implicit speech<br />

information have been first considered: automatic dialog act recognition, and topic recognition.<br />

After some research work on both aspects, INRIA has decided to choose and develop<br />

automatic topic recognition in the Amigo project, while dialog act recognition will be studied<br />

mainly as paperwork and will not be implemented in the context of Amigo.<br />

Automatic topic recognition is a particular kind of implicit speech interaction, because it<br />

transparently – without disturbing the user – exploits user’s speech. More precisely, it is<br />

implicit, because the user’s speech is not originally intended to communicate with the system,<br />

but rather to communicate with another human. Typically, the automatic topic recognition<br />

functionality might infer the current topic of discussion from two people talking together faceto-face,<br />

or from two people talking on the phone.<br />

One of the main requirements of topic recognition is its low memory and computational<br />

requirements: indeed, such an implicit system is designed to run everywhere, permanently and<br />

for many – if not all – users. This is hardly achievable when it requires a lot of resources. This<br />

is why we have quickly given up the first option, which was to connect the output of a state-ofthe-art<br />

large vocabulary automatic speech recognition system to the input of our topic<br />

recognizer. We have rather decided to investigate and design a lightweight spoken keywords<br />

recognition system instead, which is dedicated to work as a pre-processor to the topic<br />

recognition module. The efforts concerning the topic recognition module have thus been<br />

distributed into both task 4.1 “CMS topic recognition” and subtask 4.5.1 “implicit speech input”<br />

as follows:<br />

• Task 4.1 deals with the design and development of the inference engine that<br />

recognizes topic from a text stream or a sequence of words. It also deals with making<br />

the topic recognition fully compliant with the context management system, in particular<br />

implementing the IContextSource interface, supporting SPARQL queries and RDF<br />

descriptions, and interacting with the context ontology.<br />

• Subtask 4.5.1 deals with developing a lightweight keyword spotting module, which can<br />

of course be used as a standalone module, but that is primarily designed to extract the<br />

most important keywords that can then be passed to the topic recognition module from<br />

the user’s speech.<br />

Amigo IST-2004-004182 68/114

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!