04.02.2013 Views

MAS.632 Conversational Computer Systems - MIT OpenCourseWare

MAS.632 Conversational Computer Systems - MIT OpenCourseWare

MAS.632 Conversational Computer Systems - MIT OpenCourseWare

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

172 VOICE (OMMURICAION WITH COMPUTERS<br />

hardware, suitable software-based recognition requiring no additional hardware<br />

is already becoming commercially available.'<br />

Xspak Fondi..#ty<br />

Xspeak provided for the use of voice for the first two mouse functions previously<br />

mentioned: focus management and navigation among windows that may be<br />

obscured by other windows. Xspeak worked with real-estate-driven window managers<br />

supporting overlapped windows. Each window had a name; speaking its<br />

name exposed the window. Xspeak then moved the cursor into the window so the<br />

window manager assigned it the keyboard focus. Thus the user could move<br />

among applications sending keystrokes to each in turn without removing his or<br />

her hands from the keyboard. Additional commands allowed the user to raise or<br />

lower the window in which the cursor currently appeared.<br />

Xspeak employed a small window to provide both textual feedback to the user<br />

and recognition control capabilities (see Figure 8.3). As each word was recognized,<br />

it was displayed in this window. In ordinary use this window was ignored<br />

as the window reconfiguration provided adequate feedback, but the display panel<br />

was useful when recognition results were particularly poor and the user needed<br />

to confirm operation of the recognizer. Through the control panel the user could<br />

name a new window; this involved clicking on the window and speaking a name<br />

that was then trained into a recognition template. The control panel also provided<br />

a software "switch" to temporarily disable recognition when the user<br />

wished to speak with a visitor or on the telephone. Another button brought up a<br />

configuration window, which allowed the user to retrain the recognizer's vocabulary,<br />

configure the audio gain of the speech recognizer, and transfer the recognition<br />

templates to disk.<br />

Xspeak was an application not a window manager (see Figure 8.4). It worked<br />

by capturing recognition results and sending X protocol requests to the server. As<br />

with any client application, these requests could be redirected through the window<br />

manager, and the window manager received notification of the changes in<br />

window configuration or cursor location. Because Xspeak did not replace the window<br />

manager, it could be used across a wide range of workstation window system<br />

environments.<br />

Why Speech?<br />

Xspeak was motivated by the belief that window system navigation is a task suitable<br />

to voice and well matched to the capabilities of current speech input devices.<br />

Using speech recognition to replace the workstation keyboard would be very difficult;<br />

it would require large vocabulary speech recognition devices capable of<br />

dealing with the syntaxes of both English text as well as program text and oper-<br />

SXspeak has recently been revived using all-software recognition. This greatly facilitates<br />

deployment of the application.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!