MAS.632 Conversational Computer Systems - MIT OpenCourseWare

More documents

Recommendations

Info

DESIGN CONSIDERATIONS mleMradive oa R~ 105 less likely to be focusing on a display. The user may be staring out the window, reading a book, or walking around the office and not notice a new message on the computer screen. Voice can also be useful even when the intended recipient is in a more restrained physical environment such as sitting in an office using a workstation or in an airplane cockpit, because it provides an alternate channel of interaction when the user's hands and eyes are busy elsewhere. If appropriate and informative words are spoken such as "System shutdown in ten minutes" or "New mail from Kaya," then users would not need to shift their gaze to the screen. 4 Spoken messages offer a performance improvement in addition to providing the ability to audibly distinguish the message content. Experimental evidence (see [Allport et al. 1972, Wickens 1981, Treisman and Davies 1973]) indicates that if a user is performing multiple tasks, performance improves if those tasks can be managed over independent input/output channels, or modalities. For example, although we cannot listen to two conversations simultaneously, subjects were able to sight read piano music while repeating what they heard with no decrease in proficiency [Allport et al. 1972]. How do such experimental results apply to voice in an office environment? Voice notification to indicate the status of a task to which the user is not paying immediate attention or for which the associated window is not visible could effectively deliver a message with minimal disruption of the visual attention a user is simultaneously applying to other tasks. Speech synthesis affords systems the capability of media translation to turn text documents into voice. Although this seems to be a trivial continuation of the earlier discussion about remote access, it should not be dismissed quite so quickly. Much of the current work in multimedia computing uses different media for distinct purposes to present disjoint, although related, information. For example, a voice narration accompanying the display of a simulation program is meant to comment on the program's graphics rather than act as a substitute output medium. Speech synthesis lets text data be converted into voice for any purpose. Some blind users make remarkable use of it. Voice output is a first step towards employing multimedia technologies to make computers accessible to a variety of users under many different circumstances. An understanding of the limitations of speech as an output channel enables us to design better voice interfaces. Speech can be used effectively for computer output, but care must be taken to avoid allowing the liabilities of the medium to overcome the usefulness of the application. The next two sections of this chapter 'Displaying the text message in additionto speaking it is necessary for several of these applications. If I am out of the office when an important notice arrives, I will miss the speech output but may look at the screen when I return.
106 VOICE COMMUNICATION WITH COMPUTERS AfIktion Apprapdie.less outline a number of design considerations and discuss some interactive techniques in this light. Most of the limitations of voice output described earlier can be summarized by the broader observation that speech takes place over a periodof time. The amount of time required to listen to a voice recording, the necessity of reciting menu items in sequence, and much of the difficulty associated with the bulky nature of voice are all a direct consequence of the temporal nature of speech. Speed is thus one of the primary factors distinguishing voice from other interface media; for example, most graphics applications can be written with the assumption that display operations such as drawing lines or text are performed nearly instantaneously. Much of the time the application is idle while the user reads text, picks a menu item, etc.; this is punctuated by sporadic user input and display update. For telephone-based interfaces, the exact opposite is true; presentation accounts for most of the total time the user interacts with the application. Voice response systems must therefore strive to minimize the amount of time required to retrieve information: they must be brief, selective about content, and interruptible. Applications must plan what to say so as to avoid utterances that are irrelevant by the time they are spoken. A real-time speech output system may have to formulate conflicting goals and choose among them if there is not suffident time to say everything, which requires estimating how long it will take to speak an utterance. The awareness that speech takes time (and acceptance that time is a commodity of which we never have enough) should permeate the design process for building any speech application. This argues in favor of shortening spoken menus and precludes many applications that may be quite feasible in text such as browsing through a newspaper. Time motivates strategies to filter an application's output requiring careful judgment as to the relevance of a specific system comment. The remainder of this section explores these design considerations in more detail. Because speech is so difficult to employ effectively, discretion is essential in determining effective voice applications and in choosing the appropriate speech output device. Poor quality speech, whether synthesized or recorded, quickly becomes annoying unless it serves a clear purpose. For example, speech output has been employed effectively as a warning medium in airplane cockpits. On the contrary, few car drivers appreciate a tinny voice reminding them to close the doors oftheir automobile. The cockpit is a busier place, and the consequences of an error may be more severe. A more effective use of speech in the car may be traffic information (drivers already tune to commuter traffic reports on their radios) or as part of a navigation system (as described later in this chapter). In retrospect, another example of a misguided speech interface is the recitation of a customer's purchases at a cash register. As each item is scanned by the bar code reader, its description and price are announced. The original motivation for
Page 1 and 2:
MIT OpenCourseWare http://ocw.mit.e
Page 3 and 4:
VNR Computer Library Advanced Topic
Page 5 and 6:
To Ava andKaya for the patienceto s
Page 7 and 8:
vi VOICE COMMUNICATION WITH COMPUTE
Page 9 and 10:
viE VOICE COMMUNICATION WITH COMPUT
Page 11 and 12:
VOICE COMMUNICATION WITH COMPUTEgS
Page 14 and 15:
Speaking of Talk As we enter the ne
Page 16 and 17:
Speaking ofTalk The short-term fix
Page 18 and 19:
Preface This book began as graduate
Page 20:
RFe[e ih This is just the beginning
Page 23 and 24:
xxii VOICE OMMUNICEAIIO WITH COMPUT
Page 25 and 26:
2 VOICE COMMUNICATION WITH COMPUTER
Page 27 and 28:
4 VOICE COMMUNICATION WITH COMPUTEI
Page 29 and 30:
6 VOICE (OMMUNICAIION WITH COMPUTER
Page 31 and 32:
S YVOICE COMMUNICATION WITH COMPUTE
Page 33 and 34:
10 VOICE COMMUHICATION WITH (OMPUIE
Page 35 and 36:
12 VOICE COMMUNICATION WITH COMPUTE
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
18 YVOII (OMMUNICATION WITH (OMPUTE
Page 43 and 44:
20 VOIC[ COMMUNICATION WITH COMPUTE
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
28 VOICE (OMMUNICATION WITH COMPUTE
Page 53 and 54:
30 VOICE COMMUNICATION WITH (OMPUTE
Page 55 and 56:
32 VOICE (OMMUHICATIOI WITH (OMPUTE
Page 57 and 58:
Page 59 and 60:
3 Speech Coding This chapter introd
Page 61 and 62:
38 VOICE (OMMUNICATlON WITH COMPUTE
Page 63 and 64:
40 VOICE COMMURItATION WITH (OMPUTE
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
46 VOICE (OMMUNICATION WITH COMPUTE
Page 71 and 72:
48 VOICE COMMUNICATION WITi COMPUTE
Page 73 and 74:
IT -010 !Mm1-tgq Vm-0 50 VOICE COMM
Page 75 and 76:
52 VOICE (OMMUNICAlION WITH COMPUTE
Page 77 and 78: 54 VOICE COMMUNICATION WITH COMPUTE
Page 79 and 80: 56 VOICE COMMUNI(ATION WITH COMPUTE
Page 81 and 82: 58 VOICE COMMUNICATION WITH (OMPUTE
Page 83 and 84: 4 Applications and Editing of Store
Page 99 and 100: 76 VOICE COMMUNICATION WITH (OMPUTE
Page 101 and 102: 78 YOICE COMMUNICATION WITH COMPUTE
Page 105 and 106: 5 Speech Synthesis The previous two
Page 109 and 110: 6 YVOICE (OMMUNICATION WITH [OMPUTE
Page 111 and 112: n VOICE COMMUNICATION WITH COMPUTER
Page 115 and 116: 92 VOICE (OMMUNICATION WITH COMPUTE
Page 121 and 122: 98 VOICE COMMUlICATION WITH COMPUTE
Page 123 and 124: Interactive Voice Response This cha
Page 125 and 126: 102 VOICE COMMUNICATION WITH COMPUT
Page 127: 104 VOICE COMMUNICATION WITH (OMPUT
Page 133 and 134: 110 VOICE (DMMUNICATION WITH COMPUT
Page 135 and 136: 112 VOICE COMMUNICATIOH WITH COMPUT
Page 153 and 154: 13 VOICE COMMUNICAIIOR WITH COMPUTE
Page 155 and 156: 7 Speech Recognition This chapter i
Page 161 and 162: 138 VOICE (OMMUNICATION WITH COMPUT
Page 175 and 176: 152 VOICE COMMUNICATION WITH (OMPUT
Page 177 and 178: USES OF VOICE INPUT Sole IW,1 Canna
Page 179 and 180:
156 VOICE COMMUNICATION WITH COMPUT
Page 181 and 182:
In VOICE COMMUNICATION WITH COMPUTE
Page 183 and 184:
160 VOICE [OMMUHICATION WIITH OMPUT
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
168 VOICE COMMUNICAlION WITH COM IU
Page 193 and 194:
Page 195 and 196:
172 VOICE (OMMURICAION WITH COMPUTE
Page 197 and 198:
Page 199 and 200:
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
182 OICE (COMMUNIICAION WITH COMPUT
Page 207 and 208:
1I4 VOICE COMMUNICATION WITH COMPUT
Page 209 and 210:
Page 211 and 212:
188 VOICE COMMUNICATION WIIH (OMPUT
Page 213 and 214:
Page 215 and 216:
192 VOIC0 COMMUNICATION WITH COMPUT
Page 217 and 218:
194 VOI(E COMMUNIATIION WITH (OMPUT
Page 219 and 220:
Page 221 and 222:
198 VOICI COMMUNICATION WITH COMPUT
Page 223 and 224:
Page 225 and 226:
202 VOICE COMMUNICATIOR WIIH COMPUT
Page 227 and 228:
204 VOICE COMMUNICAIION WIlH COMPUT
Page 229 and 230:
206 VOIC([ OMMUNICATION WITH COMPUT
Page 231 and 232:
Page 233 and 234:
10 Basics of Telephones This and th
Page 235 and 236:
212 VOICE COMMUNICATION WITH (OMPUT
Page 237 and 238:
Page 239 and 240:
216 VOICE COMMUNICATION WITH COMPUI
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
Page 249 and 250:
226 VOICE (COMMUNICATION WITH COMPU
Page 251 and 252:
Page 253 and 254:
11 Telephones and Computers The pre
Page 255 and 256:
Page 257 and 258:
214 VOICE COMMUHiCATION WITH COMPUI
Page 259 and 260:
236 VOICE COMMUNICATIOK WITH COMPUT
Page 261 and 262:
Page 263 and 264:
Page 265 and 266:
242 VOICE COMMUHICATIOI WITH COMPUI
Page 267 and 268:
Page 269 and 270:
246 VOICE (OMMUNICAIION WITH COMPUT
Page 271 and 272:
248 VOICE (COMMUNICATION WITH COMPU
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
254 VOICE COMMUNICAIION WITH COMPUT
Page 279 and 280:
25 VOICE (OMMUNICATIOK WITH COMPUTE
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Page 287 and 288:
Page 289 and 290:
Page 291 and 292:
12 Desktop Audio Because of limitat
Page 293 and 294:
Page 295 and 296:
2f2 VOICE COMMUIICAIION WITH COMPUT
Page 297 and 298:
Page 299 and 300:
Page 301 and 302:
Page 303 and 304:
Page 305 and 306:
Page 307 and 308:
284 VOIC[ COMMIUWlATION WITH COMPUT
Page 309 and 310:
Page 311 and 312:
211 VOICE (OMMUNICATION WITH COMPUT
Page 313 and 314:
290 VOICE COMMIUNICATIO0 WITH COMPU
Page 315 and 316:
Page 317 and 318:
294 VOICE (OMMUNICATION WITH COMPUT
Page 319 and 320:
Page 321 and 322:
298 VOICE (OMMUNICATION WITH (COMPU
Page 323 and 324:
Page 325 and 326:
Page 328 and 329:
Bibliography Ades, S. and D. C. Swi
Page 330 and 331:
Biliograply o30 Daly, N. A. and V W
Page 332 and 333:
Bib~lgraphy 309 Klatt, D. H. "Revie
Page 334 and 335:
liblgraphy 311 G.Peterson, W. Wang,
Page 336 and 337:
Bbliography 313 Spiegel, M. F "Usin
Page 338 and 339:
Index p-law, 45, 220, 224 acoustics
Page 340 and 341:
Integrated Services Digital Network
Page 342:
vocal tract, 19-24, 23 voice mail,
show all

MAS.632 Conversational Computer Systems - MIT OpenCourseWare

Create successful ePaper yourself

Delete template?

Save as template?