Automotive User Interfaces and Interactive Vehicular Applications

More documents

Recommendations

Info

structure which is improbable given its context would cause increased processing difficulty. Surprisal can be calculated at various linguistic levels, such as word n-grams, syntactic structures 2 , or semantic representations. Another factor that has been shown to affect human processing difficulty is the accessibility of previous material when integrating it with new material, as captured by dependency locality theory (DLT; Gibson, 2000). DLT measures the distance between grammatically dependent words in terms of intervening discourse-mediating items. Long distance dependencies thereby cause increased processing difficulty, in particular if people need to keep track of multiple open dependencies at the same time. Demberg and Keller (2009) have recently developed a model of sentence processing called“Prediction Theory”which combines the expectation-based with the integration-based aspect of processing difficulty in a single theory. 3.2 Extensions to existing models While computational models that assess processing difficulty related to syntax are available and have been tested (Demberg and Keller, 2008; Roark et al., 2009; Wu et al., 2010; Frank, 2010), there is significantly less work on how to integrate the difficulty measures with semantics (Mitchell et al., 2010), and, to the best of our knowledge, no psycholinguistic model that also includes the effect of discourse referents (like “even”, “although”, “therefore”). Evidence for the impact of such discourse referents on the effectiveness of human interaction with dialogue systems comes from a recent experiment by Winterboer et al. (2011), who found that the presence of discourse cues in dialogue system utterances helps users compare different options and select the optimal option (the setting was again interaction with a flight booking system). Another challenge thus lies in the development of a theory and model of discourse level processing difficulty. 4. COMPLEXITY MANAGEMENT 4.1 System design The spoken dialogue interface is guided by the linguistic complexity metric (figure 1), some possibilities for which we described in section 3; it seeks to choose responses from its language generation system based on the magnitude of the metric. In turn, it evaluates the quality of user responses, which becomes a parameter of the metric. The quality of user responses is measured based on two overarching factors: speech fluency and task performance. Similarly, driving behaviour and attentional requirements are also evaluated and factored into the complexity metric. A key insight is that only the spoken dialogue system can 2 Here is the canonical example of a difficult-to-process syntactic structure in language science, by way of illustration: “The horse raced past the barn fell.” Most native Englishspeakers find this sentence ungrammatical at first glance, because “the horse raced past the barn” is a complete sentence. However, it is also a reduced relative clause, permitting “the horse (that was) raced past the barn fell” after subsequent consideration. “Fell” is a syntactically high-surprisal item. This is an extreme “boundary” case, but linguists have identified more subtle phenomena with similar characteristics. be managed in order that it yield the appropriate amount of cognitive capacity to safe driving. Initial parameters are set by data collected through experimentation. This last idea is crucial to the operation of the system. Most current applications of language technology are guided by machine learning algorithms based on text and speech corpora. We describe the conditions under which these must be collected in the following section. 4.2 Data collection and experimentation In order to identify the situations and estimate the parameters under which linguistic complexity affects task performance, we need to construct a parallel corpus of recorded conversation and driving performance information. Then we can correlate the values of the complexity measures and see which ones are most predictive of either driving impairment or difficulty in performing the auxiliary speech task. As graphics technology has progressed, there are increasingly realistic and accessible driving simulator apparatus. It is possible to use these tools to measure performance on driving in terms of deviation from an ideal line in a course. In fact, the major challenge in data collection is not the recording of driver activity, but the preparation of the linguistic data for statistical analysis. One such effort is Fors and Villing (2011); they describe a Swedish-language inautomotive data collection exercise for human-human dialogue. Their collection and analysis of Swedish conversation found that pause duration during conversations increased for the driver during periods of apparent high cognitive load, measured by the driver’s reaction time to a buzzer. Apart from the time and cost of transcribing the speech into text (something which automatic speech recognition may partially solve particularly for more widespread languages like English), Fors and Villing note that identifying the pause boundaries is very time consuming. Recent developments in data annotation such as crowdsourcing, however, now enable us to use unskilled Internet labour to annotate a variety of subtle distinctions (Sayeed et al., 2011); it is a matter of user interface design to present the task in such a way that errors and other features can be annotated by people who have no training in the task. Once we have driving performance data and speech task performance data aligned over time for a sufficient number of subjects, it is then possible to explore the correlations between quantitative linguistic complexity and driving attention in context. This will allow us to set the parameters of the system in synthesising appropriate prompts. 5. CONCLUDING REMARKS In this paper, we have brought together a number of elements that belong to a research agenda in in-vehicle spoken dialogue systems, and we have done so in a manner that makes psychologically plausible models of sentence processing relevant to automotive and other dialogue-based control and communication systems. A key component of this is attention and the effect of linguistic cognitive load. Multimodal tasks that involve communicating with an automated system while driving need to modulate the linguistic complexity of the interface with reference to current load and the need to retain a safe level attention on driving.
It happens that there is a large literature on attention and comprehension in the synthetic speech context and in the dialogue context. Different levels of linguistic representation affect or appear to be affected by tasks that demand attention; in particular, syntactic and semantic factors appear to have an effect on human performance in tasks that involve interaction with both synthetic and natural speech. There is also a large literature on computing psycholinguistic complexity and different levels of linguistic representation; divided-attention tasks such as driver multitasking permit this work to be leveraged in an applied context. Given this scientific background, we have proposed an overall system design for in-vehicle spoken dialogue complexity management. We have also identified some of the design and data collection challenges that need to be overcome in pushing this agenda forward. As yet there is no publicly available corpus that relates transcribed speech data to user task performance and driving performance. However, the technical challenges to this, though significant, are not insurmountable. The final step is the design and construction of variable-complexity in-car dialogue systems. References Delogu, C., Conte, S., and Sementina, C. (1998). Cognitive factors in the evaluation of synthetic speech. Speech Communication, 24(2):153–168. Demberg, V. and Keller, F. (2008). Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition, 109:193–210. Demberg, V. and Keller, F. (2009). A computational model of prediction in human parsing: Unifying locality and surprisal effects. In Proceedings of the 29th meeting of the Cognitive Science Society, Amsterdam. Demberg, V., Winterboer, A., and Moore, J. (2011). A strategy for information presentation in spoken dialog systems. Computational Linguistics, (37:3):489–539. Fang, R., Chai, J. Y., and Ferreira, F. (2009). Between linguistic attention and gaze fixations inmultimodal conversational interfaces. In International Conference on Multimodal Interfaces, pages 143–150. Fors, K. L. and Villing, J. (2011). Reducing cognitive load in in-vehicle dialogue system interaction. In Semdial 2011: Proceedings of the 15th workshop on the semantics and pragmatics of dialogue. Frank, A. and Jaeger, T. F. (2008). Speaking rationally: uniform information density as an optimal strategy for language production. In The 30th annual meeting of the Cognitive Science Society, pages 939—944. Frank, S. (2010). Uncertainty reduction as a measure of cognitive processing effort. In Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics, pages 81–89, Uppsala, Sweden. Gibson, E. (2000). Dependency locality theory: A distancedased theory of linguistic complexity. In Marantz, A., Miyashita, Y., and O’Neil, W., editors, Image, Language, Brain: Papers from the First Mind Articulation Project Symposium, pages 95–126. MIT Press, Cambridge, MA. Grosz, B. J., Weinstein, S., and Joshi, A. K. (1995). Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21:203–225. Hale, J. (2001). A probabilistic Earley parser as a psycholinguistic model. In Proceedings of the 2nd Conference of the North American Chapter of the Association for Computational Linguistics, volume 2, pages 159–166, Pittsburgh, PA. Hu, J., Winterboer, A., Nass, C., Moore, J., and Illowsky, R. (2007). Context & usability testing: user-modeled information presentation in easy and difficult driving conditions. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 1343–1346. ACM. Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3):1126–1177. Mitchell, J., Lapata, M., Demberg, V., and Keller, F. (2010). Syntactic and semantic factors in processing difficulty: An integrated measure. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 196–206, Uppsala, Sweden. Roark, B., Bachrach, A., Cardenas, C., and Pallier, C. (2009). Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 324–333, Singapore. Sayeed, A. B., Rusk, B., Petrov, M., Nguyen, H. C., Meyer, T. J., and Weinberg, A. (2011). Crowdsourcing syntactic relatedness judgements for opinion mining in the study of information technology adoption. In Proceedings of the ACL 2011 workshop on Language Technology for Cultural Heritage, Social Sciences, and the Humanities (LaTeCH). Swift, M. D., Campana, E., Allen, J. F., and Tanenhaus, M. K. (2002). Monitoring eye movements as an evaluation of synthesized speech. In Proceedings of the IEEE 2002 Workshop on Speech Synthesis. Taube-Schiff, M. and Segalowitz, N. (2005). Linguistic attention control: attention shifting governed by grammaticized elements of language. Journal of experimental psychology Learning memory and cognition, 31(3):508–519. Walker, M. A., Passonneau, R., and Boland, J. E. (2001). Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In Proceedings of 39th Annual Meeting of the Association for Computational Linguistics, pages 515–522, Toulouse. Winterboer, A. K., Tietze, M. I., Wolters, M. K., and Moore, J. D. (2011). The user model-based summarize and refine approach improves information presentation in spoken dialog systems. Computer Speech & Language, 25(2):175 – 191. Wu, S., Bachrach, A., Cardenas, C., and Schuler, W. (2010). Complexity metrics in an incremental right-corner parser. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1189–1198, Uppsala.
Page 1 and 2:
AutomotiveUI2011 3rd International
Page 3 and 4:
AutomotiveUI 2011 Third Internation
Page 5 and 6:
Contents Preface 1 Welcome Note ...
Page 7 and 8:
SpeeT - A Multimodal Interaction St
Page 9 and 10:
Preface - 1 -
Page 11 and 12:
Welcome Note from the Poster/Demo C
Page 13 and 14:
Part I. Poster Abstracts - 5 -
Page 15 and 16:
The use of in vehicle data recorder
Page 17 and 18:
Information Extraction from the Wor
Page 19 and 20:
IVAT (In-Vehicle Assistive Technolo
Page 21 and 22:
ENGIN (Exploring Next Generation IN
Page 23 and 24:
The HMISL language for modeling HMI
Page 25 and 26:
Designing a Spatial Haptic Cue-base
Page 27 and 28:
Investigating the Usage of Multifun
Page 29 and 30:
Tobias Islinger Regensburg Universi
Page 31 and 32:
Gas Station Creative Probing: The F
Page 33 and 34:
A New Icon Selection Test for the A
Page 35 and 36:
The Process of Creating an IVAT Plu
Page 37 and 38:
Personas for On-Board Diagnostics -
Page 39 and 40:
On Detecting Distracted Driving Usi
Page 41 and 42:
ABSTRACT Natural and Intuitive Hand
Page 43 and 44:
ABSTRACT 3D Theremin: A Novel Appro
Page 45 and 46:
Open Car User Experience Lab: A Pra
Page 47 and 48:
Situation-adaptive driver assistanc
Page 49 and 50:
Determination of Mobility Context u
Page 51 and 52:
Addressing Road Rage: The Effect of
Page 53 and 54:
ABSTRACT New Apps, New Challenges:
Page 55 and 56:
The Ethnographic (U)Turn: Local Exp
Page 57 and 58:
(C)archeology -- Car Turns Outs & A
Page 59 and 60:
An Approach to User-Focused Develop
Page 61 and 62:
Part II. Interactive Demos - 53 -
Page 63 and 64:
ABSTRACT The ROADSAFE Toolkit: Rapi
Page 65 and 66:
SpeeT: A Multimodal Interaction Sty
Page 67 and 68:
Dialogs Are Dead, Long Live Dialogs
Page 69 and 70:
Part III. Posters PDF’s - 61 -
Page 71 and 72:
THE USE OF IN VEHICLE DATA RECORDER
Page 74 and 75:
ENGIN (Exploring Next Generation IN
Page 76 and 77:
Poster PDF missing...
Page 78 and 79: Driver distraction analysis based o
Page 80 and 81: 3. 5. German Research Center for Ar
Page 83 and 84: PERSONAS FOR ON-BOARD DIAGNOSTICS U
Page 85 and 86: Natural, Intuitive Hand Gestures -
Page 87 and 88: Poster PDF missing...
Page 89 and 90: Motivation • The mobility context
Page 91 and 92: ��
Page 93 and 94: The Ethnographic (U)Turn: Local Exp
Page 95 and 96: AutomotiveUI 2011 Third Internation
Page 97 and 98: Figure 1. All Music and filtered vi
Page 99 and 100: Using Controller Widgets in 3D-GUI
Page 101 and 102: 2.1 Modeling using Conventional GUI
Page 103 and 104: 5.1 Seamlessly Integrated Design Wo
Page 105 and 106: Libraries containing multiple Contr
Page 107 and 108: Skinning as a method for differenti
Page 109 and 110: 4. STATE OF INDUSTRIAL PRACTICE Spe
Page 111 and 112: In order to keep the necessary effo
Page 115 and 116: ABSTRACT Workshop “Subliminal Per
Page 117 and 118: Message from the Workshop Organizer
Page 119 and 120: For up to date workshop information
Page 121 and 122: The effects of visual-manual tasks
Page 123 and 124: 2.1.6.3 Table Task During the table
Page 125 and 126: Moreover, participants showed signi
Page 127: eferred to in speech-synthesised in
Page 131 and 132: 3. SYSTEM EVALUATION We conducted a
Page 133 and 134: incorporate natural breakpoints wil
Page 135 and 136: duration [2]. Only a limited number
Page 137 and 138: situation creates CL, which is incr
Page 139 and 140: tracking is a viable way to measure
Page 141 and 142: esponsible, including differences i
Page 143 and 144: cognitive workload. The lateral pos
Page 145 and 146: Due to the fact that skin conductan
Page 147 and 148: However, all of the above mentioned
Page 149 and 150: We also wanted to know if there wer
Page 153 and 154: and more of a locked computer syste
Page 155 and 156: using the whole windscreen as the p
Page 157 and 158: keys to press on a visual display.
Page 159 and 160: Interaction in the Car, in Proceedi
Page 161 and 162: Defining explicit communication ges
Page 163: REMOTE TACTILE FEEDBACK The spatial
Page 166 and 167: Mobile Device Integration and Inter
Page 168 and 169: choice. For the output of multimedi
Page 172 and 173: The selection of any petrol station
Page 174 and 175: visibility of available information
Page 176 and 177: Position Paper - Integrating Mobile
Page 178 and 179:
Gartner, Smartphone sells have grow
Page 180 and 181:
Unmuting Smart Phones in Cars: gett
Page 182:
addition of new smart devices and s
show all

Automotive User Interfaces and Interactive Vehicular Applications

Create successful ePaper yourself

Delete template?

Save as template?