Dialogue Processing in a Conversational Speech ... - CiteSeerX
Dialogue Processing in a Conversational Speech ... - CiteSeerX
Dialogue Processing in a Conversational Speech ... - CiteSeerX
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Approaches<br />
Per cent correct<br />
RandomfromGrammar 38.6%<br />
FSM Strict Context 52.4%<br />
FSM Jump<strong>in</strong>g Context 55.2%<br />
Plan-Based DP 53.8%<br />
Table 1: Approaches to <strong>Speech</strong> Act Assignment<br />
4.2. Evaluation<br />
The results <strong>in</strong> Table 1 show the the performance of the two discourse<br />
process<strong>in</strong>g approaches, namely the plan-based approach and<br />
the f<strong>in</strong>ite state mach<strong>in</strong>e approach for the task of assign<strong>in</strong>g speech<br />
acts. The FSM processor with the cumulative error reduction mechanism<br />
is marked by FSM Jump<strong>in</strong>g Context, and the FSM without<br />
jump<strong>in</strong>g is marked by FSM Strict Context. The choice of randomly<br />
select<strong>in</strong>g a speech act from the non-context-based predictions of the<br />
grammar <strong>in</strong>dicates the performance of the system when we do not<br />
use any contextual <strong>in</strong>formation.<br />
We tested the discourse processors on ten unseen dialogues, with<br />
a total of 506 utterances. Out of the 506 utterances <strong>in</strong> the test set,<br />
we considered only 211 utterances for which the grammar returns<br />
multiple possible speech acts. We measured how well the different<br />
approaches correctly disambiguate the multiple speech acts with respect<br />
to hand-coded target speech acts.<br />
Table 1 demonstrates the effect of context <strong>in</strong> spoken discourse process<strong>in</strong>g.<br />
S<strong>in</strong>ce the test was conducted on utterances with multiple<br />
possible speech acts proposed by the non-context-based grammar<br />
component, it evaluates the effectiveness of the various contextbased<br />
approaches <strong>in</strong> disambiguat<strong>in</strong>g speech acts. All of the approaches<br />
employ<strong>in</strong>g context perform better than the non-contextbased<br />
grammar predictions. The evaluation also demonstrates that it<br />
is imperative to estimate context carefully. The FSM jump<strong>in</strong>g context<br />
approach, which attempts to reduce cumulative error, gives better<br />
performance than the the FSM strict context approach. It is even<br />
better than the more knowledge-<strong>in</strong>tensive plan-based approach. We<br />
expect that performance of plan-based approach will improve when<br />
we <strong>in</strong>troduce a solution to the cumulative error problem.<br />
5. Conclusions and Future Work<br />
In this paper, we described how our system addresses problems that<br />
arise <strong>in</strong> discourse process<strong>in</strong>g of spontaneous speech. First, we described<br />
two different robust pars<strong>in</strong>g strategies — the GLR* parser<br />
and the Phoenix parser, and the procedures that both parsers use<br />
to segment the <strong>in</strong>put <strong>in</strong>to units of coherent mean<strong>in</strong>g representation<br />
that are also of an appropriate size for discourse process<strong>in</strong>g.<br />
Then we described two approaches to discourse process<strong>in</strong>g — the<br />
plan <strong>in</strong>ference approach and the f<strong>in</strong>ite state approach. In describ<strong>in</strong>g<br />
the f<strong>in</strong>ite state approach, we presented one solution to the cumulative<br />
error problem. F<strong>in</strong>ally, we described our method of late stage<br />
disambiguation where the context-based predictions from our discourse<br />
processors are comb<strong>in</strong>ed with the non-context-based predictions<br />
from the parsers. Our future efforts will concentrate on f<strong>in</strong>d<strong>in</strong>g<br />
improved methods for comb<strong>in</strong><strong>in</strong>g different knowledge sources<br />
effectively for the disambiguation task, treat<strong>in</strong>g cumulative error <strong>in</strong><br />
the plan-based discourse processor, and improv<strong>in</strong>g the effectiveness<br />
of contextual <strong>in</strong>formation <strong>in</strong> constra<strong>in</strong><strong>in</strong>g the speech translation process.<br />
Acknowledgements<br />
The work reported <strong>in</strong> this paper was funded <strong>in</strong> part by grants from<br />
ATR - Interpret<strong>in</strong>g Telecommunications Research Laboratories of<br />
Japan, the US Department of Defense, and the Verbmobil Project of<br />
the Federal Republic of Germany.<br />
6. REFERENCES<br />
1. L. Lambert. Recogniz<strong>in</strong>g Complex Discourse Acts: A Tripartite<br />
Plan-Based Model of <strong>Dialogue</strong>. PhD thesis, Department of<br />
Computer Science, University of Delaware, 1993.<br />
2. A. Lavie and M. Tomita. GLR* - An Efficient Noise Skipp<strong>in</strong>g<br />
Pars<strong>in</strong>g Algorithm for Context Free Grammars, Proceed<strong>in</strong>gs<br />
of the third International Workshop on Pars<strong>in</strong>g Technologies<br />
(IWPT-93), Tilburg, The Netherlands, August 1993.<br />
3. A. Lavie. An Integrated Heuristic Scheme for Partial Parse<br />
Evaluation, Proceed<strong>in</strong>gs of the 32nd Annual Meet<strong>in</strong>g of the<br />
ACL (ACL-94), Las Cruces, New Mexico, June 1994.<br />
4. L. Mayfield, M. Gavaldà, Y-H. Seo, B. Suhm, W. Ward, A.<br />
Waibel. Pars<strong>in</strong>g Real Input <strong>in</strong> JANUS: a Concept-Based Approach.<br />
In Proceed<strong>in</strong>gs of TMI 95.<br />
5. Y. Qu, B. Di Eugenio, A. Lavie, L. Lev<strong>in</strong> and C. P. Rosé. M<strong>in</strong>imiz<strong>in</strong>g<br />
Cumulative Error <strong>in</strong> Discourse Context, To appear<br />
<strong>in</strong> Proceed<strong>in</strong>gs of ECAI Workshop on <strong>Dialogue</strong> <strong>Process<strong>in</strong>g</strong> <strong>in</strong><br />
Spoken Language Systems, Budapest, Hungary, August 1996.<br />
6. N. Reith<strong>in</strong>ger and E. Maier. Utiliz<strong>in</strong>g statistical dialogue act<br />
process<strong>in</strong>g <strong>in</strong> Verbmobil. In Proceed<strong>in</strong>gs of the ACL, 1995.<br />
7. C. P. Rosé, B. Di Eugenio, L. Lev<strong>in</strong> and C. Van Ess-Dykema.<br />
Discourse <strong>Process<strong>in</strong>g</strong> of dialogues with multiple threads, In<br />
Proceed<strong>in</strong>gs of the ACL, 1995<br />
8. M. Tomita. An Efficient Augmented Context-free Pars<strong>in</strong>g Algorithm.<br />
Computational L<strong>in</strong>guistics, 13(1-2):31–46, 1987.<br />
9. W. Ward. Extract<strong>in</strong>g Information <strong>in</strong> Spontaneous <strong>Speech</strong>. In<br />
Proceed<strong>in</strong>gs of International Conference on Spoken Language,<br />
1994.<br />
10. M. Woszczyna, N. Aoki-Waibel, F. D. Buo, N. Coccaro,<br />
T.Horiguchi,K.andKemp,A.Lavie,A.McNair,T.Polz<strong>in</strong>,<br />
I. Rog<strong>in</strong>a, C. P. Rosé, T. Schultz, B. Suhm, M. Tomita, and<br />
A. Waibel. JANUS-93: Towards Spontaneous <strong>Speech</strong> Translation.<br />
In Proceed<strong>in</strong>gs of IEEE International Conference on<br />
Acoustics, <strong>Speech</strong> and Signal <strong>Process<strong>in</strong>g</strong> (ICASSP’94), 1994.