11.06.2014 Views

Dialogue Processing in a Conversational Speech ... - CiteSeerX

Dialogue Processing in a Conversational Speech ... - CiteSeerX

Dialogue Processing in a Conversational Speech ... - CiteSeerX

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Approaches<br />

Per cent correct<br />

RandomfromGrammar 38.6%<br />

FSM Strict Context 52.4%<br />

FSM Jump<strong>in</strong>g Context 55.2%<br />

Plan-Based DP 53.8%<br />

Table 1: Approaches to <strong>Speech</strong> Act Assignment<br />

4.2. Evaluation<br />

The results <strong>in</strong> Table 1 show the the performance of the two discourse<br />

process<strong>in</strong>g approaches, namely the plan-based approach and<br />

the f<strong>in</strong>ite state mach<strong>in</strong>e approach for the task of assign<strong>in</strong>g speech<br />

acts. The FSM processor with the cumulative error reduction mechanism<br />

is marked by FSM Jump<strong>in</strong>g Context, and the FSM without<br />

jump<strong>in</strong>g is marked by FSM Strict Context. The choice of randomly<br />

select<strong>in</strong>g a speech act from the non-context-based predictions of the<br />

grammar <strong>in</strong>dicates the performance of the system when we do not<br />

use any contextual <strong>in</strong>formation.<br />

We tested the discourse processors on ten unseen dialogues, with<br />

a total of 506 utterances. Out of the 506 utterances <strong>in</strong> the test set,<br />

we considered only 211 utterances for which the grammar returns<br />

multiple possible speech acts. We measured how well the different<br />

approaches correctly disambiguate the multiple speech acts with respect<br />

to hand-coded target speech acts.<br />

Table 1 demonstrates the effect of context <strong>in</strong> spoken discourse process<strong>in</strong>g.<br />

S<strong>in</strong>ce the test was conducted on utterances with multiple<br />

possible speech acts proposed by the non-context-based grammar<br />

component, it evaluates the effectiveness of the various contextbased<br />

approaches <strong>in</strong> disambiguat<strong>in</strong>g speech acts. All of the approaches<br />

employ<strong>in</strong>g context perform better than the non-contextbased<br />

grammar predictions. The evaluation also demonstrates that it<br />

is imperative to estimate context carefully. The FSM jump<strong>in</strong>g context<br />

approach, which attempts to reduce cumulative error, gives better<br />

performance than the the FSM strict context approach. It is even<br />

better than the more knowledge-<strong>in</strong>tensive plan-based approach. We<br />

expect that performance of plan-based approach will improve when<br />

we <strong>in</strong>troduce a solution to the cumulative error problem.<br />

5. Conclusions and Future Work<br />

In this paper, we described how our system addresses problems that<br />

arise <strong>in</strong> discourse process<strong>in</strong>g of spontaneous speech. First, we described<br />

two different robust pars<strong>in</strong>g strategies — the GLR* parser<br />

and the Phoenix parser, and the procedures that both parsers use<br />

to segment the <strong>in</strong>put <strong>in</strong>to units of coherent mean<strong>in</strong>g representation<br />

that are also of an appropriate size for discourse process<strong>in</strong>g.<br />

Then we described two approaches to discourse process<strong>in</strong>g — the<br />

plan <strong>in</strong>ference approach and the f<strong>in</strong>ite state approach. In describ<strong>in</strong>g<br />

the f<strong>in</strong>ite state approach, we presented one solution to the cumulative<br />

error problem. F<strong>in</strong>ally, we described our method of late stage<br />

disambiguation where the context-based predictions from our discourse<br />

processors are comb<strong>in</strong>ed with the non-context-based predictions<br />

from the parsers. Our future efforts will concentrate on f<strong>in</strong>d<strong>in</strong>g<br />

improved methods for comb<strong>in</strong><strong>in</strong>g different knowledge sources<br />

effectively for the disambiguation task, treat<strong>in</strong>g cumulative error <strong>in</strong><br />

the plan-based discourse processor, and improv<strong>in</strong>g the effectiveness<br />

of contextual <strong>in</strong>formation <strong>in</strong> constra<strong>in</strong><strong>in</strong>g the speech translation process.<br />

Acknowledgements<br />

The work reported <strong>in</strong> this paper was funded <strong>in</strong> part by grants from<br />

ATR - Interpret<strong>in</strong>g Telecommunications Research Laboratories of<br />

Japan, the US Department of Defense, and the Verbmobil Project of<br />

the Federal Republic of Germany.<br />

6. REFERENCES<br />

1. L. Lambert. Recogniz<strong>in</strong>g Complex Discourse Acts: A Tripartite<br />

Plan-Based Model of <strong>Dialogue</strong>. PhD thesis, Department of<br />

Computer Science, University of Delaware, 1993.<br />

2. A. Lavie and M. Tomita. GLR* - An Efficient Noise Skipp<strong>in</strong>g<br />

Pars<strong>in</strong>g Algorithm for Context Free Grammars, Proceed<strong>in</strong>gs<br />

of the third International Workshop on Pars<strong>in</strong>g Technologies<br />

(IWPT-93), Tilburg, The Netherlands, August 1993.<br />

3. A. Lavie. An Integrated Heuristic Scheme for Partial Parse<br />

Evaluation, Proceed<strong>in</strong>gs of the 32nd Annual Meet<strong>in</strong>g of the<br />

ACL (ACL-94), Las Cruces, New Mexico, June 1994.<br />

4. L. Mayfield, M. Gavaldà, Y-H. Seo, B. Suhm, W. Ward, A.<br />

Waibel. Pars<strong>in</strong>g Real Input <strong>in</strong> JANUS: a Concept-Based Approach.<br />

In Proceed<strong>in</strong>gs of TMI 95.<br />

5. Y. Qu, B. Di Eugenio, A. Lavie, L. Lev<strong>in</strong> and C. P. Rosé. M<strong>in</strong>imiz<strong>in</strong>g<br />

Cumulative Error <strong>in</strong> Discourse Context, To appear<br />

<strong>in</strong> Proceed<strong>in</strong>gs of ECAI Workshop on <strong>Dialogue</strong> <strong>Process<strong>in</strong>g</strong> <strong>in</strong><br />

Spoken Language Systems, Budapest, Hungary, August 1996.<br />

6. N. Reith<strong>in</strong>ger and E. Maier. Utiliz<strong>in</strong>g statistical dialogue act<br />

process<strong>in</strong>g <strong>in</strong> Verbmobil. In Proceed<strong>in</strong>gs of the ACL, 1995.<br />

7. C. P. Rosé, B. Di Eugenio, L. Lev<strong>in</strong> and C. Van Ess-Dykema.<br />

Discourse <strong>Process<strong>in</strong>g</strong> of dialogues with multiple threads, In<br />

Proceed<strong>in</strong>gs of the ACL, 1995<br />

8. M. Tomita. An Efficient Augmented Context-free Pars<strong>in</strong>g Algorithm.<br />

Computational L<strong>in</strong>guistics, 13(1-2):31–46, 1987.<br />

9. W. Ward. Extract<strong>in</strong>g Information <strong>in</strong> Spontaneous <strong>Speech</strong>. In<br />

Proceed<strong>in</strong>gs of International Conference on Spoken Language,<br />

1994.<br />

10. M. Woszczyna, N. Aoki-Waibel, F. D. Buo, N. Coccaro,<br />

T.Horiguchi,K.andKemp,A.Lavie,A.McNair,T.Polz<strong>in</strong>,<br />

I. Rog<strong>in</strong>a, C. P. Rosé, T. Schultz, B. Suhm, M. Tomita, and<br />

A. Waibel. JANUS-93: Towards Spontaneous <strong>Speech</strong> Translation.<br />

In Proceed<strong>in</strong>gs of IEEE International Conference on<br />

Acoustics, <strong>Speech</strong> and Signal <strong>Process<strong>in</strong>g</strong> (ICASSP’94), 1994.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!