Discourse markers in dialogue: relevance-theoretic analysis and ...
Discourse markers in dialogue: relevance-theoretic analysis and ...
Discourse markers in dialogue: relevance-theoretic analysis and ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
¡¡¡£¡¡¡¢¢¢¥¥¥¡¡¤<strong>Discourse</strong> <strong>markers</strong> <strong>in</strong> <strong>dialogue</strong>: <strong>relevance</strong>-<strong>theoretic</strong><strong>analysis</strong> <strong>and</strong> corpus-based validationS<strong>and</strong>r<strong>in</strong>e Zuffereys<strong>and</strong>r<strong>in</strong>e.zufferey@eti.unige.chUniversity of Geneva, Switzerl<strong>and</strong>School of Translation <strong>and</strong> Interpretation (ETI)THEORETICALANALYSISRelevance-<strong>theoretic</strong> view of <strong>Discourse</strong> Markers (DM)• DMs encode procedural <strong>in</strong>formation [Bla02]• DMs facilitate the <strong>in</strong>ferential process• DMs guide the hearer towards the mean<strong>in</strong>g <strong>in</strong>tended by the speakerPrelim<strong>in</strong>ary empirical f<strong>in</strong>d<strong>in</strong>gs• Not all the “traditional” DMs encode procedural <strong>in</strong>formationDM should not be considered as a homogeneous classEvery DM-c<strong>and</strong>idate must be studied <strong>in</strong>dividuallyDialogue-specific feature: frequency distribution• High proportion of like, well, etc. with respect to written texts• Low proportion of therefore, moreover, etc.• e.g.: no occurrence of therefore <strong>and</strong> moreover <strong>in</strong> Switchboard (3M words)• Difference of use between <strong>dialogue</strong>s <strong>and</strong> monologues [Ste90]DMs <strong>in</strong> coherence-based theories• DMs <strong>in</strong>dicate local coherencecohesive devices• DMs are useful to detect coherence relations(reformulation, elaboration, restatement)DMs <strong>in</strong> natural language process<strong>in</strong>g• Rhetorical pars<strong>in</strong>g of discourse• Based on coherence theories, e.g. RST [MT88]Parse trees anchored on DM• Annotation of “<strong>dialogue</strong> acts”• statement, question, back-channel, etc.DMs used as <strong>in</strong>dicators formach<strong>in</strong>e-learn<strong>in</strong>g systemsCORPUS-BASEDANALYSESGoals• Recognize occurrences of DMs: ambiguous items• Empirical <strong>analysis</strong>: patterns of occurrence• NLP application: useful features for detectionData: transcriptions of <strong>dialogue</strong>s• Staff meet<strong>in</strong>gs: ICSI, Berkeley (~6 hrs)• Telephone calls: Switchboard (~100 hrs)• Subtitles of movies: many availableOccurrence statistics• Task: manual annotation of DMs fromthe ICSI corpus• RT-based criterion: items that encodeprocedural <strong>in</strong>formation• Difficulty: l<strong>in</strong>guistic items are ambiguous,sometimes a DM, sometimes not• Influence of the data: corpus type <strong>and</strong>size, transcription conventionsResultsStatistics of pragmatic occurrences (DM)are consistent across the two corpora(ICSI, SWB)Confirmation of the discourse-typespecificity of DM frequency <strong>in</strong> spokendiscourse: like, so, well much morefrequent than nevertheless, thereforeInfluence of the annotation conventionson the number of extracted DMsAnnotation by humans• Experiments: subjects annotateoccurrences of like (± pragmatic use)• Data: ICSI (1 hr), film (2hrs) = 80 occ.• Guidel<strong>in</strong>es: detect pragmatic occ.(based on def<strong>in</strong>ition + cues + examples)• Variables tested:Native vs. non-native English speakersPre-planned vs. natural <strong>dialogue</strong>sRole of prosodyInter-annotator agreement ( κ )PERFECT = 1 > κ > 0 = NILImportance of prosody• without prosodic clues, κ = 0.5• with prosodic clues, κ = 0.8Agreement equal between native ENspeakers <strong>and</strong> FR speakers with goodknowledge of ENBetter agreement for pre-planned<strong>dialogue</strong>s (film)Automatic detection of DMs• Relevant factors: position, prosody,patterns of collocations• Experiment: use of collocation patternsto automate the annotation of pragmaticoccurrences of like• Method: exclusion of collocations suchas: someth<strong>in</strong>g like, I like, etc. (total: 26)ResultsOn the development corpus (ICSI)• recall = ~100%• precision = 75%On a different corpus (SWB): test• recall = ~100%• precision = 50%Method is useful as a pre-process<strong>in</strong>gtool to help human annotatorsConclusion• Reliability of human annotation depends on guidel<strong>in</strong>es <strong>and</strong> media• Automatic filter<strong>in</strong>g has excellent recall <strong>and</strong> encourag<strong>in</strong>g precisionFurther work• Improve automatic detection us<strong>in</strong>g mach<strong>in</strong>e-learn<strong>in</strong>g techniques• Investigate the procedural <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> other DMsSelected references• [Bla02] Blakemore, D. Mean<strong>in</strong>g <strong>and</strong> <strong>relevance</strong>: the semantics <strong>and</strong>pragmatics of discourse <strong>markers</strong>. Cambridge: CUP, 2002, 200p.• [MT88] Mann, W., Thompson, S. Rhetorical structure theory: toward afunctional theory of text organization. Text. 1988, vol.8(3), pp. 243-281.• [Ste90] Stenström, A.-B. Lexical items peculiar to spoken discourse. In J.Svartvik (ed.). The London-Lund Corpus of Spoken English: Description <strong>and</strong>research. Lund: LUP, 1990, pp. 137-175.8th Conference of the IPrA, Toronto, Canada 14-18 July 2003
8th International Pragmatics Conference: S. ZuffereyPragmatic connectors <strong>in</strong> <strong>dialogue</strong>: <strong>relevance</strong>-<strong>theoretic</strong> <strong>analysis</strong><strong>and</strong> corpus-based validationThe present study proposes an <strong>analysis</strong> of pragmatic connectors us<strong>in</strong>g <strong>relevance</strong> theory. Itaims at modell<strong>in</strong>g the procedural <strong>in</strong>formation they conta<strong>in</strong>, <strong>and</strong> at show<strong>in</strong>g how it can beused to improve discourse modell<strong>in</strong>g for natural language process<strong>in</strong>g (NLP).Pragmatic issues are probably one of the major bottlenecks to the automatic underst<strong>and</strong><strong>in</strong>gof discourse <strong>in</strong> NLP. Unfortunately, there is still a big gap between the pragmatic theorieson which l<strong>in</strong>guists are currently work<strong>in</strong>g notably neo-gricean like Horn (1984) <strong>and</strong>Lev<strong>in</strong>son (1983), <strong>and</strong> post-gricean like Sperber & Wilson (1986) <strong>and</strong> those that are usedby researchers <strong>in</strong> NLP almost always based on speech act theory.For that reason, <strong>in</strong> this paper I will explore the possibility to ground computationaldiscourse modell<strong>in</strong>g <strong>in</strong> Sperber <strong>and</strong> Wilson’s <strong>relevance</strong> theory. I will therefore presentbriefly what Relevance theory tells us about discourse, based mostly on works by Reboul<strong>and</strong> Moeschler (1998), Jucker (1995) <strong>and</strong> Blakemore (2002).I will then discuss the status of pragmatic connectors from the po<strong>in</strong>t of view of <strong>relevance</strong>theory, show<strong>in</strong>g how various researchers work<strong>in</strong>g on different languages (French, Hebrew,Japanese <strong>and</strong> English) have proved the validity of a <strong>relevance</strong>-based approach to the studyof pragmatic connectors (Moeschler: 2002, Rouchota: 1998). My synthesis will expla<strong>in</strong> thesemantic role of connectors as <strong>relevance</strong>-based constra<strong>in</strong>ts on the <strong>in</strong>terpretation ofutterances <strong>in</strong> discourse, provid<strong>in</strong>g a classification of their possible roles.I will then test the validity of these <strong>theoretic</strong>al results by an empirical study conducted onvarious corpora of texts (BNC) <strong>and</strong> <strong>dialogue</strong>s (bus<strong>in</strong>ess meet<strong>in</strong>g corpus of the Swiss “IM2”Project). The dialog corpus consists of more than 100 hours of meet<strong>in</strong>g record<strong>in</strong>gs,manually transcribed for each speaker. About 10% of the corpus is annotated with dialogacts labels from the (extended) DAMSL/Switchboard set. The study will proceed <strong>in</strong> threesteps: (1) location of occurrences of pragmatic connectors (when possible, by automatedmeans); (2) annotation of the <strong>in</strong>terpretative role of each marker; (3) comparison of theannotated (observed) role with the role predicted by <strong>theoretic</strong>al <strong>analysis</strong>. One of theorig<strong>in</strong>al po<strong>in</strong>ts of the study is the use of <strong>dialogue</strong>s between more than two persons.I will conclude by argu<strong>in</strong>g that satisfactory results obta<strong>in</strong>ed by the empirical study can givesolid ground to motivate further research on <strong>relevance</strong> theory <strong>and</strong> discourse modell<strong>in</strong>g.One of the most important issue for NLP is the <strong>analysis</strong> of the various computationalformalisms that could accommodate the procedural <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> pragmaticconnectors.Selected ReferencesBlakemore, D. Mean<strong>in</strong>g <strong>and</strong> <strong>relevance</strong>: the semantics <strong>and</strong> pragmatics of discourse <strong>markers</strong>. Cambridge: CUP, 2002,208p.Jucker, A. <strong>Discourse</strong> <strong>analysis</strong> <strong>and</strong> <strong>relevance</strong>. In F. Hundsnurscher <strong>and</strong> E. We<strong>in</strong>g<strong>and</strong>, eds. Future perspectives of <strong>dialogue</strong><strong>analysis</strong>. Tüb<strong>in</strong>gen: Max Neimeyer Verlag, 1995, pp. 121-146Moeschler, J. Connecteurs, encodage conceptuel et encodage procédural. Cahiers de l<strong>in</strong>guistique française. 2002, vol.24,22p.Reboul, A., Moeschler, J. Pragmatique du discours. De l'<strong>in</strong>terprétation de l'énoncé à l'<strong>in</strong>terprétation du discours. Paris:Arm<strong>and</strong> Col<strong>in</strong>, 1998, 220p.Rouchota, V. Connectives, coherence <strong>and</strong> <strong>relevance</strong>. In V. Rouchota <strong>and</strong> A. Jucker, eds. Current issues <strong>in</strong> <strong>relevance</strong>theory. Amsterdam: John Benjam<strong>in</strong>s, 1998, pp. 11-57.