22.07.2015 Views

Discourse markers in dialogue: relevance-theoretic analysis and ...

Discourse markers in dialogue: relevance-theoretic analysis and ...

Discourse markers in dialogue: relevance-theoretic analysis and ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

¡¡¡£¡¡¡¢¢¢¥¥¥¡¡¤<strong>Discourse</strong> <strong>markers</strong> <strong>in</strong> <strong>dialogue</strong>: <strong>relevance</strong>-<strong>theoretic</strong><strong>analysis</strong> <strong>and</strong> corpus-based validationS<strong>and</strong>r<strong>in</strong>e Zuffereys<strong>and</strong>r<strong>in</strong>e.zufferey@eti.unige.chUniversity of Geneva, Switzerl<strong>and</strong>School of Translation <strong>and</strong> Interpretation (ETI)THEORETICALANALYSISRelevance-<strong>theoretic</strong> view of <strong>Discourse</strong> Markers (DM)• DMs encode procedural <strong>in</strong>formation [Bla02]• DMs facilitate the <strong>in</strong>ferential process• DMs guide the hearer towards the mean<strong>in</strong>g <strong>in</strong>tended by the speakerPrelim<strong>in</strong>ary empirical f<strong>in</strong>d<strong>in</strong>gs• Not all the “traditional” DMs encode procedural <strong>in</strong>formationDM should not be considered as a homogeneous classEvery DM-c<strong>and</strong>idate must be studied <strong>in</strong>dividuallyDialogue-specific feature: frequency distribution• High proportion of like, well, etc. with respect to written texts• Low proportion of therefore, moreover, etc.• e.g.: no occurrence of therefore <strong>and</strong> moreover <strong>in</strong> Switchboard (3M words)• Difference of use between <strong>dialogue</strong>s <strong>and</strong> monologues [Ste90]DMs <strong>in</strong> coherence-based theories• DMs <strong>in</strong>dicate local coherencecohesive devices• DMs are useful to detect coherence relations(reformulation, elaboration, restatement)DMs <strong>in</strong> natural language process<strong>in</strong>g• Rhetorical pars<strong>in</strong>g of discourse• Based on coherence theories, e.g. RST [MT88]Parse trees anchored on DM• Annotation of “<strong>dialogue</strong> acts”• statement, question, back-channel, etc.DMs used as <strong>in</strong>dicators formach<strong>in</strong>e-learn<strong>in</strong>g systemsCORPUS-BASEDANALYSESGoals• Recognize occurrences of DMs: ambiguous items• Empirical <strong>analysis</strong>: patterns of occurrence• NLP application: useful features for detectionData: transcriptions of <strong>dialogue</strong>s• Staff meet<strong>in</strong>gs: ICSI, Berkeley (~6 hrs)• Telephone calls: Switchboard (~100 hrs)• Subtitles of movies: many availableOccurrence statistics• Task: manual annotation of DMs fromthe ICSI corpus• RT-based criterion: items that encodeprocedural <strong>in</strong>formation• Difficulty: l<strong>in</strong>guistic items are ambiguous,sometimes a DM, sometimes not• Influence of the data: corpus type <strong>and</strong>size, transcription conventionsResultsStatistics of pragmatic occurrences (DM)are consistent across the two corpora(ICSI, SWB)Confirmation of the discourse-typespecificity of DM frequency <strong>in</strong> spokendiscourse: like, so, well much morefrequent than nevertheless, thereforeInfluence of the annotation conventionson the number of extracted DMsAnnotation by humans• Experiments: subjects annotateoccurrences of like (± pragmatic use)• Data: ICSI (1 hr), film (2hrs) = 80 occ.• Guidel<strong>in</strong>es: detect pragmatic occ.(based on def<strong>in</strong>ition + cues + examples)• Variables tested:Native vs. non-native English speakersPre-planned vs. natural <strong>dialogue</strong>sRole of prosodyInter-annotator agreement ( κ )PERFECT = 1 > κ > 0 = NILImportance of prosody• without prosodic clues, κ = 0.5• with prosodic clues, κ = 0.8Agreement equal between native ENspeakers <strong>and</strong> FR speakers with goodknowledge of ENBetter agreement for pre-planned<strong>dialogue</strong>s (film)Automatic detection of DMs• Relevant factors: position, prosody,patterns of collocations• Experiment: use of collocation patternsto automate the annotation of pragmaticoccurrences of like• Method: exclusion of collocations suchas: someth<strong>in</strong>g like, I like, etc. (total: 26)ResultsOn the development corpus (ICSI)• recall = ~100%• precision = 75%On a different corpus (SWB): test• recall = ~100%• precision = 50%Method is useful as a pre-process<strong>in</strong>gtool to help human annotatorsConclusion• Reliability of human annotation depends on guidel<strong>in</strong>es <strong>and</strong> media• Automatic filter<strong>in</strong>g has excellent recall <strong>and</strong> encourag<strong>in</strong>g precisionFurther work• Improve automatic detection us<strong>in</strong>g mach<strong>in</strong>e-learn<strong>in</strong>g techniques• Investigate the procedural <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> other DMsSelected references• [Bla02] Blakemore, D. Mean<strong>in</strong>g <strong>and</strong> <strong>relevance</strong>: the semantics <strong>and</strong>pragmatics of discourse <strong>markers</strong>. Cambridge: CUP, 2002, 200p.• [MT88] Mann, W., Thompson, S. Rhetorical structure theory: toward afunctional theory of text organization. Text. 1988, vol.8(3), pp. 243-281.• [Ste90] Stenström, A.-B. Lexical items peculiar to spoken discourse. In J.Svartvik (ed.). The London-Lund Corpus of Spoken English: Description <strong>and</strong>research. Lund: LUP, 1990, pp. 137-175.8th Conference of the IPrA, Toronto, Canada 14-18 July 2003


8th International Pragmatics Conference: S. ZuffereyPragmatic connectors <strong>in</strong> <strong>dialogue</strong>: <strong>relevance</strong>-<strong>theoretic</strong> <strong>analysis</strong><strong>and</strong> corpus-based validationThe present study proposes an <strong>analysis</strong> of pragmatic connectors us<strong>in</strong>g <strong>relevance</strong> theory. Itaims at modell<strong>in</strong>g the procedural <strong>in</strong>formation they conta<strong>in</strong>, <strong>and</strong> at show<strong>in</strong>g how it can beused to improve discourse modell<strong>in</strong>g for natural language process<strong>in</strong>g (NLP).Pragmatic issues are probably one of the major bottlenecks to the automatic underst<strong>and</strong><strong>in</strong>gof discourse <strong>in</strong> NLP. Unfortunately, there is still a big gap between the pragmatic theorieson which l<strong>in</strong>guists are currently work<strong>in</strong>g notably neo-gricean like Horn (1984) <strong>and</strong>Lev<strong>in</strong>son (1983), <strong>and</strong> post-gricean like Sperber & Wilson (1986) <strong>and</strong> those that are usedby researchers <strong>in</strong> NLP almost always based on speech act theory.For that reason, <strong>in</strong> this paper I will explore the possibility to ground computationaldiscourse modell<strong>in</strong>g <strong>in</strong> Sperber <strong>and</strong> Wilson’s <strong>relevance</strong> theory. I will therefore presentbriefly what Relevance theory tells us about discourse, based mostly on works by Reboul<strong>and</strong> Moeschler (1998), Jucker (1995) <strong>and</strong> Blakemore (2002).I will then discuss the status of pragmatic connectors from the po<strong>in</strong>t of view of <strong>relevance</strong>theory, show<strong>in</strong>g how various researchers work<strong>in</strong>g on different languages (French, Hebrew,Japanese <strong>and</strong> English) have proved the validity of a <strong>relevance</strong>-based approach to the studyof pragmatic connectors (Moeschler: 2002, Rouchota: 1998). My synthesis will expla<strong>in</strong> thesemantic role of connectors as <strong>relevance</strong>-based constra<strong>in</strong>ts on the <strong>in</strong>terpretation ofutterances <strong>in</strong> discourse, provid<strong>in</strong>g a classification of their possible roles.I will then test the validity of these <strong>theoretic</strong>al results by an empirical study conducted onvarious corpora of texts (BNC) <strong>and</strong> <strong>dialogue</strong>s (bus<strong>in</strong>ess meet<strong>in</strong>g corpus of the Swiss “IM2”Project). The dialog corpus consists of more than 100 hours of meet<strong>in</strong>g record<strong>in</strong>gs,manually transcribed for each speaker. About 10% of the corpus is annotated with dialogacts labels from the (extended) DAMSL/Switchboard set. The study will proceed <strong>in</strong> threesteps: (1) location of occurrences of pragmatic connectors (when possible, by automatedmeans); (2) annotation of the <strong>in</strong>terpretative role of each marker; (3) comparison of theannotated (observed) role with the role predicted by <strong>theoretic</strong>al <strong>analysis</strong>. One of theorig<strong>in</strong>al po<strong>in</strong>ts of the study is the use of <strong>dialogue</strong>s between more than two persons.I will conclude by argu<strong>in</strong>g that satisfactory results obta<strong>in</strong>ed by the empirical study can givesolid ground to motivate further research on <strong>relevance</strong> theory <strong>and</strong> discourse modell<strong>in</strong>g.One of the most important issue for NLP is the <strong>analysis</strong> of the various computationalformalisms that could accommodate the procedural <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> pragmaticconnectors.Selected ReferencesBlakemore, D. Mean<strong>in</strong>g <strong>and</strong> <strong>relevance</strong>: the semantics <strong>and</strong> pragmatics of discourse <strong>markers</strong>. Cambridge: CUP, 2002,208p.Jucker, A. <strong>Discourse</strong> <strong>analysis</strong> <strong>and</strong> <strong>relevance</strong>. In F. Hundsnurscher <strong>and</strong> E. We<strong>in</strong>g<strong>and</strong>, eds. Future perspectives of <strong>dialogue</strong><strong>analysis</strong>. Tüb<strong>in</strong>gen: Max Neimeyer Verlag, 1995, pp. 121-146Moeschler, J. Connecteurs, encodage conceptuel et encodage procédural. Cahiers de l<strong>in</strong>guistique française. 2002, vol.24,22p.Reboul, A., Moeschler, J. Pragmatique du discours. De l'<strong>in</strong>terprétation de l'énoncé à l'<strong>in</strong>terprétation du discours. Paris:Arm<strong>and</strong> Col<strong>in</strong>, 1998, 220p.Rouchota, V. Connectives, coherence <strong>and</strong> <strong>relevance</strong>. In V. Rouchota <strong>and</strong> A. Jucker, eds. Current issues <strong>in</strong> <strong>relevance</strong>theory. Amsterdam: John Benjam<strong>in</strong>s, 1998, pp. 11-57.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!