Induced data - clic-cimec

clic.cimec.unitn.it

Induced data - clic-cimec

Generating Coherent Texts

Argument realization is a difficult subtask in

the generation of coherent texts

• Extreme 1: Realize all arguments of each predicate

The Russian military was racing against time early

Friday to rescue a small submarine. The Russian

Military called a Japanese vessel for help with the

small submarine rescue.

14 June 2013

2


Generating Coherent Texts

Argument realization is a difficult subtask in

the generation of coherent texts

• Extreme 2: Realize each entity only once

[Preceding context, which already introduced all entities]

The rescue was a race against time. Help was called in.

Goal: Find coherent texts, in which the same

argument is explicit vs. implicit

14 June 2013

3


Agence France-Presse

The Russian military was racing against time early

Friday to rescue a small submarine (…) A Japanese

vessel has been called in for assistance.

Predicate-argument structures

(following PropBank/NomBank)

A0 ‘agent’

A1 ‘theme’

A2 ‘source’

rescue.01

Russian navy

small submarine

A0 ‘agent’

A1 ‘theme’

A2 ‘recipient’

call.03

(Bentivogli et al., 2009)

Russian navy

for assistance

Japanese vessel

not locally

realized

14 June 2013

4


New York Times

The Russian navy worked desperately on Friday

to save a small military submarine ... and called

for international help.

Predicate-argument structures

A0 ‘agent’

A1 ‘theme’

A2 ‘source’

save.02

Russian navy

small submarine

A0 ‘agent’

A1 ‘theme’

A2 ‘recipient’

call.03

Goal: Identify implicit arguments

via comparable texts

(Bentivogli et al., 2009)

Russian navy

for assistance

Japanese vessel

not realized

at all

14 June 2013

5


Implicit Arguments: Two Tasks

AFP

The Russian military was racing

against time early Friday to rescue a

small submarine. A Japanese vessel

has been called ∅Arg0 in for

assistance.

NYT

The Russian navy worked

desperately on Friday to save a small

military submarine ... and called

∅Arg2 for international help.

Identifying and linking implicit arguments

Only 18% F 1 -score on full texts (Laparra and Rigau, 2012)

Predicting coherent realizations in context

Not tackled in previous coherence models

No training and test data

14 June 2013

6


Exploiting Comparable Texts

AFP

The Russian military was racing

against time early Friday to rescue a

small submarine. A Japanese vessel

has been called ∅Arg0 in for

assistance.

NYT

The Russian navy worked

desperately on Friday to save a small

military submarine ... and called

∅Arg2 for international help.

We align and compare argument structures

• … to identify and link implicit arguments in discourse

• … to determine realization factors given context

Induced data for both tasks!

14 June 2013

7


Outline

Motivation

Inducing implicit arguments

Linking implicit arguments in discourse

Modeling coherent argument realization

14 June 2013

8


Corpus of Comparable Texts

Goal: induce implicit arguments and

discourse antecedents in context

GigaPairs corpus: >167k pairs of documents

• Pairs of newswire articles extracted from the

English Gigaword Fifth Edition (Parker et al., 2011)

• Over 50 million word tokens

• Texts share high degree of similarity

but vary in length and detail

(Roth and Frank, *SEM 2012)

14 June 2013

9


Induction Approach: Basis

(Bohnet, COLING 2010; Björkelund et al., COLING 2010; Martschat et al., CoNLL 2012)

Lee et al., EMNLP 2012; Roth and Frank, EMNLP 2012)

Within documents

• Semantic parsing

• Pronoun resolution

Across documents

• Predicate alignment

• Entity coreference

Use of high precision

models across texts

minimize errors

rescue

AFP

The Russian Russian military military smallwas submarine racing

against time early Friday to rescue a

small submarine. A Japanese call vessel

has been assistance called in for Japanese assistance. vessel

save

NYT

The Russian small navy submarine

worked

desperately on Friday to save a small

call

military submarine ... and called for

international Russian navy

help. help

14 June 2013

10


Induction Approach: Procedure

Goal: Induce implicit arguments+antecedents

Compare structures

of aligned predicates

1. Identify missing roles

as implicit arguments

2. If possible, link

implicit arguments

via alignment and

coreference

rescue

AFP

The Russian Russian military military smallwas submarine racing

against time early Friday to rescue a

small submarine. A Japanese call vessel

has ∅ been assistance called in for Japanese assistance. vessel

save

NYT

The Russian small navy submarine

worked

desperately on Friday to save a small

call

military submarine ... and called for

international Russian navy

help. help


14 June 2013

11


Induction in Practice

Extraction of 701 implicit arguments

and discourse antecedents

“ [T-Online i ] said Thursday that …

The [∅ A0 ] operating loss (…)

widened to 189 million euros.”

“[T-Online’s A0 ] operating

loss widened from 122 to

189 million euros in 2001.”

Use of high precision methods low recall

Manual evaluation of induced data

Sample of 90 implicit arguments: 89% correct

14 June 2013

12


Outline

Motivation

Inducing implicit arguments

Linking implicit arguments in discourse

Modeling coherent argument realization

14 June 2013

13


SemEval 2010 Task 10

Linking events and their participants in discourse

“In a lengthy court case the defendant was tried

for murder. In the end, [hewas A1 ] was cleared.“[∅ A2 ].” [ A0 ]

NI-only task: given local semantic roles

• Classify “null instantiations”

• Link NIs to discourse antecedents

(Ruppenhofer et al., SemEval 2010)

Linking is a difficult coreference resolution task

Sparsity issue: 245 training, 259 test instances

14 June 2013

14


Impact of Induced Data

(Silberer and Frank, *SEM 2012)

Induced data as additional training material

• Use a pre-existing system for argument linking

• Learn new model with more training instances

Results without andadditional without additional data data

P / R

Best SemEval

(Chen et al., 2010)

S&F system

+15 percentage points in precision

state-of-the-art

(Laparra and Rigau, 2012)

SE training data 0.25 / 0.01 0.06 / 0.09 0.15 / 0.25

+ our data -- 0.21 / 0.08 --

14 June 2013

15


Outline

Motivation

Inducing implicit arguments

Linking implicit arguments in discourse

Modeling coherent argument realization

14 June 2013

16


Centering and Local Coherence

“ ”

The coherence of a segment is affected by …

a speaker’s choices of linguistic realizations.

(Grosz et al., CL 1995)

Utterances in discourse are linked by centers

low inference load high perceived coherence

• Pronouns are resolved within the immediate context

• Full definite NPs lead to additional inferences

• Centers can also implicitly be in focus

[T-Online The house i ] appeared released its its to results have for been for the the burgled. last last year.

The operating door was ajar. loss [∅ widened. A0 ] widened.

14 June 2013 17


Entity-based Coherence Models

Previous work:

• Entity grid approach (and extensions)

• Discourse-new model

• Pronoun-based model

(Barzilay and Lapata, CL 2008; Charniak and Elsner, EACL 2009;

Elsner and Charniak, ACL 2011)

All models look at explicit occurrences

No special treatment of non-realizations

Models do not take into account whether a

non-realization is a center of an utterance

14 June 2013

18


Explicit vs. Implicit Arguments

Up to today: no corpus with coherent

contexts for implicit and explicit arguments

Our induced data set provides such contexts:

rescue

AFP

The Russian Russian military military smallwas submarine racing

against time early Friday to rescue a

small submarine. A Japanese call vessel

has ∅ been assistance called in for Japanese assistance. vessel

save

NYT

The Russian small navy submarine

worked

desperately on Friday to save a small

call

military submarine ... and called for

international Russian navy

help. help ∅

14 June 2013 19


Rating Local Coherence

Does (non-)realization affect coherence?

Rate coherence of argument use in full

discourse context (150 document pairs)

implicit

T-Online said Thursday that …

The [∅ A0 ] operating loss (…)

widened to 189 million euros.


explicit

T-Online said Thursday that …

[T-Online’s A0 ] operating loss (…)

widened to 189 million euros.


14 June 2013

20


Rating Local Coherence

Does (non-)realization affect coherence?

Rate coherence of argument use in full

discourse context (150 document pairs)

Improved coherence observed in 29 pairs:

• 20 explicit arguments, 9 implicit arguments

• Gold annotation for evaluation

No clear preference in the remaining cases

14 June 2013

21


Argument Realization Model

Predict realization using 3 groups of features

1. Context: #pronouns in sentence, discourse pos., …

2. Predicate-argument structure: length, #args, …

3. Coreference: #previous mentions, distance, …

For evaluation: test against gold annotations

Given context, PAS and coreference chain,

predict realization of argument

For training: only automatically induced data

• Aligned explicit arguments: +realization

• Implicit arguments: −realization

14 June 2013

22


Evaluation

Can previous models solve this task?

Prediction only indirect: use pairs of text (±realization)

Two versions of our model to predict realization

• Simplified: entity-grid like features (only coreference)

• Full: all features (PAS, context, coreference)

Model Accuracy

Entity grid 0.15

Pronoun-based model 0.43

Discourse-new model 0.48

prediction = max text score(text)

Model

Accuracy

Majority class (explicit) 0.69

Our model (simplified) 0.83

Our model (full) 0.90

14 June 2013 23


Outline

Motivation

Inducing implicit arguments

Linking implicit arguments in discourse

Modeling coherent argument realization

14 June 2013

24


Results

Motivation

Induce training data for implicit argument tasks

Inducing implicit arguments

Extracted 701 implicit arguments/antecedents (P: 89%)

Linking implicit arguments in discourse

Induced data improved linking model (P: +15% absolute)

Modeling coherent argument realization

Novel task and high-performance model (A: 90%)

14 June 2013

25


Conclusions and Future Work

Automatically induced implicit arguments

Useful for a range of tasks; no annotation required

Benefits for linking model are only in precision

• Model only uses simple high-level features

• Better methods needed to fully utilize available data

Implicit arguments in coherence modeling

• Novel aspect not tackled in previous models

• Future work: combine with other aspects

and use in NLG

14 June 2013

26


Thank you

… for your attention!

Thanks to the

Landesgraduiertenförderung Baden-Württemberg

for funding within the research initiative

“Coherence in language processing”

at Heidelberg University.

14 June 2013

27


Questions?

14 June 2013

28

More magazines by this user
Similar magazines