02.05.2014 Views

Proceedings - Österreichische Gesellschaft für Artificial Intelligence

Proceedings - Österreichische Gesellschaft für Artificial Intelligence

Proceedings - Österreichische Gesellschaft für Artificial Intelligence

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

each soft restriction soft m set on an phrase ph i<br />

is associated a number of slots num(soft m ) that<br />

a candidate translation cannot claim if it does not<br />

fulfil the condition.<br />

Whenever a phrase P H 1 covered by a validated<br />

candidate translation ct includes a phrase ph 1 ,<br />

we consider that the translation of ph 1 should be<br />

included by the second n-gram P H 2 also covered<br />

by ct. We associate to such soft restriction<br />

soft m a num(soft m ) = claims(ct, b n ). In<br />

other words, if P H 1 includes ph 1 , we consider<br />

that for claims(ct, b n ) slots of ph 1 its translation<br />

should be included in P H 2 . For example if “la<br />

bella casa” in Italian is validated as the translation<br />

of “das schöne Haus” in German then, for any bitext<br />

containing both, phrases included in “la bella<br />

casa” should translate to phrases included in “das<br />

schöne Haus” and vice-versa.<br />

Also, whenever a phrase P H 1 covered by<br />

a validated candidate translation ct is included<br />

in a phrase ph 1 and slots(ph 1 , b n ) =<br />

slots(P H 1 , b n ), we consider that the translation<br />

of ph 1 should include the other phrase P H 2 covered<br />

by ct. We associate to such soft condition<br />

soft m a num(soft m ) = claims(ct, b n ). In<br />

other words, if P H 1 is included in ph 1 and both<br />

phrases have the same original number of slots<br />

then P H 2 should be included by the translation<br />

of ph 1 at least claims(ct, b n ) time(s). For example<br />

if “bella” is validated as the Italian translation<br />

of “schöne” in German then phrases including<br />

“bella” and having the same number of slots<br />

should translate into phrases including “schöne”<br />

and vice-versa.<br />

6.5.3 Combining restrictions and updating<br />

the remaining candidate translations<br />

Since we do not try to align phrases, combining<br />

the restrictions violated by a candidate translation<br />

must take into account that some restrictions may<br />

apply on slots that overlap between one another.<br />

Regarding strict restrictions, we can ensure that<br />

two restrictions concern a set of slots that don’t<br />

overlap even if we don’t explicitly affect a given<br />

slot to a given strict restriction. For example, for<br />

a phrase ph i with m + n slots in a given bitext<br />

that is covered by two validated candidate translations<br />

ct e and ct h , we can tell that m slots have<br />

been locked by ct e and n slots by ct h and cannot<br />

be claimed by other candidate translations without<br />

stating explicitly which slot is locked by ct e<br />

or ct h .<br />

Whenever a soft restriction is involved, simply<br />

adding the number of slots covered by the restrictions<br />

would be incorrect because we cannot establish<br />

if the restrictions violated do not overlap<br />

on a same set of slots. For example, let’s consider<br />

a bitext containing both one occurrence of<br />

“la bella casa” in Italian and “das schöne Haus”<br />

in German with only one occurrence of “bella”<br />

and “schöne” in the whole bitext and two validated<br />

candidate translations ct i and ct j that associate<br />

“la bella” with “das schöne” and “bella casa”<br />

with “schöne Haus”. A candidate translation ct k<br />

that covers “bella” but does not associate it with<br />

“schöne” would violate both soft restrictions set<br />

by ct i and ct j . Simply adding the number of slots<br />

covered by the soft restrictions set by ct i and ct j<br />

would prohibit ct k to claims two slots when only<br />

one is actually available. The same reasoning can<br />

be extended to phrases having more than one slot<br />

and to the combination soft and strict restrictions.<br />

We thus look for the maximum number of slots<br />

that a remaining candidate translation ct occurring<br />

in a bitext b n and covering two phrases ph i<br />

and ph j can claim. For each covered phrase ph,<br />

we compute a value max soft(ph, ct, b n ) corresponding<br />

to the maximum of the num(soft m )<br />

values of the soft restrictions violated by ct for<br />

covering ph in b n .<br />

We then compute the value<br />

sub claims(ph, ct, b n ) corresponding to the<br />

number of slots originally available slots(ph, b n )<br />

minus the maximum value between the number<br />

of slots locked by strict restrictions locks(ph, b n )<br />

and max soft(ph, ct, b n ),<br />

sub claims(ph, ct, b n) =<br />

slots(ph, b n) − max(locks(ph, b n), max soft(ph, ct, b n)).<br />

Finally we update the claims(ct, b n ) value in a<br />

similar manner as it has been first initialized<br />

claims(th, b n) = min(sub claims(ph i, ct, b n),<br />

sub claims(ph j, ct, b n))<br />

It is important to note that generally only one<br />

slot is available for most phrases in a bitext.<br />

Therefore, conflicting with just one restriction in<br />

a bitext, be it a strict or a soft, is enough for most<br />

candidate translations to loose their occurrence.<br />

475<br />

<strong>Proceedings</strong> of KONVENS 2012 (LexSem 2012 workshop), Vienna, September 21, 2012

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!