12.09.2013 Views

Programme booklet (pdf)

Programme booklet (pdf)

Programme booklet (pdf)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

62<br />

CLIN 21 – CONFERENCE PROGRAMME<br />

Subtrees as a new type of context in Word Space Models<br />

Abstract<br />

Smets, Margaux and Speelman, Dirk and Geeraerts, Dirk<br />

QLVL, K.U.Leuven<br />

In Word Space Models (WSMs) there are traditionally two types of contexts that can be<br />

used: (i) lexical co-occurrences (`bag-of-words models') and (ii) syntactic dependencies.<br />

In general, models with the second type of contexts seem to perform better. However,<br />

there are some problems with these models. In the first place, a choice has to be made<br />

which contexts to include: only subject/verb and verb/object-relations, or also other<br />

dependencies . Second, in contrast with bag-of-words models, the syntactic models are<br />

supervised: they require quite large resources (a dependency parser, a manually<br />

annotated corpus, . . .), which might not be available for each language .<br />

The contexts we propose for use in WSMs are subtrees as defined in the framework of<br />

Data-Oriented-Parsing. Subtrees can capture both bag-of-words (co- occurrence)<br />

information, and syntactic information. Moreover, they are not limited to specific types<br />

of dependencies, but rather take entire structures into account.<br />

At first sight, it might seem that the problem of resources for dependency-WSMs<br />

remains in this framework. After all, we first need the `correct' tree for a sentence,<br />

before we can extract subtrees from it. However, in our experiments we show how the<br />

entire algorithm can be made unsupervised by using an unsupervised parser as a<br />

preprocessing step.<br />

In the presentation, I will first discuss in detail the workings of this new type of WSM.<br />

Next, I will present some initial results from experiments with parameters such as the<br />

accuracy of the parser in the preprocessing step, the maximum subtree depth, the<br />

minimum subtree frequency, and considering only subtrees with the highest variance.<br />

Corresponding author: margauxsmets@gmail.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!