Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
7.1. <strong>Onto</strong>logising algorithms 115<br />
Each algorithm can be seen as a different strategy for attaching terms a and<br />
b, in {a R b}, to suitable synsets Ai ∈ T and Bj ∈ T , Ai = {ai0, ai1, ..., ain},<br />
Bj = {bj0, bj1, ..., bjm}, where n = |Ai| and m = |Bj|. This results in a<br />
sb-triple {Ai R Bj}. All algorithms, presented below, start by getting all<br />
<strong>the</strong> candidate synsets from <strong>the</strong> <strong>the</strong>saurus, which are those containing term a,<br />
A ∈ T : ∀(Ai ∈ A) → a ∈ Ai, and all with term b, B ∈ T : ∀(Bj ∈ B) → b ∈ Bj.<br />
Also, for all <strong>of</strong> <strong>the</strong> proposed algorithms, if T does not contain <strong>the</strong> term argument <strong>of</strong> a<br />
tb-triple (e.g. a), a new synset containing only this term is created (e.g. Sa = {a}).<br />
Before presenting <strong>the</strong> algorithms, we introduce figure 7.1, which contains candidate<br />
synsets for attaching terms a and b, as well as a made up lexical network N.<br />
There, nodes, identified by letters, can be seen as lexical items (terms), while <strong>the</strong><br />
connections represent tb-triples <strong>of</strong> a labelled type (R1, R2 and R3). Note that N<br />
intentionally does not contain some lexical items in <strong>the</strong> synsets (k to p), which happens<br />
if <strong>the</strong>y do not occur in any tb-triple. Both <strong>the</strong> synsets and <strong>the</strong> network <strong>of</strong><br />
figure 7.1 will be used in <strong>the</strong> illustration <strong>of</strong> some <strong>of</strong> <strong>the</strong> algorithms. We intentionally<br />
created an example where, depending on <strong>the</strong> used algorithm, <strong>the</strong> resulting sb-triple<br />
is different.<br />
Sample candidate synsets:<br />
A1 = {a, c, d, k} B1 = {b, g, h}<br />
A2 = {a, e, l} B2 = {b, f, o}<br />
A3 = {a, m, n} B3 = {b, p}<br />
A4 = {a, c, d, e, i, j}<br />
Sample network N:<br />
Figure 7.1: Candidate synsets and lexical network for <strong>the</strong> ontologising examples.<br />
Related Proportion (RP): This algorithm is based on a similar assumption to<br />
Pennacchiotti and Pantel (2006)’s anchor approach. First, to attach term a, term b<br />
is fixed. For each synset Ai ∈ A, ni is <strong>the</strong> number <strong>of</strong> terms aik ∈ Ai such that <strong>the</strong><br />
triple {aik R b} holds. The related proportion rp is computed as follows:<br />
rp(Ai, {a, R, b}) =<br />
ni<br />
1 + log2(|Ai|)<br />
(7.1)