Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
7.1. <strong>Onto</strong>logising algorithms 117<br />
Adjacency matrix M, for <strong>the</strong> network in figure 7.1:<br />
a b c d e f g h i j l<br />
a 1 1 0 0 0 1 1 1 0 0 0<br />
b 1 1 1 1 1 0 0 0 0 0 1<br />
c 0 1 1 0 0 0 1 1 0 0 0<br />
d 0 1 0 1 0 0 0 0 0 0 0<br />
e 0 1 0 0 1 1 0 0 0 0 1<br />
f 1 0 0 0 1 1 0 0 0 0 1<br />
g 1 0 1 0 0 0 1 0 1 0 0<br />
h 1 0 1 0 0 0 0 1 1 1 0<br />
i 0 0 0 0 0 0 1 1 1 0 0<br />
j 0 0 0 0 0 0 0 1 0 1 0<br />
l 0 1 0 0 1 1 0 0 0 0 1<br />
Cosine values for each pair:<br />
B1 B2 B3<br />
A1 3.15/12 ≈ 0.26 1.80/12 ≈ 0.15 1.35/8 ≈ 0.17<br />
A2 2.44/9 ≈ 0.27 3.54/9 ≈ 0.39 * 1.59/6 ≈ 0.26<br />
A3 1.21/9 ≈ 0.13 0.81/9 ≈ 0.09 0.37/6 ≈ 0.06<br />
A4 5.48/18 ≈ 0.30 3.16/18 ≈ 0.18 1.96/12 ≈ 0.16<br />
max(sim(Ai, Bj)) ≈ 0.39 → resulting sb-triple = {A2 R1 B2}<br />
Figure 7.3: Using AC to select <strong>the</strong> suitable synsets for ontologising {a R1 b}, given<br />
<strong>the</strong> candidate synsets and <strong>the</strong> network N in figure 7.1.<br />
Related Proportion + Average Cosine (RP+AC): This algorithm combines<br />
RP and AC. If RP cannot select a suitable synset for a or b, because one, or both,<br />
<strong>the</strong> selected synsets have rp < θ, a selected threshold, AC is used.<br />
Number <strong>of</strong> Triples (NT): Pairs <strong>of</strong> candidate synsets, Ai ∈ A and Bj ∈ B, are<br />
scored according to <strong>the</strong> number <strong>of</strong> tb-triples <strong>of</strong> type R, present in N, between any<br />
<strong>of</strong> <strong>the</strong>ir terms. In o<strong>the</strong>r words, <strong>the</strong> pair that maximises nt(Ai, Bj) is selected:<br />
nt(Ai, Bj) =<br />
|Ai| <br />
|Bj| <br />
k=1 l=1<br />
E(aik, bjl, R) ∈ E<br />
log2(|Ai||Bj|)<br />
(7.3)<br />
As it is easier to find tb-triples between terms in larger synsets, this expression<br />
considers <strong>the</strong> size <strong>of</strong> synsets. However, in order to minimise <strong>the</strong> negative impact <strong>of</strong><br />
very large synsets, a logarithm is applied to <strong>the</strong> multiplication <strong>of</strong> <strong>the</strong> synsets’ size.<br />
The NT ontologising algorithm is illustrated in figure 7.4, where it is used to<br />
ontologise <strong>the</strong> tb-triple {a R1 b}, given <strong>the</strong> sample candidate synsets in figure 7.1<br />
and <strong>the</strong> sample network in <strong>the</strong> same figure 2 .<br />
2 For <strong>the</strong> sake <strong>of</strong> <strong>the</strong> clarity, we ignored <strong>the</strong> log2 in <strong>the</strong> denominator <strong>of</strong> <strong>the</strong> nt(Ai, Bj) expression,<br />
and considered it to be just |Ai||Bj|.