24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.1. <strong>Onto</strong>logising algorithms 117<br />

Adjacency matrix M, for <strong>the</strong> network in figure 7.1:<br />

a b c d e f g h i j l<br />

a 1 1 0 0 0 1 1 1 0 0 0<br />

b 1 1 1 1 1 0 0 0 0 0 1<br />

c 0 1 1 0 0 0 1 1 0 0 0<br />

d 0 1 0 1 0 0 0 0 0 0 0<br />

e 0 1 0 0 1 1 0 0 0 0 1<br />

f 1 0 0 0 1 1 0 0 0 0 1<br />

g 1 0 1 0 0 0 1 0 1 0 0<br />

h 1 0 1 0 0 0 0 1 1 1 0<br />

i 0 0 0 0 0 0 1 1 1 0 0<br />

j 0 0 0 0 0 0 0 1 0 1 0<br />

l 0 1 0 0 1 1 0 0 0 0 1<br />

Cosine values for each pair:<br />

B1 B2 B3<br />

A1 3.15/12 ≈ 0.26 1.80/12 ≈ 0.15 1.35/8 ≈ 0.17<br />

A2 2.44/9 ≈ 0.27 3.54/9 ≈ 0.39 * 1.59/6 ≈ 0.26<br />

A3 1.21/9 ≈ 0.13 0.81/9 ≈ 0.09 0.37/6 ≈ 0.06<br />

A4 5.48/18 ≈ 0.30 3.16/18 ≈ 0.18 1.96/12 ≈ 0.16<br />

max(sim(Ai, Bj)) ≈ 0.39 → resulting sb-triple = {A2 R1 B2}<br />

Figure 7.3: Using AC to select <strong>the</strong> suitable synsets for ontologising {a R1 b}, given<br />

<strong>the</strong> candidate synsets and <strong>the</strong> network N in figure 7.1.<br />

Related Proportion + Average Cosine (RP+AC): This algorithm combines<br />

RP and AC. If RP cannot select a suitable synset for a or b, because one, or both,<br />

<strong>the</strong> selected synsets have rp < θ, a selected threshold, AC is used.<br />

Number <strong>of</strong> Triples (NT): Pairs <strong>of</strong> candidate synsets, Ai ∈ A and Bj ∈ B, are<br />

scored according to <strong>the</strong> number <strong>of</strong> tb-triples <strong>of</strong> type R, present in N, between any<br />

<strong>of</strong> <strong>the</strong>ir terms. In o<strong>the</strong>r words, <strong>the</strong> pair that maximises nt(Ai, Bj) is selected:<br />

nt(Ai, Bj) =<br />

|Ai| <br />

|Bj| <br />

k=1 l=1<br />

E(aik, bjl, R) ∈ E<br />

log2(|Ai||Bj|)<br />

(7.3)<br />

As it is easier to find tb-triples between terms in larger synsets, this expression<br />

considers <strong>the</strong> size <strong>of</strong> synsets. However, in order to minimise <strong>the</strong> negative impact <strong>of</strong><br />

very large synsets, a logarithm is applied to <strong>the</strong> multiplication <strong>of</strong> <strong>the</strong> synsets’ size.<br />

The NT ontologising algorithm is illustrated in figure 7.4, where it is used to<br />

ontologise <strong>the</strong> tb-triple {a R1 b}, given <strong>the</strong> sample candidate synsets in figure 7.1<br />

and <strong>the</strong> sample network in <strong>the</strong> same figure 2 .<br />

2 For <strong>the</strong> sake <strong>of</strong> <strong>the</strong> clarity, we ignored <strong>the</strong> log2 in <strong>the</strong> denominator <strong>of</strong> <strong>the</strong> nt(Ai, Bj) expression,<br />

and considered it to be just |Ai||Bj|.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!