27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

if (type(c)=”class”) then<br />

add c to groups cls<br />

else if (type(c)=”property”) then<br />

add c to groups props<br />

end if<br />

end if<br />

end for<br />

classify (groups cls<br />

)<br />

classify (groups props<br />

)<br />

The generic classif ication phase classify(groups in ) is the<br />

following:<br />

//classify(group in<br />

)<br />

if(size(group in<br />

)>1) then<br />

remove concept c r<br />

from group in<br />

for ( c group ) do in<br />

if ( cr<br />

Oi<br />

_ and _ c O jwith<br />

_ i j )then<br />

if(type(c r<br />

) = “class” and type(c)=”class”)then<br />

index = calculateIndexSimilarityClasses(c r<br />

, c)<br />

else<br />

if(type(c r<br />

) = “property” and type(c)=”property”) then<br />

index = calculateIndexSimilarityProperties(c r<br />

, c)<br />

end if<br />

if(index>threshold) then<br />

add mapping found between c r<br />

and c<br />

end if<br />

end if<br />

end for<br />

classify(group in<br />

)<br />

end if<br />

According to the proposed approach, the mapping process<br />

is a cl assification problem: it classifies the similarities<br />

among the classes, the properties and the relationships<br />

and then creates a new on tology that is a common layer<br />

representing a shared view of the various ontologies. As<br />

previously said, various approaches are in literature ([3]-<br />

[11]) in order to find the semantically similar components<br />

in the ontologies. The a dopted functions are the<br />

following:<br />

Editing Distance (ED): This function is so defined:<br />

min( x , y ) ed(<br />

x,<br />

y)<br />

sim ed<br />

( x,<br />

y)<br />

1<br />

max(0,<br />

) [0,1]<br />

min( x , y )<br />

It aims to calculate the li kelihood among words that<br />

labelling concepts in t he ontology. In particular it<br />

compares the syntactical structure of the words and counts<br />

the characters that are in the same position in the words x<br />

and y by the use of the “ed” function. The value 1 means<br />

that the two words are similar.<br />

Trigram Function (TF): This function aims to measure<br />

the number of similar trigrams that are in the words that<br />

label the concepts in the ontologies.<br />

1<br />

TF( x,<br />

y)<br />

<br />

[0,1]<br />

1<br />

tri(<br />

x)<br />

tri(<br />

y)<br />

2 * tri(<br />

x)<br />

tri(<br />

y)<br />

The function tri(x) gives the set of trigrams that are in the<br />

in the word x. The value 1 means that the two words are<br />

similar.<br />

Semantic similarity index (SS): This index is so defined<br />

1<br />

SS(<br />

w1 , w2<br />

) <br />

[0,1]<br />

sim ( w , w )<br />

where<br />

jc<br />

1<br />

2<br />

sim jc (w 1 , w 2 ) = 2* log P[LSuper(c 1 , c 2 )]-[log P(c 1 )+ log P(c 2 )]<br />

This index aims to compare from a semantic point of view<br />

two words measuring their distance in t he taxonomy<br />

defined in Wordnet [16]. The value 1 means that the two<br />

words are similar.<br />

Granularity (GR): This i ndex measures the mutual<br />

position of the words representing the concepts of the<br />

ontology in the W ordNet taxonomy. This index is so<br />

defined:<br />

min[ dens(<br />

c1<br />

)* path(<br />

c1,<br />

p),<br />

dens(<br />

c2<br />

)* path(<br />

c2,<br />

p)]<br />

GR(<br />

c , c2<br />

) <br />

max[ dens(<br />

c )* path(<br />

c , p),<br />

dens(<br />

c )* path(<br />

c , p)]<br />

1<br />

<br />

1<br />

1<br />

2<br />

2<br />

[0,1]<br />

where dens(c) is the function representing the density of<br />

the concept c. This function is defined as E(c)/E where E<br />

is the ratio between the number’s arc of the concept and<br />

the numbers of its parents while E(c) is the number of the<br />

sibling of the concept c. Th e function path(c 1 , p) is the<br />

shortest path from c 1 to p that is the first parent common<br />

to c 2 .<br />

Attribute Index: This index aims to measure the<br />

numbers of similar attributes between two nodes. In<br />

particular, it is so defined:<br />

sim att<br />

X Y<br />

<br />

X<br />

X Y<br />

(<br />

x,<br />

y)*<br />

(1 (<br />

x,<br />

y))*<br />

Y<br />

Y<br />

X<br />

[0,1]<br />

with [0,1]<br />

is a parameter and X and Y are the number<br />

of attributes related to the two compared nodes. If this<br />

function gives value 1 it means that the words X and Y<br />

are similar.<br />

Synonym Index (SI): This index aims to verify if in<br />

Wordnet there are synonyms of the word related to th e<br />

concept in an ontology that label a concept in another<br />

ontology. This index can assume value 0 (no synonym) or<br />

1 (synonym).<br />

Derived Index (DE): This index aims to find in WordNet<br />

an adjective, representing a node of ontology, derived<br />

from the label of a concept that is in the other ontology.<br />

This index can assume value 0 (not derived) or 1<br />

(derived).<br />

Property Similarity Index (ISP): This index has the aim<br />

to verify the equality between the nodes evaluating their<br />

181

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!