27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

III.<br />

GRAPH REPRESENTATION OF CLASS<br />

DIAGRAMS<br />

UML class diagrams can be c onverted to labeled directed<br />

graphs in which the classes are represe nted by nodes, and the<br />

relationships between the classes are re presented as edges of<br />

the graph. In addition, edges contain extra information which<br />

specify whether they represent dependencies, generalization,<br />

association, and so on. With this representation in m ind, the<br />

problem of m atching a quer y class diagram to another class<br />

diagram in the repository becomes that of graph m atching. In<br />

particular, since the graphs to be compared usually have<br />

different numbers of nodes a nd edges, the problem is referred<br />

to as inexact g raph matching [11]. Fig. 1 s hows how a class<br />

diagram is converted to a directed graph. An adjacency matrix<br />

representation of the graph is al so shown in Table I. Rath er<br />

than containing zeros and ones, the entries of the matrix s how<br />

the types of Class relationships represented by the edges of the<br />

graph.<br />

IV. SIMILARITY METRIC<br />

We propose a similarity metric which is com posed of two<br />

parts; name (semantic) similarity and str ucture (topology)<br />

similarity [4], [12]. Nam e similarity measures the semantic<br />

relatedness of the concepts (classes) in the class diagram s to be<br />

compared, while structure similarity measures how closely the<br />

relationships between classes match one another. In other<br />

words, name similarity determines how t he corresponding<br />

nodes of tw o graphs are r elated, while structure similarity<br />

measures the level of similar ity between corresponding edges<br />

[13], [9].<br />

A. Name Similarity<br />

One way of measuring the sem antic relatedness of class<br />

names is to use domain ontology [4]. Another possibility is to<br />

utilize a lexical database such as WordNet.<br />

Corporate<br />

Customer<br />

Customer<br />

Personal<br />

Customer<br />

Figure 1. A Class Diagram and its corresponding directed graph. Nodes 1, 2,<br />

3, 4 and 5 in the graph represent the Customer, Order, Corporate Customer,<br />

Personal Customer and Item Classes<br />

TABLE I.<br />

Order<br />

Item<br />

Generalization<br />

ADJACENCY MATRIX<br />

1 2 3 4 5<br />

1 None None None None None<br />

2 Association None None None None<br />

3 Generalization None None None None<br />

4 Generalization None None None None<br />

5 None Aggregation None None None<br />

1<br />

Association<br />

Generalization<br />

3 4 5<br />

2<br />

Aggregation<br />

We propose using WordNet as is done in [3]. The Name<br />

Similarity of two class diagrams A and B having equal num ber<br />

of classes is given in (1).<br />

<br />

n<br />

<br />

cns(<br />

Ai , Bi<br />

)<br />

i1<br />

NS(<br />

A,<br />

B)<br />

<br />

n<br />

<br />

A i is the na me of the i th class in class diagram A, B i is the<br />

name of the i th class in class diagram B while n is the number<br />

of classes contained in both diagram s. cns (Class Na me<br />

Similarity) is a function that returns a semantic relatedness<br />

value between zero and one . Zero and one denote maximal<br />

relatedness and un-relatedne ss of conce pts, respectively. The<br />

division by n in equation 1 ensures that the value of nam e<br />

similarity (NS) always lies between 0 and 1.<br />

B. Structure Similarity<br />

In order to measure the stru cture similarity between two<br />

matrices, we define a squa re matrix Diff, whose entries<br />

represent the level of dissimilarity between the various types of<br />

class relationships. The (i, j)th en try of the matrix is a measure<br />

of the dissimilarity between the i th type of relationship and the<br />

j th type of relationship. A value of 1 indi cates that the two<br />

relationships are extremely dissimilar, while 0 indicates that the<br />

relationships are the sa me (hence the diagonal entries of the<br />

matrix are all zeros). The entries of this matrix can be filled by<br />

gathering information from UML experts, or by appl ying<br />

ontology as in [4]. Table II (adapted from [4]) shows a sample<br />

Difference matrix (Diff). Since the main objective of retrieving<br />

class diagrams is to reuse the m, the entries in Diff shoul d be<br />

proportional to the am ount of effo rt required to con vert one<br />

type of relationship to another, after retrieving a class diagram<br />

from the repository. The last row labeled ‘None’ shows the<br />

level of dissimilarity between having no relationship between<br />

two classes (that is no edge connecting the vertices) and having<br />

a relationship between the two classes.<br />

Let A and B be two directed graphs (representing class<br />

diagrams) each having n nodes. In addition, let the n x n<br />

matrices AdjA and AdjB be the adjacency matrices of A and B,<br />

respectively. The Structure Similarity (SS) between A and B is<br />

computed as shown in (2). nm is the number of times the edges<br />

in both graphs match exactly, while nu is th e number of times<br />

the edges do not match.<br />

<br />

<br />

Diff ( AdjA(<br />

i,<br />

j),<br />

AdjB(<br />

i,<br />

j))<br />

i,<br />

j<br />

SS( A,<br />

B)<br />

<br />

nm nu<br />

<br />

The overall similarity metric between two class diagrams<br />

represented by graphs A and B is a weighted sum of the Name<br />

Similarity and Structure Similarity as shown in (3). is a value<br />

between 0 and 1 that determ ines the relative im portance of SS<br />

and NS.<br />

S( A,<br />

B)<br />

* SS(<br />

A,<br />

B)<br />

(1 ) * NS(<br />

A,<br />

B)<br />

<br />

738

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!