25.07.2013 Views

October 2012 Volume 15 Number 4 - Educational Technology ...

October 2012 Volume 15 Number 4 - Educational Technology ...

October 2012 Volume 15 Number 4 - Educational Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

only RelFinder was really comparable, because it extracts a DBpedia subgraph and not simply a unique instance.<br />

Table 5 shows the results of this comparison, which was done according to the following criteria:<br />

Experiments were performed with 4 test sets, each one composed of 10 LOs selected randomly from Universia.<br />

The same number of solutions was compared. Specifically, we have limited the comparison to the number of<br />

solutions returned by RelFinder to not bias the results.<br />

It was considered a depth of two levels, which is the worst scenario for our algorithm, as Figure 4 shows.<br />

Figure 4. Precision/Recall curves for the terms extracted from <strong>15</strong> courses in a wide range of domains, such as<br />

geography, history, or social sciences<br />

Table 5. Comparison of RelFinder with our filtering algorithm for 40 LOs contained in Universia<br />

Measure Set 1 Set 2 Set 3 Set 4<br />

Relfinder Precision 0,3626 0,3692 0,1897 0,2464<br />

Recall 0,2654 0,3708 0,2762 0,5142<br />

F1-Score 0,2534 0,3700 0,2249 0,3331<br />

F0.5-Score 0,3379 0,3696 0,2023 0,2751<br />

Our solution Precision 0,3900 0,3779 0,2754 0,2800<br />

Recall 0,7444 0,7269 0,6695 0,6808<br />

F1-Score 0,5119 0,4996 0,3903 0,3968<br />

F0.5-Score 0,4311 0,4187 0,3122 0,3174<br />

As shown in Table 5, our solution clearly outperforms RelFinder, obtaining for all the test sets a better recall and a<br />

better precision. Note that these results were obtained for a complete recall, which is a recall of 1.0, since RelFinder<br />

does not rank the solutions. We have also considered another typical comparison measure, the F-score metric, which<br />

is a measure of a test's accuracy:<br />

As shown in (11), F-score formula ( ) only considers the precision and recall to compute the score. In our tests we<br />

used (i) the balanced version of the F-score metric, which sets , and that can be interpreted as a weighted<br />

average of the precision and recall, and (ii) an F-score with , which emphasizes the precision over the recall.<br />

The results of the experiment for this metric are shown in Table 5, and it should be remarked that our algorithm has a<br />

better behavior than RelFinder for both the precision (F0.5-score) and the balanced precision/recall (F1-score).<br />

Notice also the effect of the depth of exploration in the quality of results. The deeper is the exploration, better is the<br />

precision. For instance, there is a qualitative step forward between an exploration of two or three levels, as Figure 4<br />

(11)<br />

57

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!