13.07.2015 Views

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

IADIS International Conference <strong>WWW</strong>/<strong>Internet</strong> 20103.3 EvaluationAs mentioned above, the instance recommendation is mainly to stay up-to-date in a <strong>do</strong>main with dynamicnature. However, we took a relatively static <strong>do</strong>main for this experiment in order to accurately measure theprecision and the recall. Here, we focused a world ranking of car manufacturers (2007), and took the top 50car manufacturers (and their spin-off brands) in total 76 as the correct set in the experiment 1, excludingchina and OTHER manufacturers. It's because the current china manufacturers, that is, the correct set seem tohave been changed from the above data in 2007, and have fluctuation of description and many spin-offbrands that we could not investigate. In the experiment 2, total 16 of Japan manufacturers and their brandsare taken as the correct set. Also, in the experiments we regarded other descriptions than ones in the correctset as incorrect. In the original use, the instance recommendation leaves the final selection for the addition ofthe extracted terms to the ontology to the user. But, in the experiment the function automatically adds everyextracted term expect for the overlaps in order to confirm the accuracy. In the following sections, we evaluatehow we can raise the precision and the recall using the NEE and the adaptation filter.RecallPrecision= Registered correct instances / All the correct instances= Registered correct instances / All the registered instances (including irrelevant ones)3.3.1 Experiment 1 (Named Entity Extraction)In the experiment 1, we evaluated how we can effectively increase the instances in contrast to the number ofoperations using the NEE for the 76 world car manufacturers. As the first seeds we registered {"Nissan","Honda"} to Car class, then take the next instance "Toyota" according to the listed order of the correct setexcept for the overlaps. So, the NEE is executed with the latest registered seeds {"Nissan", "Honda","Toyota"}, and gets the candidates for the instances Se. Then, it registers all of them to the class except forthe overlaps. When all the correct instances are registered by repeating the above, this experiment isterminated. Evaluation metrics are as follows.Recall = (3 + ∆ ) / 76Precision = (3 + ∆ ) / (3 + ∆ + irrelevant instances), where ∆ is the registered correct instancesThe result is shown in Fig.6. In terms of the recall, because simple input means to manually register thecorrect instance one by one, the recall moved up in proportion to the number of operation, and achieved100 % at the 73rd time. On the other hand, because the NEE extracted a number of instance candidates at theearly stage, the recall moved up quickly, and achieved 71.1 % at the second time. After that, however, theextracted terms began to overlap with the registered ones and the move became slowly, then finally itachieved 100 % at the 24th time. As a consequence, we confirmed the NEE increases the instanceseffectively with a small number of operations.In terms of the precision, it remained 60.3 % even at the last time due to the irrelevant terms extracted inNamed Entity Extraction, although the simple input always kept 100 %.RecallR 100%90%80%70%60%50%40%30%20%10%Sim ple Input0%0 5 10 15 20 25 30 35 40 45 50 55 60 65 70Tim eNEEP recision100%90%80%70%60%50%40%30%20%10%Sim ple Input0%0 5 10 15 20 25 30 35 40 45 50 55 60 65 70Tim eNEEFigure 6. Result of experiment 1107

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!