12.07.2015 Views

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.3.7 Problems of answer annotationFinally I will point out some issues concerning the classification and tagging ofdescriptive answers.First, I obtained relatively good agreement in the Definition, the Order oftime, the Process, the Instance, and the Comparison description types, but theactual agreement was not high enough. I expect that tagging accuracy will be improvedas more detailed studies are conducted on each description type. However,a certain size of fluctuation in tagging is unavoidable as long as I are pursuingtagging by a group of non-professional annotators with various levels.It is necessary to review what mechanisms of agreement are possible and wherethe final answer should be sought assuming tagging fluctuations. There have beensome studies of this type, albeit few in number [129].Secondly, there is an issue of data sampling. The data set collected for thisstudy contained only a few discourse types in some domains, and there havebeen few surveys on such bias. However, a similar tendency can be expected onother Q&A articles of the same kind looking at the research on question types byTamura et al. [141]. Therefore, for future data sampling, an essential issue is howto prepare a sufficient amount of data and exclude the dependency on specificdomains of an experiment.There is no question about the need for precise language resources. To obtainthese, tagging by linguistic and language processing specialists will continueto be required in the future. However, once reliable grammar, rules, and lexicalknowledge are described, and they can be used continuously without majorchange, it will not be necessary to use tags with great fluctuations. Tagging bynon-professionals can be applied in cases where dictionary generation is costly relativeto performance requirements, the application is personal or in small projectsthat the cost of creating language resources is not affordable. I think that bothprofessional and non-professional methods will complement each other.60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!