12.07.2015 Views

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table 5.1. Result from a Search Engine.Keyword Gathered Retrieved Vaild Pagestejun 3,713 748 629houhou 5,998 916 929provide users with the means to navigate accurately and credibly to informationon the Web, but not to give a complete relevant document set with respect touser queries. In addition, a list is a summarization made by humans, and thusit is edited to make it easy to understand. Therefore, the restriction to itemizedanswers does not lose its effectiveness in my study. In the initial step ofmy work for this type of QA, I discuss a text categorization task that dividesa set of lists into two groups: procedural and non-procedural. First, I gatheredweb pages from a search engine and extracted lists including the procedural expressionstagged with any HTML(Hyper Text Markup Language) list tags found,and observed their characteristics. Then I examined Support Vector Machines(SVMs) and sequential pattern mining relative to the set of lists, and observedthe obtained model to find useful features for extraction of answers to explain arelevant procedure.5.2 Answering procedures with listsI can easily imagine a situation in which people ask procedural questions, forinstance a user who wants to know the procedure for installing the RedHat LinuxOS. When using a web search engine, the user could employ a keyword related tothe domain, such as “RedHat,” “install,” or the synonyms of “procedure,” suchas “method” or “process.” In conclusion, the search engine will often return aresult that does not include the actual procedures, for instance, only including thelists of hyperlinks to some URLs or simple alternatives that have no intentionalorder as is given.This thesis addresses the issue in the context of the solution being to return tothe actual procedure. In the initial step of this study, I focused on the case thatthe continuous answer candidate passage is in the original text and furthermore66

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!