file - ChaSen - 奈良先端科学技術大学院大学
file - ChaSen - 奈良先端科学技術大学院大学
file - ChaSen - 奈良先端科学技術大学院大学
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Step 1 Segment a question article into sentences. Each segment is terminatedwith a period “.”.Step 2 Carry out chunking by article.Step 3 Extract question segments as chunks, identify the question types, andoutput them in pairs.Chunker divides a sequence of sentences into question segments and otherchunks. A chunk tag is given to each sentence. The chunk tags used are ofthe five types explained in Section 3.4.1, namely IOB1, IOB2, IOE1, IOE2, andIOBES, and the IO-tag that does not distinguish the B/E/S tags from the others.Sentences not involved in the identification of question types are given the O-tag.Those sentences that constitute a question segment are given a tag consisting ofthe combination of one of the letters I, B, E, and S and one of the letters Wand D, thus I-W and B-D for example, to represent the portion in the chunk andthe question type. Figure 3.5 shows an example of composition of chunks usingthe IOB-tags. A chunker learns a chunking model from the pairs of sentencesand their chunk tags in Figure 3.5. To extract question segments from a query,sentence labeling, that labels a chunk tag to a sentence, is firstly performed.Subsequently, sentences labeled same roles such as “-D” and “-W” are chunkedby post-processing. Consequently, a question segment is extracted as a chunk andthe question type is given to the question segment based on the label of chunk.3.4.3 Conditional random fieldThe CRF (Conditional Random Field) is a stochastic model for sets. Combinationsof two random variables to represent the properties of a set are associatedwith each other as a conditional probability [Lafferty 2001]. The CRF supposes arandom field that has the Markov property regarding the elements of a set to beobserved. The advantages of this are as follows: (1) There is no need to assumethe independency of random variables as with those in the Markov model; (2)Since a model is described with conditional random variables, the model parameterscan be estimated without calculating the distribution of random variablesin the condition.33