12.07.2015 Views

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

file - ChaSen - 奈良先端科学技術大学院大学

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 3.6. Summary of Experimental Settings.Features Set1 : uni-gram of POS all/content wordsSet2 : uni-gram + bi-gram of POSallSet3 : n POS at the end of santence n=1-5Set4 : n POS at sentence head and end n=1-5Chunk tags IO/IOB1/IOB2/IOE1/IOE2/IOBESWindow size one, three and five sentencesend of sentence were exploited. The number n was varied from 1 to 5. In the casethat only uni-gram was exploited, both a feature set only including content wordsand another set including all words were tested. Figure 3.6 represents the formatof feature set of learning and test data for CRF++. It is a matrix of features ofsentences. Each column is assigned to one feature and each cell in this matrixindicates a feature value corresponding to the sentence. In this experiment, thevalues of feature is binary such that are specified by the symbol representing thefeature and a symbol indicating absence of the feature.In Figure 3.6 w 1 , w 2 , ..., and w m indicate the top m words in frequent wordsranking in the dataset, and w 1,m+1 , w 2,m+2 w 7,m+n the n words at the end ofeach sentence. The ’nil’ indicates that those features are not included in thesentence.As the diagram indicates, the feature columns can be divided into severalgroups of columns and some of groups were exploited in combination. Correspondingly,sentences used as contexts of a targeted sentence for chunking canbe selected as same manner. The contexts using in this experiment are only consideredin units of sentence, thus we use the idea of “window” of a sequence ofsentences as exploited contexts for chunking. The window size varied from onlytarget sentence for chunking through three sentences including one forward andone backward sentence, and to another five sentences including two forward andtwo backward sentences of the target. Table 3.6 summarizes these experimentalconditions.39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!