12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 4. STOCHASTIC CONTEXT-FREE GRAMMARS 102--> dog (70)ADJ --> big (75)--> old (72)VI --> ate (29)--> slept (28)VT --> heard (24)--> saw (19)Th<strong>is</strong> <strong>is</strong> essentially the phrase structure <strong>of</strong> the target grammar, except that the recursion in noun phrases <strong>is</strong>implemented differently. 16On the other hand, when given the unordered samples, the algorithm produced a grammar thatcontained nine redundant rules. However, the same grammar as above was found by removing these rules,restarting the search and pruning three more redundant rules.With ordered samples, 16 <strong>of</strong> the 100 samples required creation <strong>of</strong> new rules, and 212 merging stepswere necessary to produce the final grammar, taking 49 seconds total CPU time. <strong>The</strong> unordered samples, onthe other hand, lead to new productions for 20 samples, 1374 merges, and took 174 seconds.For sample B, the outcome was similar. Ordering the samples by length resulted in the followinggrammar, an acceptable rendering <strong>of</strong> the original.S --> NP VP (100)VP --> V NP (295)NP --> DET N (395)--> NP RC (195)RC --> REL VP (195)DET --> a (196)--> the (199)N --> cat (137)--> dog (118)--> mouse (140)REL --> that (195)V --> heard (152)--> saw (143)A compar<strong>is</strong>on experiment on the unordered l<strong>is</strong>t <strong>of</strong> samples had to be aborted due to excessively long run time.speed and accuracy.<strong>The</strong>re are several reasons why ordering samples by length for incremental merging leads to improved‘Similar’ samples, which eventually lead to a single generalization are presented closer together, thereby¡reducing the time it takes to hypothesize and accept a generalization.As a result, the number <strong>of</strong> samples and corresponding states that are not yet fully merged into the model¡structure are minimized, reducing the search space overall.16 A beam search on the same data produced a grammar that was actually better than the target grammar, i.e., it was weakly equivalent,produced essentially the same phrase structure and used shorter rules.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!