12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 4. STOCHASTIC CONTEXT-FREE GRAMMARS 101A.S --> NP VPVP --> VerbI--> VerbT NPNP --> Art Noun--> Art AP NounAP --> Adj--> Adj APArt --> theVerbI --> ate | sleptVerbT --> saw | heardNoun --> cat | dogAdj --> big oldB.S --> NP VPVP --> Verb NPNP --> Art Noun--> Art Noun RCRC --> Rel VPVerb --> saw | heardNoun --> cat | dog | mouseArt --> a | theRel --> thatFigure 4.2: Example grammars from Langley (1994).4.5.3 Sample orderingWe conclude with the two pseudo-natural language examples from Langley (1994), where a hillclimbing learner for non-probabil<strong>is</strong>tic CFGs <strong>is</strong> studied (see Section 4.4). <strong>The</strong> point here will be that theordering <strong>of</strong> samples during incremental processing can have a considerable effect on the outcome, or on thespeed <strong>of</strong> the learning process.<strong>The</strong> grammars shown in Figure 4.2 exhibit two types <strong>of</strong> recursion. Grammar A contains an NP rulefor an arbitrary number <strong>of</strong> adjectives, such as inthe old big cat heard the big old big dogGrammar B <strong>is</strong> similar but allows embedded relative clauses instead <strong>of</strong> adjectives. It yields sentences such asa mouse heard a dog that heard the mouse that heard the catFor the purpose <strong>of</strong> our experiment uniform probabilitieswere added to the original non-probabil<strong>is</strong>ticCFG productions, and were used to generate 100 random samples each (sample A and B, respectively). <strong>The</strong>samples were presented in one <strong>of</strong> two conditions: random order, or in order <strong>of</strong> increasing sentence length(number <strong>of</strong> words). 15 Both samples were processed by incremental merging and chunking.Sample A ordered by sentence length resulted in a grammar that contained three redundant rulesthat could be removed by reestimation. <strong>The</strong> final grammar was:S --> NP VI (57)--> NP VT NP (43)NP --> DET N (67)--> DET NP1 (76)NP1 --> ADJ N (76)--> ADJ NP1 (71)DET --> the (143)N --> cat (73)15 <strong>The</strong> order among samples <strong>of</strong> the same length remained random.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!