12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6tt,u ·¥¸¹u ,10ºtu ,2 10Yt ¸¹uu ,» 0Yt 1u ,» 0Yt 1u ,1 20Yt ¸¹uu ,1 20Yt ¸¹uu ,» 0 2u ,» 0ºt, u2 2 0 ¸¼u½CHAPTER 2. FOUNDATIONS 16S --> a 0.2--> b 0.2--> 0.1--> a S a 0.4--> b S b 0.1.Figure 2.2: Simple stochastic context-free grammar generating palindromesthe general sense introduced earlier. +9,21 . 450 <strong>is</strong> the sum <strong>of</strong> all the joint probabilities <strong>of</strong> random walks throughthe model (i.e., derivations) generating .In the example HMM, the string 1 <strong>is</strong> generated by only a single path through the model, andhence+-,µ. 45060£ 50 46 0£ 1375£<strong>The</strong> conditional probabilities in the product are the individual transition and em<strong>is</strong>sion probabilities. <strong>The</strong> factthat each <strong>is</strong> conditional only on the current state reflects the Markovian character <strong>of</strong> the generation process.2.3.5 Stochastic Context-free GrammarsStochastic context-free grammars (SCFGs) are the natural extension <strong>of</strong> context-free grammars(CFGs) to the probabil<strong>is</strong>tic realm. We again present a simple example and leave the formal presentation toChapter 4.Each context-free production<strong>is</strong> annotated with the conditional probabilitythat it <strong>is</strong> chosen among allpossible productions when the left-hand side nonterminal <strong>is</strong> expanded. Figure 2.2 shows a SCFG generatingall the palindromes over the E alphabet . A derivation in th<strong>is</strong> case <strong>is</strong> the usual tree-structure (parse tree)ar<strong>is</strong>ing from the nonterminal expansions, and its probability <strong>is</strong> the product <strong>of</strong> the probabilities <strong>of</strong> the rulesinvolves. Hence, the string abbba <strong>is</strong> generated with probability 0£ 4 ¾ 0£ 1 ¾ 0£ 1 ¾ 0£ 2, corresponding to theonly derivation <strong>of</strong> th<strong>is</strong> string in the grammar.Notice how the notion <strong>of</strong> context-freeness has been extended to include probabil<strong>is</strong>tic conditionalindependence <strong>of</strong> the expansion <strong>of</strong> a nonterminal from its surrounding context. Th<strong>is</strong> turns out to be crucial tokeep the computational properties <strong>of</strong> SCFGs reasonable, but it <strong>is</strong> also the major drawback <strong>of</strong> SCFGs whenusing them to model, say, natural language. We will return to th<strong>is</strong> problem at various points in the followingchapters.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!