12.07.2015 Views

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

The dissertation of Andreas Stolcke is approved: University of ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

)ÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅ= >the string Æ as a suffix, while )occurrences <strong>of</strong> ({65( 1 £££;(,,===Æ)= >means that ) generates Æ as a prefix. +-, ) ,Æ)= >in strings generated by an arbitrary nonterminal ) . <strong>The</strong> special case ===,0CHAPTER 7. -GRAMS FROM STOCHASTIC CONTEXT-FREE GRAMMARS 172((a) (b) (c)( (ÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅ ÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅÆÅÅÅFigure 7.1: Three ways <strong>of</strong> generating a substring ( from a nonterminal ) .Notation Extending the notation used in ) previousÆchapters,X Cdenotes that non-terminal) generatesare the probabilities associated with these events.C/and ) 0ÆX +9, CCT 7.3.3 Computing expectationsOur goal now <strong>is</strong> to compute the substring expectations for a given grammar. Formal<strong>is</strong>ms such asSCFGs which have a recursive rule structure suggest a divide-and-conquer algorithm that follows the recursivestructure <strong>of</strong> the grammar.We generalize the problem by considering (ë. )g0 , the expected number <strong>of</strong> (possibly overlapping)<strong>is</strong> the solution sought, where9 <strong>is</strong> the start symbol <strong>of</strong> the grammar.Now consider all possible nonterminal) ways that can (/6|( generate £££7(string 11 £££7( £££ , and the associated probabilities. For each production <strong>of</strong> ) CF£££;( (ë.9%0as a substring,denoted ) by we have tod<strong>is</strong>tingu<strong>is</strong>h two main cases, assuming the grammar <strong>is</strong> in CNF. If the string in question <strong>is</strong> <strong>of</strong> length (6{( 1, 1,and if ) happens to have a production ) ¸ ( 1, then that production adds exactly +-, ) ¸ ( 10 to the expectation( then might also be generated by recursiveexpansion <strong>of</strong> the right-hand side. Here, for each production, there are three subcases.If ) has non-terminal productions, say, ) ¸ =>,(-. )g0 .(a) First,= can by itself generate the complete ( (see Figure 7.1(a)).(b) Likew<strong>is</strong>e,> itself can generate ( (Figure 7.1(b)).Finally,= (c) could ( generate £££_( d 1 as a suffix (=), thereby resulting in a single occurrence <strong>of</strong> ( (Figure 7.1(c)).1 £££;( d ) and>, ( d #1 £££;(as a prefixexpectation (Each <strong>of</strong> these cases will have an expectation for generating ( 1 £££;(X Cas a substring, and the total(ë. )g0 will be the sum <strong>of</strong> these partial expectations. <strong>The</strong> total expectations for the first two C¬(%d #1 £££7((>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!