13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

20. Greibach Normal Form 77<br />

The reduction of the number of nonterminals has the same significance as in the<br />

case of finite state au<strong>to</strong>mata: it speeds up recognition, and this can be significant<br />

because not only the number of states is reduced but also the number of rules.<br />

Another source of time efficiency is the number of rules that match a given<br />

right hand side. If there are several rules, we need <strong>to</strong> add the left hand symbol for<br />

each of them.<br />

Definition 25 A CFG is invertible if for any pair of rules X → ⃗α and Y → ⃗α we<br />

have X = Y.<br />

There is a way <strong>to</strong> convert a given grammar in<strong>to</strong> invertible form. The set of nonterminals<br />

is ℘(N), the powerset of the set of nonterminals of the original grammar.<br />

The rules are<br />

(200) S → T 0 T 1 . . . T n−1<br />

where S is the set of all X such that there are Y i ∈ T i (i < n) such that X →<br />

Y 0 Y 1 . . . Y n−1 ∈ R. This grammar is clearly invertible: for any given sequence<br />

T 0 T 1 · · · T n−1 of nonterminals the left hand side S is uniquely defined. What needs<br />

<strong>to</strong> be shown is that it generates the same language (in fact, it generates the same<br />

constituent structures, though with different labels for the constituents).<br />

20 Greibach Normal Form<br />

We have spoken earlier about different derivations defining the same constituent<br />

structure. Basically, if in a given string we have several occurrences of nonterminals,<br />

we can choose any of them and expand them first using a rule. This is<br />

because the application of two rules that target the same string but different nonterminals<br />

commute:<br />

(201)<br />

· · · X · · · Y · · ·<br />

· · · ⃗α · · · Y · · · · · · X · · · ⃗γ · · ·<br />

· · · ⃗α · · · ⃗γ · · ·<br />

This can be exploited in many ways, for example by always choosing a particular<br />

derivation. For example, we can agree <strong>to</strong> always expand the leftmost nonterminal,<br />

or always the rightmost nonterminal.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!