13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

19. Parsing and Recognition 76<br />

Definition 24 A CFG is in standard form if all the rules have the form X →<br />

Y 1 Y 2 · · · Y n , with X, Y i ∈ N, or X → ⃗x. If in addition n = 2 for all rules of the first<br />

form, the grammar is said <strong>to</strong> be in Chomsky Normal Form.<br />

There is an easy way <strong>to</strong> convert a grammar in<strong>to</strong> standard form. Just introduce a<br />

new nonterminal Y a for each letter a ∈ A <strong>to</strong>gether with the rules Y a → a. Next<br />

replace each terminal a that cooccurs with a nonterminal on the right hand side of<br />

a rule by Y a . The new grammar generates more constituents, since letters that are<br />

introduced <strong>to</strong>gether with nonterminals do not form a constituent of their own in<br />

the old grammar. Such letter occurrences are called syncategorematic. Typical<br />

examples of syncategorematic occurrences of letters are brackets that are inserted<br />

in the formation of a term. Consider the following expansion of the grammar (??).<br />

(197)<br />

< term >→ < number >| (+)<br />

| (*)<br />

Here, opera<strong>to</strong>r symbols as well as brackets are added syncategorematically. The<br />

procedure of elimination will yield the following grammar.<br />

(198)<br />

< term >→ < number >| Y ( < term > Y + < term > Y )<br />

Y ( →(<br />

Y ) →)<br />

Y + →+<br />

Y ∗ →*<br />

| Y ( < term > Y + < term > Y )<br />

However, often the conversion <strong>to</strong> standard form can be avoided however. It is<br />

mainly interesting for theoretic purposes.<br />

Now, it may happen that a grammar uses more nonterminals than necessary.<br />

For example, the above grammar distinguishes Y + from Y ∗ , but this is not necessary.<br />

Instead the following grammar will just as well.<br />

(199)<br />

< term >→ < number >| Y ( < term > Y o < term > Y )<br />

Y ( →(<br />

Y ) →)<br />

Y o →+ | *

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!