13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

19. Parsing and Recognition 72<br />

The trees used in linguistic analysis are often ordered. The ordering is here<br />

implicitly represented in the string. Let C = 〈⃗γ 1 , ⃗γ 2 〉 be an occurrences of ⃗σ and<br />

D = 〈⃗η 1 , ⃗η 2 〉 be an occurrence of τ. We write C ⊏ D and say that C precedes D<br />

if ⃗γ 1 ⃗σ is a prefix of ⃗η 1 (the prefix need not be proper). If one underlines C and D<br />

this definition amounts <strong>to</strong> saying that the line of C ends before the line of D starts.<br />

(190) abccddx<br />

Here C = 〈a, cddx〉 and D = 〈abcc, x〉.<br />

19 Parsing and Recognition<br />

Given a grammar G, and a string ⃗x, we ask the following questions:<br />

• (Recognition:) Is ⃗x ∈ L(G)?<br />

• (Parsing:) What derivation(s) does ⃗x have?<br />

Obviously, as the derivations give information about the meaning associated with<br />

an expression, the problem of recognition is generally not of interest. Still, sometimes<br />

it is useful <strong>to</strong> first solve the recognition task, and then the parsing task. For if<br />

the string is not in the language it is unnecessary <strong>to</strong> look for derivations. The parsing<br />

problem for context free languages is actually not the one we are interested<br />

in: what we really want is only <strong>to</strong> know which constituent structures are associated<br />

with a given string. This vastly reduces the problem, but still the remaining<br />

problem may be very complex. Let us see how.<br />

Now, in general a given string can have any number of derivations, even infinitely<br />

many. Consider by way of example the grammar<br />

(191) A → A | a<br />

It can be shown that if the grammar has no unary rules and nor rules of the form<br />

X → ε then a given string ⃗x has an exponential number of derivations. We shall<br />

show that it is possible <strong>to</strong> eliminate these rules (this reduction is not semantically<br />

innocent!). Given a rule X → ε and a rule that contains X on the right, say

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!