13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

20. Greibach Normal Form 78<br />

Recall that a derivation defines a set of constituent occurrences, which in turn<br />

constitute the nodes of the tree. Notice that each occurrence of a nonterminal<br />

is replaced by some right hand side of a rule during a derivation that leads <strong>to</strong> a<br />

terminal string. After it has been replaced, it is gone and can no longer figure in a<br />

derivation. Given a tree, a linearization is an ordering of the nodes which results<br />

from a valid derivation in the following way. We write x ⊳ y if the constituent of<br />

x is expanded before the constituent of y is. One can characterize exactly what it<br />

takes for such an order <strong>to</strong> be a linearization. First, it is linear. Second if x > y then<br />

also x ⊳ y. It follows that the root is the first node in the linearization.<br />

Linearizations are closely connected with search strategies in a tree. We shall<br />

present examples. The first is a particular case of the so–called depth–first search<br />

and the linearization shall be called leftmost linearization. It is as follows. x ⊳ y<br />

iff x > y or x ⊏ y. (Recall that x ⊏ y iff x precedes y. Trees are always considered<br />

ordered.) For every tree there is exactly one leftmost linearization. We shall<br />

denote the fact that there is a leftmost derivation of ⃗α from X by X ⊢ l G<br />

⃗α. We can<br />

generalize the situation as follows. Let ◭ be a linear ordering uniformly defined<br />

on the leaves of local subtrees. That is <strong>to</strong> say, if B and C are isomorphic local trees<br />

(that is, if they correspond <strong>to</strong> the same rule ρ) then ◭ orders the leaves B linearly<br />

in the same way as ⊳ orders the leaves of C (modulo the unique (!) isomorphism).<br />

In the case of the leftmost linearization the ordering is the one given by ⊏. Now<br />

a minute’s reflection reveals that every linearization of the local subtrees of a tree<br />

induces a linearization of the entire tree but not conversely (there are orderings<br />

which do not proceed in this way, as we shall see shortly). X ⊢G ◭ ⃗α denotes<br />

the fact that there is a derivation of ⃗α from X determined by ◭. Now call π a<br />

priorization for G = 〈S, N, A, R〉 if π defines a linearization on the local tree H ρ ,<br />

for every ρ ∈ R. Since the root is always the first element in a linearization, we<br />

only need <strong>to</strong> order the daughters of the root node, that is, the leaves. Let this<br />

ordering be ◭. We write X ⊢ π G ⃗α if X ⊢◭ G<br />

⃗α for the linearization ◭ defined by π.<br />

Proposition 26 Let π be a priorization. Then X ⊢ π G ⃗x iff X ⊢ G ⃗x.<br />

A different strategy is the breadth–first search. This search goes through the tree<br />

in increasing depth. Let S n be the set of all nodes x with d(x) = n. For each n,<br />

S n shall be ordered linearly by ⊏. The breadth–first search is a linearization ∆,<br />

which is defined as follows. (a) If d(x) = d(y) then x ∆ y iff x ⊏ y, and (b) if<br />

d(x) < d(y) then x ∆ y. The difference between these search strategies, depth–first<br />

and breadth–first, can be made very clear with tree domains.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!