13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

21. Pushdown Au<strong>to</strong>mata 85<br />

pushdown. Once you hit on a b you start popping the as from the pushdown, one<br />

for each b that you find. If the pushdown is emptied before the string is complete,<br />

then you have more bs than a. If the pushdown is not emptied but the string is<br />

complete, then you have more as than bs. So, you can tell whether you have a<br />

string of the required form if you can tell whether you have an empty s<strong>to</strong>rage. We<br />

assume that this is the case. In fact, typically what one does is <strong>to</strong> fill the s<strong>to</strong>rage<br />

before we start with a special symbol #, the end-of-s<strong>to</strong>rage marker. The s<strong>to</strong>rage<br />

is represented as a string over an alphabet D that contains #, the s<strong>to</strong>rage alphabet.<br />

Then we are only in need of the following operations and predicates:<br />

➊ For each d ∈ D we have an operation push d : ⃗x ↦→ ⃗x ⌢ d.<br />

➋ For each d ∈ D we have a predicate <strong>to</strong>p d which if true of ⃗x iff ⃗x = ⃗y ⌢ d.<br />

➌ We have an operation pop : ⃗x ⌢ d ↦→ ⃗x.<br />

(If we do not have an end of stack marker, we also need a predicate ‘empty’, which<br />

is true of a stack ⃗x iff ⃗x = ε.<br />

Now, notice that the control structure is a finite state au<strong>to</strong>ma<strong>to</strong>n. It schedules<br />

the actions using the stack as a s<strong>to</strong>rage. This is done as follows. We have two<br />

alphabets, A, the alphabet of letters read from the tape, and I, the stack alphabet,<br />

which contains a special symbol, #. Initially, the stack contains one symbol, #. A<br />

transition instruction is a a quintuple 〈s, c, t, s ′ , p〉, where s and s ′ are states, c is a<br />

character or empty (the character read from the string), and t is a character (read<br />

from the <strong>to</strong>p of the stack) and finally p an instruction <strong>to</strong> either pop from the stack<br />

or push a character (different from #) on<strong>to</strong> it. A PDA contains a set of instructions.<br />

Formally, it is defined <strong>to</strong> be a quintuple 〈A, I, Q, i 0 , F, σ〉, where A is the input<br />

alphabet, I the stack alphabet, Q the set of states, i 0 ∈ Q the start state, F ⊆ Q the<br />

set of accepting states, and σ a set of instructions. If A is reading a string, then it<br />

does the following. It is initialized with the stack containing # and the initial state<br />

i 0 . Each instruction is an option for the machine <strong>to</strong> proceed. However, it can use<br />

that option only if it is in state s, if the <strong>to</strong>pmost stack symbol is t and if c ε, the<br />

next character must match c (and is then consumed). The next state is s ′ and the<br />

stack is determined from p. If p = pop, then the <strong>to</strong>pmost symbol is popped, if it<br />

is push a , then a is pushed on<strong>to</strong> stack. PDAs can be nondeterministic. For a given<br />

situation we may have several options. If given the current stack, the current state<br />

and the next character there is at most one operation that can be chosen, we call

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!