13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

21. Pushdown Au<strong>to</strong>mata 86<br />

the PDA deterministic. We add here that theoretically the operation that reads the<br />

<strong>to</strong>p of the stack removes the <strong>to</strong>pmost symbol. The stack really is just a memory<br />

device. In order <strong>to</strong> look at the <strong>to</strong>pmost symbol we actally need <strong>to</strong> pop it off the<br />

stack. However, if we put it back, then this as if we had just ‘peeked’ in<strong>to</strong> the <strong>to</strong>p<br />

of the stack. (We shall not go in<strong>to</strong> the details here: but it is possible <strong>to</strong> peek in<strong>to</strong><br />

any number of <strong>to</strong>pmost symbols. The price one pays is an exponential blowup of<br />

the number of states.)<br />

We say that the PDA accepts ⃗x by state if it is in a final state at the end of ⃗x.<br />

To continue the above example, we put the au<strong>to</strong>ma<strong>to</strong>n in an accepting state if after<br />

popping as the <strong>to</strong>pmost symbol is #. Alternatively, we say that the PDA accepts<br />

⃗x by stack if the stack is empty after ⃗x has been scanned. A slight modification of<br />

the machine results in a machine that accepts the language by stack. Basically, it<br />

needs <strong>to</strong> put one a less than needed on the stack and then cancel # on the last move.<br />

It can be shown that the class of languages accepted by PDAs by state is the same<br />

as the class of languages accepted by PDAs by stack, although for a given machine<br />

the two languages may be different. We shall establish that the class of languages<br />

accepted by PDAs by stack are exactly the CFGs. There is a slight problem in<br />

that the PDAs might actually be nondeterministic. While in the case of finite state<br />

au<strong>to</strong>mata there was a way <strong>to</strong> turn the machine in<strong>to</strong> an equivalent deterministic<br />

machine, this is not possible here. There are languages which are CF but cannot be<br />

recognized by a deterministic PDA. An example is the language of palindromes:<br />

{⃗x⃗x T : ⃗x ∈ A ∗ }, where ⃗x T is the reversal of ⃗x. For example, abddc T = cddba.<br />

The obvious mechanism is this: scan the input and start pushing the input on<strong>to</strong> the<br />

stack until you are half through the string, and then start comparing the stack with<br />

the string you have left. You accept the string if at the end the stack is #. Since<br />

the stack is popped in reverse order, you recognize exactly the palindromes. The<br />

trouble is that there is no way for the machine <strong>to</strong> know when <strong>to</strong> shift gear: it cannot<br />

tell when it is half through the string. Here is the dilemma. Let ⃗x = abc. Then<br />

abccba is a palindrome, but so is abccbaabccba and abccbaabccbaabccba. In<br />

general, abccba n is a palindrome. If you are scanning a word like this, there is no<br />

way of knowing when you should turn and pop symbols, because the string might<br />

be longer than you have thought.<br />

It is for this reason that we need <strong>to</strong> review our notion of acceptance. First, we<br />

say that a run of the machine is a series of actions that it takes, given the input. Alternatively,<br />

the run specifies what the machine chooses each time it faces a choice.<br />

(The alternatives are simply different actions and different subsequent states.) A

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!