13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

13. Complexity and Minimal Au<strong>to</strong>mata 47<br />

It is easy <strong>to</strong> check that [b] M = [a] M . We shall see that this difference means that<br />

there is an au<strong>to</strong>ma<strong>to</strong>n checking M can based on less states than any au<strong>to</strong>ma<strong>to</strong>n<br />

checking membership in L.<br />

a<br />

Given two index set I and J, put I → J if and only if J = a\I. This is<br />

well-defined. For let I = [⃗x] L . Then suppose that ⃗xa is a prefix of an accepted<br />

string. Then [⃗xa] L = {⃗y : ⃗xa⃗y ∈ L} = {⃗y : a⃗y ∈ I} = a\I. This defines a<br />

deterministic au<strong>to</strong>ma<strong>to</strong>n with initial element L. Accepting sets are those which<br />

contain ε. We call this the index au<strong>to</strong>ma<strong>to</strong>n and denote it by I(L). (Often it is<br />

called the Myhill-Nerode au<strong>to</strong>ma<strong>to</strong>n.)<br />

Theorem 15 (Myhill-Nerode) L(I(L)) = L.<br />

Proof. By induction on ⃗x we show that L ⃗x → I if and only if [⃗x] L = I. If ⃗x = ε<br />

the claim reads L = L if and only if [ε] L = L. But [ε] L = L, so the claim holds.<br />

Next, let ⃗x = ⃗ya. By induction hypothesis, L ⃗y → J if and only if [⃗y] L = I. Now,<br />

J a → J/a = [⃗ya] L . So, L ⃗x → J/a = [⃗x] L , as promised.<br />

Now, ⃗x is accepted by I(L) if and only if there is a computation from L <strong>to</strong> a set<br />

[⃗y] L containing ε. By the above this is equivalent <strong>to</strong> ε ∈ [⃗x] L , which means ⃗x ∈ L.<br />

□<br />

Given an au<strong>to</strong>ma<strong>to</strong>n A and a state q put<br />

(129) [q] := {⃗x : there is q ′ ∈ F: q ⃗x → q ′ }<br />

It is easy <strong>to</strong> see that for every q there is a string ⃗x such that [q] ⊆ [⃗x] L . Namely, let<br />

⃗x<br />

⃗x be such that i 0 → q. Then for all ⃗y ∈ [q], ⃗x⃗y ∈ L(A), by definition of [q]. Hence<br />

[q] ⊆ [⃗x] L . Conversely, for every [⃗x] L there must be a state q such that [q] ⊆ [⃗x] L .<br />

⃗x<br />

Again, q is found as a state such that i 0 → q. Suppose now that A is deterministic<br />

and <strong>to</strong>tal. Then for each string ⃗x there is exactly one state [q] such that [q] ⊆ [⃗x] L .<br />

⃗x⃗y<br />

Then obviously [q] = [⃗x] L . For if ⃗y ∈ [⃗x] L then ⃗x⃗y ∈ L, whence i 0 −→ q ′ ∈ F for<br />

some q ′ ⃗x<br />

. Since the au<strong>to</strong>ma<strong>to</strong>n is deterministic, i 0 → q → ⃗y q ′ , whence ⃗y ∈ [q].<br />

It follows now that the index au<strong>to</strong>ma<strong>to</strong>n is the smallest deterministic and <strong>to</strong>tal<br />

au<strong>to</strong>ma<strong>to</strong>n that recognizes the language. The next question is: how do we make<br />

that au<strong>to</strong>ma<strong>to</strong>n? There are two procedures; one starts from a given au<strong>to</strong>ma<strong>to</strong>n,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!