13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

14. Digression: Time Complexity 52<br />

would lead one <strong>to</strong> believe. Here is another algorithm: call a state q n–reachable if<br />

⃗x<br />

there is a word of length ≤ n such that i 0 → q; call it reachable if it is n–reachable<br />

for some n ∈ ω. L(A) is nonempty if some q ∈ F is reachable. Now, denote by<br />

R n the set of n–reachable states. R 0 := {i 0 }. In the nth step we compute R n+1 by<br />

taking all successors of points in R n . If R n+1 = R n we are done. (Checking this<br />

takes |Q| steps. Alternatively, we need <strong>to</strong> iterate the construction at most |Q| − 1<br />

times. Because if at each step we add some state, then R n grows by at least 1, so<br />

this can happen only |Q| − 1 times, for then R n has at least |Q| many elements.<br />

On the other hand, it is a subset of Q, so it can never have more elements.) Let<br />

us now see how much time we need. Each step takes R n × |A| ≤ |Q| × |A| steps.<br />

Since n ≤ |Q|, we need at most |Q| − 1 steps. So, we need c · |A||Q| 2 steps for some<br />

constant c.<br />

Consider however an algorithm that takes n 2 steps.<br />

(135)<br />

n n 2 n n 2<br />

1 1 11 121<br />

2 4 12 144<br />

3 9 13 169<br />

4 16 14 196<br />

5 25 15 225<br />

6 36 16 256<br />

7 49 17 289<br />

8 64 18 324<br />

9 81 19 361<br />

10 100 20 400<br />

For n = 10 it takes 100 steps, or a tenth of a millisecond, for n = 20 it takes 400,<br />

and for n = 30 only 900, roughly a millisecond. Only if you double the size of the<br />

input, the number of steps quadruple. Or, if you want the number of steps <strong>to</strong> grow<br />

by a fac<strong>to</strong>r 1024, you have <strong>to</strong> multiply the length of the input by 32! An input of<br />

even a thousand creates a need of only one million steps and takes a second on<br />

our machine.<br />

The algorithm described above is not optimal. There is an algorithm that is<br />

even faster: It goes as follows. We start with R −1 := ∅ and S 0 = {q 0 }. Next, S 1<br />

is the set of successors of S 0 minus R 0 , and R 0 := S 0 . In each step, we have two<br />

sets, S i and R i−1 . R i−1 is the set of i−1–reachable points, and S i is the set of points<br />

that can be reached from it it one step, but which are not in R i−1 . In the next step

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!