13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

15. Finite State Transducers 55<br />

Consider again your computer. On an input of length 65536 (= 2 16 ) it takes one<br />

second under the algorithm just described, while the naive algorithm would require<br />

it run for 2159 seconds, which is more than half an hour.<br />

In practice, one does not want <strong>to</strong> spell out in painful detail how many steps<br />

an algorithm consumes. Therefore, simplifying notation is used. One writes that<br />

a problem is in O(n) if there is a constant C such that from some n 0 on for an<br />

input of length n the algorithm takes C · n steps <strong>to</strong> compute the solution. (One<br />

says that the estimate holds for ‘almost all’ inputs if it holds only from a certain<br />

point onwards.) This notation makes sense also in view of the fact that it is not<br />

clear how much time an individual step takes, so that the time consumption cannot<br />

not really be measured in seconds (which is what is really of interest for us). If<br />

<strong>to</strong>morrow computers can compute twice as fast, everything runs in shorter time.<br />

Notice that O(bn + a) = O(bn) = O(n). It is worth understanding why. First,<br />

assume that n ≥ a. Then (b + 1)n ≥ bn + n ≥ bn + a. This means that for almost all<br />

n: (b+1)n ≥ bn+a. Next, O((b+1)n) = O(n), since O((b+1)n) effectively means<br />

that there is a constant C such that for almost all n the complexity is ≤ C(b + 1)n.<br />

Now put D := C(b + 1). Then there is a constant (namely D) such that for almost<br />

all n the complexity is ≤ Dn. Hence the problem is in O(n).<br />

Also O(cn 2 + bn + a) = O(n 2 ) and so on. In general, the highest exponent<br />

wins by any given margin over the others. Polynomial complexity is therefore<br />

measured only in terms of the leading exponent. This makes calculations much<br />

simpler.<br />

15 Finite State Transducers<br />

Finite state transducers are similar <strong>to</strong> finite state au<strong>to</strong>mata. You think of them<br />

as finite state au<strong>to</strong>mata that leave a trace of their actions in the form of a string.<br />

However, the more popular way is <strong>to</strong> think of them as translation devices with<br />

finite memory. A finite state transducer is a sextuple<br />

(144) T = 〈A, B, Q, i 0 , F, δ〉<br />

where A and B are alphabets, Q a finite set (the set of states), i 0 the initial state,<br />

F the set of final states and<br />

(145) δ ⊆ ℘(A ε × Q × B ε × Q)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!