13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16. Finite State Morphology 58<br />

Notice however that the transducer can be used <strong>to</strong> translate any language. In fact,<br />

the image of a context free language under R T can be shown <strong>to</strong> be context free as<br />

well.<br />

Let us observe that the construction above can be generalized. Let A and B be<br />

alphabets, and f a function assigning each letter of A a word over B. We extend f<br />

<strong>to</strong> words over A as follows.<br />

(157) f (⃗x · a) := ⃗x · f (a)<br />

The translation function f can be effected by a finite state machine in the following<br />

way. The initial state is i 0 . On input a the machine goes in<strong>to</strong> state q a , outputs ε.<br />

Then it returns <strong>to</strong> i 0 in one or several steps, outputting f (a). Then it is ready <strong>to</strong> take<br />

the next input. However, there are more complex functions that can be calculated<br />

with transducers.<br />

16 Finite State Morphology<br />

One of the most frequent applications of transducers is in morphology. Practically<br />

all morphology is finite state. This means the following. There is a finite state<br />

transducer that translates a gloss (= deep morphological analysis) in<strong>to</strong> surface<br />

morphology. For example, there is a simple machine that puts English nouns in<strong>to</strong><br />

the plural. It has two states, 0, and 1; 0 is initial, 1 is final. The transitions are as<br />

follows (we only use lower case letters for ease of exposition).<br />

(158) 〈0, a, 0, a〉, 〈0, b, 0, b〉, . . . , 〈0, z, 0, z〉, 〈0, ε, 1, s〉.<br />

The machine takes an input and repeats it, and finally attaches s. We shall later<br />

see how we can deal with the full range of plural forms including exceptional<br />

plurals. We can also write a machine that takes a deep representation, such as car<br />

plus singular or car plus plural and outputs car in the first case and cars in the<br />

second. For this machine, the input alphabet has two additional symbols, say, R<br />

and S, and works as follows.<br />

(159) 〈0, a, 0, a〉, 〈0, b, 0, b〉, . . . , 〈0, z, 0, z〉, 〈0, R, 1, ε〉, 〈0, S, 1, s〉.<br />

This machine accepts one R or S at the end and transforms it in<strong>to</strong> ε in the case<br />

of R and in<strong>to</strong> s otherwise. As explained above we can turn the machine around.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!