13.11.2014 Views

Introduction to Computational Linguistics

Introduction to Computational Linguistics

Introduction to Computational Linguistics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

7. Hash Tables 19<br />

<strong>to</strong> actually see the set, you have <strong>to</strong> tell OCaML how <strong>to</strong> show it <strong>to</strong> you. One way<br />

of doing that is <strong>to</strong> convert the set in<strong>to</strong> a list. There is a function called elements<br />

that converts the set in<strong>to</strong> a list. Since OCaML has a predefined way of communicating<br />

sets (which we explained above), you can now look at the elements<br />

without trouble. However, what you are looking at are members of a list, not that<br />

of the set from which the list was compiled. This can be a source of mistakes in<br />

programming. Now if you type, say,<br />

(52)<br />

# let h = PStringSet.element st;;<br />

OCaML will incidentally give you the list. It is important <strong>to</strong> realize that the program<br />

has no idea how it make itself unders<strong>to</strong>od <strong>to</strong> you if you ask it for the value<br />

of an object of a newly defined type. You have <strong>to</strong> tell it how you want it <strong>to</strong> show<br />

you.<br />

Also, once sets are defined, a comparison predicate is available. That is <strong>to</strong> say,<br />

the sets are also ordered linearly by PStringSet.compare. This is useful, for it<br />

makes it easy <strong>to</strong> define sets of sets. Notice that PStringSet.compare takes its<br />

arguments <strong>to</strong> the right. The argument immediately <strong>to</strong> its right is the one that shows<br />

up <strong>to</strong> the right in infix notation. So, PStringSet.compare f g is the same as g<br />

< f in normal notation. Beware!<br />

7 Hash Tables<br />

This section explains some basics about hash tables. Suppose there is a function,<br />

which is based on a finite look up table (so, there are finitely many possible<br />

inputs) and you want <strong>to</strong> compute this function as fast as possible. This is the<br />

moment when you want <strong>to</strong> consider using hash tables. They are some implementation<br />

of a fact look up procedure. You make the hash table using magical<br />

incantations similar <strong>to</strong> those for sets. First you need <strong>to</strong> declare from what kind<br />

of objects <strong>to</strong> what kind of objects the function is working. Also, you sometimes<br />

(but not always) need <strong>to</strong> issue a function that assigns every input value a unique<br />

number (called key). So, you define first a module of inputs, after which you issue<br />

HashTbl.make. This looks as follows.<br />

(53)<br />

module HashedTrans =<br />

struct

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!