Introduction to Computational Linguistics
Introduction to Computational Linguistics
Introduction to Computational Linguistics
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
7. Hash Tables 19<br />
<strong>to</strong> actually see the set, you have <strong>to</strong> tell OCaML how <strong>to</strong> show it <strong>to</strong> you. One way<br />
of doing that is <strong>to</strong> convert the set in<strong>to</strong> a list. There is a function called elements<br />
that converts the set in<strong>to</strong> a list. Since OCaML has a predefined way of communicating<br />
sets (which we explained above), you can now look at the elements<br />
without trouble. However, what you are looking at are members of a list, not that<br />
of the set from which the list was compiled. This can be a source of mistakes in<br />
programming. Now if you type, say,<br />
(52)<br />
# let h = PStringSet.element st;;<br />
OCaML will incidentally give you the list. It is important <strong>to</strong> realize that the program<br />
has no idea how it make itself unders<strong>to</strong>od <strong>to</strong> you if you ask it for the value<br />
of an object of a newly defined type. You have <strong>to</strong> tell it how you want it <strong>to</strong> show<br />
you.<br />
Also, once sets are defined, a comparison predicate is available. That is <strong>to</strong> say,<br />
the sets are also ordered linearly by PStringSet.compare. This is useful, for it<br />
makes it easy <strong>to</strong> define sets of sets. Notice that PStringSet.compare takes its<br />
arguments <strong>to</strong> the right. The argument immediately <strong>to</strong> its right is the one that shows<br />
up <strong>to</strong> the right in infix notation. So, PStringSet.compare f g is the same as g<br />
< f in normal notation. Beware!<br />
7 Hash Tables<br />
This section explains some basics about hash tables. Suppose there is a function,<br />
which is based on a finite look up table (so, there are finitely many possible<br />
inputs) and you want <strong>to</strong> compute this function as fast as possible. This is the<br />
moment when you want <strong>to</strong> consider using hash tables. They are some implementation<br />
of a fact look up procedure. You make the hash table using magical<br />
incantations similar <strong>to</strong> those for sets. First you need <strong>to</strong> declare from what kind<br />
of objects <strong>to</strong> what kind of objects the function is working. Also, you sometimes<br />
(but not always) need <strong>to</strong> issue a function that assigns every input value a unique<br />
number (called key). So, you define first a module of inputs, after which you issue<br />
HashTbl.make. This looks as follows.<br />
(53)<br />
module HashedTrans =<br />
struct