Introduction to Computational Linguistics
Introduction to Computational Linguistics
Introduction to Computational Linguistics
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6. Sets and Func<strong>to</strong>rs 17<br />
write it down twice since the set is written out linearly. In the same way, OCaML<br />
s<strong>to</strong>res sets in a particular way, here in form of a binary branching tree. Next,<br />
OCaML demands from you that you order the elements linearly, in advance. You<br />
can order them in any way you please, but given two distinct elements a and b,<br />
either a < b or b < a must hold. This is needed <strong>to</strong> access the set, <strong>to</strong> define set<br />
union, and so on. The best way <strong>to</strong> think of a set as being a list of objects ordered<br />
in a strictly ascending sequence. If you want <strong>to</strong> access an element, you can say:<br />
take the least of the elements. This picks out an element. And it picks out exactly<br />
one. The latter is important because OCaML operates deterministically. Every<br />
operation you define must be <strong>to</strong>tal and deterministic.<br />
If elements must always be ordered—how can we arrange the ordering? Here<br />
is how.<br />
(45)<br />
module OPStrings =<br />
struct<br />
type t = string * string<br />
let compare x y =<br />
if x = y then 0<br />
else if fst x > fst y || (fst x = fst y<br />
&& snd x > snd y) then 1<br />
else -1<br />
end;;<br />
(The indentation is just for aesthetic purposes and not necessary.) This is what<br />
OCaML answers:<br />
(46)<br />
module OPStrings :<br />
sig type t = string * string val compare :<br />
’a * ’b −> ’a * ’b −> int end<br />
It says that there is now a module called OPString with the following signature:<br />
there is a type t and a function compare, and their types inside the signature are<br />
given. A signature, by the way, is a set of functions <strong>to</strong>gether with their types. The<br />
signature is given inside sig· · · end.<br />
Now what is this program doing for us? It defines a module, which is a complete<br />
unit that has its own name. We have given it the name OPStrings. The