23.11.2014 Views

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Of course, if there are two or more keys with the same hash value, then two<br />

different entries will be mapped to the same bucket <strong>in</strong> A. In this case, we say that a<br />

collision has occurred. Clearly, if each bucket of A can store only a s<strong>in</strong>gle entry,<br />

then we cannot associate more than one entry with a s<strong>in</strong>gle bucket, which is a<br />

problem <strong>in</strong> the case of collisions. To be sure, there are ways of deal<strong>in</strong>g with<br />

collisions, which we will discuss later, but the best strategy is to try to avoid them<br />

<strong>in</strong> the first place. We say that a hash function is "good" if it maps the keys <strong>in</strong> our<br />

map so as to m<strong>in</strong>imize collisions as much as possible. For practical reasons, we also<br />

would like a hash function to be fast <strong>and</strong> easy to compute.<br />

Follow<strong>in</strong>g the convention <strong>in</strong> <strong>Java</strong>, we view the evaluation of a hash function, h(k),<br />

as consist<strong>in</strong>g of two actions—mapp<strong>in</strong>g the key k to an <strong>in</strong>teger, called the hash code,<br />

<strong>and</strong> mapp<strong>in</strong>g the hash code to an <strong>in</strong>teger with<strong>in</strong> the range of <strong>in</strong>dices ([0,N − 1]) of a<br />

bucket array, called the compression function. (See Figure 9.3.)<br />

Figure 9.3: The two parts of a hash function: a hash<br />

code <strong>and</strong> a compression func tion.<br />

9.2.3 Hash Codes<br />

The first action that a hash function performs is to take an arbitrary key k <strong>in</strong> our<br />

map <strong>and</strong> assign it an <strong>in</strong>teger value. The <strong>in</strong>teger assigned to a key k is called the<br />

hash code for k. This <strong>in</strong>teger value need not be <strong>in</strong> the range [0,N − 1], <strong>and</strong> may even<br />

be negative, but we desire that the set of hash codes assigned to our keys should<br />

avoid collisions as much as possible. For if the hash codes of our keys cause<br />

collisions, then there is no hope for our compression function to avoid them. In<br />

addition, to be consistent with all of our keys, the hash code we use for a key k<br />

should be the same as the hash code for any key that is equal to k.<br />

524

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!