23.11.2014 Views

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

however, for if there is a repeated pattern of hash codes of the form pN + q for<br />

several different p's, then there will still be collisions.<br />

The MAD Method<br />

A more sophisticated compression function, which helps elim<strong>in</strong>ate repeated pat<br />

terns <strong>in</strong> a set of <strong>in</strong>teger keys is the multiply add <strong>and</strong> divide (or "MAD") method.<br />

This method maps an <strong>in</strong>teger i to<br />

|ai + b| mod N,<br />

where N is a prime number, <strong>and</strong> a > 0 (called scal<strong>in</strong>g factor) <strong>and</strong> b ≥ 0 (called<br />

shift) are <strong>in</strong>teger constants r<strong>and</strong>omly chosen at the time the compression function<br />

is determ<strong>in</strong>ed so that a mod N≠ 0. This compression function is chosen <strong>in</strong> order to<br />

elim<strong>in</strong>ate repeated patterns <strong>in</strong> the set of hash codes <strong>and</strong> get us closer to hav<strong>in</strong>g a<br />

"good" hash function, that is, one such that the probability any two different keys<br />

collide is 1/N. This good behavior would be the same as we would have if these<br />

keys were "thrown" <strong>in</strong>to A uniformly at r<strong>and</strong>om.<br />

With a compression function such as this, which spreads <strong>in</strong>tegers fairly evenly <strong>in</strong><br />

the range [0,N − 1], <strong>and</strong> a hash code that transforms the keys <strong>in</strong> our map <strong>in</strong>to<br />

<strong>in</strong>tegers, we have an effective hash function. Together, such a hash function <strong>and</strong> a<br />

bucket array def<strong>in</strong>e the ma<strong>in</strong> <strong>in</strong>gredients of the hash table implementation of the<br />

map ADT.<br />

But before we can give the details of how to perform such operations as put, get,<br />

<strong>and</strong> remove, we must first resolve the issue of how we will be h<strong>and</strong>l<strong>in</strong>g collisions.<br />

9.2.5 Collision-H<strong>and</strong>l<strong>in</strong>g Schemes<br />

The ma<strong>in</strong> idea of a hash table is to take a bucket array, A, <strong>and</strong> a hash function, h,<br />

<strong>and</strong> use them to implement a map by stor<strong>in</strong>g each entry (k,v) <strong>in</strong> the "bucket" A<br />

[h(k)]. This simple idea is challenged, however, when we have two dist<strong>in</strong>ct keys, k 1<br />

<strong>and</strong> k 2 , such that h(k 1 ) = h(k 2 ). The existence of such collisions prevents us from<br />

simply <strong>in</strong>sert<strong>in</strong>g anew entry (k,v) directly <strong>in</strong> the bucket A [h(k)]. They also<br />

complicate our procedure for perform<strong>in</strong>g the get(k), put(k, v), <strong>and</strong> remove(k)<br />

operations.<br />

Separate Cha<strong>in</strong><strong>in</strong>g<br />

A simple <strong>and</strong> efficient way for deal<strong>in</strong>g with collisions is to have each bucket A[i]<br />

store a small map, M i , implemented us<strong>in</strong>g a list, as described <strong>in</strong> Section 9.1.1,<br />

hold<strong>in</strong>g entries (k, v) such that h(k) = i. That is, each separate M i cha<strong>in</strong>s together<br />

the entries that hash to <strong>in</strong>dex i <strong>in</strong> a l<strong>in</strong>ked list. This collision resolution rule is<br />

known as separate cha<strong>in</strong><strong>in</strong>g. Assum<strong>in</strong>g that we <strong>in</strong>itialize each bucket A [i] to be<br />

531

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!