11.07.2015 Views

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Sec. 9.4 Hashing 327One quality of a good probe sequence is th<strong>at</strong> it will cycle through all slots inthe hash table before returning to the home position. Clearly linear probing (which“skips” slots by one each time) does this. Unfortun<strong>at</strong>ely, not all values for c willmake this happen. For example, if c = 2 <strong>and</strong> the table contains an even numberof slots, then any key whose home position is in an even slot will have a probesequence th<strong>at</strong> cycles through only the even slots. Likewise, the probe sequencefor a key whose home position is in an odd slot will cycle through the odd slots.Thus, this combin<strong>at</strong>ion of table size <strong>and</strong> linear probing constant effectively dividesthe records into two sets stored in two disjoint sections of the hash table. So longas both sections of the table contain the same number of records, this is not reallyimportant. However, just from chance it is likely th<strong>at</strong> one section will become fullerthan the other, leading to more collisions <strong>and</strong> poorer performance for those records.The other section would have fewer records, <strong>and</strong> thus better performance. But theoverall system performance will be degraded, as the additional cost to the side th<strong>at</strong>is more full outweighs the improved performance of the less-full side.Constant c must be rel<strong>at</strong>ively prime to M to gener<strong>at</strong>e a linear probing sequenceth<strong>at</strong> visits all slots in the table (th<strong>at</strong> is, c <strong>and</strong> M must share no factors). For a hashtable of size M = 10, if c is any one of 1, 3, 7, or 9, then the probe sequencewill visit all slots for any key. When M = 11, any value for c between 1 <strong>and</strong> 10gener<strong>at</strong>es a probe sequence th<strong>at</strong> visits all slots for every key.Consider the situ<strong>at</strong>ion where c = 2 <strong>and</strong> we wish to insert a record with key k 1such th<strong>at</strong> h(k 1 ) = 3. The probe sequence for k 1 is 3, 5, 7, 9, <strong>and</strong> so on. If anotherkey k 2 has home position <strong>at</strong> slot 5, then its probe sequence will be 5, 7, 9, <strong>and</strong> so on.The probe sequences of k 1 <strong>and</strong> k 2 are linked together in a manner th<strong>at</strong> contributesto clustering. In other words, linear probing with a value of c > 1 does not solvethe problem of primary clustering. We would like to find a probe function th<strong>at</strong> doesnot link keys together in this way. We would prefer th<strong>at</strong> the probe sequence for k 1after the first step on the sequence should not be identical to the probe sequence ofk 2 . Instead, their probe sequences should diverge.The ideal probe function would select the next position on the probe sequence<strong>at</strong> r<strong>and</strong>om from among the unvisited slots; th<strong>at</strong> is, the probe sequence should be ar<strong>and</strong>om permut<strong>at</strong>ion of the hash table positions. Unfortun<strong>at</strong>ely, we cannot actuallyselect the next position in the probe sequence <strong>at</strong> r<strong>and</strong>om, because then we would notbe able to duplic<strong>at</strong>e this same probe sequence when searching for the key. However,we can do something similar called pseudo-r<strong>and</strong>om probing. In pseudo-r<strong>and</strong>omprobing, the ith slot in the probe sequence is (h(K) + r i ) mod M where r i is theith value in a r<strong>and</strong>om permut<strong>at</strong>ion of the numbers from 1 to M − 1. All insertion<strong>and</strong> search oper<strong>at</strong>ions use the same r<strong>and</strong>om permut<strong>at</strong>ion. The probe function isp(K, i) = Perm[i − 1],where Perm is an array of length M − 1 containing a r<strong>and</strong>om permut<strong>at</strong>ion of thevalues from 1 to M − 1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!