12.07.2015 Views

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Sec. 9.4 Hashing 335For example, because the ASCII value for “A” is 65 <strong>and</strong> “Z” is 90, sum willalways be in the range 650 <strong>to</strong> 900 for a string of ten upper case letters. Fora hash table of size 100 or less, a good distribution results with all slots inthe table accepting either two or three of the values in the key range. For ahash table of size 1000, the distribution is terrible because only slots 650 <strong>to</strong>900 can possibly be the home slot for some key value.Example 9.8 Here is a much better hash function for strings.long sfold(String s, int M) {int intLength = s.length() / 4;long sum = 0;for (int j = 0; j < intLength; j++) {char c[] = s.substring(j * 4, (j * 4) + 4).<strong>to</strong>CharArray();long mult = 1;for (int k = 0; k < c.length; k++) {sum += c[k] * mult;mult *= 256;}}char c[] = s.substring(intLength * 4).<strong>to</strong>CharArray();long mult = 1;for (int k = 0; k < c.length; k++) {sum += c[k] * mult;mult *= 256;}}return(Math.abs(sum) % M);This function takes a string as input. It processes the string four bytesat a time, <strong>and</strong> interprets each of the four-byte chunks as a single (unsigned)long integer value. The integer values for the four-byte chunks are added<strong>to</strong>gether. In the end, the resulting sum is converted <strong>to</strong> the range 0 <strong>to</strong> M − 1using the modulus opera<strong>to</strong>r. 2For example, if the string “aaaabbbb” is passed <strong>to</strong> sfold, then the firstfour bytes (“aaaa”) will be interpreted as the integer value 1,633,771,8732 Recall from Section 2.2 that the implementation for n mod m on many C ++ <strong>and</strong> Java compilerswill yield a negative number if n is negative. Implemen<strong>to</strong>rs for hash functions need <strong>to</strong> be careful thattheir hash function does not generate a negative number. This can be avoided either by insuringthat n is positive when computing n mod m, or adding m <strong>to</strong> the result if n mod m is negative.All computation in sfold is done using unsigned long values in part <strong>to</strong> protect against taking themodulus of an negative number.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!