12.07.2015 Views

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Sec. 9.5 Further Reading 351occupied the slot but does so no longer. If a <strong>to</strong>mbs<strong>to</strong>ne is encountered when searchingthrough a probe sequence, the search procedure is <strong>to</strong> continue with the search.When a <strong>to</strong>mbs<strong>to</strong>ne is encountered during insertion, that slot can be used <strong>to</strong> s<strong>to</strong>re thenew record. However, <strong>to</strong> avoid inserting duplicate keys, it will still be necessary forthe search procedure <strong>to</strong> follow the probe sequence until a truly empty position hasbeen found, simply <strong>to</strong> verify that a duplicate is not in the table. However, the newrecord would actually be inserted in<strong>to</strong> the slot of the first <strong>to</strong>mbs<strong>to</strong>ne encountered.The use of <strong>to</strong>mbs<strong>to</strong>nes allows searches <strong>to</strong> work correctly <strong>and</strong> allows reuse ofdeleted slots. However, after a series of intermixed insertion <strong>and</strong> deletion operations,some slots will contain <strong>to</strong>mbs<strong>to</strong>nes. This will tend <strong>to</strong> lengthen the averagedistance from a record’s home position <strong>to</strong> the record itself, beyond where it couldbe if the <strong>to</strong>mbs<strong>to</strong>nes did not exist. A typical database application will first load acollection of records in<strong>to</strong> the hash table <strong>and</strong> then progress <strong>to</strong> a phase of intermixedinsertions <strong>and</strong> deletions. After the table is loaded with the initial collection ofrecords, the first few deletions will lengthen the average probe sequence distancefor records (it will add <strong>to</strong>mbs<strong>to</strong>nes). Over time, the average distance will reachan equilibrium point because insertions will tend <strong>to</strong> decrease the average distanceby filling in <strong>to</strong>mbs<strong>to</strong>ne slots. For example, after initially loading records in<strong>to</strong> thedatabase, the average path distance might be 1.2 (i.e., an average of 0.2 accessesper search beyond the home position will be required). After a series of insertions<strong>and</strong> deletions, this average distance might increase <strong>to</strong> 1.6 due <strong>to</strong> <strong>to</strong>mbs<strong>to</strong>nes. Thisseems like a small increase, but it is three times longer on average beyond the homeposition than before deletions.Two possible solutions <strong>to</strong> this problem are1. Do a local reorganization upon deletion <strong>to</strong> try <strong>to</strong> shorten the average pathlength. For example, after deleting a key, continue <strong>to</strong> follow the probe sequenceof that key <strong>and</strong> swap records further down the probe sequence in<strong>to</strong>the slot of the recently deleted record (being careful not <strong>to</strong> remove a key fromits probe sequence). This will not work for all collision resolution policies.2. Periodically rehash the table by reinserting all records in<strong>to</strong> a new hash table.Not only will this remove the <strong>to</strong>mbs<strong>to</strong>nes, but it also provides an opportunity<strong>to</strong> place the most frequently accessed records in<strong>to</strong> their home positions.9.5 Further ReadingFor a comparison of the efficiencies for various self-organizing techniques, seeBentley <strong>and</strong> McGeoch, “Amortized <strong>Analysis</strong> of Self-Organizing Sequential SearchHeuristics” [BM85]. The text compression example of Section 9.2 comes from

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!