11.07.2015 Views

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Sec. 9.7 Projects 3399.2 Implement the three self-organizing list heuristics count, move-to-front, <strong>and</strong>transpose. Compare the cost for running the three heuristics on various inputd<strong>at</strong>a. The cost metric should be the total number of comparisons requiredwhen searching the list. It is important to compare the heuristics using inputd<strong>at</strong>a for which self-organizing lists are reasonable, th<strong>at</strong> is, on frequency distributionsth<strong>at</strong> are uneven. One good approach is to read text files. The listshould store individual words in the text file. Begin with an empty list, aswas done for the text compression example of Section 9.2. Each time a wordis encountered in the text file, search for it in the self-organizing list. If theword is found, reorder the list as appropri<strong>at</strong>e. If the word is not in the list,add it to the end of the list <strong>and</strong> then reorder as appropri<strong>at</strong>e.9.3 Implement the text compression system described in Section 9.2.9.4 Implement a system for managing document retrieval. Your system shouldhave the ability to insert (abstract references to) documents into the system,associ<strong>at</strong>e keywords with a given document, <strong>and</strong> to search for documents withspecified keywords.9.5 Implement a d<strong>at</strong>abase stored on disk using bucket hashing. Define records tobe 128 bytes long with a 4-byte key <strong>and</strong> 120 bytes of d<strong>at</strong>a. The remaining4 bytes are available for you to store necessary inform<strong>at</strong>ion to support thehash table. A bucket in the hash table will be 1024 bytes long, so each buckethas space for 8 records. The hash table should consist of 27 buckets (totalspace for 216 records with slots indexed by positions 0 to 215) followed bythe overflow bucket <strong>at</strong> record position 216 in the file. The hash function forkey value K should be K mod 213. (Note th<strong>at</strong> this means the last threeslots in the table will not be home positions for any record.) The collisionresolution function should be linear probing with wrap-around within thebucket. For example, if a record is hashed to slot 5, the collision resolutionprocess will <strong>at</strong>tempt to insert the record into the table in the order 5, 6, 7, 0,1, 2, 3, <strong>and</strong> finally 4. If a bucket is full, the record should be placed in theoverflow section <strong>at</strong> the end of the file.Your hash table should implement the dictionary ADT of Section 4.4. Whenyou do your testing, assume th<strong>at</strong> the system is meant to store about 100 or sorecords <strong>at</strong> a time.9.6 Implement the dictionary ADT of Section 4.4 by means of a hash table withlinear probing as the collision resolution policy. You might wish to beginwith the code of Figure 9.9. Using empirical simul<strong>at</strong>ion, determine the costof insert <strong>and</strong> delete as α grows (i.e., reconstruct the dashed lines of Figure9.10). Then, repe<strong>at</strong> the experiment using quadr<strong>at</strong>ic probing <strong>and</strong> pseudor<strong>and</strong>omprobing. Wh<strong>at</strong> can you say about the rel<strong>at</strong>ive performance of thesethree collision resolution policies?

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!