11.07.2015 Views

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

310 Chap. 9 SearchingThis is potentially a useful observ<strong>at</strong>ion th<strong>at</strong> typical “real-life” distributions ofrecord accesses, if the records were ordered by frequency, would require th<strong>at</strong> wevisit on average only 10-15% of the list when doing sequential search. This meansth<strong>at</strong> if we had an applic<strong>at</strong>ion th<strong>at</strong> used sequential search, <strong>and</strong> we wanted to make itgo a bit faster (by a constant amount), we could do so without a major rewrite tothe system to implement something like a search tree. But th<strong>at</strong> is only true if thereis an easy way to (<strong>at</strong> least approxim<strong>at</strong>ely) order the records by frequency.In most applic<strong>at</strong>ions, we have no means of knowing in advance the frequenciesof access for the d<strong>at</strong>a records. To complic<strong>at</strong>e m<strong>at</strong>ters further, certain records mightbe accessed frequently for a brief period of time, <strong>and</strong> then rarely thereafter. Thus,the probability of access for records might change over time (in most d<strong>at</strong>abasesystems, this is to be expected). Self-organizing lists seek to solve both of theseproblems.Self-organizing lists modify the order of records within the list based on theactual p<strong>at</strong>tern of record access. Self-organizing lists use a heuristic for decidinghow to to reorder the list. These heuristics are similar to the rules for managingbuffer pools (see Section 8.3). In fact, a buffer pool is a form of self-organizinglist. Ordering the buffer pool by expected frequency of access is a good str<strong>at</strong>egy,because typically we must search the contents of the buffers to determine if thedesired inform<strong>at</strong>ion is already in main memory. When ordered by frequency ofaccess, the buffer <strong>at</strong> the end of the list will be the one most appropri<strong>at</strong>e for reusewhen a new page of inform<strong>at</strong>ion must be read. Below are three traditional heuristicsfor managing self-organizing lists:1. The most obvious way to keep a list ordered by frequency would be to storea count of accesses to each record <strong>and</strong> always maintain records in this order.This method will be referred to as count. Count is similar to the leastfrequently used buffer replacement str<strong>at</strong>egy. Whenever a record is accessed,it might move toward the front of the list if its number of accesses becomesgre<strong>at</strong>er than a record preceding it. Thus, count will store the records in theorder of frequency th<strong>at</strong> has actually occurred so far. Besides requiring spacefor the access counts, count does not react well to changing frequency ofaccess over time. Once a record has been accessed a large number of timesunder the frequency count system, it will remain near the front of the listregardless of further access history.2. Bring a record to the front of the list when it is found, pushing all the otherrecords back one position. This is analogous to the least recently used bufferreplacement str<strong>at</strong>egy <strong>and</strong> is called move-to-front. This heuristic is easy toimplement if the records are stored using a linked list. When records arestored in an array, bringing a record forward from near the end of the arraywill result in a large number of records (slightly) changing position. Moveto-front’scost is bounded in the sense th<strong>at</strong> it requires <strong>at</strong> most twice the num-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!