12.07.2015 Views

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

364 Chap. 10 Indexingkey value for each block in that cylinder, called the cylinder index. When newrecords are inserted, they are placed in the correct cylinder’s overflow area (in effect,a cylinder acts as a bucket). If a cylinder’s overflow area fills completely, thena system-wide overflow area is used. Search proceeds by determining the propercylinder from the system-wide table kept in main memory. The cylinder’s blocktable is brought in from disk <strong>and</strong> consulted <strong>to</strong> determine the correct block. If therecord is found in that block, then the search is complete. Otherwise, the cylinder’soverflow area is searched. If that is full, <strong>and</strong> the record is not found, then thesystem-wide overflow is searched.After initial construction of the database, so long as no new records are insertedor deleted, access is efficient because it requires only two disk fetches. The firstdisk fetch recovers the block table for the desired cylinder. The second disk fetchrecovers the block that, under good conditions, contains the record. After manyinserts, the overflow list becomes <strong>to</strong>o long, resulting in significant search time asthe cylinder overflow area fills up. Under extreme conditions, many searches mighteventually lead <strong>to</strong> the system overflow area. The “solution” <strong>to</strong> this problem is <strong>to</strong>periodically reorganize the entire database. This means re-balancing the recordsamong the cylinders, sorting the records within each cylinder, <strong>and</strong> updating boththe system index table <strong>and</strong> the within-cylinder block table. Such reorganizationwas typical of database systems during the 1960s <strong>and</strong> would normally be doneeach night or weekly.10.3 Tree-based IndexingLinear indexing is efficient when the database is static, that is, when records areinserted <strong>and</strong> deleted rarely or never. ISAM is adequate for a limited number ofupdates, but not for frequent changes. Because it has essentially two levels ofindexing, ISAM will also break down for a truly large database where the numberof cylinders is <strong>to</strong>o great for the <strong>to</strong>p-level index <strong>to</strong> fit in main memory.In their most general form, database applications have the following characteristics:1. Large sets of records are frequently updated.2. Search is by one or a combination of several keys.3. Key range queries or min/max queries are used.For such databases, a better organization must be found. One approach wouldbe <strong>to</strong> use the binary search tree (BST) <strong>to</strong> s<strong>to</strong>re primary <strong>and</strong> secondary key indices.BSTs can s<strong>to</strong>re duplicate key values, they provide efficient insertion <strong>and</strong> deletion aswell as efficient search, <strong>and</strong> they can perform efficient range queries. When there

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!