13.07.2015 Views

13.1 through 13.5, 13.10 and 13.11

13.1 through 13.5, 13.10 and 13.11

13.1 through 13.5, 13.10 and 13.11

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

13.9 Other Primarv File Oroanizationsecordsvatues)0ecorosvalues)1--corosvalues-.cordsvalues))corosvalues0,'corosvalues1ion /r;.ainingto anit intoe. Therecords originally in bucket 0 are distributed between the two buckets based on adifferent hashing function hi*r(K) = K mod 2M. A key property of the two hashfunctions h,<strong>and</strong>h,*, is that any records that hashed to bucket 0 based on /r; will hashto either bucket 0 or bucket M based on ft;n1; this is necessary for linear hashingto work.As further collisions lead to overflorv records, additional buckets are split in the /lrrearorder 1,2,3,.... Ifenough overflows occur, all the original file buckets 0, 1,. . .,M - I will have been split, so the file now has 2M instead of M buckets, <strong>and</strong> all bucketsuse the hash function ll,*,. Hence, the records in overflow are eventually redistributedinto regular buckets, using the function h,*, via a delayed spi lr of theirbuckets. There is no directory; only a virlue n-which is initially set to 0 <strong>and</strong> is incrementedby I whenever ir split occurs-is needed to determine which buckets havebeen split. To retrieve a record with hash key value K, first apply the function h,to K;if hi\) ( n, then apply the function h,*, on K because the bucket is already split.Initially, n = 0, indicating that the function lr, applies to all buckets; /r grows linearlyas buckets are split.When n = M after being incremented, this signifies that all the original bucketshave been split <strong>and</strong> the hash function lr,*, applies to all records in the file. At thispoint, n is reset to 0 (zero), <strong>and</strong> any nerv collisions that cause overflow lead to theuse of a new hashing function hit2(K) = K mod 4M.ln general, a sequence of hashingfunctions h,*,(K) = K mod (2iM) is used, wherey = 0, 1, 2, . . .; a new hashingfunction h;*;*, is needed whenever all the buckets 0, 1, . .., (2/M)- I have been split<strong>and</strong> n is reset to 0. The search for a record with hash key value K is given byAlgorithm 13.3.Splitting can be controlled by monitoring the file load factor instead of by splittingwhenever an overflow occurs. In general, the file load factor I can be defined as / =rl(bfrr N), where r is the current number of file records,bfr is the maximum numberof records that can fit in a bucket, <strong>and</strong> N is the current number of file buckets.Buckets thart have been split can also be recombined if the load factor of the file fallsbelow a certain threshold. Blocks are combined linearly, <strong>and</strong> N is decrementedappropriately. The file load can be used to trigger both splits <strong>and</strong> combinations; inthis manner the file load can be kept within a desired range. Splits can be triggeredwhen the load exceeds a certain threshold-say,0.9-<strong>and</strong> combinations can be triggeredwhen the load falls below another threshold-s ay,0.7.Algorithm 13.3. The Search Procedure for Linear Hashingifn=0then nr

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!