13.1 through 13.5, 13.10 and 13.11
13.1 through 13.5, 13.10 and 13.11
13.1 through 13.5, 13.10 and 13.11
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
13.9 Other Primarv File Oroanizationsecordsvatues)0ecorosvalues)1--corosvalues-.cordsvalues))corosvalues0,'corosvalues1ion /r;.ainingto anit intoe. Therecords originally in bucket 0 are distributed between the two buckets based on adifferent hashing function hi*r(K) = K mod 2M. A key property of the two hashfunctions h,<strong>and</strong>h,*, is that any records that hashed to bucket 0 based on /r; will hashto either bucket 0 or bucket M based on ft;n1; this is necessary for linear hashingto work.As further collisions lead to overflorv records, additional buckets are split in the /lrrearorder 1,2,3,.... Ifenough overflows occur, all the original file buckets 0, 1,. . .,M - I will have been split, so the file now has 2M instead of M buckets, <strong>and</strong> all bucketsuse the hash function ll,*,. Hence, the records in overflow are eventually redistributedinto regular buckets, using the function h,*, via a delayed spi lr of theirbuckets. There is no directory; only a virlue n-which is initially set to 0 <strong>and</strong> is incrementedby I whenever ir split occurs-is needed to determine which buckets havebeen split. To retrieve a record with hash key value K, first apply the function h,to K;if hi\) ( n, then apply the function h,*, on K because the bucket is already split.Initially, n = 0, indicating that the function lr, applies to all buckets; /r grows linearlyas buckets are split.When n = M after being incremented, this signifies that all the original bucketshave been split <strong>and</strong> the hash function lr,*, applies to all records in the file. At thispoint, n is reset to 0 (zero), <strong>and</strong> any nerv collisions that cause overflow lead to theuse of a new hashing function hit2(K) = K mod 4M.ln general, a sequence of hashingfunctions h,*,(K) = K mod (2iM) is used, wherey = 0, 1, 2, . . .; a new hashingfunction h;*;*, is needed whenever all the buckets 0, 1, . .., (2/M)- I have been split<strong>and</strong> n is reset to 0. The search for a record with hash key value K is given byAlgorithm 13.3.Splitting can be controlled by monitoring the file load factor instead of by splittingwhenever an overflow occurs. In general, the file load factor I can be defined as / =rl(bfrr N), where r is the current number of file records,bfr is the maximum numberof records that can fit in a bucket, <strong>and</strong> N is the current number of file buckets.Buckets thart have been split can also be recombined if the load factor of the file fallsbelow a certain threshold. Blocks are combined linearly, <strong>and</strong> N is decrementedappropriately. The file load can be used to trigger both splits <strong>and</strong> combinations; inthis manner the file load can be kept within a desired range. Splits can be triggeredwhen the load exceeds a certain threshold-say,0.9-<strong>and</strong> combinations can be triggeredwhen the load falls below another threshold-s ay,0.7.Algorithm 13.3. The Search Procedure for Linear Hashingifn=0then nr