11.07.2015 Views

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

Data Structures and Algorithm Analysis - Computer Science at ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Sec. 7.7 Binsort <strong>and</strong> Radix Sort 249The first inner for loop initializes array cnt. The second loop counts thenumber of records to be assigned to each bin. The third loop sets the values in cntto their proper indices within array B. Note th<strong>at</strong> the index stored in cnt[j] is thelast index for bin j; bins are filled from high index to low index. The fourth loopassigns the records to the bins (within array B). The final loop simply copies therecords back to array A to be ready for the next pass. Variable rtoi stores r i foruse in bin comput<strong>at</strong>ion on the i’th iter<strong>at</strong>ion. Figure 7.19 shows how this algorithmprocesses the input shown in Figure 7.17.This algorithm requires k passes over the list of n numbers in base r, withΘ(n + r) work done <strong>at</strong> each pass. Thus the total work is Θ(nk + rk). Wh<strong>at</strong> isthis in terms of n? Because r is the size of the base, it might be r<strong>at</strong>her small.One could use base 2 or 10. Base 26 would be appropri<strong>at</strong>e for sorting characterstrings. For now, we will tre<strong>at</strong> r as a constant value <strong>and</strong> ignore it for the purpose ofdetermining asymptotic complexity. Variable k is rel<strong>at</strong>ed to the key range: It is themaximum number of digits th<strong>at</strong> a key may have in base r. In some applic<strong>at</strong>ions wecan determine k to be of limited size <strong>and</strong> so might wish to consider it a constant.In this case, Radix Sort is Θ(n) in the best, average, <strong>and</strong> worst cases, making it thesort with best asymptotic complexity th<strong>at</strong> we have studied.Is it a reasonable assumption to tre<strong>at</strong> k as a constant? Or is there some rel<strong>at</strong>ionshipbetween k <strong>and</strong> n? If the key range is limited <strong>and</strong> duplic<strong>at</strong>e key values arecommon, there might be no rel<strong>at</strong>ionship between k <strong>and</strong> n. To make this distinctionclear, use N to denote the number of distinct key values used by the n records.Thus, N ≤ n. Because it takes a minimum of log r N base r digits to represent Ndistinct key values, we know th<strong>at</strong> k ≥ log r N.Now, consider the situ<strong>at</strong>ion in which no keys are duplic<strong>at</strong>ed. If there are nunique keys (n = N), then it requires n distinct code values to represent them.Thus, k ≥ log r n. Because it requires <strong>at</strong> least Ω(log n) digits (within a constantfactor) to distinguish between the n distinct keys, k is in Ω(log n). This yieldsan asymptotic complexity of Ω(n log n) for Radix Sort to process n distinct keyvalues.It is possible th<strong>at</strong> the key range is much larger; log r n bits is merely the bestcase possible for n distinct values. Thus, the log r n estim<strong>at</strong>e for k could be overlyoptimistic. The moral of this analysis is th<strong>at</strong>, for the general case of n distinct keyvalues, Radix Sort is <strong>at</strong> best a Ω(n log n) sorting algorithm.Radix Sort can be much improved by making base r be as large as possible.Consider the case of an integer key value. Set r = 2 i for some i. In other words,the value of r is rel<strong>at</strong>ed to the number of bits of the key processed on each pass.Each time the number of bits is doubled, the number of passes is cut in half. Whenprocessing an integer key value, setting r = 256 allows the key to be processed onebyte <strong>at</strong> a time. Processing a 32-bit key requires only four passes. It is not unreasonableon most computers to use r = 2 16 = 64K, resulting in only two passes for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!