25.11.2014 Views

Algorithms and Data Structures

Algorithms and Data Structures

Algorithms and Data Structures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

N.Wirth. <strong>Algorithms</strong> <strong>and</strong> <strong>Data</strong> <strong>Structures</strong>. Oberon version 67<br />

behavior of the partitioning process first. After having selected a bound x, it sweeps the entire array.<br />

Hence, exactly n comparisons are performed. The number of exchanges can be determind by the following<br />

probabilistic argument.<br />

With a fixed bound u, the expected number of exchange operations is equal to the number of elements in<br />

the left part of the partition, namely u, multiplied by the probability that such an element reached its place<br />

by an exchange. An exchange had taken place if the element had previously been part of the right partition;<br />

the probablity for this is (n-u)/n. The expected number of exchanges is therefore the average of these<br />

expected values over all possible bounds u:<br />

M = [Su: 0 ≤ u ≤ n-1 : u*(n-u)]/n 2<br />

= n*(n-1)/2n - (2n 2 - 3n + 1)/6n<br />

= (n - 1/n)/6<br />

Assuming that we are very lucky <strong>and</strong> always happen to select the median as the bound, then each<br />

partitioning process splits the array in two halves, <strong>and</strong> the number of necessary passes to sort is log(n).<br />

The resulting total number of comparisons is then n*log(n), <strong>and</strong> the total number of exchanges is<br />

n*log(n)/6.<br />

Of course, one cannot expect to hit the median all the time. In fact, the chance of doing so is only 1/n.<br />

Surprisingly, however, the average performance of Quicksort is inferior to the optimal case by a factor of<br />

only 2*ln(2), if the bound is chosen at r<strong>and</strong>om.<br />

But Quicksort does have its pitfalls. First of all, it performs moderately well for small values of n, as do<br />

all advanced methods. Its advantage over the other advanced methods lies in the ease with which a straight<br />

sorting method can be incorporated to h<strong>and</strong>le small partitions. This is particularly advantageous when<br />

considering the recursive version of the program.<br />

Still, there remains the question of the worst case. How does Quicksort perform then? The answer is<br />

unfortunately disappointing <strong>and</strong> it unveils the one weakness of Quicksort. Consider, for instance, the<br />

unlucky case in which each time the largest value of a partition happens to be picked as compar<strong>and</strong> x.<br />

Then each step splits a segment of n items into a left partition with n-1 items <strong>and</strong> a right partition with a<br />

single element. The result is that n (instead of log(n)) splits become necessary, <strong>and</strong> that the worst-case<br />

performance is of the order n 2 .<br />

Apparently, the crucial step is the selection of the compar<strong>and</strong> x. In our example program it is chosen as<br />

the middle element. Note that one might almost as well select either the first or the last element. In these<br />

cases, the worst case is the initially sorted array; Quicksort then shows a definite dislike for the trivial job<br />

<strong>and</strong> a preference for disordered arrays. In choosing the middle element, the strange characteristic of<br />

Quicksort is less obvious because the initially sorted array becomes the optimal case. In fact, also the<br />

average performance is slightly better, if the middle element is selected. Hoare suggests that the choice of x<br />

be made at r<strong>and</strong>om, or by selecting it as the median of a small sample of, say, three keys [2.12 <strong>and</strong> 2.13].<br />

Such a judicious choice hardly influences the average performance of Quicksort, but it improves the worstcase<br />

performance considerably. It becomes evident that sorting on the basis of Quicksort is somewhat like<br />

a gamble in which one should be aware of how much one may afford to lose if bad luck were to strike.<br />

There is one important lesson to be learned from this experience; it concerns the programmer directly.<br />

What are the consequences of the worst case behavior mentioned above to the performance Quicksort?<br />

We have realized that each split results in a right partition of only a single element; the request to sort this<br />

partition is stacked for later execution. Consequently, the maximum number of requests, <strong>and</strong> therefore the<br />

total required stack size, is n. This is, of course, totally unacceptable. (Note that we fare no better — in<br />

fact even worse — with the recursive version because a system allowing recursive activation of procedures<br />

will have to store the values of local variables <strong>and</strong> parameters of all procedure activations automatically,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!