25.11.2014 Views

Algorithms and Data Structures

Algorithms and Data Structures

Algorithms and Data Structures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

N.Wirth. <strong>Algorithms</strong> <strong>and</strong> <strong>Data</strong> <strong>Structures</strong>. Oberon version 171<br />

ELSE (*delete p^*)<br />

q := p;<br />

IF q.right = NIL THEN p := q.left; h := TRUE<br />

ELSIF q.left = NIL THEN p := q.right; h := TRUE<br />

ELSE<br />

del(q.left, h);<br />

IF h THEN balanceL(p, h) END<br />

END<br />

END<br />

END delete<br />

Fortunately, deletion of an element in a balanced tree can also be performed with — in the worst case<br />

— O(log n) operations. An essential difference between the behaviour of the insertion <strong>and</strong> deletion<br />

procedures must not be overlooked, however. Whereas insertion of a single key may result in at most one<br />

rotation (of two or three nodes), deletion may require a rotation at every node along the search path.<br />

Consider, for instance, deletion of the rightmost node of a Fibonacci-tree. In this case the deletion of any<br />

single node leads to a reduction of the height of the tree; in addition, deletion of its rightmost node requires<br />

the maximum number of rotations. This therefore represents the worst choice of node in the worst case of a<br />

balanced tree, a rather unlucky combination of chances. How probable are rotations, then, in general?<br />

The surprising result of empirical tests is that whereas one rotation is invoked for approximately every<br />

two insertions, one is required for every five deletions only. Deletion in balanced trees is therefore about as<br />

easy — or as complicated — as insertion.<br />

4.6 Optimal Search Trees<br />

So far our consideration of organizing search trees has been based on the assumption that the frequency<br />

of access is equal for all nodes, that is, that all keys are equally probable to occur as a search argument.<br />

This is probably the best assumption if one has no idea of access distribution. However, there are cases<br />

(they are the exception rather than the rule) in which information about the probabilities of access to<br />

individual keys is available. These cases usually have the characteristic that the keys always remain the<br />

same, i.e., the search tree is subjected neither to insertion nor deletion, but retains a constant structure. A<br />

typical example is the scanner of a compiler which determines for each word (identifier) whether or not it is<br />

a keyword (reserved word). Statistical measurements over hundreds of compiled programs may in this<br />

case yield accurate information on the relative frequencies of occurrence, <strong>and</strong> thereby of access, of<br />

individual keys.<br />

Assume that in a search tree the probability with which node i is accessed is<br />

Pr {x = k i } = p i , (Si: 1 ≤ i ≤ n : p i ) = 1<br />

We now wish to organize the search tree in a way that the total number of search steps - counted over<br />

sufficiently many trials - becomes minimal. For this purpose the definition of path length is modified by (1)<br />

attributing a certain weight to each node <strong>and</strong> by (2) assuming the root to be at level 1 (instead of 0),<br />

because it accounts for the first comparison along the search path. Nodes that are frequently accessed<br />

become heavy nodes; those that are rarely visited become light nodes. The (internal) weighted path length<br />

is then the sum of all paths from the root to each node weighted by that node's probability of access.<br />

P = Si: 1 ≤ i ≤ n : p i * h i<br />

h i is the level of node i. The goal is now to minimize the weighted path length for a given probability<br />

distribution. As an example, consider the set of keys 1, 2, 3, with probabilities of access p 1 = 1/7, p 2 =<br />

2/7 <strong>and</strong> p 3 = 4/7. These three keys can be arranged in five different ways as search trees (see Fig. 4.36).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!