Algorithms and Data Structures

More documents

Recommendations

Info

N.Wirth. Algorithms and Data Structures. Oberon version 162 i i-1 n-i Fig. 4.29. Weight distribution of branches In the tree in Fig. 4.29 we divide the nodes into three classes: 1. The i-1 nodes in the left subtree have an average path length a i-1 . 2. The root has a path length of 0. 3. The n-i nodes in the right subtree have an average path length a n-i . Hence, the equation above can be expressed as a sum of two terms 1) and 3) a n (i) = ((i-1) * a i-1 + (n-i) * a n-i ) / n The desired quantity a n is the average of a n (i) over all i = 1 ... n, i.e., over all trees with the key 1, 2, ... , n at the root. a n = (Si: 1 ≤ i ≤ n: (i-1) a i-1 + (n-i) a n-i ) / n 2 = 2 * (Si: 1 ≤ i ≤ n: (i-1) a i-1 ) / n 2 = 2 * (Si: 1 ≤ i < n: i * a i ) / n 2 This equation is a recurrence relation of the form a n = f 1 (a 1 , a 2 , ... , a n-1 ). From this we derive a simpler recurrence relation of the form a n = f 2 (a n-1 ). We derive directly (1) by splitting off the last term, and (2) by substituting n-1 for n: (1) a n = 2*(n-1) * a n-1 /n 2 + 2 * (Si: 1 ≤ i < n-1: i * a i ) / n 2 (2) a n-1 = 2 * (Si: 1 ≤ i < n-1: i * a i ) / (n-1) 2 Multiplying (2) by n-1) 2 /n 2 yields (3) 2 * (Si: 1 ≤ i < n-1: i * a i ) / n 2 = a n-1 * (n-1) 2 /n 2 and substituting the right part of (3) in (1), we find a n = 2 * (n-1) * a n-1 / n 2 + a n-1 * (n-1) 2 / n 2 = a n-1 * (n-1) 2 / n 2 It turns out that a n can be expressed in non-recursive, closed form in terms of the harmonic sum H n = 1 + 1/2 + 1/3 + … + 1/n a n = 2 * (H n * (n+1)/n – 1) From Euler's formula (using Euler's constant g = 0.577...), H n = g + ln n + 1/12n 2 + ... we derive, for large n, the approximate value a n = 2 * (ln n + g - 1) Since the average path length in the perfectly balanced tree is approximately a n ' = log n - 1 we obtain, neglecting the constant terms which become insignificant for large n, lim (a n /a n ’) = 2 * ln(n) / log(n) = 2 ln(2) = 1.386... What does this result teach us? It tells us that by taking the pains of always constructing a perfectly balanced tree instead of the random tree, we could - always provided that all keys are looked up with
N.Wirth. Algorithms and Data Structures. Oberon version 163 equal probability - expect an average improvement in the search path length of at most 39%. Emphasis is to be put on the word average, for the improvement may of course be very much greater in the unhappy case in which the generated tree had completely degenerated into a list, which, however, is very unlikely to occur. In this connection it is noteworthy that the expected average path length of the random tree grows also strictly logarithmically with the number of its nodes, even though the worst case path length grows linearly. The figure of 39% imposes a limit on the amount of additional effort that may be spent profitably on any kind of reorganization of the tree's structure upon insertion of keys. Naturally, the ratio between the frequencies of access (retrieval) of nodes (information) and of insertion (update) significantly influences the payoff limits of any such undertaking. The higher this ratio, the higher is the payoff of a reorganization procedure. The 39% figure is low enough that in most applications improvements of the straight tree insertion algorithm do not pay off unless the number of nodes and the access vs. insertion ratio are large. 4.5 Balanced Trees From the preceding discussion it is clear that an insertion procedure that always restores the trees' structure to perfect balance has hardly any chance of being profitable, because the restoration of perfect balance after a random insertion is a fairly intricate operation. Possible improvements lie in the formulation of less strict definitions of balance. Such imperfect balance criteria should lead to simpler tree reorganization procedures at the cost of only a slight deterioration of average search performance. One such definition of balance has been postulated by Adelson-Velskii and Landis [4-1]. The balance criterion is the following: A tree is balanced if and only if for every node the heights of its two subtrees differ by at most 1. Trees satisfying this condition are often called AVL-trees (after their inventors). We shall simply call them balanced trees because this balance criterion appears a most suitable one. (Note that all perfectly balanced trees are also AVL-balanced.) The definition is not only simple, but it also leads to a manageable rebalancing procedure and an average search path length practically identical to that of the perfectly balanced tree. The following operations can be performed on balanced trees in O(log n) units of time, even in the worst case: 1. Locate a node with a given key. 2. Insert a node with a given key. 3. Delete the node with a given key. These statements are direct consequences of a theorem proved by Adelson-Velskii and Landis, which guarantees that a balanced tree will never be more than 45% higher than its perfectly balanced counterpart, no matter how many nodes there are. If we denote the height of a balanced tree with n nodes by h b (n), then log(n+1) < h b (n) < 1.4404*log(n+2) - 0.328 The optimum is of course reached if the tree is perfectly balanced for n = 2k-1. But which is the structure of the worst AVL-balanced tree? In order to find the maximum height h of all balanced trees with n nodes, let us consider a fixed height h and try to construct the balanced tree with the minimum number of nodes. This strategy is recommended because, as in the case of the minimal height, the value can be attained only for certain specific values of n. Let this tree of height h be denoted by T h . Clearly, T 0 is the empty tree, and T 1 is the tree with a single node. In order to construct the tree T h for h > 1, we will provide the root with two subtrees which again have a minimal number of nodes. Hence, the subtrees are also T. Evidently, one subtree must have height h-1, and the other is then allowed to have a height of one less, i.e. h-2. Figure 4.30 shows the trees with height 2, 3, and 4. Since their composition principle very
Page 1 and 2:
Algorithms and Data Structures © N
Page 3 and 4:
N.Wirth. Algorithms and Data Struct
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112: N.Wirth. Algorithms and Data Struct
Page 161: N.Wirth. Algorithms and Data Struct
Page 211: N.Wirth. Algorithms and Data Struct
show all

Algorithms and Data Structures

Create successful ePaper yourself

Delete template?

Save as template?