25.11.2014 Views

Algorithms and Data Structures

Algorithms and Data Structures

Algorithms and Data Structures

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

N.Wirth. <strong>Algorithms</strong> <strong>and</strong> <strong>Data</strong> <strong>Structures</strong>. Oberon version 179<br />

Fig. 4.41. Binary tree subdivided into pages<br />

The saving in the number of disk accesses — each page access now involves a disk access — can be<br />

considerable. Assume that we choose to place 100 nodes on a page (this is a reasonable figure); then the<br />

million item search tree will on the average require only log 100 (10 6 ) (i.e. about 3) page accesses instead of<br />

20. But, of course, if the tree is left to grow at r<strong>and</strong>om, then the worst case may still be as large as 10 4 . It<br />

is plain that a scheme for controlled growth is almost m<strong>and</strong>atory in the case of multiway trees.<br />

4.7.1 Multiway B-Trees<br />

If one is looking for a controlled growth criterion, the one requiring a perfect balance is quickly<br />

eliminated because it involves too much balancing overhead. The rules must clearly be somewhat relaxed.<br />

A very sensible criterion was postulated by R. Bayer <strong>and</strong> E.M. McCreight [4.2] in 1970: every page<br />

(except one) contains between n <strong>and</strong> 2n nodes for a given constant n. Hence, in a tree with N items <strong>and</strong> a<br />

maximum page size of 2n nodes per page, the worst case requires log n N page accesses; <strong>and</strong> page<br />

accesses clearly dominate the entire search effort. Moreover, the important factor of store utilization is at<br />

least 50% since pages are always at least half full. With all these advantages, the scheme involves<br />

comparatively simple algorithms for search, insertion, <strong>and</strong> deletion. We will subsequently study them in<br />

detail.<br />

The underlying data structures are called B-trees, <strong>and</strong> have the following characteristics; n is said to be<br />

the order of the B-tree.<br />

1. Every page contains at most 2n items (keys).<br />

2. Every page, except the root page, contains at least n items.<br />

3. Every page is either a leaf page, i.e. has no descendants, or it has m+1 descendants, where m is its<br />

number of keys on this page.<br />

4. All leaf pages appear at the same level.<br />

25<br />

10 20<br />

30 40<br />

2 5 7 8 13 14 15 18 22 24 26 27 28 32 35 38 41 42 45 46<br />

Fig. 4.42. B-tree of order 2<br />

Figure 4.42 shows a B-tree of order 2 with 3 levels. All pages contain 2, 3, or 4 items; the exception is<br />

the root which is allowed to contain a single item only. All leaf pages appear at level 3. The keys appear in

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!