Left-leaning Red-Black Trees - Princeton University

More documents

Recommendations

Info

Properties of LLRB trees built from random keysBy design, the worst-case cost of a search in an LLRB tree with N nodes is 2 lg N. In practical applications,however, the cost of a typical search is half that value, not perceptibly different fromthe cost of a search in a perfectly balanced tree. Since searches are far more common than insertsin typical symbol-table applications, the usual first step in studying a symbol-table algorithm is toassume that a table is built from random keys (precisely, a random permutation of distinct keys)and then study the cost of searchers. For standard BSTs and other methods, mathematical modelsbased on this assumption have been developed and validated with experimental results and practicalexperience. The development of a corresponding mathematical model for balanced trees is oneof the outstanding problems in the analysis of algorithms.In this paper, we present experimental results that may help guide the development of such amodel, using a modified form of a plot format suggested by Tufte [12]. Specifically, we use• a gray dot to depict the result of each experiment• a red dot to depict the average value of the experiments for each tree size• black line segments to depict the standard deviation of the experiments for each tree size, oflength and spaced above and below the red dotsWhile sometimes difficult to distinguish individually, the gray dots help illustrate the extent andthe dispersion of the experimental results. The plots at right each represent the results of 50,000experiments, each involving building a2-3 tree from a random permutationof distinct keys.average successful search cost ( ipl / N )18comparesAverage path length. What is the costlg N − .5of a typical search? That is the questionof most interest in practice. In typical10large-scale applications, most searchestree size Nare successful and bias towards specifickeys is relatively insignificant, so100050000the measuring the average length to anode in a tree constructed from randomkeys is a reasonable estimate. As2421.7shown in our first plot, this measure isextremely close to the optimal value lg1000 experiments per sizeN − .5 that would be found in a fullybalanced tree. The plots for top-down2 ln N2-3-4 trees and other types of redblacktrees are indistinguishable fromtree size N10100050000this one.Height. What is the expected worstcasesearch cost? This question is primarilyof academic interest, but may shed some light on the structure of the trees. Though thedispersion is much higher than the average, our second plot shows that the height is close to 2 lnN, the same value as the average cost of a search in a BST (!). However, this precise value is pureconjecture: for example, experiments for standard BSTs would suggest the average height 3 lg N ,but the actual value of the coefficient is known to be slightly less than 3.worst-case search cost ( tree height )heightExperimental results for LLRB trees built from random keys18.51000 experiments per size
Distribution. The first step to developing a mathematical model that explains these results is tounderstand the distribution of the probability p kthat the root is of rank k, when a LLRB (2-3)tree is built from random keys. We know this probability to be 0 for small k and for large k, and4tree size N50010000 experiments per sizek/NLLRB tree root rank distribution10.5p k (offset)0we expect it to be high when k isnear N/2. The figure at left showsthe result of computing the distributionexactly for small N andestimating its shape for intermediatevalues of N. Following theformat introduced in [10], thecurves are normalized on the xaxis and slightly separated onthe y axis, so that convergence toa distribution can be identified.The irregularities in the curvesare primarily (but not completely)due to expected variations inthe experimental results. (Thesecurves are the result of building10000 trees for each size, and aresmoother than the curves basedAverage root rank in LLRB treeson a smaller number of experiments). Ideally, we would like to see convergence at the bottom tosome distribution (whose properties we can analyze) for large N. Though it suggests the possibilityof eventual convergence to a distribution that can be suitable approximated, this figure also exhibitsan oscillation that may complicate such analysis. At right is shown a Tufte plot of the average forthis distribution for a large number of experiments. This figure clearly illustrates a log-oscillatorybehavior that is often found in the analysis of algorithms, and also shows that the dispersion is significantand does not seem to be decreasing.<strong>Red</strong> path length. How many red nodes are on the search path, on the average? This question wouldseem to be the key to understanding LLRB trees. The figure below shows that this varies (even10tree size N500.3200 experiments per sizerank / N.7Tree ATree Baverage number of nodes per path after each insertionred nodesredsblacks2 6 20 44 902 6 19 36Path lengths in two random LLRB trees
Page 1 and 2: Left-leaning Red-Black TreesRobert
Page 3 and 4: Rotations and color flipsOne way to
Page 5 and 6: public class LLRB{private static fi
Page 7: private Node moveRedLeft(Node h){co

Left-leaning Red-Black Trees - Princeton University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?