13.07.2015 Views

Improvements on the kd-tree

Improvements on the kd-tree

Improvements on the kd-tree

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

László Szécsi and Balázs Benedek / <str<strong>on</strong>g>Improvements</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> <strong>kd</strong>-<strong>tree</strong>89 1011 12 13 1412 34 5 6 71516 1718 19 20 21Figure 1: Mapping of a <strong>tree</strong> into an array using cache-linesizedsub-<strong>tree</strong>s.need to have a good estimate of <strong>the</strong> number of <strong>the</strong> nodesto be able to allocate memory in advance. Fortunately, weknow that a <strong>kd</strong>-<strong>tree</strong> uses 6n splitting planes at most. Thisalso means a maximum of 6n leaves. Adding <strong>the</strong> worst-casenumber of pointers, which is exactly <strong>the</strong> number of nodes <strong>on</strong><strong>the</strong> last level, we c<strong>on</strong>clude that an array with 24n elementssuffices.A node itself has to be as tiny as possible. The above structureassumes, that <strong>the</strong> descripti<strong>on</strong> of a splitting plane for an<strong>on</strong>-leaf node, a pointer to <strong>the</strong> list of objects for leaves, anda pointer to ano<strong>the</strong>r node for a redirect node all fit into anelement of <strong>the</strong> array. As <strong>the</strong> plane is described by a not necessarilyprecise floating-point number, all <strong>the</strong>se <strong>on</strong>ly take upa few bytes. We need <strong>on</strong>e extra bit to distinguish betweenleaves and n<strong>on</strong>-leaf nodes.Figure 2: Mapping of an unbalanced <strong>tree</strong> into an array usingpointers <strong>on</strong> <strong>the</strong> last level. Darkened nodes are leaves.The number of <strong>the</strong>se pointers could fur<strong>the</strong>r be decimatedif we make use of <strong>the</strong> fact that leaves <strong>on</strong> <strong>the</strong> last level d<strong>on</strong>ot have children. That way, a node <strong>on</strong> <strong>the</strong> last level isei<strong>the</strong>r a leaf, or a reference to <strong>the</strong> actual positi<strong>on</strong> of <strong>the</strong>child node. This representati<strong>on</strong> allows <strong>the</strong> mapping of a n<strong>on</strong>balanced<strong>tree</strong> and with all <strong>the</strong> needed pointers into a single,pre-allocable array. This structure, also compatible with <strong>the</strong>cache-line mapping, is depicted in Figure 3. Naturally, weFigure 3: <strong>kd</strong>-<strong>tree</strong> using <strong>the</strong> minimal number of pointers.Dark nodes are leaves, hatched nodes are unused.4.3. Estimati<strong>on</strong> of <strong>the</strong> number of necessary pointersThe above figure for <strong>the</strong> memory need can fur<strong>the</strong>r be decreased,if we do not account for <strong>the</strong> worst case, and usesomewhat less memory. Would <strong>the</strong> <strong>tree</strong> exceed its predefinednode count, we will have to terminate <strong>the</strong> build. Ascompromising <strong>the</strong> <strong>tree</strong> c<strong>on</strong>structi<strong>on</strong> algorithm will prove tobe very costly during traversal, a secure size should be chosen,with practically zero chance of overflow. However, if wemake use of <strong>the</strong> fact that no pointers are needed for leaves,a lower figure for <strong>the</strong> number of pointers may be found. Obviously,if <strong>the</strong> lowest level of <strong>the</strong> <strong>tree</strong> would <strong>on</strong>ly c<strong>on</strong>tainpointers, as <strong>the</strong> previously given upper bound suggests, everysingle node above would be referenced. This is impossiblebecause of two reas<strong>on</strong>s: <strong>the</strong> nodes that bel<strong>on</strong>g to <strong>the</strong><strong>tree</strong> originating from <strong>the</strong> root are not referenced, and, moresignificantly, leaves are never referenced.To derive an exact number let us introduce <strong>the</strong> followingnomenclature. Let X be <strong>the</strong> number of pointers <strong>on</strong> <strong>the</strong> lastlevel, and N <strong>the</strong> number of pre-allocated nodes. If <strong>the</strong> arrayis not full, <strong>the</strong> number of pointers is irrelevant. We are <strong>on</strong>lyinterested in <strong>the</strong> case, when every node is used as a cut, a leafor a pointer. Therefore, <strong>the</strong> number of leaves L, <strong>the</strong> numberof cuts C, and <strong>the</strong> number of pointers X add up to <strong>the</strong> size of<strong>the</strong> array.L ·C · X N (7)The number of leaves and cuts are equal.2C · X N (8)Pointers may <strong>on</strong>ly reference cut nodes, and no node can bereferenced more than <strong>on</strong>ce:C X (9)Substituting this back, we get:3X N (10)Therefore, it is not half of <strong>the</strong> nodes needed for pointers in<strong>the</strong> worst case, <strong>on</strong>ly <strong>on</strong>e third. This way, <strong>the</strong> upper bound for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!