Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6.4 <strong>Optimization</strong> Techniques<br />
Algorithm 6.4 Update via Partial PlanOptTree Replacement (A-PPR)<br />
Require: set <strong>of</strong> new partial PlanOptTrees PPOT, invalid optimality conditions OC<br />
1: for all oc ∈ OC do // remove invalid optimality conditions<br />
2: oc.op1.ocnodes.remove(oc)<br />
3: if oc.op1.|ocnodes| = 0 and oc.op1.|csnodes| = 0 then<br />
4: rremoveNodes(oc.op1) // recursive bottom-up<br />
5: oc.op2.ocnodes.remove(oc)<br />
6: if oc.op2.|ocnodes| = 0 and oc.op2.|csnodes| = 0 then<br />
7: rremoveNodes(oc.op2) // recursive bottom-up<br />
8: clearStrata12()<br />
9: for all ppot ∈ PPOT do // merge new partial PlanOptTrees<br />
10: mergePPOT(root, ppot) // see A-IPC lines 11-22<br />
starting from the root with the same concept <strong>of</strong> removing nodes without any children.<br />
Finally, we apply the merge algorithm (line 10) from Subsection 6.2.2.<br />
With the aim <strong>of</strong> reuse, we could index plans and PlanOptTrees created over time by<br />
their optimality constraints. This could avoid redundant directed re-optimization and<br />
merging PlanOptTrees but we would still need to copy statistics. Due to the risk <strong>of</strong> (1)<br />
maintenance overhead (e.g., plans, PlanOptTrees), (2) a potentially large search space, as<br />
well as (3) low remaining optimization potential, we do not reuse plans. However, future<br />
work might investigate this by combining on-demand re-optimization with (progressive)<br />
parametric query optimization (PPQO) [BBD09] by iteratively creating possible plans <strong>of</strong><br />
the search space according to the optimality conditions and subsequently, reusing already<br />
created plans.<br />
To summarize, we have shown how to use the PlanOptTree for directed re-optimization<br />
and how a PlanOptTree can be incrementally updated after successful re-optimization.<br />
All algorithms presented rely on the extension <strong>of</strong> the optimizer interface by returning<br />
partial PlanOptTrees or by directly rearranging the referenced PlanOptTree. Either way,<br />
existing optimization techniques require modifications. In the following, we will explain<br />
these modifications using selected optimization techniques from previous chapters.<br />
6.4 <strong>Optimization</strong> Techniques<br />
In order to illustrate the applicability <strong>of</strong> the PlanOptTree and the necessary modifications<br />
<strong>of</strong> optimization techniques, we use examples to show how their optimality conditions can<br />
be expressed with our approach. First, we describe the on-demand re-optimization for<br />
common rewriting techniques (see Chapter 3). Second, we show how the concept <strong>of</strong> ondemand<br />
re-optimization can be applied for cost-based vectorization (see Chapter 4) and<br />
multi-flow optimization (see Chapter 5) as well.<br />
6.4.1 Control-Flow- and Data-Flow-Oriented Techniques<br />
For control-flow- and data-flow-oriented rewriting techniques, their already presented optimality<br />
conditions can be reused as they are. In this subsection, we discuss the on-demand<br />
re-optimization for the common data-flow-oriented example optimization techniques join<br />
enumeration, eager group-by, and set operations with distinctness.<br />
181