Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6.2 Plan Optimality Trees<br />
• Optimality: The optimality <strong>of</strong> a plan is represented by the PlanOptTree. If and only<br />
if any optimality condition is violated, we will find a plan with lower cost during<br />
re-optimization. Thus, there is no need to trigger re-optimization until we detect<br />
any violation (break-even point between optimality <strong>of</strong> different plans).<br />
• Transitivity: If a statistic is included in multiple optimality conditions, we can leverage<br />
the transitivity <strong>of</strong> the comparison operators θ. Thus, the total number <strong>of</strong> required<br />
optimality conditions can be reduced.<br />
• Directed <strong>Optimization</strong>: If an optimality condition is violated, we are able to easily<br />
determine the involved operators and the related optimization technique that produced<br />
this condition. Then, we only need to directly re-optimize those operators.<br />
These properties hold for a complete PlanOptTree, while a set <strong>of</strong> partial PlanOptTrees<br />
(here, partial is defined as a subset <strong>of</strong> optimality conditions) might include redundancy,<br />
which stands in conflict with minimal monitoring and transitivity. We will revisit this<br />
issue and explain its relevance for further optimization later on.<br />
6.2.2 Creating PlanOptTrees<br />
During the initial deployment <strong>of</strong> an integration flow, the full cost-based optimization is executed<br />
once. There, the complete plan search space is evaluated and an initial PlanOptTree<br />
is created. From this point, the PlanOptTree is used for incremental and directed reoptimization<br />
only. In this subsection, we explain how to create this initial PlanOptTree.<br />
Our standard (transformation-based) optimization algorithm A-PMO recursively iterates<br />
over the hierarchy <strong>of</strong> sequences <strong>of</strong> atomic and complex operators (internal representation<br />
<strong>of</strong> a plan) and changes the current plan by applying relevant optimization techniques<br />
according to the specific types <strong>of</strong> operators. In contrast, for on-demand re-optimization, we<br />
changed the optimizer interface. Now, the optimizer does not only change the current plan<br />
but additionally, each applied optimization technique returns also a partial PlanOptTree<br />
that represents the optimality conditions for the subplan that was considered by this technique.<br />
This extension <strong>of</strong> optimization techniques is straightforward because the existing<br />
cost functions and optimality conditions can be reused. For example, the technique WD4:<br />
Early Selection Application creates a partial PlanOptTree when considering two operators.<br />
This technique constructs the partial PlanOptTree using ONodes, SNodes, a specialized<br />
CSNode Selectivity, and an OCNode.<br />
The use <strong>of</strong> the fine-grained partial PlanOptTrees at the optimizer interface is advantageous<br />
because during directed re-optimization, only subplans are considered and hence,<br />
only partial PlanOptTrees can be returned. Thus, the solution to the challenge <strong>of</strong> creating<br />
the initial PlanOptTree is to merge all partial PlanOptTrees to a minimal representation.<br />
Example 6.4 (Merging Partial PlanOptTrees). Recall plan P 5 and assume the two partial<br />
PlanOptTrees (created for the operators o 2 and o 3 ) shown in Figures 6.6(a) and 6.6(b).<br />
When merging those two partial PlanOptTrees, we see that operator o 3 and its selectivity<br />
(as a CSNode) are used by both partial PlanOptTrees. Hence, we add only o 4 and all<br />
<strong>of</strong> its child nodes from POT 2 to POT 1 . When doing so, the dangling reference from the<br />
new optimality condition to sel(o 3 ) <strong>of</strong> POT 2 is modified to refer to the existing selectivity<br />
measure sel(o 3 ) <strong>of</strong> POT 1 . Finally, we created the PlanOptTree shown in Figure 6.6(c).<br />
Algorithm 6.1 describes the creation <strong>of</strong> a PlanOptTree in detail. We iterate over all<br />
operators <strong>of</strong> a given subplan (lines 2-22). If an operator contains a subplan, we recursively<br />
173