Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6.2 Plan Optimality Trees<br />
We execute a full optimization only once during the initial deployment <strong>of</strong> an integration<br />
flow. The optimizer contract is changed in a way that it returns the set <strong>of</strong> optimality<br />
conditions. Thus, the resulting research challenge is how to organize these optimality conditions<br />
for efficient statistic monitoring, condition evaluation, and directed re-optimization.<br />
In contrast to existing passive structures such as matrix views [BMM + 04] that are used<br />
and maintained by the re-optimizer for selective operators, we propose an active structure,<br />
the so-called Plan Optimality Tree (PlanOptTree), which is a data structure that models<br />
optimality <strong>of</strong> a plan. It indexes operators and their related statistics, which are included<br />
in optimality conditions. As a result, we maintain only required statistics and we can<br />
continuously evaluate optimality conditions. Re-optimization is actively triggered only if<br />
necessary—in this case, it is guaranteed that we will find a plan with lower costs. Here,<br />
directed re-optimization is applied only for operators included in any violated conditions.<br />
Example 6.3 (PlanOptTree POT(P 5 )). Figure 6.4 shows the PlanOptTree <strong>of</strong> plan P 5 .<br />
o 2 o 3<br />
|ds in1| |ds out1| |ds in1| |ds out1|<br />
sel(o 2) sel(o 3)<br />
o 4<br />
|ds in1| |ds out1|<br />
sel(o 4)<br />
o 6 o 5 o 7<br />
W W W<br />
W(expr A)<br />
P(A)<br />
W(expr B)<br />
P(B)<br />
≤<br />
≤<br />
(oc1) (oc2) (oc4)<br />
Figure 6.4: PlanOptTree <strong>of</strong> Plan P 5<br />
It includes two optimality conditions (oc 1 , oc 2 ) for expressing the order <strong>of</strong> the Selection<br />
operators o 2 , o 3 and o 4 (see Figure 6.3(c)) according to their selectivities (oc 3 <strong>of</strong> Example<br />
6.1 is omitted due to transitivity <strong>of</strong> conditions) and one condition (oc 4 ) regarding<br />
branch prediction <strong>of</strong> the Switch 19 operator o 5 according to the weighted path probabilities.<br />
In the rest <strong>of</strong> this chapter, we explain in detail how to create, update, and use these<br />
PlanOptTrees in order to enable the vision <strong>of</strong> on-demand re-optimization.<br />
≤<br />
6.2 Plan Optimality Trees<br />
In this section, we formally define the PlanOptTree and show how to create a PlanOptTree<br />
for a given plan during the initial optimization <strong>of</strong> an integration flow. Further, we explain<br />
how to use it for statistic maintenance and when to trigger re-optimization.<br />
6.2.1 Formal Foundation<br />
A PlanOptTree, which general structure is shown in Figure 6.5, models optimality conditions<br />
<strong>of</strong> a plan and it is defined as follows:<br />
Definition 6.1 (PlanOptTree). Let P denote the optimal plan with regard to the current<br />
statistics. Further, let m denote the number <strong>of</strong> operators and let s denote the maximum<br />
number <strong>of</strong> statistic types per operator. Then, the PlanOptTree is defined as a graph <strong>of</strong><br />
five strata representing all optimality conditions <strong>of</strong> P :<br />
19 For the Switch operator multiple versions <strong>of</strong> statistics are monitored. This includes the total execution<br />
time W (o i) as well as the different execution times <strong>of</strong> all expression evaluations W (expr i).<br />
171