Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6.4 <strong>Optimization</strong> Techniques<br />
o 1 o 4 o 2 o 3 o 5<br />
R<br />
W sel(ba 1)<br />
sel(ba 2)<br />
W<br />
W W W<br />
≥<br />
(oc1)<br />
W - W +<br />
∆tw<br />
W(P’,R·∆tw)<br />
0<br />
lc<br />
T L(M’,R·∆tw)<br />
≥ (oc2)<br />
≥ (oc3)<br />
≥ (oc4)<br />
≥ (oc5)<br />
Figure 6.13: Example PlanOptTree <strong>of</strong> Multi-Flow <strong>Optimization</strong><br />
operators, we monitor the execution time. In addition, the message rate R is monitored<br />
for the first operator <strong>of</strong> this plan. We use oc 1 to express the ordering <strong>of</strong> the partition<br />
tree. In contrast, for the validity condition, we require a hierarchy <strong>of</strong> complex statistic<br />
nodes. First, we determine the cost components W − (P ′ ) and W + (P ′ ) according to the<br />
defined cost model extension. Then, ∆tw is computed from these cost components and the<br />
message rate. Furthermore, we compute the latency time and the execution time using<br />
the determined waiting time ∆tw. Finally, we use the optimality conditions oc 2 -oc 5 in<br />
order to express the mentioned validity condition, where two more complex statistic nodes<br />
(latency constraint lc and 0) are used as constant-value operands.<br />
Once we triggered re-optimization, we can use the PlanOptTree for directed re-optimization<br />
as well. There are three facets, where we can exploit the PlanOptTree. First, we<br />
use the violated optimality conditions for the directed reordering <strong>of</strong> partitioning attributes<br />
similarly to the ordering <strong>of</strong> selective operators. Second, with regard to the different cases <strong>of</strong><br />
computing ∆tw (minimum, default, exceeded latency), we can directly derive the case from<br />
the violated optimality conditions and compute the waiting time ∆tw accordingly. Third,<br />
after each executed partitioned plan instance, we determine the current (continuously<br />
adapted) waiting time ∆tw by querying the corresponding complex statistic node. Most<br />
importantly this allows for a workload adaptation with almost no adaptation delay as it<br />
was introduced by the optimization interval ∆t. Therefore, we also solved the problem<br />
that the maximum message latency time cannot be guaranteed if workload characteristics<br />
change abruptly and if we use long optimization intervals.<br />
6.4.4 Discussion<br />
We generalize the main findings from the given example optimization techniques. First,<br />
even when considering techniques with a large search space, only few optimality conditions<br />
are required because we do not model the complete plan search space but only the conditions<br />
<strong>of</strong> the optimal plan. Hence, all conditions are binary decisions <strong>of</strong> subplans and they<br />
exploit the fact that costs <strong>of</strong> operators in front <strong>of</strong> and after that subplan are independent<br />
<strong>of</strong> the rewriting decision. Second, only one optimality condition per dependency (plus one<br />
condition for binary operators) is required to model plan optimality. As a result, those<br />
conditions can be seamlessly integrated into existing memo structures that are typically<br />
used during query optimization in DBMS (for an example, see the cascades framework in<br />
MS SQL Server [BN08]). Third, due to the possibility <strong>of</strong> arbitrarily complex optimality<br />
conditions, all kinds <strong>of</strong> optimization techniques can be integrated. For example, our optimizer<br />
uses (1) reordering techniques (e.g., switch path reordering, early selection, early<br />
187