Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6 On-Demand Re-<strong>Optimization</strong><br />
The overall cost-based optimization framework used so far relies on incremental statistic<br />
maintenance and periodical re-optimization to meet the high performance demands <strong>of</strong> integration<br />
flows and to overcome the problems <strong>of</strong> unknown statistics and changing workload<br />
characteristics. The potential problems <strong>of</strong> (1) many unnecessary re-optimization steps and<br />
(2) missed optimization opportunities due to adaptation delays are caused by the strict<br />
separation <strong>of</strong> optimization, execution and statistic monitoring. In order to overcome these<br />
major drawbacks <strong>of</strong> periodical re-optimization, in this chapter, we introduce the novel<br />
concept <strong>of</strong> on-demand re-optimization.<br />
Our aim is to reduce the overhead for statistics monitoring and re-optimization and<br />
at the same time to adapt to changing workload characteristics as fast as possible. We<br />
achieve this by extending the optimizer interface <strong>of</strong> our overall cost-based re-optimization<br />
framework in the sense <strong>of</strong> modeling optimality <strong>of</strong> a plan by its optimality conditions<br />
and triggering re-optimization only if workload changes violate these conditions. First,<br />
we define the Plan Optimality Tree (PlanOptTree) and describe how to create such a<br />
PlanOptTree for a given plan to model optimality <strong>of</strong> this plan by its optimality conditions<br />
rather than considering the complete search space. We explain how to use it for<br />
statistic maintenance and how this triggers re-optimization if optimality conditions are<br />
violated. Second, we exploit these violated conditions for search space reductions during<br />
re-optimization. In detail, we explain the directed re-optimization and the update <strong>of</strong><br />
PlanOptTrees after successful re-optimization. Finally, we describe how common optimization<br />
techniques are extended in order to enable on-demand re-optimization and we<br />
present experimental evaluation results, which compare the periodical re-optimization with<br />
this novel on-demand re-optimization approach. According to the experimental evaluation,<br />
we achieve improvements concerning re-optimization overhead as well as adaptation<br />
sensibility and thus reduce the total execution time.<br />
6.1 Motivation and Problem Description<br />
Efficiency <strong>of</strong> integration flows is ensured by cost-based optimization in order to (1) exploit<br />
the full optimization potential and (2) to adapt to changing workload characteristics<br />
[MSHR02, IHW04, BMM + 04, DIR07, LSM + 07, CC08, NRB09] such as varying cardinalities,<br />
selectivities and execution times (for example, reasoned by unpredictable workloads<br />
<strong>of</strong> external systems or temporal variations <strong>of</strong> network properties). With regard to the low<br />
risk <strong>of</strong> re-optimization overhead and good optimization opportunities, the state-<strong>of</strong>-the-art<br />
cost-based optimization models <strong>of</strong> integration flows are (1) the periodical re-optimization<br />
(see Chapter 3) or (2) the optimize-always optimization model [SVS05, SMWM06]. Within<br />
the optimize-always model optimization is triggered for each plan instance, which fails in<br />
the case <strong>of</strong> many plan instances with rather small amounts <strong>of</strong> data per instance because<br />
the optimization time can be even higher than the execution time <strong>of</strong> a single plan instance.<br />
As mentioned in Subsection 2.2.4, there are fundamental differences to other system<br />
167