25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6 On-Demand Re-<strong>Optimization</strong><br />

The overall cost-based optimization framework used so far relies on incremental statistic<br />

maintenance and periodical re-optimization to meet the high performance demands <strong>of</strong> integration<br />

flows and to overcome the problems <strong>of</strong> unknown statistics and changing workload<br />

characteristics. The potential problems <strong>of</strong> (1) many unnecessary re-optimization steps and<br />

(2) missed optimization opportunities due to adaptation delays are caused by the strict<br />

separation <strong>of</strong> optimization, execution and statistic monitoring. In order to overcome these<br />

major drawbacks <strong>of</strong> periodical re-optimization, in this chapter, we introduce the novel<br />

concept <strong>of</strong> on-demand re-optimization.<br />

Our aim is to reduce the overhead for statistics monitoring and re-optimization and<br />

at the same time to adapt to changing workload characteristics as fast as possible. We<br />

achieve this by extending the optimizer interface <strong>of</strong> our overall cost-based re-optimization<br />

framework in the sense <strong>of</strong> modeling optimality <strong>of</strong> a plan by its optimality conditions<br />

and triggering re-optimization only if workload changes violate these conditions. First,<br />

we define the Plan Optimality Tree (PlanOptTree) and describe how to create such a<br />

PlanOptTree for a given plan to model optimality <strong>of</strong> this plan by its optimality conditions<br />

rather than considering the complete search space. We explain how to use it for<br />

statistic maintenance and how this triggers re-optimization if optimality conditions are<br />

violated. Second, we exploit these violated conditions for search space reductions during<br />

re-optimization. In detail, we explain the directed re-optimization and the update <strong>of</strong><br />

PlanOptTrees after successful re-optimization. Finally, we describe how common optimization<br />

techniques are extended in order to enable on-demand re-optimization and we<br />

present experimental evaluation results, which compare the periodical re-optimization with<br />

this novel on-demand re-optimization approach. According to the experimental evaluation,<br />

we achieve improvements concerning re-optimization overhead as well as adaptation<br />

sensibility and thus reduce the total execution time.<br />

6.1 Motivation and Problem Description<br />

Efficiency <strong>of</strong> integration flows is ensured by cost-based optimization in order to (1) exploit<br />

the full optimization potential and (2) to adapt to changing workload characteristics<br />

[MSHR02, IHW04, BMM + 04, DIR07, LSM + 07, CC08, NRB09] such as varying cardinalities,<br />

selectivities and execution times (for example, reasoned by unpredictable workloads<br />

<strong>of</strong> external systems or temporal variations <strong>of</strong> network properties). With regard to the low<br />

risk <strong>of</strong> re-optimization overhead and good optimization opportunities, the state-<strong>of</strong>-the-art<br />

cost-based optimization models <strong>of</strong> integration flows are (1) the periodical re-optimization<br />

(see Chapter 3) or (2) the optimize-always optimization model [SVS05, SMWM06]. Within<br />

the optimize-always model optimization is triggered for each plan instance, which fails in<br />

the case <strong>of</strong> many plan instances with rather small amounts <strong>of</strong> data per instance because<br />

the optimization time can be even higher than the execution time <strong>of</strong> a single plan instance.<br />

As mentioned in Subsection 2.2.4, there are fundamental differences to other system<br />

167

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!