25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.2 Plan Optimality Trees<br />

• Optimality: The optimality <strong>of</strong> a plan is represented by the PlanOptTree. If and only<br />

if any optimality condition is violated, we will find a plan with lower cost during<br />

re-optimization. Thus, there is no need to trigger re-optimization until we detect<br />

any violation (break-even point between optimality <strong>of</strong> different plans).<br />

• Transitivity: If a statistic is included in multiple optimality conditions, we can leverage<br />

the transitivity <strong>of</strong> the comparison operators θ. Thus, the total number <strong>of</strong> required<br />

optimality conditions can be reduced.<br />

• Directed <strong>Optimization</strong>: If an optimality condition is violated, we are able to easily<br />

determine the involved operators and the related optimization technique that produced<br />

this condition. Then, we only need to directly re-optimize those operators.<br />

These properties hold for a complete PlanOptTree, while a set <strong>of</strong> partial PlanOptTrees<br />

(here, partial is defined as a subset <strong>of</strong> optimality conditions) might include redundancy,<br />

which stands in conflict with minimal monitoring and transitivity. We will revisit this<br />

issue and explain its relevance for further optimization later on.<br />

6.2.2 Creating PlanOptTrees<br />

During the initial deployment <strong>of</strong> an integration flow, the full cost-based optimization is executed<br />

once. There, the complete plan search space is evaluated and an initial PlanOptTree<br />

is created. From this point, the PlanOptTree is used for incremental and directed reoptimization<br />

only. In this subsection, we explain how to create this initial PlanOptTree.<br />

Our standard (transformation-based) optimization algorithm A-PMO recursively iterates<br />

over the hierarchy <strong>of</strong> sequences <strong>of</strong> atomic and complex operators (internal representation<br />

<strong>of</strong> a plan) and changes the current plan by applying relevant optimization techniques<br />

according to the specific types <strong>of</strong> operators. In contrast, for on-demand re-optimization, we<br />

changed the optimizer interface. Now, the optimizer does not only change the current plan<br />

but additionally, each applied optimization technique returns also a partial PlanOptTree<br />

that represents the optimality conditions for the subplan that was considered by this technique.<br />

This extension <strong>of</strong> optimization techniques is straightforward because the existing<br />

cost functions and optimality conditions can be reused. For example, the technique WD4:<br />

Early Selection Application creates a partial PlanOptTree when considering two operators.<br />

This technique constructs the partial PlanOptTree using ONodes, SNodes, a specialized<br />

CSNode Selectivity, and an OCNode.<br />

The use <strong>of</strong> the fine-grained partial PlanOptTrees at the optimizer interface is advantageous<br />

because during directed re-optimization, only subplans are considered and hence,<br />

only partial PlanOptTrees can be returned. Thus, the solution to the challenge <strong>of</strong> creating<br />

the initial PlanOptTree is to merge all partial PlanOptTrees to a minimal representation.<br />

Example 6.4 (Merging Partial PlanOptTrees). Recall plan P 5 and assume the two partial<br />

PlanOptTrees (created for the operators o 2 and o 3 ) shown in Figures 6.6(a) and 6.6(b).<br />

When merging those two partial PlanOptTrees, we see that operator o 3 and its selectivity<br />

(as a CSNode) are used by both partial PlanOptTrees. Hence, we add only o 4 and all<br />

<strong>of</strong> its child nodes from POT 2 to POT 1 . When doing so, the dangling reference from the<br />

new optimality condition to sel(o 3 ) <strong>of</strong> POT 2 is modified to refer to the existing selectivity<br />

measure sel(o 3 ) <strong>of</strong> POT 1 . Finally, we created the PlanOptTree shown in Figure 6.6(c).<br />

Algorithm 6.1 describes the creation <strong>of</strong> a PlanOptTree in detail. We iterate over all<br />

operators <strong>of</strong> a given subplan (lines 2-22). If an operator contains a subplan, we recursively<br />

173

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!