Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6.3 Re-<strong>Optimization</strong><br />
Algorithm 6.3 PlanOptTree Trigger Re-<strong>Optimization</strong> (A-PTR)<br />
Require: invalid optimality conditions OC<br />
1: clear memo<br />
2: for all oc ∈ OC do // for each OCNode<br />
3: for all oc1 ∈ oc.op1.ocnodes do // for each OCNode <strong>of</strong> operand 1<br />
4: if oc1.θ = oc.θ and oc1.op2 = oc.op1 then<br />
5: if ¬oc1.isOptimal(oc1.op1.agg, oc.op2.agg) then<br />
6: OC ← OC ∪ oc1<br />
7: rCheckTransitivity(oc, oc1.op1, left)<br />
8: for all oc2 ∈ oc.op2.ocnodes do // for each OCNode <strong>of</strong> operand 2<br />
9: if oc2.θ = oc.θ and oc2.op1 = oc.op2 then<br />
10: if ¬oc2.isOptimal(oc2.op2.agg, oc.op1.agg) then<br />
11: OC ← OC ∪ oc2<br />
12: rCheckTransitivity(oc, oc2.op2, right)<br />
13: PPOT ← optimizePlan(ptid, OC) // apply directed re-optimization<br />
14: A-PPR(PPOT, OC) // update PlanOptTree<br />
incrementally updated (line 14). It is important to note that the directed re-optimization<br />
<strong>of</strong> operators involved in violated optimality conditions is equivalent to full re-optimization.<br />
The directed re-optimization relies on monotonic cost functions. This ensures that no<br />
local suboptima exist and that we will find the global optimum. The Picasso project<br />
[RH05] showed that this assumption holds for complete cost diagrams over multiple alternative<br />
plans <strong>of</strong> most queries [HDH07, HDH08, DBDH08]. In contrast, it always holds<br />
for our cost model <strong>of</strong> integration flows with regard to a single plan (see Subsection 3.2.2).<br />
However, due to the possibility <strong>of</strong> arbitrarily complex optimality conditions, even without<br />
this property, we could still guarantee to trigger full re-optimization if a better plan exists.<br />
Theorem 6.2 (Directed Re-<strong>Optimization</strong>). The directed re-optimization for all operators<br />
o ′ ∈ P that have been identified by violated optimality conditions oc ′ <strong>of</strong> a PlanOptTree is<br />
equivalent to the full re-optimization <strong>of</strong> all operators o ∈ P .<br />
Pro<strong>of</strong>. Assume all dependencies between operators o <strong>of</strong> plan P to be a directed graph<br />
G = (V, A) <strong>of</strong> vertexes (operators) and arcs (dependencies). Then, the re-optimization <strong>of</strong><br />
P is a graph homomorphism f : G → H. In order to prove Theorem 6.2, we show that<br />
∀o i /∈ o ′ (<br />
: vpre(oi ) ∈ G ≡ v pre(oi ) ∈ H ) ∧ ( v suc(oi ) ∈ G ≡ v suc(oi ) ∈ H ) , (6.2)<br />
where v pre(oi ) denotes the set <strong>of</strong> predecessors <strong>of</strong> operator o i and v suc(oi ) denotes the set <strong>of</strong><br />
successors <strong>of</strong> o i . (1) If there exists a homomorphism f : G → H such that<br />
v j ≺ o i ∈ G ∧ o i ≺ v j ∈ H, (6.3)<br />
then, the order v j ≺ o i is represented by an optimality condition oc with o i , v j ∈ oc or by a<br />
transitive optimality condition toc with o i , v j ∈ toc. The same is true for successors <strong>of</strong> o i .<br />
(2) All used cost functions are known to be monotonically non-decreasing w.r.t. the input<br />
statistics. Hence, during re-optimization, f : G → H, the globally optimal solution will<br />
be found. (3) Further, all operators o ′ included in violated optimality conditions ∀o i ∈ oc ′<br />
or transitive optimality conditions ∀o i ∈ toc ′ are used by f : G → H. As a result,<br />
∄ ( o i /∈ o ′ ∧ (( v pre(oi ) ∈ G ≠ v pre(oi ) ∈ H ) ∨ ( v suc(oi ) ∈ G ≠ v suc(oi ) ∈ H ))) , (6.4)<br />
179