Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
1,000<br />
1,000<br />
W<br />
W<br />
600,000<br />
V<br />
600,000<br />
T<br />
T<br />
V<br />
R<br />
S<br />
R<br />
S<br />
(a) Plan P (before reordering)<br />
(b) Plan P ′ (after reordering)<br />
Figure 3.10: Heuristic Join Reordering Example<br />
the order <strong>of</strong> these two operators as shown in Figure 3.10. For example, the costs <strong>of</strong> the<br />
Join operator ∗ ⋊⋉ W are independent <strong>of</strong> the order <strong>of</strong> previous join operators because the<br />
size <strong>of</strong> the intermediate result (1,000) before this operator is constant and hence, these<br />
operator costs are also constant. This concept <strong>of</strong> binary cost-based reordering decisions<br />
take into account cardinality and selectivity <strong>of</strong> join operators and thus generalized the<br />
existing ranking functions minSel, minCard, and minSize [BGLJ10].<br />
Algorithm 3.3 illustrates the heuristic join reordering in detail. Essentially, it uses a<br />
transformation-based approach, where the inputs <strong>of</strong> join operators are reordered. First,<br />
we select the input data set with the smallest cardinality (line 1) and reorder it with<br />
the existing first join operand (line 2). Second, we incrementally reorder subsequent join<br />
operands by iterating over all joins (line 3) and comparing the costs <strong>of</strong> the overall plan<br />
under the hypothesis that we reorder the join operand with a subsequent operand. Assuming<br />
variable selectivities and cardinalities, the cost comparison <strong>of</strong> subplans for arbitrary<br />
left-deep join trees are specified as follows. First, fix two input data sets with the commutativity<br />
optimality condition <strong>of</strong> |R| ≤ |S|. Second, the optimality <strong>of</strong> executing R ⋊⋉ S<br />
before ∗ ⋊⋉ T is given if the following optimality condition holds:<br />
|R| + |R| · |S| + f R,S · |R| · |S| + f R,S · |R| · |S| · |T |<br />
≤ |R| + |R| · |T | + f R,T · |R| · |T | + f R,T · |R| · |T | · |S|<br />
|S| + f R,S · |S| + f R,S · |S| · |T |<br />
≤ |T | + f R,T · |T | + f R,T · |T | · |S|.<br />
(3.8)<br />
Algorithm 3.3 Heuristic Join Reordering (A-HJR)<br />
Require: set <strong>of</strong> input data sets R (with n = |R|)<br />
1: R k ← min Ri ∈R|R i | // determine smallest input<br />
2: reorder(R 1 ,R k )<br />
3: for i ← 3 to n do // for each input<br />
4: for j ← i − 1 to 2 do // for each predecessor input<br />
5: if ¬ connected(R i , R j−1 ) then // is reordering impossible<br />
6: break<br />
7: if ¬ optimal(R i , R j ) then // is reordering not meaningful<br />
8: break<br />
9: reorder(R j ,R i )<br />
10: return R<br />
54