25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />

1,000<br />

1,000<br />

W<br />

W<br />

600,000<br />

V<br />

600,000<br />

T<br />

T<br />

V<br />

R<br />

S<br />

R<br />

S<br />

(a) Plan P (before reordering)<br />

(b) Plan P ′ (after reordering)<br />

Figure 3.10: Heuristic Join Reordering Example<br />

the order <strong>of</strong> these two operators as shown in Figure 3.10. For example, the costs <strong>of</strong> the<br />

Join operator ∗ ⋊⋉ W are independent <strong>of</strong> the order <strong>of</strong> previous join operators because the<br />

size <strong>of</strong> the intermediate result (1,000) before this operator is constant and hence, these<br />

operator costs are also constant. This concept <strong>of</strong> binary cost-based reordering decisions<br />

take into account cardinality and selectivity <strong>of</strong> join operators and thus generalized the<br />

existing ranking functions minSel, minCard, and minSize [BGLJ10].<br />

Algorithm 3.3 illustrates the heuristic join reordering in detail. Essentially, it uses a<br />

transformation-based approach, where the inputs <strong>of</strong> join operators are reordered. First,<br />

we select the input data set with the smallest cardinality (line 1) and reorder it with<br />

the existing first join operand (line 2). Second, we incrementally reorder subsequent join<br />

operands by iterating over all joins (line 3) and comparing the costs <strong>of</strong> the overall plan<br />

under the hypothesis that we reorder the join operand with a subsequent operand. Assuming<br />

variable selectivities and cardinalities, the cost comparison <strong>of</strong> subplans for arbitrary<br />

left-deep join trees are specified as follows. First, fix two input data sets with the commutativity<br />

optimality condition <strong>of</strong> |R| ≤ |S|. Second, the optimality <strong>of</strong> executing R ⋊⋉ S<br />

before ∗ ⋊⋉ T is given if the following optimality condition holds:<br />

|R| + |R| · |S| + f R,S · |R| · |S| + f R,S · |R| · |S| · |T |<br />

≤ |R| + |R| · |T | + f R,T · |R| · |T | + f R,T · |R| · |T | · |S|<br />

|S| + f R,S · |S| + f R,S · |S| · |T |<br />

≤ |T | + f R,T · |T | + f R,T · |T | · |S|.<br />

(3.8)<br />

Algorithm 3.3 Heuristic Join Reordering (A-HJR)<br />

Require: set <strong>of</strong> input data sets R (with n = |R|)<br />

1: R k ← min Ri ∈R|R i | // determine smallest input<br />

2: reorder(R 1 ,R k )<br />

3: for i ← 3 to n do // for each input<br />

4: for j ← i − 1 to 2 do // for each predecessor input<br />

5: if ¬ connected(R i , R j−1 ) then // is reordering impossible<br />

6: break<br />

7: if ¬ optimal(R i , R j ) then // is reordering not meaningful<br />

8: break<br />

9: reorder(R j ,R i )<br />

10: return R<br />

54

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!