25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />

Fork (o1)<br />

Fork (o1)<br />

Assign (o2)<br />

[out: msg1]<br />

Assign (o4)<br />

[out: msg3]<br />

Assign (o6)<br />

[out: msg5]<br />

Assign (o2)<br />

[out: msg1]<br />

Assign (o4)<br />

[out: msg3]<br />

Assign (o6)<br />

[out: msg5]<br />

Invoke (o3)<br />

[service s3, in: msg1, out: msg2]<br />

Invoke (o5)<br />

[service s4, in: msg3, out: msg4]<br />

Invoke (o7)<br />

[service s5, in: msg5, out: msg6]<br />

Invoke (o3)<br />

[service s3, in: msg1, out: msg2]<br />

Invoke (o5)<br />

[service s4, in: msg3, out: msg4]<br />

Invoke (o7)<br />

[service s5, in: msg5, out: msg6]<br />

Orderby (o12)<br />

[in: msg2, out: msg2]<br />

Orderby (o13)<br />

[in: msg4, out: msg4]<br />

Orderby (o16)<br />

[in: msg6, out: msg6]<br />

Setoperation (o8)<br />

[in: msg2,msg4, out: msg7]<br />

UNION DISTINCT<br />

Setoperation (o8)<br />

[in: msg2,msg4, out: msg7]<br />

UNION DISTINCT<br />

(Merge)<br />

Setoperation (o9)<br />

[in: msg7,msg6, out: msg8]<br />

UNION DISTINCT<br />

Setoperation (o9)<br />

[in: msg7,msg6, out: msg8]<br />

UNION DISTINCT<br />

(Merge)<br />

Assign (o10)<br />

[in: msg8, out: msg9]<br />

Assign (o10)<br />

[in: msg8, out: msg9]<br />

Invoke (o11)<br />

[service s6, in: msg9]<br />

Invoke (o11)<br />

[service s6, in: msg9]<br />

(a) Plan P 6<br />

(b) Plan P ′ 6<br />

Figure 3.18: Example Setoperation Type Selection<br />

the techniques orderby insertion and setoperation type selection, we created the rewritten<br />

plan P 6 ′ shown in Figure 3.18(b). Here, we use the efficient merge algorithm for both<br />

Setoperation operators and hence, require to sort all three input data sets. Sorting the<br />

result <strong>of</strong> the first Setoperation operator is not required because the output <strong>of</strong> the merge<br />

algorithm is already ordered. Consider two cases with different statistics for input and<br />

output cardinalities <strong>of</strong> the Setoperation o 8 . Figure 3.19 (left) shows the abstract costs <strong>of</strong><br />

the two possible subplans—P 6 : (o 8 ) versus P 6 ′ : (o′ 12 , o′ 13 , o′ 8 )—in both cases.<br />

Statistics C(o 8 ) C(o ′ 12 , o′ 13 , o′ 8 )<br />

|ds in1 (o 8 )| = 1,000<br />

case 1 |ds in2 (o 8 )| = 1,000, 501,000 21,932<br />

|ds out (o 8 )| = 1,000<br />

|ds in1 (o 8 )| = 1,000,<br />

case 2 |ds in2 (o 8 )| = 10, 6,000 11,009<br />

|ds out (o 8 )| = 1,000<br />

Figure 3.19: Example Setoperation <strong>Cost</strong> Comparison<br />

We observe that the optimality <strong>of</strong> these subplans depends on current workload characteristics,<br />

while the subplan (o ′ 12 , o′ 13 , o′ 8 ) is more robust6 over arbitrary statistic ranges than<br />

the subplan (o 8 ) as shown in Figure 3.19 (right).<br />

Finally, this optimization technique is also applicable for other operators such as Projection<br />

with duplicate elimination or for forcing a merge-based join algorithm (WD9). Thus,<br />

in general, the technique WD8 (orderby insertion) should be applied before selecting different<br />

physical types <strong>of</strong> an operator.<br />

To summarize, we presented selected control-flow- and data-flow-oriented optimization<br />

6 Robustness is an alternative optimization objective, which is beyond the scope <strong>of</strong> this thesis. However,<br />

in contrast to existing work [ABD + 10], we would identify these robust (insensitive to input statistics)<br />

plans by simply choosing one <strong>of</strong> the plans with lowest asymptotic time complexity with regard to their<br />

overall abstract cost functions <strong>of</strong> all plan operators.<br />

72

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!