Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
plan with Orderby and Setoperation (UNION DISTINCT MERGE) operators. The<br />
sorted output order between the two Setoperation operators was exploited by the<br />
optimizer in order to reduce the number <strong>of</strong> required Orderby operators. Finally, the<br />
Setoperation and Orderby operators were parallelized by WC2.<br />
• P 7 : Similar to P 6 , this plan was also affected by WC1 in the sense <strong>of</strong> rescheduled<br />
subflows <strong>of</strong> the existing Fork operator. Furthermore, the techniques WD10 and WD8<br />
changed the Join operators o 14 , o 15 and o 16 from nested loop joins to subplans <strong>of</strong><br />
Orderby and merge join operators. Finally, note that join enumeration did not<br />
resulted in a new join order.<br />
• P 8 : This plan was mainly affected by control-flow oriented optimization techniques.<br />
In detail, the operator sequence (o 3 -o 9 ) was rewritten into two parallel subflows <strong>of</strong> a<br />
Fork operator. The last operator o 10 was not included because both o 9 and o 10 are<br />
two writing interactions to the same external system. Finally, the technique WC1<br />
was applied once again for rescheduling the created subflows.<br />
In addition to these consistent optimization benefits, Figure 3.22(b) shows the required<br />
cumulative optimization time. The significant differences between the optimization times<br />
<strong>of</strong> different plans are caused by two facts. First, the different total execution time influences<br />
the number <strong>of</strong> periodical re-optimization steps required in this scenario because these<br />
optimization steps are triggered periodically. Second, different techniques (with different<br />
time complexity) are applied according to the specific operator types used in the concrete<br />
plan. For example, plans P 4 and P 7 are dominated by the costs for join enumeration,<br />
where we did not found different join orders due to ensuring semantic correctness (P 4 )<br />
and the chain query type (P 7 ).<br />
Putting it all together, we can conclude that execution time reductions are possible,<br />
while only a fairly low overhead is required by periodical re-optimization.<br />
Scalability<br />
In addition to the presented comparison <strong>of</strong> optimized and unoptimized execution, scalability<br />
is one <strong>of</strong> the most important aspects. Hence, we conducted a series <strong>of</strong> experiments<br />
that examines the scalability with regard to increasing number <strong>of</strong> operators as well as with<br />
regard to increasing input data size.<br />
Figure 3.23: Speedup <strong>of</strong> Rewriting Sequences to Parallel <strong>Flows</strong><br />
78