Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
5.5 Experimental Evaluation<br />
data-driven integration flow use cases (plans P 1 , P 2 , P 5 , and P 7 ), which have been described<br />
in Section 2.4. Furthermore, we used the following scale factors: the number <strong>of</strong><br />
messages |M|, the message rate R, the selectivity according to the partitioning attribute<br />
sel, the batch size k ′ , the message rate distribution function D, the maximum latency<br />
constraint lc, and the data size d <strong>of</strong> input messages (in 100 kB).<br />
End-to-End Comparison and <strong>Optimization</strong> Benefits<br />
First <strong>of</strong> all, we investigate the end-to-end optimization benefit achieved by multi-flow optimization<br />
and the related optimization overhead. We compared the multi-flow optimization<br />
with no-optimization, while all other optimization techniques have been disabled. Similar<br />
to the use case comparison in Section 3.5 and 4.6, we executed 20,000 plan instances for<br />
each asynchronous, data-driven example plan (P 1 , P 2 , P 5 , and P 7 ) and for both execution<br />
models. We reused the same workload configuration as already presented (without correlations<br />
and without workload changes). Furthermore, we fixed the cardinality <strong>of</strong> input<br />
data sets to d = 1 (100 kB messages), an optimization interval <strong>of</strong> ∆t = 5 min, a sliding<br />
window size <strong>of</strong> ∆w = 5 min and EMA as the workload aggregation method. With regard to<br />
multi-flow optimization, we did not use the computed waiting time but directly restricted<br />
the batch size to k ′ = 10 in order to achieve comparable results across the different plans.<br />
(a) Cumulative Execution Time<br />
(b) Cumulative <strong>Optimization</strong> Time<br />
Figure 5.15: Use Case Comparison <strong>of</strong> Multi-Flow <strong>Optimization</strong><br />
Figure 5.15(a) shows the resulting total execution times. To summarize, we consistently<br />
observe significant execution time reductions that have been achieved as follows:<br />
• P 1 : The plan P 1 benefits from MFO in several ways. First, the Switch operator<br />
o 2 is executed once for a message batch because the switch expression attribute<br />
/material/type is used as the only partitioning attribute. Furthermore, the Assign<br />
operators o 4 , o 6 , and o 8 are also executed only once because the result is exclusively<br />
used by the partition-aware Invoke operators o 7 , and o 9 . These writing Invoke operators<br />
show additional benefit because a single operator instance is used to process<br />
all messages <strong>of</strong> a batch. Overall, this achieves a throughput improvement <strong>of</strong> 62%.<br />
• P 2 : The plan P 2 mainly benefits from executing the Invoke operator o 3 and the<br />
predecessor Assign operator o 2 only once for a whole partition. There, the predicate<br />
part /resultsets/resultset/row/A1 Custkey is used as the partitioning attribute.<br />
Additional benefit is achieved by the final Assign and Invoke operators o 5 and o 6 .<br />
In total, an improvement <strong>of</strong> 53% has been achieved.<br />
157