Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.6 Experimental Evaluation<br />
approach was designed for multi- and many-core systems, in general, it can be extended<br />
to the distributed case as well. There, extensions with regard to the communication costs<br />
between several nodes as well as heterogeneous hardware (different execution times on different<br />
server nodes) would be required. However, in this distributed setting, the cost-based<br />
vectorization would be even more important because the number <strong>of</strong> involved server nodes<br />
could be reduced without sacrificing the degree <strong>of</strong> parallelism.<br />
4.6 Experimental Evaluation<br />
In this section, we provide experimental evaluation results for both the full vectorization<br />
and the cost-based vectorization <strong>of</strong> integration flows. The major perspectives <strong>of</strong> this<br />
evaluation are (1) the performance in terms <strong>of</strong> message throughput, (2) the influence on<br />
latency times <strong>of</strong> single messages, as well as (3) the optimization overhead and influences<br />
<strong>of</strong> parameterization. In general, the evaluation shows that:<br />
• Significant performance improvements in the sense <strong>of</strong> increased message throughput<br />
are achievable. The benefit <strong>of</strong> vectorization increases with increasing number <strong>of</strong><br />
operators, with increasing data size, and with increasing number <strong>of</strong> plan instances.<br />
• The latency <strong>of</strong> single messages is moderately increased by vectorization. However,<br />
due to Little’s Law [Lit61], in case <strong>of</strong> high load <strong>of</strong> messages, the total latency time<br />
(including waiting time) is reduced by vectorization.<br />
• The deployment and optimization overhead imposed by vectorization is moderate<br />
as well. In addition to the influence <strong>of</strong> the parameters <strong>of</strong> periodical optimization,<br />
also the cost-constraint-parameter λ has high influence on the resulting performance.<br />
Typically, the default setting <strong>of</strong> λ = 0 leads to highest performance.<br />
• The cost-based optimization typically finds the global optimal plan according to the<br />
number <strong>of</strong> execution buckets. Thus, the number <strong>of</strong> buckets should not be restricted.<br />
Finally, we can state that the cost-based vectorization achieves significant throughput<br />
improvements, while accepting moderate additional latency for single messages. In conclusion,<br />
this concept can be applied by default if a high load <strong>of</strong> plan instances exists and<br />
moderate latency time is acceptable. The theoretical optimality and latency guarantees<br />
also hold under experimental performance evaluation.<br />
The evaluation is structured as follows. First, we present the end-to-end comparison <strong>of</strong><br />
unoptimized and vectorized execution using our running example plans. Second, we use a<br />
set <strong>of</strong> additional template plans in order to evaluate the aspects throughput improvement,<br />
latency time, and optimization overhead in more detail on plans with variable number <strong>of</strong><br />
operators. Third, we present evaluation results with regard to multiple deployed plans.<br />
Experimental Setting<br />
We implemented the presented approach within our WFPE (workflow process engine). In<br />
general, the WFPE uses compiled plans and an instance-based execution model. Then, we<br />
integrated components for the full vectorization (VWFPE) and for the cost-based vectorization<br />
(CBVWFPE). For this purpose, new deployment functionalities have been introduced<br />
and several changes in the runtime environment were required because those plans are<br />
executed in an interpreted fashion.<br />
119