25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.6 Experimental Evaluation<br />

approach was designed for multi- and many-core systems, in general, it can be extended<br />

to the distributed case as well. There, extensions with regard to the communication costs<br />

between several nodes as well as heterogeneous hardware (different execution times on different<br />

server nodes) would be required. However, in this distributed setting, the cost-based<br />

vectorization would be even more important because the number <strong>of</strong> involved server nodes<br />

could be reduced without sacrificing the degree <strong>of</strong> parallelism.<br />

4.6 Experimental Evaluation<br />

In this section, we provide experimental evaluation results for both the full vectorization<br />

and the cost-based vectorization <strong>of</strong> integration flows. The major perspectives <strong>of</strong> this<br />

evaluation are (1) the performance in terms <strong>of</strong> message throughput, (2) the influence on<br />

latency times <strong>of</strong> single messages, as well as (3) the optimization overhead and influences<br />

<strong>of</strong> parameterization. In general, the evaluation shows that:<br />

• Significant performance improvements in the sense <strong>of</strong> increased message throughput<br />

are achievable. The benefit <strong>of</strong> vectorization increases with increasing number <strong>of</strong><br />

operators, with increasing data size, and with increasing number <strong>of</strong> plan instances.<br />

• The latency <strong>of</strong> single messages is moderately increased by vectorization. However,<br />

due to Little’s Law [Lit61], in case <strong>of</strong> high load <strong>of</strong> messages, the total latency time<br />

(including waiting time) is reduced by vectorization.<br />

• The deployment and optimization overhead imposed by vectorization is moderate<br />

as well. In addition to the influence <strong>of</strong> the parameters <strong>of</strong> periodical optimization,<br />

also the cost-constraint-parameter λ has high influence on the resulting performance.<br />

Typically, the default setting <strong>of</strong> λ = 0 leads to highest performance.<br />

• The cost-based optimization typically finds the global optimal plan according to the<br />

number <strong>of</strong> execution buckets. Thus, the number <strong>of</strong> buckets should not be restricted.<br />

Finally, we can state that the cost-based vectorization achieves significant throughput<br />

improvements, while accepting moderate additional latency for single messages. In conclusion,<br />

this concept can be applied by default if a high load <strong>of</strong> plan instances exists and<br />

moderate latency time is acceptable. The theoretical optimality and latency guarantees<br />

also hold under experimental performance evaluation.<br />

The evaluation is structured as follows. First, we present the end-to-end comparison <strong>of</strong><br />

unoptimized and vectorized execution using our running example plans. Second, we use a<br />

set <strong>of</strong> additional template plans in order to evaluate the aspects throughput improvement,<br />

latency time, and optimization overhead in more detail on plans with variable number <strong>of</strong><br />

operators. Third, we present evaluation results with regard to multiple deployed plans.<br />

Experimental Setting<br />

We implemented the presented approach within our WFPE (workflow process engine). In<br />

general, the WFPE uses compiled plans and an instance-based execution model. Then, we<br />

integrated components for the full vectorization (VWFPE) and for the cost-based vectorization<br />

(CBVWFPE). For this purpose, new deployment functionalities have been introduced<br />

and several changes in the runtime environment were required because those plans are<br />

executed in an interpreted fashion.<br />

119

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!