Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.6 Experimental Evaluation<br />
the relative improvement <strong>of</strong> vectorization increases with increasing number <strong>of</strong> operators.<br />
Figure 4.21(e) shows the impact <strong>of</strong> the time interval t between the initiation <strong>of</strong> two<br />
plan instances. For that, we fixed d = 1, m = 5, n = 250, q = 50 and we varied t<br />
from 10 ms to 70 ms. There is almost no difference between the full vectorization and the<br />
cost-based vectorization. However, the absolute improvement between instance-based and<br />
vectorized approaches decreases slightly with increasing t. The explanation is that the<br />
time interval has no impact on the instance-based execution. In contrast, the vectorized<br />
approach depends on t because the highest improvement is achieved with full pipeline<br />
utilization.<br />
Further, we analyze the influence <strong>of</strong> the number <strong>of</strong> instances n as illustrated in Figure<br />
4.21(c). Here, we fixed d = 1, m = 5, t = 0, q = 50 and we varied n with<br />
n ∈ {10, 100, 200, 300, 400, 500, 600, 700}. Basically, we observe that the relative improvement<br />
between instance-based and vectorized execution increases when increasing n, due<br />
to parallelism <strong>of</strong> plan instances. However, it is interesting to note that the fully vectorized<br />
solution performs slightly better for small n. However, when increasing n, the cost-based<br />
vectorized approach performs optimal because there the maximum queue constraint q is<br />
reached and we observe the influence <strong>of</strong> the already mentioned convoy effect.<br />
Figure 4.21(f) illustrates the influence <strong>of</strong> the maximum queue size q, which we varied<br />
from 10 to 70. Here, we fixed d = 1, m = 5, t = 0 and n = 250. An increasing q slightly<br />
decreases the execution time <strong>of</strong> vectorized plans. This is reasoned by (1) less request<br />
notifications <strong>of</strong> waiting threads and (2) better load balancing by the thread scheduler.<br />
Message Latency<br />
Vectorization is a trade-<strong>of</strong>f between message throughput improvement and increased latency<br />
time <strong>of</strong> single messages. Therefore, we now investigate the latency <strong>of</strong> single messages.<br />
Figure 4.22 illustrates the differences <strong>of</strong> the three execution models instance-based<br />
(WFPE), vectorized (VWFPE) and cost-based vectorized (CBVWFPE) according to the latency<br />
<strong>of</strong> single messages (including inbound waiting time) and the execution time <strong>of</strong> single messages<br />
(without inbound waiting time). Therefore, we fixed d = 1, t = 0, q = 50, and we<br />
varied the number <strong>of</strong> operators m, similar to Figure 4.21(a). All results are illustrated as<br />
error bars using the minimum, median (50% quartile) and maximum latency/execution<br />
time, respectively. In this experiment, all n = 250 messages arrive simultaneously in the<br />
system. The latency time includes the waiting time and execution time in the sense <strong>of</strong><br />
end-to-end latency. In contrast, the execution time shows how long it takes to process a<br />
single message, without waiting time at the server inbound message queues.<br />
First, we observe that the instance-based execution allows for lowest minimum latency<br />
(first processed message), while both the vectorized as well as the cost-based vectorized<br />
execution requires higher initial time for processing the first messages. This is caused by<br />
queue management and synchronization between threads. It is important to note that<br />
the cost-based vectorized model exhibit lower initial latencies. Further, we see that the<br />
median and maximum latencies are higher for the instance-based execution model because<br />
it is directly influenced by the reached throughput.<br />
Second, it is important that the execution time <strong>of</strong> a single message is much smaller when<br />
using the instance-based execution model rather than the vectorized model. The reason<br />
is that the vectorized execution is dominated by the most time-consuming operator and<br />
requires additional effort for thread queue management and synchronization. In addition,<br />
for the used setting <strong>of</strong> a queue size <strong>of</strong> q = 50, most messages wait just in front <strong>of</strong> this<br />
123