25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.6 Experimental Evaluation<br />

the relative improvement <strong>of</strong> vectorization increases with increasing number <strong>of</strong> operators.<br />

Figure 4.21(e) shows the impact <strong>of</strong> the time interval t between the initiation <strong>of</strong> two<br />

plan instances. For that, we fixed d = 1, m = 5, n = 250, q = 50 and we varied t<br />

from 10 ms to 70 ms. There is almost no difference between the full vectorization and the<br />

cost-based vectorization. However, the absolute improvement between instance-based and<br />

vectorized approaches decreases slightly with increasing t. The explanation is that the<br />

time interval has no impact on the instance-based execution. In contrast, the vectorized<br />

approach depends on t because the highest improvement is achieved with full pipeline<br />

utilization.<br />

Further, we analyze the influence <strong>of</strong> the number <strong>of</strong> instances n as illustrated in Figure<br />

4.21(c). Here, we fixed d = 1, m = 5, t = 0, q = 50 and we varied n with<br />

n ∈ {10, 100, 200, 300, 400, 500, 600, 700}. Basically, we observe that the relative improvement<br />

between instance-based and vectorized execution increases when increasing n, due<br />

to parallelism <strong>of</strong> plan instances. However, it is interesting to note that the fully vectorized<br />

solution performs slightly better for small n. However, when increasing n, the cost-based<br />

vectorized approach performs optimal because there the maximum queue constraint q is<br />

reached and we observe the influence <strong>of</strong> the already mentioned convoy effect.<br />

Figure 4.21(f) illustrates the influence <strong>of</strong> the maximum queue size q, which we varied<br />

from 10 to 70. Here, we fixed d = 1, m = 5, t = 0 and n = 250. An increasing q slightly<br />

decreases the execution time <strong>of</strong> vectorized plans. This is reasoned by (1) less request<br />

notifications <strong>of</strong> waiting threads and (2) better load balancing by the thread scheduler.<br />

Message Latency<br />

Vectorization is a trade-<strong>of</strong>f between message throughput improvement and increased latency<br />

time <strong>of</strong> single messages. Therefore, we now investigate the latency <strong>of</strong> single messages.<br />

Figure 4.22 illustrates the differences <strong>of</strong> the three execution models instance-based<br />

(WFPE), vectorized (VWFPE) and cost-based vectorized (CBVWFPE) according to the latency<br />

<strong>of</strong> single messages (including inbound waiting time) and the execution time <strong>of</strong> single messages<br />

(without inbound waiting time). Therefore, we fixed d = 1, t = 0, q = 50, and we<br />

varied the number <strong>of</strong> operators m, similar to Figure 4.21(a). All results are illustrated as<br />

error bars using the minimum, median (50% quartile) and maximum latency/execution<br />

time, respectively. In this experiment, all n = 250 messages arrive simultaneously in the<br />

system. The latency time includes the waiting time and execution time in the sense <strong>of</strong><br />

end-to-end latency. In contrast, the execution time shows how long it takes to process a<br />

single message, without waiting time at the server inbound message queues.<br />

First, we observe that the instance-based execution allows for lowest minimum latency<br />

(first processed message), while both the vectorized as well as the cost-based vectorized<br />

execution requires higher initial time for processing the first messages. This is caused by<br />

queue management and synchronization between threads. It is important to note that<br />

the cost-based vectorized model exhibit lower initial latencies. Further, we see that the<br />

median and maximum latencies are higher for the instance-based execution model because<br />

it is directly influenced by the reached throughput.<br />

Second, it is important that the execution time <strong>of</strong> a single message is much smaller when<br />

using the instance-based execution model rather than the vectorized model. The reason<br />

is that the vectorized execution is dominated by the most time-consuming operator and<br />

requires additional effort for thread queue management and synchronization. In addition,<br />

for the used setting <strong>of</strong> a queue size <strong>of</strong> q = 50, most messages wait just in front <strong>of</strong> this<br />

123

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!