25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 Vectorizing <strong>Integration</strong> <strong>Flows</strong><br />

(a) Scalability with d<br />

(b) Variance with d<br />

(c) Scalability with n<br />

(d) Scalability with m<br />

(e) Scalability with t<br />

(f) Scalability with q<br />

Figure 4.21: Scalability Comparison with Different Influencing Factors<br />

In Figure 4.21(a), we scaled the data size d <strong>of</strong> the XML input messages from 100 kB to<br />

700 kB and measured the execution time (elapsed time) <strong>of</strong> 250 plan instances (n = 250)<br />

needed by the three different execution models. There, we fixed m = 5, t = 0, n = 250 and<br />

q = 50. We observe that all three execution models exhibit a linear scaling according to the<br />

data size and that significant improvements can be achieved with vectorization. There, the<br />

absolute improvement increases with increasing data size. Further, in Figure 4.21(b), we<br />

illustrated the variance over all 20 repetitions <strong>of</strong> this sub-experiment. The variance <strong>of</strong> the<br />

instance-based execution is minimal, while the variance <strong>of</strong> both vectorized models is worse<br />

due to the unpredictable influence <strong>of</strong> thread scheduling by the operating system. <strong>Cost</strong>based<br />

vectorization exhibits a significantly lower variance than full vectorization because<br />

we use fewer threads and therefore reduce the thread scheduling influence.<br />

Now, we fix d = 1 (lowest improvement in 4.21(a)), t = 0, n = 250 and q = 50 in order<br />

to investigate the influence <strong>of</strong> m. In Figure 4.21(d), we vary m from 5 to 35 operators, as<br />

already mentioned for the experimental setup. Interestingly, not only the absolute but also<br />

122

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!