25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.6 Experimental Evaluation<br />

Plan with Restricted k<br />

In order to reveal the characteristics <strong>of</strong> vectorizing multiple plans, we further evaluated the<br />

influence <strong>of</strong> restricting the number <strong>of</strong> execution buckets. This is applied if the cost-based<br />

vectorization exceeds this computed maximum number <strong>of</strong> execution buckets. All other<br />

aspects <strong>of</strong> vectorization for multiple plans is a combination <strong>of</strong> already presented effects.<br />

(a) Number <strong>of</strong> Operators m<br />

(b) Input Data Size d<br />

Figure 4.25: Restricting k with Different Numbers <strong>of</strong> Operators and Data Sizes<br />

The first sub-experiment analyzes the influence <strong>of</strong> the number <strong>of</strong> execution buckets on<br />

the execution time <strong>of</strong> a message sequence with regard to varying number <strong>of</strong> plan operators<br />

m. We fixed d = 1, t = 0, q = 50 and explicitly varied the number <strong>of</strong> execution buckets k.<br />

Figure 4.25(a) shows the resulting execution time for a message sequence <strong>of</strong> n = 250. We<br />

observe that, in a first part, an increasing number <strong>of</strong> execution buckets leads to decreasing<br />

execution time. In a second part, a further increase <strong>of</strong> the number <strong>of</strong> execution buckets<br />

led to an increasing execution time. As a result, there is an optimal number <strong>of</strong> execution<br />

buckets, which increases depending on the number <strong>of</strong> operators. We annotated with k1, k2<br />

and k3 the numbers <strong>of</strong> execution buckets that our A-CPV computed without restricting<br />

k. Note that for m = 5, at most k = 5 execution buckets can be used.<br />

In addition to this, we also analyzed the influence <strong>of</strong> the number <strong>of</strong> execution buckets<br />

on the execution time with regard to different data sizes. Hence, we used the plan m = 20,<br />

we fixed t = 0, q = 50 and we varied the data size d ∈ {1, 4, 7} (in 100 kB). Figure 4.25(b)<br />

shows these results. We observe that the first additional execution bucket significantly<br />

decrease the execution time, while after that, the execution time varies only slightly for an<br />

increasing number <strong>of</strong> execution buckets. However, there is also an optimal point, where the<br />

optimal number <strong>of</strong> execution buckets decreases with increasing data size due to increased<br />

cache displacement. Again, we annotated the resulting number <strong>of</strong> execution buckets <strong>of</strong><br />

our A-CPV.<br />

We might use randomized algorithms as heuristics for determining the number <strong>of</strong> buckets<br />

k and for assigning operators to buckets. However, the presented experiments (Figure<br />

4.25(a) and Figure 4.25(a)) show an interesting characteristic that prohibit such randomized<br />

heuristics. While our cost-based vectorization approach finds the near-optimal<br />

solution, a randomly chosen k might significantly decrease the performance, where the<br />

influence <strong>of</strong> determining the best k increases with increasing number <strong>of</strong> operators.<br />

In conclusion, the cost-based vectorization typically computes schemes, where k is close<br />

to the optimal number <strong>of</strong> execution buckets with regard to minimal execution time <strong>of</strong> a<br />

message sequence. Thus, we recommend using cost-based vectorization without restricting<br />

k. However, for multiple deployed plans it is required to ensure the maximum constraint.<br />

127

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!