25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.3 <strong>Cost</strong>-<strong>Based</strong> Vectorization<br />

b i with i ∈ [1, k] according to the constrained optimization objective<br />

⎛ ⎞<br />

φ c = min<br />

m<br />

l k | ∀i ∈ [1, k] : ∑ bi<br />

⎝ W (o j ) ⎠ ≤ W (o max ) + λ, (4.8)<br />

k=1<br />

where λ (with λ ≥ 0) is a user-defined absolute parameter to control the cost constraint.<br />

There, λ = 0 leads to the highest meaningful degree <strong>of</strong> parallelism, while higher values <strong>of</strong><br />

λ lead to a decrease <strong>of</strong> parallelism.<br />

Essentially, one can configure the absolute parameter λ in order to influence the number<br />

<strong>of</strong> threads. The higher the value <strong>of</strong> λ, the more operators are assigned to single execution<br />

buckets, and thus, the lower the number <strong>of</strong> buckets and the lower the number <strong>of</strong> required<br />

threads. However, the decision on merging operators is still made in a cost-based manner.<br />

<strong>Based</strong> on the defined cost-based vectorization problems, we now investigate the resulting<br />

search space. Essentially, both problems exhibit the same search space because they only<br />

differ in the optimization objective φ, where the worst-case time complexity depends on<br />

the structure <strong>of</strong> a given plan. Figure 4.12 illustrates the best case and the worst case.<br />

j=1<br />

Fork o 1<br />

o 1 o 2 o 3 o 4 o 5 o 6 o 7<br />

o 2 o 3 o 4 o 5 o 6 o 7<br />

(a) Best-Case Plan P b (b) Worst-Case Plan P w<br />

Figure 4.12: Plan-Dependent Search Space<br />

The best case from a computational complexity perspective is the sequence <strong>of</strong> operators<br />

(see Figure 4.12(a)), where each operator has a data dependency to its predecessor. Here,<br />

the order <strong>of</strong> operators must be preserved when assigning operators to execution buckets. In<br />

contrast, the worst-case is a set <strong>of</strong> operators without any dependencies between operators<br />

(see Figure 4.12(b)) because there, we could create arbitrary combinations <strong>of</strong> operators.<br />

We use an example to illustrate the resulting plan search space for the best case.<br />

Example 4.6 (Operator Distribution Across Buckets). Assume a plan P with a sequence<br />

<strong>of</strong> four operators (m = 4). Table 4.2 shows the possible plans for the different numbers <strong>of</strong><br />

buckets k.<br />

Table 4.2: Example Operator Distribution<br />

|b| b 1 b 2 b 3 b 4<br />

Plan 1 k = 1 o 1 , o 2 , o 3 , o 4 - - -<br />

Plan 2 k = 2 o 1 o 2 , o 3 , o 4 - -<br />

Plan 3 o 1 , o 2 o 3 , o 4 - -<br />

Plan 4 o 1 , o 2 , o 3 o 4 - -<br />

Plan 5 k = 3 o 1 o 2 o 3 , o 4 -<br />

Plan 6 o 1 o 2 , o 3 o 4 -<br />

Plan 7 o 1 , o 2 o 3 o 4 -<br />

Plan 8 k = 4 o 1 o 2 o 3 o 4<br />

We can distinguish eight different (2 4−1 = 8) plans, where Plan 1 is the special case <strong>of</strong> an<br />

instance-based plan and Plan 8 is the special case <strong>of</strong> a fully vectorized plan.<br />

103

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!