Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4.3 <strong>Cost</strong>-<strong>Based</strong> Vectorization<br />
b i with i ∈ [1, k] according to the constrained optimization objective<br />
⎛ ⎞<br />
φ c = min<br />
m<br />
l k | ∀i ∈ [1, k] : ∑ bi<br />
⎝ W (o j ) ⎠ ≤ W (o max ) + λ, (4.8)<br />
k=1<br />
where λ (with λ ≥ 0) is a user-defined absolute parameter to control the cost constraint.<br />
There, λ = 0 leads to the highest meaningful degree <strong>of</strong> parallelism, while higher values <strong>of</strong><br />
λ lead to a decrease <strong>of</strong> parallelism.<br />
Essentially, one can configure the absolute parameter λ in order to influence the number<br />
<strong>of</strong> threads. The higher the value <strong>of</strong> λ, the more operators are assigned to single execution<br />
buckets, and thus, the lower the number <strong>of</strong> buckets and the lower the number <strong>of</strong> required<br />
threads. However, the decision on merging operators is still made in a cost-based manner.<br />
<strong>Based</strong> on the defined cost-based vectorization problems, we now investigate the resulting<br />
search space. Essentially, both problems exhibit the same search space because they only<br />
differ in the optimization objective φ, where the worst-case time complexity depends on<br />
the structure <strong>of</strong> a given plan. Figure 4.12 illustrates the best case and the worst case.<br />
j=1<br />
Fork o 1<br />
o 1 o 2 o 3 o 4 o 5 o 6 o 7<br />
o 2 o 3 o 4 o 5 o 6 o 7<br />
(a) Best-Case Plan P b (b) Worst-Case Plan P w<br />
Figure 4.12: Plan-Dependent Search Space<br />
The best case from a computational complexity perspective is the sequence <strong>of</strong> operators<br />
(see Figure 4.12(a)), where each operator has a data dependency to its predecessor. Here,<br />
the order <strong>of</strong> operators must be preserved when assigning operators to execution buckets. In<br />
contrast, the worst-case is a set <strong>of</strong> operators without any dependencies between operators<br />
(see Figure 4.12(b)) because there, we could create arbitrary combinations <strong>of</strong> operators.<br />
We use an example to illustrate the resulting plan search space for the best case.<br />
Example 4.6 (Operator Distribution Across Buckets). Assume a plan P with a sequence<br />
<strong>of</strong> four operators (m = 4). Table 4.2 shows the possible plans for the different numbers <strong>of</strong><br />
buckets k.<br />
Table 4.2: Example Operator Distribution<br />
|b| b 1 b 2 b 3 b 4<br />
Plan 1 k = 1 o 1 , o 2 , o 3 , o 4 - - -<br />
Plan 2 k = 2 o 1 o 2 , o 3 , o 4 - -<br />
Plan 3 o 1 , o 2 o 3 , o 4 - -<br />
Plan 4 o 1 , o 2 , o 3 o 4 - -<br />
Plan 5 k = 3 o 1 o 2 o 3 , o 4 -<br />
Plan 6 o 1 o 2 , o 3 o 4 -<br />
Plan 7 o 1 , o 2 o 3 o 4 -<br />
Plan 8 k = 4 o 1 o 2 o 3 o 4<br />
We can distinguish eight different (2 4−1 = 8) plans, where Plan 1 is the special case <strong>of</strong> an<br />
instance-based plan and Plan 8 is the special case <strong>of</strong> a fully vectorized plan.<br />
103