25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Vectorizing <strong>Integration</strong> <strong>Flows</strong><br />

order to prove the theorem, we need to prove the two single claims <strong>of</strong> W (P ′′ ) ≤ W (P )<br />

and W (P ′′ ) ≤ W (P ′ ).<br />

For the pro<strong>of</strong> <strong>of</strong> W (P ′′ ) ≤ W (P ), assume the worst case, where ∀o i : R o (o i ) = 1. If we<br />

vectorize this to P ′′ , we need to compute the costs by W (b ′′<br />

i ) = (R o(b ′′<br />

i ))/(R e(b ′′<br />

i )) · W (o i)<br />

with R e (b ′′<br />

i ) = 1/|b|. Due to the vectorized execution, W (P ′′ ) = max m i=1 W (b′′ i ), while<br />

W (P ) = ∑ m<br />

i=1 W (o i). Hence, we can write W (P ′′ ) = W (P ) if the condition ∀o i : R o (o i ) =<br />

1 holds. This is the worst case. For each R o (o i ) < 1, we get W (P ′′ ) < W (P ).<br />

In order to prove W (P ′′ ) ≤ W (P ′ ), we fix λ = 0. If we merge two buckets b i and b i+1 , we<br />

see that R e (b ′′<br />

i ) is increased from 1/|b| to 1/(|b|−1). Thus, we re-compute the costs W (b′′ i )<br />

as mentioned before. In the worst case, W (b ′′<br />

i ) = W (b′ i ), which is true iff R e(b ′ i ) = R o(b ′ i )<br />

because then we also have R e (b ′′<br />

i ) = R e(b ′ i ). Due to W (P ′′ ) = max m i=1 W (b′′ i ), we can state<br />

W (P ′′ ) ≤ W (P ). Hence, Theorem 4.4 holds.<br />

In conclusion, we cannot guarantee that the result <strong>of</strong> the A-CPV is the global optimum<br />

because we cannot efficiently evaluate the effective resource consumption. However, we can<br />

guarantee that each merging <strong>of</strong> execution buckets when solving the P-CPV with λ = 0<br />

(where the costs <strong>of</strong> each bucket are lower than or equal to the highest operator costs)<br />

improves the performance <strong>of</strong> the plan P .<br />

4.3.3 <strong>Cost</strong>-<strong>Based</strong> Vectorization with Restricted Number <strong>of</strong> Buckets<br />

Due to dynamically changing workload characteristics, we recommend using the cost-based<br />

vectorization approach. However, there might exist scenarios where an explicit restriction<br />

<strong>of</strong> k and thus, <strong>of</strong> the number <strong>of</strong> threads, is advantageous. Hence, in this subsection, we<br />

discuss the necessary changes <strong>of</strong> the exhaustive and heuristic computation approaches<br />

when using this constraint.<br />

Exhaustive Computation Approach<br />

With regard to the exhaustive cost-based computation approach (see Subsection 4.3.2),<br />

only minor changes are required when restricting k. Due to the restricted number <strong>of</strong><br />

execution buckets, k = |b|, the search space is smaller than for the previously described<br />

P-CPV. As already stated, for an operator sequence (best case), there are<br />

|P ′′ | k =<br />

k−1<br />

∏<br />

i=1<br />

m − i<br />

i<br />

different possibilities, while for sets <strong>of</strong> operators, there are<br />

|P ′′ | k = 1 k!<br />

j=0<br />

(4.16)<br />

k∑<br />

( )<br />

(−1) k−j k<br />

j m (4.17)<br />

j<br />

possibilities to distribute the m operators <strong>of</strong> plan P across k buckets. Hence, the enumeration<br />

<strong>of</strong> candidate distribution schemes can be reused by simply invoking the recursive<br />

Algorithm 4.2 only once for the given k. In addition, we change the optimality condition<br />

for evaluating those candidates to<br />

⎛ ⎛ ⎞⎞<br />

l bi<br />

φ = min ⎝ max<br />

|b|=k ∑<br />

⎝ W (o j ) ⎠⎠ , (4.18)<br />

i=1<br />

j=1<br />

110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!