25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 Vectorizing <strong>Integration</strong> <strong>Flows</strong><br />

P’a<br />

o1<br />

W(o1)=4<br />

o2<br />

W(o2)=3<br />

o3 k=2 | Ka=1<br />

W(o3)=1<br />

X<br />

Ka=2<br />

k=2<br />

P’a<br />

o1<br />

W(o1)=4<br />

o2<br />

W(o2)=3<br />

o3<br />

W(o3)=1<br />

P’b<br />

o1 o2 o3 o4 o5<br />

W(o1)=3 W(o2)=2 W(o3)=5 W(o4)=4 W(o5)=3<br />

k=4 | Kb=1<br />

X<br />

Kb=2<br />

k=2<br />

P’b<br />

o1 o2 o3 o4 o5<br />

W(o1)=3 W(o2)=2 W(o3)=5 W(o4)=4 W(o5)=3<br />

P’c<br />

o1<br />

W(o1)=1<br />

o2 o3 o4<br />

W(o2)=9 W(o3)=4 W(o4)=2<br />

k=3 | Kc=3<br />

(a) <strong>Cost</strong>-<strong>Based</strong> Plan Vectorization<br />

(b) Restricted <strong>Cost</strong>-<strong>Based</strong> Plan Vectorization<br />

Figure 4.17: Heuristic Multiple Plan Vectorization<br />

Finally, we execute the restricted cost-based plan vectorization with K i as a parameter.<br />

Figure 4.17(b) shows the result <strong>of</strong> this step, where K a = 2 was used, but K b = 2 because<br />

for P b , the local constraint exceeded the number <strong>of</strong> free buckets. As a result, we ensured<br />

that the global maximum constraint was not exceeded.<br />

This is a heuristic computation because (1) we directly assign free buckets (independent<br />

<strong>of</strong> relative cost improvements) and (2) the order <strong>of</strong> considered plans might influence<br />

the resulting distribution. However, it <strong>of</strong>ten solves this problem adequately. An exact<br />

computation approach would use a while loop and redistribute free buckets as long as no<br />

more changes are made. Just after this, we would restrict k explicitly. However, an exact<br />

computation approach would require using the exhaustive cost-based computation.<br />

Our heuristic algorithm has a time complexity <strong>of</strong> O(h · m) because we call h times the<br />

A-CPV that has a time complexity <strong>of</strong> O(m) (see Theorem 4.3). Furthermore, we might<br />

call at most h times the heuristic restricted plan vectorization algorithm that has also a<br />

linear time complexity <strong>of</strong> O(m).<br />

In conclusion, the multiple plan vectorization takes into account the workload characteristics<br />

in order to restrict the maximum number <strong>of</strong> used execution buckets. There,<br />

plans with high load are preferred and get more execution buckets assigned to them. At<br />

the same time, this algorithm applies the cost-based vectorization and hence, can ensure<br />

near-optimal performance even in the case <strong>of</strong> large numbers <strong>of</strong> plans.<br />

4.5 Periodical Re-<strong>Optimization</strong><br />

We have shown how to compute cost-based vectorized plans in case <strong>of</strong> both single and<br />

multiple deployed plans. Those exhaustive and heuristic algorithms rely on continuous<br />

gathering <strong>of</strong> execution statistic and periodical re-optimization in order to be aware <strong>of</strong><br />

changing workload characteristics. Finally, the whole vectorization approach is embedded<br />

as a specific optimization technique into our general cost-based optimization framework.<br />

Note that our transformation-based optimization algorithm (A-PMO, Subsection 3.3.1)<br />

applies the cost-based vectorization after all other rewriting techniques, where techniques<br />

for rewriting patterns to parallel flows are not used if vectorization is enabled.<br />

With regard to the whole feedback loop <strong>of</strong> our general cost-based optimization framework,<br />

there are two additional challenges that need to be addressed for vectorized plans.<br />

First, if vectorization produces a plan that differs from the currently deployed plan, we<br />

have to evaluate the re-optimization potential in the sense <strong>of</strong> the benefit <strong>of</strong> exchanging<br />

plans. Second, if an exchange is beneficial or one <strong>of</strong> the other optimization techniques<br />

116

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!