25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.4 <strong>Optimization</strong> Techniques<br />

One <strong>of</strong> the core concepts is to leverage parallelism <strong>of</strong> operator execution in order to<br />

minimize the execution time. Due to typically low CPU utilization reasoned by (1) singlethreaded,<br />

instance-based plan execution, (2) significant waiting times for external systems,<br />

and (3) IO-bottlenecks for message persistence, these decisions are made with cost-based<br />

optimality conditions rather than statically during the initial deployment.<br />

Rescheduling the Start <strong>of</strong> Parallel <strong>Flows</strong><br />

The technique WC1: Rescheduling the Start <strong>of</strong> Parallel <strong>Flows</strong> rewrites existing Fork operators.<br />

The execution time <strong>of</strong> the Fork operator o is determined by its most time-consuming<br />

subflow r i with<br />

⎛<br />

⎞<br />

W (o) =<br />

max<br />

|r|<br />

i=1<br />

∑m i<br />

⎝ W (o i,j ) + i · W (Start Thread) ⎠ (3.16)<br />

j=1<br />

because the individual subflows are started in sequence and temporally joined (synchronized)<br />

at the end <strong>of</strong> this operator.<br />

The core principle <strong>of</strong> this technique is to reduce waiting time by rewriting the concurrent<br />

subflows such that the subflows are started in descending order <strong>of</strong> their execution time.<br />

In other words, the start sequence <strong>of</strong> parallel subflows is optimal if the condition Ŵ (r i) ≥<br />

Ŵ (r i+1 ) holds, where Ŵ (r i) = ∑ m i<br />

j=1 Ŵ (o i,j). The execution time can be reduced by (|r|−<br />

1)·W (Start T hread), where W (Start T hread) denotes the constant costs for creation and<br />

start <strong>of</strong> a thread.<br />

The rewriting algorithm essentially consists <strong>of</strong> two steps. First, for each subflow, we<br />

recursively sum up the costs <strong>of</strong> all individual operators. Second, we check if the optimality<br />

condition holds and order the subflows according to the estimated costs if required. The<br />

time complexity <strong>of</strong> this algorithm is given by O(|r| · log|r|).<br />

Example 3.9 (Rescheduling Parallel <strong>Flows</strong>). Assume our example plan P 7 and the monitored<br />

execution times shown in Figure 3.12(a). Furthermore, assume the cost for starting<br />

<strong>of</strong> a parallel subflow (thread) to be W (Start T hread) = 3 ms. The estimated costs <strong>of</strong> the<br />

Fork operator are then given by Ŵ (o 2) = 2 · 3 ms + 230 ms = 236 ms. Rescheduling the<br />

parallel flows yields the alternative plan shown in Figure 3.12(b). Using this plan, the<br />

costs are reduced to Ŵ (o 2) = 3 ms + 230 ms = 233 ms.<br />

Fork (o2)<br />

Fork (o2)<br />

5ms<br />

10ms<br />

7ms<br />

4ms<br />

10ms<br />

4ms<br />

5ms<br />

7ms<br />

Assign (o3)<br />

Assign (o6)<br />

Assign (o9)<br />

Assign (o11)<br />

Assign (o6)<br />

Assign (o11)<br />

Assign (o3)<br />

Assign (o9)<br />

130ms<br />

150ms<br />

90ms<br />

120ms<br />

150ms<br />

120ms<br />

130ms<br />

90ms<br />

Invoke (o4)<br />

Invoke (o7)<br />

Invoke (o10)<br />

Invoke (o12)<br />

Invoke (o7)<br />

Invoke (o12)<br />

Invoke (o4)<br />

Invoke (o10)<br />

70ms<br />

Translation<br />

(o5)<br />

205ms<br />

70ms<br />

Translation<br />

(o8)<br />

230ms<br />

97ms<br />

85ms<br />

Translation<br />

(o13)<br />

209ms<br />

70ms<br />

Translation<br />

(o8)<br />

85ms<br />

Translation<br />

(o13)<br />

70ms<br />

Translation<br />

(o5)<br />

230ms 209ms 205ms 97ms<br />

(a) Plan P 7<br />

(b) Optimized Plan P ′ 7<br />

Figure 3.12: Example Rescheduling Parallel <strong>Flows</strong><br />

Clearly, the benefit <strong>of</strong> this optimization technique is limited but the potential increases<br />

with increasing number <strong>of</strong> subflows. This technique should be used after WC2 and WC3<br />

61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!