Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.4 <strong>Optimization</strong> Techniques<br />
One <strong>of</strong> the core concepts is to leverage parallelism <strong>of</strong> operator execution in order to<br />
minimize the execution time. Due to typically low CPU utilization reasoned by (1) singlethreaded,<br />
instance-based plan execution, (2) significant waiting times for external systems,<br />
and (3) IO-bottlenecks for message persistence, these decisions are made with cost-based<br />
optimality conditions rather than statically during the initial deployment.<br />
Rescheduling the Start <strong>of</strong> Parallel <strong>Flows</strong><br />
The technique WC1: Rescheduling the Start <strong>of</strong> Parallel <strong>Flows</strong> rewrites existing Fork operators.<br />
The execution time <strong>of</strong> the Fork operator o is determined by its most time-consuming<br />
subflow r i with<br />
⎛<br />
⎞<br />
W (o) =<br />
max<br />
|r|<br />
i=1<br />
∑m i<br />
⎝ W (o i,j ) + i · W (Start Thread) ⎠ (3.16)<br />
j=1<br />
because the individual subflows are started in sequence and temporally joined (synchronized)<br />
at the end <strong>of</strong> this operator.<br />
The core principle <strong>of</strong> this technique is to reduce waiting time by rewriting the concurrent<br />
subflows such that the subflows are started in descending order <strong>of</strong> their execution time.<br />
In other words, the start sequence <strong>of</strong> parallel subflows is optimal if the condition Ŵ (r i) ≥<br />
Ŵ (r i+1 ) holds, where Ŵ (r i) = ∑ m i<br />
j=1 Ŵ (o i,j). The execution time can be reduced by (|r|−<br />
1)·W (Start T hread), where W (Start T hread) denotes the constant costs for creation and<br />
start <strong>of</strong> a thread.<br />
The rewriting algorithm essentially consists <strong>of</strong> two steps. First, for each subflow, we<br />
recursively sum up the costs <strong>of</strong> all individual operators. Second, we check if the optimality<br />
condition holds and order the subflows according to the estimated costs if required. The<br />
time complexity <strong>of</strong> this algorithm is given by O(|r| · log|r|).<br />
Example 3.9 (Rescheduling Parallel <strong>Flows</strong>). Assume our example plan P 7 and the monitored<br />
execution times shown in Figure 3.12(a). Furthermore, assume the cost for starting<br />
<strong>of</strong> a parallel subflow (thread) to be W (Start T hread) = 3 ms. The estimated costs <strong>of</strong> the<br />
Fork operator are then given by Ŵ (o 2) = 2 · 3 ms + 230 ms = 236 ms. Rescheduling the<br />
parallel flows yields the alternative plan shown in Figure 3.12(b). Using this plan, the<br />
costs are reduced to Ŵ (o 2) = 3 ms + 230 ms = 233 ms.<br />
Fork (o2)<br />
Fork (o2)<br />
5ms<br />
10ms<br />
7ms<br />
4ms<br />
10ms<br />
4ms<br />
5ms<br />
7ms<br />
Assign (o3)<br />
Assign (o6)<br />
Assign (o9)<br />
Assign (o11)<br />
Assign (o6)<br />
Assign (o11)<br />
Assign (o3)<br />
Assign (o9)<br />
130ms<br />
150ms<br />
90ms<br />
120ms<br />
150ms<br />
120ms<br />
130ms<br />
90ms<br />
Invoke (o4)<br />
Invoke (o7)<br />
Invoke (o10)<br />
Invoke (o12)<br />
Invoke (o7)<br />
Invoke (o12)<br />
Invoke (o4)<br />
Invoke (o10)<br />
70ms<br />
Translation<br />
(o5)<br />
205ms<br />
70ms<br />
Translation<br />
(o8)<br />
230ms<br />
97ms<br />
85ms<br />
Translation<br />
(o13)<br />
209ms<br />
70ms<br />
Translation<br />
(o8)<br />
85ms<br />
Translation<br />
(o13)<br />
70ms<br />
Translation<br />
(o5)<br />
230ms 209ms 205ms 97ms<br />
(a) Plan P 7<br />
(b) Optimized Plan P ′ 7<br />
Figure 3.12: Example Rescheduling Parallel <strong>Flows</strong><br />
Clearly, the benefit <strong>of</strong> this optimization technique is limited but the potential increases<br />
with increasing number <strong>of</strong> subflows. This technique should be used after WC2 and WC3<br />
61