Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
current sliding window used for cost estimation. Increasing the sliding window size ∆w<br />
results in a slower adaptation because statistics are computed over long time intervals.<br />
For example, when using average-based aggregation methods, a long sliding time window<br />
causes a slow adaptation because the overall influence <strong>of</strong> the most recent statistic tuples is<br />
fairly low. As a result, the parameterization is a trade-<strong>of</strong>f between robustness <strong>of</strong> estimates<br />
and fast adaptation to changing workload characteristics. A short time window leads to<br />
high influence <strong>of</strong> single outliers (not robust but fast adaptation), while a long time window<br />
takes long histories into account (robust but slow adaptation).<br />
<strong>Optimization</strong> Interval ∆t: An increasing ∆t will also cause a slower adaptation because<br />
no re-optimization (and thus, no re-estimation) is initiated during this optimization<br />
interval. The longer the optimization interval, the longer we rely on historic estimates.<br />
This parameter only affects the number <strong>of</strong> estimation points rather than influencing the<br />
estimation itself. Thus, this is also a trade-<strong>of</strong>f between the costs <strong>of</strong> re-optimization and<br />
the fast adaptation to changing workload characteristics.<br />
Workload Aggregation Method (method used to aggregate statistics over the sliding<br />
window): The choice <strong>of</strong> the workload aggregation method also influences the adaptation<br />
sensibility. For workload aggregation over a sliding time window <strong>of</strong> length ∆w, which<br />
contains statistics (equi-distant time series) <strong>of</strong> n plan instances, our statistic estimator uses<br />
the following four aggregation methods in order to compute the one-step-ahead forecast<br />
at timestamp t and we assume that this estimate stays constant for the next optimization<br />
interval. As an example, we illustrate the aggregation <strong>of</strong> operator execution times W (o j ):<br />
• Moving Average (MA):<br />
• Weighted Moving Average (WMA):<br />
(( n∑<br />
WMA t = (w i · W i (o j ))<br />
i=1<br />
• Exponential Moving Average (EMA):<br />
MA t = 1 n<br />
)<br />
/<br />
n∑<br />
W i (o j ) (3.10)<br />
i=1<br />
)<br />
n∑<br />
w i with w i = i =<br />
2<br />
i=1<br />
n∑<br />
(w i · W i (o j ))<br />
i=1<br />
n·(n+1)<br />
4<br />
(3.11)<br />
EMA 1 = W i (o j )<br />
EMA t = EMA t−1 + α · (W i (o j ) − EMA t−1 ) with α = 0.05, 1 ≤ i ≤ n<br />
(3.12)<br />
• Linear Regression (LR):<br />
LR t = a + b · x with x = n + 1<br />
a = 1 n∑<br />
W i (o j ) − b · 1 n∑<br />
i = 1 n∑<br />
W i (o j ) − b · n + 1<br />
n<br />
n n<br />
2<br />
i=1<br />
i=1 i=1<br />
n∑<br />
n∑ n∑<br />
n∑<br />
n (i · W i (o j )) − i · W i (o j ) (i · W i (o j )) −<br />
b =<br />
i=1<br />
n<br />
i=1<br />
i=1<br />
(<br />
n∑ n∑<br />
) 2<br />
=<br />
i 2 − i<br />
i=1<br />
i=1<br />
i=1<br />
(n + 1)<br />
2<br />
1<br />
12 (n3 − n)<br />
n∑<br />
W i (o j )<br />
i=1<br />
.<br />
(3.13)<br />
56