25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />

current sliding window used for cost estimation. Increasing the sliding window size ∆w<br />

results in a slower adaptation because statistics are computed over long time intervals.<br />

For example, when using average-based aggregation methods, a long sliding time window<br />

causes a slow adaptation because the overall influence <strong>of</strong> the most recent statistic tuples is<br />

fairly low. As a result, the parameterization is a trade-<strong>of</strong>f between robustness <strong>of</strong> estimates<br />

and fast adaptation to changing workload characteristics. A short time window leads to<br />

high influence <strong>of</strong> single outliers (not robust but fast adaptation), while a long time window<br />

takes long histories into account (robust but slow adaptation).<br />

<strong>Optimization</strong> Interval ∆t: An increasing ∆t will also cause a slower adaptation because<br />

no re-optimization (and thus, no re-estimation) is initiated during this optimization<br />

interval. The longer the optimization interval, the longer we rely on historic estimates.<br />

This parameter only affects the number <strong>of</strong> estimation points rather than influencing the<br />

estimation itself. Thus, this is also a trade-<strong>of</strong>f between the costs <strong>of</strong> re-optimization and<br />

the fast adaptation to changing workload characteristics.<br />

Workload Aggregation Method (method used to aggregate statistics over the sliding<br />

window): The choice <strong>of</strong> the workload aggregation method also influences the adaptation<br />

sensibility. For workload aggregation over a sliding time window <strong>of</strong> length ∆w, which<br />

contains statistics (equi-distant time series) <strong>of</strong> n plan instances, our statistic estimator uses<br />

the following four aggregation methods in order to compute the one-step-ahead forecast<br />

at timestamp t and we assume that this estimate stays constant for the next optimization<br />

interval. As an example, we illustrate the aggregation <strong>of</strong> operator execution times W (o j ):<br />

• Moving Average (MA):<br />

• Weighted Moving Average (WMA):<br />

(( n∑<br />

WMA t = (w i · W i (o j ))<br />

i=1<br />

• Exponential Moving Average (EMA):<br />

MA t = 1 n<br />

)<br />

/<br />

n∑<br />

W i (o j ) (3.10)<br />

i=1<br />

)<br />

n∑<br />

w i with w i = i =<br />

2<br />

i=1<br />

n∑<br />

(w i · W i (o j ))<br />

i=1<br />

n·(n+1)<br />

4<br />

(3.11)<br />

EMA 1 = W i (o j )<br />

EMA t = EMA t−1 + α · (W i (o j ) − EMA t−1 ) with α = 0.05, 1 ≤ i ≤ n<br />

(3.12)<br />

• Linear Regression (LR):<br />

LR t = a + b · x with x = n + 1<br />

a = 1 n∑<br />

W i (o j ) − b · 1 n∑<br />

i = 1 n∑<br />

W i (o j ) − b · n + 1<br />

n<br />

n n<br />

2<br />

i=1<br />

i=1 i=1<br />

n∑<br />

n∑ n∑<br />

n∑<br />

n (i · W i (o j )) − i · W i (o j ) (i · W i (o j )) −<br />

b =<br />

i=1<br />

n<br />

i=1<br />

i=1<br />

(<br />

n∑ n∑<br />

) 2<br />

=<br />

i 2 − i<br />

i=1<br />

i=1<br />

i=1<br />

(n + 1)<br />

2<br />

1<br />

12 (n3 − n)<br />

n∑<br />

W i (o j )<br />

i=1<br />

.<br />

(3.13)<br />

56

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!