Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
period is longer than the sliding time window size (∆t ≥ ∆w). In detail, we aggregated<br />
700,000 statistics (execution times W (o i ) only) and we observed that all statistics were<br />
aggregated in less than 20 ms. The single aggregation methods differ only slightly in their<br />
execution time, where MA is the fastest method but only minor differences are observable.<br />
If ∆t < ∆w or no ∆w is used, incremental statistics maintenance is required. Thus, we<br />
repeated the experiment with our incremental aggregation methods. When comparing full<br />
and incremental maintenance, we see that the incremental methods are a factor <strong>of</strong> 1.5 to<br />
3 slower than the full methods because they require additional computation efforts for<br />
producing valid intermediate results and for many method invocations. EMA is the fastest<br />
incremental method based on its incremental nature. Our Estimator comprises all <strong>of</strong> these<br />
aggregation methods and some additional infrastructural functionalities, where we use the<br />
incremental EMA as default aggregation method. The maintenance <strong>of</strong> all three statistics<br />
(|ds in |,|ds out |, and W (o i )) for all plan instances <strong>of</strong> the test set (2,100,000 statistic tuples)<br />
using our Estimator is illustrated as Estimator (EMA). In conclusion, the overhead for<br />
statistics maintenance during the full comparison scenario was 106 ms. This is negligible<br />
compared to the cumulative execution time <strong>of</strong> 140 min in the optimized case.<br />
Workload Adaptation<br />
Due to changing workload characteristics, the sensibility <strong>of</strong> workload adaptation has high<br />
importance. According to Subsection 3.3.3, there are three possibilities to influence the<br />
sensibility <strong>of</strong> workload adaptation: (1) the workload sliding time window size ∆w, (2) the<br />
optimization period ∆t, and (3) the workload aggregation method Agg. We evaluated<br />
their influence in the following series <strong>of</strong> experiments.<br />
Figure 3.27: Workload Adaptation Delays<br />
Figure 3.27 shows the results <strong>of</strong> an experiment, where we executed n = 20,000 instances<br />
<strong>of</strong> plan P 3 and a modified plan P 3 ′ (with eager group-by) with disabled periodical reoptimization.<br />
After n = 5,000 and n = 15,000 instances, we changed the cardinality <strong>of</strong><br />
one <strong>of</strong> two input data sets (workload changes WC1 and WC2). While in the first part,<br />
the eager group-by was most efficient, the simple join and group-by performed better after<br />
WC1. We fixed a sliding window size <strong>of</strong> ∆w = 5,000 s and MA as the workload aggregation<br />
method. It took 2,100 plan instances to adapt to the workload shift and the plan changed<br />
(PC1 at break even point between estimated plan costs). This adaptation delay depends<br />
on the used sliding time window size ∆w and the aggregation method.<br />
82