Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
5 Multi-Flow <strong>Optimization</strong><br />
decreasing relative improvement because it mainly benefits from the reduced costs for the<br />
Switch operator. However, this operator is executed after several Selection operators<br />
that significantly reduced the amount <strong>of</strong> input data, which led to this almost constant<br />
absolute improvement. Similarly, also plan P 7 shows a decreasing relative improvement<br />
because the size <strong>of</strong> external data was not changed. In conclusion, the scalability with<br />
increasing data size strongly depends on how a plan benefits from partitioning.<br />
Second, we investigated the scalability with increasing batch sizes k ′ . There, we reused<br />
our example plans P 1 , P 2 , P 5 , and P 7 and the workload configuration from the previous<br />
experiment. For each example plan and compared the multi-flow optimization with nooptimization<br />
varying the batch size k ′ ∈ {1, 10, 20, 30, 40, 50, 60, 70}. Figure 5.17 shows<br />
the results <strong>of</strong> this experiment. Essentially, we make three major observations. First, the<br />
overhead for executing single-message-partitions is marginal. Despite the fact that MFO<br />
theoretically cannot decrease the performance, there is some overhead due to horizontal<br />
partitioning at the inbound side and additional message abstraction layers. However, the<br />
experiments show that this overhead is negligible. As a result the multi-flow optimization<br />
is robust in the sense that it ensures predictable performance even in special cases, where<br />
we do not benefit from partitioning. Second, the theoretically analyzed monotonically<br />
non-increasing total execution time function with increasing batch size and the existence<br />
<strong>of</strong> a lower bound <strong>of</strong> the total execution time do also hold under experimental evaluation.<br />
Third, we observe that the lower this lower bound (the higher the optimization potential),<br />
the higher the batch size that is required until we asymptotically tend to this lower bound<br />
(e.g., compare Figure 5.17(a) and Figure 5.17(c)). This effect is reasoned by the higher<br />
relative amount <strong>of</strong> time that is logically shared among messages.<br />
Execution Time<br />
Until now we have evaluated the optimization benefit and scalability <strong>of</strong> multi-flow optimization<br />
using restricted batch sizes k ′ . In this subsection, we investigate in detail the<br />
inter-influences between arbitrary message arrival rates R, waiting times ∆tw, partitioning<br />
attribute selectivities sel and the resulting batch size k ′ . In addition, we evaluate the<br />
effects <strong>of</strong> the resulting batch size k ′ on the plan execution time W (P ′ , k ′ ).<br />
(a) Execution Time W (P ′ 2, k ′ ) (b) Relative Execution Time W (P ′ 2, k ′ )/k ′<br />
Figure 5.18: Execution Time W (P ′ 2 , k′ ) with Varying Batch Size k ′<br />
First, we evaluated the execution time <strong>of</strong> the resulting plan instance for message partitions<br />
compared to the unoptimized execution. We executed instances <strong>of</strong> plan P 2 with<br />
varying batch size k ′ ∈ [1, 20], where we measured the total execution time W (P ′ 2 , k′ )<br />
(Figure 5.18(a)) and computed the relative execution time W (P ′ 2 , k′ )/k ′ (Figure 5.18(b)).<br />
160