25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5 Multi-Flow <strong>Optimization</strong><br />

decreasing relative improvement because it mainly benefits from the reduced costs for the<br />

Switch operator. However, this operator is executed after several Selection operators<br />

that significantly reduced the amount <strong>of</strong> input data, which led to this almost constant<br />

absolute improvement. Similarly, also plan P 7 shows a decreasing relative improvement<br />

because the size <strong>of</strong> external data was not changed. In conclusion, the scalability with<br />

increasing data size strongly depends on how a plan benefits from partitioning.<br />

Second, we investigated the scalability with increasing batch sizes k ′ . There, we reused<br />

our example plans P 1 , P 2 , P 5 , and P 7 and the workload configuration from the previous<br />

experiment. For each example plan and compared the multi-flow optimization with nooptimization<br />

varying the batch size k ′ ∈ {1, 10, 20, 30, 40, 50, 60, 70}. Figure 5.17 shows<br />

the results <strong>of</strong> this experiment. Essentially, we make three major observations. First, the<br />

overhead for executing single-message-partitions is marginal. Despite the fact that MFO<br />

theoretically cannot decrease the performance, there is some overhead due to horizontal<br />

partitioning at the inbound side and additional message abstraction layers. However, the<br />

experiments show that this overhead is negligible. As a result the multi-flow optimization<br />

is robust in the sense that it ensures predictable performance even in special cases, where<br />

we do not benefit from partitioning. Second, the theoretically analyzed monotonically<br />

non-increasing total execution time function with increasing batch size and the existence<br />

<strong>of</strong> a lower bound <strong>of</strong> the total execution time do also hold under experimental evaluation.<br />

Third, we observe that the lower this lower bound (the higher the optimization potential),<br />

the higher the batch size that is required until we asymptotically tend to this lower bound<br />

(e.g., compare Figure 5.17(a) and Figure 5.17(c)). This effect is reasoned by the higher<br />

relative amount <strong>of</strong> time that is logically shared among messages.<br />

Execution Time<br />

Until now we have evaluated the optimization benefit and scalability <strong>of</strong> multi-flow optimization<br />

using restricted batch sizes k ′ . In this subsection, we investigate in detail the<br />

inter-influences between arbitrary message arrival rates R, waiting times ∆tw, partitioning<br />

attribute selectivities sel and the resulting batch size k ′ . In addition, we evaluate the<br />

effects <strong>of</strong> the resulting batch size k ′ on the plan execution time W (P ′ , k ′ ).<br />

(a) Execution Time W (P ′ 2, k ′ ) (b) Relative Execution Time W (P ′ 2, k ′ )/k ′<br />

Figure 5.18: Execution Time W (P ′ 2 , k′ ) with Varying Batch Size k ′<br />

First, we evaluated the execution time <strong>of</strong> the resulting plan instance for message partitions<br />

compared to the unoptimized execution. We executed instances <strong>of</strong> plan P 2 with<br />

varying batch size k ′ ∈ [1, 20], where we measured the total execution time W (P ′ 2 , k′ )<br />

(Figure 5.18(a)) and computed the relative execution time W (P ′ 2 , k′ )/k ′ (Figure 5.18(b)).<br />

160

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!