25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.3 Periodical Re-<strong>Optimization</strong><br />

benefit from a partitioning attribute (e.g., for writing interactions <strong>of</strong> an Invoke operator).<br />

However, in some cases (e.g., operations on externally loaded data), all operators<br />

can benefit from partitioning as well because they are inherently executed only once and<br />

just expanded to the batch size if required (e.g., by the first binary operator that receives<br />

a message partition as one <strong>of</strong> its inputs). For example, as we load data for a partition<br />

<strong>of</strong> messages, we can execute any subsequent transformation <strong>of</strong> this loaded data also only<br />

once.<br />

For operators that do not benefit from partitioning, the abstract costs are computed by<br />

C(o ′ i , k′ ) = C(o i ) · k ′ and the execution time can be computed by W (o ′ i , k′ ) = W (o i ) · k ′<br />

or by W (o ′ i , k′ ) = W (o i ) · C(o ′ i , k′ )/C(o i ). Finally, if k ′ = 1, we get the instance-based<br />

costs with C(o ′ i , k′ ) = C(o i ) and W (o ′ i , k′ ) = W (o i ). Thus, the instance-based execution<br />

is a specific case <strong>of</strong> the execution <strong>of</strong> horizontally partitioned message batches. As a result,<br />

theoretically, partitioning cannot cause any performance decrease <strong>of</strong> an operator.<br />

In addition to the mentioned operators that can benefit from partitioning, there are<br />

further operators that might also benefit from partitioning. Examples for these are the<br />

Join, Selection, and Groupby operators. However, due to the partitioning <strong>of</strong> complete<br />

messages (with tree-structured data) partitioning applies only to specific cases, where a<br />

message has only a single tuple (to which the value <strong>of</strong> the partitioning attribute refers).<br />

Hence, we do not consider these operators because the possible benefit is strongly limited.<br />

Nevertheless, these operators could be included with benefit if streaming <strong>of</strong> message parts<br />

(e.g., a part for each tuple) [PVHL09a, PVHL09b] is applied because, we could execute<br />

Join, Selection, and Groupby operators efficiently on whole batches <strong>of</strong> these parts. We<br />

use our example plan P 2 in order to illustrate the overall cost estimation in detail.<br />

Example 5.7 (Extended <strong>Cost</strong> Estimation). Recall the rewritten plan P 2 ′ (Figure 5.5) and<br />

assume a number <strong>of</strong> k ′ messages per message partition. Using the extended cost model, we<br />

can estimate the execution time W (P 2 ′, k′ ). The monitored average execution times W (o i )<br />

are shown in the table in Figure 5.11. Now, we compute W (P 2 ′, k′ ) as follows:<br />

W (P ′ 2, k ′ ) =<br />

m∑<br />

W (o ′ i, k ′ ) = W (o 1 ) + W (o 2 ) + W (o 3 ) + W (o 4 ) · k ′ + W (o 5 ) · k ′ + W (o 6 ) · k ′<br />

i=1<br />

= W (o 1 ) + W (o 2 ) + W (o 3 ) + (W (o 4 ) + W (o 5 ) + W (o 6 )) · k ′<br />

The operators o 1 , o 2 , and o 3 benefit from partitioning and hence, we assign costs that<br />

are independent <strong>of</strong> k ′ , while costs <strong>of</strong> operators o 4 , o 5 , and o 6 increase linearly with k ′ .<br />

Using this cost function <strong>of</strong> P 2 we can estimate the execution time for an arbitrary number<br />

Operator o i Execution Time W (o i )<br />

o 1<br />

o 2<br />

o 3<br />

o 4<br />

o 5<br />

o 6<br />

P<br />

0.01 s<br />

0.015 s<br />

0.3 s<br />

0.055 s<br />

0.02 s<br />

0.13 s<br />

0.53 s<br />

Figure 5.11: Relative Execution Time W (P ′ 2 , k′ )/k ′ 145

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!