25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />

into optimization. Finally, the Action operator exhibits abstract costs in the form <strong>of</strong><br />

|ds in | + |ds out |. However, the Action operator will not be included in any rewriting (except<br />

for parallelism) because it executes arbitrary code snippets and thus, is treated as a<br />

black box by the optimizer.<br />

2: Execution Times: In a second step, we monitor statistics (e.g., execution times<br />

and cardinalities) in order to weight the mentioned abstract costs <strong>of</strong> interaction- and dataflow-oriented<br />

operators. With the aim to estimate the costs for a newly created plan P ′ ,<br />

we aggregate the costs C(o ′ i ) and C(o i) <strong>of</strong> the single operators weighted with the execution<br />

statistics W (o i ) <strong>of</strong> the current plan P . Thus, we estimate missing statistics with<br />

Ŵ (o ′ i) = C(o′ i )<br />

C(o i ) · W (o i). (3.1)<br />

For control-flow-oriented operators, we directly estimate the execution time. The costs for<br />

the complex control-flow-oriented Switch operator can be computed by<br />

⎛ ⎛<br />

⎞⎞<br />

n∑<br />

i∑<br />

⎝P (path i ) · ⎝ W ( ) ∑m i<br />

expr pathj + W (o i,k ) ⎠⎠ , (3.2)<br />

i=1<br />

j=1<br />

where we require switch path probabilities P (path i ) (relative frequencies) for all n paths,<br />

weighted costs for path expression evaluation W ( )<br />

expr pathj because the evaluation <strong>of</strong><br />

these expressions (e.g., XPath) can be cost-intensive as well as weighted costs for the<br />

m i operators <strong>of</strong> each path. Here, the second summation goes only up to j = i because<br />

the evaluation is aborted if we find a true condition due to the if-elseif-else semantic <strong>of</strong><br />

this operator. Similar, the costs for the complex Fork operator (concurrent subflows <strong>of</strong><br />

arbitrary operators) are computed by the most time-consuming subflow:<br />

⎛<br />

⎞<br />

n<br />

max<br />

i=1<br />

k=1<br />

∑m i<br />

⎝ W (o i,j ) + i · W (Start Thread) ⎠ , (3.3)<br />

j=1<br />

where W (Start T hread) denotes a constant, used to represent the required time for creation<br />

and start <strong>of</strong> a thread. When computing the costs for the Iteration operator, with<br />

r ·<br />

n∑<br />

W (o i ) , (3.4)<br />

i=1<br />

the average number <strong>of</strong> iteration loops r is required as well. Further, the waiting time <strong>of</strong><br />

the Delay operator is also taken into account. Finally, the Signal operator has to be<br />

mentioned, where costs (needed for raising an exception) are represented as a constant.<br />

Putting it all together, this cost model has several fundamental properties. Some <strong>of</strong><br />

these properties are used by different chapters <strong>of</strong> this thesis.<br />

• Self-Adjustment: Due to weighting with monitored execution times, the cost model is<br />

self-adjusting with regard to the behavior <strong>of</strong> different operators according to changing<br />

workload characteristics. Thus, the cost model adjusts itself to the present environment<br />

(hardware platform, behavior <strong>of</strong> external systems). Especially, this behavior<br />

<strong>of</strong> external systems or different queries to these systems and thus, also <strong>of</strong> network<br />

properties could not be taken into account by an empirical cost model that is only<br />

based on cardinalities.<br />

42

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!