25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.2 Plan Vectorization<br />

p1<br />

o1 o2 o3<br />

o4 o5 o6<br />

p2<br />

o1 o2 o3 o4 o5 o6<br />

t0(p1)<br />

t1(p1)<br />

t0(p2)<br />

time t<br />

t1(p2)<br />

(a) Instance-<strong>Based</strong> Execution <strong>of</strong> Plan P<br />

p1<br />

o1 o2 o3<br />

o4 o5 o6<br />

p2<br />

o1 o2 o3 o4 o5 o6<br />

t0(p1) t0(p2) t1(p1)<br />

t1(p2)<br />

possible improvement<br />

due to vectorization<br />

time t<br />

(b) Fully Vectorized Execution <strong>of</strong> Plan P ′<br />

Figure 4.3: Temporal Aspects <strong>of</strong> Instance-<strong>Based</strong> and Vectorized Plans<br />

Definition 4.1 (Plan Vectorization Problem (P-PV)). Let P denote a plan, and p i ∈<br />

{p 1 , p 2 , . . . , p n } denotes the plan instances with P ⇒ p i . Further, let the plan P contain a<br />

sequence <strong>of</strong> atomic or complex operators o i ∈ {o 1 , o 2 , . . . , o m }. For serialization purposes,<br />

the plan instances are executed in sequence, where the end time t 1 <strong>of</strong> a plan instance is<br />

lower than the start time t 0 <strong>of</strong> the subsequent plan instance with t 1 (p i ) ≤ t 0 (p i+1 ). Then,<br />

the P-PV describes the search for the vectorized plan P ′ (with data flow semantics) that<br />

exhibits the highest degree <strong>of</strong> parallelism for the plan instances p ′ i such that the constraint<br />

conditions (t 1 (p ′ i , o i) ≤ t 0 (p ′ i , o i+1)) ∧ (t 1 (p ′ i , o i) ≤ t 0 (p ′ i+1 , o i)) hold and the semantic correctness<br />

(see Definition 3.1) is ensured.<br />

The same rules <strong>of</strong> ensuring semantic correctness as used for inter-operator parallelism<br />

in Chapter 3 also apply for vectorized plans. For example, this requires synchronization <strong>of</strong><br />

writing interactions. However, we assume independence <strong>of</strong> plan instances, which holds for<br />

typical data-propagating integration flows. This means that we synchronize, for example,<br />

a reading interaction with a subsequent writing interaction <strong>of</strong> plan instance p 1 but we<br />

allow executing the reading interaction <strong>of</strong> p 2 in parallel to the writing interaction <strong>of</strong> p 1 .<br />

Nevertheless, monotonic reads and writes are ensured. We will revisit this issue <strong>of</strong> intrainstance<br />

synchronization when discussing the rewriting algorithm.<br />

<strong>Based</strong> on the P-PV, we now reveal the static cost analysis <strong>of</strong> the best case (full pipelines),<br />

where cost denotes the total execution time. Let P include an operator sequence o with<br />

constant operator costs W (o i ) = 1, the costs <strong>of</strong> n plan instances are<br />

W (P ) = n · m<br />

W (P ′ ) = n + m − 1<br />

∆(W (P ) − W (P ′ )) = (n − 1) · (m − 1),<br />

// instance-based<br />

// fully vectorized<br />

(4.1)<br />

where m denotes the number <strong>of</strong> operators. This is an idealized model only used for<br />

illustration purposes. In practice, the improvement depends on the most time-consuming<br />

operator o max with W (o max ) = max m i=1 W (o i) <strong>of</strong> a vectorized plan P ′ because the workcycle<br />

<strong>of</strong> the whole data flow graph depends on this operator due to filled queues (with<br />

maximum constraints) in front <strong>of</strong> this operator. We will revisit this effect when discussing<br />

the cost-based vectorization in Section 4.3. The costs are then specified by:<br />

W (P ) = n ·<br />

m∑<br />

W (o i )<br />

i=1<br />

W (P ′ ) = (n + m − 1) · W (o max ).<br />

// instance-based<br />

// fully vectorized<br />

(4.2)<br />

91

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!