Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4 Vectorizing <strong>Integration</strong> <strong>Flows</strong><br />
Algorithm 4.1 Plan Vectorization (A-PV)<br />
Require: operator sequence o<br />
1: B ← ∅, D ← ∅, Q ← ∅<br />
2: for i ← 1 to |o| do // for each operator o i<br />
3: // Part 1: Dependency Analysis<br />
4: for j ← i to |o| do // for each following operator o j<br />
5:<br />
δ<br />
if ∃o j → oi then<br />
6: Q ← Q ∪ q with q ← createQueue()<br />
7: D ← D ∪ d〈o j , q, o i 〉 with d〈o j , q, o i 〉 ← createDependency()<br />
8: // Part 2: Graph Creation<br />
9: if o i ∈ Switch, Iteration, Validate, Signal, Savepoint, Invoke then<br />
10: // see rules on rewriting context-specific operators<br />
11: else<br />
12: b(o i ) ← createBucket(o i )<br />
13: for k ← 1 to |D| do // for each dependency d<br />
14: d〈o y , q, o x 〉 ← d k<br />
15: if o i ≡ o x then<br />
16: b(o i ).addOutputQueue(q)<br />
17: else if o i ≡ o y then<br />
18: b(o i ).addInputQueue(q)<br />
19: if b(o i ).countOutputQueues() ≥ 2 then<br />
20: createCopyOperatorAfter(b(o i ), B, D, Q)<br />
21: else if b(o i ).countOutputQueues() = 0 then<br />
22: createLogicalQueue(b(o i−1 ), b(o i ), B, D, Q)<br />
23: B ← B ∪ b(o i )<br />
24: return B<br />
create the graph <strong>of</strong> execution buckets. This part contains the following three steps. First,<br />
we create an execution bucket for each operator. Second, we connect each operator with<br />
the referenced input queues. Clearly, each queue is referenced by exactly one operator,<br />
but each operator can reference multiple queues. Third, we connect each operator with<br />
the referenced output queues. If one operator must be connected to n output queues with<br />
n ≥ 2 (its results are used by multiple following operators), we insert a Copy operator<br />
after this operator in order to send the message to all dependency sources. Within the<br />
createCopyOperatorAfter function the new operator, execution bucket and queue are<br />
created as well as the dependencies and queue references are changed accordingly. Furthermore,<br />
if an operator is connected to zero input queues, we insert an artificial dummy queue<br />
(where empty message objects are passed) between this operator and its former temporal<br />
predecessor. <strong>Based</strong> on this basic rewriting concept, additional rewriting rules for contextspecific<br />
operators (e.g., Switch, Iteration) and for serialization and recoverability are<br />
required (line 10). We use an example to illustrate the A-PV in more detail.<br />
Example 4.2 (Automatic Plan Vectorization). Recall our example plan P 2 as well as the<br />
vectorized plan P 2 ′ <strong>of</strong> Example 4.1. Figure 4.5(a) and 4.5(b) illustrate the core aspects<br />
when rewriting P 2 to P 2 ′ . First, we determine all direct data dependencies δ and add those<br />
to the set <strong>of</strong> dependencies D. If an operator is referenced by multiple dependencies (e.g.,<br />
operator o 1 in this example), we need to insert a Copy operator just after it. Furthermore,<br />
94