25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 Vectorizing <strong>Integration</strong> <strong>Flows</strong><br />

execution, where each incoming message initiates a new plan instance. All operators <strong>of</strong><br />

one instance are executed before the next instance is started.<br />

plan instance pid=3<br />

plan instance pid=2<br />

plan instance pid=1<br />

Receive (o1)<br />

[service: s5, out: msg1]<br />

Assign (o2)<br />

[in: msg1, out: msg2]<br />

Invoke (o3)<br />

[service: s4, in: msg2, out: msg3]<br />

Join (o4)<br />

[in: msg1,msg3, out: msg4]<br />

Assign (o5)<br />

[in: msg4, out: msg5]<br />

Invoke (o6)<br />

[service s3, in: msg5]<br />

msg1<br />

msg2<br />

msg3<br />

msg4<br />

msg5<br />

Message<br />

Queue<br />

Receive (o1)<br />

[service: s5, out: msg1]<br />

Assign (o2)<br />

[in: msg1, out: msg2]<br />

Invoke (o3)<br />

[service: s4, in: msg2, out: msg3]<br />

Join (o4)<br />

[in: msg1,msg3, out: msg4]<br />

Assign (o5)<br />

[in: msg4, out: msg5]<br />

Invoke (o6)<br />

[service s3, in: msg5]<br />

msg1<br />

msg2<br />

msg3<br />

msg4<br />

msg5<br />

time t<br />

(a) Example Plan P 2<br />

(b) Instance-<strong>Based</strong> Plan Execution <strong>of</strong> P 2<br />

Figure 4.1: Example Instance-<strong>Based</strong> Execution <strong>of</strong> Plan P 2<br />

In contrast, Figure 4.2 shows the fully vectorized plan, where each operator is executed<br />

within an execution bucket. Note that we also emphasized the changed operator parameters.<br />

Vectorized plan P’<br />

Assign (o2)<br />

[in: msg1, out: msg2]<br />

Invoke (o3)<br />

[service: s4, in: msg2, out: msg3]<br />

Message<br />

Queue<br />

Copy (oc)<br />

[in: msg1, out: msg1]<br />

Join (o4)<br />

[in: msg1,msg3, out: msg4]<br />

Assign (o5)<br />

[in: msg4, out: msg5]<br />

Invoke (o6)<br />

[service s3, in: msg5]<br />

inter-bucket message queue<br />

execution bucket bi (thread)<br />

Figure 4.2: Example Fully Vectorized Execution <strong>of</strong> Plan P ′ 2<br />

We can leverage pipeline parallelism (within a single pipeline) and parallel pipelines. In<br />

this model, each edge <strong>of</strong> a data flow graph includes a message queue for inter-operator<br />

communication. Dashed arrows represent dequeue (read) operations, while normal arrows<br />

represent enqueue (write) operations. Additional operators (e.g., the Copy operator for<br />

data flow splits) are required, while the Receive operator is not needed anymore.<br />

Major challenges have to be tackled when transforming P into P ′ in order to preserve<br />

the control-flow semantics and prevent the external behavior from being changed. <strong>Based</strong><br />

on the mentioned requirement <strong>of</strong> ensuring semantic correctness in the form <strong>of</strong> serialized<br />

external behavior, we now formally define the plan vectorization problem. Figure 4.3(a)<br />

illustrates the temporal aspects <strong>of</strong> the example instance-based plan (assuming a sequence<br />

<strong>of</strong> operators). In this case, different instances <strong>of</strong> this plan are serialized in incoming order.<br />

Such an instance-based plan is the input <strong>of</strong> our vectorization problem. In contrast to this,<br />

Figure 4.3(b) shows the temporal aspects <strong>of</strong> a vectorized plan for the best case. Here,<br />

only the external behavior (according to the start time t 0 and the end time t 1 <strong>of</strong> plan<br />

and operator instances) must be serialized. Such a vectorized plan is the output <strong>of</strong> the<br />

vectorization problem. The plan vectorization problem is then defined as follows.<br />

90

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!