25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 Preliminaries and Existing Techniques<br />

From the perspective <strong>of</strong> structural representation, we distinguish four major types.<br />

First, there are directed graphs, where the integration flow is modeled with G = (V, A) as<br />

sets <strong>of</strong> vertexes V (operators) and arcs A (dependencies between operators). Typically,<br />

such a graph is restricted to directed acyclic graphs (DAG), and XMI (XML Metadata<br />

Interchange) [OMG07] is used as technical representation. Common examples <strong>of</strong> this type<br />

are UML (Unified Modeling Language) activity diagrams [OMG03] and BPMN (Business<br />

Process Modeling Notation) process specifications [BMI06]. However, many proprietary<br />

representations exist as well. Van der Aalst et al. identified common workflow patterns<br />

for imperative workflows with directed graph structures in order to classify the expressive<br />

power <strong>of</strong> concrete workflow languages [vdABtHK00, vdAtHKB03]. Second, hierarchies<br />

<strong>of</strong> sequences are a common representation. There, the flow is modeled as a sequence <strong>of</strong><br />

operators, where each operator can be an atomic or complex (composite <strong>of</strong> subsequences)<br />

operator. Thus, arbitrary hierarchies can be modeled. As the technical representation<br />

XML is used due to its suitable hierarchical nature. This is the most common type for<br />

representing integration flows because this specification is more restrictive, which allows<br />

the automatic compilation into an executable form and all algorithms working on this<br />

representation can be realized as simple recursive algorithms. Examples are BPEL [OAS06]<br />

processes and XPDL (XML Process Definition Language) [WfM05] process definitions.<br />

Example 2.1. In order to make this distinction between directed graphs and hierarchies<br />

<strong>of</strong> sequences more understandable, Figure 2.4 shows an example integration flow with both<br />

different flow structures. Figure 2.4(a) illustrates an example plan with a directed graph<br />

structure, where we receive a message, execute two filters depending on an expression<br />

evaluation and finally send the result to an external system. Figure 2.4(b) illustrates the<br />

same example with a hierarchy-<strong>of</strong>-sequences structure. When comparing both, we see that<br />

the directed graph uses only atomic operators and allows for arbitrary temporal dependencies<br />

between operators, while the hierarchy <strong>of</strong> sequences require complex operators (e.g., the<br />

Switch operator) that recursively contain other operators and therefore is more restrictive.<br />

Receive<br />

Switch<br />

Selection<br />

Selection<br />

Selection<br />

Selection<br />

Write<br />

Receive<br />

Switch<br />

Selection<br />

Selection<br />

Selection<br />

Selection<br />

Write<br />

(a) Directed Graph<br />

(b) Hierarchy <strong>of</strong> Sequences<br />

Figure 2.4: <strong>Integration</strong> Flow Modeling with Directed Graphs and Hierarchies <strong>of</strong> Sequences<br />

In contrast to directed graphs and hierarchies <strong>of</strong> sequences, there is a third and a fourth<br />

type, both <strong>of</strong> which are less common nowadays. Some integration platforms allow to<br />

model (to program) integration flows on source code level using specific APIs. Examples<br />

<strong>of</strong> this type are JPD (Java Process Definition) [BEA08], BPELJ (BPEL extension for<br />

Java) [BEA04], and script-based integration platforms such as pygrametl [TP09]. As an<br />

advantage, arbitrary custom code can be used, while the flow designer is confronted with<br />

high modeling and optimization efforts. Finally, some platforms use fixed integration flows.<br />

A fairly simple example is the concept <strong>of</strong> so-called message channels, where messages are<br />

received, transformed and finally sent to a single external system. Concrete integration<br />

flows are modeled by configuring these fixed flows. Although this structure allows for<br />

10

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!