Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2 Preliminaries and Existing Techniques<br />
From the perspective <strong>of</strong> structural representation, we distinguish four major types.<br />
First, there are directed graphs, where the integration flow is modeled with G = (V, A) as<br />
sets <strong>of</strong> vertexes V (operators) and arcs A (dependencies between operators). Typically,<br />
such a graph is restricted to directed acyclic graphs (DAG), and XMI (XML Metadata<br />
Interchange) [OMG07] is used as technical representation. Common examples <strong>of</strong> this type<br />
are UML (Unified Modeling Language) activity diagrams [OMG03] and BPMN (Business<br />
Process Modeling Notation) process specifications [BMI06]. However, many proprietary<br />
representations exist as well. Van der Aalst et al. identified common workflow patterns<br />
for imperative workflows with directed graph structures in order to classify the expressive<br />
power <strong>of</strong> concrete workflow languages [vdABtHK00, vdAtHKB03]. Second, hierarchies<br />
<strong>of</strong> sequences are a common representation. There, the flow is modeled as a sequence <strong>of</strong><br />
operators, where each operator can be an atomic or complex (composite <strong>of</strong> subsequences)<br />
operator. Thus, arbitrary hierarchies can be modeled. As the technical representation<br />
XML is used due to its suitable hierarchical nature. This is the most common type for<br />
representing integration flows because this specification is more restrictive, which allows<br />
the automatic compilation into an executable form and all algorithms working on this<br />
representation can be realized as simple recursive algorithms. Examples are BPEL [OAS06]<br />
processes and XPDL (XML Process Definition Language) [WfM05] process definitions.<br />
Example 2.1. In order to make this distinction between directed graphs and hierarchies<br />
<strong>of</strong> sequences more understandable, Figure 2.4 shows an example integration flow with both<br />
different flow structures. Figure 2.4(a) illustrates an example plan with a directed graph<br />
structure, where we receive a message, execute two filters depending on an expression<br />
evaluation and finally send the result to an external system. Figure 2.4(b) illustrates the<br />
same example with a hierarchy-<strong>of</strong>-sequences structure. When comparing both, we see that<br />
the directed graph uses only atomic operators and allows for arbitrary temporal dependencies<br />
between operators, while the hierarchy <strong>of</strong> sequences require complex operators (e.g., the<br />
Switch operator) that recursively contain other operators and therefore is more restrictive.<br />
Receive<br />
Switch<br />
Selection<br />
Selection<br />
Selection<br />
Selection<br />
Write<br />
Receive<br />
Switch<br />
Selection<br />
Selection<br />
Selection<br />
Selection<br />
Write<br />
(a) Directed Graph<br />
(b) Hierarchy <strong>of</strong> Sequences<br />
Figure 2.4: <strong>Integration</strong> Flow Modeling with Directed Graphs and Hierarchies <strong>of</strong> Sequences<br />
In contrast to directed graphs and hierarchies <strong>of</strong> sequences, there is a third and a fourth<br />
type, both <strong>of</strong> which are less common nowadays. Some integration platforms allow to<br />
model (to program) integration flows on source code level using specific APIs. Examples<br />
<strong>of</strong> this type are JPD (Java Process Definition) [BEA08], BPELJ (BPEL extension for<br />
Java) [BEA04], and script-based integration platforms such as pygrametl [TP09]. As an<br />
advantage, arbitrary custom code can be used, while the flow designer is confronted with<br />
high modeling and optimization efforts. Finally, some platforms use fixed integration flows.<br />
A fairly simple example is the concept <strong>of</strong> so-called message channels, where messages are<br />
received, transformed and finally sent to a single external system. Concrete integration<br />
flows are modeled by configuring these fixed flows. Although this structure allows for<br />
10