Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
data properties <strong>of</strong> external systems, we use the described lightweight concept for explicitly<br />
taking into account conditional probabilities and correlation at the same time using a<br />
incremental reordering approach that is tailor-made for integration flows.<br />
3.4 <strong>Optimization</strong> Techniques<br />
In this section, we discuss specific optimization techniques that are used within the described<br />
core optimization algorithm. For selected techniques, we present the core idea, the<br />
optimality conditions, possible execution time reduction, the rewriting algorithm and its<br />
time complexity, as well as possible side effects to other techniques.<br />
<strong>Cost</strong>-<strong>Based</strong> Techniques<br />
Reordering <strong>of</strong> Switch-Paths (WD1)<br />
Merging <strong>of</strong> Switch-Paths (WD2)<br />
Execution Pushdown to External Systems (WD3)<br />
Early Selection Application (WD4)<br />
Early Projection Application (WD5)<br />
Early GroupBy Application (WD6)<br />
Materialization Point Insertion (WD7)<br />
Orderby Insertion / Removal (WD8)<br />
Join-Type Selection (WD9)<br />
Join Enumeration (WD10)<br />
Setoperation-Type Selection (WD11)<br />
Splitting / Merging <strong>of</strong> Operators (WD12)<br />
Precomputation <strong>of</strong> Values (WD13)<br />
Early Translation Application (WD14)<br />
(WC1) Rescheduling Start <strong>of</strong> Parallel <strong>Flows</strong><br />
(WC2) Rewriting Sequences to Parallel <strong>Flows</strong><br />
(WC3) Rewriting Iterations to Parallel <strong>Flows</strong><br />
(WC4) Merging Parallel <strong>Flows</strong><br />
(WM1) Message Indexing<br />
(WM2) Recycling Locally Created Intermediates<br />
(WM3) Recycling Externally Loaded Intermediates<br />
(Vect) <strong>Cost</strong>-<strong>Based</strong> Vectorization see Chapter 4<br />
see Chaper 5 Multi-Flow <strong>Optimization</strong> (MFO)<br />
Data Flow<br />
Double Variable Assignment Removal (RD1)<br />
Unnecessary Variable Assignment Removal (RD2)<br />
Unnecessary Variable Declaration Removal (RD3)<br />
Two Sibling Translation Operation Merging (RD4)<br />
Two Sibling Validation Merging (RD5)<br />
Unnecessary Switch-Path Elimination (RD6)<br />
Algebraic Simplification (RD7)<br />
(HLB) Heterogeneous Load Balancing<br />
(RC1) Redundant Control Flow Rewriting<br />
(RC2) Unreachable Subgraph Elimination<br />
(RC3) Local Subprocess Inline Compilation<br />
(RC4) Static Node Compilation<br />
Control Flow<br />
Rule-<strong>Based</strong> Techniques<br />
Figure 3.11: <strong>Cost</strong>-<strong>Based</strong> <strong>Optimization</strong> Techniques<br />
Figure 3.11 distinguishes the used cost-based optimization techniques into control-floworiented<br />
and data-flow-oriented optimization techniques and emphasizes (with bold font)<br />
the techniques that we will describe in detail. All optimization techniques presented in<br />
this section follow the optimization objective <strong>of</strong> minimizing the average execution time<br />
<strong>of</strong> a plan. In addition to these techniques, we will discuss in very detail the cost-based<br />
vectorization in Chapter 4 and the multi-flow optimization in Chapter 5, which both follow<br />
the optimization objective <strong>of</strong> maximizing the message throughput. Apart from these<br />
techniques, we refer the interested reader to our details on message indexing [BHW + 07,<br />
BHLW08d] and rule-based optimization techniques [BHW + 07] that are omitted in this<br />
thesis. The latter includes, for example, relational algebraic simplifications as described<br />
by Dadam [Dad96].<br />
3.4.1 Control-Flow-Oriented Techniques<br />
Control-flow-oriented optimization techniques address the interaction- and control-floworiented<br />
operators <strong>of</strong> our flow meta model. These techniques try to exploit the specific<br />
characteristics <strong>of</strong> operators like alternatives (Switch operator), loops (Iteration operator),<br />
and parallel subflows (Fork operator, constrained by Invoke operators).<br />
60