Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2 Preliminaries and Existing Techniques<br />
2.5 Summary and Discussion<br />
To summarize, we classified existing work <strong>of</strong> specifying integration tasks, where we mainly<br />
distinguish query-based, integration-flow-based and user-interface-oriented approaches.<br />
Due to the emerging requirements <strong>of</strong> complex integration tasks that (1) stretch beyond<br />
simple read-only applications, (2) involve many types <strong>of</strong> heterogeneous systems and applications,<br />
and (3) require fairly complex procedural aspects, imperative integration flows are<br />
increasingly used. Hence, we further classified the modeling, execution and optimization<br />
<strong>of</strong> these integration flows in detail according to a generalized reference system architecture<br />
<strong>of</strong> an integration platform for integration flows. Typically, an integration flow is modeled<br />
as a hierarchy <strong>of</strong> sequences with control-flow semantics. The control-flow semantics subsumes<br />
also implicit data-flow semantics by using instance-local, materialized intermediates<br />
in the form <strong>of</strong> variables. With regard to the optimization <strong>of</strong> such integration flows, we<br />
can summarize that mainly rule-based optimization approaches (optimize-once) have been<br />
proposed so far. This optimization model has two major drawbacks. First, adaptation to<br />
changing workload characteristics is impossible because the flow is only optimized once<br />
during the initial deployment. Second, many cost-based optimization decisions cannot be<br />
made statically in a rule-based fashion.<br />
In contrast to the rule-based optimization <strong>of</strong> integration flows, there are numerous approaches<br />
<strong>of</strong> adaptive query processing in different application areas. However, these approaches<br />
are tailor-made for specific system types and their underlying assumptions <strong>of</strong><br />
execution characteristics. For example, plan-based adaptation in DBMS is based on the<br />
assumption <strong>of</strong> long running queries over finite data sets, while continuous-query-based<br />
adaptation in DSMS relies on the assumption <strong>of</strong> continuous queries over infinite tuple<br />
streams. In contrast to these system types, integration flows exhibit the specific characteristics<br />
<strong>of</strong> being deployed once and executed many times, where many independent<br />
instances—with rather small amounts <strong>of</strong> data per instance—are executed over time. In<br />
conclusion, the major research question is if we can exploit context knowledge <strong>of</strong> integration<br />
flows in order to design a tailor-made optimization approach that takes into account<br />
these specific characteristics <strong>of</strong> integration flows.<br />
As a formal foundation, we defined the basic notation in the form <strong>of</strong> a meta model<br />
for integration flows, including a message meta model that covers all static data aspects<br />
and a flow meta model that precisely defines the plan execution characteristics as well as<br />
interaction-, control-flow-, and data-flow-oriented operators. This meta model reflects the<br />
common modeling and execution semantics <strong>of</strong> integration flows as well as their specific<br />
transactional requirements and thus, all results <strong>of</strong> this thesis can be seamlessly applied to<br />
other meta models as well. Furthermore, we specified example integration flows within the<br />
context <strong>of</strong> the two major use cases <strong>of</strong> horizontal and vertical integration. These example<br />
flows represent the main characteristics and different facets <strong>of</strong> integration flows and hence,<br />
they are used as running examples throughout the whole thesis.<br />
Putting it all together, there are existing approaches for query-based, integration-flowbased<br />
and UI-oriented integration. From the perspective <strong>of</strong> optimization, there exist<br />
tailor-made techniques for adaptive query processing. In contrast, the optimization <strong>of</strong><br />
integration flows is mainly rule-based. Thus, the focus and novelty <strong>of</strong> this thesis is the<br />
cost-based optimization <strong>of</strong> integration flows that is strongly required in order to address<br />
the high performance demands when executing integration flows.<br />
32