Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
5 Multi-Flow <strong>Optimization</strong><br />
Similar to the vectorization <strong>of</strong> integration flows, in this chapter, we introduce the multiflow<br />
optimization [BHL10, BHL11] as a data-flow-oriented optimization technique that is<br />
tailor-made for integration flows. This technique tackles the problem <strong>of</strong> expensive external<br />
system access as well as it exploits the optimization potential that equivalent work (e.g.,<br />
same queries to external systems) is done multiple times. The core idea is to horizontally<br />
partition inbound message queues and to execute plan instances for message batches rather<br />
than for individual messages. Therefore, this technique is applicable for asynchronous<br />
data-driven integration flows, where message queues are used at the inbound side <strong>of</strong> the<br />
integration platform. As a result, the message throughput is increased by reducing the<br />
amount <strong>of</strong> work (external system access and local processing steps) done by the integration<br />
platform. We call this technique multi-flow optimization because sequences <strong>of</strong> messages<br />
that would initiate multiple plan instances are processed together.<br />
In order to enable multi-flow optimization, in Section 5.1, we introduce the batch creation<br />
via horizontal message queue partitioning. Essentially, two major challenges arise<br />
in the context <strong>of</strong> multi-flow optimization. In Section 5.2, we discuss the challenge <strong>of</strong> plan<br />
execution on batches <strong>of</strong> messages. Furthermore, in Section 5.3, we describe how this optimization<br />
technique is embedded within the periodical re-optimization framework and<br />
we address the challenge <strong>of</strong> computing the optimal waiting time with regard to message<br />
throughput maximization. In addition, we provide formal analysis results such as optimality<br />
and latency guarantees in Section 5.4. Finally, the experimental evaluation, which<br />
is presented in Section 5.5, shows that significant performance improvements in the sense<br />
<strong>of</strong> an increased message throughput are achieved by multi-flow optimization.<br />
5.1 Motivation and Problem Description<br />
In the context <strong>of</strong> integration platforms, especially in scenarios with huge numbers <strong>of</strong> plan<br />
instances, the major optimization objective is throughput maximization [LZL07] rather<br />
than the execution time minimization <strong>of</strong> single plan instances. The goal is (1) to maximize<br />
the number <strong>of</strong> messages processed per time period, or synonymously in our context, (2) to<br />
minimize the total execution time <strong>of</strong> a sequence <strong>of</strong> plan instances. Here, depending on the<br />
application area, moderate latency times <strong>of</strong> single messages, in the orders <strong>of</strong> seconds to<br />
minutes, are acceptable [UGA + 09]. When addressing this general optimization objective,<br />
the following concrete problems have to be considered:<br />
Problem 5.1 (Expensive External System Access). External system access can be really<br />
time-consuming caused by network latency (minimal roundtrip time), external query processing,<br />
network traffic, and message transformations from external formats into internal<br />
structures. Depending on the involved external systems and on the present infrastructure,<br />
the fraction <strong>of</strong> these influences with regard to the required total access time may vary significantly.<br />
However, in particular when accessing custom applications and services, data<br />
129