25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.2 Plan Optimality Trees<br />

We execute a full optimization only once during the initial deployment <strong>of</strong> an integration<br />

flow. The optimizer contract is changed in a way that it returns the set <strong>of</strong> optimality<br />

conditions. Thus, the resulting research challenge is how to organize these optimality conditions<br />

for efficient statistic monitoring, condition evaluation, and directed re-optimization.<br />

In contrast to existing passive structures such as matrix views [BMM + 04] that are used<br />

and maintained by the re-optimizer for selective operators, we propose an active structure,<br />

the so-called Plan Optimality Tree (PlanOptTree), which is a data structure that models<br />

optimality <strong>of</strong> a plan. It indexes operators and their related statistics, which are included<br />

in optimality conditions. As a result, we maintain only required statistics and we can<br />

continuously evaluate optimality conditions. Re-optimization is actively triggered only if<br />

necessary—in this case, it is guaranteed that we will find a plan with lower costs. Here,<br />

directed re-optimization is applied only for operators included in any violated conditions.<br />

Example 6.3 (PlanOptTree POT(P 5 )). Figure 6.4 shows the PlanOptTree <strong>of</strong> plan P 5 .<br />

o 2 o 3<br />

|ds in1| |ds out1| |ds in1| |ds out1|<br />

sel(o 2) sel(o 3)<br />

o 4<br />

|ds in1| |ds out1|<br />

sel(o 4)<br />

o 6 o 5 o 7<br />

W W W<br />

W(expr A)<br />

P(A)<br />

W(expr B)<br />

P(B)<br />

≤<br />

≤<br />

(oc1) (oc2) (oc4)<br />

Figure 6.4: PlanOptTree <strong>of</strong> Plan P 5<br />

It includes two optimality conditions (oc 1 , oc 2 ) for expressing the order <strong>of</strong> the Selection<br />

operators o 2 , o 3 and o 4 (see Figure 6.3(c)) according to their selectivities (oc 3 <strong>of</strong> Example<br />

6.1 is omitted due to transitivity <strong>of</strong> conditions) and one condition (oc 4 ) regarding<br />

branch prediction <strong>of</strong> the Switch 19 operator o 5 according to the weighted path probabilities.<br />

In the rest <strong>of</strong> this chapter, we explain in detail how to create, update, and use these<br />

PlanOptTrees in order to enable the vision <strong>of</strong> on-demand re-optimization.<br />

≤<br />

6.2 Plan Optimality Trees<br />

In this section, we formally define the PlanOptTree and show how to create a PlanOptTree<br />

for a given plan during the initial optimization <strong>of</strong> an integration flow. Further, we explain<br />

how to use it for statistic maintenance and when to trigger re-optimization.<br />

6.2.1 Formal Foundation<br />

A PlanOptTree, which general structure is shown in Figure 6.5, models optimality conditions<br />

<strong>of</strong> a plan and it is defined as follows:<br />

Definition 6.1 (PlanOptTree). Let P denote the optimal plan with regard to the current<br />

statistics. Further, let m denote the number <strong>of</strong> operators and let s denote the maximum<br />

number <strong>of</strong> statistic types per operator. Then, the PlanOptTree is defined as a graph <strong>of</strong><br />

five strata representing all optimality conditions <strong>of</strong> P :<br />

19 For the Switch operator multiple versions <strong>of</strong> statistics are monitored. This includes the total execution<br />

time W (o i) as well as the different execution times <strong>of</strong> all expression evaluations W (expr i).<br />

171

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!