Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6 On-Demand Re-<strong>Optimization</strong><br />
stratum 1<br />
stratum 2<br />
stratum 3<br />
RNode<br />
o 1 o 2 o 3 o 7<br />
ONode<br />
nid=1 nid=2 nid=3 nid=7<br />
stat 1 stat 3 stat 1 stat 3<br />
stat 1 stat 1 stat 2 stat 5<br />
SNode<br />
monitored<br />
statistics<br />
stratum 4<br />
cstat<br />
cstat<br />
cstat<br />
CSNode<br />
cstat<br />
global memo<br />
structure<br />
stratum 5<br />
ocond<br />
ocond<br />
OCNode<br />
ocond<br />
Figure 6.5: General Structure <strong>of</strong> a PlanOptTree<br />
1. RNode: The single root node refers to m ′ with 1 ≤ m ′ ≤ m operator nodes (ONode).<br />
2. ONode: An operator node is identified by a node identifier nid and refers to s ′ with<br />
1 ≤ s ′ ≤ s statistic nodes (SNode), where s denotes the maximum number <strong>of</strong> atomic<br />
statistic types.<br />
3. SNode: A statistic node exhibits one <strong>of</strong> the s atomic statistic types, where a single<br />
type must not occur multiple times for one operator o i . Further, each SNode contains<br />
a list <strong>of</strong> statistic tuples monitored for o i , a single aggregate, as well as a reference to<br />
a list <strong>of</strong> CSNodes and a list <strong>of</strong> OCNodes.<br />
4. CSNode: A complex statistic node is a mathematical expression using all referenced<br />
parent SNodes or CSNodes as operands, where a CSNode can refer to SNodes <strong>of</strong><br />
different operators. Further, it refers to a list <strong>of</strong> complex statistic nodes (CSNode)<br />
and a list <strong>of</strong> optimality condition nodes (OCNode). Hence, arbitrary hierarchies<br />
<strong>of</strong> complex statistics are possible. In addition, CSNodes can be used to represent<br />
constant values or externally loaded values.<br />
5. OCNode: An optimality condition node is defined as a boolean expression op 1 θ op 2 ,<br />
where θ denotes an arbitrary binary comparison operator and the operands op 1 and<br />
op 2 refer to any CSNode or SNode, respectively. The optimality condition is defined<br />
as violated if the expression evaluates to false.<br />
The nodes <strong>of</strong> strata 1 and 2 are reachable over unidirectional references, while nodes <strong>of</strong><br />
strata 3-5 are defined as bidirectional references (children and parents).<br />
Although the PlanOptTree is a graph, we call it a tree, because from the viewpoint <strong>of</strong><br />
statistic maintenance, only the tree from strata 1 to 3 is relevant, while from the viewpoint<br />
<strong>of</strong> directed optimization, each optimality condition is the root <strong>of</strong> a tree from strata 5 to<br />
3. All references to children and parents are maintained as sorted lists ordered by their<br />
identifier. Conceptually, known index structures can be used instead <strong>of</strong> lists. Furthermore,<br />
each node <strong>of</strong> stratum 4 and stratum 5 is reachable over multiple paths. For this reason,<br />
a PlanOptTree includes a MEMO structure in order to mark subgraphs that have already<br />
been evaluated. Finally, we are able to exploit the following four fundamental properties:<br />
• Minimal Monitoring: The PlanOptTree includes only operators and statistics that<br />
are included in any optimality condition. Thus, we can easily determine the relevant<br />
statistics for minimal statistics monitoring (given by stratum 2 and stratum 3).<br />
172