Roxana - Gabriela HORINCAR Refresh Strategies and Online ... - LIP6
Roxana - Gabriela HORINCAR Refresh Strategies and Online ... - LIP6
Roxana - Gabriela HORINCAR Refresh Strategies and Online ... - LIP6
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
eaches (<strong>and</strong> possibly exceeds) the size of its publication window:<br />
Div(s, t, Tr) ≥ Ws.<br />
In consequence, we make a distinction between the total number of items published by a<br />
source since the last refresh (stream divergence) <strong>and</strong> the number of only those available<br />
ones (window divergence).<br />
Definition 2.1.5. Stream divergence relevant to query q<br />
Let F (s, t) be the publication stream of source feed s. Let function new(F (s, t)) be the<br />
sequence of new items generated by s since its last refresh moment Tr. We define the<br />
stream divergence as:<br />
DivF (s, q, t, Tr) = |q(new(F (s, t)))|<br />
Note that the divergence function introduced before (Definition 2.1.2) corresponds to the<br />
stream divergence.<br />
Definition 2.1.6. Window divergence relevant to query q<br />
Let A(s, t) be the publication window of source feed s. Let function new(A(s, t)) return<br />
the sequence of new items published since time moment Tr <strong>and</strong> still available at source s<br />
at time moment t. We define the window divergence as:<br />
DivA(s, q, t, Tr) = |q(new(A(s, t)))|<br />
Observe that stream <strong>and</strong> window divergence functions relevant to query q are equal if the<br />
source s is not yet saturated at time moment t:<br />
if |new(A(s, t))| = |new(F (s, t))| < Ws, then DivA(s, q, t, Tr) = DivF (s, q, t, Tr)<br />
Otherwise, as source s becomes saturated, stream divergence exceeds the window divergence<br />
value:<br />
if |new(A(s, t))| = Ws, |new(F (s, t))| > Ws, then DivA(s, q, t, Tr) < DivF (s, q, t, Tr)<br />
We introduce an example in Figure 2.2 in order to better underst<strong>and</strong> the evolution in time<br />
of the stream <strong>and</strong> window divergence functions when different types of aggregation queries<br />
are used: a simple union q1, with sel(q1) = 1, <strong>and</strong> a filtering query q2, with sel(q2) < 1.<br />
We consider that source s becomes saturated at time instant Tsat. Both the stream <strong>and</strong><br />
window divergence functions relevant to query q1 reach the capacity of the publication<br />
window Ws <strong>and</strong> those relevant to query q2 reach a value estimated near sel(q2) · Ws. In<br />
Figure 2.2 the saturation point is marked by red lines.<br />
24