Chapter 06 - Changing Education Paradigm
Chapter 06 - Changing Education Paradigm
Chapter 06 - Changing Education Paradigm
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
28 CHAPTER 6. MINING ASSOCIATION RULES IN LARGE DATABASES<br />
1-var Constraint Anti-Monotone Succinct<br />
S v, 2f=; ; g yes yes<br />
v 2 S no yes<br />
S V no yes<br />
S V yes yes<br />
S = V partly yes<br />
min(S) v no yes<br />
min(S) v yes yes<br />
min(S) =v partly yes<br />
max(S) v yes yes<br />
max(S) v no yes<br />
max(S) =v partly yes<br />
count(S) v yes weakly<br />
count(S) v no weakly<br />
count(S) =v partly weakly<br />
sum(S) v yes no<br />
sum(S) v no no<br />
sum(S) =v partly no<br />
avg(S) v, 2f=; ; g no no<br />
(frequency constraint) (yes) (no)<br />
Table 6.3: Characterization of 1-variable constraints: anti-monotonicity and succinctness.<br />
either. This property is used at each iteration of the Apriori algorithm to reduce the number of candidate itemsets<br />
examined, thereby reducing the search space for association rules.<br />
Other examples of anti-monotone constraints include \min(J.price) 500" and \S.year = 1998". Any itemset<br />
which violates either of these constraints can be discarded since adding more items to such itemsets can never satisfy<br />
the constraints. A constraint such as\avg(I.price) 100" is not anti-monotone. For a given set that does not satisfy<br />
this constraint, a superset created by adding some (cheap) items may result in satisfying the constraint. Hence,<br />
pushing this constraint inside the mining process will not guarantee completeness of the data mining query response.<br />
A list of 1-variable constraints, characterized on the notion of anti-monotonicity, is given in the second column of<br />
Table 6.3.<br />
\What other kinds of constraints can we use for pruning the search space?" Apriori-like algorithms deal with other<br />
constraints by rst generating candidate sets and then testing them for constraint satisfaction, thereby following a<br />
generate-and-test paradigm. Instead, is there a kind of constraint for which we can somehow enumerate all and only<br />
those sets that are guaranteed to satisfy the constraint? This property of constraints is called succintness. If a rule<br />
constraint is succinct, then we can directly generate precisely those sets that satisfy it, even before support counting<br />
begins. This avoids the substantial overhead of the generate-and-test paradigm. In other words, such constraints are<br />
pre-counting prunable. Let's study an example of how succinct constraints can be used in mining association rules.<br />
Example 6.8 Based on Table 6.3, the constraint \min(J:price) 500" is succinct. This is because we can explicitly<br />
and precisely generate all the sets of items satisfying the constraint. Speci cally, such a set must contain at least<br />
one item whose price is less than $500. It is of the form: S1 [ S2, where S1 6= ; is a subset of the set of all those<br />
items with prices less than $500, and S2, possibly empty, is a subset of the set of all those items with prices > $500.<br />
Because there is a precise \formula" to generate all the sets satisfying a succinct constraint, there is no need to<br />
iteratively check the rule constraint during the mining process.<br />
What about the constraint \min(J:price) 500", which occurs in Example 6.7? This is also succinct, since we<br />
can generate all sets of items satisfying the constraint. In this case, we simply do not include items whose price is<br />
less than $500, since they cannot be in any set that would satisfy the given constraint. 2<br />
Note that a constraint such as\avg(I:price) 100" could not be pushed into the mining process, since it is<br />
neither anti-monotone nor succinct according to Table 6.3.<br />
Although optimizations associated with succinctness (or anti-monotonicity) cannot be applied to constraints like<br />
\avg(I:price) 100", heuristic optimization strategies are applicable and can often lead to signi cant pruning.