10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 3.4. An example database for FP-growthTid Transaction Ordered Transaction100 {a, b, d, e, f} {b, d, f, a, e}200 {b, f, g} {b, f, g}300 {d, g, h, i} {d, g}400 {a, c, e, g, j} {g, a, e}500 {b, d, f} {b, d, f}3.1 Association Rule <strong>Mining</strong> 35header tablerootitem support node-linkb 3d 3f 3g 3a 2e 2d:2f:2b:3f:1g:1d:1g:1g:1a:1e:1a:1e:1Fig. 3.2. FP-tr<strong>ee</strong> of the example databaseconsists of the set of prefix paths in the FP-tr<strong>ee</strong> co-occurring with the suffix pattern), thenconstructing its conditional FP-tr<strong>ee</strong> <strong>and</strong> performing mining recursively on such a tr<strong>ee</strong>. Thepattern growth is achieved by the concatenation of the suffix pattern with the frequent patternsgenerated from a conditional FP-tr<strong>ee</strong>.Example 2. Let our example database be the database shown in Table 3.4 with min sup =2.First, the supports of all items are accumulated <strong>and</strong> all infrequent items are removed from thedatabase. The items in the transactions are reordered according to the support in descendingorder, resulting in the transformed database shown in Table 3.4. The FP-tr<strong>ee</strong> for this databaseis shown in Figure 3.2. The pseudo code of the FP-growth algorithm is presented in Algorithm3.3 [107].Although the authors of the FP-growth algorithm [107] claim that their algorithm does notgenerate any c<strong>and</strong>idate itemsets, some works (e.g., [102]) have shown that the algorithm actuallygenerates many c<strong>and</strong>idate itemsets since it essentially uses the same c<strong>and</strong>idate generationtechnique as is used in Apriori but without its prune step. Another issue of FP-tr<strong>ee</strong> is that theconstruction of the frequent pattern tr<strong>ee</strong> is a time consuming activity.Algorithm 3.3: FP-growthInput:A transaction database D, a frequent pattern tr<strong>ee</strong> FP-tr<strong>ee</strong>, a user specified thresholdmin supOutput: Frequent itemsets F

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!