10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

36 3 Algorithms <strong>and</strong> <strong>Techniques</strong>Method: call FP-growth FP-tr<strong>ee</strong>, nullProcedure FP-growth(Tr<strong>ee</strong>, α):If Tr<strong>ee</strong> contains a single prefix-pathLet P be the single prefix-path part of Tr<strong>ee</strong>;Let Q be the multipath part with the top branching node replaced by a null root;For each combination β of the nodes in P doGenerate pattern β ∪ α with support=minimum support of nodes in β;Let freq pattern set(P) be the set of patterns generated;endendelseLet Q be Tr<strong>ee</strong>;endFor each item a i ∈ Q dogenerate pattern β = a i ∪ α with support=a i .support;construct β’s conditional pattern-base <strong>and</strong> then β’ conditional FT-tr<strong>ee</strong> Tr<strong>ee</strong> β ;If Tr<strong>ee</strong> β ̸=/0 then call Fp-growth(Tr<strong>ee</strong> β ,β);Let freq pattern set(Q) be the set of patterns generated;endreturn( freq pattern set(P)∪ freq pattern set(Q)∪( freq pattern set(P)× freq pattern set(Q)));3.1.3 Sequential Pattern <strong>Mining</strong>The sequential mining problem was first introduced in [11]; two sequential patterns examplesare: “80% of the people who buy a television also buy a video camera within a day”, <strong>and</strong>“Every time Microsoft stock drops by 5%, then IBM stock will also drop by at least 4% withinthr<strong>ee</strong> days”. The above patterns can be used to determine the efficient use of shelf space forcustomer convenience, or to properly plan the next step during an economic crisis. Sequentialpattern mining is also very important for analyzing biological data [18] [86], in which a verysmall alphabet (i.e., 4 for DNA sequences <strong>and</strong> 20 for protein sequences) <strong>and</strong> long patternswith a typical length of few hundreds or even thous<strong>and</strong>s frequently appear.Sequence discovery can be thought of as essentially an association discovery over a temporaldatabase. While association rules [9, 138] discern only intra-event patterns (itemsets),sequential pattern mining discerns inter-event patterns (sequences). There are many other importanttasks related to association rule mining, such as correlations [42], causality [228],episodes [176], multi-dimensional patterns [154, 132], max-patterns [24], partial periodicity[105], <strong>and</strong> emerging patterns [78]. Incisive exploration of sequential pattern mining issue willdefinitely help to get the efficient solutions to the other research problems shown above.Efficient sequential pattern mining methodologies have b<strong>ee</strong>n s<strong>tud</strong>ied extensively in manyrelated problems, including the general sequential pattern mining [11, 232, 269, 202, 14],constraint-based sequential pattern mining [95], incremental sequential pattern mining [200],frequent episode mining [175], approximate sequential pattern mining [143], partial periodicpattern mining [105], temporal pattern mining in data stream [242], maximal <strong>and</strong> closed sequentialpattern mining [169, 261, 247]. In this section, due to space limitation, we focuson introducing the general sequential pattern mining algorithm, which is the most basic onebecause all the others can benefit from the strategies it employs, i.e., Apriori heuristic <strong>and</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!