10.11.2016 Views

Learning Data Mining with Python

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Recommending Movies Using Affinity Analysis<br />

The Apriori algorithm<br />

The Apriori algorithm is part of our affinity analysis and deals specifically <strong>with</strong><br />

finding frequent itemsets <strong>with</strong>in the data. The basic procedure of Apriori builds<br />

up new candidate itemsets from previously discovered frequent itemsets. These<br />

candidates are tested to see if they are frequent, and then the algorithm iterates as<br />

explained here:<br />

1. Create initial frequent itemsets by placing each item in its own itemset.<br />

Only items <strong>with</strong> at least the minimum support are used in this step.<br />

2. New candidate itemsets are created from the most recently discovered<br />

frequent itemsets by finding supersets of the existing frequent itemsets.<br />

3. All candidate itemsets are tested to see if they are frequent. If a candidate is<br />

not frequent then it is discarded. If there are no new frequent itemsets from<br />

this step, go to the last step.<br />

4. Store the newly discovered frequent itemsets and go to the second step.<br />

5. Return all of the discovered frequent itemsets.<br />

This process is outlined in the following workflow:<br />

[ 68 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!