08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 8<br />

Refer to the following code:<br />

def rules_from_itemset(itemset, dataset):<br />

itemset = frozenset(itemset)<br />

nr_transactions = float(len(dataset))<br />

for item in itemset:<br />

antecendent = itemset-consequent<br />

base = 0.0<br />

# acount: antecedent count<br />

acount = 0.0<br />

# ccount : consequent count<br />

ccount = 0.0<br />

for d in dataset:<br />

if item in d: base += 1<br />

if d.issuperset(itemset): ccount += 1<br />

if d.issuperset(antecedent): acount += 1<br />

base /= nr_transactions<br />

p_y_given_x = ccount/acount<br />

lift = p_y_given_x / base<br />

print('Rule {0} -> {1} has lift {2}'<br />

.format(antecedent, consequent,lift))<br />

This is slow-running code: we iterate over the whole dataset repeatedly. A better<br />

implementation would cache the counts for speed. You can download such an<br />

implementation from the book's website, and it does indeed run much faster.<br />

Some of the results are shown in the following table:<br />

Antecedent Consequent Consequent<br />

count<br />

Antecedent<br />

count<br />

Antecedent<br />

and<br />

consequent<br />

count<br />

1378, 1379l, 1269 279 (0.3%) 80 57 225<br />

1380<br />

48, 41, 976 117 1026 (1.1%) 122 51 35<br />

48, 41, 16011 16010 1316 (1.5%) 165 159 64<br />

Lift<br />

Counts are the number of transactions; they include the following:<br />

• The consequent alone (that is, the base rate at which that product is bought)<br />

• All the items in the antecedent<br />

• All the items in the antecedent and the consequent<br />

[ 177 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!