08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

We are now going to use this binary matrix to make predictions of movie ratings.<br />

The general algorithm will be (in pseudocode) as follows:<br />

Chapter 8<br />

1. For each user, rank every other user in terms of closeness. For this step,<br />

we will use the binary matrix and use correlation as the measure of<br />

closeness (interpreting the binary matrix as zeros and ones allows us<br />

to perform this computation).<br />

2. When we need to estimate a rating for a user-movie pair, we look at the<br />

neighbors of the user sequentially (as defined in step 1). When we first find<br />

a rating for the movie in question, we report it.<br />

Implementing the code first, we are going to write a simple NumPy function.<br />

NumPy ships <strong>with</strong> np.corrcoeff, which computes correlations. This is a very<br />

generic function and computes n-dimensional correlations even when only a single,<br />

traditional correlation is needed. Therefore, to compute the correlation between two<br />

users, we need to call the following:<br />

corr_between_user1_and_user2 = np.corrcoef(user1, user2)[0,1]<br />

In fact, we will be wishing to compute the correlation between a user and all the<br />

other users. This will be an operation we will use a few times, so we wrap it in<br />

a function named all_correlations:<br />

import numpy as np<br />

def all_correlations(bait, target):<br />

'''<br />

corrs = all_correlations(bait, target)<br />

corrs[i] is the correlation between bait and target[i]<br />

'''<br />

return np.array(<br />

[np.corrcoef(bait, c)[0,1]<br />

for c in target])<br />

[ 167 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!