10.11.2016 Views

Learning Data Mining with Python

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

movie_name_data.columns = ["MovieID", "Title", "Release Date",<br />

"Video Release", "IMDB", "", "Action", "Adventure",<br />

"Animation", "Children's", "Comedy", "Crime", "Documentary",<br />

"Drama", "Fantasy", "Film-Noir",<br />

"Horror", "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller",<br />

"War", "Western"]<br />

Chapter 4<br />

Getting the movie title is important, so we will create a function that will return a<br />

movie's title from its MovieID, saving us the trouble of looking it up each time. Let's<br />

look at the code:<br />

def get_movie_name(movie_id):<br />

We look up the movie_name_data <strong>Data</strong>Frame for the given MovieID and return only<br />

the title column:<br />

title_object = movie_name_data[movie_name_data["MovieID"] ==<br />

movie_id]["Title"]<br />

We use the values parameter to get the actual value (and not the pandas Series<br />

object that is currently stored in title_object). We are only interested in the first<br />

value—there should only be one title for a given MovieID anyway!<br />

title = title_object.values[0]<br />

We end the function by returning the title as needed. Let's look at the code:<br />

return title<br />

In a new I<strong>Python</strong> Notebook cell, we adjust our previous code for printing out the top<br />

rules to also include the titles:<br />

for index in range(5):<br />

print("Rule #{0}".format(index + 1))<br />

(premise, conclusion) = sorted_confidence[index][0]<br />

premise_names = ", ".join(get_movie_name(idx) for idx<br />

in premise)<br />

conclusion_name = get_movie_name(conclusion)<br />

print("Rule: If a person recommends {0} they will<br />

also recommend {1}".format(premise_names, conclusion_name))<br />

print(" - Confidence: {0:.3f}".format(confidence[(premise,<br />

conclusion)]))<br />

print("")<br />

[ 75 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!