Building Machine Learning Systems with Python - Richert, Coelho

More documents

Recommendations

Info

Chapter 7 Fortunately, scikit-learn makes it very easy to do the right thing: it has classes named LassoCV, RidgeCV, and ElasticNetCV, all of which encapsulate a cross-validation check for the inner parameter. The code is 100 percent like the previous one, except that we do not need to specify any value for alpha from sklearn.linear_model import ElasticNetCV met = ElasticNetCV(fit_intercept=True) kf = KFold(len(target), n_folds=10) for train,test in kf: met.fit(data[train],target[train]) p = map(met.predict, data[test]) p = np.array(p).ravel() e = p-target[test] err += np.dot(e,e) rmse_10cv = np.sqrt(err/len(target)) This results in a lot of computation, so you may want to get some coffee while you are waiting (depending on how fast your computer is). Rating prediction and recommendations If you have used any commercial online system in the last 10 years, you have probably seen these recommendations. Some are like Amazon's "costumers who bought X also bought Y." These will be dealt with in the next chapter under the topic of basket analysis. Others are based on predicting the rating of a product, such as a movie. This last problem was made famous with the Netflix Challenge; a million-dollar machine learning public challenge by Netflix. Netflix (well-known in the U.S. and U.K., but not available elsewhere) is a movie rental company. Traditionally, you would receive DVDs in the mail; more recently, the business has focused on online streaming of videos. From the start, one of the distinguishing features of the service was that it gave every user the option of rating films they had seen, using these ratings to then recommend other films. In this mode, you not only have the information about which films the user saw, but also their impression of them (including negative impressions). In 2006, Netflix made available a large number of customer ratings of films in its database and the goal was to improve on their in-house algorithm for ratings prediction. Whoever was able to beat it by 10 percent or more would win 1 million dollars. In 2009, an international team named BellKor's Pragmatic Chaos was able to beat that mark and take the prize. They did so just 20 minutes before another team, The Ensemble, passed the 10 percent mark as well! An exciting photo-finish for a competition that lasted several years. [ 159 ]
Regression – Recommendations Unfortunately, for legal reasons, this dataset is no longer available (although the data was anonymous, there were concerns that it might be possible to discover who the clients were and reveal the private details of movie rentals). However, we can use an academic dataset with similar characteristics. This data comes from GroupLens, a research laboratory at the University of Minnesota. Machine learning in the real world Much has been written about the Netflix Prize and you may learn a lot reading up on it (this book will have given you enough to start to understand the issues). The techniques that won were a mix of advanced machine learning with a lot of work in the preprocessing of the data. For example, some users like to rate everything very highly, others are always more negative; if you do not account for this in preprocessing, your model will suffer. Other not so obvious normalizations were also necessary for a good result: how old the film is, how many ratings did it receive, and so on. Good algorithms are a good thing, but you always need to "get your hands dirty" and tune your methods to the properties of the data you have in front of you. We can formulate this as a regression problem and apply the methods that we learned in this chapter. It is not a good fit for a classification approach. We could certainly attempt to learn the five class classifiers, one class for each possible grade. There are two problems with this approach: • Errors are not all the same. For example, mistaking a 5-star movie for a 4-star one is not as serious of a mistake as mistaking a 5-star movie for a 1-star one. • Intermediate values make sense. Even if our inputs are only integer values, it is perfectly meaningful to say that the prediction is 4.7. We can see that this is a different prediction than 4.2. These two factors together mean that classification is not a good fit to the problem. The regression framework is more meaningful. We have two choices: we can build movie-specific or user-specific models. In our case, we are going to first build user-specific models. This means that, for each user, we take the movies, it has rated as our target variable. The inputs are the ratings of other old users. This will give a high value to users who are similar to our user (or a negative value to users who like more or less the same movies that our user dislikes). The system is just an application of what we have developed so far. You will find a copy of the dataset and code to load it into Python on the book's companion website. There you will also find pointers to more information, including the original MovieLens website. [ 160 ]
Page 2 and 3:
Building Machine Learning Systems w
Page 4 and 5:
Credits Authors Willi Richert Luis
Page 6 and 7:
Luis Pedro Coelho is a Computationa
Page 8 and 9:
Maurice HT Ling completed his PhD.
Page 10 and 11:
Table of Contents Preface 1 Chapter
Page 12 and 13:
Table of Contents Tuning the instan
Page 14 and 15:
Table of Contents Improving classif
Page 16 and 17:
Preface You could argue that it is
Page 18 and 19:
Preface What you need for this book
Page 20:
Downloading the example code You ca
Page 23 and 24:
Getting Started with Python Machine
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 48 and 49:
Learning How to Classify with Real-
Page 50 and 51:
Chapter 2 We are using Matplotlib;
Page 52 and 53:
Chapter 2 The last few lines select
Page 54 and 55:
Chapter 2 error = 0.0 for ei in ran
Page 56 and 57:
Chapter 2 We can play around with t
Page 58 and 59:
Chapter 2 Features and feature engi
Page 60 and 61:
Chapter 2 In the preceding screensh
Page 62 and 63:
Chapter 2 Binary and multiclass cla
Page 64 and 65:
Clustering - Finding Related Posts
Page 66 and 67:
Chapter 3 How to do it More robust
Page 68 and 69:
Chapter 3 This means that the first
Page 70 and 71:
Chapter 3 ... post = posts[i] ... i
Page 72 and 73:
Chapter 3 If you have a clear pictu
Page 74 and 75:
Chapter 3 Extending the vectorizer
Page 76 and 77:
Chapter 3 0.0 >>> print(tfidf("b",
Page 78 and 79:
Chapter 3 Flat clustering divides t
Page 80 and 81:
Because the cluster centers are mov
Page 82 and 83:
Chapter 3 'D:\\data\\379\\raw\\comp
Page 84 and 85:
As we have learned previously, we w
Page 86 and 87:
Chapter 3 Position Similarity Excer
Page 88:
Chapter 3 But before you go there,
Page 91 and 92:
Topic Modeling For those who are in
Page 93 and 94:
Topic Modeling Sparsity means that
Page 95 and 96:
Topic Modeling Although daunting at
Page 97 and 98:
Topic Modeling … for tj,v in t:
Page 99 and 100:
Topic Modeling Finally, we build th
Page 101 and 102:
Topic Modeling Alternatively, we ca
Page 103 and 104:
Topic Modeling Topic modeling was f
Page 105 and 106:
Classification - Detecting Poor Ans
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124: Classification - Detecting Poor Ans
Page 132 and 133: Classification II - Sentiment Analy
Page 134 and 135: Chapter 6 Getting to know the Bayes
Page 136 and 137: Using Naive Bayes to classify Given
Page 138 and 139: Chapter 6 This denotation "" leads
Page 140 and 141: Chapter 6 Similarly, we do this for
Page 142 and 143: Chapter 6 A quick look at the previ
Page 144 and 145: Chapter 6 To keep our experimentati
Page 146 and 147: Chapter 6 Y = np.zeros(Y.shape[0])
Page 148 and 149: Chapter 6 ° ° Experiment with whe
Page 150 and 151: Chapter 6 We have to be patient whe
Page 152 and 153: Chapter 6 First, we define a range
Page 154 and 155: Chapter 6 Determining the word type
Page 156 and 157: Chapter 6 Successfully cheating usi
Page 158 and 159: Chapter 6 Our first estimator Now w
Page 160 and 161: Chapter 6 for d in documents: allca
Page 162 and 163: Regression - Recommendations You ha
Page 164 and 165: Chapter 7 The preceding graph shows
Page 166 and 167: Chapter 7 Root mean squared error a
Page 168 and 169: Penalized regression The important
Page 170 and 171: Chapter 7 P greater than N scenario
Page 172 and 173: Chapter 7 So, we can see that the d
Page 176 and 177: [ 161 ] Chapter 7 The loading of th
Page 178: Chapter 7 Summary In this chapter,
Page 181 and 182: Regression - Recommendations Improv
Page 185 and 186: Weights Regression - Recommendation
Page 196 and 197: Classification III - Music Genre Cl
Page 198 and 199: Chapter 9 Matplotlib provides the c
Page 200 and 201: [ 185 ] Chapter 9
Page 202 and 203: Chapter 9 def create_fft(fn): sampl
Page 204 and 205: Chapter 9 ax.set_yticks(range(len(g
Page 206 and 207: Chapter 9 On the left-hand side gra
Page 208 and 209: [ 193 ] Chapter 9 Improving classif
Page 210 and 211: We get the following promising resu
Page 212: Chapter 9 Summary In this chapter,
Page 215 and 216: Computer Vision - Pattern Recogniti
Page 225 and 226:
Computer Vision - Pattern Recogniti
Page 227 and 228:
Page 229 and 230:
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Dimensionality Reduction Sketching
Page 239 and 240:
Dimensionality Reduction However, t
Page 241 and 242:
Dimensionality Reduction To underst
Page 243 and 244:
Dimensionality Reduction In order t
Page 245 and 246:
Dimensionality Reduction Asking the
Page 247 and 248:
Dimensionality Reduction n_ feature
Page 249 and 250:
Dimensionality Reduction Sketching
Page 251 and 252:
Dimensionality Reduction Limitation
Page 253 and 254:
Dimensionality Reduction Now, MDS t
Page 255 and 256:
Dimensionality Reduction Of course,
Page 257 and 258:
Big(ger) Data • Your algorithms c
Page 259 and 260:
Big(ger) Data sleep(4) return 2*x @
Page 261 and 262:
Big(ger) Data Looking under the hoo
Page 263 and 264:
Big(ger) Data def write_result(ofna
Page 265 and 266:
Big(ger) Data Amazon Web Services i
Page 267 and 268:
Big(ger) Data In EC2 parlance, a ru
Page 269 and 270:
Big(ger) Data In this system, pip i
Page 271 and 272:
Big(ger) Data Keys, keys, and more
Page 273 and 274:
Big(ger) Data We can use the same j
Page 276 and 277:
Where to Learn More about Machine L
Page 278 and 279:
• Machined Learnings at http://ww
Page 280 and 281:
Index A AcceptedAnswerId attribute
Page 282 and 283:
F false negative 41 false positive
Page 284 and 285:
inary matrix of recommendations, us
Page 286:
sklearn.naive_bayes package 127 skl
Page 289 and 290:
NumPy Beginner's Guide - Second Edi
show all

Building Machine Learning Systems with Python - Richert, Coelho

Create successful ePaper yourself

Delete template?

Save as template?