08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Classification – Detecting Poor Answers<br />

>>> clf.fit(X, y)<br />

>>> print(np.exp(clf.intercept_), np.exp(clf.coef_.ravel()))<br />

[ 0.09437188] [ 1.80094112]<br />

>>> def lr_model(clf, X):<br />

return 1 / (1 + np.exp(-(clf.intercept_ + clf.coef_*X)))<br />

>>> print("P(x=-1)=%.2f\tP(x=7)=%.2f"%(lr_model(clf, -1), lr_<br />

model(clf, 7)))<br />

P(x=-1)=0.05 P(x=7)=0.85<br />

You might have noticed that Scikit-learn exposes the first coefficient through the<br />

special field intercept_.<br />

If we plot the fitted model, we see that it makes perfect sense given the data:<br />

Applying logistic regression to our<br />

postclassification problem<br />

Admittedly, the example in the previous section was created to show the beauty of<br />

logistic regression. How does it perform on the extremely noisy data?<br />

Comparing it to the best nearest neighbour classifier (k = 90) as a baseline, we see<br />

that it performs a bit better, but also won't change the situation a whole lot:<br />

Method mean(scores) stddev(scores)<br />

LogReg C=0.1 0.6310 0.02791<br />

LogReg C=100.00 0.6300 0.03170<br />

LogReg C=10.00 0.6300 0.03170<br />

LogReg C=0.01 0.6295 0.02752<br />

[ 108 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!