08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 5<br />

Method mean(scores) stddev(scores)<br />

LogReg C=1.00 0.6290 0.03270<br />

90NN 0.6280 0.02777<br />

We have seen the accuracy for the different values of the regularization parameter<br />

C. With it, we can control the model complexity, similar to the parameter k for the<br />

nearest neighbor method. Smaller values for C result in a higher penalty, that is,<br />

they make the model more complex.<br />

A quick look at the bias-variance chart for our best candidate, C = 0.1, shows<br />

that our model has high bias—test and train error curves approach closely but<br />

stay at unacceptably high values. This indicates that logistic regression <strong>with</strong> the<br />

current feature space is under-fitting and cannot learn a model that captures the<br />

data correctly.<br />

So what now? We switched the model and tuned it as much as we could <strong>with</strong> our<br />

current state of knowledge, but we still have no acceptable classifier.<br />

It seems more and more that either the data is too noisy for this task or that our set<br />

of features is still not appropriate to discriminate the classes that are good enough.<br />

[ 109 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!