08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 1<br />

Although we cannot look into the future, we can and should simulate a similar effect<br />

by holding out a part of our data. Let us remove, for instance, a certain percentage of<br />

the data and train on the remaining one. Then we use the hold-out data to calculate<br />

the error. As the model has been trained not knowing the hold-out data, we should<br />

get a more realistic picture of how the model will behave in the future.<br />

The test errors for the models trained only on the time after the inflection point now<br />

show a completely different picture.<br />

Error d=1: 7,917,335.831122<br />

Error d=2: 6,993,880.348870<br />

Error d=3: 7,137,471.177363<br />

Error d=10: 8,805,551.189738<br />

Error d=100: 10,877,646.621984<br />

The result can be seen in the following chart:<br />

It seems we finally have a clear winner. The model <strong>with</strong> degree 2 has the lowest<br />

test error, which is the error when measured using data that the model did not see<br />

during training. And this is what lets us trust that we won't get bad surprises when<br />

future data arrives.<br />

[ 29 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!