01.04.2015 Views

1FfUrl0

1FfUrl0

1FfUrl0

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The only possibilities we have in this case is to either get more features, make the<br />

model more complex, or change the model.<br />

Chapter 5<br />

Fixing high variance<br />

If, on the contrary, we suffer from high variance that means our model is too<br />

complex for the data. In this case, we can only try to get more data or decrease the<br />

complexity. This would mean to increase k so that more neighbors would be taken<br />

into account or to remove some of the features.<br />

High bias or low bias<br />

To find out what actually our problem is, we have to simply plot the train and test<br />

errors over the data size.<br />

High bias is typically revealed by the test error decreasing a bit at the beginning, but<br />

then settling at a very high value with the train error approaching a growing dataset<br />

size. High variance is recognized by a big gap between both curves.<br />

Plotting the errors for different dataset sizes for 5NN shows a big gap between the<br />

train and test error, hinting at a high variance problem. Refer to the following graph:<br />

[ 103 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!