23.02.2015 Views

Machine Learning - DISCo

Machine Learning - DISCo

Machine Learning - DISCo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

110 MACHINE LEARNING<br />

Error versus weight updates (example 1)<br />

0.008<br />

0.007<br />

Validation set error<br />

0 5000 loo00 15000 20000<br />

Number of weight updates<br />

Error versus weight updates (example 2)<br />

0.08 %** I r 8<br />

0.07 - Training set error * -<br />

Validation set error +<br />

0.06<br />

y+:L<br />

0 lo00 2000 3000 4000 5000 6000<br />

Number of weight updates<br />

FIGURE 4.9<br />

Plots of error E as a function of the number of weight updates, for two different robot perception<br />

tasks. In both learning cases, error E over the training examples decreases monotonically, as gradient<br />

descent minimizes this measure of error. Error over the separate "validation" set of examples typically<br />

decreases at first, then may later increase due to overfitting the training examples. The network most<br />

IikeIy to generalize correctly to unseen data is the network with the lowest error over the validation<br />

set. Notice in the second plot, one must be careful to not stop training too soon when the validation<br />

set error begins to increase.<br />

this variation for two fairly typical applications of BACKPROPAGATION. Consider<br />

first the top plot in this figure. The lower of the two lines shows the monotonically<br />

decreasing error E over the training set, as the number of gradient descent<br />

iterations grows. The upper line shows the error E measured over a different validation<br />

set of examples, distinct from the training examples. This line measures the<br />

generalization accuracy of the network-the accuracy with which it fits examples<br />

beyond the training data.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!