26.12.2013 Views

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

186<br />

ARTIFICIAL NEURAL NETWORKS<br />

Figure 6.14<br />

Learning with momentum<br />

accelerate descent in the steady downhill direction, and <strong>to</strong> slow down the<br />

process when the learning surface exhibits peaks and valleys.<br />

Figure 6.14 represents learning with momentum for operation Exclusive-OR.<br />

A comparison with a pure back-propagation algorithm shows that we reduced<br />

the number <strong>of</strong> epochs from 224 <strong>to</strong> 126.<br />

In the delta and generalised delta rules, we use a constant and rather<br />

small value for the learning rate parameter, a. Can we increase this value<br />

<strong>to</strong> speed up training?<br />

One <strong>of</strong> the most effective means <strong>to</strong> accelerate the convergence <strong>of</strong> backpropagation<br />

learning is <strong>to</strong> adjust the learning rate parameter during training.<br />

The small learning rate parameter, , causes small changes <strong>to</strong> the weights in the<br />

network from one iteration <strong>to</strong> the next, and thus leads <strong>to</strong> the smooth learning<br />

curve. On the other hand, if the learning rate parameter, , is made larger <strong>to</strong><br />

speed up the training process, the resulting larger changes in the weights may<br />

cause instability and, as a result, the network may become oscilla<strong>to</strong>ry.<br />

To accelerate the convergence and yet avoid the danger <strong>of</strong> instability, we can<br />

apply two heuristics (Jacobs, 1988):<br />

. Heuristic 1. If the change <strong>of</strong> the sum <strong>of</strong> squared errors has the same algebraic<br />

sign for several consequent epochs, then the learning rate parameter, ,<br />

should be increased.<br />

. Heuristic 2. If the algebraic sign <strong>of</strong> the change <strong>of</strong> the sum <strong>of</strong> squared errors<br />

alternates for several consequent epochs, then the learning rate parameter, ,<br />

should be decreased.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!