26.12.2013 Views

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

AI - a Guide to Intelligent Systems.pdf - Member of EEPIS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

WILL A NEURAL NETWORK WORK FOR MY PROBLEM?<br />

325<br />

Another problem is overfitting. The greater the number <strong>of</strong> hidden neurons, the<br />

greater the ability <strong>of</strong> the network <strong>to</strong> recognise existing patterns. However,<br />

if the number <strong>of</strong> hidden neurons is <strong>to</strong>o big, the network might simply memorise<br />

all training examples. This may prevent it from generalising, or producing correct<br />

outputs when presented with data that was not used in training. For instance, the<br />

overfitted character recognition network trained with Helvetica-font examples<br />

might not be able <strong>to</strong> recognise the same characters in the Times New Roman font.<br />

The practical approach <strong>to</strong> preventing overfitting is <strong>to</strong> choose the smallest<br />

number <strong>of</strong> hidden neurons that yields good generalisation. Thus, at the starting<br />

point, an experimental study could begin with as little as two neurons in the<br />

hidden layer. In our example, we will examine the system’s performance with<br />

2, 5, 10 and 20 hidden neurons and compare results.<br />

The architecture <strong>of</strong> a neural network (with five neurons in the hidden layer)<br />

for the character recognition problem is shown in Figure 9.20. Neurons in the<br />

hidden and output layers use a sigmoid activation function. The neural network<br />

is trained with the back-propagation algorithm with momentum; the momentum<br />

constant is set <strong>to</strong> 0.95. The input and output training patterns are shown in<br />

Table 9.2. The binary input vec<strong>to</strong>rs representing the bit maps <strong>of</strong> the respective<br />

digits are fed directly in<strong>to</strong> the network.<br />

The network’s performance in our study is measured by the sum <strong>of</strong> squared<br />

errors. Figure 9.21 demonstrates the results; as can be seen from Figure 9.21(a),<br />

a neural network with two neurons in the hidden layer cannot converge <strong>to</strong> a<br />

solution, while the networks with 5, 10 and 20 hidden neurons learn relatively<br />

fast. In fact, they converge in less than 250 epochs (each epoch represents an<br />

entire pass through all training examples). Also note that the network with 20<br />

hidden neurons shows the fastest convergence.<br />

Once the training is complete, we must test the network with a set <strong>of</strong> test<br />

examples <strong>to</strong> see how well it performs.<br />

Figure 9.20<br />

Neural network for printed digit recognition

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!