modeling the plasma arc cutting process using ann - Revtn.ro
modeling the plasma arc cutting process using ann - Revtn.ro
modeling the plasma arc cutting process using ann - Revtn.ro
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Nonconventional Technologies Review – no. 4/2011<br />
(response variable). For <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> needs of training<br />
and testing <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> created ANN <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> whole<br />
experimental data set (N tot = 96) is randomly<br />
divided into a data subset for training (N 1 = 67)<br />
and a data subset for testing <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> ANN (N 2 =<br />
29). App<strong>ro</strong>ximately, two-thirds of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> whole<br />
data set have been employed for training and<br />
one-third of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> whole data set has been used<br />
for testing <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> trained ANN. Since <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng>re is 67<br />
data for ANN training, different small and large<br />
scale ANN <st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hitectures could be developed.<br />
However, ANNs are p<strong>ro</strong>ne to <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> overfitting and<br />
overtraining p<strong>ro</strong>blem that could limit <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
generalization capability of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> ANN. Overfitting<br />
usually occurs in ANNs with a lot of degrees of<br />
freedom (a huge number of neu<strong>ro</strong>ns) and when<br />
overtrained <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> ANN only memorizes <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
training set and loses its ability to generalize to<br />
new data. In both cases <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> performance of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
training data set increases, while <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
performance of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> validation data set<br />
decreases. This is well known bias-variance<br />
p<strong>ro</strong>blem, and <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> goal is to find simplest<br />
ANN model that has <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> total er<strong>ro</strong>r<br />
considerably low. To this aim, in se<st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hing for<br />
ANN which generalizes well different ANN<br />
<st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hitectures were developed. It was found that<br />
ANN <st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hitecture with 3 hidden neu<strong>ro</strong>ns<br />
(Figure 2) represents optimal solution (after<br />
trade-off).<br />
Prior to ANN training, <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> initial values of<br />
weights were set according to Nguyen-Wid<strong>ro</strong>w<br />
method. In ANN <st<strong>ro</strong>ng>modeling</st<strong>ro</strong>ng> it is often advisable<br />
to perform and analyze <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> training of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> given<br />
ANN model starting with different initial weights<br />
ra<st<strong>ro</strong>ng>the</st<strong>ro</strong>ng>r <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng>n changing <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> ANN <st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hitecture or<br />
adding more hidden neu<strong>ro</strong>ns. The MATLAB’s<br />
Neural Network Toolbox software package is<br />
used for training and testing <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> ANN models.<br />
BP algorithm with momentum was used for<br />
ANN training (“traingdm” p<strong>ro</strong>cedure in<br />
MATLAB). After some preliminary<br />
investigations, learning rate 0.3 and<br />
momentum constant of 0.7 were chosen for<br />
ANN training. The ANN’s performance during<br />
training was measured according to <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> mean<br />
of squared er<strong>ro</strong>rs (MSE). MSE is <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> average<br />
squared difference between outputs of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
network and target (experimental) values.<br />
Training was initially set to terminate after a<br />
maximum number of epochs (10000), but it<br />
was stopped at 5000 iterations since no fur<st<strong>ro</strong>ng>the</st<strong>ro</strong>ng>r<br />
imp<strong>ro</strong>vement in <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> MSE was achieved. As<br />
depicted in Figure 3, <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> prediction er<strong>ro</strong>r,<br />
measured by <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> MSE, is low (i.e. 0.0379163).<br />
Fig. 2. The selected ANN <st<strong>ro</strong>ng>arc</st<strong>ro</strong>ng>hitecture for<br />
<st<strong>ro</strong>ng>modeling</st<strong>ro</strong>ng> <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> PAC <st<strong>ro</strong>ng>p<strong>ro</strong>cess</st<strong>ro</strong>ng><br />
For all developed ANN models linear<br />
transfer function and tangent sigmoid transfer<br />
function were used in <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> output and hidden<br />
layer, respectively. In order to stabilize and<br />
enhance ANN training <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> data was normalized<br />
to a range of [-1, 1].<br />
Fig. 3. ANN training and test performance<br />
graph<br />
Ano<st<strong>ro</strong>ng>the</st<strong>ro</strong>ng>r performance measure for <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng><br />
network efficiency is <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> correlation coefficient<br />
(R). The correlation coefficient is a statistical<br />
measure of <st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> strength of correlation between<br />
actual versus predicted values. For example,<br />
<st<strong>ro</strong>ng>the</st<strong>ro</strong>ng> value of + 1 indicates perfect correlation. In<br />
46