14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

284 Creating Neural Networks Chapter 10<br />

Overview of Neural Networks<br />

Table 10.4 Activation Functions (Continued)<br />

Linear<br />

Gaussian<br />

The identity function. The linear combination of the X variables is not<br />

transformed.<br />

The Linear activation function is most often used in conjunction with one<br />

of the non-linear activation functions. In this case, the Linear activation<br />

function is placed in the second layer, <strong>and</strong> the non-linear activation<br />

functions are placed in the first layer. This is useful if you want to first<br />

reduce the dimensionality of the X variables, <strong>and</strong> then have a nonlinear<br />

model for the Y variables.<br />

For a continuous Y variable, if only Linear activation functions are used, the<br />

model for the Y variable reduces to a linear combination of the X variables.<br />

For a nominal or ordinal Y variable, the model reduces to a logistic<br />

regression.<br />

The Gaussian function. Use this option for radial basis function behavior, or<br />

when the response surface is Gaussian (normal) in shape. The Gaussian<br />

function is:<br />

e–<br />

x2<br />

where x is a linear combination of the X variables.<br />

Boosting<br />

Boosting is the process of building a large additive neural network model by fitting a sequence of smaller<br />

models. Each of the smaller models is fit on the scaled residuals of the previous model. The models are<br />

combined to form the larger final model. The process uses validation to assess how many component<br />

models to fit, not exceeding the specified number of models.<br />

Boosting is often faster than fitting a single large model. However, the base model should be a 1 to 2 node<br />

single-layer model, or else the benefit of faster fitting can be lost if a large number of models is specified.<br />

Use the Boosting panel in the Model Launch to specify the number of component models <strong>and</strong> the learning<br />

rate. Use the Hidden Layer Structure panel in the Model Launch to specify the structure of the base model.<br />

The learning rate must be 0 < r ≤ 1. Learning rates close to 1 result in faster convergence on a final model,<br />

but also have a higher tendency to overfit data. Use learning rates close to 1 when a small Number of Models<br />

is specified.<br />

As an example of how boosting works, suppose you specify a base model consisting of one layer <strong>and</strong> two<br />

nodes, with the number of models equal to eight. The first step is to fit a one-layer, two-node model. The<br />

predicted values from that model are scaled by the learning rate, then subtracted from the actual values to<br />

form a scaled residual. The next step is to fit a different one-layer, two-node model on the scaled residuals of<br />

the previous model. This process continues until eight models are fit, or until the addition of another model<br />

fails to improve the validation statistic. The component models are combined to form the final, large model.<br />

In this example, if six models are fit before stopping, the final model consists of one layer <strong>and</strong> 2 x 6 = 12<br />

nodes.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!