22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

missing pixels can probably be easily filled with the values of the adjacent pixels.

On the other hand, if a full channel is dropped (in an RGB image), the color changes

(good luck figuring out the values for the missing channel!).

The figure below illustrates the effect of both regular and two-dimensional

dropout procedures on an image of our dataset.

Figure 6.9 - Dropping channels with nn.Dropout2d

Sure, in deeper layers, there is no correspondence between channel and color

anymore, but each channel still encodes some feature. By randomly dropping some

channels, two-dimensional dropout achieves the desired regularization.

Now, let’s make it a bit harder for our model to learn by setting its dropout

probability to 30% and observing how it fares…

Model Configuration

The configuration part is short and straightforward: We create a model, a loss

function, and an optimizer.

The model will be an instance of our CNN2 class with five filters and a dropout

probability of 30%. Our dataset has three classes, so we’re using

nn.CrossEntropyLoss() (which will take the three logits produced by our model).

Optimizer

Regarding the optimizer, let’s ditch the SGD optimizer and use Adam for a change.

Stochastic gradient descent is simple and straightforward, as we’ve learned in

Chapter 0, but it is also slow. So far, the training speed of SGD has not been an issue

because our problems were quite simple. But, as our models grow a bit more

complex, we can benefit from choosing a different optimizer.

438 | Chapter 6: Rock, Paper, Scissors

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!