22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

output_train / spaced_points

Output

tensor([0., 2., 0., 2., 0., 2., 2., 2., 2., 0., 2.])

"Why?"

This adjustment has the purpose of preserving (or at least trying to) the overall

level of the outputs in the particular layer that’s "suffering" the dropout. So, let’s

imagine that these inputs (after dropping) will feed a linear layer and, for

educational purposes, that all their weights are equal to one (and bias equals zero).

As you already know, a linear layer will multiply these weights by the (dropped)

inputs and sum them up:

F.linear(output_train, weight=torch.ones(11), bias=torch.tensor(0))

Output

tensor(9.4000)

The sum is 9.4. It would have been half of this (4.7) without the adjusting factor.

"OK, so what? Why do I need to preserve the level of the outputs

anyway?"

Because there is no dropping in evaluation mode! We’ve talked about it briefly in

the past—the dropout is random in nature, so it would produce slightly (or maybe

not so slightly) different predictions for the same inputs. You don’t want that,

that’s bad business. So, let’s set our model to eval mode (and that’s why I chose to

make it a model instead of using functional dropout) and see what happens there:

dropping_model.eval()

output_eval = dropping_model(spaced_points)

output_eval

434 | Chapter 6: Rock, Paper, Scissors

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!