22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

batch_normalizer = nn.BatchNorm2d(

num_features=3, affine=False, momentum=None

)

normed1 = batch_normalizer(batch1[0])

print(normed1.mean(axis=[0, 2, 3]),

normed1.var(axis=[0, 2, 3], unbiased=False))

Output

(tensor([ 2.3171e-08, 3.4217e-08, -2.9616e-09]),

tensor([0.9999, 0.9999, 0.9999]))

As expected, each channel in the output has its pixel values with zero mean and

unit standard deviation.

Other Normalizations

Batch normalization is certainly the most popular kind of normalization, but it’s not

the only one. If you check PyTorch’s documentation on normalization layers, you’ll

see many alternatives, like nn.SyncBatchNorm, for instance. But, just like the batch

renormalization technique, they are beyond the scope of this book.

Small Summary

This was probably the most challenging section in this book so far. It goes over a lot

of information while only scratching the surface of this topic. So, I am organizing a

small summary of the main points we’ve addressed:

• During training time, batch normalization computes statistics (mean and

variance) for each individual mini-batch and uses these statistics to produce

standardized outputs.

• The fluctuations in the statistics from one mini-batch to the next introduce

randomness into the process and thus have a regularizing effect.

• Due to the regularizing effect of batch normalization, it may not work well if

combined with other regularization techniques (like dropout).

• During evaluation time, batch normalization uses a (smoothed) average of the

statistics computed during training.

Batch Normalization | 545

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!