22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

normed1 = batch_normalizer(batch1[0])

batch_normalizer.state_dict()

Output

OrderedDict([('running_mean', tensor([0.8443, 0.8810])),

('running_var', tensor([1.0726, 1.0774])),

('num_batches_tracked', tensor(1))])

Great, it matches the statistics we computed before. The resulting values should be

standardized by now, right? Let’s double-check it:

normed1.mean(axis=0), normed1.var(axis=0)

Output

(tensor([ 3.3528e-08, -9.3132e-09]), tensor([1.0159, 1.0159]))

"This looks a bit off … shouldn’t the variance be exactly one?"

Yes, and no. I confess I find this a bit annoying too—the running variance is

unbiased, but the actual standardization of the data points of a mini-batch uses a

biased variance.

"What’s the difference between the two?"

The difference lies in the denominator only:

Equation 7.6 - Biased variance

Batch Normalization | 539

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!