22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

You may go bananas with the value of T trying in vain to approach infinity, but 20

periods is more than enough to make a point:

alpha = 1/3; T = 20

t = np.arange(1, T + 1)

age = alpha * sum((1 - alpha)**(t - 1) * t)

age

Output

2.9930832408241015

That’s three-ish enough, right? If you’re not convinced, try using 93 periods (or

more).

Now that we know how to compute the average age of an EWMA given its alpha,

we can figure out which (simple) moving average has the same average age:

Equation 6.7 - Alpha vs. periods

There we go, an easy and straightforward relationship between the value of alpha

and the number of periods of a moving average. Guess what happens if you plug

the value one-third for alpha? You get the corresponding number of periods: five.

An EWMA using an alpha equal to one-third corresponds to a five-period moving

average.

It also works the other way around: If we’d like to compute the EWMA equivalent

to a 19-period moving average, the corresponding alpha would be 0.1. And, if we’re

using the EWMA’s formula based on beta, that would be 0.9. Similarly, to compute

the EWMA equivalent to a 1999-period moving average, alpha and beta would be

0.001 and 0.999, respectively.

These choices are not random at all: It turns out, Adam uses these two values for

its betas (one for the moving average of gradients, the other for the moving

average of squared gradients).

458 | Chapter 6: Rock, Paper, Scissors

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!