22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

value updated by a constant value eta (the learning rate), but this constant is going

to be weighted by how much that parameter contributes to minimizing the loss

(its gradient).

Honestly, I believe this way of thinking about the parameter update makes more

sense. First, you decide on a learning rate that specifies your step size, while the

gradients tell you the relative impact (on the loss) of taking a step for each

parameter. Then, you take a given number of steps that’s proportional to that

relative impact: more impact, more steps.

"How do you choose a learning rate?"

That is a topic on its own and beyond the scope of this section as

well. We’ll get back to it later on.

In our example, let’s start with a value of 0.1 for the learning rate (which is a

relatively high value, as far as learning rates are concerned).

Step 4

1 # Sets learning rate - this is "eta" ~ the "n"-like Greek letter

2 lr = 0.1

3 print(b, w)

4

5 # Step 4 - Updates parameters using gradients and the

6 # learning rate

7 b = b - lr * b_grad

8 w = w - lr * w_grad

9

10 print(b, w)

Output

[0.49671415] [-0.1382643]

[0.80119529] [0.04511107]

42 | Chapter 0: Visualizing Gradient Descent

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!