22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

one matrix for each data point, each matrix containing a grid of errors.

The only step missing is to compute the mean squared error. First, we take the

square of all errors. Then, we average the squares over all data points. Since our

data points are in the first dimension, we use axis=0 to compute this average:

all_losses = (all_errors ** 2).mean(axis=0)

all_losses.shape

Output

(101, 101)

The result is a grid of losses, a matrix of shape (101, 101), each loss corresponding

to a different combination of the parameters b and w.

These losses are our loss surface, which can be visualized in a 3D plot, where the

vertical axis (z) represents the loss values. If we connect the combinations of b and

w that yield the same loss value, we’ll get an ellipse. Then, we can draw this ellipse

in the original b x w plane (in blue, for a loss value of 3). This is, in a nutshell, what a

contour plot does. From now on, we’ll always use the contour plot, instead of the

corresponding 3D version.

Figure 0.4 - Loss surface

In the center of the plot, where parameters (b, w) have values close to (1, 2), the loss

is at its minimum value. This is the point we’re trying to reach using gradient

Step 2 - Compute the Loss | 35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!