22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Let’s apply this filter to our image so we can use the resulting image in our next

operation:

padded = F.pad(image, (1, 1, 1, 1), mode='constant', value=0)

conv_padded = F.conv2d(padded, kernel_edge, stride=1)

Pooling

Now we’re back in the business of shrinking images. Pooling is different than the

former operations: It splits the image into tiny chunks, performs an operation on

each chunk (that yields a single value), and puts the chunks together as the

resulting image. Again, an image is worth a thousand words.

Figure 5.15 - Max pooling

In the image above, we’re performing a max pooling with a kernel size of two. Even

though this is not quite the same as the filters we’ve already seen, it is still called a

kernel.

In this example, the stride is assumed to be the same size as the

kernel.

Our input image is split into nine chunks, and we perform a simple max operation

(hence, max pooling) on each chunk (really, it is just taking the largest value in each

chunk). Then, these values are put together, in order, to produce a smaller

resulting image.

The larger the pooling kernel, the smaller the resulting image.

364 | Chapter 5: Convolutions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!