22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

reduced to 1/8 or less of its original dimensions (and thus 1/64 of its total number

of pixels). The number of channels / filters produced by each block, though, usually

increases as more blocks are added.

After the sequence of blocks, the image gets flattened: Hopefully, at this stage,

there is no loss of information occurring by considering each value in the flattened

tensor a feature on its own.

Once the features are dissociated from pixels, it becomes a fairly standard

problem, like the ones we’ve been handling in this book: The features feed one or

more hidden layers, and an output layer produces the logits for classification.

If you think of it, what those typical convolutional blocks do is

akin to pre-processing images and converting them into

features. Let’s call this part of the network a featurizer (the one

that generates features).

The classification itself is handled by the familiar and well-known

hidden and output layers.

In transfer learning, which we’ll see in Chapter 7, this will

become even more clear.

LeNet-5

LeNet-5 is a seven-level convolutional neural network developed by Yann LeCun in

1998 to recognize hand-written digits in 28x28 pixel images—the famous MNIST

dataset! That’s when it all started (kinda). In 1989, LeCun himself used backpropagation

(chained gradient descent, remember?) to learn the convolution

filters, as we discussed above, instead of painstakingly developing them manually.

His network had the architecture depicted next.

368 | Chapter 5: Convolutions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!