22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

had 1.2 million images belonging to 1,000 categories, just like the 2012 edition.

VGG

The architecture developed by Karen Simonyan and Andrew Zisserman from the

Oxford Vision Geometry Group (VGG) is pretty much an even larger or, better yet,

deeper model than AlexNet (and now you know the origin of yet another

architecture name). Their goal is made crystal clear in their model’s description:

…we explore the effect of the convolutional network (ConvNet) depth on its

accuracy.

Source: Results (ILSVRC2014) [115]

VGG models are massive, so we’re not paying much attention to them here. If you

want to learn more about it, its paper is called "Very Deep Convolutional Networks

for Large-Scale Image Recognition." [116]

Inception (GoogLeNet Team)

The Inception architecture is probably the one with the best meme of all: "We need

to go deeper." The authors, Christian Szegedy, et al., like the VGG team, wanted to

train a deeper model. But they came up with a clever way of doing it (highlights are

mine):

Additional dimension reduction layers based on embedding learning

intuition allow us to increase both the depth and the width of the network

significantly without incurring significant computational overhead.

Source: Results (ILSVRC2014) [117]

If you want to learn more about it, the paper is called "Going Deeper with

Convolutions." [118]

"What are these dimension-reduction layers?"

No worries, we’ll get back to it in the "Inception Modules" section.

ILSVRC-2015

The 2015 edition [119] popularized residual connections in its aptly named

architecture: Res(idual) Net(work). The training data used in the competition

502 | Chapter 7: Transfer Learning

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!