22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

image_rgb = np.stack([image_r, image_g, image_b], axis=2)

Let’s see what those same matrices represent, once we consider them

channels (unfortunately, the visual impact is completely lost in print).

Before moving on with our classification problem, we need to address the shape

issue: Different frameworks (and Python packages) use different conventions for

the shape of the images.

Shape (NCHW vs NHWC)

"What do these acronyms stand for?"

It’s quite simple, actually:

• N stands for the Number of images (in a mini-batch, for instance).

• C stands for the number of Channels (or filters) in each image.

• H stands for each image’s Height.

• W stands for each image’s Width.

Thus the acronyms indicate the expected shape of the mini-batch:

• NCHW: (number of images, channels, height, width)

• NHWC: (number of images, height, width, channels)

Basically, everyone agrees that the number of images comes first, and that height

and width are an inseparable duo. It all comes down to the channels (or filters):

Classifying Images | 271

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!