20.03.2021 Views

Deep-Learning-with-PyTorch

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

388 CHAPTER 13 Using segmentation to find suspected nodules

)

n_classes=1,

depth=3,

wf=4,

padding=True,

batch_norm=True,

up_mode='upconv',

augmentation_model = SegmentationAugmentation(**self.augmentation_dict)

# ... line 154

return segmentation_model, augmentation_model

For input into UNet, we’ve got seven input channels: 3 + 3 context slices, and 1 slice

that is the focus for what we’re actually segmenting. We have one output class indicating

whether this voxel is part of a nodule. The depth parameter controls how deep the

U goes; each downsampling operation adds 1 to the depth. Using wf=5 means the first

layer will have 2**wf == 32 filters, which doubles with each downsampling. We want

the convolutions to be padded so that we get an output image the same size as our

input. We also want batch normalization inside the network after each activation function,

and our upsampling function should be an upconvolution layer, as implemented

by nn.ConvTranspose2d (see util/unet.py, line 123).

13.6.2 Using the Adam optimizer

The Adam optimizer (https://arxiv.org/abs/1412.6980) is an alternative to using

SGD when training our models. Adam maintains a separate learning rate for each

parameter and automatically updates that learning rate as training progresses. Due to

these automatic updates, we typically won’t need to specify a non-default learning rate

when using Adam, since it will quickly determine a reasonable learning rate by itself.

Here’s how we instantiate Adam in code.

Listing 13.23

training.py:156, .initOptimizer

def initOptimizer(self):

return Adam(self.segmentation_model.parameters())

It’s generally accepted that Adam is a reasonable optimizer to start most projects

with. 13 There is often a configuration of stochastic gradient descent with Nesterov

momentum that will outperform Adam, but finding the correct hyperparameters to

use when initializing SGD for a given project can be difficult and time consuming.

There have been a large number of variations on Adam—AdaMax, RAdam,

Ranger, and so on—that each have strengths and weaknesses. Delving into the details

of those is outside the scope of this book, but we think that it’s important to know that

those alternatives exist. We’ll use Adam in chapter 13.

13 See http://cs231n.github.io/neural-networks-3.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!