20.03.2021 Views

Deep-Learning-with-PyTorch

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

282 CHAPTER 11 Training a classification model to detect suspected tumors

– Loop over each batch of validation data (in a manner very similar to the

training loop).

– Load the relevant batch of validation data (again, in the background worker

process).

– Classify the batch, and compute the loss.

– Record information about how well the model performed on the validation

data.

– Print out progress and performance information for this epoch.

As we go through the code for the chapter, keep an eye out for two main differences

between the code we’re producing here and what we used for a training loop in part 1.

First, we’ll put more structure around our program, since the project as a whole is quite

a bit more complicated than what we did in earlier chapters. Without that extra structure,

the code can get messy quickly. And for this project, we will have our main training

application use a number of well-contained functions, and we will further separate

code for things like our dataset into self-contained Python modules.

Make sure that for your own projects, you match the level of structure and design

to the complexity level of your project. Too little structure, and it will become difficult

to perform experiments cleanly, troubleshoot problems, or even describe what you’re

doing! Conversely, too much structure means you’re wasting time writing infrastructure

that you don’t need and most likely slowing yourself down by having to conform

to it after all that plumbing is in place. Plus it can be tempting to spend time on infrastructure

as a procrastination tactic, rather than digging into the hard work of making

actual progress on your project. Don’t fall into that trap!

The other big difference between this chapter’s code and part 1 will be a focus on

collecting a variety of metrics about how training is progressing. Being able to accurately

determine the impact of changes on training is impossible without having good

metrics logging. Without spoiling the next chapter, we’ll also see how important it is

to collect not just metrics, but the right metrics for the job. We’ll lay the infrastructure for

tracking those metrics in this chapter, and we’ll exercise that infrastructure by collecting

and displaying the loss and percent of samples correctly classified, both overall

and per class. That’s enough to get us started, but we’ll cover a more realistic set of

metrics in chapter 12.

11.2 The main entry point for our application

One of the big structural differences from earlier training work we’ve done in this

book is that part 2 wraps our work in a fully fledged command-line application. It will

parse command-line arguments, have a full-featured --help command, and be easy to

run in a wide variety of environments. All this will allow us to easily invoke the training

routines from both Jupyter and a Bash shell. 1

1

Any shell, really, but if you’re using a non-Bash shell, you already knew that.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!