20.03.2021 Views

Deep-Learning-with-PyTorch

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Preparing for a large-scale project

237

reasonable solutions. That attention on the problem has also resulted in a lot of highquality

papers and open source projects, which are a great source of inspiration and

ideas. This will be a huge help once we conclude part 2 of the book, if you are interested

in continuing to improve on the solution we create. We’ll provide some links to

additional information in chapter 14.

This part of the book will remain focused on the problem of detecting lung

tumors, but the skills we’ll teach are general. Learning how to investigate, preprocess,

and present your data for training is important no matter what project you’re working

on. While we’ll be covering preprocessing in the specific context of lung tumors, the

general idea is that this is what you should be prepared to do for your project to succeed.

Similarly, setting up a training loop, getting the right performance metrics, and tying

the project’s models together into a final application are all general skills that we’ll

employ as we go through chapters 9 through 14.

NOTE While the end result of part 2 will work, the output will not be accurate

enough to use clinically. We’re focusing on using this as a motivating example

for teaching PyTorch, not on employing every last trick to solve the problem.

9.2 Preparing for a large-scale project

This project will build off of the foundational skills learned in part 1. In particular, the

content covering model construction from chapter 8 will be directly relevant.

Repeated convolutional layers followed by a resolution-reducing downsampling layer

will still make up the majority of our model. We will use 3D data as input to our

model, however. This is conceptually similar to the 2D image data used in the last few

chapters of part 1, but we will not be able to rely on all of the 2D-specific tools available

in the PyTorch ecosystem.

The main differences between the work we did with convolutional models in chapter

8 and what we’ll do in part 2 are related to how much effort we put into things outside

the model itself. In chapter 8, we used a provided, off-the-shelf dataset and did

little data manipulation before feeding the data into a model for classification. Almost

all of our time and attention were spent building the model itself, whereas now we’re

not even going to begin designing the first of our two model architectures until chapter

11. That is a direct consequence of having nonstandard data without prebuilt

libraries ready to hand us training samples suitable to plug into a model. We’ll have to

learn about our data and implement quite a bit ourselves.

Even when that’s done, this will not end up being a case where we convert the CT to

a tensor, feed it into a neural network, and have the answer pop out the other side. As

is common for real-world use cases such as this, a workable approach will be more complicated

to account for confounding factors such as limited data availability, finite

computational resources, and limitations on our ability to design effective models. Please

keep that in mind as we build to a high-level explanation of our project architecture.

Speaking of finite computational resources, part 2 will require access to a GPU to

achieve reasonable training speeds, preferably one with at least 8 GB of RAM. Trying

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!