20.03.2021 Views

Deep-Learning-with-PyTorch

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

262 CHAPTER 10 Combining data sources into a unified dataset

Listing 10.5

candidateInfo_list.sort(reverse=True)

return candidateInfo_list

The ordering of the tuple members in noduleInfo_list is driven by this sort. We’re

using this sorting approach to help ensure that when we take a slice of the data, that

slice gets a representative chunk of the actual nodules with a good spread of nodule

diameters. We’ll discuss this more in section 10.5.3.

10.3 Loading individual CT scans

dsets.py:80, def getCandidateInfoList

This means we have all of the actual nodule

samples starting with the largest first, followed

by all of the non-nodule samples (which don’t

have nodule size information).

Next up, we need to be able to take our CT data from a pile of bits on disk and turn it

into a Python object from which we can extract 3D nodule density data. We can see this

path from the .mhd and .raw files to Ct objects in figure 10.4. Our nodule annotation

information acts like a map to the interesting parts of our raw data. Before we can follow

that map to our data of interest, we need to get the data into an addressable form.

TIP Having a large amount of raw data, most of which is uninteresting, is a

common situation; look for ways to limit your scope to only the relevant data

when working on your own projects.

CT Files

.MHD

.RAW

ANnotations

.CSV

CT

ARray

xyz2irc

Transform

1000

0 0

0 1 0 0

0 0 1 0

0001

0 0 Candidate

at

COordinate

(X,Y,Z)

Training

candidate

location

[(I,R,C),

(I,R,C),

(I,R,C),

...

]

Sample tuple

sample aRrayay

( ,

is nodule?

T/F,

Series_uid

“1.2.3”,

candidate

location

(I,R,C))

Figure 10.4 Loading a CT scan produces a voxel array and a transformation from

patient coordinates to array indices.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!