20.03.2021 Views

Deep-Learning-with-PyTorch

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

458 CHAPTER 15 Deploying to production

TIP You can run the JITed and exported PyTorch model without keeping the

source. However, we always want to establish a workflow where we automatically

go from source model to installed JITed model for deployment. If we do

not, we will find ourselves in a situation where we would like to tweak something

with the model but have lost the ability to modify and regenerate.

Always keep the source, Luke!

15.2.3 Our server with a traced model

Now is a good time to iterate our web server to what is, in this case, our final version.

We can export the traced CycleGAN model as follows:

python3 p3ch15/cyclegan.py data/p1ch2/horse2zebra_0.4.0.pth

➥ data/p3ch15/traced_zebra_model.pt

Now we just need to replace the call to get_pretrained_model with torch.jit.load

in our server (and drop the now-unnecessary import of get_pretrained_model). This

also means our model runs independent of the GIL—and this is what we wanted our

server to achieve here. For your convenience, we have put the small modifications in

request_batching_jit_server.py. We can run it with the traced model file path as a

command-line argument.

Now that we have had a taste of what the JIT can do for us, let’s dive into the

details!

15.3 Interacting with the PyTorch JIT

Debuting in PyTorch 1.0, the PyTorch JIT is at the center of quite a few recent innovations

around PyTorch, not least of which is providing a rich set of deployment options.

15.3.1 What to expect from moving beyond classic Python/PyTorch

Quite often, Python is said to lack speed. While there is some truth to this, the tensor

operations we use in PyTorch usually are in themselves large enough that the Python

slowness between them is not a large issue. For small devices like smartphones, the

memory overhead that Python brings might be more important. So keep in mind that

frequently, the speedup gained by taking Python out of the computation is 10% or less.

Another immediate speedup from not running the model in Python only appears

in multithreaded environments, but then it can be significant: because the intermediates

are not Python objects, the computation is not affected by the menace of all

Python parallelization, the GIL. This is what we had in mind earlier and realized when

we used a traced model in our server.

Moving from the classic PyTorch way of executing one operation before looking at

the next does give PyTorch a holistic view of the calculation: that is, it can consider the

calculation in its entirety. This opens the door to crucial optimizations and higherlevel

transformations. Some of those apply mostly to inference, while others can also

provide a significant speedup in training.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!