20.03.2021 Views

Deep-Learning-with-PyTorch

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Exporting models

455

15.2 Exporting models

So far, we have used PyTorch from the Python interpreter. But this is not always desirable:

the GIL is still potentially blocking our improved web server. Or we might want

to run on embedded systems where Python is too expensive or unavailable. This is

when we export our model. There are several ways in which we can play this. We might

go away from PyTorch entirely and move to more specialized frameworks. Or we

might stay within the PyTorch ecosystem and use the JIT, a just in time compiler for a

PyTorch-centric subset of Python. Even when we then run the JITed model in Python,

we might be after two of its advantages: sometimes the JIT enables nifty optimizations,

or—as in the case of our web server—we just want to escape the GIL, which JITed

models do. Finally (but we take some time to get there), we might run our model

under libtorch, the C++ library PyTorch offers, or with the derived Torch Mobile.

15.2.1 Interoperability beyond PyTorch with ONNX

Sometimes we want to leave the PyTorch ecosystem with our model in hand—for

example, to run on embedded hardware with a specialized model deployment pipeline.

For this purpose, Open Neural Network Exchange provides an interoperational

format for neural networks and machine learning models (https://onnx.ai). Once

exported, the model can be executed using any ONNX-compatible runtime, such as

ONNX Runtime, 6 provided that the operations in use in our model are supported by

the ONNX standard and the target runtime. It is, for example, quite a bit faster on the

Raspberry Pi than running PyTorch directly. Beyond traditional hardware, a lot of specialized

AI accelerator hardware supports ONNX (https://onnx.ai/supported-tools

.html#deployModel).

In a way, a deep learning model is a program with a very specific instruction set,

made of granular operations like matrix multiplication, convolution, relu, tanh, and

so on. As such, if we can serialize the computation, we can reexecute it in another runtime

that understands its low-level operations. ONNX is a standardization of a format

describing those operations and their parameters.

Most of the modern deep learning frameworks support serialization of their computations

to ONNX, and some of them can load an ONNX file and execute it

(although this is not the case for PyTorch). Some low-footprint (“edge”) devices

accept an ONNX files as input and generate low-level instructions for the specific

device. And some cloud computing providers now make it possible to upload an

ONNX file and see it exposed through a REST endpoint.

In order to export a model to ONNX, we need to run a model with a dummy

input: the values of the input tensors don’t really matter; what matters is that they are

the correct shape and type. By invoking the torch.onnx.export function, PyTorch

6

The code lives at https://github.com/microsoft/onnxruntime, but be sure to read the privacy statement!

Currently, building ONNX Runtime yourself will get you a package that does not send things to the mothership.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!