20.03.2021 Views

Deep-Learning-with-PyTorch

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

448 CHAPTER 15 Deploying to production

model.load_state_dict(torch.load(sys.argv[1],

map_location='cpu')['model_state'])

model.eval()

def run_inference(in_tensor):

No autograd for us.

with torch.no_grad():

# LunaModel takes a batch and outputs a tuple (scores, probs)

out_tensor = model(in_tensor.unsqueeze(0))[1].squeeze(0)

probs = out_tensor.tolist()

out = {'prob_malignant': probs[1]}

return out

@app.route("/predict", methods=["POST"])

def predict():

meta = json.load(request.files['meta'])

blob = request.files['blob'].read()

in_tensor = torch.from_numpy(np.frombuffer(

blob, dtype=np.float32))

in_tensor = in_tensor.view(*meta['shape'])

out = run_inference(in_tensor)

return jsonify(out)

if __name__ == '__main__':

app.run(host='0.0.0.0', port=8000)

print (sys.argv[1])

Run the server as follows:

python3 -m p3ch15.flask_server

➥ data/part2/models/cls_2019-10-19_15.48.24_final_cls.best.state

We prepared a trivial client at cls_client.py that sends a single example. From the code

directory, you can run it as

python3 p3ch15/cls_client.py

We expect a form submission

(HTTP POST) at the “/predict”

endpoint.

Encodes our response

content as JSON

Our request will have

one file called meta.

Converts our data from

binary blob to torch

It should tell you that the nodule is very unlikely to be malignant. Clearly, our server

takes inputs, runs them through our model, and returns the outputs. So are we done?

Not quite. Let’s look at what could be better in the next section.

15.1.2 What we want from deployment

Let’s collect some things we desire for serving models. 3 First, we want to support modern

protocols and their features. Old-school HTTP is deeply serial, which means when a

client wants to send several requests in the same connection, the next requests will

only be sent after the previous request has been answered. Not very efficient if you

want to send a batch of things. We will partially deliver here—our upgrade to Sanic

certainly moves us to a framework that has the ambition to be very efficient.

3

One of the earliest public talks discussing the inadequacy of Flask serving for PyTorch models is Christian

Perone’s “PyTorch under the Hood,” http://mng.bz/xWdW.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!