09.11.2016 Views

Foundations of Python Network Programming 978-1-4302-3004-5

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 3 ■ TCP<br />

$ python tcp_deadlock.py client 1073741824<br />

Sending 1073741824 bytes <strong>of</strong> data, in chunks <strong>of</strong> 16 bytes<br />

734816 bytes sent<br />

Why have both client and server been brought to a halt?<br />

The answer is that the server’s output buffer and the client’s input buffer have both finally filled, and<br />

TCP has used its window adjustment protocol to signal this fact and stop the socket from sending more<br />

data that would have to be discarded and later re-sent.<br />

Consider what happens as each block <strong>of</strong> data travels. The client sends it with sendall(). Then the<br />

server accepts it with recv(), processes it, and then transmits its capitalized version back out with<br />

another sendall() call. And then what? Well, nothing! The client is never running any recv() calls—not<br />

while it still has data to send—so more and more capitalized data backs up, until the operating system is<br />

not willing to accept any more.<br />

During the run shown previously, about 600KB was buffered by the operating system in the client’s<br />

incoming queue before the network stack decided that it was full. At that point, the server blocks in its<br />

sendall() call, and is paused there by the operating system until the logjam clears and it can send more<br />

data. With the server no longer processing data or running any more recv() calls, it is now the client’s<br />

turn to have data start backing up. The operating system seems to have placed a limit <strong>of</strong> around 130KB<br />

to the amount <strong>of</strong> data it would queue up in that direction, because the client got roughly another 130KB<br />

into producing the stream before finally being brought to a halt as well.<br />

On a different system, you will probably find that different limits are reached. So the foregoing<br />

numbers are arbitrary and based on the mood <strong>of</strong> my laptop at the moment; they are not at all inherent in<br />

the way TCP works.<br />

And the point <strong>of</strong> this example is to teach you two things—besides, <strong>of</strong> course, showing that<br />

recv(1024) indeed returns fewer bytes than 1,024 if a smaller number are immediately available!<br />

First, this example should make much more concrete the idea that there are buffers sitting inside<br />

the TCP stacks on each end <strong>of</strong> a network connection. These buffers can hold data temporarily so that<br />

packets do not have to be dropped and eventually re-sent if they arrive at a moment that their reader<br />

does not happen to be inside <strong>of</strong> a recv() call. But the buffers are not limitless; eventually, a TCP routine<br />

trying to write data that is never being received or processed is going to find itself no longer able to write,<br />

until some <strong>of</strong> the data is finally read and the buffer starts to empty.<br />

Second, this example makes clear the dangers involved in protocols that do not alternate lock-step<br />

between the client requesting and the server acknowledging. If a protocol is not strict about the server<br />

reading a complete request until the client is done sending, and then sending a complete response in the<br />

other direction, then a situation like that created here can cause both <strong>of</strong> them to freeze without any<br />

recourse other than killing the program manually, and then rewriting it to improve its design!<br />

But how, then, are network clients and servers supposed to process large amounts <strong>of</strong> data without<br />

entering deadlock? There are, in fact, two possible answers: either they can use socket options to turn <strong>of</strong>f<br />

blocking, so that calls like send() and recv() return immediately if they find that they cannot send any<br />

data yet. We will learn more about this option in Chapter 7, where we look in earnest at the possible<br />

ways to architect network server programs.<br />

Or, the programs can use one <strong>of</strong> several techniques to process data from several inputs at a time,<br />

either by splitting into separate threads or processes—one tasked with sending data into a socket,<br />

perhaps, and another tasked with reading data back out—or by running operating system calls like<br />

select() or poll() that let them wait on busy outgoing and incoming sockets at the same time, and<br />

respond to whichever is ready.<br />

Finally, note carefully that the foregoing scenario cannot ever happen when you are using UDP! This<br />

is because UDP does not implement flow control. If more datagrams are arriving up than can be<br />

processed, then UDP can simply discard some <strong>of</strong> them, and leave it up to the application to discover that<br />

they went missing.<br />

47

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!