Foundations of Python Network Programming 978-1-4302-3004-5
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
CHAPTER 3 ■ TCP<br />
$ python tcp_deadlock.py client 1073741824<br />
Sending 1073741824 bytes <strong>of</strong> data, in chunks <strong>of</strong> 16 bytes<br />
734816 bytes sent<br />
Why have both client and server been brought to a halt?<br />
The answer is that the server’s output buffer and the client’s input buffer have both finally filled, and<br />
TCP has used its window adjustment protocol to signal this fact and stop the socket from sending more<br />
data that would have to be discarded and later re-sent.<br />
Consider what happens as each block <strong>of</strong> data travels. The client sends it with sendall(). Then the<br />
server accepts it with recv(), processes it, and then transmits its capitalized version back out with<br />
another sendall() call. And then what? Well, nothing! The client is never running any recv() calls—not<br />
while it still has data to send—so more and more capitalized data backs up, until the operating system is<br />
not willing to accept any more.<br />
During the run shown previously, about 600KB was buffered by the operating system in the client’s<br />
incoming queue before the network stack decided that it was full. At that point, the server blocks in its<br />
sendall() call, and is paused there by the operating system until the logjam clears and it can send more<br />
data. With the server no longer processing data or running any more recv() calls, it is now the client’s<br />
turn to have data start backing up. The operating system seems to have placed a limit <strong>of</strong> around 130KB<br />
to the amount <strong>of</strong> data it would queue up in that direction, because the client got roughly another 130KB<br />
into producing the stream before finally being brought to a halt as well.<br />
On a different system, you will probably find that different limits are reached. So the foregoing<br />
numbers are arbitrary and based on the mood <strong>of</strong> my laptop at the moment; they are not at all inherent in<br />
the way TCP works.<br />
And the point <strong>of</strong> this example is to teach you two things—besides, <strong>of</strong> course, showing that<br />
recv(1024) indeed returns fewer bytes than 1,024 if a smaller number are immediately available!<br />
First, this example should make much more concrete the idea that there are buffers sitting inside<br />
the TCP stacks on each end <strong>of</strong> a network connection. These buffers can hold data temporarily so that<br />
packets do not have to be dropped and eventually re-sent if they arrive at a moment that their reader<br />
does not happen to be inside <strong>of</strong> a recv() call. But the buffers are not limitless; eventually, a TCP routine<br />
trying to write data that is never being received or processed is going to find itself no longer able to write,<br />
until some <strong>of</strong> the data is finally read and the buffer starts to empty.<br />
Second, this example makes clear the dangers involved in protocols that do not alternate lock-step<br />
between the client requesting and the server acknowledging. If a protocol is not strict about the server<br />
reading a complete request until the client is done sending, and then sending a complete response in the<br />
other direction, then a situation like that created here can cause both <strong>of</strong> them to freeze without any<br />
recourse other than killing the program manually, and then rewriting it to improve its design!<br />
But how, then, are network clients and servers supposed to process large amounts <strong>of</strong> data without<br />
entering deadlock? There are, in fact, two possible answers: either they can use socket options to turn <strong>of</strong>f<br />
blocking, so that calls like send() and recv() return immediately if they find that they cannot send any<br />
data yet. We will learn more about this option in Chapter 7, where we look in earnest at the possible<br />
ways to architect network server programs.<br />
Or, the programs can use one <strong>of</strong> several techniques to process data from several inputs at a time,<br />
either by splitting into separate threads or processes—one tasked with sending data into a socket,<br />
perhaps, and another tasked with reading data back out—or by running operating system calls like<br />
select() or poll() that let them wait on busy outgoing and incoming sockets at the same time, and<br />
respond to whichever is ready.<br />
Finally, note carefully that the foregoing scenario cannot ever happen when you are using UDP! This<br />
is because UDP does not implement flow control. If more datagrams are arriving up than can be<br />
processed, then UDP can simply discard some <strong>of</strong> them, and leave it up to the application to discover that<br />
they went missing.<br />
47