09.11.2016 Views

Foundations of Python Network Programming 978-1-4302-3004-5

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C H A P T E R 3<br />

■ ■ ■<br />

TCP<br />

The Transmission Control Protocol (TCP) is the workhorse <strong>of</strong> the Internet. First defined in 1974, it lets<br />

applications send one another streams <strong>of</strong> data that, if they arrive at all—that is, unless a connection dies<br />

because <strong>of</strong> a network problem—are guaranteed to arrive intact, in order, and without duplication.<br />

Protocols that carry documents and files nearly always ride atop TCP, including HTTP and all the<br />

major ways <strong>of</strong> transmitting e-mail. It is also the foundation <strong>of</strong> choice for protocols that carry on long<br />

conversations between people or computers, like SSH and many popular chat protocols.<br />

When the Internet was younger, it was sometimes possible to squeeze a little more performance out<br />

<strong>of</strong> a network by building your application atop UDP and choosing the size and timing <strong>of</strong> each individual<br />

packet yourself. But modern TCP implementations tend to be very smart, having benefited from more<br />

than 30 years <strong>of</strong> improvement, innovation, and research, and these days even very performance-critical<br />

applications like message queues (Chapter 8) <strong>of</strong>ten choose TCP as their medium.<br />

How TCP Works<br />

As we learned in Chapter 2, real networks are fickle things that sometimes drop the packets you transmit<br />

across them, occasionally create extra copies <strong>of</strong> a packet instead, and are also known to deliver packets<br />

out <strong>of</strong> order. With a bare-packet facility like UDP, your own application code has to worry about whether<br />

messages arrived, and have a plan for recovering if they did not. But with TCP, the packets themselves<br />

are hidden and your application can simply stream data toward its destination, confident that it will be<br />

re-transmitted until it finally arrives.<br />

The classic definition <strong>of</strong> TCP is RFC 793 from 1981, though many subsequent RFCs have detailed<br />

extensions and improvements.<br />

How does TCP provide a reliable connection? It starts by combining two mechanisms that we<br />

discussed in Chapter 2. There, we had to implement them ourselves because we were using UDP. But<br />

with TCP they come built in, and are performed by the operating system’s network stack without your<br />

application even being involved.<br />

First, every packet is given a sequence number, so that the system on the receiving end can put<br />

them back together in the right order, and so that it can notice missing packets in the sequence and ask<br />

that they be re-transmitted.<br />

Instead <strong>of</strong> using sequential integers (1,2,…) to mark packets, TCP uses a counter that counts the<br />

number <strong>of</strong> bytes transmitted. So a 1,024-byte packet with a sequence number <strong>of</strong> 7,200 would be followed<br />

by a packet with a sequence number <strong>of</strong> 8,224. This means that a busy network stack does not have to<br />

remember how it broke a data stream up into packets; if asked for a re-transmission, it can break the<br />

stream up into packets some other way (which might let it fit more data into a packet if more bytes are<br />

now waiting for transmission), and the receiver can still put the packets back together.<br />

The initial sequence number, in good TCP implementations, is chosen randomly so villains cannot<br />

assume that every connection starts at byte zero and easily craft forged packets by guessing how far a<br />

transmission that they want to interrupt has proceeded.<br />

35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!