Foundations of Python Network Programming 978-1-4302-3004-5
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
CHAPTER 7 ■ SERVER ARCHITECTURE<br />
Load Balancing and Proxies<br />
Event-driven servers take a single process and thread <strong>of</strong> control and make it serve as many clients as it<br />
possibly can; once every moment <strong>of</strong> its time is being spent on clients that are ready for data, a process<br />
really can do no more. But what if one thread <strong>of</strong> control is simply not enough for the load your network<br />
service needs to meet?<br />
The answer, obviously, is to run several instances <strong>of</strong> your service and to distribute clients among<br />
them. This requires a key piece <strong>of</strong> s<strong>of</strong>tware: a load balancer that runs on the port to which all <strong>of</strong> the<br />
clients will be connecting, and which then turns around and gives each <strong>of</strong> the running instances <strong>of</strong> your<br />
service the data being sent by some fraction <strong>of</strong> the incoming clients. The load balancer thus serves as a<br />
proxy: to network clients it looks like your server, but to your server it looks like a client, and <strong>of</strong>ten<br />
neither side knows the proxy is even there.<br />
Load balancers are such critical pieces <strong>of</strong> infrastructure that they are <strong>of</strong>ten built directly into<br />
network hardware, like that sold by Cisco, Barracuda, and f5. On a normal Linux system, you can run<br />
s<strong>of</strong>tware like HAProxy or delve into the operating system’s firewall rules and construct quite efficient<br />
load balancing using the Linux Virtual Server (LVS) subsystem.<br />
In the old days, it was common to spread load by simply giving a single domain name several<br />
different IP addresses; clients looking up the name would be spread randomly across the various server<br />
machines. The problem with this, <strong>of</strong> course, is that clients suffer when the server to which they are<br />
assigned goes down; modern load balancers, by contrast, can <strong>of</strong>ten recover when a back-end server goes<br />
down by moving its live connections over to another server without the client even knowing.<br />
The one area in which DNS has retained its foothold as a load-balancing mechanism is geography.<br />
The largest service providers on the Internet <strong>of</strong>ten resolve hostnames to different IP addresses<br />
depending on the continent, country, and region from which a particular client request originates. This<br />
allows them to direct traffic to server rooms that are within a few hundred miles <strong>of</strong> each customer, rather<br />
than requiring their connections to cross the long and busy data links between continents.<br />
So why am I mentioning all <strong>of</strong> these possibilities before tackling the ways that you can move beyond<br />
a single thread <strong>of</strong> control on a single machine with threads and processes?<br />
The answer is that I believe load balancing should be considered up front in the design <strong>of</strong> any<br />
network service because it is the only approach that really scales. True, you can buy servers these days <strong>of</strong><br />
more than a dozen cores, mounted in machines that support massive network channels; but if,<br />
someday, your service finally outgrows a single box, then you will wind up doing load balancing. And if<br />
load balancing can help you distribute load between entirely different machines, why not also use it to<br />
help you keep several copies <strong>of</strong> your server active on the same machine?<br />
Threading and forking, it turns out, are merely limited special cases <strong>of</strong> load balancing. They take<br />
advantage <strong>of</strong> the fact that the operating system will load-balance incoming connections among all <strong>of</strong> the<br />
threads or processes that are running accept() against a particular socket. But if you are going to have to<br />
run a separate load balancer in front <strong>of</strong> your service anyway, then why go to the trouble <strong>of</strong> threading or<br />
forking on each individual machine? Why not just run 20 copies <strong>of</strong> your simple single-threaded server on<br />
20 different ports, and then list them in the load balancer’s configuration?<br />
Of course, you might know ahead <strong>of</strong> time that your service will never expand to run on several<br />
machines, and might want the simplicity <strong>of</strong> running a single piece <strong>of</strong> s<strong>of</strong>tware that can by itself use<br />
several processor cores effectively to answer client requests. But you should keep in mind that a multithreaded<br />
or multi-process application is, within a single piece <strong>of</strong> s<strong>of</strong>tware, doing what might more<br />
cleanly be done by configuring a proxy standing outside your server code.<br />
Threading and Multi-processing<br />
The essential idea <strong>of</strong> a threaded or multi-process server is that we take the simple and straightforward<br />
server that we started out with—the one way back in Listing 7–2, the one that waits repeatedly on a<br />
117