09.11.2016 Views

Foundations of Python Network Programming 978-1-4302-3004-5

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 7 ■ SERVER ARCHITECTURE<br />

Load Balancing and Proxies<br />

Event-driven servers take a single process and thread <strong>of</strong> control and make it serve as many clients as it<br />

possibly can; once every moment <strong>of</strong> its time is being spent on clients that are ready for data, a process<br />

really can do no more. But what if one thread <strong>of</strong> control is simply not enough for the load your network<br />

service needs to meet?<br />

The answer, obviously, is to run several instances <strong>of</strong> your service and to distribute clients among<br />

them. This requires a key piece <strong>of</strong> s<strong>of</strong>tware: a load balancer that runs on the port to which all <strong>of</strong> the<br />

clients will be connecting, and which then turns around and gives each <strong>of</strong> the running instances <strong>of</strong> your<br />

service the data being sent by some fraction <strong>of</strong> the incoming clients. The load balancer thus serves as a<br />

proxy: to network clients it looks like your server, but to your server it looks like a client, and <strong>of</strong>ten<br />

neither side knows the proxy is even there.<br />

Load balancers are such critical pieces <strong>of</strong> infrastructure that they are <strong>of</strong>ten built directly into<br />

network hardware, like that sold by Cisco, Barracuda, and f5. On a normal Linux system, you can run<br />

s<strong>of</strong>tware like HAProxy or delve into the operating system’s firewall rules and construct quite efficient<br />

load balancing using the Linux Virtual Server (LVS) subsystem.<br />

In the old days, it was common to spread load by simply giving a single domain name several<br />

different IP addresses; clients looking up the name would be spread randomly across the various server<br />

machines. The problem with this, <strong>of</strong> course, is that clients suffer when the server to which they are<br />

assigned goes down; modern load balancers, by contrast, can <strong>of</strong>ten recover when a back-end server goes<br />

down by moving its live connections over to another server without the client even knowing.<br />

The one area in which DNS has retained its foothold as a load-balancing mechanism is geography.<br />

The largest service providers on the Internet <strong>of</strong>ten resolve hostnames to different IP addresses<br />

depending on the continent, country, and region from which a particular client request originates. This<br />

allows them to direct traffic to server rooms that are within a few hundred miles <strong>of</strong> each customer, rather<br />

than requiring their connections to cross the long and busy data links between continents.<br />

So why am I mentioning all <strong>of</strong> these possibilities before tackling the ways that you can move beyond<br />

a single thread <strong>of</strong> control on a single machine with threads and processes?<br />

The answer is that I believe load balancing should be considered up front in the design <strong>of</strong> any<br />

network service because it is the only approach that really scales. True, you can buy servers these days <strong>of</strong><br />

more than a dozen cores, mounted in machines that support massive network channels; but if,<br />

someday, your service finally outgrows a single box, then you will wind up doing load balancing. And if<br />

load balancing can help you distribute load between entirely different machines, why not also use it to<br />

help you keep several copies <strong>of</strong> your server active on the same machine?<br />

Threading and forking, it turns out, are merely limited special cases <strong>of</strong> load balancing. They take<br />

advantage <strong>of</strong> the fact that the operating system will load-balance incoming connections among all <strong>of</strong> the<br />

threads or processes that are running accept() against a particular socket. But if you are going to have to<br />

run a separate load balancer in front <strong>of</strong> your service anyway, then why go to the trouble <strong>of</strong> threading or<br />

forking on each individual machine? Why not just run 20 copies <strong>of</strong> your simple single-threaded server on<br />

20 different ports, and then list them in the load balancer’s configuration?<br />

Of course, you might know ahead <strong>of</strong> time that your service will never expand to run on several<br />

machines, and might want the simplicity <strong>of</strong> running a single piece <strong>of</strong> s<strong>of</strong>tware that can by itself use<br />

several processor cores effectively to answer client requests. But you should keep in mind that a multithreaded<br />

or multi-process application is, within a single piece <strong>of</strong> s<strong>of</strong>tware, doing what might more<br />

cleanly be done by configuring a proxy standing outside your server code.<br />

Threading and Multi-processing<br />

The essential idea <strong>of</strong> a threaded or multi-process server is that we take the simple and straightforward<br />

server that we started out with—the one way back in Listing 7–2, the one that waits repeatedly on a<br />

117

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!