13.07.2015 Views

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

useful <strong>for</strong> the particular Web application that you're building, keep it inthe Web server as a shared procedure.7.3 Documentation"As we enter the 21st century we find that rifle marksmanship hasbeen largely lost in the military establishments of the world. Thenotion that technology can supplant incompetence is upon us in allsorts of endeavors, including that of shooting."-- Jeff Cooper in The Art of the Rifle (1997; Paladin Press)Given a system with 1000 procedures and no documentation, thetypical manager will lay down an edict to the programmers: you mustwrite a "doc string" <strong>for</strong> every procedure saying what inputs it takes,what outputs it generates, and how it trans<strong>for</strong>ms those inputs intooutputs. Virtually every programming environment going back to the1960s has support <strong>for</strong> this kind of thinking. The fancier "doc string"systems will even parse through directories of source code, extractthe doc strings, and print a nice-looking manual of 1000 doc strings.How useful are doc strings? Useful but not sufficient. Theprogrammer new to a system won't have any idea which of the 1000procedures and corresponding doc strings are most important. Thenew programmer won't have any idea why these procedures werebuilt, what problem they solve, whether the whole system has beendeprecated in favor of newer software from another source. Certainlythe 1000 doc strings aren't going to win over any programmers toadopting a piece of software. It is much more important to presentclear English prose that demonstrates the quality of your thinking anddesign work in attacking a real problem. The prose does not have tobe more than a few pages long but it needs to be carefully crafted.7.4 Separating the Designers and the ProgrammersCriticism and requests <strong>for</strong> changes will come in proportion to thenumber of people who understand that part of the system beingcriticized. Very few people are capable of data modeling orinteraction design. Although these are the only parts of the systemthat deeply affect the user experience or the utility of an in<strong>for</strong>mationsystem to its operators, you will thus very seldom be required toentertain a suggestion in this area. Only someone with years ofrelevant experience is likely to propose that a column be added to anSQL table or that five tables can be replaced with three tables. Amuch larger number of people are capable of writing Web scripts. Soyou'll sometimes be derided <strong>for</strong> your choice of programmingenvironment, regardless of what it is or how state of the art it wassupposed to be at the time you adopted it. Virtually every human13811.3.2 Load Balancing Among the Front-End MachinesCirca 1995 a popular strategy <strong>for</strong> high-volume Web sites was roundrobinDNS. Each front-end machine was assigned a unique publiclyroutable IP address. The Domain Name System (DNS) server <strong>for</strong> theWeb site was programmed to give different answers when asked <strong>for</strong>a translation of the Web server's hostname. For example,www.cnn.com was using round-robin DNS. They had a central NFSfile server containing the content of the site and a rack of small frontendmachines, each of which was a Web server and an NFS client.This architecture enabled CNN to update their site consistently bytouching only one machine, i.e., the central NFS server.How was the CNN system experienced by users? When a user at theMIT Laboratory <strong>for</strong> Computer Science requestedhttp://www.cnn.com/TECH/, his or her desktop machine would askthe local name server <strong>for</strong> a translation of the hostname www.cnn.cominto a 32-bit IP address. (Remember that all <strong>Internet</strong> communicationis machine-to-machine and requires numeric IP addresses;alphanumeric hostnames such as "www.amazon.com" or"web.mit.edu" are used only <strong>for</strong> user interface.) The MIT name serverwould contact the InterNIC registry to learn the IP addresses of thename servers <strong>for</strong> the cnn.com domain. The MIT name server wouldthen contact CNN's name servers and learn that "www.cnn.com" wasavailable at the IP address 207.25.71.5. Subsequent users within thesame subnetwork at MIT would, <strong>for</strong> a period of time designated byCNN, get the same answer of 207.25.71.5 without the MIT nameserver going back to the CNN name servers.Where is the load balancing in this system? Suppose that a user atenvironmentaldefense.org requested http://www.cnn.com/HEALTH/.Environmental Defense's name server would also contact CNN nameservers to learn the translation of "www.cnn.com". This time,however, the CNN server would provide a different answer:207.25.71.20, leading that user, and subsequent users withinenvironmentaldefense.org's network, to a different front-end serverthan the machine providing pages to users at MIT.Round-robin DNS is not a very popular load balancing method today.For one thing, it is not very balanced. Suppose that the CNN nameserver tells America Online's name server that www.cnn.com isreachable at 207.25.71.29. AOL is perfectly free to provide thattranslation to all 30+ million of its customers. Another problem withround-robin DNS is the impact on users when a front-end machinedies. If the box at 207.25.71.29 were to fail, none of AOL's customerswould be able to reach www.cnn.com until the expiration time on the211

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!