useful <strong>for</strong> the particular Web application that you're building, keep it inthe Web server as a shared procedure.7.3 Documentation"As we enter the 21st century we find that rifle marksmanship hasbeen largely lost in the military establishments of the world. Thenotion that technology can supplant incompetence is upon us in allsorts of endeavors, including that of shooting."-- Jeff Cooper in The Art of the Rifle (1997; Paladin Press)Given a system with 1000 procedures and no documentation, thetypical manager will lay down an edict to the programmers: you mustwrite a "doc string" <strong>for</strong> every procedure saying what inputs it takes,what outputs it generates, and how it trans<strong>for</strong>ms those inputs intooutputs. Virtually every programming environment going back to the1960s has support <strong>for</strong> this kind of thinking. The fancier "doc string"systems will even parse through directories of source code, extractthe doc strings, and print a nice-looking manual of 1000 doc strings.How useful are doc strings? Useful but not sufficient. Theprogrammer new to a system won't have any idea which of the 1000procedures and corresponding doc strings are most important. Thenew programmer won't have any idea why these procedures werebuilt, what problem they solve, whether the whole system has beendeprecated in favor of newer software from another source. Certainlythe 1000 doc strings aren't going to win over any programmers toadopting a piece of software. It is much more important to presentclear English prose that demonstrates the quality of your thinking anddesign work in attacking a real problem. The prose does not have tobe more than a few pages long but it needs to be carefully crafted.7.4 Separating the Designers and the ProgrammersCriticism and requests <strong>for</strong> changes will come in proportion to thenumber of people who understand that part of the system beingcriticized. Very few people are capable of data modeling orinteraction design. Although these are the only parts of the systemthat deeply affect the user experience or the utility of an in<strong>for</strong>mationsystem to its operators, you will thus very seldom be required toentertain a suggestion in this area. Only someone with years ofrelevant experience is likely to propose that a column be added to anSQL table or that five tables can be replaced with three tables. Amuch larger number of people are capable of writing Web scripts. Soyou'll sometimes be derided <strong>for</strong> your choice of programmingenvironment, regardless of what it is or how state of the art it wassupposed to be at the time you adopted it. Virtually every human13811.3.2 Load Balancing Among the Front-End MachinesCirca 1995 a popular strategy <strong>for</strong> high-volume Web sites was roundrobinDNS. Each front-end machine was assigned a unique publiclyroutable IP address. The Domain Name System (DNS) server <strong>for</strong> theWeb site was programmed to give different answers when asked <strong>for</strong>a translation of the Web server's hostname. For example,www.cnn.com was using round-robin DNS. They had a central NFSfile server containing the content of the site and a rack of small frontendmachines, each of which was a Web server and an NFS client.This architecture enabled CNN to update their site consistently bytouching only one machine, i.e., the central NFS server.How was the CNN system experienced by users? When a user at theMIT Laboratory <strong>for</strong> Computer Science requestedhttp://www.cnn.com/TECH/, his or her desktop machine would askthe local name server <strong>for</strong> a translation of the hostname www.cnn.cominto a 32-bit IP address. (Remember that all <strong>Internet</strong> communicationis machine-to-machine and requires numeric IP addresses;alphanumeric hostnames such as "www.amazon.com" or"web.mit.edu" are used only <strong>for</strong> user interface.) The MIT name serverwould contact the InterNIC registry to learn the IP addresses of thename servers <strong>for</strong> the cnn.com domain. The MIT name server wouldthen contact CNN's name servers and learn that "www.cnn.com" wasavailable at the IP address 207.25.71.5. Subsequent users within thesame subnetwork at MIT would, <strong>for</strong> a period of time designated byCNN, get the same answer of 207.25.71.5 without the MIT nameserver going back to the CNN name servers.Where is the load balancing in this system? Suppose that a user atenvironmentaldefense.org requested http://www.cnn.com/HEALTH/.Environmental Defense's name server would also contact CNN nameservers to learn the translation of "www.cnn.com". This time,however, the CNN server would provide a different answer:207.25.71.20, leading that user, and subsequent users withinenvironmentaldefense.org's network, to a different front-end serverthan the machine providing pages to users at MIT.Round-robin DNS is not a very popular load balancing method today.For one thing, it is not very balanced. Suppose that the CNN nameserver tells America Online's name server that www.cnn.com isreachable at 207.25.71.29. AOL is perfectly free to provide thattranslation to all 30+ million of its customers. Another problem withround-robin DNS is the impact on users when a front-end machinedies. If the box at 207.25.71.29 were to fail, none of AOL's customerswould be able to reach www.cnn.com until the expiration time on the211
It seems reasonable to expect that hardware engineers will continueto deliver substantial per<strong>for</strong>mance improvements and that fashions insoftware development and business complexity will continue to robusers of any enjoyment of those improvements. So stick to 10requests per second per CPU until you've got your own applicationspecificbenchmarks that demonstrate otherwise.11.3 Load BalancingAs noted earlier in this chapter, an <strong>Internet</strong> service with 100 CPUsspread among 15 physical computers isn't going to be very reliable ifall 100 CPUs must be working <strong>for</strong> the overall service to function. Weneed to develop a strategy <strong>for</strong> load balancing so that (1) userrequests are divided more or less evenly among the available CPUs,(2) when a piece of hardware fails it doesn't result in too many errorsreturned to users, and (3) we can reconfigure hardware and networkwithout breaking users' bookmarks and links from other sites.We will start by positing a two-tier server farm with a single multi-CPU machine running the RDBMS and multiple single-CPU front-endmachines, each of which runs the Web server program, interpretspage scripts, per<strong>for</strong>ms SSL encryption, and generally does anycomputation not being per<strong>for</strong>med within the RDBMS.**** insert drawing of our example server farm ****Figure 1: A typical server configuration <strong>for</strong> a medium-to-high volume<strong>Internet</strong> application. A powerful multi-CPU server supports therelational database management system. Multiple small 1-CPUmachines run the HTTP server program.11.3.1 Load Balancing In the Persistence LayerOur persistence layer is the multi-CPU computer running theRDBMS. The RDBMS itself is typically a multi-process or multithreadedapplication. For each database client the RDBMS spawns aseparate process or thread. In this case each front-end machinepresents itself to the RDBMS as one or more database clients. If weassume that the load of user requests are spread among the frontendmachines, the load of database work will be spread among themultiple CPUs of the RDBMS server by the operating system processor thread scheduler.210being on the planet, however, understands that mauve looks differentfrom fuchsia and that Helvetica looks different from Times Roman.Thus the largest number of suggestions <strong>for</strong> changes to a Webapplication will be design-related. Someone wants to add a new logoto every page on the site. Someone wants to change the backgroundcolor in the discussion <strong>for</strong>um section. Someone wants to make aheadline larger on a particular page. Someone wants to add a bit ofwhitespace here and there.Suppose that you've built your Web application in the simplest andmost direct manner. For each URL there is a corresponding script,which contains SQL statements, some procedural code in thescripting language (IF statements, basically), and static strings ofHTML that will be combined with the values returned from thedatabase to <strong>for</strong>m the completed page. If you break down what isinside a Visual Basic Active Server Page or a Java Server Page or aPerl CGI script, you always find these three items: SQL, IFstatements, HTML.Development of an application with this style of programming is easy.You can see all the relevant code <strong>for</strong> a page in one text editor buffer.Maintenance is also straight<strong>for</strong>ward. If a user sends in a bug reportsaying "There is a spelling error onhttp://www.yourcommunity.org/foo/bar" you know that you need onlylook in one file in the file system (/foo/bar.asp or /foo/bar.jsp or/foo/bar.pl or whatever) and you are guaranteed to find the source ofthe user's problem. This goes <strong>for</strong> SQL and procedural programmingerrors as well.What if people want site-wide changes to fonts, colors, headers andfooters? This could be easy or hard depending on how you've craftedthe system. Suppose that default colors are read from a configurationparameter system and headers, footers, and per-page navigationaids are generated by the page script calling shared procedures. Inthis happy circumstance making site-wide changes might take only afew minutes.What if people want to change the wording of some annotation in thestatic HTML <strong>for</strong> a page? Or make a particular headline on one pagelarger? Or add a bit of white space in one place on one page? Thiswill require a programmer because the static HTML stringsassociated with that page are embedded in a file that contains SQLand procedural language code. You don't want someone to bring asection of the service down because of a botched attempt to fix atypo or add a hint.139
- Page 1 and 2:
SoftwareEngineering forInternetAppl
- Page 3 and 4:
Signature: ________________________
- Page 5 and 6:
end-users. We use every opportunity
- Page 7 and 8:
• availability of magnet content
- Page 9 and 10:
• we want to see if a student is
- Page 11 and 12:
you supply English-language queries
- Page 13 and 14:
What to do during lecturesWe try to
- Page 15 and 16:
The one-term cram courseWhen teachi
- Page 17 and 18:
332• spend a term learning how to
- Page 19 and 20:
Once we've taught students how to b
- Page 21 and 22:
has permission to perform each task
- Page 23 and 24:
UDDIUnixcustomer's credit card. If
- Page 25 and 26:
thousands of concurrent users. This
- Page 27 and 28:
OraclePerlnamed XYZ" without the pr
- Page 29 and 30:
LDAPLinuxbits per color, a vastly s
- Page 31 and 32:
FilterFirewallFlat-fileGIF318functi
- Page 33 and 34:
when there is an educational dimens
- Page 35 and 36:
system. The authors of the core pro
- Page 37 and 38:
Sign-OffsTry to schedule comprehens
- Page 39 and 40:
scheduling goals that both you and
- Page 41 and 42:
Client Tenure In Job (new, mid-term
- Page 43 and 44:
ReferencesEngagement ManagementSQL*
- Page 45 and 46:
Decision-makers often bring senior
- Page 47 and 48:
presentation to a panel of outsider
- Page 49 and 50:
300always been written by programme
- Page 51 and 52:
17.3 Professionalism in the Softwar
- Page 53 and 54:
Try to make sure that your audience
- Page 55 and 56:
Chapter 17WriteupIf I am not for my
- Page 57 and 58:
Suppose that an RDBMS failure were
- Page 59 and 60:
analysis programs analyzing standar
- Page 61 and 62:
at 9 hours 11 minutes 59 seconds pa
- Page 63 and 64:
found" will result in an access log
- Page 65 and 66:
15.18 Time and MotionThe team shoul
- Page 67 and 68:
select 227, 891, 'algorithm', curre
- Page 69 and 70:
create table km_object_views (objec
- Page 71 and 72:
• object-create• object-display
- Page 73 and 74:
The trees chapter of SQL for Web Ne
- Page 75 and 76:
);274-- ordering within a form, low
- Page 77 and 78:
and start the high-level document f
- Page 79 and 80:
Example Ontology 2: FlyingWe want a
- Page 81 and 82:
systems. What would a knowledge man
- Page 83 and 84:
spreadsheet". Other users can comme
- Page 85 and 86:
Chapter 15Metadata (and Automatic C
- Page 87 and 88: {site url}{site description}en-usCo
- Page 89 and 90: drawing on the intermodule API that
- Page 91 and 92: At this point you have something of
- Page 93 and 94: • description• URL for a photo
- Page 95 and 96: Here's a raw SOAP request/response
- Page 97 and 98: Chapter 14Distributed Computing wit
- Page 99 and 100: conduct programmer job interviews h
- Page 101 and 102: Most admin pages can be excluded fr
- Page 103 and 104: content that should distinguish one
- Page 105 and 106: Chapter 13Planning ReduxA lot has c
- Page 107 and 108: the Internet-specific problem of no
- Page 109 and 110: wouldn't see these dirty tricks unl
- Page 111 and 112: 12.8 Exercise 4: Big BrotherGeneral
- Page 113 and 114: than one call to contains in the sa
- Page 115 and 116: A third argument against the split
- Page 117 and 118: way 1 1/16One might argue that this
- Page 119 and 120: absquatulate 612bedizen 36, 9211cry
- Page 121 and 122: What if the user typed multiple wor
- Page 123 and 124: Chapter 12S E A R C HRecall from th
- Page 125 and 126: long as it is much easier to remove
- Page 127 and 128: features that are helpful? What fea
- Page 129 and 130: made it in 1938)? Upon reflection,
- Page 131 and 132: environment, we identify users by t
- Page 133 and 134: those updates by no more than 1 min
- Page 135 and 136: Balancer and mod_backhand, a load b
- Page 137: translation had elapsed--the site w
- Page 141 and 142: 11.1.5 Transport-Layer EncryptionWh
- Page 143 and 144: such as ticket bookings would colla
- Page 145 and 146: give their site a unique look and f
- Page 147 and 148: It isn't challenging to throw hardw
- Page 149 and 150: Chapter 11Scaling GracefullyLet's l
- Page 151 and 152: 10.15 Beyond VoiceXML: Conversation
- Page 153 and 154: Consider that if you're authenticat
- Page 155 and 156: In this example, we:194• ask the
- Page 157 and 158: As in any XML document, every openi
- Page 159 and 160: (http://www.voicegenie.com). These
- Page 161 and 162: Chapter 10Voice (VoiceXML)questions
- Page 163 and 164: 9.15 MoreStandards information:•
- Page 165 and 166: 9.14 The FutureIn most countries th
- Page 167 and 168: 9.10 Exercise 7: Build a Pulse Page
- Page 169 and 170: 9.6 Keypad HyperlinksLet's look at
- Page 171 and 172: text/xml,application/xml,applicatio
- Page 173 and 174: Protocol (IP) routing, a standard H