it, then copy the dev code over to the production directory andrestart.What's wrong with the two-server plan? Nothing if the developmentand testing teams are the same, in which case there is no possibilityof simultaneous development and testing. For a complex site,however, the publisher may wish to spend a week testing be<strong>for</strong>elaunching a revision. It isn't acceptable to idle authors and developerswhile a handful of testers bangs away at the development server.The addition of a staging server, rooted at /web/foobar-staging/(Server 3) allows development to proceed while testers are preparing<strong>for</strong> the public launch of a new version.Here's how the three servers are used:1. developers work continuously in /web/foobar-dev/2. when the publisher is mostly happy with the developmentsite, a named version or branch is created and installed at/web/foobar-staging/3. the testers bang away at the /web/foobar-staging/ server,checking fixes back into the version control repository butonly into the staging branch4. when the testers and publishers sign off on the stagingserver's per<strong>for</strong>mance, the site is released to /web/foobar/(production)5. any fixes made to the staging branch of the code that havenot already been fixed by the development team are mergedback into the development branch in the version controlrepositoryItem 2: Two or Three RDBMS Users/TablespacesSuppose that the publisher has a working production site runningversion 1.0 of the software. One could connect the developmentserver rooted at /web/foobar-dev/ to the production database.After all, the raison d'être of the RDBMS is concurrency control. It willbe happy to handle eight simultaneous connections from aproduction Web server plus two or three from a development server.The fly in this ointment is that one of the developers might get sloppyand write a program that sends drop table users rather thandrop table users_experimental_extra_table to the database.Or, less dramatically, a junior developer might leave out a WHEREclause in an SQL statement and inadvertently request a result set of10^9 rows, thus slowing down the production site.Inserting a new document into the collection will be slow. We'll haveto go through the document, word by word, and update as many rowsin the index as there are distinct words in the document. But thatextra work at insertion time pays off in a reduction in query time fromO[N] to O[1].Given a data structure of the preceding <strong>for</strong>m, we can quickly find alldocuments containing the word "running". We can also quickly find alldocuments containing the word "shoes". We can intersect theseresult sets quickly, giving us the documents that contain both"running" and "shoes". With some fancier indexing data structures wecan restrict our search to documents that contain the contiguousphrase "running shoes" as opposed to documents where those wordsappear separately. But suppose that there are 1000 documents in thecollection containing these two words. Which are the most relevant tothe user's query of "running shoes"?We need a new data structure: the word-frequency histogram. Thiswill tell us which words occur in a document and how frequently theyoccur in a way that is easily adjusted <strong>for</strong> the total length of adocument.Here's a word-frequency histogram <strong>for</strong> the first sentence of AnnaKarenina:Word Count Frequencyall 1 1/16another 1 1/16but 1 1/16each 1 1/16families 1 1/16family 1 1/16happy 1 1/16in 1 1/16is 1 1/16its 1 1/16one 1 1/16own 1 1/16resemble 1 1/16unhappy 2 2/16118231
absquatulate 612bedizen 36, 9211cryptogenic 9dactylioglyph 7214exheredate 57, 812, 4010feuilleton 87, 349, 1203genetotrophic 5000hartebeest 710inspissate 549, 21, 3987...samoyed 17, 91, 1000, 3492sesquipedalian 723the 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,...uberous 6, 800velutinous 45, 2307widdershins 7300xenial 3611ypsili<strong>for</strong>m 5607zibeline 4782If we build this as a hash table, we have O[1] access to a row in thetable. If we merely keep the rows in sorted order, we have O[log W]access to any row in the table, where W is the number of words inour vocabulary. Per<strong>for</strong>mance does not vary with the number ofdocuments in the collection... or does it? Just about every Englishdocument will contain the word "the" and there<strong>for</strong>e simply returningthe value of the document_ids column <strong>for</strong> the word "the" will takeO[N] time, where N is the number of documents in the corpus. Thisrow isn't useful anyway because it isn't selective, i.e., we could getthe same in<strong>for</strong>mation almost as fast with a sequential scan of thedocuments table, collecting all the document IDs. While indexing adocument, a full-text search system will refer to a list of stopwords,words that are too common to be worth indexing. For standardEnglish, the stopword list includes such words as "a", "and", "as","at", "<strong>for</strong>", "or", "the", etc.So it would seem that this publisher will need at least one newdatabase. Here are the steps:1. create a new database user and tablespace; if this is on aseparate physical computer from your production RDBMSserver it will protect your production server's per<strong>for</strong>mancefrom inadvertent denial-of-service attacks by sloppydevelopment SQL statements2. export the production database into a file system file, whichis a good periodic practice in any case as it will verify theintegrity of the database3. import the database export into the new developmentdatabase4. every time that a developer alters a table, adds a table, orpopulates a new table, record the operation in a"patches.sql" file5. when ready to move code from staging to production, hastilyapply all the data model modifications from patches.sql tothe production RDBMSShould there be three databases, i.e., one <strong>for</strong> dev, one <strong>for</strong> staging,and one <strong>for</strong> production? Not necessarily. Unless one expects radicaldata model evolution it may be acceptable to use the same database<strong>for</strong> development and staging. Keep in mind that adding a column to arelational database table seldom breaks old queries. This was one ofthe objectives set <strong>for</strong>th by E.F. Codd in 1970 in "A Relational Modelof Data <strong>for</strong> Large Shared Data Banks"(http://www.acm.org/classics/nov95/toc.html) and certainly modernimplementations of the relational model have lived up to Codd'shopes in this respect.Item 3: One Version Control RepositoryThe function of the version control repository is to• remember what all the previous checked-in versions of a filecontained• show the difference between what's in a checked-out treeand what's in the repository• help merge changes made simultaneously by multipleauthors who might have been unaware of each other's work• group a snapshot of currently checked-in versions of files as"Release 2.1" or "JuneIssue"230119
- Page 1 and 2:
SoftwareEngineering forInternetAppl
- Page 3 and 4:
Signature: ________________________
- Page 5 and 6:
end-users. We use every opportunity
- Page 7 and 8:
• availability of magnet content
- Page 9 and 10:
• we want to see if a student is
- Page 11 and 12:
you supply English-language queries
- Page 13 and 14:
What to do during lecturesWe try to
- Page 15 and 16:
The one-term cram courseWhen teachi
- Page 17 and 18:
332• spend a term learning how to
- Page 19 and 20:
Once we've taught students how to b
- Page 21 and 22:
has permission to perform each task
- Page 23 and 24:
UDDIUnixcustomer's credit card. If
- Page 25 and 26:
thousands of concurrent users. This
- Page 27 and 28:
OraclePerlnamed XYZ" without the pr
- Page 29 and 30:
LDAPLinuxbits per color, a vastly s
- Page 31 and 32:
FilterFirewallFlat-fileGIF318functi
- Page 33 and 34:
when there is an educational dimens
- Page 35 and 36:
system. The authors of the core pro
- Page 37 and 38:
Sign-OffsTry to schedule comprehens
- Page 39 and 40:
scheduling goals that both you and
- Page 41 and 42:
Client Tenure In Job (new, mid-term
- Page 43 and 44:
ReferencesEngagement ManagementSQL*
- Page 45 and 46:
Decision-makers often bring senior
- Page 47 and 48:
presentation to a panel of outsider
- Page 49 and 50:
300always been written by programme
- Page 51 and 52:
17.3 Professionalism in the Softwar
- Page 53 and 54:
Try to make sure that your audience
- Page 55 and 56:
Chapter 17WriteupIf I am not for my
- Page 57 and 58:
Suppose that an RDBMS failure were
- Page 59 and 60:
analysis programs analyzing standar
- Page 61 and 62:
at 9 hours 11 minutes 59 seconds pa
- Page 63 and 64:
found" will result in an access log
- Page 65 and 66:
15.18 Time and MotionThe team shoul
- Page 67 and 68: select 227, 891, 'algorithm', curre
- Page 69 and 70: create table km_object_views (objec
- Page 71 and 72: • object-create• object-display
- Page 73 and 74: The trees chapter of SQL for Web Ne
- Page 75 and 76: );274-- ordering within a form, low
- Page 77 and 78: and start the high-level document f
- Page 79 and 80: Example Ontology 2: FlyingWe want a
- Page 81 and 82: systems. What would a knowledge man
- Page 83 and 84: spreadsheet". Other users can comme
- Page 85 and 86: Chapter 15Metadata (and Automatic C
- Page 87 and 88: {site url}{site description}en-usCo
- Page 89 and 90: drawing on the intermodule API that
- Page 91 and 92: At this point you have something of
- Page 93 and 94: • description• URL for a photo
- Page 95 and 96: Here's a raw SOAP request/response
- Page 97 and 98: Chapter 14Distributed Computing wit
- Page 99 and 100: conduct programmer job interviews h
- Page 101 and 102: Most admin pages can be excluded fr
- Page 103 and 104: content that should distinguish one
- Page 105 and 106: Chapter 13Planning ReduxA lot has c
- Page 107 and 108: the Internet-specific problem of no
- Page 109 and 110: wouldn't see these dirty tricks unl
- Page 111 and 112: 12.8 Exercise 4: Big BrotherGeneral
- Page 113 and 114: than one call to contains in the sa
- Page 115 and 116: A third argument against the split
- Page 117: way 1 1/16One might argue that this
- Page 121 and 122: What if the user typed multiple wor
- Page 123 and 124: Chapter 12S E A R C HRecall from th
- Page 125 and 126: long as it is much easier to remove
- Page 127 and 128: features that are helpful? What fea
- Page 129 and 130: made it in 1938)? Upon reflection,
- Page 131 and 132: environment, we identify users by t
- Page 133 and 134: those updates by no more than 1 min
- Page 135 and 136: Balancer and mod_backhand, a load b
- Page 137 and 138: translation had elapsed--the site w
- Page 139 and 140: It seems reasonable to expect that
- Page 141 and 142: 11.1.5 Transport-Layer EncryptionWh
- Page 143 and 144: such as ticket bookings would colla
- Page 145 and 146: give their site a unique look and f
- Page 147 and 148: It isn't challenging to throw hardw
- Page 149 and 150: Chapter 11Scaling GracefullyLet's l
- Page 151 and 152: 10.15 Beyond VoiceXML: Conversation
- Page 153 and 154: Consider that if you're authenticat
- Page 155 and 156: In this example, we:194• ask the
- Page 157 and 158: As in any XML document, every openi
- Page 159 and 160: (http://www.voicegenie.com). These
- Page 161 and 162: Chapter 10Voice (VoiceXML)questions
- Page 163 and 164: 9.15 MoreStandards information:•
- Page 165 and 166: 9.14 The FutureIn most countries th
- Page 167 and 168: 9.10 Exercise 7: Build a Pulse Page
- Page 169 and 170:
9.6 Keypad HyperlinksLet's look at
- Page 171 and 172:
text/xml,application/xml,applicatio
- Page 173 and 174:
Protocol (IP) routing, a standard H