Software Engineering for Internet Applications - Student Community

More documents

Recommendations

Info

• User #21 contributed Comment #37 on Article #529• User #192 asked Question #512• User #451 posted Answer #3 to Question #924• User #1392 has read Article #456• User #8923 is interested in being alerted when a change ismade to Article #223• User #8923 is interested in being alerted when an answer toQuestion #9213 is postedWe are careful to record authorship because attributed contentcontributes to our chances of building a real community. To offerusers the service of email notifications when someone responds to aquestion or comments on an article, it is necessary to recordauthorship.Why record the fact that a particular user has read, or at leastdownloaded, a particular document? Consider an online learningcommunity of professors and students at a university. It is necessaryto record readership if one wishes to write a robot that sends outmessages like the following:To: Sam StudentFrom: Community Nag RobotDate: Friday, 4:30 pmSubject: Your Lazy BonesSam,I notice that you have four assignments due onMonday and that youhave not even looked at two of them. I hope thatyou aren't planningto go to a fraternity party tonight instead ofstudying.Very truly yours,Some SQL CodeOnce an online learning community is recording the act ofreadership, it is natural to consider recording whether or not the actof reading proved worthwhile. In general collaborative filtering is thelast refuge of those too cowardly to edit. However, recording "UserChapter 16User Activity AnalysisThis chapter looks at ways that you can monitor user activity withinyour community and how that information can be used to personalizea user's experience.16.1 Step 1: Ask the Right QuestionsBefore considering what is technically feasible, it is best to start witha wishlist of the questions about user activity that have relevance foryour client's application. Here are some starter questions:• What are the URLs that are producing server errors?[answer leads to action: fix broken code]• How many users requested non-existent files and where didthey get the bad URLs? [answer leads to action: fix badlinks]• Are at least 50 percent of users visiting /foobar/, ournewest and most important section? [answer leads toaction: maybe add more pointers to the new section fromother areas of the site]• How popular are the voice and wireless interfaces to theapplication? [answer leads to action: invest more effort inpopular interfaces]• Which pages are causing users to get stuck and abandontheir sessions? I.e., what are the typical last pages viewedbefore a user disappears for the day? [answer leads toaction: clarify user interface or annotation on those pages]• Suppose that we operate an ecommerce site and that we'vepurchased advertisements on Google andwww.nytimes.com. How likely are visitors from those twosources to buy something? How do the dollar amountscompare? [answer leads to action: buy more ads from theplace that sends high-profit users]16.2 Step 2: Look at What's Easily AvailableEvery HTTP server program can be configured to log its actions.Typically the server will write two logs: (1) the "access log",containing one line corresponding to every user request, and (2) the"error log", containing complete information about what went wrongduring those requests that resulted in program errors. A "file not64285
15.18 Time and MotionThe team should work together with the client to develop theontology. These discussions and the initial documentation shouldrequire 2 to 3 hours. Designing the metadata data model may be asimple copy/paste operation for teams building with Oracle, but inany case should require no more than an hour. Generating the DDLstatements and drop tables script should take about two hours ofwork by one programmer. Building out the system pages, Exercise 5through 10, should require 8 to 12 programmer-hours. This part canbe divided to an extent but it's probably best to limit the programmingto two individuals working together closely since the exercises buildupon one another. Finally, the writeups at the end should take one totwo hours total.#7241 really liked Article #2451" opens up interesting possibilities forpersonalization.Consider a corporate knowledge management system. At thebeginning the database is empty and there are only a few users.Scanning the titles of all contributed content would take only a fewminutes. After 5 years, however, the database contains 100,000documents and the 10,000 active users are contributing severalhundred new documents every day (keep in mind that a question oranswer in a discussion forum is a "document" for the purpose of thisdiscussion). If Jane User wants to see what her coworkers have beenup to in the last 24 hours, it might take her 30 minutes to scan thetitles of the new content. Jane User may well abandon an onlinelearning community that, when smaller, was very useful to her.Suppose now that the database contains 100 entries of the form"Jane liked this article" and 100 entries of the form "Jane did not likethis article". Before Jane has arrived at work, a batch job cancompare every new article in the system to the 100 articles that Janeliked and the 100 articles that Jane said she did not like. Thiscomparison can be done using most standard full-text searchsoftware, which will take two documents and score them for similaritybased on words used. Each new document is given a score of theformavg(similarity(:new_doc,all_docs_marked_as_liked_by_user(:user_id)))-avg(similarity(:new_doc,all_docs_marked_as_disliked_by_user(:user_id)))The new documents are then presented to Jane ranked bydescending score. If you're an Intel stockholder you'll be pleased toconsider the computational implications of this personalizationscheme. Every new document must be compared to every documentpreviously marked by a user. Perhaps that is 200 comparisons. Ifthere are 10,000 users, this scoring operation must be repeated10,000 times. So that is 2,000,000 comparisons per day per newdocument in the system. Full-text comparisons generally are quiteslow as they rely on lookup up each word in a document to find itsoccurrence frequency in standard written English. A comparison oftwo documents can take 1/10th of CPU time. We're thus looking atabout 200,000 seconds of CPU time per new document added to thesystem, plus the insertion of 10,000 rows in the database, each rowcontaining the personalization score of that document for a particular28465
Page 1 and 2:
SoftwareEngineering forInternetAppl
Page 3 and 4:
Signature: ________________________
Page 5 and 6:
end-users. We use every opportunity
Page 7 and 8:
• availability of magnet content
Page 9 and 10:
• we want to see if a student is
Page 11 and 12:
you supply English-language queries
Page 13 and 14: What to do during lecturesWe try to
Page 15 and 16: The one-term cram courseWhen teachi
Page 17 and 18: 332• spend a term learning how to
Page 19 and 20: Once we've taught students how to b
Page 21 and 22: has permission to perform each task
Page 23 and 24: UDDIUnixcustomer's credit card. If
Page 25 and 26: thousands of concurrent users. This
Page 27 and 28: OraclePerlnamed XYZ" without the pr
Page 29 and 30: LDAPLinuxbits per color, a vastly s
Page 31 and 32: FilterFirewallFlat-fileGIF318functi
Page 33 and 34: when there is an educational dimens
Page 35 and 36: system. The authors of the core pro
Page 37 and 38: Sign-OffsTry to schedule comprehens
Page 39 and 40: scheduling goals that both you and
Page 41 and 42: Client Tenure In Job (new, mid-term
Page 43 and 44: ReferencesEngagement ManagementSQL*
Page 45 and 46: Decision-makers often bring senior
Page 47 and 48: presentation to a panel of outsider
Page 49 and 50: 300always been written by programme
Page 51 and 52: 17.3 Professionalism in the Softwar
Page 53 and 54: Try to make sure that your audience
Page 55 and 56: Chapter 17WriteupIf I am not for my
Page 57 and 58: Suppose that an RDBMS failure were
Page 59 and 60: analysis programs analyzing standar
Page 61 and 62: at 9 hours 11 minutes 59 seconds pa
Page 63: found" will result in an access log
Page 67 and 68: select 227, 891, 'algorithm', curre
Page 69 and 70: create table km_object_views (objec
Page 71 and 72: • object-create• object-display
Page 73 and 74: The trees chapter of SQL for Web Ne
Page 75 and 76: );274-- ordering within a form, low
Page 77 and 78: and start the high-level document f
Page 79 and 80: Example Ontology 2: FlyingWe want a
Page 81 and 82: systems. What would a knowledge man
Page 83 and 84: spreadsheet". Other users can comme
Page 85 and 86: Chapter 15Metadata (and Automatic C
Page 87 and 88: {site url}{site description}en-usCo
Page 89 and 90: drawing on the intermodule API that
Page 91 and 92: At this point you have something of
Page 93 and 94: • description• URL for a photo
Page 95 and 96: Here's a raw SOAP request/response
Page 97 and 98: Chapter 14Distributed Computing wit
Page 99 and 100: conduct programmer job interviews h
Page 101 and 102: Most admin pages can be excluded fr
Page 103 and 104: content that should distinguish one
Page 105 and 106: Chapter 13Planning ReduxA lot has c
Page 107 and 108: the Internet-specific problem of no
Page 109 and 110: wouldn't see these dirty tricks unl
Page 111 and 112: 12.8 Exercise 4: Big BrotherGeneral
Page 113 and 114: than one call to contains in the sa
Page 115 and 116:
A third argument against the split
Page 117 and 118:
way 1 1/16One might argue that this
Page 119 and 120:
absquatulate 612bedizen 36, 9211cry
Page 121 and 122:
What if the user typed multiple wor
Page 123 and 124:
Chapter 12S E A R C HRecall from th
Page 125 and 126:
long as it is much easier to remove
Page 127 and 128:
features that are helpful? What fea
Page 129 and 130:
made it in 1938)? Upon reflection,
Page 131 and 132:
environment, we identify users by t
Page 133 and 134:
those updates by no more than 1 min
Page 135 and 136:
Balancer and mod_backhand, a load b
Page 137 and 138:
translation had elapsed--the site w
Page 139 and 140:
It seems reasonable to expect that
Page 141 and 142:
11.1.5 Transport-Layer EncryptionWh
Page 143 and 144:
such as ticket bookings would colla
Page 145 and 146:
give their site a unique look and f
Page 147 and 148:
It isn't challenging to throw hardw
Page 149 and 150:
Chapter 11Scaling GracefullyLet's l
Page 151 and 152:
10.15 Beyond VoiceXML: Conversation
Page 153 and 154:
Consider that if you're authenticat
Page 155 and 156:
In this example, we:194• ask the
Page 157 and 158:
As in any XML document, every openi
Page 159 and 160:
(http://www.voicegenie.com). These
Page 161 and 162:
Chapter 10Voice (VoiceXML)questions
Page 163 and 164:
9.15 MoreStandards information:•
Page 165 and 166:
9.14 The FutureIn most countries th
Page 167 and 168:
9.10 Exercise 7: Build a Pulse Page
Page 169 and 170:
9.6 Keypad HyperlinksLet's look at
Page 171 and 172:
text/xml,application/xml,applicatio
Page 173 and 174:
Protocol (IP) routing, a standard H
show all

Software Engineering for Internet Applications - Student Community

Create successful ePaper yourself

Delete template?

Save as template?