13.07.2015 Views

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

at 9 hours 11 minutes 59 seconds past midnight in a timezone 5hours behind Greenwich Mean Time (06/Mar/2003:09:11:59 -0500),requested the file /dogs/george using the GET method of theHTTP/1.1 protocol. The file was found by the server and returnednormally (status code of 200) but it was returned by an ill-behavedscript that did not give the server in<strong>for</strong>mation about how many byteswere written, hence the 0 after the status code. This user followed alink to this URL from http://www.photo.net/ (the referer header) andis using a browser that first falsely identifies itself as Netscape 4.0(Mozilla 4.0) but then explains that it is actually merely compatiblewith Netscape and is really Microsoft <strong>Internet</strong> Explorer 5.0 onWindows NT (MSIE 5.0; Windows NT). On a lightly used service wemight have configured the server to use nslookup and log thehostname of stargate.fs.uni-lj.si rather than the IP address,in which case we'd have been able to glance at the log and see that itwas someone at a university in Slovenia.That's a lot of in<strong>for</strong>mation in one line but consider what is missing. Ifthis user previously logged in and presented a user_id cookie, wecan't tell and we don't have that user ID. On an ecommerce site wemight be able to infer that the user purchased something by thepresence of a line showing a successful request <strong>for</strong> a "completepurchase"URL. However we won't see the dollar amount of thatpurchase and surely a $1000 purchase is much more interesting thana $10 purchase.16.3 Step 3: Figure Out What Extra In<strong>for</strong>mation YouNeed to RecordIf your client is unhappy with the kind of in<strong>for</strong>mation available fromthe standard logs there are three basic alternatives:288• configure the HTTP server program to add cookie headercontents to the standard access log• augment your software to log additional user activity into theRDBMS and construct ad hoc query pages in the siteadministrator area of the service• construct a full dimensional data warehouse of user activityIf all that you need is the user ID <strong>for</strong> every request it is often a simplematter to configure the HTTP server program, e.g., Apache orMicrosoft <strong>Internet</strong> In<strong>for</strong>mation Server, to append the contents of theentire cookie header or just one named cookie to each line in theaccess log.Chapter 4<strong>Software</strong> StructureBe<strong>for</strong>e embarking on a development project it is a good idea tosketch the overall structure of the system to be built.4.1 Gross AnatomyAny good online learning community will have roughly the same corestructure:1. user database2. content database3. user/content map4. user/user mapAs used above, "database" is an abstract term. The user database,<strong>for</strong> example, could be implemented as a set of SQL tables within arelational database management system. The tables <strong>for</strong> the userdatabase need not be separated in any way from tables used toimplement other modules, i.e., they would all be owned by the sameuser and reside within the same tablespace. On the other hand, theuser database might be external to the online learning community'score source of persistence. A common case in which the userdatabase can become external is that of a corporation's knowledgemanagement system where employees are authenticated bychecking a central LDAP server.A more modern example of how these core databases might becomesplit up would be in the world of Web services. Microsoft Hailstorm,<strong>for</strong> example, offers to provide user database services to the rest ofthe <strong>Internet</strong>. A university might set up complementary communities,one <strong>for</strong> high school students and one <strong>for</strong> colleagues at other schools,both anchored by the same database of genomics content. Thegenomics content database might be running on a physicallyseparate computer from the online communities and advertise itsservices via WSDL and provide those services via SOAP.4.2 User DatabaseAt a bare minimum the user database has to record the real nameand email address of the user. Remember that the more identified,authenticated, and accountable people are, the better the opportunity<strong>for</strong> building a community out of an aggregate. An environment whereanonymous users shout at each other from behind screen names61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!