13.07.2015 Views

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

Software Engineering for Internet Applications - Student Community

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

create table content_raw (content_id integer primary key,content_type varchar(100) not null,refers_to references content_raw,creation_user not null references users,creation_date not null date,release_time date,expiration_time date,-- some of our content is geographically specificzip_codevarchar(5),-- a lot of our readers will appreciate Spanish-- versionslanguagechar(2) references language_codes,mime_type varchar(100) not null,one_line_summary varchar(200) not null,-- let's use BLOB in case this is a Microsoft Word doc-- or JPEG a BLOB can also hold HTML or plain textbody blob,editorial_status varchar(30)check (editorial_status in('submitted','rejected','approved','expired')));If this table is to contain 7 versions of an article with a Content ID of5657 that will violate the primary key constraint on the content_idcolumn. What if we remove the primary key constraint? In Oracle thisprevents us from establishing referential integrity constraints pointingto this ID. With no integrity constraints, we will be running the risk, <strong>for</strong>example, that our database will contain comments on content itemsthat have been deleted. With multiple rows <strong>for</strong> each content item ourpointers become ambiguous. The statement "User 739 has readArticle 5657" points from a specific row in the users table into a setof rows in the content_raw. Should we try to be more specific? Dowe want a comment on an article to refer to a specific version of thatarticle? Do we want to know that a reader has read a specific versionof an article? Do we want to know that an editor has approved aspecific version of an article? It depends. For some purposes weprobably do want to point to a version, e.g., <strong>for</strong> approval, and at othertimes we want to point to the article in the abstract. If we add aversion_number column, this becomes relatively straight<strong>for</strong>ward.create table content_raw (-- the combination of these two is the keycontent_id integer,version_number integer,...primary key (content_id, version_number)If you've been requiring registration to view discussions, <strong>for</strong> example,those discussions won't be indexed by Google unless your softwareis smart enough to recognize that it is Google behind the request andmake an exception. How to recognize Google? Here's a one-linesnippet from the photo.net access log (newlines inserted <strong>for</strong>readability):216.239.46.48 - - [19/Mar/2002:03:36:56 -0500]"GET /minolta/dimage-7/ HTTP/1.0"200 18881"" "Googlebot/2.1(+http://www.googlebot.com/bot.html)"Notice the user-agent header at the end: Googlebot/2.1(+http://www.googlebot.com/bot.html). Because some searchengines archive what they index you would not want to provideregistration-free access to content that is truly private to members. Intheory a placed in the HEAD of your HTML documents would prevent searchengines from archiving the page but robots are not guaranteed tofollow such directives.Some search engines allow you to provide indexing hints and hints<strong>for</strong> presentation once a user is looking at a search results page. Forexample, in the table of contents page <strong>for</strong> this book, we have thefollowing META tags in the HEAD:The "keywords" tag adds some words that are relevant to thedocument but not present in the visible text. This would helpsomeone who decided to search <strong>for</strong> "MIT 6.171 textbook", <strong>for</strong>example. The "description" tag can be used by a search engine whensummarizing a page. If it isn't present a search engine may show thefirst 20 words on the page or follow some heuristics to build areasonable summary. These tags have been routinely abused. Apublisher might add popular search terms such as "sex" to a site thatis unrelated to those terms, in hopes of capturing more readers. Acompany might add the names of its competitors as keywords. Users110239

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!