02.11.2014 Views

untangling_the_web

untangling_the_web

untangling_the_web

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

DID: 4046925<br />

UNCLASSIFIEDHFOR OFFlelAL USE ONLY<br />

~ define (search): AWSP has a much more robust set of search options, syntax,<br />

and APls than o<strong>the</strong>r search engines and also permits <strong>the</strong> use of stored<br />

(canned) queries; <strong>the</strong> AWSP "data store" contains text, html, music, video,<br />

images, and more types of files.<br />

~ process: users can search <strong>the</strong> entire Alexa data store and "are able to<br />

process both <strong>the</strong> raw content and <strong>the</strong> metadata extracted by Alexa's internal<br />

processes."<br />

~ publish: <strong>the</strong> output of <strong>the</strong> search can be anything from one result to an<br />

entirely new vertical search engine, for example a new video search engine<br />

or a new search engine for automotive parts. Quite literally, "by making use of<br />

<strong>the</strong>se utilities, a user might introduce a great new search service to <strong>the</strong> world<br />

with nothing more than a home computer.,,69<br />

The costs are modest and are based on consumption (you pay for what you use and<br />

not for a subscription or service contract):<br />

$1 per cpu hour ($0.50 for reserved but unused hours)<br />

$1 per GB/year of user storage<br />

$1 per 50 GB processed<br />

$1 per GB uploaded/downloaded<br />

$1 for every 4,000 user-published <strong>web</strong> service requests<br />

In case you're curious, Alexa has a long history. Now owned by Amazon, Alexa was<br />

created by Bruce Gilliat and Brewster Kahle (of Internet Archive fame), and until now<br />

has been both famous and infamous as <strong>the</strong> technology behind <strong>the</strong> controversial <strong>web</strong><br />

traffic and <strong>web</strong>site statistics "What's Related" toolbar feature in both Netscape and<br />

Internet Explorer. The new AWSP is actually integrated into Amazon's <strong>web</strong> services<br />

platform, something no one has done before."<br />

Simply stated, Alexa/Amazon are "renting" <strong>the</strong>ir huge database ("data store") to any<br />

and all takers for a remarkably reasonable price and, what is more, offering detailed<br />

69 Alexa Web Search Platform User Guide, Introduction: What Can I Do with <strong>the</strong> Platform?<br />

(17 January 2007).<br />

70 There is one example of something similar, which came to my and some o<strong>the</strong>rs' minds. If you are<br />

familiar with IBM's WebFountain and its proprietary implementations for specific customers, you may<br />

see some similarities. WebFountain also spidered <strong>the</strong> <strong>web</strong> and <strong>the</strong>n let IBM's customers run queries<br />

against that data set in more sophisticated ways than simple querying (something akin to<br />

datamining). However, <strong>the</strong> problem with WebFountain and its progeny was that IBM had to write <strong>the</strong><br />

programs, and <strong>the</strong>reby hangs a tale of woe. For more, I recommend Jeff Dalton's blog entry on this<br />

topic (I think he nails it). Jeff Dalton, "Alexa Web Search Platform: IBM WebFountain 2.0," Jeff's<br />

Search Cafe, <br />

192 UNCLASSIFIEDhTOR OFFIGIAL l:ISE 9NLY

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!