02.11.2014 Views

untangling_the_web

untangling_the_web

untangling_the_web

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DOClD: 4046925<br />

UNCLASSIFIEDNIiOft OlililelAL use Or~LY<br />

Yahoo Site Explorer: The Yahoo Site Explorer <strong>web</strong>site is still in beta; its goal is to<br />

help users learn detailed information about a specific <strong>web</strong>site:<br />

The Yahoo search database contains detailed information about <strong>the</strong> structure of<br />

<strong>the</strong> <strong>web</strong>. In addition to <strong>the</strong> <strong>web</strong> pages <strong>the</strong>mselves, <strong>the</strong> database stores<br />

information about links among pages, and uses that information (as well as<br />

additional algorithms) to gauge <strong>the</strong> popularity of a given page.<br />

Site Explorer gives you access to this information so you can learn about a site.<br />

To explore a site, you submit a URL using a search box, just as you would for a<br />

normal <strong>web</strong> search. You can <strong>the</strong>n click links on <strong>the</strong> results page to see detailed<br />

information.<br />

The Yahoo Site Explorer will reveal all <strong>the</strong> pages in a specific domain, all <strong>the</strong> pages<br />

in any subdirectory of a domain, and all <strong>the</strong> links to a domain. The main purpose of<br />

<strong>the</strong> Site Explorer is to help <strong>web</strong>masters improve <strong>the</strong> rankings of <strong>the</strong>ir sites, as<br />

evidenced by <strong>the</strong> capability for sites to submit missing uris, <strong>the</strong> fact that Site Explorer<br />

provides 50 results by default, its <strong>web</strong> services APls, and its ability to export <strong>the</strong> data<br />

to a tab separated (TSV) file for fur<strong>the</strong>r analysis. The initial response from <strong>the</strong> search<br />

community has been lukewarm, but I like this new tool because'it simplifies learning<br />

about a site and, unlike Google, Site Explorer provides all <strong>the</strong> links to a site (Which<br />

Yahoo calls "inlinks") instead of a limited subset of links.<br />

Let's examine Site Explorer from <strong>the</strong> point of view of a researcher instead of a<br />

developer. Here's <strong>the</strong> Site Explorer result page for <strong>the</strong> uri [http://www.who.int]<br />

showing all <strong>the</strong> pages in all subdomains of that <strong>web</strong>site. The order is by <strong>the</strong> most<br />

visited pages at <strong>the</strong> domain according to Yahoo's records about <strong>the</strong> page:<br />

Rli~1' lls;<br />

~J?8$t2J8: .!J2.I .I. ~~I~~>. n ~o/~~i~ _. _ _,.. ..__<br />

S!>fJW F.«ges !rom: Afl slIlidomains1.0nlyIhisdomaID<br />

"W,h,ltni;,r.,iAAl.:Wtt\Ejrp1,(UiV Hilp.<br />

.P ig i,. ,~;:i;o~;;i ~tiui'2 sa.1 32:.·bj. ';"6 :<br />

r, ~~%~:;::=il l~~\S 1<br />

~, 'EJ. ! r i{lhti: !I\,~~ · . i~ . ~~ . !t'i'~1r; URI,<br />

7. OM":f l arf!@ t~h;~.~:iil"1 !~;~l.~<br />

\~Wi. .~h l))r;if~~.;,; 2~}:':Gg,.~ i· ;fii;-i !i firlki<br />

UNCLASSIFIEDHFOR OFFIGIAL 1:J8E mlLV 105

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!