02.11.2014 Views

untangling_the_web

untangling_the_web

untangling_the_web

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DID: 4046925<br />

UNCLASSIFIEDflFOR OFFlebllL I:JSE er~L¥<br />

------,--,------_._---------------<br />

Gigablast<br />

The Gigablast search engine, which has been around since 2002, is still not quite in<br />

<strong>the</strong> same league as powerhouses Google, Yahoo, and Live Search, but it is well on<br />

its way to becoming one of <strong>the</strong> best search engines. That's something of a surprise<br />

given Gigablast's humble origins and unique status among major search engines. In<br />

case you're not familiar with Gigablast, it is different from its major competitors most<br />

notably because it is still owned and largely run by <strong>the</strong> guy who first wrote its C++<br />

code in 2000. Matt Wells is still <strong>the</strong> very hands-on proprietor of Gigablast. Its<br />

database now indexes over 2 billion pages, up from 650 million in late 2004.<br />

While this falls short of <strong>the</strong> size of <strong>the</strong> Google, Yahoo, and Live Search databases,<br />

it's not bad, especially considering a lot of <strong>the</strong> "stuff' in those databases is dross and<br />

<strong>the</strong> numbers are not verified independently.<br />

How does Gigablast stack up to <strong>the</strong> big boys? Gigablast has some very nice<br />

features, some of which are unique to it, such as <strong>the</strong> IP range search (something<br />

All<strong>the</strong>Web once offered).<br />

Gigablast<br />

http://www.gigablast.com/<br />

Strengths<br />

~ simple interface<br />

~ cached copies with date indexed [archived copies]<br />

~ cached copies of <strong>web</strong>pages without images [stripped]<br />

~ links to Internet Archives [older copies]<br />

~ clusters results by default (can be turned off)<br />

~ no limit on number of search terms<br />

~ file types indexed include Microsoft Word, Excel, and PowerPoint, as well as<br />

PDF, PostScript, HTML, and text; syntax is:<br />

o type:pdf for Adobe Acrobat PDFs<br />

o type:doc for Microsoft Word documents<br />

o type:ppt for PowerPoint presentations<br />

o type:xls for Excel spreadsheets<br />

o type:ps for PostScript files<br />

o type:text for ASCII text files<br />

o type:html for HTML Web pages<br />

UNCLASSIFIEDNFOR OFFlelAL tiS! Orc.lL f 141

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!