02.11.2014 Views

untangling_the_web

untangling_the_web

untangling_the_web

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DID: 4046925<br />

UNCLASSIFIEDlfFOR OFFlel~L USE O,4LY<br />

Tip 19: Use Wildcards to Maximize Effectiveness<br />

I love to use wildcards in searching. Unfortunately, few search engines permit<br />

wildcard searches, which means <strong>the</strong> researcher must enter <strong>the</strong> term in a variety of<br />

forms for a thorough search. Here is how <strong>the</strong> major search engines handle<br />

wildcards.<br />

Google: one limited wildcard (*); can only replace any single term with white<br />

space on ei<strong>the</strong>r side (e.g., ["what a * <strong>web</strong>"] will find "what a tangled <strong>web</strong>" and<br />

"what a coiled <strong>web</strong>"). Cannot be used within or at <strong>the</strong> end of a search term (for<br />

example, to pluralize a term).<br />

Yahoo: Yahoo does not support wildcard searching. The old cheat of using a<br />

"small" word, such as a, no longer works in Yahoo.<br />

Live Search: no truncation, no wildcard. A search for [cat] finds cat, not cats.<br />

Exalead: a search on [child*] return pages with children highlighted as a search<br />

result. The wildcard also can be used inside a search term, e.g., [kazak*stan].<br />

This search accurately finds kazakhstan, kazakh, and kazak as well as kazakstan.<br />

Tip 20: Examine Page Source Code<br />

In addition to often revealing <strong>the</strong> <strong>web</strong>page's language encoding, page source can<br />

provide o<strong>the</strong>r helpful details, including names, dates, email addresses, type of<br />

software used to create <strong>the</strong> page, etc. However, experienced <strong>web</strong>page designers<br />

have learned by now that putting <strong>the</strong>se sorts of details into <strong>the</strong> source code is an<br />

open invitation to spammers to harvest <strong>the</strong>m, so finding useful bits of information in<br />

source code is less likely now than in <strong>the</strong> past. Still, many people are not<br />

experienced Internet users and have not yet learned to keep this information out of<br />

<strong>the</strong> source code, so it is worth a look. Below is a very good example of how<br />

analyzing page source code helped one company track down <strong>the</strong> person responsible<br />

for a fabricated television interview that spread potentially libelous information about<br />

<strong>the</strong> company. The page source contained an email address that ultimately led to <strong>the</strong><br />

person responsible for <strong>the</strong> false information.<br />

Daniel Janal, "How to Deal with Lies About Your Company (and You) on <strong>the</strong><br />

Internet," Scambusters.org http://www.scambusters.org/Scambusters29.html<br />

Tip 21: Ask for Help<br />

One of <strong>the</strong> hardest questions to answer is, "how do you know when you aren't<br />

finding something because you're not searching correctly or because it's not <strong>the</strong>re to<br />

find?" The best rule of thumb I can think of is to ask a more experienced Internet<br />

researcher for advice and assistance if you are hitting a brick wall. Experienced<br />

researchers generally have developed a pretty good sense of where to look for<br />

different kinds of information and, most importantly, <strong>the</strong> types of information you are<br />

not likely to find on <strong>the</strong> Internet.<br />

432 UNCLASSIFIEDNFOR OFFleI~L USE ONLY

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!