29.10.2014 Views

Six Articles on Electronic - Craig Ball

Six Articles on Electronic - Craig Ball

Six Articles on Electronic - Craig Ball

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Craig</strong> <strong>Ball</strong> © 2007<br />

Unlocking Keywords<br />

By <strong>Craig</strong> <strong>Ball</strong><br />

[Originally published in Law Technology News, January 2007]<br />

The noti<strong>on</strong> that words hold mythic power has been with us as l<strong>on</strong>g as language.<br />

We know we d<strong>on</strong>'t need to ward off evil spirits, but we still say, "Gesundheit!" when some<strong>on</strong>e<br />

sneezes. Can't hurt.<br />

But misplaced c<strong>on</strong>fidence in the power of word searches can seriously hamper electr<strong>on</strong>ic data<br />

discovery. Perhaps because keyword searching works so well in the regimented realm of<br />

automated legal research, lawyers and judges embrace it in EDD with little thought given to its<br />

effectiveness as a tool for exploring less structured informati<strong>on</strong>. Too bad, because the difference<br />

between keyword searches that get the goods and those that fail hinges <strong>on</strong> thoughtful<br />

preparati<strong>on</strong> and precauti<strong>on</strong>.<br />

Text Translati<strong>on</strong><br />

Framing effective searches starts with understanding that most of what we think of as textual<br />

informati<strong>on</strong> isn't stored as text. Brilliant keywords w<strong>on</strong>'t turn up anything if the data searched<br />

isn't properly processed.<br />

Take Microsoft Outlook e-mail. The message we see isn't a discrete document so much as a<br />

report assembled <strong>on</strong>-the- fly from a database. As with any database, the way informati<strong>on</strong> is<br />

stored little resembles the way we see it <strong>on</strong>screen after our e-mail program works its magic by<br />

decompressing, decoding, and decrypting messages.<br />

Lots of evidence we think of as textual isn't stored as text, including fax transmissi<strong>on</strong>s, .tiff or<br />

PDF documents, PowerPoint word art, CAD/CAM blueprints, and zip archives. For each, the<br />

search software must process the data to insure c<strong>on</strong>tent is accessible as searchable text.<br />

Be certain the search tool you or your vendor employ can access and interpret all of the data<br />

that should be seen as text.<br />

Recursi<strong>on</strong><br />

Reviewing a box of documents that c<strong>on</strong>tains envelopes within folders, you'd open everything to<br />

ensure you saw everything.<br />

Computers store data within data such that an Outlook file can hold an e-mail transmitting a zip<br />

archive c<strong>on</strong>taining a PowerPoint with an embedded .tiff image.<br />

It's the electr<strong>on</strong>ic equivalent of Russian nesting dolls. If the text you seek is inside that .tiff, the<br />

search tool must drill down through each nested item, opening each with appropriate software to<br />

ensure all c<strong>on</strong>tent is searched. This is called recursi<strong>on</strong>, and it's an essential feature of<br />

competent search. Be sure your search tool can dig down as deep as the evidence.<br />

Excepti<strong>on</strong>s<br />

Even when search software opens wide and digs deep, it will encounter items it can't read:<br />

password protected files, proprietary formats, and poor optical character recogniti<strong>on</strong>. When that<br />

115

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!