Six Articles on Electronic - Craig Ball
Six Articles on Electronic - Craig Ball
Six Articles on Electronic - Craig Ball
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Craig</strong> <strong>Ball</strong> © 2007<br />
for entire operating systems and comm<strong>on</strong> applicati<strong>on</strong>s like Microsoft Windows or Intuit’s<br />
Quicken, culls huge chunks of patently irrelevant files from c<strong>on</strong>siderati<strong>on</strong> without risk of<br />
overlooking relevant informati<strong>on</strong> excluded based <strong>on</strong> locati<strong>on</strong> or file extensi<strong>on</strong>. Hashing thwarts<br />
efforts to hide files by name change or relocati<strong>on</strong> because hash-matching flushes out a file’s<br />
true nature--so l<strong>on</strong>g, that is, as the c<strong>on</strong>tents of the file haven’t changed.<br />
Bates Numbering<br />
Hashing’s ability to uniquely identify e-documents makes it a candidate to replace traditi<strong>on</strong>al<br />
Bates numbering in electr<strong>on</strong>ic producti<strong>on</strong>. Though hash values d<strong>on</strong>’t fulfill the sequencing<br />
functi<strong>on</strong> of Bates numbering, they’re excellent unique identifiers and enjoy an advantage over<br />
Bates numbers because they eliminate the possibility that the same number might attach to<br />
different documents. An electr<strong>on</strong>ic document’s hash value derives from its c<strong>on</strong>tents, so will<br />
never c<strong>on</strong>flict with that of another document unless the two are identical.<br />
Authenticati<strong>on</strong><br />
I regularly use hashing to establish that a forensically sound duplicate of a hard drive faithfully<br />
reflects every byte of the source and to prove that my work hasn’t altered the original evidence.<br />
As e-discovery gravitates to native producti<strong>on</strong>, c<strong>on</strong>cern about intenti<strong>on</strong>al or inadvertent<br />
alterati<strong>on</strong> requires lawyers to have a fast, reliable method to authenticate electr<strong>on</strong>ic documents.<br />
Hashing neatly fills this bill. In practice, a producing party simply calculates and records the<br />
hash values for the items produced in native format. Once these hash values are established,<br />
the slightest alterati<strong>on</strong> of the data would be immediately apparent when hashed.<br />
De-duplicati<strong>on</strong><br />
In e-discovery, vast volumes of identical data are burdensome and pose a significant risk of<br />
c<strong>on</strong>flicting relevance and privilege assessments. Hashing flags identical documents, permitting<br />
<strong>on</strong>e review of an item that might otherwise have cropped up hundreds of times. This is deduplicati<strong>on</strong>,<br />
and it drastically cuts review costs.<br />
But because even the slightest difference triggers different hash values, insignificant variati<strong>on</strong>s<br />
between files (e.g., different Internet paths taken by otherwise identical e-mail) may frustrate deduplicati<strong>on</strong><br />
when hashing an entire e-document. An alternative is to hash relevant segments of<br />
e-documents to assess their relative identicality, a practice called “near de-duplicati<strong>on</strong>.”<br />
Here’s to You, Math Geeks<br />
So this Thanksgiving, raise a glass to the brilliant mathematicians who dreamed up hash<br />
algorithms. They’re making electr<strong>on</strong>ic discovery and computer forensics a whole lot easier and<br />
less expensive.<br />
112