21.03.2013 Views

Problem - Kevin Tafuro

Problem - Kevin Tafuro

Problem - Kevin Tafuro

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

See Also<br />

Recipe 11.11<br />

11.16 Compressing Data with Entropy into a<br />

Fixed-Size Seed<br />

<strong>Problem</strong><br />

You are collecting data that may contain entropy, and you will need to output a<br />

fixed-size seed that is smaller than the input. That is, you have a lot of data that has a<br />

little bit of entropy, yet you need to produce a fixed-size seed for a pseudo-random<br />

number generator. At the same time, you would like to remove any statistical biases<br />

(patterns) that may be lingering in the data, to the extent possible.<br />

Alternatively, you have data that you believe contains one bit of entropy per bit of<br />

data (which is generally a bad assumption to make, even if it comes from a hardware<br />

generator; see Recipe 11.19), but you’d like to remove any patterns in the data that<br />

could facilitate analysis if you’re wrong about how much entropy is there. The process<br />

of removing patterns is called whitening.<br />

Solution<br />

You can use a cryptographic hash function such as SHA1 to process data into a<br />

fixed-size seed. It is generally a good idea to process data incrementally, so that you<br />

do not need to buffer potentially arbitrary amounts of data with entropy.<br />

Discussion<br />

Be sure to estimate entropy conservatively. ( See Recipe 11.19.)<br />

It is a good idea to use a cryptographic algorithm to compress the data from the<br />

entropy source into a seed of the right size. This helps preserve entropy in the data,<br />

up to the output size of the message digest function. If you need fewer bytes for a<br />

seed than the digest function produces, you can always truncate the output. In addition,<br />

cryptographic processing effectively removes any patterns in the data (assuming<br />

that the hash function is a pseudo-random function). Patterns in the data can<br />

help facilitate breaking an entropy source (in part or in full), particularly when that<br />

source does not actually produce as much entropy as was believed.<br />

Compressing Data with Entropy into a Fixed-Size Seed | 613<br />

This is the Title of the Book, eMatter Edition<br />

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!