21.03.2013 Views

Problem - Kevin Tafuro

Problem - Kevin Tafuro

Problem - Kevin Tafuro

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

In CBC, CFB, and OFB modes, encryption can’t really be parallelized because the<br />

ciphertext for a block is necessary to create the ciphertext for the next block; thus,<br />

we can’t compute ciphertext out of order. However, for CBC and CFB, when we<br />

decrypt, things are different. Because we only need the ciphertext of a block to<br />

decrypt the next block, we can decrypt the next block before we decrypt the first<br />

one.<br />

There are two reasonable strategies for parallelizing the work. When a message<br />

shows up all at once, you might divide it roughly into equal parts and handle each<br />

part separately. Alternatively, you can take an interleaved approach, where alternating<br />

blocks are handled by different threads. That is, the actual message is separated<br />

into two different plaintexts, as shown in Figure 5-5.<br />

Original message M 1<br />

1st plaintext<br />

2nd plaintext<br />

M 1<br />

Figure 5-5. Encryption through interleaving<br />

M 2<br />

M 2<br />

If done correctly, both approaches will result in the correct output. We generally prefer<br />

the interleaving approach, because all threads can do work with just a little bit of<br />

data available. This is particularly true in hardware, where buffers are small.<br />

With a noninterleaving approach, you must wait at least until the length of the message<br />

is known, which is often when all of the data is finally available. Then, if the<br />

message length is known in advance, you must wait for a large percentage of the data<br />

to show up before the second thread can be launched.<br />

Even the interleaved approach is a lot easier when the size of the message is known<br />

in advance because it makes it easier to get the message all in one place. If you need<br />

the whole message to come in before you know the length, parallelization may not be<br />

worthwhile, because in many cases, waiting for an entire message to come in before<br />

beginning work can introduce enough latency to thwart the benefits of parallelization.<br />

If you aren’t generally going to get an entire message all at once, but you are able to<br />

determine the biggest message you might get, another reasonably easy approach is to<br />

allocate a result buffer big enough to hold the largest possible message.<br />

For the sake of simplicity, let’s assume that the message arrives all at once and you<br />

might want to process a message with two parallel threads. The following code provides<br />

an example API that can handle CTR mode encryption and decryption in parallel<br />

(remember that encryption and decryption are the same operation in CTR mode).<br />

Parallelizing Encryption and Decryption in Modes That Allow It (Without Breaking Compatibility) | 209<br />

This is the Title of the Book, eMatter Edition<br />

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.<br />

M 3<br />

M 3<br />

M 4<br />

M 4<br />

M 5<br />

M 5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!