12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

52 BASE64 MIME DECODER ANNOYANCE-FILTER §52<br />

52. Read the encoded input stream and return the next non-white space character. This code does<br />

not verify whether characters it returns are valid within a base64 stream—that’s up to the caller to<br />

determine once the character is returned.<br />

〈 Get next significant character from input stream 52 〉 ≡<br />

while (true ) {<br />

c = −1;<br />

while (ip < inputLine .length ( )) {<br />

if (inputLine [ip] > ’␣’) {<br />

c = inputLine [ip ++];<br />

break;<br />

}<br />

ip ++;<br />

}<br />

if (c ≥ 0) {<br />

break;<br />

}<br />

if (¬getNextEncodedLine ( )) {<br />

break;<br />

}<br />

}<br />

This code is used in section 51.<br />

53. An end of file indication (due to encountering the MIME part separator sentinel) is valid only after<br />

an even number of four character encoded sequences. Validate this and report any errors accordingly.<br />

If an unexpected end of file is encountered, any incomplete encoded sequence is discarded.<br />

〈 Check for end of file in base64 stream 53 〉 ≡<br />

if (c ≡ EOF) {<br />

if (i > 0) {<br />

nDecodeErrors ++;<br />

mf ⃗ reportParserDiagnostic("Unexpected␣end␣of␣file␣in␣Base64␣decoding.");<br />

}<br />

return −1;<br />

}<br />

This code is used in section 51.<br />

54. Once we’ve decoded four characters from the input stream, we have four six-bit fields in the b<br />

array. Now we extract, shift, and ∨ these fields together to form three 8 bit bytes. One subtlety arises<br />

at the end of file. <strong>The</strong> last one or two characters of an encoded four character field may be replaced by<br />

equal signs to indicate that the final field encodes only one or two source bytes. If this is the case, the<br />

number of bytes placed onto the decodedBytes queue is reduced to the correct value.<br />

〈 Assemble the decoded bits into bytes and place on decoded queue 54 〉 ≡<br />

o[0] = (b[0] ≪ 2) | (b[1] ≫ 4);<br />

o[1] = (b[1] ≪ 4) | (b[2] ≫ 2);<br />

o[2] = (b[2] ≪ 6) | b[3];<br />

j = a[2] ≡ ’=’ ? 1 : (a[3] ≡ ’=’ ? 2 : 3);<br />

for (k = 0; k < j; k++) {<br />

decodedBytes .push back (o[k]);<br />

}<br />

This code is used in section 50.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!