12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

68 SHIFT-JIS DECODER ANNOYANCE-FILTER §77<br />

77. We test for the first byte we’ve read being in the range which denotes a two byte character. If<br />

so, read the second byte of the character, validating that it is within the ranges permitted for second<br />

bytes, and assemble the 16 bit character from the two bytes.<br />

〈 Check for Shift-JIS two byte character and assemble as required 77 〉 ≡<br />

if (((c1 ≥ # 81) ∧ (c1 ≤ # 9F)) ∨ ((c1 ≥ # E0) ∧ (c1 ≤ # EF)) ∨ ((c1 ≥ # F0) ∧ (c1 ≤ # FC))) {<br />

int c2 = getNextEncodedByte ( );<br />

}<br />

if (c2 ≡ −1) {<br />

ostringstream os ;<br />

os ≪ name ( ) ≪ "_MBCSdecoder:␣Premature␣end␣of␣line␣in␣two␣byte␣character.";<br />

reportDecoderDiagnostic(os );<br />

return −1;<br />

}<br />

if (¬(((c2 ≥ # 40) ∧ (c2 ≤ # 7E)) ∨ ((c2 ≥ # 80) ∧ (c2 ≤ # FC)))) {<br />

ostringstream os ;<br />

os ≪ name ( ) ≪ "_MBCSdecoder:␣Invalid␣second␣byte␣in␣two␣byte␣character:␣""0x" ≪<br />

setiosflags (ios ::uppercase ) ≪ hex ≪ c1 ≪ "␣" ≪ "0x" ≪ c2 ≪ ".";<br />

reportDecoderDiagnostic(os );<br />

return −1;<br />

}<br />

return (c1 ≪ 8) | c2 ;<br />

This code is used in section 76.<br />

78. To permit expansion of Macintosh-specific characters to multiple character replacements, we have<br />

the ability to store the balance of a multiple character sequence in the pending string. If there are any<br />

characters there, return them before obtaining another character from the input stream.<br />

〈 Check for pending characters and return if so 78 〉 ≡<br />

if (¬pending .empty ( )) {<br />

int pc = pending [0];<br />

}<br />

pending = pending .substr (1);<br />

return pc;<br />

This code is used in section 76.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!