12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

§75 ANNOYANCE-FILTER SHIFT-JIS DECODER 67<br />

75. Shift-JIS decoder.<br />

Shift-JIS is used to encode Japanese characters on MS-DOS, Windows, and the Macintosh (which adds<br />

four additional one-byte characters which we support here). <strong>The</strong> encoding uses code points # 21– # 7E<br />

for ASCII/JIS-Roman single byte characters, code points # A1– # DF for single byte hald width katakana,<br />

plus two-byte characters introduced by first bytes in the ranges # 81– # 9F, # E0– # EF, and, for user-defined<br />

characters, # F0– # FC. <strong>The</strong> second byte of a valid two-byte character will always be in one of the ranges<br />

#<br />

40– # 7E and # 80– # FC.<br />

〈 Class definitions 10 〉 +≡<br />

class Shift JIS MBCSdecoder : public MBCSdecoder {<br />

protected:<br />

string pending ;<br />

public:<br />

Shift JIS MBCSdecoder( )<br />

: pending ("") { }<br />

virtual ∼Shift JIS MBCSdecoder( )<br />

{ }<br />

virtual string name (void)<br />

{<br />

return "Shift_JIS";<br />

}<br />

virtual int getNextDecodedChar (void); /∗ Get next decoded byte ∗/<br />

};<br />

76. Decode the next logical character. We return −1 when the end of the encoded line is encountered.<br />

An invalid second byte of a two byte character terminates processing of the line, as it’s likely to be<br />

gibberish from then on.<br />

〈 Class implementations 11 〉 +≡<br />

int Shift JIS MBCSdecoder ::getNextDecodedChar (void)<br />

{<br />

〈 Check for pending characters and return if so 78 〉;<br />

}<br />

int c1 = getNextEncodedByte ( );<br />

if (c1 ≥ 0) {<br />

〈 Check for Shift-JIS two byte character and assemble as required 77 〉;<br />

〈 Check for Macintosh-specific single byte characters and translate 79 〉;<br />

}<br />

return c1 ;

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!