12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

§173 ANNOYANCE-FILTER TOKEN PARSER 145<br />

173. Token parser.<br />

A tokenParser reads lines from a mailFolder and returns tokens as defined by its active tokenDefinition.<br />

Separate tokenDefinitions can be defined for use while parsing regular text and binary byte streams,<br />

respectively. A tokenParser has the ability to save the lines parsed from a message in a messageQueue ,<br />

permitting further subsequent analysis. Note that what is saved is “what the parser saw”—after MIME<br />

decoding or elision of ignored parts.<br />

〈 Class definitions 10 〉 +≡<br />

class tokenParser {<br />

protected:<br />

mailFolder ∗source ;<br />

string cl ;<br />

string ::size type clp;<br />

bool atEnd , inHTML, inHTMLcomment ;<br />

tokenDefinition ∗td ; /∗ Token definition for text mode ∗/<br />

tokenDefinition ∗btd ; /∗ Token definition for byte stream parsing ∗/<br />

bool saveMessage ; /∗ Save current message in messageQueue ? ∗/<br />

bool assemblePhrases ; /∗ Are we assembling phrases ? ∗/<br />

deque〈string〉 phraseQueue ; /∗ Phrase assembly queue ∗/<br />

deque〈string〉 pendingPhrases ; /∗ Queue of phrases awaiting return ∗/<br />

public:<br />

list〈string〉 messageQueue ; /∗ Current message ∗/<br />

tokenParser( )<br />

{<br />

td = Λ;<br />

}<br />

void setSource (mailFolder &mf )<br />

{<br />

source = &mf ;<br />

cl = "";<br />

clp = 0;<br />

atEnd = inHTML = inHTMLcomment = false ;<br />

saveMessage = false ;<br />

messageQueue .clear ( );<br />

phraseQueue .clear ( );<br />

pendingPhrases .clear ( );<br />

〈 Check phrase assembly parameters and activate if required 179 〉;<br />

}<br />

void setTokenDefinition (tokenDefinition &t, tokenDefinition &bt )<br />

{<br />

td = &t;<br />

btd = &bt ;<br />

}<br />

void setTokenLengthLimits (unsigned int lMax , unsigned int lMin = 1, unsigned int<br />

blMax = 1, unsigned int blMin = 1)<br />

{<br />

assert(td ≠ Λ);<br />

td ⃗ setLengthLimits (lMin , lMax );<br />

assert(btd ≠ Λ);<br />

btd ⃗ setLengthLimits (blMin , blMax );<br />

}

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!