12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

§185 ANNOYANCE-FILTER CLASSIFY MESSAGE 157<br />

185. <strong>The</strong> classifyThis method reads the next message from the mail folder and returns the probability<br />

that it is junk. If the end of the mail folder is encountered −1 is returned.<br />

〈 Class implementations 11 〉 +≡<br />

double classifyMessage ::classifyThis (bool createTranscript )<br />

{<br />

dictionaryWord dw ;<br />

double junkProb = −1;<br />

}<br />

if (createTranscript ∨ (transcriptFilename ≠ "")) {<br />

mf ⃗ setTranscriptList (&messageTranscript );<br />

if (Annotate (’p’) ∨ Annotate (’d’)) {<br />

saveParserDiagnostics = true ;<br />

}<br />

}<br />

〈 Build set of unique tokens in message 187 〉;<br />

〈 Classify message tokens by probability of significance 188 〉;<br />

〈 Compute probability message is junk from most significant tokens 189 〉;<br />

if (tp.getSaveMessage ( )) {<br />

〈 Add classification diagnostics to parser diagnostics queue 190 〉;<br />

ofstream mdump(pDiagFilename .c str ( ));<br />

tp.writeMessageQueue (mdump);<br />

mdump.close ( );<br />

}<br />

if (createTranscript ∨ (transcriptFilename ≠ "")) {<br />

〈 Add annotation to message transcript 191 〉;<br />

if (transcriptFilename ≠ "") {<br />

mf ⃗ writeMessageTranscript (transcriptFilename );<br />

}<br />

}<br />

return junkProb;<br />

186. Just one more thing. . . . We need to define an absolute value function for floating point<br />

quantities. Make it so.<br />

〈 Class definitions 10 〉 +≡<br />

#ifdef OLDWAY<br />

double abs (double x)<br />

{<br />

return (x < 0) ? (−(x)) : x;<br />

}<br />

#endif<br />

187. Read the next message from the mail folder and build the set utokens of unique tokens in the<br />

message. set insertion automatically discards tokens which appear more than once.<br />

〈 Build set of unique tokens in message 187 〉 ≡<br />

set〈string〉 utokens ;<br />

while (tp.nextToken (dw )) {<br />

utokens .insert (dw .get ( ));<br />

}<br />

This code is used in section 185.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!