12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

§31 ANNOYANCE-FILTER DICTIONARY 33<br />

31. Exporting or importing a dictionary to or from a binary file is more or less a matter of iterating<br />

through the dictionary and delegating the matter to each individual word. One detail we must deal<br />

with, however, is adding a pseudo-word at the head of the dictionary to record the number of mail and<br />

junk messages which contributed the words to the dictionary. <strong>The</strong>se counts are needed to subsequently<br />

recompute the probability for each word.<br />

When loading a dictionary with importFromBinaryFile this pseudo-word is recognised and the values<br />

it contains are added to the messageCount for each category. Note that importing a file is logically an<br />

addition to an existing dictionary—you may import any number of binary dictionary files, just as you<br />

can add mail folders with the −−mail and −−junk options.<br />

#define pseudoCountsWord "␣COUNTS␣"<br />

〈 Class implementations 11 〉 +≡<br />

void dictionary ::exportToBinaryFile (ostream &os )<br />

{<br />

if (verbose ) {<br />

cerr ≪ "Exporting␣dictionary␣to␣binary␣file." ≪ endl ;<br />

}<br />

}<br />

dictionaryWord pdw ;<br />

pdw .set(pseudoCountsWord , messageCount [dictionaryWord ::Mail ],<br />

messageCount [dictionaryWord ::Junk ], −1);<br />

pdw .exportToBinaryFile (os );<br />

for (dictionary ::iterator p = begin ( ); p ≠ end ( ); p++) {<br />

p ⃗ second .exportToBinaryFile (os );<br />

}<br />

void dictionary ::importFromBinaryFile (istream &is )<br />

{<br />

if (verbose ) {<br />

cerr ≪ "Importing␣dictionary␣from␣binary␣file." ≪ endl ;<br />

}<br />

}<br />

dictionaryWord dw ;<br />

if (dw .importFromBinaryFile (is )) {<br />

assert(dw .get ( ) ≡ pseudoCountsWord );<br />

messageCount [dictionaryWord ::Mail ] += dw .n mail ( );<br />

messageCount [dictionaryWord ::Junk ] += dw .n junk ( );<br />

while (dw .importFromBinaryFile (is )) {<br />

include (dw ) ;<br />

}<br />

}

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!