12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

§19 ANNOYANCE-FILTER DICTIONARY 25<br />

19. Dictionary.<br />

A dictionary is a collection of dictionaryWord objects, organised for rapid look-up. For convenience and<br />

efficiency, we derive dictionary from the STL map container, thereby making all of its core functionality<br />

accessible to the user. It would be more efficient and cleaner to use a set, but objects in a set cannot be<br />

modified; values in a map can.<br />

〈 Class definitions 10 〉 +≡<br />

class dictionary : public map〈string, dictionaryWord〉 {<br />

public:<br />

unsigned int memoryRequired ;<br />

void add (dictionaryWord w, dictionaryWord ::mailCategory category ); void<br />

include (dictionaryWord &w) ;<br />

void exportCSV (ostream &os = cout );<br />

void importCSV (istream &is = cin );<br />

void computeJunkProbability (unsigned int nMailMessages , unsigned int nJunkMessages , double<br />

mailBias = 2, unsigned int minOccurrences = 5);<br />

void purge (unsigned int occurrences = 0);<br />

void resetCat (dictionaryWord ::mailCategory category );<br />

void printStatistics (ostream &os = cout ) const;<br />

#ifdef HAVE_PLOT_UTILITIES<br />

void plotProbabilityHistogram (string fileName , unsigned int nBins = 20) const;<br />

#endif<br />

void exportToBinaryFile (ostream &os );<br />

void importFromBinaryFile (istream &is );<br />

unsigned int estimateMemoryRequirement (void) const<br />

{<br />

return memoryRequired ;<br />

}<br />

dictionary( )<br />

: memoryRequired (0) { } } ;<br />

20. <strong>The</strong> add method looks up a dictionaryWord in the dictionary. If the word is already present, its<br />

number of occurrences in the given category is incremented. Otherwise, the word is added to the dictionary<br />

with the occurrence count for the category initialised to 1.<br />

〈 Class implementations 11 〉 +≡<br />

void dictionary ::add (dictionaryWord w, dictionaryWord ::mailCategory category )<br />

{<br />

dictionary ::iterator p;<br />

if ((p = find (w.get ( ))) ≠ end ( )) {<br />

p ⃗ second .add (category );<br />

}<br />

else {<br />

insert (make pair (w.get ( ), w)).first ⃗ second .add (category );<br />

memoryRequired += w.estimateMemoryRequirement ( );<br />

}<br />

}

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!