The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
§19 ANNOYANCE-FILTER DICTIONARY 25<br />
19. Dictionary.<br />
A dictionary is a collection of dictionaryWord objects, organised for rapid look-up. For convenience and<br />
efficiency, we derive dictionary from the STL map container, thereby making all of its core functionality<br />
accessible to the user. It would be more efficient and cleaner to use a set, but objects in a set cannot be<br />
modified; values in a map can.<br />
〈 Class definitions 10 〉 +≡<br />
class dictionary : public map〈string, dictionaryWord〉 {<br />
public:<br />
unsigned int memoryRequired ;<br />
void add (dictionaryWord w, dictionaryWord ::mailCategory category ); void<br />
include (dictionaryWord &w) ;<br />
void exportCSV (ostream &os = cout );<br />
void importCSV (istream &is = cin );<br />
void computeJunkProbability (unsigned int nMailMessages , unsigned int nJunkMessages , double<br />
mailBias = 2, unsigned int minOccurrences = 5);<br />
void purge (unsigned int occurrences = 0);<br />
void resetCat (dictionaryWord ::mailCategory category );<br />
void printStatistics (ostream &os = cout ) const;<br />
#ifdef HAVE_PLOT_UTILITIES<br />
void plotProbabilityHistogram (string fileName , unsigned int nBins = 20) const;<br />
#endif<br />
void exportToBinaryFile (ostream &os );<br />
void importFromBinaryFile (istream &is );<br />
unsigned int estimateMemoryRequirement (void) const<br />
{<br />
return memoryRequired ;<br />
}<br />
dictionary( )<br />
: memoryRequired (0) { } } ;<br />
20. <strong>The</strong> add method looks up a dictionaryWord in the dictionary. If the word is already present, its<br />
number of occurrences in the given category is incremented. Otherwise, the word is added to the dictionary<br />
with the occurrence count for the category initialised to 1.<br />
〈 Class implementations 11 〉 +≡<br />
void dictionary ::add (dictionaryWord w, dictionaryWord ::mailCategory category )<br />
{<br />
dictionary ::iterator p;<br />
if ((p = find (w.get ( ))) ≠ end ( )) {<br />
p ⃗ second .add (category );<br />
}<br />
else {<br />
insert (make pair (w.get ( ), w)).first ⃗ second .add (category );<br />
memoryRequired += w.estimateMemoryRequirement ( );<br />
}<br />
}