12.06.2015 Views

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

The Annoyance Filter.pdf - Fourmilab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

156 CLASSIFY MESSAGE ANNOYANCE-FILTER §183<br />

183. Classify message.<br />

<strong>The</strong> classifyMessage class reads input from a mailFolder and returns the junk probability for<br />

successive messages. <strong>The</strong> input mailFolder may contain only a single message.<br />

〈 Class definitions 10 〉 +≡<br />

class classifyMessage {<br />

public:<br />

mailFolder ∗mf ;<br />

tokenParser tp;<br />

unsigned int nExtremal ;<br />

dictionary ∗d;<br />

fastDictionary ∗fd ;<br />

double unknownWordProbability ;<br />

classifyMessage(mailFolder &m, dictionary &dt , fastDictionary ∗fdt = Λ, unsigned int<br />

nExt = 15, double uwp = 0.2);<br />

double classifyThis (bool createTranscript = false );<br />

protected:<br />

void addSignificantWordDiagnostics (list〈string〉 &l, list〈string〉::iterator where ,<br />

multimap〈double, string〉 &rtokens , string endLine = "");<br />

};<br />

184. <strong>The</strong> constructor initialises the classifier for the default parsing of ISO-8859 messages.<br />

〈 Global functions 184 〉 ≡<br />

classifyMessage ::classifyMessage(mailFolder &m, dictionary &dt , fastDictionary<br />

∗fdt , unsigned int nExt , double uwp)<br />

{<br />

mf = &m;<br />

tp.setSource (m);<br />

tp.setTokenDefinition (isoToken , asciiToken );<br />

tp.setTokenLengthLimits (maxTokenLength , minTokenLength , streamMaxTokenLength ,<br />

streamMinTokenLength );<br />

if (pDiagFilename .length ( ) > 0) {<br />

tp.setSaveMessage (true );<br />

}<br />

d = &dt ;<br />

fd = fdt ;<br />

nExtremal = nExt ;<br />

unknownWordProbability = uwp;<br />

}<br />

See also sections 229, 230, 231, and 242.<br />

This code is used in section 254.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!