The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
156 CLASSIFY MESSAGE ANNOYANCE-FILTER §183<br />
183. Classify message.<br />
<strong>The</strong> classifyMessage class reads input from a mailFolder and returns the junk probability for<br />
successive messages. <strong>The</strong> input mailFolder may contain only a single message.<br />
〈 Class definitions 10 〉 +≡<br />
class classifyMessage {<br />
public:<br />
mailFolder ∗mf ;<br />
tokenParser tp;<br />
unsigned int nExtremal ;<br />
dictionary ∗d;<br />
fastDictionary ∗fd ;<br />
double unknownWordProbability ;<br />
classifyMessage(mailFolder &m, dictionary &dt , fastDictionary ∗fdt = Λ, unsigned int<br />
nExt = 15, double uwp = 0.2);<br />
double classifyThis (bool createTranscript = false );<br />
protected:<br />
void addSignificantWordDiagnostics (list〈string〉 &l, list〈string〉::iterator where ,<br />
multimap〈double, string〉 &rtokens , string endLine = "");<br />
};<br />
184. <strong>The</strong> constructor initialises the classifier for the default parsing of ISO-8859 messages.<br />
〈 Global functions 184 〉 ≡<br />
classifyMessage ::classifyMessage(mailFolder &m, dictionary &dt , fastDictionary<br />
∗fdt , unsigned int nExt , double uwp)<br />
{<br />
mf = &m;<br />
tp.setSource (m);<br />
tp.setTokenDefinition (isoToken , asciiToken );<br />
tp.setTokenLengthLimits (maxTokenLength , minTokenLength , streamMaxTokenLength ,<br />
streamMinTokenLength );<br />
if (pDiagFilename .length ( ) > 0) {<br />
tp.setSaveMessage (true );<br />
}<br />
d = &dt ;<br />
fd = fdt ;<br />
nExtremal = nExt ;<br />
unknownWordProbability = uwp;<br />
}<br />
See also sections 229, 230, 231, and 242.<br />
This code is used in section 254.