The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
§191 ANNOYANCE-FILTER CLASSIFY MESSAGE 161<br />
191. If we’re producing a message transcript, just before writing it add the annotations to the end of<br />
the header which indicate the junk probability and classification of the message based on the threshold<br />
settings. After these, other annotations requested by the −−annotate option are appended.<br />
<strong>The</strong> test for the end of the message header where we insert the annotations is a little curious. When<br />
we’re processing a message received from a POP3Proxy server, the transcript will contain the CR from<br />
the CR/LF termination sequences as required by POP3. (<strong>The</strong> final line feed will have been stripped by<br />
getline as the message was read.) Preserving these terminators allows us to use the standard mechanisms<br />
of mailFolder without lots of special flags, so we deem a line the end of the header if it’s either zero<br />
length (read from a UNIX mail folder with getline or if it contains a single CR (received from a POP3<br />
server). In the latter case, we set transEndl so as terminate annotations we add to the transcript with<br />
CR/LF as well.<br />
〈 Add annotation to message transcript 191 〉 ≡<br />
ostringstream os ;<br />
list〈string〉::iterator p;<br />
string transEndl = ""; /∗ Find the end of the header in the message. If this fails simply append<br />
the annotations to the end of the message. ∗/<br />
for (p = messageTranscript .begin ( ); p ≠ messageTranscript .end ( ); p++) {<br />
if (p ⃗ length ( ) ≡ 0) {<br />
break;<br />
}<br />
if (∗p ≡ "\r") {<br />
transEndl = "\r";<br />
break;<br />
}<br />
}<br />
double jp = junkProb; /∗ If the probability is sufficiently small it to be edited in scientific<br />
notation, force it to zero so it’s easier to parse. ∗/<br />
if (jp < 0.001) {<br />
jp = 0;<br />
}<br />
os ≪ Xfile ≪ "−Junk−Probability:␣" ≪ setprecision (3) ≪ jp ≪ transEndl ;<br />
messageTranscript .insert (p, os .str ( ));<br />
os .str ("");<br />
os ≪ Xfile ≪ "−Classification:␣";<br />
if (junkProb ≥ junkThreshold ) {<br />
os ≪ "Junk";<br />
}<br />
else if (junkProb ≤ mailThreshold ) {<br />
os ≪ "Mail";<br />
}<br />
else {<br />
os ≪ "Indeterminate";<br />
}<br />
os ≪ transEndl ;<br />
messageTranscript .insert (p, os .str ( ));<br />
if (Annotate (’w’)) {<br />
addSignificantWordDiagnostics (messageTranscript , p, rtokens , transEndl );<br />
}<br />
if (Annotate (’p’) ∨ Annotate (’d’)) {<br />
while (¬parserDiagnostics .empty ( )) {<br />
ostringstream os ;