The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
§256 ANNOYANCE-FILTER DEVELOPMENT LOG 215<br />
COMPRESSED_FILES was not defined.<br />
definition.<br />
We now unconditionally define isc in the mailFolder class<br />
With these fixes, the makew32.bat build on Win32 now works once again.<br />
Added a testw32.bat file which runs a rudimentary test of the Win32 build similar to the check<br />
target in Makefile.in. I added this file to both the dist and winarch archive generation targets in<br />
Makefile.in.<br />
Modified Makefile.in to replace the hard-coded /ftp/annoyance−filter destination with a PUBDEST<br />
declaration at the top of the file which defaults to the same directory. This permits overriding the default<br />
publication destination for use at another site or for nondestructive testing of new releases simply by<br />
editing the Makefile. Some day, it might make sense to permit overriding this with an option at<br />
./configure time, but this is not that day.<br />
Release 0.1-RC4.<br />
2002 October 11<br />
Integrated the application string parsers for Flash and PDF formats, which were developed in a<br />
separate stand-alone test program. <strong>The</strong>se include the classes applicationStringParser (mother of all<br />
application parsers), flashStream, flashTextExtractor, and <strong>pdf</strong>TextExtractor, the latter compiled<br />
in only if all the utilities it needs to decode PDF via a pipe to <strong>pdf</strong>totext are present. At the moment,<br />
these aren’t hooked up to the mail folder, but are merely exercised by code in the −−jig.<br />
Integrated Knuth and Levy’s CWEB version 3.64 in the cweb directory. <strong>The</strong> CWEAVE and CTANGLE<br />
programs are built with a change file, common−bigger.ch which increases the input line length limit to<br />
400 characters as I did in the earlier 3.63 release.<br />
Added plumbing to invoke Flash and PDF parsers for attachments with those application types. Thanks<br />
to the inability to take a class member function as an unqualified function pointer, this is somewhat<br />
tacky, requiring a pointer to the mailFolder to obtain decoded data.<br />
2002 October 12<br />
Added decoders and interpreters for Shift-JIS and Unicode (UCS-2, UTF-8, and UTF-16 encodings).<br />
<strong>The</strong>se are used to decode and interpret these character sets in Flash animations whose fonts are so<br />
tagged.<br />
Added logic to invoke the new Unicode UTF-8 decoder when a MIME part’s charset= designates it so<br />
encoded.<br />
2002 October 13<br />
In the process of testing UTF-8 decoding of Unicode messages, I stumbled over a bug in ignoring HTML<br />
comments embedded within tokens, a common trick in junk mail to evade naïve filters, for example,<br />
“remove␣yourself”. (Yes, I know a valid HTML comment is supposed to contain a<br />
space after the initial and before the final sentinel, but junk mail often violates this rule, counting on<br />
sloppy browsers not to enforce the standard, so we must comply in the interest of “seeing what the user<br />
would”.) HTML comments are now completely discarded, even when embedded within tokens.<br />
<strong>The</strong> dist target in Makefile.in failed to clean the cweb directory before including it in the source<br />
archive, which could have the result of leaving objects and binaries not compatible with the system on<br />
which the user is installing. I modified the target to descend into the cweb directory and make␣clean.<br />
This promptly ran into another problem because the CWEB Makefile deletes the C source for CWEAVE,<br />
using the bootstrapped CTANGLE to re-build it. This is clean, but runs afoul of my rebuilding both<br />
programs directly in the outer Makefile. I saved the original CWEB makefile as Makefile.ORIG and