The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
80 APPLICATION STRING PARSERS ANNOYANCE-FILTER §98<br />
98. Application string parsers.<br />
An application string parser reads files in application-defined formats (for example, word processor<br />
documents, spreadsheets, page description languages, etc.) and returns strings included in the file.<br />
Unlike tokenParser in “byte stream” mode, there is nothing heuristic in the operation of an application<br />
string parser—it must understand the structure of the application data file in order to identify and<br />
extract strings within it.<br />
<strong>The</strong> applicationStringParser class is the virtual parent of all specific application string parsers. It<br />
provides common services to derived classes and defines the external interface. When initialising an<br />
applicationStringParser , the caller must supply a pointer to the mailFolder from which it will be<br />
invoked, through which the folder’s nextByte method will be called to return decoded binary bytes of<br />
the application file. It would be much cleaner if we could simply supply an arbitrary function which<br />
returned the next byte of the stream we’re decoding, but that runs afoul of C++’s rules for taking the<br />
address of class members. Consequently, we’re forced to make applicationStringParser co-operate with<br />
mailFolder to obtain decoded bytes.<br />
〈 Class definitions 10 〉 +≡<br />
class applicationStringParser {<br />
protected:<br />
bool error , eof ; /∗ Error and end of file indicators ∗/<br />
mailFolder ∗mf ;<br />
virtual unsigned char get8 (void);<br />
virtual void get8n (unsigned char ∗buf , const int n)<br />
{ /∗ Store next n bytes into buf ∗/<br />
for (int i = 0; (¬eof ) ∧ (i < n); i++) {<br />
buf [i] = get8 ( );<br />
}<br />
}<br />
public:<br />
applicationStringParser(mailFolder ∗f = Λ) : error (false ) , eof (false ), mf (Λ)<br />
{<br />
setMailFolder (f);<br />
}<br />
virtual ∼applicationStringParser( )<br />
{ }<br />
virtual string name (void) const = 0;<br />
void setMailFolder (mailFolder ∗f)<br />
{<br />
mf = f;<br />
}<br />
virtual bool nextString (string &s) = 0;<br />
virtual void close (void){ error = eof = false ; } bool isError (void) const { return error ; }<br />
bool isEOF (void) const<br />
{<br />
}<br />
return eof ;<br />
bool isOK (void) const<br />
{<br />
return (¬isEOF ( )) ∧ (¬isError ( ));<br />
}<br />
} ;