The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
The Annoyance Filter.pdf - Fourmilab
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
§119 ANNOYANCE-FILTER FLASH TEXT EXTRACTOR 97<br />
119. <strong>The</strong> DefineFontInfo tag is crucial to decoding Flash text strings. Text in Flash files is stored a<br />
glyph indices within a font. <strong>The</strong> font can, in the general case, be defined by an arbitrary stroked path<br />
outline, independent of any standard character set. For fonts which employ standard character sets, the<br />
optional DefineFontInfo identifies the character set and provides the mapping from the glyph indices to<br />
characters in the font’s character set. We save these in maps indexed by the font ID so we can look<br />
them up when we encounter text in that font.<br />
〈 Parse Flash DefineFontInfo tag 119 〉 ≡<br />
{<br />
#ifdef FLASH_PARSE_DEBUG<br />
cout ≪ "DefineFontInfo" ≪ endl ;<br />
#endif<br />
}<br />
unsigned short fontID = get16 ( );<br />
unsigned int fontNameLen = get8 ( );<br />
string fontName ;<br />
getString (fontName , fontNameLen );<br />
if (¬textOnly ) {<br />
strings .push (fontName );<br />
}<br />
fontFlags fFlags = static cast〈fontFlags〉(get8 ( ));<br />
map〈unsigned short, unsigned short〉::iterator fp = fontGlyphCount .find (fontID );<br />
if (fp ≡ fontGlyphCount .end ( )) {<br />
if (verbose ) {<br />
cerr ≪ "DefineFontInfo␣for␣font␣ID␣" ≪ fontID ≪<br />
"␣without␣previous␣DefineFont." ≪ endl ;<br />
}<br />
ignoreTag (4);<br />
}<br />
else {<br />
unsigned nGlyphs = fp ⃗ second ;<br />
vector〈unsigned short〉 ∗v = new vector〈unsigned short〉(nGlyphs );<br />
}<br />
fontMap.insert (make pair (fontID , v));<br />
fontInfoBits .insert (make pair (fontID , fFlags ));<br />
for (unsigned int g = 0; g < nGlyphs ; g++) {<br />
if (fFlags & fontWideCodes ) {<br />
(∗v)[g] = get16 ( );<br />
}<br />
else {<br />
(∗v)[g] = get8 ( );<br />
}<br />
}<br />
This code is used in section 115.