17.05.2014 Views

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 2.3 Examples for the fold option<br />

description and option list before folding after folding<br />

Replace all characters in a Unicode set with a specific character<br />

Space folding: map all variants of Unicode space characters to U+0020:<br />

fold={{[:blank:] U+0020}}<br />

Dashes folding: map all variants of Unicode dash characters to U+002D:<br />

fold={{[:Dash:] U+002D}}<br />

Replace all unassigned characters (i.e. Unicode code points to which no character<br />

is assigned) with U+FFFD: fold={{[:Unassigned:] U+FFFD}}<br />

Special handling for individual characters<br />

Preserve all hyphen characters at line breaks while keeping the remaining default<br />

foldings. Since these characters are identified internally in <strong>TET</strong> (as opposed to having<br />

a fixed Unicode property) the keyword _dehyphenation is used to identify the<br />

folding’s domain: fold={{_dehyphenation preserve}}<br />

Preserve Arabic Tatweel characters (which are removed by default):<br />

fold={{[U+0640] preserve}}<br />

Replace various punctuation characters with their ASCII counterparts:<br />

fold={ {[U+2018] U+0027} {[U+2019] U+0027} {[U+201C] U+0022}<br />

{[U+201D] U+0022}}<br />

<br />

U+00A0<br />

<br />

U+2011<br />

<br />

U+03A2<br />

<br />

U+002D<br />

<br />

U+0640<br />

<br />

U+201C<br />

<br />

U+0020<br />

<br />

U+002D<br />

<br />

U+FFFD<br />

<br />

U+002D<br />

<br />

U+0640<br />

<br />

U+002D U+0022<br />

2.4 Unicode Postprocessing 29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!