PDFlib 8 Windows COM/.NET Tutorial
PDFlib 8 Windows COM/.NET Tutorial
PDFlib 8 Windows COM/.NET Tutorial
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6.4.3 Complex Script Shaping<br />
The shaping process selects appropriate glyph forms depending on whether a character<br />
is located at the start, middle, or end of a word, or in a standalone position. Shaping is a<br />
crucial component of Arabic and Hindi text formatting. Shaping may also replace a sequence<br />
of two or more characters with a suitable ligature. Since the shaping process determines<br />
the appropriate character forms automatically, explicit ligatures and Unicode<br />
presentation forms (e.g. Arabic Presentation Forms-A U+FB50) must not be used as input<br />
characters.<br />
Since complex scripts require multiple different glyph forms per character and additional<br />
rules for selecting and placing these glyphs, shaping for complex scripts does not<br />
work with all kinds of fonts, but requires suitable fonts which contain the necessary information.<br />
Shaping works for TrueType and OpenType fonts which contain the required<br />
feature tables (see »Requirements for shaping«, page 166, for detailed requirements).<br />
Shaping can only be done for characters in the same font because the shaping information<br />
is specific to a particular font. As it doesn’t make sense, for example, to form ligatures<br />
across different fonts, complex script shaping cannot be applied to a word which<br />
contains characters from different fonts.<br />
Override shaping behavior. In some cases users may want to override the default<br />
shaping behavior. <strong>PDFlib</strong> supports several Unicode formatting characters for this purpose.<br />
For convenience, these formatting characters can also be specified with entity<br />
names (see Table 6.4).<br />
Table 6.4 Unicode control characters for overriding the default shaping behavior<br />
formatting<br />
character entity name Unicode name function<br />
U+200C ZWNJ ZERO WIDTH NON-JOINER prevent the two adjacent characters from<br />
forming a cursive connection<br />
U+200D ZWJ ZERO WIDTH JOINER force the two adjacent characters to form a<br />
cursive connection<br />
6.4.4 Bidirectional Formatting<br />
Cookbook A full code sample can be found in the Cookbook topic complex_scripts/bidi_formatting.<br />
For right-to-left text (especially Arabic and Hebrew, but also some other scripts) it is<br />
very common to have nested sequences of left-to-right Latin text, e.g. an address or a<br />
quote in another language. These mixed sequences of text require bidirectional (Bidi)<br />
formatting. Since numerals are always written from left to right, the Bidi problem affects<br />
even text which is completely written in Arabic or Hebrew. <strong>PDFlib</strong> implements bidirectional<br />
text reordering according to the Unicode Bidi algorithm as specified in Unicode<br />
Standard Annex #9 1 . Bidi processing does not have to be enabled with an option,<br />
but will automatically be applied as part of the shaping process if text in a right-to-left<br />
script with an appropriate script option is encountered.<br />
1. See www.unicode.org/unicode/reports/tr9/<br />
6.4 Complex Script Output 167