17.05.2014 Views

PDFlib 8 Windows COM/.NET Tutorial

PDFlib 8 Windows COM/.NET Tutorial

PDFlib 8 Windows COM/.NET Tutorial

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.4.3 Complex Script Shaping<br />

The shaping process selects appropriate glyph forms depending on whether a character<br />

is located at the start, middle, or end of a word, or in a standalone position. Shaping is a<br />

crucial component of Arabic and Hindi text formatting. Shaping may also replace a sequence<br />

of two or more characters with a suitable ligature. Since the shaping process determines<br />

the appropriate character forms automatically, explicit ligatures and Unicode<br />

presentation forms (e.g. Arabic Presentation Forms-A U+FB50) must not be used as input<br />

characters.<br />

Since complex scripts require multiple different glyph forms per character and additional<br />

rules for selecting and placing these glyphs, shaping for complex scripts does not<br />

work with all kinds of fonts, but requires suitable fonts which contain the necessary information.<br />

Shaping works for TrueType and OpenType fonts which contain the required<br />

feature tables (see »Requirements for shaping«, page 166, for detailed requirements).<br />

Shaping can only be done for characters in the same font because the shaping information<br />

is specific to a particular font. As it doesn’t make sense, for example, to form ligatures<br />

across different fonts, complex script shaping cannot be applied to a word which<br />

contains characters from different fonts.<br />

Override shaping behavior. In some cases users may want to override the default<br />

shaping behavior. <strong>PDFlib</strong> supports several Unicode formatting characters for this purpose.<br />

For convenience, these formatting characters can also be specified with entity<br />

names (see Table 6.4).<br />

Table 6.4 Unicode control characters for overriding the default shaping behavior<br />

formatting<br />

character entity name Unicode name function<br />

U+200C ZWNJ ZERO WIDTH NON-JOINER prevent the two adjacent characters from<br />

forming a cursive connection<br />

U+200D ZWJ ZERO WIDTH JOINER force the two adjacent characters to form a<br />

cursive connection<br />

6.4.4 Bidirectional Formatting<br />

Cookbook A full code sample can be found in the Cookbook topic complex_scripts/bidi_formatting.<br />

For right-to-left text (especially Arabic and Hebrew, but also some other scripts) it is<br />

very common to have nested sequences of left-to-right Latin text, e.g. an address or a<br />

quote in another language. These mixed sequences of text require bidirectional (Bidi)<br />

formatting. Since numerals are always written from left to right, the Bidi problem affects<br />

even text which is completely written in Arabic or Hebrew. <strong>PDFlib</strong> implements bidirectional<br />

text reordering according to the Unicode Bidi algorithm as specified in Unicode<br />

Standard Annex #9 1 . Bidi processing does not have to be enabled with an option,<br />

but will automatically be applied as part of the shaping process if text in a right-to-left<br />

script with an appropriate script option is encountered.<br />

1. See www.unicode.org/unicode/reports/tr9/<br />

6.4 Complex Script Output 167

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!