17.05.2014 Views

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Note <strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> does not apply any language-specific processing beyond language detection. It<br />

is up to the <strong>IFilter</strong> client to use the LCID information. While some <strong>IFilter</strong> clients (e.g. SharePoint,<br />

SQL Server) include sophisticated LCID treatment, other <strong>IFilter</strong> clients may completely ignore<br />

the LCID information.<br />

Table 2.1 Common LCID values and the corresponding primary and secondary language<br />

LCID primary language secondary language (country)<br />

0x0000 Neutral locale language Neutral sublanguage<br />

0x0401 Arabic (ar) Saudi Arabia (SA)<br />

0x0404 Chinese (zh) Traditional (Hant)<br />

0x0407 German (de) Germany (DE)<br />

0x0409 English (en) United States (US)<br />

0x040c French (fr) France (FR)<br />

0x0410 Italian (it) Italy (IT)<br />

0x0411 Japanese (ja) Japan (JP)<br />

0x0413 Dutch (nl) Netherlands (NL)<br />

0x0419 Russian (ru) Russia (RU)<br />

0x0804 Chinese (zh) Simplified (Hans)<br />

0x0c0a Spanish (es) Spain (ES)<br />

0x0800<br />

System default locale language<br />

0x1000 Unspecified custom locale language Unspecified custom sublanguage<br />

XML configuration for LCIDs. LCIDs for overriding or supplementing automatic LCID<br />

detection can be specified in the LocaleId element of the XML configuration file:<br />

<br />

The detection attribute can have the values auto, disabled, and script. All other attributes<br />

except default will be ignored if detection=disabled. Default is auto. The script setting activates<br />

script analysis, but disables statistical analysis.<br />

The default attribute can be used to specify a global LCID setting which will be used<br />

for all text if detection=disabled. If this attribute is missing, the system locale will be<br />

used.<br />

For all attributes except detection a numeric value in decimal or hexadecimal syntax<br />

can be specified. Hexadecimal values must start with 0x. Table 2.2 lists the supported<br />

script attributes and their default values. LCIDs for text in all other scripts will be assigned<br />

automatically.<br />

2.2 Automatic Language Detection 25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!