17.05.2014 Views

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table 4.1 XML elements and attributes in the configuration file<br />

element<br />

DocOptions<br />

parent: Tet<br />

Filtering<br />

parent:<br />

TetPdf<strong>IFilter</strong>Config<br />

LocaleId<br />

parent: Filtering<br />

Metadata<br />

parent:<br />

TetPdf<strong>IFilter</strong>Config<br />

description of the element and its attributes<br />

(May appear zero or one time) The value contains an option list for <strong>TET</strong>_open_document( ) in the<br />

<strong>TET</strong> kernel.<br />

(May appear zero or one time) Specify details of the <strong>PDF</strong> filtering process. Supported attributes:<br />

indexNestedPdf<br />

(Boolean; optional) Process <strong>PDF</strong> attachments recursively (see Section 2.1, »<strong>PDF</strong><br />

Document Domains«, page 19). Default: true<br />

metadataHandling<br />

(Choice; optional) Select the type of metadata handling (see Section 3.6, »Indexing<br />

Metadata Properties as Text«, page 45). Default: property<br />

ignore Drop all metadata properties. This may be useful for debugging or performance<br />

optimization in situations where metadata is not required.<br />

property Treat metadata as properties.<br />

propertyAndPrefixedText<br />

In addition to treating metadata as properties, prepend the prefix specified<br />

in textIndexPrefix (if present) for custom properties and the prefixes<br />

according to Table 3.2, page 45, for predefined properties, and treat the<br />

result as plain text.<br />

propertyAndText<br />

In addition to treating metadata as properties, treat metadata as plain<br />

text.<br />

useIdentifier<br />

(Boolean; optional) Specify whether identifier or friendlyName will be used to<br />

identify properties if both of these attributes for the Property element are present.<br />

Default: true<br />

(May appear zero or one time) Configure locale ID detection (see Section 2.2, »Automatic<br />

Language Detection«, page 24). Supported attributes:<br />

arabic (LCID; optional) LCID for Arabic text. Default: 0x0401 Arabic (SA)<br />

chinese<br />

cyrillic<br />

default<br />

(LCID; optional) LCID for Chinese text. Default: 0x0804 Chinese (People's Republic of<br />

China)<br />

(LCID; optional) LCID for Cyrillic text. Default: 0x0419 Russian (RU)<br />

(LCID; optional) Global LCID which will be used for all text chunks if detection is<br />

disabled. Default: 0x0800 (system-locale)<br />

detection (Choice; optional) Control automatic LCID detection. Default: auto<br />

auto Determine LCID based on script and statistical language analysis.<br />

disabled Disable LCID detection; all other attributes except default and use-<br />

CatalogLang will be ignored.<br />

script (<strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> <strong>4.0</strong>) Determine LCID based on script.<br />

latin (LCID; optional) LCID for Latin text. Default: 0x0409 English (US)<br />

useCatalogLang<br />

(Boolean; optional; <strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> <strong>4.0</strong>) Specify whether the Lang entry in the<br />

document’s catalog will be evaluated. If true, <strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> checks the Lang entry in<br />

the <strong>PDF</strong> document catalog. If present, the Lang entry will be converted to an LCID. If<br />

the conversion is successful the LCID overrides the value of the LocaleId/@default<br />

attribute; if the LCID belongs to one of the Arabic, Chinese, Cyrillic, or Latin scripts it<br />

overrides the value of the corresponding attribute of the LocaleId element. Default:<br />

true<br />

(May appear zero or one time) Specify metadata properties (see Section 3.4, »Custom Metadata<br />

Properties«, page 42). If present, this element must appear after Filtering and Tet.<br />

66 Chapter 4: XML Configuration File

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!