17.05.2014 Views

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

PDFlib TET PDF IFilter 4.0 Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4 Custom Metadata Properties<br />

Custom metadata properties are additional properties beyond the predefined properties<br />

which meet specific requirements within an enterprise, organization, industry etc.<br />

<strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> gives you full control over custom properties: they can be specified in<br />

the configuration file so that they will be generated by <strong>TET</strong> <strong>PDF</strong> <strong>IFilter</strong> and indexed by<br />

the search engine.<br />

Planning custom metadata properties. In order to specify custom properties you<br />

must consider the following aspects (see »Property identification and GUIDs«, page 40,<br />

for details on GUIDs, identifiers, and friendly names):<br />

> You can group one or more properties in a property set. Each property set needs a<br />

unique 128-bit identifier called the GUID.<br />

> The property identifier is a unique integer which identifies the property within its<br />

property set. Property identifiers in a set start at the value 2. With some <strong>IFilter</strong> clients<br />

the identifier can be replaced with a friendly name. You can override predefined<br />

properties by assigning the corresponding GUID+ID combination.<br />

> The friendly name for a property is optional if an identifier is available, and required<br />

otherwise. It can be an arbitrary name which must be unique within the configuration<br />

file. While for some <strong>IFilter</strong> clients it can be used instead of the identifier, friendly<br />

names do not work with all <strong>IFilter</strong> clients.<br />

> Property source: properties can be populated from document metadata or general<br />

<strong>PDF</strong> information according to Section 3.1, »Sources of Metadata in <strong>PDF</strong>«, page 37.<br />

> The data type of the property: Int32 (32-bit integer), Double (floating point number<br />

with double precision), Boolean (true/false), DateTime (specification of a point in<br />

time), and String.<br />

> The precedence rule: if there is more than one data source for the property you can<br />

specify whether the first available non-empty data source will have precedence (i.e.<br />

subsequent sources will be ignored), or whether data from all non-empty sources<br />

will be collected.<br />

> Specify whether the property will be emitted as a vector, i.e. multiple values will be<br />

handed to the <strong>IFilter</strong> interface in an array structure instead of a flat value (see Section<br />

3.5, »Multivalued Properties«, page 44).<br />

> A prefix which will be prepended to the property name if properties are indexed as<br />

part of the full text (see Section 3.6, »Indexing Metadata Properties as Text«, page 45).<br />

XML configuration for custom properties. One or more custom properties can be specified<br />

in the PropertySet element, where each Property element describes a property in the<br />

set:<br />

<br />

<br />

<br />

<br />

<br />

Multiple <strong>PDF</strong> sources can be mapped to the same Windows property. The presence of a<br />

Property element will automatically enable processing for the specified property. However,<br />

handling of all predefined and custom metadata properties can be completely disabled<br />

with the metadataHandling attribute of the Filtering element:<br />

42 Chapter 3: Metadata Properties

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!