17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

tected PDF (after the search engine indexed the contents and the hit list contained a link<br />

to the PDF), the document’s internal permission settings will protect the document as<br />

usual when accessed by the user.<br />

The shrug feature for protected documents. <strong>TET</strong> offers a feature which can be used to<br />

extract text and images from protected documents, assuming the <strong>TET</strong> user accepts responsibility<br />

for respecting the document author’s rights. This feature is called shrug,<br />

and works as follows: by supplying the shrug option to <strong>TET</strong>_open_document( ) the user asserts<br />

that he or she will not violate any document authors’ rights. <strong>PDFlib</strong> GmbH’s terms<br />

and conditions require that <strong>TET</strong> customers respect PDF permission settings.<br />

If all of the following conditions are true, the shrug feature will be enabled:<br />

> The shrug option has been supplied to <strong>TET</strong>_open_document( ).<br />

> The document requires a master password but it has not been supplied to <strong>TET</strong>_open_<br />

document( ).<br />

> If the document requires a user (open) password, it must have been supplied to <strong>TET</strong>_<br />

open_document( ).<br />

> <strong>Text</strong> extraction is not allowed in the document’s permission settings, i.e.<br />

nocopy=true.<br />

The shrug feature will have the following effects:<br />

> Extracting content from the document is allowed despite nocopy=true. The user is responsible<br />

for respecting the document author’s rights.<br />

> The pCOS pseudo object shrug will be set to true/1.<br />

> pCOS runs in full mode (instead of restricted mode), i.e. the pcosmode pseudo object<br />

will be set to 2.<br />

The shrug pseudo object can be used according to the following idiom to determine<br />

whether or not the contents can directly be made available to the user, or should only<br />

be used for indexing and similar indirect purposes:<br />

int doc = tet.open_document(filename, "shrug");<br />

...<br />

if ((int) tet.pcos_get_number(doc, "shrug") == 1)<br />

{<br />

/* only indexing allowed */<br />

}<br />

else<br />

{<br />

/* content may be delivered to the user */<br />

}<br />

50 Chapter 5: Configuration

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!