17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

PDF packages and portfolios. Acrobat 8 (PDF 1.7) introduced the concept of PDF packages<br />

which are file attachments with additional properties. Acrobat 9 (PDF 1.7 extension<br />

level 3) extends this concept with the introduction of PDF portfolios.<br />

> How to display with Acrobat 8/9: Acrobat presents the cover sheet of the package/<br />

portfolio and the constituent PDF documents with dedicated user interface elements<br />

for PDF packages.<br />

> How to search a single PDF package with Acrobat 8/9: Edit, Search and in the Look In:<br />

pull-down select In the Entire PDF Package<br />

> How to search multiple PDF packages with Acrobat 8/9: not available<br />

> Sample code for the <strong>TET</strong> library: get_attachments mini sample<br />

> <strong>TET</strong>ML element: /<strong>TET</strong>/Document/Attachments/Attachment/Document<br />

PDF properties. This domain does not explicitly contain text, but is used as a pseudo<br />

domain which collects various intrinsic properties of a PDF document, e.g. PDF/X and<br />

PDF/A status, Tagged PDF status, etc.<br />

> How to display with Acrobat 8: Acrobat 8 does not directly display standards conformance<br />

information, but you can find relevant entries in File, Properties..., Custom<br />

or File, Properties..., Additional Metadata... You can also use the free <strong>PDFlib</strong> custom<br />

XMP panel 1 for ISO standards to explicitly display conformance information for the<br />

PDF/A-1, PDF/X-4, PDF/X-5, and PDF/E-1 standards.<br />

> Acrobat 9: View, Navigation Panels, Standards (only present for standard-conforming<br />

PDFs)<br />

> How to search with Acrobat 8/9: not available<br />

> Sample code for the <strong>TET</strong> library: dumper mini sample<br />

> <strong>TET</strong>ML elements and attributes: /<strong>TET</strong>/Document/@pdfa, /<strong>TET</strong>/Document/@pdfx<br />

1. See www.pdflib.com/developer/xmp-metadata/xmp-panels<br />

60 Chapter 6: <strong>Text</strong> <strong>Extraction</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!