17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

estricted pCOS mode if nocopy=false or plainmetadata=true, and bookmarks[...]/Title as<br />

well as all paths starting with pages[...]/annots[...]/ can be retrieved in restricted pCOS<br />

mode if nocopy=false.<br />

This function assumes that strings retrieved from the PDF document are text strings.<br />

String objects which contain binary data should be retrieved with <strong>TET</strong>_pcos_get_stream( )<br />

instead which does not modify the data in any way.<br />

Bindings C language binding: The string will be returned in UTF-8 format (on zSeries and i5/<br />

iSeries: EBCDIC-UTF-8) without BOM. The returned strings will be stored in a ring buffer<br />

with up to 10 entries. If more than 10 strings are queried, buffers will be reused, which<br />

means that clients must copy the strings if they want to access more than 10 strings in<br />

parallel. For example, up to 10 calls to this function can be used as parameters for a<br />

printf( ) statement since the return strings are guaranteed to be independent if no more<br />

than 10 strings are used at the same time.<br />

C++ language binding: The string will be returned as wstring in the default wstring configuration<br />

of the C++ wrapper. In string compatibility mode on zSeries and i5/iSeries the<br />

result will be returned in EBCDIC-UTF-8 without BOM.<br />

Java and .NET bindings: the result will be provided as Unicode string. If no more text is<br />

available a null object will be returned.<br />

Perl, PHP and Python language bindings: the result will be provided as UTF-8 string. If<br />

no more text is available a null object will be returned.<br />

RPG language binding: the result will be provided as EBCDIC-UTF-8 string.<br />

C++ const unsigned char *pcos_get_stream(int doc, int *length, string optlist, wstring path)<br />

C# Java final byte[ ] pcos_get_stream(int doc, String optlist, String path)<br />

Perl PHP string pcos_get_stream(int doc, string optlist, string path)<br />

VB RB Function pcos_get_stream(doc as Long, optlist As String, path As String)<br />

C const unsigned char *<strong>TET</strong>_pcos_get_stream(<strong>TET</strong> *tet, int doc, int *length, const char *optlist,<br />

const char *path, ...)<br />

Get the contents of a pCOS path with type stream, fstream, or string.<br />

doc A valid document handle obtained with <strong>TET</strong>_open_document*( ).<br />

length (C and C++ language bindings only) A pointer to a variable which will receive<br />

the length of the returned stream data in bytes.<br />

optlist An option list specifying stream retrieval options according to Table 10.19.<br />

path<br />

A full pCOS path for a stream or string object.<br />

Additional parameters (C language binding only) A variable number of additional parameters<br />

can be supplied if the key parameter contains corresponding placeholders (%s<br />

for strings or %d for integers; use %% for a single percent sign). Using these parameters<br />

will save you from explicitly formatting complex paths containing variable numerical<br />

or string values. The client is responsible for making sure that the number and type of<br />

the placeholders matches the supplied additional parameters.<br />

10.12 pCOS Functions 189

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!