PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
estricted pCOS mode if nocopy=false or plainmetadata=true, and bookmarks[...]/Title as<br />
well as all paths starting with pages[...]/annots[...]/ can be retrieved in restricted pCOS<br />
mode if nocopy=false.<br />
This function assumes that strings retrieved from the PDF document are text strings.<br />
String objects which contain binary data should be retrieved with <strong>TET</strong>_pcos_get_stream( )<br />
instead which does not modify the data in any way.<br />
Bindings C language binding: The string will be returned in UTF-8 format (on zSeries and i5/<br />
iSeries: EBCDIC-UTF-8) without BOM. The returned strings will be stored in a ring buffer<br />
with up to 10 entries. If more than 10 strings are queried, buffers will be reused, which<br />
means that clients must copy the strings if they want to access more than 10 strings in<br />
parallel. For example, up to 10 calls to this function can be used as parameters for a<br />
printf( ) statement since the return strings are guaranteed to be independent if no more<br />
than 10 strings are used at the same time.<br />
C++ language binding: The string will be returned as wstring in the default wstring configuration<br />
of the C++ wrapper. In string compatibility mode on zSeries and i5/iSeries the<br />
result will be returned in EBCDIC-UTF-8 without BOM.<br />
Java and .NET bindings: the result will be provided as Unicode string. If no more text is<br />
available a null object will be returned.<br />
Perl, PHP and Python language bindings: the result will be provided as UTF-8 string. If<br />
no more text is available a null object will be returned.<br />
RPG language binding: the result will be provided as EBCDIC-UTF-8 string.<br />
C++ const unsigned char *pcos_get_stream(int doc, int *length, string optlist, wstring path)<br />
C# Java final byte[ ] pcos_get_stream(int doc, String optlist, String path)<br />
Perl PHP string pcos_get_stream(int doc, string optlist, string path)<br />
VB RB Function pcos_get_stream(doc as Long, optlist As String, path As String)<br />
C const unsigned char *<strong>TET</strong>_pcos_get_stream(<strong>TET</strong> *tet, int doc, int *length, const char *optlist,<br />
const char *path, ...)<br />
Get the contents of a pCOS path with type stream, fstream, or string.<br />
doc A valid document handle obtained with <strong>TET</strong>_open_document*( ).<br />
length (C and C++ language bindings only) A pointer to a variable which will receive<br />
the length of the returned stream data in bytes.<br />
optlist An option list specifying stream retrieval options according to Table 10.19.<br />
path<br />
A full pCOS path for a stream or string object.<br />
Additional parameters (C language binding only) A variable number of additional parameters<br />
can be supplied if the key parameter contains corresponding placeholders (%s<br />
for strings or %d for integers; use %% for a single percent sign). Using these parameters<br />
will save you from explicitly formatting complex paths containing variable numerical<br />
or string values. The client is responsible for making sure that the number and type of<br />
the placeholders matches the supplied additional parameters.<br />
10.12 pCOS Functions 189