17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table 10.17 Options for <strong>TET</strong>_write_image_file( ) and <strong>TET</strong>_get_image_data( )<br />

option<br />

preferredtiffcompression<br />

typeonly 1<br />

description<br />

(Keyword) Compression scheme used for most extracted TIFF images (default: flate):<br />

lzw LZW compression (TIFF compression scheme 5)<br />

flate Flate compression, also called Adobe Deflate or zlib compression (TIFF compression scheme 8)<br />

(Boolean) The image type will be determined according to the supplied options, but no image file will be<br />

written. This is useful for determining the type of image returned by <strong>TET</strong>_get_image_data( ), which does<br />

not return the image type itself. Default: false<br />

1. Only for <strong>TET</strong>_write_image_file( )<br />

C++ const char *get_image_data(int doc, size_t *length, int imageid, wstring optlist)<br />

C# Java final byte[ ] get_image_data(int doc, int imageid, String optlist)<br />

Perl PHP string get_image_data(long doc, long imageid, string optlist)<br />

VB RB Function get_image_data(doc As Long, imageid As Long, optlist As String)<br />

C<br />

const char * <strong>TET</strong>_get_image_data(<strong>TET</strong> *tet, int doc, size_t *length, int imageid, const char *optlist)<br />

Retrieve image data from memory.<br />

doc A valid document handle obtained with <strong>TET</strong>_open_document*( ).<br />

length (C and C++ language bindings only) C-style pointer to a memory location where<br />

the length of the returned data in bytes will be stored.<br />

imageid The pCOS ID of the image. This ID can be retrieved from the imageid field after<br />

a successful call to <strong>TET</strong>_get_image_info( ), or by looping over all entries in the images<br />

pCOS array (there are length:images entries in this array).<br />

optlist An option list specifying image-related options according to Table 10.17. The<br />

following options can be used: compression, keepxmp<br />

Returns<br />

Details<br />

Bindings<br />

The data representing the image according to the specified options. In case of an error<br />

(including images which cannot be extracted) a NULL pointer will be returned in C and<br />

C++, and empty data in other language bindings. If an error happens it is recommended<br />

to call <strong>TET</strong>_get_errmsg( ) to find out more details about the error.<br />

This function will convert the pixel data for the image with the specified pCOS ID to one<br />

of several image formats, and make the data available in memory.<br />

COM: Most client programs will use the Variant type to hold the image data.<br />

C and C++ language bindings: The returned data buffer can be used until the next call to<br />

this function.<br />

REALbasic: the result will be provided as REALbasic string with encoding -1 (binary data).<br />

If no more text is available an empty string will be returned.<br />

184 Chapter 10: <strong>TET</strong> Library API Reference

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!