17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C++<br />

C<br />

int open_document_callback(void *opaque, size_t filesize,<br />

size_t (*readproc)(void *opaque, void *buffer, size_t size),<br />

int (*seekproc)(void *opaque, long offset),<br />

string optlist)<br />

int <strong>TET</strong>_open_document_callback(<strong>TET</strong> *tet, void *opaque, size_t filesize,<br />

size_t (*readproc)(void *opaque, void *buffer, size_t size),<br />

int (*seekproc)(void *opaque, long offset),<br />

const char *optlist)<br />

Open a PDF document from a custom data source for content extraction.<br />

opaque A pointer to some user data that might be associated with the input PDF document.<br />

This pointer will be passed as the first parameter of the callback functions, and<br />

can be used in any way. <strong>TET</strong> will not use the opaque pointer in any other way.<br />

filesize<br />

The size of the complete PDF document in bytes.<br />

readproc A C callback function which copies size bytes to the memory pointed to by<br />

buffer. If the end of the document is reached it may copy less data than requested. The<br />

function must return the number of bytes copied.<br />

seekproc A C callback function which sets the current read position in the document.<br />

offset denotes the position from the beginning of the document (0 meaning the first<br />

byte). If successful, this function must return 0, otherwise -1.<br />

optlist An option list specifying document options according to Table 10.3.<br />

Returns See <strong>TET</strong>_open_document( ).<br />

Details See <strong>TET</strong>_open_document( ).<br />

Bindings<br />

This function is only available in the C and C++ language bindings.<br />

C++ void close_document(int doc)<br />

C# Java void close_document(int doc)<br />

Perl PHP <strong>TET</strong>_close_document(resource tet, long doc)<br />

VB Sub close_document(doc As Long)<br />

C void <strong>TET</strong>_close_document(<strong>TET</strong> *tet, int doc)<br />

Release a document handle and all internal resources related to that document.<br />

doc A valid document handle obtained with <strong>TET</strong>_open_document*( ).<br />

Details<br />

Closing a document automatically closes all of its open pages. All open documents and<br />

pages will be closed automatically when <strong>TET</strong>_delete( ) is called. It is good programming<br />

practice, however, to close documents explicitly when they are no longer needed.<br />

Closed document handles must no longer be used in any function call.<br />

10.4 Document Functions 133

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!