17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 <strong>TET</strong> Library Language Bindings<br />

This chapter discusses specifics for the language bindings which are supplied for the<br />

<strong>TET</strong> library. The <strong>TET</strong> distribution contains full sample code for several small <strong>TET</strong> applications<br />

in all supported language bindings.<br />

3.1 Exception Handling<br />

Errors of a certain kind are called exceptions in many languages for good reasons – they<br />

are mere exceptions, and are not expected to occur very often during the lifetime of a<br />

program. The general strategy is to use conventional error reporting mechanisms (read:<br />

special error return codes) for function calls which may go wrong often times, and use a<br />

special exception mechanism for those rare occasions which don’t justify cluttering the<br />

code with conditionals. This is exactly the path that <strong>TET</strong> goes: Some operations can be<br />

expected to go wrong rather frequently, for example:<br />

> Trying to open a PDF document for which one doesn’t have the proper password (but<br />

see also the shrug feature described in Section 5.1, »Indexing protected PDF Documents«,<br />

page 49);<br />

> Trying to open a PDF document with a wrong file name;<br />

> Trying to open a PDF document which is damaged beyond repair.<br />

<strong>TET</strong> signals such errors by returning a value of –1 as documented in the API reference.<br />

Other events may be considered harmful, but will occur rather infrequently, e.g.<br />

> running out of virtual memory;<br />

> supplying wrong function parameters (e.g. an invalid document handle);<br />

> supplying malformed option lists;<br />

> a required resource (e.g. a CMap file for CJK text extract) cannot be found.<br />

When <strong>TET</strong> detects such a situation, an exception will be thrown instead of passing a special<br />

error return value to the caller. In languages which support native exceptions<br />

throwing the exception will be done using the standard means supplied by the language<br />

or environment. For the C language binding <strong>TET</strong> supplies a custom exception<br />

handling mechanism which must be used by clients (see Section 3.2, »C Binding«, page<br />

22).<br />

It is important to understand that processing a document must be stopped when an<br />

exception occurred. The only methods which can safely be called after an exception are<br />

<strong>TET</strong>_delete( ), <strong>TET</strong>_get_apiname( ), <strong>TET</strong>_get_errnum( ), and <strong>TET</strong>_get_errmsg( ). Calling any<br />

other method after an exception may lead to unexpected results. The exception will<br />

contain the following information:<br />

> A unique error number;<br />

> The name of the API function which caused the exception;<br />

> A descriptive text containing details of the problem;<br />

Querying the reason of a failed function call. Some <strong>TET</strong> function calls, e.g. <strong>TET</strong>_open_<br />

document( ) or <strong>TET</strong>_open_page( ), can fail without throwing an exception (they will return<br />

-1 in case of an error). In this situation the functions <strong>TET</strong>_get_errnum( ), <strong>TET</strong>_get_errmsg( ),<br />

and <strong>TET</strong>_get_apiname( ) can be called immediately after a failed function call in order to<br />

retrieve details about the nature of the problem.<br />

3.1 Exception Handling 21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!