PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.2 C Binding<br />
Exception handling. The <strong>TET</strong> API provides a mechanism for acting upon exceptions<br />
thrown by the library in order to compensate for the lack of native exception handling<br />
in the C language. Using the <strong>TET</strong>_TRY( ) and <strong>TET</strong>_CATCH( ) macros client code can be set up<br />
such that a dedicated piece of code is invoked for error handling and cleanup when an<br />
exception occurs. These macros set up two code sections: the try clause with code which<br />
may throw an exception, and the catch clause with code which acts upon an exception.<br />
If any of the API functions called in the try block throws an exception, program execution<br />
will continue at the first statement of the catch block immediately. The following<br />
rules must be obeyed in <strong>TET</strong> client code:<br />
> <strong>TET</strong>_TRY( ) and <strong>TET</strong>_CATCH( ) must always be paired.<br />
> <strong>TET</strong>_new( ) will never throw an exception; since a try block can only be started with a<br />
valid <strong>TET</strong> object handle, <strong>TET</strong>_new( ) must be called outside of any try block.<br />
> <strong>TET</strong>_delete( ) will never throw an exception, and therefore can safely be called outside<br />
of any try block. It can also be called in a catch clause.<br />
> Special care must be taken about variables that are used in both the try and catch<br />
blocks. Since the compiler doesn’t know about the transfer of control from one block<br />
to the other, it might produce inappropriate code (e.g., register variable optimizations)<br />
in this situation.<br />
Fortunately, there is a simple rule to avoid this kind of problem: Variables used in<br />
both the try and catch blocks must be declared volatile. Using the volatile keyword signals<br />
to the compiler that it must not apply dangerous optimizations to the variable.<br />
> If a try block is left (e.g., with a return statement, thus bypassing the invocation of<br />
the corresponding <strong>TET</strong>_CATCH( )), the <strong>TET</strong>_EXIT_TRY( ) macro must be called before the<br />
return statement to inform the exception machinery.<br />
> As in all <strong>TET</strong> language bindings document processing must stop when an exception<br />
was thrown.<br />
The following code fragment demonstrates these rules with the typical idiom for dealing<br />
with <strong>TET</strong> exceptions in client code (a full sample can be found in the <strong>TET</strong> package):<br />
volatile int pageno;<br />
...<br />
if ((tet = <strong>TET</strong>_new()) == (<strong>TET</strong> *) 0)<br />
{<br />
printf("out of memory\n");<br />
return(2);<br />
}<br />
<strong>TET</strong>_TRY(tet)<br />
{<br />
for (pageno = 1; pageno