PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.5 Java Binding<br />
Installing the <strong>TET</strong> Java edition. <strong>TET</strong> is organized as a Java package with the name<br />
com.pdflib.<strong>TET</strong>. This package relies on a native JNI library; both pieces must be configured<br />
appropriately.<br />
In order to make the JNI library available the following platform-dependent steps<br />
must be performed:<br />
> On Unix systems the library libtet_java.so (on Mac OS X: libtet_java.jnilib) must be<br />
placed in one of the default locations for shared libraries, or in an appropriately configured<br />
directory.<br />
> On Windows the library pdf_tet.dll must be placed in the Windows system directory,<br />
or a directory which is listed in the PATH environment variable.<br />
The <strong>TET</strong> Java package is contained in the tet.jar file and contains a single class called tet.<br />
In order to supply this package to your application, you must add tet.jar to your<br />
CLASSPATH environment variable, add the option -classpath tet.jar in your calls to the<br />
Java compiler, or perform equivalent steps in your Java IDE. In the JDK you can configure<br />
the Java VM to search for native libraries in a given directory by setting the<br />
java.library.path property to the name of the directory, e.g.<br />
java -Djava.library.path=. extractor<br />
You can check the value of this property as follows:<br />
System.out.println(System.getProperty("java.library.path"));<br />
Exception handling. The <strong>TET</strong> language binding for Java will throw native Java exceptions<br />
of the class <strong>TET</strong>Exception. <strong>TET</strong> client code must use standard Java exception syntax:<br />
<strong>TET</strong> tet = null;<br />
try {<br />
...<strong>TET</strong> method invocations...<br />
} catch (<strong>TET</strong>Exception e) {<br />
System.err.print("<strong>TET</strong> exception occurred:\n");<br />
System.err.print("[" + e.get_errnum() + "] " + e.get_apiname() + ": " +<br />
e.get_errmsg() + "\n");<br />
} catch (Exception e) {<br />
System.err.println(e.getMessage());<br />
} finally {<br />
if (tet != null) {<br />
tet.delete(); /* delete the <strong>TET</strong> object */<br />
}<br />
}<br />
Since <strong>TET</strong> declares appropriate throws clauses, client code must either catch all possible<br />
exceptions or declare those itself.<br />
26 Chapter 3: <strong>TET</strong> Library Language Bindings