17.05.2014 Views

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

PDFlib Text Extraction Toolkit (TET) Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

devserver (1)$ ant search<br />

Buildfile: build.xml<br />

compile:<br />

search:<br />

[java] Enter query:<br />

<strong>PDFlib</strong><br />

[java] Searching for: pdflib<br />

[java] 5 total matching documents<br />

[java] 1. ../data/<strong>PDFlib</strong>-datasheet.pdf<br />

[java] Title: <strong>PDFlib</strong>, <strong>PDFlib</strong>+PDI, Personalization Server Datasheet<br />

[java] 2. ../data/Whitepaper-PDFA-with-<strong>PDFlib</strong>-products.pdf<br />

[java] Title: Whitepaper: Creating PDF/A with <strong>PDFlib</strong><br />

[java] 3. ../data/FontReporter.pdf<br />

[java] Title: <strong>PDFlib</strong> FontReporter 1.3 <strong>Manual</strong><br />

[java] 4. ../data/<strong>TET</strong>-PDF-IFilter-datasheet.pdf<br />

[java] Title: <strong>PDFlib</strong> <strong>TET</strong> PDF IFilter Datasheet<br />

[java] 5. ../data/Whitepaper-XMP-metadata-in-<strong>PDFlib</strong>-products.pdf<br />

[java] Title: Whitepaper: XMP Metadata support in <strong>PDFlib</strong> Products<br />

[java] Press (q)uit or enter number to jump to a page.<br />

q<br />

[java] Enter query:<br />

title:FontReporter<br />

[java] Searching for: title:fontreporter<br />

[java] 1 total matching documents<br />

[java] 1. ../data/FontReporter.pdf<br />

[java] Title: <strong>PDFlib</strong> FontReporter 1.3 <strong>Manual</strong><br />

[java] Press (q)uit or enter number to jump to a page.<br />

q<br />

[java] Enter query:<br />

BUILD SUCCESSFUL<br />

Total time: 57 seconds<br />

Two queries have been performed: one for the word <strong>PDFlib</strong> in the text, and another one<br />

for the word FontReporter in the title field. Note that q must be entered to leave the result<br />

paging mode before the next query can be started.<br />

All paths and filenames in the Ant build.xml file are defined via properties so that the<br />

file can be used with different environments, either by providing the properties on the<br />

command line or by entering the properties to override in a file build.properties, or even<br />

platform-specific into the files windows.properties or unix.properties. For example, to run<br />

the sample with a Lucene JAR file which is installed under /tmp you can invoke Ant as<br />

follows:<br />

ant -Dlucene.jar=/tmp/lucene-core-2.4.0.jar index<br />

Testing <strong>TET</strong> and Lucene with the demo Web application. The Lucene demo Web application<br />

can be deployed on any Java servlet container such as Tomcat or GlassFish. The<br />

required steps are described in the HTML documentation that comes with Lucene, also<br />

available online at lucene.apache.org/java/2_4_0/demo3.html.<br />

Note the step Configuration on that page. Here you must make the location of the index<br />

known to the Web application by entering it in the file configuration.jsp. The path to<br />

add here would be /bind/lucene/index if Ant was run without overriding<br />

the property for the location of the Lucene index.<br />

38 Chapter 4: <strong>TET</strong> Connectors

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!