PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
unsupported types 87<br />
XMP metadata 82<br />
inch 64<br />
index (XSLT sample) 102<br />
installing <strong>TET</strong> 7<br />
J<br />
Java binding 26<br />
L<br />
license key 8<br />
ligatures 62<br />
list values in option lists 122<br />
Lucene search engine 37<br />
M<br />
MediaWiki 46<br />
millimeters 64<br />
mini samples 13<br />
N<br />
.NET binding 27<br />
number of pages 105<br />
O<br />
optimizing performance 54<br />
option lists 121<br />
Oracle <strong>Text</strong> 41<br />
P<br />
packages 60<br />
page boxes 64<br />
page size 106<br />
pCOS 105<br />
API functions 153<br />
data types 107<br />
encryption 119<br />
path syntax 110<br />
pseudo objects 112<br />
pCOS Cookbook 14<br />
pCOS interface 105<br />
PDF Reference <strong>Manual</strong> 105<br />
PDF versions 11<br />
performance optimization 54<br />
Perl binding 28<br />
PHP binding 29<br />
placed images 81<br />
points 64<br />
portfolios 60<br />
post-processing for Unicode values 68<br />
prerotated glyphs 67<br />
protected documents 49<br />
PUA (Private Use Area) 68<br />
Python Binding 31<br />
R<br />
raw text extraction (XSLT sample) 103<br />
rectangles in option lists 122<br />
replacement character 69<br />
resource configuration 51<br />
resourcefile parameter 53<br />
response file 17<br />
roadmap to documentation and samples 13<br />
RPG binding 32<br />
S<br />
schema 96<br />
searching for font usage (XSLT sample) 102<br />
searchpath 52<br />
sequences 62<br />
shadow removal 73<br />
shrug feature 49<br />
small image removal 86<br />
Solr search server 40<br />
surrogates 63, 65<br />
T<br />
table detection 74<br />
table extraction (XSLT sample) 103<br />
<strong>TET</strong> command-line tool 15<br />
<strong>TET</strong> connector 35<br />
for Lucene 37<br />
for MediaWiki 46<br />
for Microsoft products 44<br />
for Oracle 41<br />
for Solr 40<br />
<strong>TET</strong> Cookbook 14<br />
<strong>TET</strong> features 11<br />
<strong>TET</strong> Markup Language (<strong>TET</strong>ML) 89<br />
<strong>TET</strong> plugin for Adobe Acrobat 35<br />
tet.upr 53<br />
<strong>TET</strong>_CATCH( ) 127<br />
<strong>TET</strong>_close_document( ) 133<br />
<strong>TET</strong>_close_page( ) 140<br />
<strong>TET</strong>_create_pvf() 125<br />
<strong>TET</strong>_delete( ) 123<br />
<strong>TET</strong>_delete_pvf() 126<br />
<strong>TET</strong>_EXIT_TRY( ) 22, 127<br />
<strong>TET</strong>_get_apiname() 127<br />
<strong>TET</strong>_get_char_info( ) 143<br />
<strong>TET</strong>_get_errmsg( ) 127<br />
<strong>TET</strong>_get_errnum( ) 127<br />
<strong>TET</strong>_get_image_data( ) 148<br />
<strong>TET</strong>_get_image_info( ) 145<br />
<strong>TET</strong>_get_text( ) 142<br />
<strong>TET</strong>_get_xml_data( ) 150<br />
<strong>TET</strong>_new( ) 123<br />
<strong>TET</strong>_open_document( ) 129<br />
<strong>TET</strong>_open_document_callback( ) 133<br />
<strong>TET</strong>_open_page( ) 134<br />
<strong>TET</strong>_pcos_get_number( ) 153<br />
162 Index