PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
PDFlib Text Extraction Toolkit (TET) Manual
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Small image filtering. <strong>TET</strong> ignores very small images if may of those is present on the<br />
page. Since the image merging process often combines many small images to a larger<br />
image, small image removal is performed after image merging. Only images which can<br />
not be merged to form a larger image will be candidates for small image removal. In addition,<br />
they must satisfy the conditions for size and count which can be specified in the<br />
maxarea and maxcount suboptions of the smallimages option of <strong>TET</strong>_open_page( ) and<br />
<strong>TET</strong>_process_page( ).<br />
In order to completely disable small image removal use the following page option:<br />
imageanalysis={smallimages={disable}}<br />
86 Chapter 7: Image <strong>Extraction</strong>