06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

15:50-16:10, Paper MoBT6.2<br />

Document Segmentation using Pixel-Accurate Ground Truth<br />

An, Chang, Lehigh Univ.<br />

Yin, Dawei, Lehigh Univ.<br />

Baird, Henry, Lehigh Univ.<br />

We compare methodologies for trainable document image content extraction, using a variety of ground-truth policies:<br />

loose, tight, and pixel-accurate. The goal is to achieve pixel-accurate segmentation of document images. Which groundtruth<br />

policy is the best has been debated. ``Loose’’ truth is obtained by sweeping rectangles to enclose entire text blocks<br />

etc, and can be an efficient manual task. ``Tight’’ truth requires more care, and more time, to enclose individual text lines.<br />

Pixel-accurate truth, in which only foreground pixels are labeled, can be obtained by applying the PARC PixLabeler tool;<br />

in our experience this tool was as quick to use as loose truthing. We have compared the accuracy of all three truthing policies,<br />

and report that tight truth supports higher accuracy than loose truth, and pixel-accurate truth yields the highest accuracy.<br />

We have also experimented on morphological expansions on pixel-accurate truth, by expanding sets of foreground<br />

pixels morphologically, and report that expanded pixel-accurate truth supports higher accuracy than pixel-accurate truth.<br />

16:10-16:30, Paper MoBT6.3<br />

An Adaptive Script-Independent Block-Based Text Line Extraction<br />

Ziaratban, Majid, Amirkabir Univ. of Technology<br />

Faez, Karim, Amirkabir Univ. of Technology<br />

In this paper, a novel script-independent block-based text line extraction technique is proposed for multi-skewed document<br />

images. Three parameters are defined to adopt the method with various writings. Extensive experiments on different<br />

datasets demonstrate that the proposed algorithm outperforms previous methods.<br />

16:30-16:50, Paper MoBT6.4<br />

Automated Quality Assurance for Document Logical Analysis<br />

Meunier, Jean-Luc, XRCE<br />

We consider here the general problem of converting documents available in print-ready or image format into a structured<br />

format that reflects the logical structure of the document. One aspect of the problem involves reconstructing conventional<br />

constructs such as titles, headings, captions, footnotes, etc. In practice, another important aspect involves putting in place<br />

some automated Quality Assessment (QA) method. We propose here a method to automate the QA in the case of a homogeneous<br />

collection by considering multiple documents at once instead of focusing only on the document being processed.<br />

16:50-17:10, Paper MoBT6.5<br />

The PAGE (Page Analysis and Ground-Truth Elements) Format Framework<br />

Pletschacher, Stefan, Univ. of Salford<br />

Antonacopoulos, Apostolos, Univ. of Salford<br />

There is a plethora of established and proposed document representation formats but none that can adequately support individual<br />

stages within an entire sequence of document image analysis methods (from document image enhancement to<br />

layout analysis to OCR) and their evaluation. This paper describes PAGE, a new XML-based page image representation<br />

framework that records information on image characteristics (image borders, geometric distortions and corresponding corrections,<br />

binarisation etc.) in addition to layout structure and page content. The suitability of the framework to the evaluation<br />

of entire workflows as well as individual stages has been extensively validated by using it in high-profile applications<br />

such as in public contemporary and historical ground-truthed datasets and in the ICDAR Page Segmentation competition<br />

series.<br />

MoBT7 Dolmabahçe Hall C<br />

Computer Aided Detection and Diagnosis Regular Session<br />

Session chair: Unal, Gozde (Sabanci Univ.)<br />

- 36 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!