06.02.2013 Views

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

Abstract book (pdf) - ICPR 2010

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

09:20-09:40, Paper ThAT7.2<br />

HMM-Based Word Spotting in Handwritten Documents using Subword Models<br />

Fischer, Andreas, Univ. of Bern<br />

Keller, Andreas, Univ. of Bern<br />

Frinken, Volkmar, Univ. of Bern<br />

Bunke, Horst, Univ. of Bern<br />

Handwritten word spotting aims at making document images amenable to browsing and searching by keyword retrieval.<br />

In this paper, we present a word spotting system based on Hidden Markov Models (HMM) that uses trained subword models<br />

to spot keywords. With the proposed method, arbitrary keywords can be spotted that do not need to be present in the<br />

training set. Also, no text line segmentation is required. On the modern IAM off-line database and the historical George<br />

Washington database we show that the proposed system outperforms a standard template matching approach based on dynamic<br />

time warping (DTW).<br />

09:40-10:00, Paper ThAT7.3<br />

A Content Spotting System for Line Drawing Graphic Document Images<br />

Luqman, Muhammad Muzzamil, Univ. Françoise Rabelaise Tours France; CVC Barcelona<br />

Brouard, Thierry, Univ. Françoise Rabelaise Tours France<br />

Ramel, Jean-Yves, Univ. François Rabelais de Tours<br />

Llados, Josep, Computer Vision Center<br />

We present a content spotting system for line drawing graphic document images. The proposed system is sufficiently domain<br />

independent and takes the keyword based information retrieval for graphic documents, one step forward, to Query<br />

By Example (QBE) and focused retrieval. During offline learning mode: we vectorize the documents in the repository,<br />

represent them by attributed relational graphs, extract regions of interest (ROIs) from them, convert each ROI to a fuzzy<br />

structural signature, cluster similar signatures to form ROI classes and build an index for the repository. During online<br />

querying mode: a Bayesian network classifier recognizes the ROIs in the query image and the corresponding documents<br />

are fetched by looking up in the repository index. Experimental results are presented for synthetic images of architectural<br />

and electronic documents.<br />

10:00-10:20, Paper ThAT7.4<br />

Toward Massive Scalability in Image Matching<br />

Moraleda, Jorge, Ricoh Innovations Inc.<br />

Hull, Jonathan, Ricoh<br />

A method for image matching from partial blurry images is presented that leverages existing text retrieval algorithms to<br />

provide a solution that scales to hundreds of thousands of images. As an initial application, we present a document image<br />

matching system in which the user supplies a query image of a small patch of a paper document taken with a cell phone<br />

camera, and the system returns a label identifying the original electronic document if found in a previously indexed collection.<br />

Experimental results show that a retrieval rate of over 70% is achieved on a collection of nearly 500,000 document<br />

pages.<br />

10:20-10:40, Paper ThAT7.5<br />

Learning Image Anchor Templates for Document Classification and Data Extraction<br />

Sarkar, Prateek, Palo Alto Res. Center<br />

Image anchor templates are used in document image analysis for document classification, data localization, and other<br />

tasks. Current tools allow human operators to mark out small sub-images from documents to act as anchor templates.<br />

However, this requires time, and expertise because operators have to make informed decisions based on behavior of the<br />

template matching algorithms, and the expected degradations patterns in documents. We propose learning templates for a<br />

task automatically and quickly from a few training examples. Document classification or data localization can be done<br />

more robustly by combining evidence from many more discriminating templates (e.g., hundreds) than would be practicable<br />

for operators to specify.<br />

- 249 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!