13.07.2015 Views

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

ISBN: 978-972-8939-25-0 © 2010 IADISFigure 4. Monthly process counts. (NHocr on Maggie)These results show that there are a lot of applications of OCR in which privacy <strong>do</strong>es not matter. Statisticsof processed image types can be found in our previous report (Goto 2007).3.2 Applications of WeOCRThe WeOCR platform has been used in various ways.For example, majority logic is known to be useful for combining multiple classifiers and improving theaccuracy of character recognition (Miyao 2004, Tabaru 1998). We also investigated the accuracyimprovement based on the majority logic using WeOCR (OCRGrid) platform (Goto 2006). Some privateOCR engines were deployed on a local WeOCR platform in our laboratory in the experiments. A clientprogram sends each character image to multiple WeOCR servers, collects the recognition results from theservers, and chooses the most popular character as the top character candidate. A WeOCR server itself canalso become a client. Since the methods based on majority logic require a lot of OCR engines with differentcharacteristics, the Grid-based approach should be quite useful.People in “Seeing with Sound – The vOICe –” project (Meijer 1996) have made an e-mail interface formobile camera phones. The server receives an image from the mobile phone and sends the recognized textdata back to the user. An OCR engine on the WeOCR platform is used as a back-end server. The system wasoriginally developed in order to help visually-disabled people read text on signboards, etc. Any OCRdevelopers can help those people through the WeOCR platform.Table 2 shows the WeOCR application software/server known to the author so far. These applications useWeOCR platform instead of a built-in OCR library. All the applications except the last one were madeindependently by developers having no special relationship to our laboratory. In addition, some otherexperimental WeOCR client programs and web services have been spotted on the <strong>Internet</strong>. Thus, ourWeOCR platform has been accepted as a useful building block by many developers, and created new OCRapplication design and usage styles suitable in the Grid/Cloud Computing era. Actually, any developers canadd OCR functions to their programs quite easily using WeOCR platform.A potential future application of WeOCR (OCRGrid) platform is multilingual processing. Thousands oflanguages exist all over the world. ABBYY FineReader, one of the world’s leading OCR packages, canhandle around 180 languages so far. However, the spell checking is supported for about 40 languages only. Itseems impossible to have a large number of languages and dialects supported by OCR engines from only acouple of companies. WeOCR platform can provide a lot of community-supported OCR servers withlocalized dictionaries. We will be able to obtain better recognition results, since we can expect that theservers for a language are better maintained in the countries where the language is used. In addition, having alot of OCR servers for various languages are very useful for research on multilingual OCR systems.46

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!