13.07.2015 Views

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

IADIS International Conference <strong>WWW</strong>/<strong>Internet</strong> 2010WIKLANG – A DEFINITION ENVIRONMENT FORMONOLINGUAL AND BILINGUAL DICTIONARIES TOSHALLOW-TRANSFER MACHINE TRANSLATIONAléssio Miranda Júnior and Laura S. GarcíaC3SL, Computer Department, Federal University of Paraná – UFPR, Curitiba-Pr, BrazilABSTRACTIn a time when the most successful development efforts in Machines Translation (MT) are based on closed software,Apertium has become an alternative mature, interesting and open source. However, one of the main obstacles for theimprovement of its results and popularization is the absence of a specific interface to manage its linguistic knowledge,which, because of the imposed difficulty, reduces the number of potential collaborators to the development of thelanguage pairs. In the present paper, we propose an interaction-interface environment that can abstract the concepts of thesystem and the textual process available for the development of bi or monolingual dictionaries. In addition to that, it hasthe ability to capture and organize knowledge in a simple manner for non-experts in Computing, leading to the growthand development of new language pairs for the MT.KEYWORDSApertium, Machine Translation Systems, WiKLaTS, interfaces, knowledge management, WiKLang.1. INTRODUCTIONMachine Translation (MT) (Hutchins, 1992) is the traditional term used to refer to the semi or fullyautomatedprocess whereby a text or utterance in a natural language (so-called source-language) is translatedinto another natural language (so-called target-language), resulting in an intelligible text which, in turn,preserves certain features of the source-text, such as style, cohesion and meaning.In recent years, we can see that the successful instances of Machines Translation (MT) are always a set ofclosed software and knowledge base, distributed as static products and with a commercial purpose. Thismodel comes with a big disadvantage, namely the difficulty imposed to the architecture and techniqueimprovement studies and even to the development of language pairs without a financial incentive.This situation hinders the development of open-source software able to transform this technology insomething open and democratic. In this sense, creating opportunities for the community to contribute evenmore to the evolution and the development of this interdisciplinary area that involves Computer Science andLinguistic has become a relevant challenge.Among the different types of MT software that have appeared following the open-source software modeland focusing on the abovementioned objectives, we have chosen the “Opentrad Apertium Project”(Apertium) (Forcada, 2008) as a consolidated free/open-source MT platform which is in constant evolutionand with expressive results in literature (Tyers, 2009 and Forcada, 2006).Apertium is a shallow-transfer Machine Translator based on superficial syntactic rules that use its ownknowledge (data) base, with an open and flexible structure in XML standard. It provides an engine andtoolbox that allow users to build their own machine translation systems by writing only the data. The dataconsists, on a basic level, of three dictionaries and transfer rules.Shallow transfer systems, such as Apertium, based on superficial syntactic rules, use superficialinformation based on the structure of the sentence, avoiding the in-depth semantic details within thatstructure. In general, this type of MT is more efficient and can substitute a complete syntactic analysis withsatisfactory results.159

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!