25.08.2013 Views

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

such as German in this special case, are just some of the problems Rumantsch and<br />

other small languages have to face in order to propose acceptable terminology and<br />

preserve language at the same time. The project on the ‘Welsh National Terminology<br />

Database’ reflects the need to find a means between accepted terminology standards<br />

used for bigger languages (ISO 704 and ISO 860 norms) and language preservation. This<br />

project takes advantage of the similarities between terminology and lexicography, as<br />

existing lexicographical resources and applications are used to enrich the terminology<br />

database.<br />

Another central topic that lesser used languages have in common is the usability<br />

of available data. On the one hand we find the contribution on Judeo-Spanish, where<br />

Roussi & Stulic describe how to transliterate and annotate texts written in Hebrew<br />

characters and, at the same time, allow users to add their own interpretation and<br />

comments. On the other hand, Uchechukwu explains in his contribution the problems<br />

related to appropriate font programmes and software compatibility. On the basis of<br />

the Igbo language he describes what happens when the amount of data is considerable<br />

but not usable (due to the obstacle of accepted format).<br />

Issues of data sparseness and usability determine linguistic research, especially<br />

during the phases of data pre-processing, and the amount of time linguists must<br />

invest in dealing with linguistic research questions. Uemlianin proposes to use<br />

SpeechCluster in order to ensure that linguists can concentrate on linguistic analyses<br />

rather than disperse their efforts with formatting or any other time-consuming manual<br />

processing.<br />

Trosterud emphasises on the importance of open-source technology for projects<br />

on lesser used languages, so as to avoid waste in terms of time and technology, which<br />

must be reinvented every single time for every small language. The same point of<br />

view is stated by Stuflesser and Streiter as they present their intention to use XNLRDF,<br />

a free software package for NLP. Their contribution introduces the existing prototype<br />

and outlines future strategies.<br />

A similar aim is pursued by the invited key-note speaker Oliver Streiter, who focuses<br />

on this topic, providing a detailed overview on available resources and underlining the<br />

importance of mutual support within the research community through data sharing<br />

in standard formats, so as to make it usable and accessible to everybody. One of the<br />

instruments cited and used most often for data sharing is the Internet, as it allows<br />

online storage of data such as dictionaries, language games or terminology data bases<br />

(Jones & Prys). This medium is used by Canolfan Bedwyr to publish the web-based<br />

word games for Welsh, as well as by the Ladin institutions to disseminate their online<br />

8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!