25.08.2013 Views

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Implementing NLP-Projects for Small<br />

Languages: Instructions for Funding Bodies,<br />

Strategies for Developers<br />

29<br />

Oliver Streiter<br />

This research starts from the assumption that the conditions under which ‘Small<br />

Language’ Projects (SLPs) and ‘Big Language’ Projects (BLPs) are conducted are<br />

different. These differences have far-reaching consequences that go beyond the<br />

material conditions of projects. We will therefore try to identify strategies or<br />

techniques that aim to handle these problems. A central idea we put forward is<br />

pooling the resources to be developed with other similar Open Source resources. We<br />

will elaborate the expected advantages of this approach, and suggest that it is of such<br />

crucial importance that funding organisations should put it as condicio sine qua non<br />

into the project contract.<br />

1. Introduction: Small Language & ‘Big Language’ Projects - An Analysis of<br />

their Differences<br />

Implementing NLP-projects for Small Languages: Is this an issue that requires<br />

special attention? Are Small Language Projects (SLPs) different from ‘Big Language’<br />

Projects (BLPs)? What might happen if SLPs are handled in the same way as BLPs? What<br />

are the risks? How can they be reduced? Can we formulate general guidelines so that<br />

such projects might be conducted more safely? Although the processing of minority<br />

languages and other Small Languages has been subject to a series of workshops, this<br />

subject has been barely tackled as such. While most contributions discuss specific<br />

achievements (e.g. an implementation or transfer of a technique from Big Languages<br />

to Small Languages), only a few articles transcend to higher levels of reflection on<br />

how Small Language Projects might be conducted in general.<br />

In this contribution, we will compare SLPs and BLPs at the abstract schematic<br />

level. This comparison reveals differences that affect, among other things, the status<br />

of the researcher, the research paradigm to be chosen, the attractiveness of the<br />

research for young researchers, as well as the persistence and availability of the<br />

elaborated data - all to the disadvantage of Small Languages. We will advance one far-<br />

reaching solution that overcomes some of these problems inherent to SLPs, that is,<br />

to pool the developed resources with other similar Open Source resources and make

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!