11.07.2015 Views

Natural Language Processing for Less Privileged Languages

Natural Language Processing for Less Privileged Languages

Natural Language Processing for Less Privileged Languages

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

More <strong>Privileged</strong><strong>Less</strong> <strong>Privileged</strong>NLP / CLOther PrivilegesNLP / CLComputerizationLinguistic StudySource(Finance, Human Resources,Equipment, Socio-Political Support, etc.)ComputerizationLinguistic StudyDestinationFigure 2: The four dimensions of the problem: The Source is where we come from and Destination is wherewe are going. The problem is to go from the Source to the Destination and the solution is non-trivial.3.5 Other PrivilegesOne of the major reasons why building language resourcesand providing language processing capabilities<strong>for</strong> the L-languages is going to be a very difficulttask is the fact that these languages lack theprivileges which make it possible to build languageresources and NLP/CL tools. By ‘privileges’ wemean the availability of finance, equipment, humanresources, and even political and social support <strong>for</strong>reducing the lack of computing and language processingsupport <strong>for</strong> the L-languages. The lack ofsuch ‘privileges’ may be the single biggest reasonwhich is holding back the progress towards providingcomputing and language processing support <strong>for</strong>these languages.4 Some (Partially) Successful Ef<strong>for</strong>tsThe problem seems to be insurmountable, but therehas been some progress. More importantly, the urgencyof solving this problem (even if partially) isbeing realized by more and more people. Some recentevents or ef<strong>for</strong>ts which tried to address the problemand which have had some impact in improvingthe situation are:• The LREC conferences and workshops 2 .• Workshop on ”Shallow Parsing in South Asian<strong>Language</strong>s”, IJCAI-07, India.2 www.lrec-conf.org9• EMELD and the Digital Tools Summit in Linguistics,2006, USA.• Workshop on <strong>Language</strong> Resources <strong>for</strong> EuropeanMinority <strong>Language</strong>s, 1998, Spain.• Projects supported by ELRA on the Basic <strong>Language</strong>Resource Kit (BLARK) that targets thespecifications of a minimal kits <strong>for</strong> each languageto support NLP tools development 3 .• There is also a corresponding project at LDC(the <strong>Less</strong> Commonly Taught <strong>Language</strong>s 4 ).• The IJCNLP Workshop on Named EntityRecognition <strong>for</strong> South and South Asian <strong>Language</strong>s5 .This list is, of course, not exhaustive. There aremany papers relevant to the theme of this workshopat the IJCNLP 2008 main conference 6 , as at someprevious major conferences. There is also a very relevanttutorial (Mihalcea, 2008) at the IJCNLP 2008conference about building resources and tools <strong>for</strong>languages with scarce resources.Even the industry is realizing the importance ofproviding computing support <strong>for</strong> some of the L-languages. In the last few years there have beenmany announcements about the addition of some3 http://www.elda.org/blark4 http://projects.ldc.upenn.edu/LCTL5 http://ltrc.iiit.ac.in/ner-ssea-08/6 http://ijcnlp2008.org

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!