12.09.2013 Views

Programme booklet (pdf)

Programme booklet (pdf)

Programme booklet (pdf)

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

PRESENTATION ABSTRACTS<br />

Im chattin :-) u wanna NLP it: Analyzing Reduction in Chat<br />

Abstract<br />

van Halteren, Hans 1 and Martell, Craig 2 and Du, Caixia 3 and Gu, Yan 3<br />

and Johan, Kobben 3 and Panjaitan, Leequisach 3 and Schubotz, Louise 3<br />

Vasylenko, Kateryna 4<br />

1 Radboud University Nijmegen<br />

2 Naval Postgraduate School, Monterey<br />

3 ReMa L&C, RUN/UvT<br />

4 ReMa L&C<br />

Modern NLP research attempts to cover the whole spectrum from written to spoken<br />

text. Right in the middle we find chat text, a written text type which has many<br />

similarities with spoken text. One of these is spelling variation, often reduction, e.g.<br />

nite instead of night. It is clear that, if we ever want to analyze or generate chat text,<br />

we have to understand the factors behind this spelling behavior, whether user<br />

experience with SMS, peer group identification by speech spelling or otherwise.<br />

This paper contributes by studying spelling reduction in chat text. We investigated<br />

cases of reduction in the NPS Chat Corpus. After identifying various types in 2000 posts<br />

from the publicly available part of the corpus, we focused on four frequent<br />

phenomena: a) wanna (want to) and gonna (going to), b) ya and u (you), c) g-drop in<br />

present participles, e.g. findin for finding d) apostrophe drop in enclitics, e.g. hes for<br />

he’s. For these, we automatically extracted all occurrences of both reduced and full<br />

forms in 1Mw from the complete corpus. For each, we also determined features which<br />

could be of influence on the choice between the alternating forms, such as the poster’s<br />

age group and immediate context in the post. On the basis of this, we built regression<br />

models to find out which of the features show a significant influence.<br />

In the paper, we present the main findings and relate them to those identified in the<br />

literature as being active in spoken text.<br />

Corresponding author: hvh@let.ru.nl<br />

47

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!