12.09.2013 Views

Programme booklet (pdf)

Programme booklet (pdf)

Programme booklet (pdf)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

74<br />

CLIN 21 – CONFERENCE PROGRAMME<br />

Using easy distributed computing for data-intensive<br />

processing<br />

Abstract<br />

Van den Bogaert, Joachim<br />

Centre for Computational Linguistics, K.U. Leuven<br />

Given the large amounts of data we are coping with when computing useful data from<br />

large corpora, and the difficulties and costs it takes to run parallel code with traditional<br />

parallel computing, we will present different frameworks that may be used to facilitate<br />

easy distributed computing. Using string-to-tree alignment (GHKM), frequent subtree<br />

mining, and distributed Moses decoding as example cases, we will demonstrate how<br />

applications and algorithms may be upscaled and out-scaled with these frameworks.<br />

We will consider both the creation of an embarrassingly parallel solution and the redesign<br />

of an existing algorithm to fit the mapreduce paradigm.<br />

Corresponding author: joachim@ccl.kuleuven.be

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!