18.07.2013 Views

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 9<br />

<strong>Corpus</strong> specifications<br />

<strong>The</strong> ingredients<br />

Deliverables concerned<br />

D18 Final version of corpus Final version of POS-tagged corpus of 45<br />

million words available for the DK-CLARIN repository and accessible<br />

through a web-based (or other) concordance tool. Outcome:<br />

Resource with documentation.<br />

Outline<br />

9.1 <strong>Corpus</strong> composition . . . . . . . . . . . . . . . . . . . . . 160<br />

9.2 Text material . . . . . . . . . . . . . . . . . . . . . . . . . 161<br />

9.2.1 Wikipedia . . . . . . . . . . . . . . . . . . . . . . . 161<br />

9.3 <strong>Corpus</strong> access . . . . . . . . . . . . . . . . . . . . . . . . . 161<br />

This chapter describes the composition of the corpus, the text material<br />

included, and how the corpus can be accessed.<br />

9.1 <strong>Corpus</strong> composition<br />

<strong>The</strong> following table shows from which sources the text material included in<br />

the DK-CLARIN corpus were drawn.<br />

160

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!