18.07.2013 Views

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.3. Filling in the header 72<br />

Legal values Legal values follow the TEI specifications: 35<br />

Value Description<br />

nil Info has not been determined yet<br />

empty Info is irrelevant, non-existent, or undeterminable<br />

original Original, un-translated version of the text. Default<br />

translation <strong>The</strong> text is a translation<br />

⊲ tdDomain<br />

<strong>The</strong> domain the text is associated with.<br />

Properties<br />

Value set<br />

type<br />

enumerated, closed<br />

XML name vs_tdDomain.xml<br />

Legal values <strong>The</strong> full set of 66 DDOC domain values is used, as experiments<br />

using it for automatic domain classification were promising,<br />

see Asmussen (2005). 36 <strong>The</strong> 66 values can be looked up in the following<br />

XML document: DDOC domain values.<br />

⊲ tdDomainDiscourse<br />

Describes whether the discourse is domain-specific or not, i.e. if the<br />

type of language used in the text can be categorized as language for<br />

general or specific purposes.<br />

Properties<br />

Value set<br />

type<br />

enumerated, closed<br />

XML name vs_tdDomainDiscourse.xml<br />

35 http://www.tei-c.org/release/doc/tei-p5-doc/html/ref-derivation.html<br />

36 http://korpus.dsl.dk/staff/ja/papers/cl2005_asmussen.latex.pdf

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!