18.07.2013 Views

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.3. Filling in the header 79<br />

Legal values Four-digit date. If the year of text creation is not known,<br />

textCreationYear is set to the same value as publDate.<br />

⊲ textFileName<br />

Name of the source file from which this text is drawn, that is usually<br />

the name of the file the text was delivered in. <strong>The</strong> organization having<br />

collected the text is responsible for keeping a copy of its source file<br />

in an archive if it wants to enable future corrections or modifications<br />

of the CTB version of the text with regard to certain information only<br />

contained in the source file.<br />

Properties<br />

Value set<br />

type<br />

XML name n/a<br />

descriptive<br />

Legal values Any legal (path and) filename pointing to the source<br />

file in the archive.<br />

⊲ textId<br />

Unique text identifier.<br />

Properties<br />

Value set<br />

type<br />

system: descriptive<br />

prefixes listed below: enumerated,<br />

open<br />

XML name system: n/a<br />

prefixes: vs_textId.xml<br />

Legal values Values for textId of textIdType “ctb” (cf. below): Specified<br />

10-digit integer. Identifiers of this type are composed as follows:<br />

<strong>The</strong> first two digits (from the left) indicate the project framework<br />

within which the texts were collected (which can be some other than<br />

DK-CLARIN). Thus, the first two digits can be viewed as a kind of<br />

prefix. <strong>The</strong> following set of prefixes of textIdType “ctb” is used:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!