The Corpus Thread - Det Danske Sprog- og Litteraturselskab
The Corpus Thread - Det Danske Sprog- og Litteraturselskab
The Corpus Thread - Det Danske Sprog- og Litteraturselskab
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.3. Filling in the header 79<br />
Legal values Four-digit date. If the year of text creation is not known,<br />
textCreationYear is set to the same value as publDate.<br />
⊲ textFileName<br />
Name of the source file from which this text is drawn, that is usually<br />
the name of the file the text was delivered in. <strong>The</strong> organization having<br />
collected the text is responsible for keeping a copy of its source file<br />
in an archive if it wants to enable future corrections or modifications<br />
of the CTB version of the text with regard to certain information only<br />
contained in the source file.<br />
Properties<br />
Value set<br />
type<br />
XML name n/a<br />
descriptive<br />
Legal values Any legal (path and) filename pointing to the source<br />
file in the archive.<br />
⊲ textId<br />
Unique text identifier.<br />
Properties<br />
Value set<br />
type<br />
system: descriptive<br />
prefixes listed below: enumerated,<br />
open<br />
XML name system: n/a<br />
prefixes: vs_textId.xml<br />
Legal values Values for textId of textIdType “ctb” (cf. below): Specified<br />
10-digit integer. Identifiers of this type are composed as follows:<br />
<strong>The</strong> first two digits (from the left) indicate the project framework<br />
within which the texts were collected (which can be some other than<br />
DK-CLARIN). Thus, the first two digits can be viewed as a kind of<br />
prefix. <strong>The</strong> following set of prefixes of textIdType “ctb” is used: