The Corpus Thread - Det Danske Sprog- og Litteraturselskab
The Corpus Thread - Det Danske Sprog- og Litteraturselskab
The Corpus Thread - Det Danske Sprog- og Litteraturselskab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.3. Filling in the header 45<br />
and 99999999 26 in the case of integers and dates to indicate that this particular<br />
information obviously is missing and should be added if it does exist<br />
or, if it turns out that the information definitely does not exist, it should<br />
be marked as non-existent. To sum up, the following constant symbols are<br />
used as values for header elements and attributes, unless otherwise stated<br />
further below in this section: 27<br />
Symbol Type Meaning<br />
empty<br />
anonymous<br />
String<br />
Names<br />
Info is non-existent<br />
Person is unknown<br />
0 Integer Info is non-existent<br />
1000 Date/Year Info is non-existent<br />
nil String Info has not been<br />
determined yet<br />
99999999 Integer and Date/Year Info has not been<br />
determined yet<br />
In all other cases, that is in cases where the desired information<br />
is available, the values listed in Section 3.3.2.1 are used replacing the<br />
header variables indicated in the full header template above. For each<br />
of these variables a description is given followed by an overview of<br />
its properties and – in the case of enumerated sets – a list of legal<br />
values. In cases where these lists are too comprehensive, they are replaced<br />
by a link to an XML version of them. All value sets are also<br />
accessible as XML files and may be referenced automatically or manually<br />
when filling in headers. All value set files are found under the path<br />
http://korpus.dsl.dk/clarin/corpus-doc/text-header/. <strong>The</strong><br />
filenames themselves are given below. 28 <strong>The</strong> structure of the XML value<br />
set files is as shown in the following extract. <strong>The</strong> structure has been designed<br />
for this specific purpose (i.e. it is not TEI) and it should be fairly<br />
self-explanatory:<br />
<br />
26 In former versions of the documentation the ‘undetermined’ value was 1 (minus one).<br />
However, TEI does not always allow a negative value for some of its integer datatypes which<br />
is the reason why it has been replaced.<br />
27 In cases where TEI does not allow the undetermined/non-existent values defined here,<br />
the elements of the value sets are restricted to those that are accepted by TEI. This is the case<br />
for the following attributes:cert in,sex in,mode in,type<br />
in,level in.<br />
28 As these are XML files, a web browser may not show them well formatted. Viewing them<br />
as HTML source may help though.