18.07.2013 Views

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

The Corpus Thread - Det Danske Sprog- og Litteraturselskab

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.3. Filling in the header 45<br />

and 99999999 26 in the case of integers and dates to indicate that this particular<br />

information obviously is missing and should be added if it does exist<br />

or, if it turns out that the information definitely does not exist, it should<br />

be marked as non-existent. To sum up, the following constant symbols are<br />

used as values for header elements and attributes, unless otherwise stated<br />

further below in this section: 27<br />

Symbol Type Meaning<br />

empty<br />

anonymous<br />

String<br />

Names<br />

Info is non-existent<br />

Person is unknown<br />

0 Integer Info is non-existent<br />

1000 Date/Year Info is non-existent<br />

nil String Info has not been<br />

determined yet<br />

99999999 Integer and Date/Year Info has not been<br />

determined yet<br />

In all other cases, that is in cases where the desired information<br />

is available, the values listed in Section 3.3.2.1 are used replacing the<br />

header variables indicated in the full header template above. For each<br />

of these variables a description is given followed by an overview of<br />

its properties and – in the case of enumerated sets – a list of legal<br />

values. In cases where these lists are too comprehensive, they are replaced<br />

by a link to an XML version of them. All value sets are also<br />

accessible as XML files and may be referenced automatically or manually<br />

when filling in headers. All value set files are found under the path<br />

http://korpus.dsl.dk/clarin/corpus-doc/text-header/. <strong>The</strong><br />

filenames themselves are given below. 28 <strong>The</strong> structure of the XML value<br />

set files is as shown in the following extract. <strong>The</strong> structure has been designed<br />

for this specific purpose (i.e. it is not TEI) and it should be fairly<br />

self-explanatory:<br />

<br />

26 In former versions of the documentation the ‘undetermined’ value was 1 (minus one).<br />

However, TEI does not always allow a negative value for some of its integer datatypes which<br />

is the reason why it has been replaced.<br />

27 In cases where TEI does not allow the undetermined/non-existent values defined here,<br />

the elements of the value sets are restricted to those that are accepted by TEI. This is the case<br />

for the following attributes:cert in,sex in,mode in,type<br />

in,level in.<br />

28 As these are XML files, a web browser may not show them well formatted. Viewing them<br />

as HTML source may help though.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!