25.08.2013 Views

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

PDF (Online Text) - EURAC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

own tags. Since verbal derivation is written conjunctively (like word formation in<br />

European languages), a single ‘verb’ tag (V) proved sufficient (cf. Table 4). As with<br />

parts of tense morphology and with word formation in European languages, an analysis<br />

of Northern Sotho verbal derivations is left to a separate tool (e.g. to a morphological<br />

analyser; see the discussion in Taljard & Bosch 2005).<br />

Other tags cover invariable lexical items:<br />

• adverbs (ADV) and numerals (NUM);<br />

• tense/mood/aspect markers for present tense (PRES), future (FUT), and<br />

progressive (PROG);<br />

• auxiliaries (AUX) and copulative verbs (VCOP);<br />

• ideophones (IDEO); and,<br />

• different (semantically defined) kinds of particles that mark a hortative<br />

(HORT), questions (QUE), as well as agentive (PAAGEN), connective (PACON),<br />

copulative (PACOP), instrumental (PAINS), locative (PALOC) and temporal (PATEMP)<br />

constructs.<br />

In principle, our approach to the design of tagsets for nouns and verbs is similar to<br />

the one of Van Rooy and Pretorius (2003) for Setswana, but it is much less complex.<br />

In the case of verbs we agree on the allocation of a single tag for verb stem plus<br />

suffix(es) as well as on separate tags for verbal prefixes:<br />

“[…] verbs are preceded by a number of prefixes, which are regarded as<br />

separate tokens for the purposes of tagging. The verb stem, containing the<br />

root and a number of suffixes (as well as the reflexive prefix) receives a single<br />

tag.“ (Van Rooy & Pretorius 2003:211)<br />

Likewise, for nouns, we are in agreement that at this stage in the development of<br />

tagsets, certain subclassifications such as the separate identification of deverbatives<br />

should be excluded (cf. Van Rooy & Pretorius 2003:210). Our approach differs from Van<br />

Rooy and Pretorius among others, in that a much smaller tagset is compiled for both<br />

verbs and nouns. In the case of verbs, we do not consider modal categories, and in the<br />

case of nouns, we honour subclasses but not divisions in terms of relational nouns and<br />

proper names. Consider the following examples illustrating basic differences in terms<br />

of the approaches as well as of the complexity of the tags:<br />

(1) Nouns<br />

a) Mosadi ‘woman’<br />

Tswana (Van Rooy & Pretorius 2003:217):<br />

104

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!