26.12.2013 Views

A computational grammar and lexicon for Maltese

A computational grammar and lexicon for Maltese

A computational grammar and lexicon for Maltese

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

as the i/j switching depending on preceding words are not h<strong>and</strong>led at all due to limitations<br />

in GF.<br />

• S<strong>and</strong>hi rules, which determine stem changes when enclitic pronouns are suffixed to words<br />

are encoded but there are known exceptions which have not been h<strong>and</strong>led.<br />

• <strong>Maltese</strong>’ free word order is not captured at all; clauses of a particular type have a fixed<br />

word order.<br />

• Relative clauses have not been extensively tested.<br />

• The current version of the <strong>grammar</strong> does not pass all the treebank tests created during<br />

the development (see table 2.20).<br />

Evaluation<br />

In addition to known issues in coverage above, it also relevant to mention limitations in the<br />

evaluation process itself.<br />

• The treebanks are developed manually to test known critical areas of the <strong>grammar</strong>, but<br />

there is no measure of how completely they cover the actual <strong>Maltese</strong> language. Coverage<br />

of the RGL abstract syntax <strong>and</strong> coverage of the target language are two different things.<br />

• The calculation of treebank scores is very simply defined as the percentage of successful<br />

linearisations. This means that these scores are essentially skewed by the number of test<br />

trees written to cover each feature of the language. It is likely that in the current treebank<br />

set, verb inflections are covered more extensively than any other aspect of <strong>Maltese</strong>.<br />

• A typical way to qualitatively test a resource <strong>grammar</strong> is by using it in an application<br />

<strong>grammar</strong> <strong>for</strong> some particular use case. This was not carried out at all as part of this work<br />

<strong>for</strong> lack of time.<br />

4.2.2 Lexicon limitations<br />

Database<br />

• As mentioned in section 3.2, the chosen database engine does not allow sorting by a custom<br />

collation, which in the case of <strong>Maltese</strong> means that results shown to the user might<br />

not be sorted as they would expect.<br />

• Despite using database indexes on searchable fields, open regex searches can become<br />

a little slow to execute, depending on the search term used. To alleviate this, all regex<br />

searches received from the web application are prefixed with the beginning-of-string anchor<br />

^. This reduces slightly the search capabilities offered to the user.<br />

• No versioning or auditing of any kind is kept by the database engine, which means that<br />

any changes made are irreversible. This introduces a risk of data corruption which is only<br />

mitigated against by having regular database backups as data dumps.<br />

61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!