A computational grammar and lexicon for Maltese
A computational grammar and lexicon for Maltese
A computational grammar and lexicon for Maltese
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
as the i/j switching depending on preceding words are not h<strong>and</strong>led at all due to limitations<br />
in GF.<br />
• S<strong>and</strong>hi rules, which determine stem changes when enclitic pronouns are suffixed to words<br />
are encoded but there are known exceptions which have not been h<strong>and</strong>led.<br />
• <strong>Maltese</strong>’ free word order is not captured at all; clauses of a particular type have a fixed<br />
word order.<br />
• Relative clauses have not been extensively tested.<br />
• The current version of the <strong>grammar</strong> does not pass all the treebank tests created during<br />
the development (see table 2.20).<br />
Evaluation<br />
In addition to known issues in coverage above, it also relevant to mention limitations in the<br />
evaluation process itself.<br />
• The treebanks are developed manually to test known critical areas of the <strong>grammar</strong>, but<br />
there is no measure of how completely they cover the actual <strong>Maltese</strong> language. Coverage<br />
of the RGL abstract syntax <strong>and</strong> coverage of the target language are two different things.<br />
• The calculation of treebank scores is very simply defined as the percentage of successful<br />
linearisations. This means that these scores are essentially skewed by the number of test<br />
trees written to cover each feature of the language. It is likely that in the current treebank<br />
set, verb inflections are covered more extensively than any other aspect of <strong>Maltese</strong>.<br />
• A typical way to qualitatively test a resource <strong>grammar</strong> is by using it in an application<br />
<strong>grammar</strong> <strong>for</strong> some particular use case. This was not carried out at all as part of this work<br />
<strong>for</strong> lack of time.<br />
4.2.2 Lexicon limitations<br />
Database<br />
• As mentioned in section 3.2, the chosen database engine does not allow sorting by a custom<br />
collation, which in the case of <strong>Maltese</strong> means that results shown to the user might<br />
not be sorted as they would expect.<br />
• Despite using database indexes on searchable fields, open regex searches can become<br />
a little slow to execute, depending on the search term used. To alleviate this, all regex<br />
searches received from the web application are prefixed with the beginning-of-string anchor<br />
^. This reduces slightly the search capabilities offered to the user.<br />
• No versioning or auditing of any kind is kept by the database engine, which means that<br />
any changes made are irreversible. This introduces a risk of data corruption which is only<br />
mitigated against by having regular database backups as data dumps.<br />
61