A computational grammar and lexicon for Maltese
A computational grammar and lexicon for Maltese
A computational grammar and lexicon for Maltese
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 3<br />
Computational <strong>lexicon</strong><br />
This chapter begins by presenting a web application designed <strong>for</strong> collecting the heterogeneous<br />
lexical resources available <strong>for</strong> <strong>Maltese</strong> into a single database. After explaining the setup<br />
<strong>and</strong> implementation of this collection, we then go on to describe how it is combined with the<br />
resource <strong>grammar</strong> from the previous chapter to produce full-<strong>for</strong>m <strong>computational</strong> <strong>lexicon</strong>.<br />
3.1 Method<br />
3.1.1 Sources<br />
The approach adopted in this work <strong>for</strong> constructing a <strong>computational</strong> <strong>lexicon</strong> <strong>for</strong> <strong>Maltese</strong> is<br />
to first build a plat<strong>for</strong>m where all existing lexical resources can be gathered into a single collection.<br />
While there are some large, high quality print dictionaries available <strong>for</strong> <strong>Maltese</strong> (see<br />
section 1.2.1), the number <strong>and</strong> size of <strong>computational</strong> resources is only a fraction of this. Nevertheless,<br />
the hope is that an open plat<strong>for</strong>m <strong>for</strong> hosting <strong>and</strong> searching through resources from<br />
heterogeneous sources will be useful in its own right, <strong>and</strong> even attract the addition of new<br />
lexical resources that may become available in the future. The sources available at the time of<br />
writing were:<br />
• An exhaustive list of all 4,142 root-<strong>and</strong>-pattern verbs (including hypothetical <strong>for</strong>ms), from<br />
the verbal roots database (Camilleri & Spagnol, 2013).<br />
• A corpus of 654 broken plurals <strong>for</strong> both nouns <strong>and</strong> adjectives (Mayer et al. , 2013).<br />
• A list of over 2,500 verbal nouns listed in the Aquilina dictionary <strong>and</strong> other sources (Ellul,<br />
2013).<br />
• A Basic English-<strong>Maltese</strong> dictionary containing some 5,454 English entries (Falzon, 2012).<br />
3.1.2 Heterogeneous data<br />
Traditional relational databases work with a strict schema system, whereby the structure of all<br />
data is fixed at design time <strong>and</strong> all entries in the database necessarily con<strong>for</strong>m to this schema.<br />
In this work however we are dealing with lexical resources from distinctly different sources,<br />
47