Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
32 Chapter 3. Related Work<br />
knowledge about <strong>the</strong> world and common-sense knowledge, which can be defined as<br />
knowledge that an ordinary person is expected to know. An important feature <strong>of</strong><br />
common-sense knowledge bases is that <strong>the</strong>y are typically very formalised, which<br />
eases <strong>the</strong>ir representation as ontologies and provides reasoning capabilities.<br />
Popular LKBs in numbers<br />
In order to have an idea on <strong>the</strong> size and contents <strong>of</strong> <strong>the</strong> popular LKBs, tables 3.1<br />
and 3.2 contain quantitative information about <strong>the</strong>m. Table 3.1 shows <strong>the</strong> number <strong>of</strong><br />
lexical items included in each LKB, according to <strong>the</strong>ir POS. Table 3.2 indicates <strong>the</strong><br />
core structure <strong>of</strong> each LKB, <strong>the</strong> number <strong>of</strong> instances <strong>of</strong> that structure, <strong>the</strong> number<br />
<strong>of</strong> different types <strong>of</strong> relation that may connect two core structures, and <strong>the</strong> unique<br />
types <strong>of</strong> <strong>the</strong> later relations. When <strong>the</strong> information in <strong>the</strong> table cells is missing, it is<br />
not applicable. For WordNet, <strong>the</strong> number <strong>of</strong> relations includes <strong>the</strong> direct and <strong>the</strong><br />
inverse relations, because <strong>the</strong>y have different names. On <strong>the</strong> o<strong>the</strong>r hand, MindNet<br />
identifies direct and inverse relations by a directed arrow (e.g. ←Hyp and Hyp→,<br />
repectively for hypernymy and hyponymy). So, in front <strong>of</strong> <strong>the</strong> number <strong>of</strong> MindNet<br />
relations, we added <strong>the</strong> information ’×2’.<br />
Due to its different structure, it is not possible to compare MindNet, created<br />
automatically, with <strong>the</strong> o<strong>the</strong>r LKBs, all handcrafted. The only number that shows<br />
that MindNet is larger is <strong>the</strong> number <strong>of</strong> relation instances (713k), which is significantly<br />
higher than in WordNet 3.0 (285k). Moreover, as an automatic approach,<br />
MindNet can also grow by processing more text. Richardson et al. (1998) refer that,<br />
after processing <strong>the</strong> Micros<strong>of</strong>t Encarta 98 encyclopedia, 220k additional headwords<br />
were collected for MindNet. Also, <strong>the</strong> MindNet website currently refers a total <strong>of</strong><br />
45 different relation types, which is more than <strong>the</strong> 32 reported in 1998.<br />
These numbers also show that LKBs are much smaller than knowledge bases as<br />
DBPedia (more than 2.5M concepts and 250M relations) and Freebase (Bollacker<br />
et al., 2008), a collaborative knowledge base (more than 20M concepts and 300M<br />
relations). This occurs especially because LKBs are restricted to lexical knowledge,<br />
while <strong>the</strong> o<strong>the</strong>rs are much broader and contain a wide-range <strong>of</strong> world knowledge<br />
facts.<br />
Resource <strong>Lexical</strong> items<br />
Nouns Verbs Adjectives Adverbs O<strong>the</strong>r Total<br />
WordNet 3.0 (2006) 117,097 11,488 22,141 4,601 - 155,327<br />
MindNet (1998) - - - - - 159,000 headwords<br />
FrameNet (2012) 5,136 4,819 2,268 - 378 12,601<br />
VerbNet (2012) - 3,769 - - - 3,769<br />
Table 3.1: Comparison <strong>of</strong> LKBs according to included lexical items.<br />
Resource Core structure Relations<br />
Type Instances Unique Types Instances<br />
WordNet 3.0 (2006) synset 117k+ 20 285k<br />
MindNet (1998) word entry 191k definitions 32 (×2) 713k<br />
FrameNet (2012) frame 1,674 8 -<br />
VerbNet (2012) class entry 274 - -<br />
Table 3.2: Comparison <strong>of</strong> LKBs according to core structure and relations.