24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

32 Chapter 3. Related Work<br />

knowledge about <strong>the</strong> world and common-sense knowledge, which can be defined as<br />

knowledge that an ordinary person is expected to know. An important feature <strong>of</strong><br />

common-sense knowledge bases is that <strong>the</strong>y are typically very formalised, which<br />

eases <strong>the</strong>ir representation as ontologies and provides reasoning capabilities.<br />

Popular LKBs in numbers<br />

In order to have an idea on <strong>the</strong> size and contents <strong>of</strong> <strong>the</strong> popular LKBs, tables 3.1<br />

and 3.2 contain quantitative information about <strong>the</strong>m. Table 3.1 shows <strong>the</strong> number <strong>of</strong><br />

lexical items included in each LKB, according to <strong>the</strong>ir POS. Table 3.2 indicates <strong>the</strong><br />

core structure <strong>of</strong> each LKB, <strong>the</strong> number <strong>of</strong> instances <strong>of</strong> that structure, <strong>the</strong> number<br />

<strong>of</strong> different types <strong>of</strong> relation that may connect two core structures, and <strong>the</strong> unique<br />

types <strong>of</strong> <strong>the</strong> later relations. When <strong>the</strong> information in <strong>the</strong> table cells is missing, it is<br />

not applicable. For WordNet, <strong>the</strong> number <strong>of</strong> relations includes <strong>the</strong> direct and <strong>the</strong><br />

inverse relations, because <strong>the</strong>y have different names. On <strong>the</strong> o<strong>the</strong>r hand, MindNet<br />

identifies direct and inverse relations by a directed arrow (e.g. ←Hyp and Hyp→,<br />

repectively for hypernymy and hyponymy). So, in front <strong>of</strong> <strong>the</strong> number <strong>of</strong> MindNet<br />

relations, we added <strong>the</strong> information ’×2’.<br />

Due to its different structure, it is not possible to compare MindNet, created<br />

automatically, with <strong>the</strong> o<strong>the</strong>r LKBs, all handcrafted. The only number that shows<br />

that MindNet is larger is <strong>the</strong> number <strong>of</strong> relation instances (713k), which is significantly<br />

higher than in WordNet 3.0 (285k). Moreover, as an automatic approach,<br />

MindNet can also grow by processing more text. Richardson et al. (1998) refer that,<br />

after processing <strong>the</strong> Micros<strong>of</strong>t Encarta 98 encyclopedia, 220k additional headwords<br />

were collected for MindNet. Also, <strong>the</strong> MindNet website currently refers a total <strong>of</strong><br />

45 different relation types, which is more than <strong>the</strong> 32 reported in 1998.<br />

These numbers also show that LKBs are much smaller than knowledge bases as<br />

DBPedia (more than 2.5M concepts and 250M relations) and Freebase (Bollacker<br />

et al., 2008), a collaborative knowledge base (more than 20M concepts and 300M<br />

relations). This occurs especially because LKBs are restricted to lexical knowledge,<br />

while <strong>the</strong> o<strong>the</strong>rs are much broader and contain a wide-range <strong>of</strong> world knowledge<br />

facts.<br />

Resource <strong>Lexical</strong> items<br />

Nouns Verbs Adjectives Adverbs O<strong>the</strong>r Total<br />

WordNet 3.0 (2006) 117,097 11,488 22,141 4,601 - 155,327<br />

MindNet (1998) - - - - - 159,000 headwords<br />

FrameNet (2012) 5,136 4,819 2,268 - 378 12,601<br />

VerbNet (2012) - 3,769 - - - 3,769<br />

Table 3.1: Comparison <strong>of</strong> LKBs according to included lexical items.<br />

Resource Core structure Relations<br />

Type Instances Unique Types Instances<br />

WordNet 3.0 (2006) synset 117k+ 20 285k<br />

MindNet (1998) word entry 191k definitions 32 (×2) 713k<br />

FrameNet (2012) frame 1,674 8 -<br />

VerbNet (2012) class entry 274 - -<br />

Table 3.2: Comparison <strong>of</strong> LKBs according to core structure and relations.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!