27.06.2013 Views

Proceedings of the 12th European Conference on Knowledge ...

Proceedings of the 12th European Conference on Knowledge ...

Proceedings of the 12th European Conference on Knowledge ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Taha Osman, Dhavalkumar Thakker and Matt Nathan<br />

<strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g>se requirements we can, for instance, make a critical decisi<strong>on</strong> that proactive<br />

intelligent browsing or faceted search (Suominen, 2007) is a more appropriate method for semantic<br />

retrieval as opposed to explicit query-based search, and we can also strike a balance between<br />

automated bootstrapping <str<strong>on</strong>g>of</str<strong>on</strong>g> our knowledgebase with informati<strong>on</strong> from public semantic datasets and<br />

<str<strong>on</strong>g>the</str<strong>on</strong>g> manual verificati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> machine-learned facts for enriching <str<strong>on</strong>g>the</str<strong>on</strong>g> knowledgebase.<br />

The rest <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> paper is organised as follows: secti<strong>on</strong> 2 introduces <str<strong>on</strong>g>the</str<strong>on</strong>g> topic <str<strong>on</strong>g>of</str<strong>on</strong>g> semantically driven<br />

knowledge management, followed by explanati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> case study that motivates this research in<br />

secti<strong>on</strong> 3. Secti<strong>on</strong> 4 discusses <str<strong>on</strong>g>the</str<strong>on</strong>g> requirement analysis for <str<strong>on</strong>g>the</str<strong>on</strong>g> investigated commercial semantic<br />

retrieval applicati<strong>on</strong>, while secti<strong>on</strong> 5 explores our proposed roadmap to integrating informati<strong>on</strong><br />

extracti<strong>on</strong>, <strong>on</strong>tology engineering, and knowledge management technologies to address <str<strong>on</strong>g>the</str<strong>on</strong>g>se<br />

requirements. Finally secti<strong>on</strong> 6 c<strong>on</strong>cludes <str<strong>on</strong>g>the</str<strong>on</strong>g> paper and presents future challenges.<br />

2. Overview <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic knowledge management<br />

The main distincti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic-based over traditi<strong>on</strong>al knowledgebase systems is <str<strong>on</strong>g>the</str<strong>on</strong>g> requirement for<br />

a meta-data layer that provides for machine-level understanding <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> knowledge representati<strong>on</strong>.<br />

This layer, known in <str<strong>on</strong>g>the</str<strong>on</strong>g> Semantic web terminology as Ontology, comprises tax<strong>on</strong>omy <str<strong>on</strong>g>of</str<strong>on</strong>g> entity types,<br />

<str<strong>on</strong>g>the</str<strong>on</strong>g>ir attributes, and relati<strong>on</strong>ships between <str<strong>on</strong>g>the</str<strong>on</strong>g>m, described using an open standard (formalism).<br />

Hence <str<strong>on</strong>g>the</str<strong>on</strong>g> <strong>on</strong>tology represents a schema that is used to semantically annotate domain entities and<br />

<str<strong>on</strong>g>the</str<strong>on</strong>g>ir interrelati<strong>on</strong>s. These annotati<strong>on</strong>s represent <str<strong>on</strong>g>the</str<strong>on</strong>g> formal body <str<strong>on</strong>g>of</str<strong>on</strong>g> knowledge about <str<strong>on</strong>g>the</str<strong>on</strong>g> entities<br />

traditi<strong>on</strong>ally known as a knowledgebase (KB). The machine-understandable semantic knowledgebase<br />

can be aut<strong>on</strong>omously reas<strong>on</strong>ed by s<str<strong>on</strong>g>of</str<strong>on</strong>g>tware agents in order to map storage and retrieval queries to<br />

<str<strong>on</strong>g>the</str<strong>on</strong>g> annotated informati<strong>on</strong>.<br />

The standardisati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Semantic web formalism provides for linking to and using knowledge held at<br />

o<str<strong>on</strong>g>the</str<strong>on</strong>g>r informati<strong>on</strong> repositories to bootstrap and/or enrich <str<strong>on</strong>g>the</str<strong>on</strong>g> knowledgebase. A prime example <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g><br />

latter is <str<strong>on</strong>g>the</str<strong>on</strong>g> direct links at <str<strong>on</strong>g>the</str<strong>on</strong>g> <strong>on</strong>tology (schema) and data (informati<strong>on</strong> instances) levels between<br />

semantic datasets in <str<strong>on</strong>g>the</str<strong>on</strong>g> Linked Open Data Cloud project (Hausenblas, 2009). These advantages<br />

enthused <str<strong>on</strong>g>the</str<strong>on</strong>g> utilisati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic technologies for knowledgebase management in a variety <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

fields. For instance, The SEKT project at British Telecom investigates <str<strong>on</strong>g>the</str<strong>on</strong>g> utilisati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic<br />

knowlegebases and semantic search for improving <str<strong>on</strong>g>the</str<strong>on</strong>g> end-user experience while browsing scientific<br />

and business articles (Warren, 2005). A similar effort was reported in (Zhou, 2004), where semanticsbased<br />

computer-aided manufacturing knowledge management systems was deployed to support<br />

knowledge-intensive manufacturing processes. Semantic manufacturing knowledge management<br />

provides a new soluti<strong>on</strong> for meeting <str<strong>on</strong>g>the</str<strong>on</strong>g> requirements in such envir<strong>on</strong>ments such as knowledge<br />

accuracy, lower knowledge cost, knowledge timeliness and knowledge unificati<strong>on</strong>.<br />

Reviewing semantic-based knowledge management cannot be c<strong>on</strong>cluded without drawing attenti<strong>on</strong> to<br />

<str<strong>on</strong>g>the</str<strong>on</strong>g> problem <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic annotati<strong>on</strong>. Semantic annotati<strong>on</strong> is a labour-intensive process, <str<strong>on</strong>g>the</str<strong>on</strong>g> cost <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

which can be especially prohibitive for organisati<strong>on</strong>s that regularly add large amounts <str<strong>on</strong>g>of</str<strong>on</strong>g> data to <str<strong>on</strong>g>the</str<strong>on</strong>g>ir<br />

repositories. Hence, informati<strong>on</strong> extracti<strong>on</strong> (IE) technologies are <str<strong>on</strong>g>of</str<strong>on</strong>g>ten utilised to automate <str<strong>on</strong>g>the</str<strong>on</strong>g> process<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> annotating unstructured textual data (documents, image capti<strong>on</strong>s, web pages, etc.) with semantic<br />

descripti<strong>on</strong>s (Osman, 2007).<br />

3. Case study: utilising semantic knowledge management in informati<strong>on</strong><br />

retrieval applicati<strong>on</strong>s<br />

This study is motivated by <str<strong>on</strong>g>the</str<strong>on</strong>g> authors’ involvement in a project aiming to utilise semantic web<br />

technologies to improve <str<strong>on</strong>g>the</str<strong>on</strong>g> image browsing experience for customers <str<strong>on</strong>g>of</str<strong>on</strong>g> PA Images, <str<strong>on</strong>g>the</str<strong>on</strong>g> photography<br />

arm <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> Press Associati<strong>on</strong> (PA) - <str<strong>on</strong>g>the</str<strong>on</strong>g> UK’s leading multimedia news and informati<strong>on</strong> provider and<br />

supplier <str<strong>on</strong>g>of</str<strong>on</strong>g> business-to-business media services.<br />

The Press Associati<strong>on</strong>’s current digital archive holds about 7.5 milli<strong>on</strong> images, which is growing by an<br />

average <str<strong>on</strong>g>of</str<strong>on</strong>g> 35.000 new images per week. The images are supplied by <str<strong>on</strong>g>the</str<strong>on</strong>g> company’s own<br />

photographers and subc<strong>on</strong>tracted photo agencies around <str<strong>on</strong>g>the</str<strong>on</strong>g> world. However, currently PA Images’<br />

customers can <strong>on</strong>ly interact with <str<strong>on</strong>g>the</str<strong>on</strong>g> images repository via an <strong>on</strong>-line free-text search <str<strong>on</strong>g>of</str<strong>on</strong>g> manually<br />

annotated image capti<strong>on</strong>s which does not deliver <str<strong>on</strong>g>the</str<strong>on</strong>g> level <str<strong>on</strong>g>of</str<strong>on</strong>g> recall and accuracy that does justice to<br />

PA’s rich repository <str<strong>on</strong>g>of</str<strong>on</strong>g> images. Hence, PA Images is investing in <str<strong>on</strong>g>the</str<strong>on</strong>g> utilizati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> semantic web<br />

technologies in a bid to improve <str<strong>on</strong>g>the</str<strong>on</strong>g> image retrieval experience for <str<strong>on</strong>g>the</str<strong>on</strong>g>ir customers. The core<br />

comp<strong>on</strong>ent <str<strong>on</strong>g>of</str<strong>on</strong>g> any semantic-based intelligent retrieval mechanism is <str<strong>on</strong>g>the</str<strong>on</strong>g> semantic knowledgebase,<br />

which is <str<strong>on</strong>g>the</str<strong>on</strong>g> focal point <str<strong>on</strong>g>of</str<strong>on</strong>g> this investigati<strong>on</strong>. The objective <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> documented research in this paper is<br />

738

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!