03.12.2012 Views

Semantic Web-Based Information Systems: State-of-the-Art ...

Semantic Web-Based Information Systems: State-of-the-Art ...

Semantic Web-Based Information Systems: State-of-the-Art ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Semant c <strong>Web</strong> <strong>of</strong> Language Eng neer ng<br />

for annotation; (3) <strong>the</strong> management <strong>of</strong> ontological tiers, which can be annotated<br />

with language pr<strong>of</strong>ile terms and, <strong>the</strong>refore, corresponding ontological terms; and<br />

(4) storing OntoELAN annotation documents in XML format based on multimedia<br />

and domain ontologies. To our best knowledge, OntoELAN is <strong>the</strong> first audio/video<br />

annotation tool in <strong>the</strong> linguistic domain that provides support for ontology-based<br />

annotation. It is expected that <strong>the</strong> availability <strong>of</strong> such a tool will greatly facilitate<br />

<strong>the</strong> creation <strong>of</strong> linguistic multimedia repositories as islands <strong>of</strong> <strong>the</strong> <strong>Semantic</strong> <strong>Web</strong><br />

<strong>of</strong> language engineering.<br />

Introduction<br />

The <strong>Semantic</strong> <strong>Web</strong> (Lu, Dong, & Fotouhi, 2002; Berners-Lee, Hendler, & Lassila,<br />

2001) is <strong>the</strong> next-generation <strong>Web</strong>, in which information is structured with well-defined<br />

semantics, enabling better cooperation <strong>of</strong> machine and human effort. The <strong>Semantic</strong><br />

<strong>Web</strong> is not a replacement, but an extension <strong>of</strong> <strong>the</strong> current <strong>Web</strong>, and its development<br />

greatly relies on <strong>the</strong> availability <strong>of</strong> ontologies and powerful annotation tools.<br />

Ontology development and annotation management are two challenges <strong>of</strong> <strong>the</strong><br />

development <strong>of</strong> <strong>the</strong> <strong>Semantic</strong> <strong>Web</strong>, as we discussed in Chebotko, Lu, and Fotouhi<br />

(2004). In this chapter, although we use our developed general multimedia ontology<br />

as <strong>the</strong> framework and <strong>the</strong> GOLD ontology developed at <strong>the</strong> University <strong>of</strong> Arizona<br />

as an ontology example for ontology-based annotation <strong>of</strong> linguistic multimedia<br />

data, our focus will be on addressing <strong>the</strong> second challenge — <strong>the</strong> development<br />

<strong>of</strong> an ontology-based multimedia annotator OntoELAN for <strong>the</strong> <strong>Semantic</strong> <strong>Web</strong> <strong>of</strong><br />

language engineering.<br />

Recently, <strong>the</strong>re is an increasing interest and effort for preserving and documenting<br />

endangered languages (Lu et al., 2004; The National Science Foundation, 2004).<br />

Many languages are in serious danger <strong>of</strong> being lost, and if nothing is done to prevent<br />

it, half <strong>of</strong> <strong>the</strong> world’s approximately 6,500 languages will disappear in <strong>the</strong> next 100<br />

years. The death <strong>of</strong> a language entails <strong>the</strong> loss <strong>of</strong> a community’s traditional culture,<br />

for <strong>the</strong> language is a unique vehicle for its traditions and culture.<br />

In <strong>the</strong> linguistic domain, many language data are collected as audio and video recordings,<br />

which impose a challenge to document indexing and retrieval. Annotation<br />

<strong>of</strong> multimedia data provides an opportunity for making <strong>the</strong> semantics explicit and<br />

facilitates <strong>the</strong> searching <strong>of</strong> multimedia documents. However, different annotators<br />

might use different vocabulary to annotate multimedia, which causes low recall<br />

and precision in search and retrieval. In this article, we propose an ontology-based<br />

annotation approach, in which a linguistic ontology is used so that <strong>the</strong> terms and<br />

<strong>the</strong>ir relationships are formally defined. In this way, annotators will use <strong>the</strong> same<br />

vocabulary to annotate multimedia, so that ontology-driven search engines will<br />

retrieve multimedia data with greater recall and precision. We believe that even<br />

Copyright © 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission<br />

<strong>of</strong> Idea Group Inc. is prohibited.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!